
spymemcached - issue #10
CancelledKeyException while handling IO with a downed server
I've configured two memcached and let one of them shutdown during a loop of get keys. I can see some INFO logs telling it's attempting to reconnect. But when I put the failed memcached back, the app hangs and stops to print any logs. This can be happened on both sync and async get.
environment: windows xp/java1.5/memcached-2.0.jar
Comment #1
Posted on Mar 14, 2008 by Helpful ElephantI've reproduced this. Thanks.
Comment #2
Posted on Mar 14, 2008 by Helpful ElephantActually, scratch that. I got a cancellation exception on the client, but once I caught that, the client continued to do the right thing.
Can you provide a small test case? This is what I did:
Comment #3
Posted on Mar 14, 2008 by Happy HippoI've found the problem is exception related. The following is what my test case, but if I put the get() into try/catch, it's working fine.
=============================================== String m_sMemcachedHosts = "localhost:11211 localhost:11212"; MemcachedClient mc3 = new MemcachedClient(AddrUtil.getAddresses (m_sMemcachedHosts));
for (int i = 0; i < 50; i++)
{
long t1 = System.currentTimeMillis();
String val = (String) mc3.get("test");
long t2 = System.currentTimeMillis();
System.out.println("[" + i + "] mc3 test=" + val + " (" + (t2 - t1) + ")");
try
{
Thread.sleep(1000);
}
catch (Exception e)
{
e.printStackTrace();
}
}
Comment #4
Posted on Mar 14, 2008 by Helpful ElephantOK, that's similar enough to what I've done.
I'm going to change this to a documentation bug and attempt to make it clearer what this behavior is.
Comment #5
Posted on Mar 14, 2008 by Helpful ElephantSorry for thrashing this bug so much, but I just read the detailed report on the list, and there's definitely something going wrong in my client.
Exception in thread "Memcached IO over {MemcachedConnection to /127.0.0.1:11211}" java.nio.channels.CancelledKeyException at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55) at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:69) at java.nio.channels.SelectionKey.isReadable(SelectionKey.java:271) at net.spy.memcached.MemcachedConnection.handleIO(MemcachedConnection.java:262) at net.spy.memcached.MemcachedConnection.handleIO(MemcachedConnection.java:180) at net.spy.memcached.MemcachedClient.run(MemcachedClient.java:730)
I'm not certain I can make this happen for me, but I should be able to expect it and make it deal with it a bit better.
Comment #6
Posted on Mar 16, 2008 by Helpful ElephantStatus update:
I was looking into this tonight, and I don't see how that exception should be able to escape. The code in question looks like this:
try { [...] if(sk.isReadable()) { // CancelledKeyException thrown here handleReads(sk, qa); }
[...]
} catch(Exception e) { getLogger().info("Reconnecting due to exception on %s", qa, e); queueReconnect(qa); }
For my reference, the full report is here:
http://www.nabble.com/Re%3A-MemcachedClient-and-timeout-p16046771.html
Comment #7
Posted on Apr 28, 2008 by Massive CamelI am seeing this, but I'm still using 1.4.
Exception in thread "Memcached IO over {MemcachedConnection to xxx/xx.xx.xx.xx:11211 xxx/xx.xx.xx.xx:11211}" java.nio.channels.CancelledKeyException at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55) at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:69) at java.nio.channels.SelectionKey.isReadable(SelectionKey.java:271) at net.spy.memcached.MemcachedConnection.handleIO(MemcachedConnection.java:263) at net.spy.memcached.MemcachedConnection.handleIO(MemcachedConnection.java:181) at net.spy.memcached.MemcachedClient.run(MemcachedClient.java:715)
Comment #8
Posted on Apr 28, 2008 by Helpful Elephant1.4 is quite old. I haven't done any work on that branch since July 2007. I've fixed numerous bugs since then (which I believe includes this one).
Comment #9
Posted on Oct 2, 2008 by Helpful ElephantI can't reproduce this and the code path I'm aware of suggests it's relatively impossible, so I'm going to close this as invalid unless someone can get me a test.
Status: Invalid
Labels:
Type-Defect
Priority-High
Component-Logic