[SPY-94] handle disappearing nodes better, do not block io for other operations Created: 29/Jul/12  Updated: 29/May/13

Status: Open
Project: Spymemcached Java Client
Component/s: None
Affects Version/s: None
Fix Version/s: .next
Security Level: Public

Type: Improvement Priority: Major
Reporter: Matt Ingenthron Assignee: Michael Nitschinger
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
In testing, we've found the client can end up blocked until the continuious op timeout comes along and kills a dead connection.

Right now, in practice, this "network cable falls out" mode is worse than it needs to be since the handleIO() method on the evented IO loop blocks waiting for something to do. Though it's not 100% clear why, it's believed to be related to all of the caller threads eventually blocking on the "int selected = selector.select(delay);" in that method. Since our continuous timeout threshold is 1000 that means (worst case) 1000*timeout until we dump the connection, which could be several minutes. Ideally, we'd probably push this IO down a layer so data wanting to go to a particular node from a new caller isn't caught up in everything else going on. There may be a simpler fix though, since these should all be non-blocking and the selector.select() should pretty much always come back with something.
Generated at Wed Apr 16 14:01:20 CDT 2014 using JIRA 5.2.4#845-sha1:c9f4cc41abe72fb236945343a1f485c2c844dac9.