[SPY-93] Hard-coded 1 second delay in spy could be avoided Created: 27/Jul/12  Updated: 29/May/13

Status: Open
Project: Spymemcached Java Client
Component/s: library
Affects Version/s: 2.7.3
Fix Version/s: .next
Security Level: Public

Type: Bug Priority: Minor
Reporter: Raghavan Srinivas (Inactive) Assignee: Michael Nitschinger
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
This is based on cbse-87

We must supply a username to the MemcacheClient constructor and this sets up an authentication step which I'd thought wasn't required**. Authentication causes issues when a node goes down: every operation destined for that node waits 1 second (hard-coded) for the authentication to clear*. O f course since the node is unavailable, the auth never comes. Then, every op continues to take a full second as opposed to failing fast. The client app then slows to a crawl because one node is down.

    Is it possible to eliminate this authentication? I've tried setting the username to null but the connections to the cluster keep resetting.

    * See spy class TCPMemcachedNodeImpl. Method addOp() waits for the authLatch. The authLatch gets set to 1 but never counts down.

    **Our buckets use dedicated ports and ...
    "When a client connects to a server on a host and dedicated port number, it may use either memcached/membase binary or ASCII protocols, and SASL authentication is not required."

Workaround:

I did implement our own check for down nodes. Basically, I created a connection tracker that implements ConnectionObserver which is given to MemcachedClient.addObserver(). It tracks the status of all nodes and maintains a list of down nodes. If any are down, then before sending down an operation to the memcachedClient, it checks if the operation would go to the down node and prevents the operation from being submitted. The tracker makes use of the NodeLocator.getPrimary(key)'s socketAddress. Only when a node(s) is down would every operation suffer a bit of hashing check. When the node comes back up, my tracker gets notified, and with the down node cleared, the hash check is stopped.

Something like that should be built into the spy code.
Generated at Thu Jul 31 10:39:26 CDT 2014 using JIRA 5.2.4#845-sha1:c9f4cc41abe72fb236945343a1f485c2c844dac9.