Couchbase
  • Why NoSQL?
  • Couchbase Server
  • Download
  • Resources
  • Careers
Home | Forums | Membase | Membase Server 1.7.x

Problem with spymemcached when server is failed over

4 replies [Last post]
  • Login or register to post comments
Wed, 10/19/2011 - 15:00
awarner_theladders
Offline
Joined: 10/14/2011
Groups: None

I'm seeing a problem where, with a simple cluster of 3 machines with auto-failover enabled, spymemcached doesn't properly connect to a different node when one node is killed manually. The ultimate stack trace is as follows:

Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancelled
at net.spy.memcached.internal.OperationFuture.get(OperationFuture.java:84)
at net.spy.memcached.internal.GetFuture.get(GetFuture.java:38)

Our client library wraps the spymemcached MemcachedClient asyncGet method, cancelling it if it exceeds a certain timeout. I've tried a 1000 ms timeout and a 5000ms timeout, and it didn't seem to affect the number of cancelled operations. The steps to reproduce it for us are as follows:

3 machines running membase-server-enterprise_x86_64_1.7.1.1.deb distribution of membase, auto failover enabled. Client (have tried spymemcached 2.7.2 and 2.7.3) started up using vbuckets with all 3 nodes passed into the uri list (/pools uri in all cases).

The test creates 50 records in membase (trying to ensure at least one on each server), then loops through each record and does a GET on each with a 1 second sleep in between each GET.

All of this works fine until, on one node, I forcibly kill the server with sudo pkill -u membase. Membase auto fails over (verified through the UI), at which point I continue to see the above stacktrace on roughly every 3rd GET. It seems like it's having trouble negotiating which server to get the values that for which the killed server was the primary from (possibly locking somewhere and timing out?). If I telnet to each server on port 11213, I see that the value I'm expecting is actually there on both of the remaining nodes.

If I then bring the killed server up again and rebalance the cluster, the client recovers gracefully.

Any advice?

Top
  • Login or register to post comments
Thu, 10/20/2011 - 09:34
dan
Offline
Joined: 01/05/2011
Groups: None

With earlier versions of the library we ran into issues like this where the client had trouble with server failures/cluster changes so what we did was catch Exception e (yeah I know), destroy the client and create a new one.

We use a connection pool which makes it rather easy to do.

Top
  • Login or register to post comments
Thu, 10/20/2011 - 09:57
awarner_theladders
Offline
Joined: 10/14/2011
Groups: None

Thanks for the response - out of curiosity, when you say you use a connection pool, do you mean that you keep around a pool of client objects? We've been using one instance of the client (a spring bean), so destroying and recreating it might be a pain, although certainly not unreasonably difficult.

Is this the recommended way of handling any exception?

Top
  • Login or register to post comments
Tue, 11/08/2011 - 10:45
ingenthr
Offline
Joined: 03/16/2010
Groups:

Note there was a fix related to this in spymemcached 2.7.3. I believe all of the known issues have been solved here now.

Top
  • Login or register to post comments
Mon, 08/13/2012 - 13:40
ingenthr
Offline
Joined: 03/16/2010
Groups:

Note that version 2.8 has since come out and there was a split to a CouchbaseClient 1.0. See http://www.couchbase.com/develop/ for details

Top
  • Login or register to post comments
  • Login or register to post comments
  • Login
  • Register

Company

  • About Us
  • Leadership
  • Customers
  • Partners
  • Contact Us

Product

  • Couchbase Server
  • Couchbase SDKs
  • Use Cases
  • Documentation
  • Forums

Open Source

  • Couchbase Project
  • Couchbase vs. CouchDB

Commercial

  • Subscriptions & Support
  • Training & Services

News

  • Blog
  • Newsletter
  • Press Releases
  • Buzz

Follow Us

    
  • Customer Login
  • Terms of Service
  • Privacy Policy
  • Trademark Policy
  • Site Map

© 2013 COUCHBASE All rights reserved.

Sign in to Couchbase Community

close
  • Create new account
  • Request new password
You are logging into the Forums, Wiki and Issue Tracker