Couchbase
  • Why NoSQL?
  • Couchbase Server
  • Download
  • Resources
  • Careers
Home | Forums | Membase | Memcached Server 1.0.3

Node failure management

3 replies [Last post]
  • Login or register to post comments
Wed, 10/06/2010 - 03:27
Bets
Offline
Joined: 09/10/2010
Groups: None

Hey there.
I would like some explanation about nodes failures scenario.

Here is our scenario:

We are using the .NET client to connect to NS Cluster and retrieve the active nodes (REST).
We are running a cluster with 6 nodes.
During some server maintenance this morning we took out of the cluster one of the nodes and received some timeouts errors.
The question is: What is the 'best practice' to manage the single nodes failures?
When some of the nodes goes down (timeout occurred), we should instruct out NS client to retrieve the active nodes again or is there some way to do it automatically?

Regards

Top
Wed, 10/06/2010 - 10:09
Perry Krug
Offline
Joined: 06/02/2010
Groups: None

Thanks for the inquiry. What version of the Enyim client are you using? There have been some recent changes to improve the behavior around handling of failures.

At a high level, the behavior should be that when a node goes down, there will be a brief timeout period (configurable, 10 seconds by default I believe) until the failure is detected by the client. It should then mark that node as dead and continue operations with the remaining servers. The client will continue to periodically poll that dead server and will add it back into its pool when it can connect to it again.

In complete transparency, we have also seen some server-side issues and have already fixed them in the soon-to-be-released 1.6.0 version.

Thanks again, please let me know what else I can do to help.

Perry

__________________

Forum support is great for free but sometimes you need a guaranteed response time and dedicated resources for your questions or issues.
Consider purchasing enterprise-level support from Membase: http://www.membase.com/products-and-services/overview
Call or email "sales -at- membase -dot- com" today!

Top
Thu, 10/07/2010 - 00:46
Bets
Offline
Joined: 09/10/2010
Groups: None

Thank you Perry for your response.
As always very punctual.

Seems that our problem may be the outdated Enyim client version.
I've seen that the feature "detect offline nodes" has been introduced in the latest release so we will update our clients and see how it handles the failures.

I will report back here.

I hope that the issue will be fixed because now seems that we have a single point of failure and that is not acceptable.
Also, do you think that increasing the connection timeout in the Enyim configuration can help to prevent errors while the client is pooling the new node in case some other failed?

Thanks again.

Top
Thu, 10/07/2010 - 10:37
Perry Krug
Offline
Joined: 06/02/2010
Groups: None

While increasing the connection timeout will prevent errors to the client, it will also increase the amount of time that a client is potentially hanging waiting for data. In the case of memcached (as opposed to Membase) I think it is more desirable to have the client "miss" the data and go about regenerating it so that the user can continue rather than waiting. It's a balance that depends on the application and its behavior.

Perry

__________________

Forum support is great for free but sometimes you need a guaranteed response time and dedicated resources for your questions or issues.
Consider purchasing enterprise-level support from Membase: http://www.membase.com/products-and-services/overview
Call or email "sales -at- membase -dot- com" today!

Top
  • Login or register to post comments
  • Login
  • Register

Company

  • About Us
  • Leadership
  • Customers
  • Partners
  • Contact Us

Product

  • Couchbase Server
  • Couchbase SDKs
  • Use Cases
  • Documentation
  • Forums

Open Source

  • Couchbase Project
  • Couchbase vs. CouchDB

Commercial

  • Subscriptions & Support
  • Training & Services

News

  • Blog
  • Newsletter
  • Press Releases
  • Buzz

Follow Us

    
  • Customer Login
  • Terms of Service
  • Privacy Policy
  • Trademark Policy
  • Site Map

© 2013 COUCHBASE All rights reserved.

Sign in to Couchbase Community

close
  • Create new account
  • Request new password
You are logging into the Forums, Wiki and Issue Tracker