How does Hard Failover work? Our nodejs client kept trying to hit a node that was in Pend state

alexegli · September 8, 2015, 7:00pm

We have a couchbase cluster of two nodes, both running enterprise v3.0.2 on ubuntu 12.04. We recently had to upgrade the memory on one of our couchbase server nodes, which involved restarting the VM. We have our buckets configured to have 1 replica, so I thought this meant that when one node went down the other node would activate its replicas and take over for it. Our nodejs client though kept getting connection errors while the node was in the Pend state, and then once it was back up and running the connection errors stopped happening. Does a hard failover not happen if the node is in the Pend state, or do we have to do something special in the cluster to make sure our clients don’t try to contact nodes that have gone done?

cihangirb · September 8, 2015, 9:10pm

if you do not have auto-failover enabled, we won’t fail a node automatically - you can still manually do this but would be easier with auto failover.

In a situation like yours, it is easy to avoid these failures in the app;

failover the node that will be restarted,
restart the couchbase server node
once it is back online, add back the node and rebalance

With this steps, after the failover, new incoming operations are sent to the node that takes over after the failover. once you add back and rebalance, the restarted node starts taking the traffic again and during the failovers, node app will automagically work avoiding mass failures.
thanks
-cihan

alexegli · September 8, 2015, 9:14pm

The node restarted because the VM ran out of memory and crashed. I then kept the node down and added more memory, but I thought couchbase would handle cases where a node dies in a cluster and becomes unresponsive. Is enabling auto-failover as simple as going to that Auto-Failover page in settings and clicking the enable checkbox? Are there any downsides to enabling it, or anything we need to do in preparation for it? Will the act of enabling it temporarily take down the cluster or impact the cluster in any way?

Topic		Replies	Views
Couchbase-server failover removes node from cluster Couchbase Server server , couchbase-cli	8	1021	September 28, 2023
Automatic failover in an environment where any server could die at any time Couchbase Server	1	1412	April 27, 2017
What kind of failover is auto-failover? Couchbase Server	1	1763	January 25, 2016
Hard Failover , Transparent For Client Couchbase Server	7	2717	February 4, 2016
Behaviour of 2 nodes claster Couchbase Server	2	1988	January 23, 2015

How does Hard Failover work? Our nodejs client kept trying to hit a node that was in Pend state

Related topics