Auto-Rebalance after node failover

dpudi1122 · February 7, 2018, 5:04pm

Hello,

I have a situation where we had a node failed and couchbase took care of auto failover but it did not do an auto rebalance which is fine, but after that, we had another node failure that did not failover. Looking for a solution where I can auto rebalance after a failover. So that the second node failure would have auto failover and re-balance.
Our cluster is a 28 node cluster with 4 buckets, each node is of 24 core and 96 GB ram.

anil · February 7, 2018, 8:20pm

Hi @dpudi1122,

Refer to “using automatic failover” chapter in our documentation it will explain how auto failover feature works and how to reset quota for second failed node to automatically failover.

Thanks

guy.klages · February 8, 2018, 12:23am

Hello Dpudi1122,

For a while, there has been discussion about this topic and the exact features related to it for our next Server version; so please tell us more about your requirements and expected behavior. Do you want the option for a 2nd auto-failover or do you need a setting for any user-specified number of auto-failovers? Adjustable by only the Administrator or which user roles? Do you need the auto-rebalance every time it auto-failovers or only after the 2nd auto-failover? Do you need auto-failover and auto-rebalance to always be together or be separate (mutually exclusive) operations? Please let us know any other related details or considerations.

perry · February 9, 2018, 2:27pm

In our next major release (this year) we plan on extending the auto-failover quota to be configurable up to 3 without needing manual intervention to reset it. Would that meet your requirements here?

The challenge with doing an automatic rebalance after failover is that it ends up recreating data and putting more load onto an already degraded (N-1) cluster…so historically we have recommended strongly against that.

We are indeed having more discussions internally about whether it makes sense to add an option for auto-rebalance after failover, but it would then require you to be aware of a possible “cascading” failure and size the cluster to be able cope at reduced capacity.

Topic		Replies	Views
Auto rebalance after node failure Couchbase Server	11	5576	May 17, 2017
Automatic failover in an environment where any server could die at any time Couchbase Server	1	1276	April 27, 2017
Couchbase HA issues Couchbase Server	2	836	February 15, 2018
Backup my data on a failed node and rebalance stcuk at 0% Couchbase Server	1	1361	August 9, 2016
Question on Recovering cluster Couchbase Server	4	1226	April 7, 2017

Auto-Rebalance after node failover

Related topics