Couchbase cluster stuck after node failure

paolop · February 6, 2017, 10:54am

Hello,
today one of the servers in a small cluster we have incurred a hardware failure. The cluster had 4 nodes, running only ‘memcache’ type buckets. Version 4.1.1-5914 Community Edition (build-5914).

We fixed the server (required a full re-install), but now we are stuck in a situation where the cluster lists the old server as “pending rebalance”, and won’t come out of that state.

I tried (from the CLI) server-remove -force, it says “unable to rebalance cluster (400) Bad Request {“deltaRecoveryNotPossible”:1}” . If I try a rebalance command, it replies with the same error. I tried setting the recovery type to full, but it did not help.

Any idea how I can evict for good that node and rebalance the remaining three? As I mentioned all buckets are all memcached, there is very little data to rebalance, per se.

Many thanks.

drigby · February 6, 2017, 3:33pm

You should be able to Failover the dead node on the UI (use hard failover), then add the new one in and Rebalance.

paolop · February 7, 2017, 2:20am

Yes, thanks for the reply. It worked, eventually. I had to ‘cancel’ the rebalance via CLI, then do a rebalance again to remove the node. Not terribly intuitive.
Thanks again.

Topic		Replies	Views
Failure Recovery - Can't Rebalance Couchbase Server	6	3726	November 9, 2014
Unable to rebalance cluster after node failure Couchbase Server	3	2129	July 29, 2013
CRITICAL: Couchbase Cluster Stuck in Rebalance Couchbase Server	0	1432	July 20, 2017
Cannot add node after Fail Over/Rebalance - Node doesn't support requested services: [<<"kv">>] Couchbase Server 40-rc	3	3721	November 10, 2015
Trying to recover from an outage, rebalancing fails immediately Couchbase Server	3	310	September 10, 2023

Couchbase cluster stuck after node failure

Related topics