2 of 6 nodes in pending state for a very long time, so that I can’t rebalance the cluster nor remove nodes.
This happended while I am doing online-upgrading from 2.2.0 CE to 3.0.1 CE for 3 nodes.
My cloud server cost increases (cause of the 3 extra servers), but I do not know how to handle this.
Can anyone give me a few tips for me?
Apologies for the delayed response. do you still need help?
Looks like the pending servers are warming up at the moment (high cpu utilization). Were they restarted? could you share the logs as well?
Actually, I have another critical problem.
I force-stopped 3 of 2.2.0 CE nodes and off-line upgrade to 3.0.1 CE and tried rebalancing the cluster.
The 3 of old nodes (previously 2.2.0 CE) have 50GB disk. (whose disk usage was 91% before rebalancing)
The 3 of new nodes (3.0.1 CE) have 200GB disk.
I marked the 3 old nodes as ‘Remove’ and performed ‘Rebalance’.
I expected the data from 3 old nodes goes into the new nodes.
But after a few hours, 2 of the 3 old nodes (50GB), went down because of disk full.
And I couldn’t understand the situation. How could disk usage for old nodes (marked as ‘Remove’) go up when rebalancing?
2 dead nodes didn’t go back up so I had to fail them over. (loosing data within them)
Now I have 1 more 50GB node to remove, and I am afraid that the same thing would happend.
Am I missing something here?
How could I properly rebalance the cluster and safely remove 50GB node from the cluster?
I am also seeing this behaviour in Version: 3.0.2-1603 Enterprise Edition (build-1603).
We use couchbase server in production in MakeMyTrip.
Could you please provide more help in this. Out of 25, 14 servers are in pending state.
Please advise what needs to be done.