Node failure while rebalancing, won't come back up... data loss?

I had a 6 node cluster with replica 1, decided to add 4 new nodes and run rebalance. (now 10 nodes)
While it was rebalancing (between 90 and 99% done) one of the node (that node was one of the original 6 and showed 97% done.) had a disk issue and became unresponsive on port 8091.
Restarting service couchbase-server was a no go... service wouldn't stop even afer a few hours and saw no disk IO either. So I reset the machine and now the cluster sees that node as down.
After the reboot the node wont show up in the cluster...
When I go to the address of that failed node, I see the new node setup page on port 8091... but the cluster still sees that node as part of the cluster but in down state.
How can I rejoin that network without loosing my data? I'm pretty sure I'm missing replicas for data on that down node...

Thanks for any help or suggestions.

1 Answer

« Back to question.


Losing one node is ok since you have replica, you should be able to fail over this node and rebalance.

Since you have a non stable environment I am inviting you to do a backup first