Couchbase Rebalance large Cluster

Hi, we are running couchbase as KV and storing around 43TB of data in 16 nodes cluster, with replication factor 1.
So, whenever we need scaling up/down we require to Rebalance and rebalancing is becoming painful for us.
As we start rebalancing we have to suffer downtime of around 2 days. Our clients are not able to read or write from CB.
Is there any way to avoid this?

1 Like

Rebalance is a completely online operation. It could take long, but there should be no downtime. Can you explain what you mean by ‘downtime of around 2 days’?
Also, 5TB of data per node (including the replica data) is over what we currently recommend. How much RAM do you have allocated on each node?

Hi Shivani_g,
We are using Node of 32 GB RAM with 25GB allocated on each node.
100% Rebalancing takes approx 2-2.5 days. CB is not allowing to read-write(which leads to application downtime) until rebalancing on any of the nodes is 100% complete. As soon as the 1st node is 100% rebalanced read-write on CB is started, meanwhile rebalancing is continued on other nodes.

Is there a specific config for rebalancing without downtime?

@shivani_g, any update over this.

Based on the details you have provided it seems your memory to data ratio << 1%.
Couchbase recommends at least 10% memory to data ratio for operational (including rebalance) stability.
Do you see errors during rebalance? If you do, can you share the errors you see.

@shivani_g, is there any doc related to this also is this default feature in community version as well or do we need to configure it.

Rebalance is a community feature and there is no need to configure it. It is automatically configured. One rule of thumb to follow for a stable rebalance is that the ratio of memory to data on disk should be > 10%. If you go lower than that you can run into issues.

Hi @shivani_g , can you give example of “the ratio of memory to data on disk should be > 10%” . I didn’t get this. thanks

E.g. if you have 32GB RAM on a node that is allocated to the bucket, you should not be storing more than 320 GB of data on the node (this includes replica data as well). Up to 320GB will ensure that you are not going below 10% ratio for memory to disk.

In your case, you have 43TB of data in 16 nodes, which means around 2.6TB data per node. Each node has 25GB allocated to the bucket as per your comment. This is < 1% of memory to disk ratio which can cause rebalance instability, long duration to complete, as well as significant impact to front-end workload as you are seeing.

You can either add more nodes to your cluster or use nodes with more RAM - 256GB RAM at least. Preferably 512GB RAM per node if you are going to use the same number of nodes.

Thanks @shivani_g for the clarification. Is there any official documentation regarding the 10% rule you have mentioned. If you can share it will be great. So that I can take this for ward in my implementation.

There is no documentation around it currently. However, it is a Best Practice that we provide during Sizing engagements.