Rebalancing is taking lot of time on couchbase server (several days)

Hi,

Environment:
Couchbase version: Version: 2.1.1 community edition (build-764)

Cluster details:
3 node couchbase cluster

Memory:
In Use: 33.7 GB
Unused: 14.9 GB

Disk:
In Use: 55.7 GB

Total buckets: 10
Item count: > 92000000
Replication: 1

Issue:
It takes a lot of time for couchbase server to finish rebalancing when we have a lot of data in couchbase. I would like to know whether there is any way to speed up rebalancing on such clusters.

Thanks,
Richards Peter.

Hi,

We’ve made substantial improvements since version 2.1.1, including several optimizations around rebalance operations. The number of buckets listed (10) is on the upper end of what is supported and resource contention is the likely culprit for rebalance slowness. If there’s an option to have a maintenance window and migrate the data to a newer and more powerful cluster with additional nodes this could substantially improve performance.

Thanks

Todd

Hi,

We are also facing similar issue. Rebalancing takes long time and finally stuck.
We use Couchbase community edition 3.0.1. This is a 6 node cluster with 5 Couchbase buckets and 1 memcached bucket and No replication.

Memory :
Total : 120 GB
In USe: ~1.5GB

Disk: ~1.5 GB

When data is present rebalance takes long time. Waited for more than 2 Hrs and it was ongoing. From the log it is found that rebalancing is a “swap rebalance”.
Once the data is flushed and rebalanced, the operation completes with in 15 min. We faced a hardware failure in production last day and finally we had to flush all the data to complete the rebalance.

Appreciate any help to solve this issue.

Dear Team,

i’m also facing similar issue here too, but data rebalance is completed as like normal or as we expect. but some times indexer rebalance is taking more than 4hrs of time. we have ~200 to 300 indexes and replica=1. some times same cluster whole rebalance will complete not morethan 1hrs. but why we are getting that much delay over there.

is there any possible reasons could you share here , need to find what causing this problem.