Rebalance failing - possibly due to primary index running out of space?

mvgetz · July 21, 2016, 2:07pm

We created a primary index on a bucket with 450M documents and it ran out of space on the indexing volume. This caused the node indexing thread to continually restart. A graceful removal of that node from the cluster did not work so we did a hard failover. Since then, rebalances will not work. I cleaned the data off of the node and added it back to the cluster but that did not help any.

Lot of errors in the logs:
memcached.log
2016-07-20T21:59:20.487794-05:00 WARNING (stats) Notified the timeout on checkpoint persistence for vbucket 877, id 0, cookie 0x7fe182ab9a80 2016-07-20T21:59:20.487841-05:00 WARNING 121: Slow SEQNO_PERSISTENCE operation on connection (127.0.0.1:58703 => 127.0.0.1:11209): 31000 ms 2016-07-20T21:59:20.498013-05:00 WARNING (stats) Notified the timeout on checkpoint persistence for vbucket 876, id 0, cookie 0x7fe182af4780 2016-07-20T21:59:20.498067-05:00 WARNING 122: Slow SEQNO_PERSISTENCE operation on connection (127.0.0.1:54497 => 127.0.0.1:11209): 31000 ms

And many others in the various logs. Here is the collected info from the node that ran out of space while creating the Primary index on the stats bucket.

The cluster is still working and it does not look like we have lost any data yet.

Any help would be appreciated!
Thanks!
Mark

Topic		Replies	Views
After adding a new node to cluster rebalance stuck and server is unable to start or stop Couchbase Server	1	2218	March 8, 2016
Rebalance failed. See logs for detailed reason. You can try again Couchbase Server	0	632	September 3, 2019
Rebalance Stalled while running 4.1.1-5914 Community Edition (build-5914) Couchbase Server	2	696	September 5, 2018
Failure during rebalance Couchbase Server	5	5917	July 2, 2013
Rebalance failed Couchbase Server	2	956	September 6, 2023

Rebalance failing - possibly due to primary index running out of space?

Related topics