Index service autofailover

pccb · February 12, 2024, 5:33pm

Hello,

In versions lower than 7.1, autofailover of index service was not supported. But due to Data Service preference index service would still failover (automatically). Can someone help understand the difference between the “autofailover of index service in lower versions which would take place due to Data Service preference” and the “autofailover introduced starting version 7.1”?

Thanks!

deepkaran.salooja · February 22, 2024, 6:30am

@pccb , with index service auto-failover in 7.1:

a. The cluster manager checks the index service health every few seconds and if it is not healthy, the node will be automatically failed over.
b. Also, the index service tries to ensure index availability after auto-failover e.g. if an index only exists on node A and has no replica, failing over node A will lead to complete unavailablity of that index. Auto-failover is not allowed in such a case. However, the Data Service preference takes precedence over this.

pccb · February 22, 2024, 1:34pm

Thanks @deepkaran.salooja

b. is clear, but in a. I cannot make out the difference between 6.6 with DSP and 7.1. I mean, though cluster mgr will not check index service health in 6.6 with DSP, ultimately the index service will failover in both.

Also, wrt example in b. if the index service is down, anyways the index is not accessible, by not doing autofailover, it isn’t any better. It does not ensure index availability anyways. Atleast if autofailover would have happened, clustermap would have got correctly updated and requests would not have hit the failed node, isnt it?

deepkaran.salooja · February 27, 2024, 7:22pm

Lower server versions e.g. 6.6 had auto-failover disabled to avoid the possibility of false failover due to CPU saturation. 7.1 has automatic CPU throttling built-in for indexer process once it exceeds a high threshold. This ensures that cluster manager component running on the indexer node can communicate with master node and avoid node failover.

Also, 7.1 is more advanced where index service can signal it is unhealthy vs cluster manager being unable to reach the node. This allows for more capabilities to be added in the future e.g. if indexer detects a bad disk due to disk writes failing over a threshold, it can signal unhealthy.

The issue with that is index metadata is local to a node. If the last replica of an index is lost, it cannot be repaired by the system on further capacity addition and rebalance due to metadata loss.

For index service, all index replicas are considered active and query requests are automatically load-balanced across those. If a node is unhealthy, the queries will automatically use the other replica. There is no dependency on cluster map update as such.

system · May 27, 2024, 7:23pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Index Service and Auto Failover Couchbase Server	4	1963	February 7, 2022
Couchbase node auto-failed-over and indices out of sync Couchbase Server n1ql , server , index	3	1828	July 5, 2017
Why 3 node cluster for Automatic Failover?: Couchbase Server	3	4403	July 21, 2017
Query not working when one of the server in cluster died Couchbase Server	8	4269	August 1, 2017
Restore indexes only with cbbackupmgr Couchbase Server index , backup	4	1959	April 7, 2022

Index service autofailover

Related topics