@rajib761
My index services are seperate, but i have only one replica. I think to survive two nodes going down I will need to bump the index replica to two.
That is correct. The service guarantee is if you have N replicas of an index, that index will survive the loss of N nodes. (Note N = 3 is the maximum number supported.) You also need to have at least N + 1 Index nodes (master + N replicas) or not all N replicas will be created.
I also did not understand why the auto-failover is a limitation for index service. There is no concept of active and replica vbucket for index. Both main and replica indexes are read-write, so why would a auto failover be required.
Autofailover will take the failing-over nodes out of the cluster, so it has to have a way of determining when it should do this. This is a trickier problem than it seems on its face, e.g. if the node becomes unresponsive for a while, how long should Cluster Manager wait to hear from the node before it fails it over? If this is too short, it could trigger a failover of a node that was just experiencing a temporary spike in workload. This will then make the problem worse instead of better, because the remaining nodes need to handle the workload of the failed-over nodes in addition to the workloads they were already handling, making it more likely they get overloaded and unresponsive, thus triggering a cascade of auto-failovers.
So there is input into the Autofailover decision from the individual service. This has not been implemented yet by the Index service. Also we want to put in place some shock absorbers that will be able to absorb transient workload spikes without making the service unresponsive to health checks from Cluster Manager that are used in the decision whether to automatically fail over. Currently Autofailover is done at a node level instead of a service level, which is why if KV and Index are on the same node, if KV decides it needs to autofailover the side effect is it also causes Index to autofailover, as KV is considered the priority service in this situation. We have had some discussions on whether to change Autofailover to be done at the service level instead of at the node level product-wide, but this will be a ways out if it happens. Conceptually this makes more sense, as one service may be unhealthy while another on the same node is not, as not all failures are node-level like loss of power to the node.
Are you hinting at auto-failover for index when there are no replicas for index?
No, the number of replicas has nothing to do with the decision to autofailover – it only impacts what indexes actually survive the failover. If no indexes have any replicas, losing any Index node will also lose all the indexes it contains, and they will need to be recreated manually via new “create index” statements. The reason for this index loss is that each Index node only has the metadata describing the indexes it hosts (whether the main one or a replica). It is possible we could enhance the metadata handling so that all Index nodes have the metadata for all indexes in the entire cluster in order to eliminate the need to manually recreate any indexes if at least one Index node survives. There have been some discussions about this but this would also be a ways out in the future if it happens.