Couchbase query fails when one data node goes down

Indeed, failover is different than rebalance. However, if you have a sufficient number of replicas, within seconds of failover you should go back to full availability. This is something we test to all of the time. You may have less replicas though, so a rebalance is needed to bring back in some redundancy, or adding a repaired/new node and then rebalancing.

You can use a getFromReplica if you are okay with the idea that in transient failure situations you may get an older copy of the data, yes. See the discussion on this in the docs. Also, I should say that with N1QL, SDKs after 3.x will automatically retry if it is safe to do so. You don’t indicate which SDK you’re using, but all of the modern SDKs have built-in retries up until the timeout as a default, with the ability to change the behavior to best effort if you see fit.