Hello Everyone,
I am currently testing Enterprise features in couchbase. The feature I am testing today is Rack Awareness.
I have 3 nodes and have deployed the Travel Sample. See below, distribution of the data is weird. It seems like Rack Aware is not working.
Server Node Name Group Services Data/Disk Usage Items (Active / Replica)
10.50.51.36 us-east-1a Data 22.5MB / 36.4MB 10.4 K/ 10.5 K
10.50.51.37 us-east-1b Data 22.7MB / 36.6MB 10.5 K/ 10.5 K
10.50.52.20 us-east-1b Data 22.7MB / 36.7MB 10.5 K/10.4 K
Since 2 nodes are on the same group, don’t replicas from us-east-1b all go in group us-east-1a? I was in the impression that in the above scenario, I should have more data on node in group us-east-1a.
Here is the number of item in the travel bucket:
SELECT count(*) FROM travel-sample
;
[
{
"$1": 31591
}
]
I simulated an outage, and failed over the 2 nodes in group us-east-1b.
Here is the number of items in the bucket.
[
{
"$1": 21090
}
]
Data was lost.
So I decided to add another node in group us-east-1a so I have an even number of nodes in each group. In this case, data is properly distributed.
I then shutdown both Nodes in us-east-1a. And no data was loss. Which is a good thing.
Questions:
Does Rack Awareness requires an even number of nodes total?
Or does Rack Awareness requires same number of nodes per group?
Or both of the above?
If yes to any of the above questions, is it mentioned in couchbase documentation? If so, can you please share the link? I would very much like to help contribute to couchbase documentation, is mentioning this in the forum enough? Or I need to go somewhere else?
I got the following warning when I click on the fail Over button:
“Attention – There are not replica (backup) copies of all data on this node! Failing over the node now will irrecoverably lose that data when the incomplete replica is activated and this node is removed from the cluster. If the node might come back online, it is recommended to wait. Check this box if you want to failover the node, despite the resulting data loss”
But in my even example above, no data lost happen, I had all my rows in my bucket:
SELECT count(*) FROM travel-sample
;
[
{
"$1": 31591
}
]
Does the failover warning takes into account a Rack Aware failover scenario where there is no data loss possible?
I am asking this because if I am in a middle of a maintenance in production and everything was done correctly and I receive this error message, the paranoia will start to kick in.
Thanks,
Steeve