Which part of data on which node ?
I've set up a cluster.
It's working, right now no problems.
Just curious about when CouchBase performs auto-sharding;
which part of my bucket goes to which node ?
Does replicas hold all data ?
For example : I have 3 nodes, rebalanced successfully. And 2 replicas.
Replicas hold a copy of the whole data.
Nodes using auto-sharding and storing some part of the data.
Am i right about that ? Or am i in a totally wrong point of view ?
You can read about how sharding is done here.
A deeper dive can be found in this white paper.
At a high-level, every Couchbase bucket is split into 1024 partitions or vBuckets (virtual buckets) per copy of data.
So you have 1024 partitions for the active dataset and another 1024 for the replica.
These are uniformly distributed across all the nodes in the cluster so that there are no hotspots.
Every node holds 1/3rd of the active data and 1/3rd of the replica data if you have 3 nodes.