How replication and vbuckets works
Hi,
for me is unclear how actually works replication.
Let's say that I have 4 servers: A1, A2, B1, B2. On each I have (for example) 1 GB. A want have A1 and A2 in one vbucket (so total size for using will be 2 GB). Both servers will be replicated to B (B1 and B2). So I have 1 replica of 2 servers.
How I do this?
Now in settings, I set first node (A1) and set 1 replica. In next steps I add servers A2, B1 and B2. It's ok? What actually means "Join cluster"? How I determine decomposition on servers? Or in how order I must add servers A2, B1 and B2 to do this?
Thanks.
It may make sense for you guys to read about how we've implemented vbuckets: http://wiki.membase.org/display/membase/vBuckets and http://dustin.github.com/2010/06/29/memcached-vbuckets.html.
Be sure that you're not confusing "vbuckets" with "buckets" (yes, I know it's confusing...we apologize). "Buckets" are simply a logical keyspace/database concept that we use to apply different configurations and access methods to different datasets. Each "bucket" is split into 1024 "vbuckets" which is what we then distribute evenly across the cluster (this is the "balancing" or "rebalancing" concept). Additionally, each "vbucket" is replicated as per the configuration and each replica is also distributed evenly across the cluster so that no replica "vbucket" is on the same node as its active "vbucket".
In the current release, each "bucket" is spread evenly across all nodes in the cluster (when you create it initially and whenever you go through a rebalance).
There is also no concept of a "master" server or "replica/slave" server. Every server is active for some slice(s) of the dataset and holds replica data for other slice(s).
To answer your question specifically:
-Starting with server A1, you create a "bucket" of 1GB. At this point, even if you have configured replicas, none get created because you don't have enough nodes in the cluster
-Add servers A1, B1 and B2 to the cluster. They go onto the "pending rebalance" list. This is a list that we maintain to allow you to add and/or remove multiple nodes at once without shuffling the data around each time. When the nodes initially join the cluster, nothing has actually happened other than syncing the current configuration.
-Press the "rebalance" button. Now, the Membase software will begin to move slices of data (vbuckets) from server A1 to servers A2, B1 and B2 until they are all responsible for equal portions of the dataset. Once the rebalance is complete, the software determines which (if any) replicas need to be recreated. In your case, they all need to be created so you'll see some extra activity after rebalancing has completed (depending on how much data you have).
-You now have 1 bucket spread across 4 servers with 1 replica. The bucket is actually 4GB in size (since each node has allocated 1GB to this bucket) and you essentially have 2GB to work with (since half of the space will be used for replicas)
That's the beginnings of things ;-) Let me know if you have any other questions.
Perry
Hi perry,
thank you for comprehensive answer. If I understand correctly, administrator has no control under physical placing of data but system is flexible to add/remove server. And when I set 1 replica, then any of server can fail and no data will be lost. If I set 1 replica and failed 2 of servers, very probably some data will be lost. Am I right?
Thanks.
Sorry for the late reply...
yes, you are correct. Keep in mind that the data may not actually be "lost"...it is still available in the database files but you'd have to recover them manually rather than just being able to failover to replicas and continue accessing the data.
Perry
I have the same doubt, any answer?
Grover Campos Ancajima