Reading a more technical explanation of how vBuckets work, each vBucket is defined in one of the following states: Active, Pending, Replica and Dead. (Source: http://dustin.github.com/2010/06/29/memcached-vbuckets.html)
Quoting the document, "A replica vbucket is similar to a dead vbucket in that from a normal client’s perspective. That is, all requests are refused, but replication commands are allowed."
Lets say we have a simple two node cluster, with 1 replica copy vBuckets. We can assume, that each vBucket has an Active state on one server, and a Replica state on the other server (once data in bucket is replicated fully). The above description makes me believe, that the Replica copy is never used, even for reading. Say you are connected to server B, which has a Replica copy of the vBucket you're trying to read a specific key from; the connection will always be made to Server A for the active bucket, correct?
Is there a possibility to control the behavior in such a way, that a Replica copy could also be used for reading (read only) key values (which I assume, could be stale because of replication lag). This could be defined on the bucket level (just like number of replicas).
I see some benefits in this:
1) Auto-"failover" key reads in the time before it takes the cluster to know a node went offline. Writes are still blocking until the node has been actually failed-over.
2) Reduced network traffic between couchbase nodes since reads from Replicas may be satisfied on the server you're connecting to (assuming, the server has a Replica copy of the vBucket you need);
A slightly improved design of vBuckets could provide a "floating-master" (Active) vBuckets, which would migrate between servers if write-blocking should occur on the node which holds the currently-active vBucket (due to the node being taken offline for example). When the node would re-join the cluster, the buckets would be resolved from Pending state into Replication (to copy updated data in the cluster) and then eventually, like with rebalance operation, the "active" vBucket states and data could be distributed across the nodes.
Is this something you're researching, or has the architecture already changed in a significant way since the original vBuckets document almost 2 years ago? How relevant is this today with CB 2.0?