Durability / Consistency Enhancement
It would be nice to be able to tweak the durability of the system. Basically add some of the features of other cluster systems where quorum reads/writes are supported.
I like the approach that a certain apache product has in that you can specify on a read/write how consistent you want the data. Something like:
set(Key, Value, Expiration, Consistency, Durable)
Where Consistency is a set of possible values:
ONE : Any replica has received the data. (Weaker consistency than today)
MASTER_ONLY : Master node has received the data. (Same as today)
MAJORITY / QUORUM : Majority of replicas have received the change. (Better consistency, slower)
ALL : All replicas have received the change. (Slowest)
Where Durable is either:
TRUE: All nodes only report when data is committed to disk.
FALSE: Nodes report immediately upon receipt of the data. (Memory write is ok.)
Similar with read:
get(Key, Consistency):
ONE: Any replica responds. (Weaker consistency, more HA)
MASTER_ONLY: Master node responds with data. (Same as today)
QUORUM: Majority of replicas respond with consistent values. (More consistent, possibly better HA since master node can be down)
ALL: All replicas must respond with consistent values.
I read the very little information about what is planned with the "observe" command soon. Would it be possible to build this framework on top of that?
This would allow a client to write/read data in a fashion that matches its requirements for the various pieces of data.
If easier, this could be set on the bucket itself instead, at the expense of not being able to store data in the same bucket using different schemes.
My main issues are:
1) Lack of ability to reach data if a "master" node goes down. This means that if any node in the system fails, then some part of the data set is not available. I'm used to systems that "heal" by re-electing a new master, etc. Quorum Reads / Writes are obviously a different approach - where there is no real master to the system.
2) Reblance operations seem very expensive in your system. Adding capacity to the system seems like it would be best done during a lull, but the time you may want to spin up additional resources are when the system is actually at its busiest.
3) Consistent reads/writes are very good. I like that you are focused on the C part of CAP. However, availability of data is also very important. It seems like couchbase is focused on CP, and I'm more interested in CA since partitions are generally rare compared to node failure. (At least this is my understanding).
Inconsistent reads would be helpful as well to help satisfy the A part, but there is still an issue with writes when the master node is down.
It is good to hear that something like this OBSERVE will come, but I have problems to find out, when this will come. Do you have more information about that?
And I am interested in the Java client library.
Observe is a Couchbase 2.0 feature and the design has almost been finalized. We will post the design to our public wiki when it is completed.
Something exactly like that is coming with the OBSERVE command, with values for both replication and persistence. The only difference is that Couchbase Server will always be consistent and won't require quorum reads. As a system, it's designed for consistency, which we believe makes for a much simpler programming model.
That said, we're likely to also introduce client support for inconsistent reads. The thought there is that in a failure scenario, a client library can automatically read from a replica.
We would love your feedback on this. Also, what client library would you be most interested in?