
The existing .NET Client Library mutate methods will be overloaded to include two new options. The first option will be for specifying the number of replicas to which a key is written before the operation is considered a success. The second option will be to specify how whether a write must be persisted to be considered successful.
{code:java}
//result.Success would be true only if the key successfully replicates to at least 2 nodes.
var result =
client.ExecuteStore(StoreMode.Set, "foo", "{ \"message\" : \"bar\" }", ReplicateTo.2);
Assert.That(result.Success, Is.True);
//The key has been modified
var result = client.ExecuteStore(StoreMode.Set, "foo", "{ \"message\" : \"bar\" }", ReplicateTo.2);
client.ExecuteStore(StoreMode.Set, "foo", "{ \"message\" : \"not bar\" }"); //someone changes it
Assert.That(result.Success, Is.False);
Assert.That(result.Message, Is.StringMatching("Modified"));
//result.Success would be true only if the key is successfully persisted to the master and at least 1 slave.
var result = client.ExecuteStore(StoreMode.Set, "foo", "{ \"message\" : \"bar\" }", PersistTo.MasterSlave);
Assert.That(result.Success, Is.True);
//combining the two durability checks
var result = client.ExecuteStore(StoreMode.Set, "foo", "{ \"message\" : \"bar\" }", PersistTo.Master, ReplicateTo.2);
{code}
In the examples above, the ReplicateTo and PersistTo are enums with values from 1 to 4.
The original mutate operations defined in the MemcachedClient returned only Boolean values. Because of the need to report a 'modified' result, only the ExecuteXXX mutate methods will be modified.
h2. PHP
h2. Ruby
The {{observe}} method will return false value in case of timeout and positive in case of success.
{code:none}
# observe single key and wait for 5 seconds
# until it will be replicated to at least one replica
observe("foo", :cas => 6635827497922002944, :ttl => 5, :replicas => 1)
# returns true or false
# observe multiple keys and wait for 5 seconds
# until they have been replicated to at least one replica
observe({"foo" => 6635827497922002944, "bar" => 16213820143098331136},
:ttl => 5, :replicas => 1)
# returns {"foo" => true, "bar" => true}
{code}
As alternative API all mutators (i.e. set/add/replace/append/prepend) should have two new options:
* {{:num_replicas}} --- how many replicas desired to consider mutator successful. as far as the client knows how many replicas configured, it could also raise ArgumentError exception if this option negative or excessive.
* {{:timeout}} --- the timeout for replication. In this case it would be better to raise exception instead of returning falsy value
{code:none}
set("foo", "bar", :num_replicas => 2, :timeout => 5)
{code}
Or even better to combine these options into single option {{:observe}}. It will make more clear the fact of addition operation applied here.
{code:none}
set("foo", "bar", :observe => {:replicas => 2, :timeout => 5})
{code}
Because the code above written in synchronous fashion, it is ok to wait *before* method will return the value. But in asynchronous mode we can return immediately.
{code:none}
conn.run do |c|
c.set("foo", "bar", :observe => {:replicas => 1, :persisted => 2})
c.append("baz", "bar", :observe => {:persisted => 4})
c.observe_and_wait
end
{code}
h2. C
For C client (libcouchbase) it will be better to implement only *low-level* operation because of its asynchronous nature. This means the there should be {{libcouchbase_observe()}} function accepting the key and returning its status on all replicas into the corresponding callback.
h1. Recommended Implementation
{note}These recommendations are preliminary, and have not been reviewed.{note}
Given that the item being mutated may be changed at any time by another actor in a deployment, the key idea here is for the client to use the CAS value returned from the mutation operation. Since the client library knows at all times (though, asynchronously) which node is responsible for the vbucket for a given key, and knows which nodes are slaves for that key, the client library can use the {{OBSERVE}} command as a method of determining what has happened with a key on this given node.
h3. Implementation Approach
If, for example, application code wanted to check for "foo" with CAS value _12345_, it would use {{OBSERVE}} command against whichever nodes it needs to in order to report the status. This would be done in a loop, with some reasonable backoff (possibly guided by recommended polling times from server stats) until a reasonable or user specified timeout.
Looping steps would consist of...
First the client would check to ensure that "foo" _12345_ is still the current value on the master for this vbucket via {{OBSERVE}}. If it is not the current value, it simply returns saying the item is modified.
If it is still the current value, the next step for the client depends whether the application code is simply checking for persistence or index processing. This can be evaluated from the {{OBSERVE}} response. If the application code is checking for replication or whether or not the modification has been persisted on multiple nodes, it can then proceed to check any slaves, as identified by the cluster configuration, for status of persistence or replication of "foo" with CAS _12345_. If successful, it returns that response to the application code. If it is not successful, then it waits an interval and loops again, blocking the application code until status is determined or a timeout value has been reached.
Thus the possible set of return values would be either OBS_SUCCESS, OBS_MODIFIED, or OBS_TIMEDOUT.
Note that the status OBS_MODIFIED does not indicate monotonic forward mutation. For example, in one scenario a failover may have occurred and the item key "foo" being observed may have been reverted to a previous state. This state may even be some value prior to the initial fetch before the application code mutated the value of the document.
h2. Implementation Constraints
{{OBSERVE}} is a binary protocol only operation. It could be implemented in ASCII, but that would currently be complicated by the fact that mutation operations do not return the new CAS value in ASCII protocol. Since Couchbase Server uses binary protocol exclusively, we do not implement that currently.
h2. Expected use Cases
h3. Ensure Indexable
The current implementation of Couchbase Server does not consider items in stale=false view queries until items are persisted. The combination of an application making changes, ensuring they're persisted and then issuing a stale=false query will give the app an ability to consistently access the view with respect to any changes that application made.
Further information on this use case is to be documented elsewhere.
h3. Ensure Durability
Through client library specific APIs, applications may wish to verify that the change recently made was durable. For example, in some cases a user application may wish to verify a data item has been made on the master and at least one replica. Another example may be that a user application may be extremely paranoid that a particular data update not be lost. In that case, it would want to ensure that item has been replicated to at least 3 servers and persisted to at least 4 servers (the maximum durability supported by Couchbase Server).
h2. Implementation Questions
{quote}
q: Is it better to return the values OBS_SUCCESS, OBS_MODIFIED, and OBS_TIMEDOUT instead of true/false and treating the timeout as an exceptional condition? Since OBS_MODIFIED and OBS_TIMEDOUT may effectively require the same error handling, it may be easier to switch on these or return some extended boolean with status.
{quote}
a: ?
{quote}
q: Is a C implementation of {{OBSERVE}} needed?
{quote}