In Part-1 of this blog series on understanding Ordering in Couchbase Functions, we observed how the mutations are consumed by Couchbase Eventing Service in different scenarios.
Now, let’s look under the hood of the Eventing Service and try to understand how the Eventing Workers actually get assigned for processing the mutations.
vBucket to Worker Assignments
Couchbase automatically shards data across the data nodes; these shards are called as vbuckets. Every bucket is sharded into 1024 vBuckets. To learn more about auto-sharding and vbuckets, check out this whitepaper.
Let us assume that our sample cluster has 3 data nodes and 1 eventing node. There are 2 functions deployed onto the Eventing node and they listen to 1 bucket on the Data node.
The Source Bucket is automatically sharded into 1024 buckets which are spread across the 3 nodes. The illustration below shows the vBucket to be assigned in a monotonically increasing order, but this might not always be the case; the illustration lists the vBucket sequence in-order for better readability. There are 2 functions deployed onto the Eventing node: fn-email and fn-score each with 4 and 3 workers assigned to them respectively.
As there is just 1 eventing node, all the 1024 vBuckets are assigned to this Eventing node. Each Worker of a given Function listens to 1024/<num_workers> of the vBuckets. Vbucket distribution would be off by one for few workers, if 1024/num_workers != 0. And hence as seen in the illustration, each of the 4 workers of the fn-email function listens to 256(=1024/4) vBuckets; and with the same analogy each of the 3 workers of the fn-score function listens to 341(=1024/3) vBuckets(with one worker alone seeing 342 vBuckets).
If the cluster is stable and there is no Cluster Rebalance: All mutations that happen to a particular vBucket will be consumed by the assigned worker; all changes happening to a particular document will be consumed by the worker in-order(essentially, each worker process spawns multiple threads, and the documents are seen in order by the thread).
Now, this explains why documents are not processed in the sequence(Take-Away#1 and Take-Away#3 in Part-1) they were inserted on, as documents can belong to different vBuckets, and hence different workers assigned to these different vBuckets, operate on them concurrently leading to non-sequential behaviour.
Now, let’s say the mutation rate increases and we see the backlog on the given single Eventing node increases leading to increasing load on the CPU in the Eventing Node. The Administrator adds one more Eventing node and performs a Cluster Rebalance.
Now, each of the Eventing Nodes sees 1024/2 = 512 vBuckets assigned to it, and the workers also see half of the vBuckets than what was assigned earlier. The vBucket-to-worker assignments are seamlessly performed by the Eventing Service during Rebalance and are completely transparent to the Administrator.
As the metadata bucket stores the checkpoint information of the processing being done and also the worker-assignments, no mutation is lost and the Couchbase Eventing Service performs seamlessly during the Cluster Rebalances.
The above behaviour offers elastic scalability based on different seasonalities or traffic patterns.
Behaviour During DEBUG Mode
The online real-time debugger helps developers debug the deployed code as the mutations happen. A debug session operates only on one mutation at a time, and this mutation is arbitrarily selected. A new debug session should be started for every successive change to be observed.
Once a Debug session is started, a single mutation is assigned to a worker for the debugging session and the rest of the mutations happening to the bucket are processed; that is, the rest of the mutations happening to the source bucket are not blocked when the debugging session is underway.
It is important to note that the Debugger should not be used in Production environments and is best consumed in Development environments. While the debugger session is underway, the rest of the mutations are processed by the Function. The vbucket-to-worker assignments that we saw earlier holds good, but the mutation that was used for the Debugger is processed out of sequence. So, a Debugger session in Production environment can introduce timing issues and can cause lost mutations if the Debug session is terminated in the middle. Also, if the operation in the Function is sensitive to the ordering of mutations for a particular document, then this out-of-sequence processing of the mutation might lead to states that are not correct.
We hope that this two-part series gave you a good understanding into some of the worker ordering semantics and also a quick peek into under the hood of Couchbase Eventing Service.