Your cluster is set up to perform some level of data replication between nodes within the cluster for any given node. Every node will have both active data and replica data. Active data is all the data that had been written to the node from a client, while replica data is a copy of data from another node in the cluster. Data replication enables high availability of data in a cluster. Should any node in cluster fail, the data will still be available at a replica.
On any give node, both active and replica data must wait in a disk write queue before being written to disk. If you node experiences a heavy load of writes, the replication queue can become overloaded with replica and active data waiting to be persisted.
By default a node will send backoff messages when the disk write queue on the node contains one million items or 10%. When other nodes receive this message, they will reduce the rate at which they send replica data. You can configure this default to be a given number so long as this value is less than 10% of the total items currently in a replica partition. For instance if a node contains 20 million items, when the disk write queue reaches 2 million items a backoff message will be sent to nodes sending replica data. You use the Couchbase command-line tool, cbepctl to change this configuration:
shell> ./cbepctl 10.5.2.31:11210 -b bucketname -p bucketpassword set tap_param tap_throttle_queue_cap 2000000In this example we specify that a node sends replication backoff requests when it has two million items or 10% of all items, whichever is greater. You will see a response similar to the following:
setting param: tap_throttle_queue_cap 2000000In this next example, we change the default percentage used to manage the replication stream. If the items in a disk write queue reach the greater of this percentage or a specified number of items, replication requests will slow down:
shell> ./management/cbepctl 10.5.2.31:11210 set -b bucketname tap_param tap_throttle_cap_pcnt 15In this example, we set the threshold to 15% of all items at a replica node. When a disk write queue on a node reaches this point, it will send replication backoff requests to other nodes.
Be aware that this tool is a per-node, per-bucket operation. That means that if you want to perform this operation, you must specify the IP address of a node in the cluster and a named bucket. If you do not provided a named bucket, the server will apply the setting to any default bucket that exists at the specified node. If you want to perform this operation for an entire cluster, you will need to perform the command for every node/bucket combination that exists for that cluster.
For more information about changing this setting, see Section 7.6, “cbepctl Tool”. You can also monitor the progress of this backoff operation in Couchbase Web Console under Tap Queue Statistics | back-off rate. For more information, see Section 6.4.1.4, “Monitoring TAP Queues”.