XDCR "aggresiveness"

ergroot · September 19, 2013, 8:50am

Hi,

I am in the situation where I only have 2 nodes to run CB. This means I have to use XDCR replication to make sure both nodes have the same information.

I have XDCR up-and-running, but I notice that it is sort of “lazy”. When I put write/read load on the nodes I notice that at some point each CB instance has thousands of pending replications which means the systems are not in sync. After a while it seems to pick up the speed and after 4 to 5 minutes it is in sync.

Unfortunately by that time the data in CB is a mess because updates and delete operations have gone wrong as the application tries to delete a record on CB instance B while it was added to CB instance A and not replicated yet.

I’ve experimented with the XDCR settings like, xdcrOptimisticReplicationThreshold and xdcrMaxConcurrentReps. It appears to get in sync quicker, but it still takes 4 to 5 minutes and that is way too long.

For my application I need instant replication between 2 CB instances.

Please provide some hints towards a solution.

Best Regards,
Erik

dipti · September 19, 2013, 3:08pm

Erik,

Do you mean that you have only a 1 node cluster on both sides?
what kind of hardware are you running these on? What is your front end workload?
If you want high availability you need to use intra cluster replication. If you have more than 1 nodes, you can set the intra cluster replication to 1 and it will create a replica copy for you. You can manually failover a node to promote the replicas to active.

If you are looking for disaster recovery, you can use XDCR, but XDCR is a heavier operation and requires a minimum of 3 nodes. It works on a per partition basis and by default has 32 streams it uses. These streams round robin across the 1024 vBuckets. It is uses data that is persisted to disk, meaning that you have to wait for data to persist to disk and it will then get picked up by the XDCR engine.

XDCR can be done in seconds if not milliseconds depending on various factors. Hope this helps explain how XDCR works.

penacho · September 20, 2013, 8:34am

Dipti,
XCDR requiring 3 nodes seems a bit strange.
In earlier posts, users were recommended to do use 2 single-nodes with XDCR between them over a dual-node cluster with replication. The reason for that was that such a dual node cluster does not seem to give access to the replicated documents of a failed server. E.g.:

Also in our (=Erik’s) case, two nodes are sufficient for capacity and performance. And it should also be sufficient for redundancy. How can we ensure fast replication, plus availability of all data when one node goes down?

Topic		Replies	Views
Delay in XDCR Replication in couchbase Couchbase Server	2	3586	June 26, 2014
Is there a way to configure XDCR to be more fault tolerant? Couchbase Server	1	2726	June 29, 2013
Couchbase XDCR Replication issue Couchbase Server xdcr	2	3020	July 16, 2015
Couchbase node inititates a failuring during XDCR Replication Couchbase Server	0	1120	February 7, 2017
Consistent XDCR error when replicating to more than 2 clusters in the new 2.2 release Couchbase Server	6	2689	September 24, 2013

XDCR "aggresiveness"

Related topics