XDCR performance drops dramatically after a period of time

I just installed CS 2.1 for evaluation (had been testing using 2.0.1). I created XDCR replication from node1, bucket1 to node2, bucket1. I then added 5 million documents to node1.bucket1. From the console attached to node2 (the receiver in my XDCR configuration), I noticed that the incoming XDCR 'ops per second' was fairly constant (some variability, but not much). After some point in time (usually towards the end of the replication, which I determine given that the number of items in the receiver is 'close' to 5 million) I notice the XDCR 'ops per second' drops dramatically. From the console, I notice the following pattern:

A burst of incoming items (ops per second is at a reasonable number), then it goes to zero for anywhere from 15 to 30 seconds. Then again i see a burst (for a few seconds) of some incoming transfers, then back to zero for 15-30 seconds. This continues until the replication is complete. The last 5-10% takes just as much time as the initial 90-95% of data. I didn't see this in 2.0.1.



At the very end of replication (e.g., the last few k items), since most of vBucket replicators have already done and been idle, it is ok to see some fluctuation. But not in the proportion you are describing here. It looks like something is going wrong during the replication of last 10% of items.

You can look at the logs, an easy way to see what happened is to check at the time uneven replication was observed, whether there are a lot of timeout or memcached error in ns_server logs at the destination side, if so, that may explain the uneven replication rate.

Also if you can, share your log with me (both source and destination) and specify the time frame of the last 10% item (the time uneven replication was observed).

To share the log follow the steps documented here:
(share the XDCR logs and cb_collectinfo)


1 Answer

« Back to question.

What are considered the 'ns_server' logs? Looking in /opt/couchbase/var/lib/couchbase/logs, I see lots of different 'types' of log files:


but no ns_server log files.

Also, i'm using the 2.1 for evaluation purposes only. Since i haven't purchased CS is it ok to upload the log files and cbcollect_info output you requested?

In your case you can start with the XDCR log.

When you run the cb_collect_info tools this generate a log for you.