[MB-6643] Very slow replication rate( less than 100 on average) with 2 unidirectional replications between 2 clusters. Created: 13/Sep/12  Updated: 26/Sep/12  Resolved: 19/Sep/12

Status: Closed
Project: Couchbase Server
Component/s: cross-datacenter-replication
Affects Version/s: 2.0-beta-2
Fix Version/s: 2.0-beta-2
Security Level: Public

Type: Bug Priority: Critical
Reporter: Ketaki Gangal Assignee: Junyi Xie (Inactive)
Resolution: Duplicate Votes: 0
Labels: 2.0-beta-release-notes, pblock
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment: 2.0-1717
1024 vbuckets

2 unidirectional replications on 2 clusters.

Centos

6G, 4 core VMs

Attachments: PNG File Screen Shot 2012-09-13 at 10.14.38 AM.png     PNG File Screen Shot 2012-09-13 at 10.14.47 AM.png    

 Description   
- Setup two 3 node clusters.
- Load 3M items on bucket1, cluster1 and 3M on bucket2, cluster2 [No expires on any load]
Example
-nohup lib/perf_engines/mcsoda.py localhost:23201 vbuckets=1024 doc-gen=0 doc-cache=0 ratio-creates=1 ratio-sets=1 min-value-size=256 max-items=1000000 exit-after-creates=1 prefix=a_one
-nohup lib/perf_engines/mcsoda.py localhost:23202 vbuckets=1024 doc-gen=0 doc-cache=0 ratio-creates=1 ratio-sets=1 min-value-size=256 max-items=1000000 exit-after-creates=1 prefix=a_two
-nohup lib/perf_engines/mcsoda.py localhost:23203 vbuckets=1024 doc-gen=0 doc-cache=0 ratio-creates=1 ratio-sets=1 min-value-size=256 max-items=1000000 exit-after-creates=1 prefix=a_three

Start unidirectional from cluster1 bucket1 to cluster2 bucket1
Start unidirectional from cluster2 bucket2 to cluster1 bucket2

Keep load running on cluster1 -new load with deletes/expires
lib/perf_engines/mcsoda.py localhost:23203 vbuckets=1024 doc-gen=0 doc-cache=0 ratio-creates=1 ratio-sets=1 ratio-deletes=0.02 ratio-expirations=0.05 expirations=1200 min-value-size=256 max-items=1000000 exit-after-creates=1 prefix=a_four&


Observing very slow replication rate on cluster1 and close to 100 percent CPU used.

Replication rate on cluster 1 is between 10 - 115
Replication rate on cluster2 - is between 1-2k


Adding screenshots from cluster1.



 Comments   
Comment by Ketaki Gangal [ 13/Sep/12 ]
Over a period of time, xdc gets is much higher but xdc sets is low.

Seeing 9K/sec Xdc ops and gets 9K/sec and creates <100 per sec on the cluster.
Comment by Junyi Xie (Inactive) [ 13/Sep/12 ]
Unable to tell a lot from the screen shot. Please post more information about two clusters, e.g, the complete screenshot with XDC sections (Replication and Destination).

Also, you load data from your local machine from the commands above. Can you please load the data from another machine in your test?
Comment by Ketaki Gangal [ 13/Sep/12 ]
The clusters are still running here

http://10.3.121.32:8091/index.html#sec=analytics&statsBucket=%2Fpools%2Fdefault%2Fbuckets%2Fbucket1%3Fbucket_uuid%3Db0a2f01b5f787f2f85f7978a42d99a6b&zoom=zoom_minute&graph=ep_ops_create

http://10.3.121.38:8091/index.html#sec=analytics&statsBucket=%2Fpools%2Fdefault%2Fbuckets%2Fbucket1%3Fbucket_uuid%3Db0a2f01b5f787f2f85f7978a42d99a6b&zoom=zoom_minute&graph=ep_ops_create

Loading commands are put up on the bug description above. I am not loading data from my local machine, it is from different clients.

The screenshot was from one of the relevant clusters with XDC ops/sec and XDC create/sec rate. I will add the other one as well.

Comment by Junyi Xie (Inactive) [ 19/Sep/12 ]
This sounds the same issue as MB-6662 because you have expired items in workload, and you see extremely high getMeta ops but very low setWithMeta/DeleteWithMeta.

MB-6662 has been fixed by recent commits.
Comment by Junyi Xie (Inactive) [ 19/Sep/12 ]
MB-6662
Generated at Wed Jul 23 11:15:31 CDT 2014 using JIRA 5.2.4#845-sha1:c9f4cc41abe72fb236945343a1f485c2c844dac9.