I’m running a set of XDCR tests. My model is a master/slave environment, where the master cluster will always push data to the destinations. In my tests, i insert data into the master (which is a cluster of two nodes), which replicates to all the destinations. All is working well (your xdcr 2.2 improvements are a major improvement from 2.1). However, beam.smp (on the masters) is eating up memory, and eventually crashes. In both instances where beam.smp crashed, it starts up again and finishes the replication. Is this expected behavior?
I’m running CS 2.2.
As you can guess this is not an expected behavior. (to have processes crashing); but I would like to understand a little more your topology.
Can you tell us how many nodes do you have on each cluster? and how many documents/operation per second do you have? (You can monitor the XDCR work on your administration console)
What are the options you have chosen for the replication itself, based on the parameters available here:
The ‘main’ cluster contains 2 nodes. I have two buckets on the main cluster. I replicate to 8 external clusters, where each of those clusters contains 1 node. I’m inserting about 8K documents per second into each bucket on the master cluster, which is obviously replicating to each of the 8 external clusters over XDCR. I run ‘top’ on each of the nodes that make up the ‘master’ cluster. The amount of memory each node is using increases rather quickly. Eventually, one of the nodes uses up all memory -I can see the process go away - then come back again, and then the replication finishes (albeit at a slower rate now).
The XDCR params are as follows:
max concurrent reps: 20
optimistic replication threshold: 20000
The remainder are default values