Replicate Data to Capella via XDCR

Question from a customer:

“We want to use XDCR to replicate data to Capella. We have a cluster on-prem with internal set of IP addresses. My understanding is when doing XDCR, all nodes where the Data Service is running should be whitelisted. In our case, all the IPs are internal. Can someone please advise on general steps/guidelines/best practices on what we should do in order to setup XDCR? Thanks!”

The IP(s) that need to be whitelisted is/are the external IP(s) that traffic would be flowing out of those nodes as. The specific terminology will depend on how the “on-prem” environment is setup but the concept should be the same across all.

As a simple example, you may have 3 nodes in your datacenter each with internal IPs, they probably won’t have their own external IPs. However, they will have some kind of network router in that datacenter that is used to talk to the internet and that will have a public IP. Kind of like your laptop on your home network :wink:

What you need to do is identify what the outbound IP(s) are for the nodes based on the network routing. One route is for them to talk to their network/infra team to understand what the outbound IP(s) are. Another is to test themselves using something like https://linuxconfig.org/how-to-use-curl-to-get-public-ip-address, however in cases where traffic is balanced between multiple outbound gateways this may be misleading as it would only return a single IP rather than the set of all possible outbound IPs.

In a public cloud environment (AWS/GCP/Azure) you will likely have one or more network gateways that are used for outbound network traffic…it’s the public IPs of these that you need to whitelist.

Once you’ve got it setup, you can use SDK Doctor from each node of the source cluster to confirm that it is able to communicate with the Capella cluster.