Configure Couchbase Cluster - Autonomous Operator with Public Network to enable external XDCR

Hi,

We are trying to setup Couchbase Cluster with Autonomous Operator and we need to enable Public Network access because we need to migrate data from an old Couchbase Server that it is neither in the current K8S Cluster nor in another K8S Cluster. Therefore, we cannot setup inter-cluster replica.
As far as i understood from the documentation in AO 2.2 which is the version we are using, the only possible way to do this is enabling Public Networking access.

When we enable Public Network access with a current domain that we have and following all the instructions in the documentation, we are getting SSL Handshake error none of the LoadBalancers become available for this reason. We are on AWS.

I have tried different workarounds and none of them work: Creating a completely new set of creds using easyrsa as the doc suggests (This is giving the SSL Handshake error). Using current certificates that we have: In this case, the issue is that current certs do not have SAN with Couchbase cluster requirements as it is suggested here Creating TLS Certificates | Couchbase Docs. So when we import these certs in the secret is saying that the certs are not valid because some host is not contained in the SAN.

We are kinda lost on how we can set up this in a current environment. It is supposed that Autonomous Operator is for doing this kind of thing less elaborated and straightforward but it seems it is not.

Is there something we are missing? We have strictly followed up this Couchbase TLS | Couchbase Docs, this Configure Public Networking | Couchbase Docs and Configure TLS | Couchbase Docs.

On the other hand, we would have liked to set up this inside an internal Route53 DNS using the current Nginx Ingress that we currently have inside the EKS cluster. Is this possible? We haven’t found how to do this in the current documentation.

welcome @jprmxhero !

I know this can work as I’ve set it up before myself manually, and we have automated tests for this kind of configuration. Usually the sticky bits are around setting up the external DNS or the certificates. Those are both really core K8S functionality which our operator does automation around. External DNS, for example, is standardized but the permissions sets vary from provider to provider.

One question: what version is the old Couchbase Server environment running and what’s the new one running? I ask this because there are various known issues in some of the older versions with XDCR. I seem to recall some of those were with TLS setup.

I don’t think you’re missing anything, but it’s hard to say without some details what’s going wrong. One debugging step I’d recommend, try sdk-doctor against your public networking setup. It’s designed to probe for different configuration issues. It might find something which is an issue with XDCR. The other thing I’d recommend is looking through the logs from the operator for anything unusual with setting up the LBs or External DNS. One easy way to do this is cbopinfo to gather all of the details and look at it locally.

If you have an Enterprise Subscription, you can certainly contact support and ask for it to be looked over as well.

XDCR uses the binary memcached protocol between the two clusters, and that can’t go through an Nginx Ingress (assuming I’m picking up the configuration you’re suggesting). The LB and associated DNS entries can be propagated to the internal Route 53 just fine, but it can’t be an Nginx ingress. The only way to get the binprot to traverse is with an LB per node.

Thanks for the reply and the deep dive into the subject. The problem is on the certs for sure because I cannot get this work and because of that I haven’t started yet on configuring the XDCR yet.

External DNS is working fine because it is creating the records in Route53 automatically. LBs are created as well but nodes are always out of service.
When I go to the cochbase pods logs the only error I am seeing is “error: SSL Handshake”. This happens if I set up a new certificate with easyrsa as the documentation says. But we have other certs for the principal domain there. Certs that are issued from Amazon. If I set up those certs, the original ones that we are using for the rest of the applications domains, I cannot even create then cluster because I am getting the error message related my certs do not have internal SNAs from K8S cluster that it seems to be required for the couchbase cluster works.

We have enterprise support but starting on April and they are not giving support to us until that date.