In my previous article I discussed—from a high level—the new Public Connectivity feature in Autonomous Operator 1.2.0.  This was intentionally an abstract overview in order to coax the user to learn about the joys of DDNS, TLS and layer 3 Networking.

Give a man a fish, and you feed him for a day. Teach a man to fish, and you feed him for a lifetime

Hopefully you have all invested the time to learn how to fish! (or at least ready to get your feet wet.)  This article gives a practical tutorial on configuration of the Operator in order to be able to expose your Couchbase clusters safely on the public internet.

Why Use This Feature?

Today’s tech startups are more cloud-focused than the traditional enterprise. Some would argue that the traditional enterprise is entrenched—guarding data behind firewalls in private data centers—and arguably, from a security standpoint, this is the correct thing to do.

Increased cloud exposure, while a bigger risk, is also becoming less of a concern as time goes on.  Cloud more importantly opens myriad doors to agility and innovation gains. Connecting public service offerings over the public internet is a huge benefit, and one that cannot be easily and economically achieved with services that live on-premises, hidden behind NAT boundaries.

The one example that I am personally quite fond of, is the rise of Function as a Service (FaaS).  Functions are short lived jobs (typically based on containers) that respond to stimuli and return a result.  They are created on demand, and automatically scale horizontally instantaneously to handle the required workload.  You can use public FaaS service offerings today, with no time wasted installing and configuring virtual or physical infrastructure.  AWS Lambda is one such incarnation you may well be familiar with.

Unless your function is pure (in the sense that it just processes data) it will require inputs, typically in the form of a database.  These FaaS offerings, given they operate on the public internet, will also require a connection to a public database. Establishing private VPN tunnels between these services may be difficult or impossible.

It is for these reasons—interconnectivity, simplicity, and agility—that we offer the option of public connectivity.

Security, Security, Security

A service placed on the public internet will be faced with scrutiny from malicious 3rd-party actors.  The internet is awash with attempts to glean and exploit personal information. As a simple test, connect a UNIX system to the internet.  Your SSH logs will fill up fairly quickly with attempts to access the machine using dictionaries of common/stolen usernames and passwords. Firewalls will show attempts to scan for open ports.  This is just the accepted normal, and has been for as long as I can remember.

Databases in particular are honey pots to criminals trying to exploit systems in order to gain access to mailing lists for phishing attacks, or extract credit card details for fraud and identity theft. You quite simply have to make these services secure.

The Public Connectivity feature of the Operator mandates the use of full end-to-end encryption.  This prevents snoopers from seeing confidential information while on public networks. Digital certificates form a trust between clients and servers.  A client will verify that a server is valid for the host name it tried to connect to, and that it is signed by a trusted certificate authority.

The Operator allows the use of server certificate chains only, and does not act as a certificate authority, signing server certificates for individual servers as the topology changes.  Acting as a CA would allow any certificate to be created and signed, so we opt for the safe approach. As a result, we support a wildcard certificate for the cluster as a whole. When using wildcard certificates we also need to use public DNS in order for the client to confirm that the client can verify the server certificate is valid for the host being contacted.

This background gives us enough knowledge to begin deploying our database with public connectivity.

Let’s Get Started

DNS

As discussed, we need to use public DNS in order to contact the Couchbase cluster nodes when using public connectivity.  These can be bought relatively cheaply online from registrars such as Gandi, GoDaddy, Namecheap etc.

We also need to be able to use Dynamic DNS.  As nodes are added and removed from our Couchbase cluster, we need corresponding entries to be added and removed from the DNS.  They also need to be updated if public IP addresses of these nodes change. This is due to high performance, client-side sharding used by Couchbase clients and XDCR.  We will be using the Kubernetes external-dns service to perform DDNS updates.  The link lists supported DDNS providers.  Once you have purchased a DNS domain you will need to delegate its name servers to your chosen DDNS provider.  My personal choice for this example is Cloudflare. The final preparation step is the creation of an API key or other credentials for the external-dns controller to authenticate with the DDNS provider and control the DNS records required by the Couchbase cluster.

TLS

For most people this is the most mystical part of the process.  HTTPS web pages just work transparently, so there is little need to be concerned with this from day to day by the average user.  I’m not going to go into any detail (as that is for another post), but what we need to discuss are the main things that need to be tied to your chosen DNS configuration.

I’m using my personal DNS domain, spjmurray.co.uk, for this demonstration.  I will be installing my Couchbase cluster in its own namespace called 6c3c0075-b44a-11e9-9518-4a8d7629c69a, and the cluster itself will be called couchbase. These are important parameters to know because they allow us to uniquely address a Couchbase cluster within our Kubernetes cluster.  The Couchbase cluster will be configured so that its domain is couchbase.6c3c0075-b44a-11e9-9518-4a8d7629c69a.spjmurray.co.uk. The operator will require the creation of A records within this domain for each node as well as the Couchbase Web Console.

Knowing our domain, we can now determine the DNS wildcard certificate subject alternative name *.couchbase.6c3c0075-b44a-11e9-9518-4a8d7629c69a.spjmurray.co.uk.

OpenVPN’s EasyRSA tool is a simple method of generating certificates.  First, clone the repository and initialize it.

Generate the CA certificate and key pair.  If you recall, the CA’s private key is used to digitally sign a server certificate. A client can then verify the server certificate is authentic with the CA’s public key.  This command will prompt you for a CA name and a password. After completion the CA certificate can be found in pki/ca.crt.

The server certificate and key pair are created next.  When TLS is specified in your Couchbase cluster configuration the Operator will use TLS to communicate with the cluster.  This prevents any passwords or sensitive data being transmitted in plain text. To support Kubernetes’ private DNS names we need another DNS wildcard subject alternative name.  The nopass option must also be specified so that the private key is not encrypted and can be read by Couchbase server. The following command will prompt for a password; this is the CA private key’s password used to digitally sign the certificate.

You can verify that the certificate is as expected by examining it in OpenSSL:

EasyRSA creates private keys in the modern PKCS#7 format, however Couchbase Server only supports PKCS#1.  To remedy this we need to convert formats.

Now that TLS is configured, collect your CA certificate and server certificate/private key pair, as they will be needed when configuring your Couchbase cluster in a later step.

DDNS Setup

Now we can start deploying some actual Kubernetes resources.  First up, let’s create our namespace for the external-dns controller to run in and a service account to run as.

A role is required to grant permission for the external-dns controller to interrogate Kubernetes resources in the namespace it is running in.  The role is bound to the service account that the external-dns controller will run as. I will use a cluster role in this example so it can be shared between all instances of the external-dns controller.  It will be bound within the namespace, however, as the controller does not need access to all namespaces. OpenShift users: You will need admin privileges for role creation and binding, as they require privilege escalation, and for security reasons, cannot be performed by normal users.  The role looks like the following:

And is installed with the following:

The final step is to install the external-dns controller.  We will configure it to look for services within the namespace.  If a service has an annotation external-dns.alpha.kubernetes.io/hostname then the external-dns controller will create DNS A records in our DDNS provider mapping to the service’s IP address.

It is possible that multiple instances of external-dns are synchronizing DNS records to the same domain.  If it sees a record that doesn’t correspond to a service it is managing, it will delete it. To prevent two or more controllers from continuously adding their own and deleting others’ records, we add in a GUID so that the controller only responds to records it owns.  For your curiosity, ownership is managed through DNS TXT records. The deployment YAML looks like the following. You should substitute your own Cloudflare API key and email address in the environment parameters.

This can be created with the following:

Check that the deployment is running and we are ready to install our Couchbase cluster.

Install the Operator

This is covered extensively in the official documentation.  First, you will need to install the custom resource definitions.  Then install the dynamic admission controller into a namespace of your choice and connect it to the Kubernetes API.

The admission controller is a required component of the Operator 1.2.0 deployment.  It applies default values to the cluster, and most importantly, does validation outside of the scope of native JSON schema validation.  The most important validation it performs for this setup is ensuring that your DNS and TLS are configured correctly in your Couchbase cluster definition.

The Operator is installed into the same namespace as the external-dns controller using a very similar process to the external-dns controller.

Public Couchbase Cluster

The final step is actually the easiest.  Here’s the YAML definition:

The admin console and exposed features (per pod services) are exposed with new parameters that allow the service type to be specified.  On this occasion I’m running in GKE. When a LoadBalancer service is created it gets a public IP address associated with it.

The new DNS setting, when specified, will annotate admin console and per-pod services with the labels understood by the external dns-controller.  For the admin console setting this is console.${metadata.name}.${spec.dns.domain} for example.

Finally, as we are using public connectivity and DNS, the dynamic admission controller will force us to use TLS.  The TLS parameters are populated with secrets containing the TLS certificates we created earlier for this cluster.

Create the cluster and watch the status or Operator logs for completion.  Eventually you should be able to connect to the console with the url https://console.6c3c0075-b44a-11e9-9518-4a8d7629c69a.spjmurray.co.uk:18091/ as load balancer IPs are allocated and DNS records are added.  You can use this same address to establish XDCR remote clusters and bootstrap Couchbase client SDKs.  Congratulations you have enabled public connectivity!

Troubleshooting

Merely explaining how to configure public connectivity is half the job.  You need to be able to determine where the problem lies before raising support cases.  Given it’s always the network’s fault (well mostly), here are some tips to help you.

DNS is not instantaneous, it takes time for records to appear, and it takes time for modifications to propagate as TTLs expire.  To check that DNS is as expected, first look up the expected DNS names. Find the service names:

Look up the calculated DNS name:

Does the DNS A record exist?  Does the IP address correspond to the service’s public IP address?

Next you need to be sure that the requested ports are listening.  We can check that the TLS enabled Admin port is listening and we can establish a TCP session on that port:

The final thing to do is establish whether TLS is working as expected using the CA certificate:

Additionally, for the particularly brave, you can check that the DNS addresses passed to the clients are correct:

Next Steps

Couchbase Autonomous Operator 1.2.0 is a big release with many new features.  The main focuses are upgradability and ease of use. We hope you enjoy doing cool new things with it as much as we have enjoyed creating it.  As always your feedback is key!

Read More

Author

Posted by Simon Murray, Senior Software Engineer, Couchbase

Simon has almost 20 years experience on diverse topics such as systems programming, application performance and scale out storage. The cloud is now his current focus, specializing in enterprise network architecture, information security and platform orchestration across a wide range of technologies.

2 Comments

  1. Great feature. But I can only set one value to exposedFeatures, otherwise I will get an error:

    spec.exposedFeatures in body should be one of [admin xdcr client]

    1. And another question, I can’t find the operator’s source code. Will the operator be open source?

Leave a reply