No, I haven’t lost my mind. Unicorn is the internal code-name for Couchbase Autonomous Operator 2.0.0. For a release this large in scope we wanted something mythical and fantastical to sum up the sheer effort and passion that was put into it. It’s also mildly amusing to give management a platform where they have to talk about unicorns and pixies! To underscore the difficulty of this release, we will focus on one key aspect of the Operator: Kubernetes custom resources. Custom resource underpin everything the Operator does.
This post is part of a series of technical blogs taking a deep-dive into the inner workings of the Operator. This blog will focus on getting from A to B. Sounds simple doesn’t it? (Or vague…) As you will see, it was anything but. We shall define what we mean by this, examine the technical challenges, and finally arrive at why we chose to do what we did.
From the development team to you–the end user–we hope you have as good an experience using it as we have had creating it.
A to B with CRDs?
So what do we mean by this? Let’s first look at why we need a B in the first place. Operator 1.x manages Couchbase clusters with a single configuration object, a custom resource in Kubernetes terminology. This approach has benefits–everything is controlled in a centralized manner–but it also has its drawbacks–to make any change to cluster you have access to all aspects of the cluster.
I like the first approach, and I tell the team all the time:
However, we are providing an enterprise service first and foremost. Security needs to be a prime concern–from the networking layer right up to the business processes that exist during a cluster’s lifetime.
Separation of Concerns
A clear business requirement was that a user should be able to create a private data bucket and use it. That same user should not be given the ability to do anything else. Being able to scale the cluster should not be allowed–this may cause disruption to other users. Our goals were therefore:
- Define roles within Kubernetes
- Users can manage buckets and process data without affecting other users
- Administrators can modify the cluster topology as needs demand
As well as providing strong security guarantees, roles separate concerns. End users only need know about manipulating documents and running N1QL queries. Administrators primary concerns are security compliance, resource utilization and cost. There is, however, a trade off. Limiting knowledge to specific domains also erects barriers between those domains. The solution needs flexibility in order to provide the ability to choose where any barriers exist.
Custom Resource Decomposition
The key to meeting our goals is Kubernetes RBAC. Kubernetes RBAC allows specific users, to perform specific actions, on specific resource types. We move towards our goal by splitting our monolithic Couchbase cluster resource into separate core cluster and bucket resource types. Each resource type can have a separate set of users that are allowed to use them.
Another consideration we have to contend with, is how being able to configure one aspect of the cluster can compromise another. Operator 2.0 introduces Cross Data Center Replication (XDCR) management. It would be nice to allow bucket users to set up their own replication strategies. In order to configure a replication you need to connect to a remote cluster. This requires credentials that are configured with Kubernetes secrets. By granting a user access to secrets you then reveal every password and TLS private key in the name space.
We therefore consider any configuration requiring access to secrets as an “admin only” operation. This helps limit scope between knowledge domains as best we are able.
Custom Resource Aggregation
A Couchbase cluster’s configuration is now distributed across many different resources. These are controlled by many different people with specific roles. The Operator, however, needs to see the full picture in order to create and manage the cluster.
By default, and through no fault of its own, Operator 2.0 acts naively. When it starts monitoring a Couchbase cluster, it will also monitor any bucket resources it finds. A single, logical cluster is created from the aggregation of these cluster and bucket resources. If you create another cluster, it will also find and aggregate the same bucket resources. You may not wish for this behavior, and Kubernetes provides the way again.
For each resource type that is aggregated, there is a corresponding label selector that can be configured. This label selector allows you to control exactly what resources are aggregated. Only resources with the matching labels will be selected by the Operator.
We now know that decomposing the Couchbase cluster resource is a good thing. While doing this is not without its faults, the benefits outweigh them.
Kubernetes provides primitives that help us achieve our goal with RBAC and label selection. Our journey from A to B will be easy right? Think again.
In a perfect world the Operator would run on every Kubernetes version available. This is probably true for future versions of Kubernetes that maintain backward compatibility. I say probably because extensions may be introduced that are incompatible with an application written before their release. For this reason, each Operator release has a window of certified Kubernetes versions.
Certified Kubernetes versions have an upper and lower limit. The limits are out of our hands. We can only certify if a cloud vendor supports a certain version. At the time of writing Google Kubernetes Engine (GKE) only allows Kubernetes versions 1.14 and 1.15. The Operator is certified on Amazon Elastic Kubernetes Service (EKS), Microsoft Azure Kubernetes Service (AKS) and Red Hat OpenShift Container Platform (OCP). We are again limited by the common set of Kubernetes versions supported across all platforms.
The lower bound for CAO 2.0.0 was Kubernetes 1.13.
Custom Resource Versioning
Kubernetes 1.13 introduced versioned custom resource definitions (CRDs). This allows Kubernetes to be aware of multiple different custom resource formats. This sounds perfect for our needs, but the devil is in the detail.
CRD versioning only stores resources as a single version. When we define a v1 and v2 of a resource type, all resources are be stored by Kubernetes as a single version e.g. v2. Custom resource conversion hooks provide a way for us to convert from the stored v2, to v1 resource, when a client is accessing the resource with the v1 API. Conversion is designed take one resource and return one resource, performing only simple operations. For the Operator, to convert between a monolithic and a modular cluster configuration, is a step too far.
Another concern was that with Kubernetes 1.13, CRD conversion was an alpha-grade feature. I’m a fan of anything that provides a more seamless user experience (UX). Users, however, are less enthusiastic with enabling alpha grade features via intrusive Kubernetes API reconfiguration. In general we only begin using features when they are generally available APIs, usually at the beta stage.
The final problem was that the CRD versioning in Kubernetes 1.13 only supported a single JSON schema. This means that when installed, a v2 Couchbase cluster CRD could support both v1 and v2 custom resource types, but any update to a v1 resource would fail because validation against the single v2 schema would fail.
It Never Rains, it Pours
On the face of it, it appears we are stuck. A good idea is impossible to achieve without starting over. We cannot use CRD conversion as it’s unavailable, and not designed for what we want. We cannot run different versions of the Operator concurrently because only a single schema is supported. But I’m British:
If only a single version of a custom resource is supported, so be it, we just upgrade everything at once. Which leaves the conversion process.
Are We Nearly There Yet?
Actually, yes. The key to success is simple. Read in a old, v1 Couchbase cluster resource–this isn’t validated–then write out a new, v2 Couchbase cluster resource, and any bucket resources that it requires–these are validated.
As alluded to before, UX is very important, so we provide a tool to perform the conversion for you. As you would expect, this uses label selectors and plays it safe. No new features are enabled–every new feature is opt-in–so existing XDCR and Couchbase RBAC settings are not affected by the upgrade process.
We finally made it to B! Somehow, through adversity, we were able to go from a monolithic, to a modular configuration model. We were able to define a near-enough seamless online upgrade process. Now the power is in your hands to explore the possibilities offered by CAO 2.0.
As always, you chose the directions we have taken, and you will choose the directions we take. We look forward to your feedback, suggestions, and–dare I say it–criticism!
It’s The Journey Not The Destination
I’ve spent a huge amount of time doing technical documentation for this release (and I really hope you enjoy it). Rather than a dry–and overly technical–blog, I thought a change were in order, hence the endless analogy. What I’ve been striving to express are all the technical challenges, and trade offs we–as a team–face to deliver these milestones. What may seem like a simple new feature from a product marketing perspective, is orders of magnitude more complex under the hood (bonnet).
By undertaking these complicated tasks on your behalf–and being open about it–I sincerely hope that you gain an appreciation of the work that goes in. I hope our story will resonate with others too. Kubernetes can only get better for everyone as these ideas are refined and incorporated. We still have a long way to go, but thank you for accompanying us on this journey.