The first critical vulnerability in the Kubernetes container orchestration platform was disclosed earlier this month. This vulnerability affects all versions of Kubernetes greater than 1.0.0. It also affects RedHat OpenShift platform versions greater than 3.0.0.
In this post we look at the causes, mitigations and how this affects the Couchbase Autonomous Operator.
When connecting to Kubernetes pods the user will typically use the command:
kubectl exec -ti my-pod-name /bin/bash
For users of the docker container management suite, this is going to look very familiar. This is because the command is first sent to the Kubernetes API which acts as a reverse-proxy. The reverse-proxy forwards the request to the Kubernetes agent running on the node on which the requested pod resides. Finally the command is used to connect directly to the docker container.
The vulnerability is exposed in the reverse-proxy when an error occurs with the remote exec command. Connections to services downstream from the proxy are left open when an error occurs. This, in turn, provides a way for users to escalate their privileges to that of a Kubernetes cluster administrator. The users must be allowed to perform the command by role-based access control.
With administrator-level privileges, the user is then able to connect to any container running on the compromised Kubernetes node. This allows access to any secrets (typically usernames, passwords, TLS private keys, etc.) and data volumes that are mounted in the container. This is even more troubling if the container is running with elevated (root) privileges. Additionally the container may also have the host file system mounted in read/write mode.
A second variant again relies on the same underlying reverse-proxy vulnerability. In the first variant, the user would have to be authenticated and have access rights to execute commands on containers before the privilege escalation could be performed. With the second variant, no authentication is required at all to gain administrative access. The scope of this vulnerability is limited to modification of service brokers. These are managed services that may be discovered and used by an application.
For the first variant of the vulnerability, you can prevent privilege escalation of your users by removing permissions to use the exec, attach and portforward commands for pods. These permissions are granted by default in most installations. The commands, however, are typically required in most environments for debugging and testing purposes, so disabling access may not be the correct answer.
The best method to prevent all vulnerabilities, therefore, is to upgrade the Kubernetes cluster to a version that has the vulnerability patched. Patched versions begin at Kubernetes 1.10.11, 1.11.5 and 1.12.3.
The Couchbase Autonomous Operator is supported on Kubernetes versions 1.11.0 onward. We recommend an upgrade to either of the applicable patched versions of Kubernetes at the earliest convenience. The Autonomous Operator itself is not affected in any way by these vulnerabilities or fixes; however, Couchbase server pods created may be subject to the first vulnerability, compromising data integrity.
Autonomous Operator Considerations
While it is never nice to be subjected to a critical vulnerability, we prepare for such eventualities. Full Kubernetes upgrade instructions are provided in the Operator 1.1.0 documentation, however we will discuss them in further detail in this post.
As a stateful service, care needs to be taken with upgrading the Kubernetes cluster to maintain data integrity of the Couchbase data platform.
A standard rolling upgrade approach taken by many Kubernetes management applications upgrades each worker node in order. To upgrade a single node, pods are first evicted (deleted). If a pod is part of an application such as a deployment, then the controller managing the application will recreate any evicted pods. This is in order to restore the correct scaling. After eviction the Kubernetes agent can be upgraded. The underlying host may be rebooted. This is a good time for operating system patches to be applied. Finally the node can added back into the cluster.
For a stateless service, it only takes a few seconds to recreate the pod and add it back into your service pool. For a stateful application like Couchbase server, this may take a lot longer.
What Happens During an Eviction?
Take for example a Couchbase cluster with 3 nodes. The Couchbase cluster contains a single bucket with a single replica.
When the first Couchbase pod is evicted, the Autonomous Operator will detect it is down and wait for failover to occur. Failover restores the availability of your data from a replica. Once failover has taken place, the Autonomous Operator will create a new pod and begin rebalancing data across the Couchbase cluster. This restores the correct number of replicas. It is important to note that during this process you will need an additional Kubernetes node to accommodate the new pod.
Because we have lost one of three nodes, 66% of documents contained in the bucket will be degraded with only a single replica. To illustrate, the following table shows 6 master documents (in bold) and where their replicas reside. If server 1 were to be evicted, documents A, B, D and E would only have a single replica.
|Server 1||Server 2||Server 3|
While the rebalance is occurring, there is nothing to stop the Kubernetes upgrade process from evicting another Couchbase pod. If this were to be allowed to happen there is a risk that up to 33% of your data now has no replicas and be lost. In the above example, server 2 being evicted would result in documents A and D being unrecoverable.
It is therefore critical to control how and when Kubernetes nodes are evicted. Once a Couchbase server pod is evicted you must allow the Autonomous Operator to create a replacement pod and allow data to be entirely rebalanced before upgrading the next Kubernetes node.
Autonomous Operator Best Practices
We recommend the use of pod anti-affinity when configuring your Couchbase cluster. This feature ensures that no two Couchbase server pods from the same Couchbase cluster can reside on the same Kubernetes node. This ensures that at most a single replica can be lost at a time.
We also highly recommend using persistent volumes. In our last post we described how we mandate the use of persistent volumes to protect against data loss and support the product. A key benefit is that, in the examples given in the last section, the data would not be lost.
The Autonomous Operator can recreate a new pod and reuse the existing data volume. Rebalancing data across this cluster is substantially faster as only documents that have been updated since failover need to be replicated on replacement pods.
Even with persistent volumes enabled, it is still possible that multiple pods may be evicted in quick succession. This may lead to certain documents being temporarily unavailable until the Autonomous Operator can restore the data.
Looking To The Future
We want the whole Kubernetes upgrade experience to be seamless. This is precisely because bugs will be found in the Kubernetes platform or the operating system. These will need to be patched to ensure your applications are stable and secure.
While persistent volumes are part of the answer, it is not a complete solution. There are situations where document unavailability may still occur. To address this we are happy to announce that Autonomous Operator 1.2.0 will feature enhancements to mitigate disruption to your documents to the failover duration only.
By leveraging Kubernetes pod disruption budgets, we can control precisely how many pods the Autonomous Operator will allow to be evicted at any one time. This prevents multiple data replicas being evicted at any one time.
Further to this, pod readiness checks used by the disruption budgets have been enhanced to be aware of whether Couchbase server pods are both alive and all data fully replicated across the cluster. This allows the Autonomous Operator to control precisely when a new eviction is allowed to occur.
The Couchbase Autonomous Operator is designed not only as a product to simplify deployment and management of the Couchbase data platform. We take broader view that operating on managed public cloud platforms will have its unique challenges. The team is continuously working to protect your data and provide high availability of your data both in situations under our control and those outside of it.