Couchbase Server

How to Clone Couchbase Clusters for CI/CD On-Demand Ephemeral Environments

Continuous Integration and Continuous Deployment are now common software development practices. In the world of databases, this translates into needing on-demand, stateful, ephemeral environments.  

Provisioning a stateless environment is not tied to any particular source of data. All that is needed is to run the code you want to test in your CI environment. This is the basis of most CI/CD tools and won’t be covered in this article. 

The slightly harder part comes from the dependencies the application needs to be tested properly, which is often referred to as external services. Couchbase being one of them. There are different ways to get those, through Docker containers for instance, or hosted in your test infrastructure, or some external as a Service solution. It does not really matter as long as they are available while running your test. Good practices would be to use Environment Variables to refer to those instances. 

Assuming these services are running, like a Couchbase Free Tier instance or a Docker container, the next step is to make sure that they are configured correctly, and seeded with the data needed for the test.

A while ago, I posted about using Couchbase Shell in GitHub actions. This will tell you the basics about using Couchbase Shell with GitHub Actions, but this can be applied to most CI/CD solutions as well. Today, I want to go further and show you some useful scripts to clone a cluster or elements of a cluster for your on demand environments.

Using Couchbase Shell to clone environments

When using Couchbase Shell, the first thing that comes to mind when wanting to do something is, is there a function for that? As of now we don’t have a function to clone something. Most of the available functions reflect our APIs capabilities and we have no cloning APIs today. But, we have the ability to write scripts, which means we can make our own!

The first thing that comes to mind when managing databases is often to recreate the structure and schemas. As Couchbase is Schemaless, this will only consist of the existing buckets, scopes, collections, and indexes in the source cluster. The first step is to export that structure so it can be reimported later. This function will list every bucket, then inner scopes and collections, and add them to an array. Then it will list all indexes and add them to the output JSON. 

This works because under the hood, Couchbase Shell is using Nushell, a new type of shell that is portable (meaning it works the same way on Linux, Windows, or OS X, which is great for CI/CD scripts having to support different OS), and that considers any structure data as a DataFrame, making the manipulation of JSON extremely easy.

To try it out, run cbsh, then source the file containing the function. For me it’s ci_scripts.nu. I have a cluster already configured in my cbsh config, called local

Now if you open local-cluster-export.json, you will get the structure of your cluster:

I have deleted that bucket for the purpose of this test, to reimport it later: buckets drop travel-sample.

The next logical step is to have a function that takes this file as input and recreate the complete structure in another cluster:

Now to run that function:

And there you have it, functions that allow you to export and import the data structure from one cluster to another. While this is a good starting point, there are still questions about how to reimport data, or about granularity. Also, you may not want to export and import a complete cluster.

Filtering buckets to import is fairly easy as Nushell allows you to filter dataframes:

This will recreate a JSON object containing only a bucket named travel-sample and indexes for this bucket.

From there you should be all set to manage basic cluster structure. What about the data? There are different ways you can import data with cbsh, as it covers most key/value operations as well as any INSERT/UPSERT queries. And then we have the doc import command. Its usage is fairly straightforward, all you need is a list of rows with an identified id field. This can be anything that can be turned into a dataframe for Nushell (XML, CSV, TSV, Parquet, and more). And of course, it can be a JSON file from a Couchbase SQL++ query. This is an example that will save a query result to a file and import that file back to a collection: 


That’s one particular example but the whole point of using scripting language is to make them your own. You will find a more complete example in this GitHub Gist. It has support for environment variables for source and destination and you can decide to either clone all buckets of a cluster, a specific bucket, scope, or collection.

Don’t hesitate to drop us a comment here or on Discord, we are always looking for suggestions to improve the global Couchbase experience.

Share this article
Get Couchbase blog updates in your inbox
This field is required.

Author

Posted by Laurent Doguin

Laurent is a nerdy metal head who lives in Paris. He mostly writes code in Java and structured text in AsciiDoc, and often talks about data, reactive programming and other buzzwordy stuff. He is also a former Developer Advocate for Clever Cloud and Nuxeo where he devoted his time and expertise to helping those communities grow bigger and stronger. He now runs Developer Relations at Couchbase.

Leave a comment

Ready to get Started with Couchbase Capella?

Start building

Check out our developer portal to explore NoSQL, browse resources, and get started with tutorials.

Use Capella free

Get hands-on with Couchbase in just a few clicks. Capella DBaaS is the easiest and fastest way to get started.

Get in touch

Want to learn more about Couchbase offerings? Let us help.