Couchbase introduced its fully managed Capella Database-as-a-Service (DBaaS) offer on Amazon Web Services (AWS) in 2021, and more recently on Google Cloud Platform (GCP). Customers no longer need to worry about day-to-day management and maintenance of their Couchbase clusters. With Capella you can benefit from a faster, easier and more affordable NoSQL database while using SQL to query your data. But how do you take advantage of Couchbase Capella if your data is stored in a legacy relational database? This is where MOLO17, a long standing Couchbase partner, allows customers to smoothly transition from the old to the new. 

Data Migration from RDBMS to NoSQL

Moving data between data stores can be complex and time consuming to set up. Customers need to develop ETL data pipelines using expensive data integration tools. Most of these tools do not even support all the capabilities that are needed to transition to a modern NoSQL database. Migrating from a Relational Database System (RDBMS) to a modern database is not a one time event where users move over from one day to another. Rather, it is a gradual process where the existing RDBMS needs to coexist with the new NoSQL database over a period of time, which can be weeks, months, and even years. During this transition period it will be critical to synchronize data between databases. Changes in the RDBMS need to be reflected in the NoSQL database and vice versa in near real-time.

Document databases like Couchbase provide flexibility in how you store data in JSON documents. These JSON documents can then be organized in scopes and collections. When moving a relational database to Couchbase you might move data from multiple tables into a single JSON document to optimize query performance. However, when upserts or deletions are taking place in these JSON documents, it will be critical to synchronize the appropriate tables back in the relational database. This level of complexity, which has been challenging for many data integration tools, can now be addressed with MOLO17’s GlueSync data replication platform.

Data replication diagram for GlueSync

Data Replication with GlueSync

Let’s break down the different data replication options available with GlueSync. Assuming your data is currently stored in one of the major relational databases such as Microsoft SQL Server, Oracle, IBM Db2, MySQL, PostgreSQL, Sybase and others, your options are:

    1. In the simplest case, you want to migrate data to Couchbase Capella in a single event, then only use Capella for all your data needs. GlueSync can achieve this using a one-time snapshot replication. You just need to identify the source tables and columns to replicate, and the target JSON document structure.
    2. More typically, after the data has been replicated to Capella in a single event, all the changes will happen in Capella. However, you also need to move all the changes from Capella to the RDBMS until you are confident with your new Capella environment and decide to decommission your old database. GlueSync can capture the data changes as they occur in Capella using the Couchbase native Eventing service and replicate the changes in real-time to the RDBMS, taking advantage of its multi-dimensional scalability and performance.
    3. Often, the original database applications are still in daily use so new data is regularly coming into your existing RDBMS. Of course, it is critical to get that data into Capella right away, in real-time. GlueSync supports this approach by first replicating all the identified data in the source database(s) using a one-time snapshot replication. At the same time, GlueSync starts to monitor the data changes, and replicates them to Capella using CDC, as they occur in the RDBMS. As for the case above, but in reverse order, by replicating only data changes, GlueSync optimizes performance on both source and target databases, achieving near real-time data accuracy.
    4. Finally, GlueSync can also manage the most complex use case in which changes are occurring both in the original RDBMS and in Capella, but both databases need to be always synchronized, implementing both case 2 and case 3. above. After performing an initial snapshot to copy data from the RDBMS to Capella, GlueSync uses CDC to replicate new data to Capella while taking advantage of the Couchbase Eventing feature to capture changes in Capella and propagate them back to the RDBMS.

To summarize, GlueSync can replicate entire datasets or a subset of data in a single event (snapshot) and it can also replicate the changed data only (CDC) for optimal performance. If data updated in Couchbase must be moved back to the RDBMS, GlueSync can keep both data systems, the RDBMS and Couchbase, up to date with a two-way replication. Having GlueSync take care of this critical and highly specialized task will allow you to focus on your core business, leaving the job of moving the data back and forward to GlueSync.

Connecting GlueSync to Couchbase

Connecting GlueSync to Couchbase and Relational Databases

Installation and Configuration

GlueSync’s cloud native, containerized architecture makes installation and setup a snap. GlueSync is distributed as a Docker containerized app that is at its best when deployed in Kubernetes. You set source database and Couchbase target connection parameters in a simple JSON configuration file which is then used to run the containerized application within your environment. The JSON configuration file also identifies tables and objects to replicate and contains optimization parameters.

Data Modeling

Data replication often involves selecting a subset of relational data to replicate—perhaps only certain fields in a table are needed, or values from different fields must be combined in the JSON output. GlueSync supports data modeling on-the-fly. When you set up your replication parameters in the configuration file, you can identify fields to skip or rename, and define SQL query statements to aggregate, map and design an output structure that will be then transformed to JSON document output.

Database Connections

To connect to the relational database containing the data to replicate (source database), you will need a JDBC driver, usually provided by the database vendors. GlueSync uses the Couchbase Java SDK to connect to Couchbase. The MOLO17 engineering team worked jointly with the Couchbase engineering team to achieve best-in-class native integration with Capella and with Couchbase Server. GlueSync always uses the latest Couchbase SDKs to avoid deprecation or incompatibility issues.

Whether you are working with Couchbase Capella, Server or Mobile, GlueSync supports native Couchbase technologies such as Eventing, App Services and SyncGateway to replicate data between RDBMS or NoSQL databases and Couchbase.

Benefits of GlueSync

If your company is going through a data modernization or application modernization process and has decided to adopt Couchbase Capella as its DBaaS data platform, look at MOLO17 GlueSync to replicate data easily and securely from existing relational databases to Capella and vice versa. Here are the main benefits GlueSync will provide:

Improved Data Availability
GlueSync will create a reliable and secure pipeline to convey data coming from your still-relevant relational databases to the new and strategically important Capella platform. Once GlueSync is properly set up and configured, you can forget about it! It will be doing its job behind the scenes providing you with consistent reliable data where you need it.

Increased Overall Performance
Efficiently moving your data to a high-performance platform such as Couchbase Capella with MOLO17 GlueSync will allow you to properly scale your enterprise applications for optimal business results. With GlueSync and its resilient, low latency and fault tolerant design, performance is assured when moving your data from on-premise to the cloud.

De-Risked Replication Process
Entrusting the replication process to GlueSync rather than following a “do it yourself” approach you will be able to rely on a product implemented and tested by MOLO17, a veteran in the data replication market, and always there to support you in your data replication journey

Better Data Analytics
GlueSync will allow you to offload data from an overloaded RDBMS to a highly scalable database-as-a-service platform. This will empower distributed teams working on analytics to take full advantage of Couchbase Capella.

Lower TCO on the Overall Solution
Using GlueSync to move data from RDBMS to Capella means relying on products that are almost maintenance free. Additionally, you’ll have a solution with high availability and automated scaling where you can easily add, remove, or change nodes to meet your current needs with no application changes.

Free Trial

You can access a free evaluation license of MOLO17 GlueSync by completing this contact form (specifying you want to evaluate GlueSync).

For a free trial of Couchbase Capella, please visit: https://cloud.couchbase.com/sign-up

 

Author

Posted by Giacomo Lorenzin

Giacomo is Executive Vice President, Head of North America Operations at MOLO17.

Leave a reply