One request I found frequently in customers is about moving data between Oracle and Couchbase. Oracle is a widely deployed relational database, and customers moving to Couchbase have this starting point challenge. How do we load my Couchbase database from the relational data in Oracle?
First thing to consider is that we are talking about different kind of database. Oracle database hold data organized in rows in tables. Each table has a fixed column set, and each row in a column is uniquely determined by a set of fields: the primary key. A database instance has many different tables.
On the other hand, Couchbase is a general purpose Document database. Couchbase follows a key-value approach where the key is a single String, and the value has no specific format. Usually, we use Couchbase as a Document store, where the value has a JSON format. In this way, there are some advanced features we can use, like global secondary indexes, views and N1QL, a SQL-like language for data access.
In Couchbase, there is not a fixed schema for holding data. Each JSON Document can have different attributes. In case of a new kind of data coming in, you do not need to modify the schema or create a new table, like in the relational world. This flexibility is one of the reasons customers are moving to Couchbase.
Another big difference between relational word and document databases are about data normalization. Relational databases general rule is to organize the columns (attributes) and tables of a relational database to minimize data redundancy. Data modeling in Couchbase, on the other hand, focus on maximizing performance on data access, so denormalize data is a common practice. For this exercise we will use the following conventions:
- Each JSON Document will include a root-element equal to the name of the table
- All the attribute names will be lower-case
- Couchbase is a key-value document database. The format of the key will be derived from the relational primary key as follows:
[table name in lower case]::[value of field1 of the PK]::[value of field 2 of the PK]:: ...
- Numeric, boolean and text data types will be preserved in the transformation
- Oracle Date and Timestamps types will be stored in Couchbase as milliseconds since January 1, 1970, 00:00:00 GMT
So now it is time to write some code to do the magic … Don’t panic. I did it for you.
We will use the oracle2couchbase tool available here (Disclaimer: this is an unsupported tool!). Feel free to spent some time looking the code. Not too complex.
You will find instructions on how to run in your environment. As example data from Oracle, let us use the HR model provided with any Oracle distribution as an example schema. We will move that to a Couchbase bucket named HR.
The command line looks as follows:
java -cp ./oracle2couchbase.jar:./lib/couchbase-java-client-2.2.2.jar:./lib/couchbase-core-io-1.2.2.jar:./lib/rxjava-1.0.15.jar:./lib/ojdbc6.jar -DcbClusterAddress=couchbaseMachine -DcbBucketName=HR -DoraAddress=oracleMachine -DoraUser=HR -DoraPassword=oracle -DoraService=XE -DoraTables=COUNTRIES,DEPARTMENTS,EMPLOYEES,JOBS,JOB_HISTORY,LOCATIONS,REGIONS com.oracle2couchbase.Loader
Now we can access to our data. Let us use SQL for Oracle and N1QL for Couchbase.
First query: look for the locations whose city name starts with ‘B’
Couchbase Query Workbench
This was the same query syntax.
Let us do some more complex: Count for the name of locations by country. Include the country name by doing a JOIN with the related JSON document countries (or table, in the relational world!), and order by total, then on name. This example shows how relationships are maintained.
Couchbase Query Workbench
Very similar syntax, as you can see.
To finish, let me notice that this is just an example, a first step on how to load data from Oracle to Couchbase. A more proper data design may include denormalize some data by embedding referred data on JSON documents.
If you want to go for an in depth discussion, do not miss the following excellent content:
- Data Modeling in Document Databases for the RDBMS-Minded, by Cihan Biyikoglu
- Data modelling for N1QL, by Matthew Revell
- User Profile Store: Advanced Data Modeling Part 1, by Kirk Kirkconnell
- Moving SQL database content to Couchbase, by Laurent Doguin, for a similar approach for Postgres data sources