Couchbase Hadoop Connector

The Couchbase Hadoop Connector has been developed in conjunction with Cloudera to allow Hadoop users an easy method of moving data back and forth between Couchbase and Hadoop. These two systems are commonly deployed alongside one another: Couchbase for operational data and Hadoop for data analytics.

This plugin allows you to connect to Couchbase Server to stream data into HDFS or Hive for processing with Hadoop. If you have used Sqoop before for imports and exports from other databases then this plugin's use should be straightforward as it uses a similar command line argument structure.

Installation

After you download and extract the plugin you will find a set of files to be moved into your Sqoop installation.  Refer to the documentation included in the download archive.

Automatic Installation

Automatic installation is done through the use of the install.sh script that comes with the plugin download. The script takes one argument, the path to your Sqoop installation. Below is an example of how to use the script.

./install.sh path_to_sqoop_home

Manual Installation

Manual installation of the Couchbase plugin requires copying the files downloaded from Cloudera into your Sqoop installation. Below are a list of files that contained in the plugin and the name of the directory in your Sqoop installation to copy each file to.

File Installation location
couchbase-hadoop-plugin-<version>.jar lib
jettison-1.1.jar lib
netty-<version>.Final.jar lib
spymemcached-<version>.jar lib
couchsqoop-config.xml conf
couchsqoop-manager conf/managers.