Suggestion for doing bulk transaction

Hi All,

I want to write one script, where I can pull some data from a bucket from an Couchbase cluster and import the same on another cluster to a same bucket.

Not only this, using some attribute of above Couchbase doc, I want to again pull some data and then import it again on basis of some condition in where clause. And, this steps I will have to do for around 11-12 buckets as well.

So, could someone assist me to design how I can do the bulk operation here.
Bulk operation means I have to do each steps multiple times as per the input size.

If you want me to explain using some example, please let me know.

Let say, I have 10 buckets from A to J.

  1. I will pull data from A bucket and again import it on target cluster.
  2. I will extract some field value from A bucket’s doc, and fetch another document from bucket B and then import it on target cluster.
  3. In this way, I have to do for all the buckets and i might need to use loop, error-handling, debug etc.

So If someone can suggest me around this that would be really helpful.

As of now, I have writtern only commands like cbq, cbimport to do these task. But I want to make this to handle bulk data smoothly and which can be easily comprehend or debug by any other person.

Consider the XDCR replication

Is it possible to filter out the data from one bucket using another bucket data in where clause ?
Can you please suggest some example if any ?

Latest version of XDCR has filtering XDCR Advanced Filtering | Couchbase Docs

You can also consider SDKs to smooth things and handle errors/debug the way you want in single program.

  1. Connect to source cluster and target cluster
  2. For each bucket and value write N1QL generate keys form source cluster
  3. Get the documents from source cluster and write to target cluster
  4. repeat each value each bucket

Also SDKs has options to subscribe XDCR stream and based on what you need then you can write to target cluster too.