Bulk Upload using Java API In couchbase server

We have requirement to load 240K data using Java API during server startup. If we use set method then I am facing data loss as async process.Total records it shows 15K only. If I put one milli sec sleep after each insert then it working fine but take huge amount of time. If i use set(key,value).get() then ops/sec reduced to single digit causing huge performance bottleneck.

I have also tried alternative tool like cbdocloader Tool, which is taking more than 5 min to load all data which is very bad number for us.
Please let us know if you guys have come up with any solution in this regard?

1 Answer

« Back to question.


yes this is a common mistake that gets made because of the async API. If you do never block or orchestrate on the results, you are shutting down your main thread while 15k items on the server and the rest is in the write queue on the clients.

You did the right thing in the first place to block on the future, but if you do nothing else you will end up with a single-threaded synchronous loop. And that's not the best way to utilize the given resources.

So here are the alternatives, use the one that fits your programming model best:

1) Stick with .get() blocking calls, but fan them out in the first place into a Executor to get more concurrency. The client is thread safe so you dont need to worry about synchronization.

2) Do not block on the future but rather use a listener to get notified once the future is complete. You can use CountDownLatches to orchestrate with your main thread so that it doesn't shut down prematurely.

3) Write and dont block, put all futures in a list and once you are done writing, iterate over the list and only exit the main thread once all things are done and potentially retry if an op failed.

Does that help you forward?