Getting more than 30k blobs at the same time

Hello folks, I hope you all are doing great!

I know that blobs it is stored in a separate indexed content store of the document and is only retrieved on demand.

But, I need to know if there is a way to increase performance when I try to retrieve up to 30k of blobs. Because of this query is taking around 9secs. Each blob his size is 1040kb.

In my query I only select 3 columns, and I use the Result object to use the getBlob functions, which is where it is taking the longest.

I am new to couchbase lite and was expecting different performance with the use of the blobs, but I have been looking for some techniques to improve this process in my application as all this data is needed for the application to work.

Any advice?

How long do you expect it to take to retrieve 30,000 * 1,040,000 = 30GB ? Do you have a 100 giga-bit network? That would help.

The problem is not related to data transfer through the sync gateway. I mean when I had all data and I need to perfor the offline query on cblite using flow or live data. Retrieve all documents on memory it takes around 9+ secs.

To determine if there is some improvement that can be made on those 9 seconds, it would be useful know what is occurring during those 9 seconds. Is it possible to obtain any profiling? For java, kill -3 will give thread dumps that may be useful.

If this is running where there is less than 30GB RAM available for the documents, then some of them would need to be paged in from secondary storage. That might be worth investigating.

I am going to make the profile using Android Studio tool. To determine the memory behavior and check what processes are occurring to know if I can make any improvements.

Thanks for the advice, I will start with it

Each blob is stored as an individual file in the filesystem. This is because blobs tend to be large, and databases are not efficient at storing huge values.

Opening and reading 30,000 1MB files is never going to be fast! I strongly suspect you need to rethink your data schema – part of the benefit of a database is that you can have fine-grained access to individual parts of a record, and only retrieve the data you need at that moment. If you’re going to load all the data at once, you might as well just dump it into one big file…

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.