Very bad performance with 5000 documents

Hi,

We’re using Couchbase Lite in our Xamarin application. We’ve recently loaded the app with 5000 test documents (each document consisting of ~44 lines of pretty formatted JSON).

We then created a view like this:

        _elementRepository.GetView()
            .SetMap((doc, emit) =>
            {
                if (!doc.ContainsKey("type") || !doc.ContainsKey("companyId"))
                    return;

                if (doc.ContainsKey("archived") && doc["archived"].Equals(bool.TrueString))
                    return;

                if (!doc["companyId"].Equals(_settingsService.CompanyId))
                    return;

                if (!doc["type"].Equals("Item"))
                    return;

                emit(doc["ItemTypeId"], null);

            }, "1.20");

We’re then querying these documents by a “ItemTypeId” parameter. We first made a query for all documents mathing this filter (i.e. 5000 documents): and it took 30 seconds to complete on my Samsung S8.

We then tried to paginate the result, by fetching 20 rows each call: each call took roughly 4 seconds.

We might have to do some programmatic filtering, so fetching the documents by pagination might not work for us.

How can we improve the performance of our queries? I should also mention that 5000 documents is quite small in comparison to what we might process in a production environment.

(Additional info: we are using Couchbase Lite version 1.4.1, and the corresponding SGW version)

That’s much slower than I’d expect. Could you show the code that performs the query?

Are the documents very large? What’s the average size in bytes?

I suggest you try out Couchbase Lite 2, which is in beta and almost ready for release. It is much, much faster, especially at queries.

Good to hear that it is slower than expected.
Do you have a timeframe on CBL 2? It’s not released as a pre-release nuget (from what I can tell).

Our documents are each 1,09 kB in size.
The code that performs the query looks like this:

var query = _database.GetView(nameof(T)).CreateQuery();
query.Descending = true;
query.AllDocsMode = AllDocsMode.AllDocs;

            if (queryObject != null)
            {
                if (queryObject.Keys != null)
                    query.Keys = queryObject.Keys;

                if (queryObject.StartKey != null)
                    query.StartKey = queryObject.StartKey;

                if (queryObject.EndKey != null)
                    query.EndKey = queryObject.EndKey;

                if (queryObject.Skip > 0)
                    query.Skip = queryObject.Skip;

                if (queryObject.Limit > 0)
                    query.Limit = queryObject.Limit;
            }

            var queryRows = query.Run();
            var types = new List<T>();
            foreach (var row in queryRows)
            {
                types.Add(ToObject(row.Document?.Properties));
            }

            return types;

CBL 2.0 should be going GA in a month or so…we will be having our second beta in a week.

It is available through nuget but you will have to specify the relevant developer source

Add http://mobile.nuget.couchbase.com/nuget/Developer/ to your Nuget package sources

You can read about that here

1 Like

From the uppercase method names I assume you’re using CBL for .NET, with Xamarin?

In this specific query, what values did you end up setting for the query properties?

I don’t know much about .NET development; is there a way you can do some profiling to find out where the time is going?

Hi I want to share something that i’m passing, so sorry if cannot have this kind of post here.

I have similar problem im my application, its really dificult to deserialize more than 1000 Json At Once. After searching a lot I founded that new versions of Xamarin have a lower performance on System.Reflection (I think most json parsers use this). In my project we Use JSON.Net to perform intense deserialize, and searching a bit I found that someone said that donwgrading to version 8.0.3 really improves json deserialize performance, maybe because its require a lower version of Xamarin, that uses another rules for System.Reflection . In fact I tried to downgrade but I have a really dificult to work with mono libs on Visual studio and spent too much time making the downgrade and yet not completing him. So I skiped this of my mind and have focus the current appp to get only the real data that he needs from Replication Channels (double check on sync routines and optmizing), and created somes “Mini-Json-Parsers” for match some things on CreateQuery() (parsing a stored json with less variable possible if the query returns true deserialize the huge one) I could have a gain of 30/60% of performance and this number go up after adding Async Queries too.

In My scenario out customer asked us to test the current solution that its for small markets business and tried on some big markets, and my up post could helped me prove to the team that the tech works very well.

To finish my system use the same Lib and couchbase Lite 1.4.1 and run against a Xamarin and a Windows 10 Professional, Its really impressive the better performance of JSON.Net on Windows 10 SO than Android with Xamarin, I really think that some problems I have its the current version of Xamarin, hope I could append some utility to this thread.

Regards

In Couchbase Lite 2 we don’t use JSON at all — the documents are stored in a binary format called Fleece that’s extremely fast to parse, and all the API access is through objects, so there’s no need for the app to work with JSON.

2 Likes

It’s my next priority option to upgrade the app Performance. After readed that json question.:heart_eyes:

Sorry for my late response. We queried this with an “elementTypeId” consisting of a GUID string.

We’ve played around with CBL 2.0 the last few days - and I absolutely love it!! I’ve done complex queries with much better performance.

I noticed that none of the query-methods are asynchronous though (or did I miss it?)… Do you have general best practices for awaiting a result (i.e. “thread.sleep”)?

The API is synchronous because we want to keep it as similar as possible between platforms, and .NET, Java and Cocoa have very different mechanisms for async operations. But you can write your own async functions that wrap around our API.

In the case of a query, all the hard work is done in the execute method — the returned ResultSet contains all the query results. So you could write an async C# method that takes QueryParameters, then runs the Query on a background thread and returns the ResultSet.

For best performance you should have a separate Database instance for background tasks, because each Database can only perform one database operation at a time, so other methods will block while a query is executing.