2.6.0 CBLite Pull Replication High Disk Usage

Hi,

I am using Objective-C Couchbase Lite 2.6.0 Community in a Cordova application and I am experiencing an issue with the pull replication.
I have 2 different app bundles which are the same, except for the amount of data. A dev app that has ~1 million documents and is 1.3GB and a staging app that has ~5 million documents and is 4.3GB.
The dev app reads about 1GB before pulling documents, whereas the staging app reads 40GB before pulling documents. Reading 40GB takes about 10 minutes and I’m not sure why it needs to read that much every time the replication starts.

This wasn’t an issue in 2.5.3, the replication would start straight away. 2.5.3 did have a different issue where the memory would fill with all the documents it was pulling.


I’ve tried turning continuous off, using push and pull rather than just pull. Push on its own works fine.
This also occurs when offline.

(void)startReplication:(CDVInvokedUrlCommand*)urlCommand{
    NSDictionary *json = [[urlCommand arguments] objectAtIndex:0];
    NSURL *url = [NSURL URLWithString:json[@"source"]];
    CBLURLEndpoint *target = [[CBLURLEndpoint alloc] initWithURL: url];
    CBLReplicatorConfiguration *config = [[CBLReplicatorConfiguration alloc] initWithDatabase:database target:target];
    config.replicatorType = kCBLReplicatorTypePull;
    config.authenticator = [[CBLBasicAuthenticator alloc] initWithUsername:json[@"username"] password:json[@"password"]];
    config.continuous = true;
    CBLReplicator* replicator = [[CBLReplicator alloc] initWithConfig:config];
    [replicator addChangeListener:^(CBLReplicatorChange *change) {
        [self replicationNotification: change: @"replicationStatus"];
    }];
    [replicator start];
    CDVPluginResult* pluginResult = [CDVPluginResult resultWithStatus:CDVCommandStatus_OK];
    [self.commandDelegate sendPluginResult:pluginResult callbackId:urlCommand.callbackId];
}

iPad Pro (2nd generation)
iOS 13.2.2

Even that is probably more than it should. On first launch the app should only be pulling in the changes since the built-in database was last synced by you.

Just to make sure: Your subject says “high disk usage”, but the post talks about network usage. You mean network usage, right?

Are you using the CBLDatabase method to copy the database file out of the app bundle? (Copying it manually will cause problems with replication.)

When you build the database yourself to include in the app, does your pull replication have the same settings that the app will use?

If you turn on sync logging, do you see any warnings about checkpoints being mismatched?

I used the replication initially to get all of the documents, so I haven’t done any copying using the CBLDatabase method. Just a brand new database that didn’t exist previously.
CBLDatabase *database = [[CBLDatabase alloc] initWithName:@"db" error:&error];
The first time I pulled all the data from Sync Gateway it is fine, as there is nothing to read on device, but when I close the app and then open again, starting the replication seems to read everything.

I do have some indexes, so maybe it is worth seeing if I have the same problem without them?

This all appears to be local activity, as if I turn the internet off it does the same thing. The screenshots from Xcode and Instruments show high disk activity, with little network activity.
If I use the incorrect auth or just don’t specify an auth, use the wrong url, it still has this disk activity.

Like I say, this didn’t occur in 2.5.3, but I can’t really use 2.5.3 as the memory was filling up and eventually crashing due to hitting about 1GB. This doesn’t happen in 2.6.0, so we can assume that is now fixed.

Below I’ve posted the logs for domain all. This is from start to finish of the replication, as you can see there isn’t really anthing logged for about 10 minutes after starting the replication.

couchbase pull replication.log.zip (5.0 KB)

Oh, I think I misunderstood your description. I thought you had an app bundled with a 1.3GB database, but you meant that it pulls a remote database with 1.3GB of data, right?

This all appears to be local activity, as if I turn the internet off it does the same thing. The screenshots from Xcode and Instruments show high disk activity, with little network activity.

Thanks for posting the screenshots from Instruments. The 2nd one with the backtrace is exactly what I wanted to see, with the I/O-heavy backtrace.

It looks like the I/O is due to the new conflict-resolution feature in 2.6. When a replicator starts, it looks for any pre-existing unresolved conflicts in the database. Unfortunately the way this is done results in a linear scan of every document’s metadata, which becomes very inefficient with large databases.

I’ll file a bug report on this. I can’t think of a workaround, but hopefully we can get a fix out ASAP.

Filed issue CBL-536.

Sorry yeah, I meant I had 2 versions of the same app with different bundle ids on the device, that both pulled from different remote databases and therefore are different sizes.

Thanks very much for the quick response :slight_smile:

I see you’ve found a fix for this problem now which is great!
Are you able to give any time estimates on when we can expect this to be released, or is there a beta with this and I can help test it?
Thanks again.

It should appear in 2.7, which is nearing its code freeze right now. I don’t think we have a public release date for it yet.

You can test the CE version of Couchbase Lite by checking out the master branch from Github and building it; but there’s no way for you to build EE yourself since it includes some small closed-source bits.

I have pulled from git and it is working, thank you :smiley: