We upgraded Couchbase Lite Swift Enterprise from 3.2.4 to 4.0.2. All of our local QA/testing passed, but we are seeing a production-only crash after the upgrade. The crash happens during collection.save(document:) on a background queue. We didn’t have these crashes with previous version.
This looks like a bug in our code. Is it reproducible? Can you tell us anything about the contents of the document or the circumstances under which it was saved?
Thanks for the quick response. It’s frequent and at scale in production since the upgrade. We don’t have a deterministic repro yet. It only appears in production after upgrading from 3.2.4 → 4.0.2. All QA tests pass. We can’t reproduce it on our own. In addition there are several different crashes too. Here are stack traces:
Same syncing mechanism as in 3.2.4 (no concurrency model change).
The only functional change was updating deprecated APIs: we now use collection.save(document:) instead of database.save(document:).
The crash happens on a background queue during collection.save(document:).
Payloads are JSON‑backed models from server data, with nested dictionaries/arrays. We are serializing our models into MutableDocument and then saving this document
We understand the docs say no downgrade support between major versions (4.x → 3.x). However, we’re facing severe production issues after upgrading to 4.0.2 and want to ship a hotfix that downgrades to 3.3.1.
We’re considering this approach to avoid forcing a clean install:
Ship an app update with CBL 3.3.1
On first launch, delete the existing 4.x database/collections
Recreate the database and repopulate from our backend sync
Would this be a safe/acceptable path, or are there still incompatibilities even if we delete the 4.x data files and rebuild? If not safe, is a full clean install the only supported downgrade option?
Also, are there any other recommended downgrade/workaround options we should consider?
First off, if you’re a customer please contact support and get them to file a CBSE.
You’re correct, downgrading CBL 4 to 3 won’t work because modified documents will now have version vectors, which 3.x doesn’t understand.
The clean-sync approach you describe will work, since Sync Gateway / App Services 4 supports 3.x clients and sends them rev-tree revisions instead of version vectors.
@nikolaios , in your QA tests do you enable MallocScribble? This causes free() to fill the heap block with garbage (I think it’s 0xDD) which can help expose use-after-free bugs. Xcode’s scheme editor has a checkbox for it, or you can just set the environment variable MallocScribble.
I ask because my suspicion is that this is a race condition where a document is being saved at the same time its database is being closed/freed. CBL is supposed to handle this condition correctly, but if it didn’t, something like these crashes could occur. MallocScribble would make the crash more deterministic.
Also: If you have full crash reports for any of these, could you post them? If my theory is correct we may see another thread in the process of closing the database.
We have an idea of what the regression is; basically, the -[CBLDatabase dealloc] method in the iOS codebase doesn’t properly synchronize releasing the underlying C++ C4Database object, making it possible for all the database-related objects to be freed while a poor Document is trying to save.
We’re going to fix this in the upcoming 4.0.3 release; until then you should be able to work around it by adding your own synchronization/locking to avoid releasing the Database while a Document is saving.
Sorry, disregard the above – we were accidentally looking at an older branch, so that synchronization issue does not exist in 4.0. Moreover, your crash reports don’t show any CBL code running on other threads at the time of the crash.
I’m back to being stumped, and I’m reviewing all the code involved.