I am using couchbase lite on the iPhone using Swift 3.0. I have found some JSON examples to get the size in bytes, but these do not work with a CBLDocument in Swift. I have also thrown together a somewhat terrible way of accomplishing what I need. Please show me there is better way to get the size of a CBLDocument in bytes.
We don’t have a method for that, and there’s no way to get the raw JSON stored in the database. (In fact there’s no guarantee we are storing JSON in the database; we do in 1.x, but the 2.0 in development uses a more compact binary format. It’s an implementation detail.)
You could use something like what you’re doing, except that
For numbers you need to work out the number of decimal digits. The memory size occupied by an NSNumber has nothing to do with the written length of the number in ASCII.
For arrays, again, looking at NSArray won’t tell you anything. Instead you need to recurse over the elements in the array, adding up their encoded sizes.
Same for dictionaries, but add up the lengths of the keys as well as the values.
Don’t forget to take into account the two quote characters around strings, the commas between array/dictionary elements, the colon after a dictionary key, and the braces/brackets around arrays and dictionaries.
Some CBL documents in my iOS and Android app will exceed 20MB limit sooner or later. so I need to handle this somehow. The solution I have in mind is to check CBLDocument size when trying to add content, if size has reached 20MB, create another document. So I have the same question for solution in both Swift and Java. I noticed the last message in this thread was Feb 2017, I was wondering what’s the answer to the original question in November 2019 (e.g. does newer version CBLite offer some convenient approach for this)?
Honestly, I think it’s a bad idea to have documents anywhere near that large. (You wouldn’t put 20MB of data into a single row in a traditional database, would you?) It introduces performance problems:
Any time you load the document, even to read one field, 20MB of data has to be read from disk.
Any time a query reads the document, even for just one field, it also has to read 20MB from disk.
Sync Gateway has to parse all 20MB of JSON when it stores the document, which is not fast.
The replicator does support deltas now, so it will only transmit the parts of the document that changed, but computing what changed can be expensive. (And there are limits: changes in an array are too complex for our current algorithms to handle, so it will send the whole array even if just one item changed.)
If changing your schema to break these up into smaller documents isn’t an option, I suggest taking the portions of the document that aren’t necessary for queries, or which change less often, and making them into a blob/attachment in JSON format. (Or multiple attachments, to keep them under 20MB.)
Thanks for the quick and detailed response. I don’t care much about loading document or query as that will almost never happen on client side nor on production server in my use case (this type of document is primarily for record-keeping purpose, which is why its size will grow over time, and I need to let it grow), but the 3rd and 4th points are indeed issues for me.
Breaking up the document into smaller documents isn’t an option for me (because even one of the fields could eventually grow beyond 20MB for active users). I was wondering what’s the size limits on the “blob/attachment in JSON format” you mentioned? I mean what’s the size limit per blob/attachment and what’s the limit on the number of such blob/attachment per document? And any performance considerations for blob/attachment when the total size is on the order of tens of MB?
(this type of document is primarily for record-keeping purpose, which is why its size will grow over time, and I need to let it grow)
It would probably be more efficient to create new documents rather than keep appending to an existing document. Or at least only append to an existing doc for a limited time or until it reaches a size threshold.
what’s the size limit per blob/attachment and what’s the limit on the number of such blob/attachment per document?
Individual blobs are also limited to 20MB since they’re also stored as documents in Couchbase Server.
There is no limit on the number of blobs attached to a document, though.
Thanks Jens ! I am thinking about setting size threshold to 5MB, once document reach this size, just create another document. This circles back to this question: what’s the best way to programmatically get CBLDocument size at run time from mobile end (iOS and Android) in November 2019?
The premise doesn’t make much sense in terms of Couchbase Lite since it stores files in a completely different format than Couchbase Server does. The closest estimate you could probably get would be to transform your document into a JSON string piece by piece and then see how long it becomes (blobs would have to be special cased but they are all roughly the same size in the actual document, so you could probably just add a delta onto your total per blob). Compare the size to the size on Server and adjust as necessary.