Single Document Structure or Multiple Documents for User Data

I have a Couchbase document for each user that provides profile and authorization information related to the user. This is the user-profile document. The Couchbase user document can be updated by administrators from a web based admin. Currently the user document is read only from the mobile device.

There is a new feature that requires having user related data saved on the mobile device. This information will only ever be maintained on the mobile device and is read only in the administrator web based admin.

The new feature provides the ability for the user to choose the type of data they are interested in having offline access to. When a user requests offline access to data, the channel associated with the offline access data is saved and needs to be linked to the logged in user. This information is only updated from the mobile device.

To avoid sync conflicts, the offline channels are stored in a small document that also contains the userid of the associated user. This is the user-offline-data document. That way if the admin updates the user-profile and the user changes their user-offline-data document at the “same time”, there will not be a sync conflict, since they are two different documents.

Note: The document key of the user-offline-data document is easily created based on the key of the user document and a doc type string. From the mobil app the document will be accessed using key-value lookup, instead of a query.

This document structure works well for data syncing. I was just wondering if there are any issues that it may cause later on. The only thing I can think of would be:

  1. The extra complication/overhead of merging the two documents together when retrieving the users for the web based admin.
  2. Eventually the user-offline-data can expire, so the user-offline-data document would be updated by a cloud based script in the Couchbase database. This record would get synced to the user’s mobile device. This opens up the possibility of a sync conflict. I tend to lean toward a custom sync conflict resolver that lets the mobile version override any changes in the cloud. For this use case, the cloud based script would just expire the user-offline-data again if needed.

Are there any other considerations related to having multiple small documents, instead of one large document for user data?

Are there general recommendations for user one large document, verses have lots of smaller documents that are linked together with an ID.

You’ve already identified one of the key benefits of splitting this information into separate documents - if multiple authors are updating different sets of information in the same doc, it can be useful to split those sets into different documents to avoid unnecessary conflicts.

Using separate, smaller documents will also generally reduce the amount of data being synchronized when only a subset of a larger document is being modified. Delta sync accomplishes the same thing, but you’re correct that it’s preferable to use smaller documents.

What you may want to be wary of is using many small documents that are all frequently changing. There’s some replication work associated with notifying the replicating peer about a changed document - this includes document metadata (key, revision, sequence). So replicating the same information in 100s of small documents is going to have more replication overhead than a single large document.

Thanks for information. This is very helpful to confirm the use of small documents is good. I’ll keep in mind the issue of frequent updates as the document architecture grows.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.