Couchbase Mobile supports a JSON document style NoSQL data model. In addition to supporting the standard JSON data types, Couchbase Mobile also supports binary data that include images, audio, video, PDF files, etc. A JSON document can be associated with one or more elements of binary data referred to as “attachments” or “blobs”. The binary data can be synced between Couchbase Lite clients and the server via the Sync Gateway. In this post, we discuss how to create binary data attachments, how to retrieve and update them. We also take a look under the hood at how attachments are internally represented, related idiosyncrasies and how to deal with them.

Everything in this post applies to a Couchbase Mobile 2.x based deployment.

Background: Attachments and Blobs

Support for associating binary data with JSON documents within Couchbase Mobile has evolved over the years. The internal representation of binary data within the JSON document has changed across versions of Couchbase Mobile. In Couchbase Mobile 1.x, binary data was stored in the form of “attachments” within a top-level “_attachments” attribute. Couchbase Mobile introduced the blob data type for storing binary data. In most cases, the discrepancy between the representations across versions is seamlessly handled by Couchbase Mobile so end users don’t have to do anything special within their apps to deal with it. However, there are certain cases wherein app developers would have to take extra measures to deal with the discrepancy. We will also discuss those measures in this post and try to address some commonly asked questions.

Workflow #1: Handling attachments created on Couchbase Lite

Let’s take a look at how you can create JSON documents with binary data attachments with Couchbase Lite and sync them over to the server-side. This is the flow that we will be describing:

Create binary data attachments on Couchbase Lite

Developers must use the blob API for creating blob data. A document can be associated with one or more attachments or blobs. Here is a code snippet that shows the usage of this API in swift. Refer to the developer documentation for equivalent code snippets for other platforms.

Internal Representation

When the document is created in Couchbase Lite, internally, it looks something like this:

Notice the “@type”: “blob” type entry created for the image type data.

Note that there are several system-level metadata such as _id that are included in the document. For brevity, not all of it is shown in the example. Applications must never make any assumptions about the format and availability of system-level metadata and for that reason, apps must never access those attributes directly. Always use the metadata retrieval options such as meta().id.

Syncing attachments to Sync Gateway

Sync Gateway is backward compatible with Couchbase Mobile 1.x. This implies that the Sync Gateway needs to be capable of processing binary data using the 1.x _ attachments style representation as well as the 2.x blob type. That also implies that when the Couchbase Lite 2.x client pushes up data to the Sync Gateway it needs to send it in a format that is compatible with 1.x clients.

For that reason, when Couchbase Lite syncs the document with the Sync Gateway, it adds the _attachments entry into the document. So the document when pushed up would look something like the example below. The list of attachments associated with the document is specified within the _attachments object.

Note that there are several system-level metadata such as _id that is included in the document. For brevity, not all of it is shown in the example. Applications must never make any assumptions about the format and availability of system-level metadata and for that reason, apps must never access those attributes directly. Always use the metadata retrieval options such as meta().id.

Retrieval of attachment on Sync Gateway

The attachment data must be retrieved through the Sync Gateway _attachments REST endpoint. At the time of writing of this post, attachments cannot be directly managed using the Couchbase Server SDKs.

Here is a sample curl command to retrieve the attachment(s) associated with a document using the _attachments REST endpoint. You would replace the authorization header with suitable credentials corresponding to the user configured in your system. Also, notice the name of the attachment, “blob_%2Fimage” is the URL encoded version of “blob/image”_.

Updating attachments on Sync Gateway

The attachment data must be updated through the Sync Gateway _attachments REST endpoint. At the time of writing of this post, mobile attachments cannot be managed using Couchbase Server SDKs.

Here is a sample curl command to update an attachment associated with a document using the _attachments REST endpoint. You would replace the authorization header with suitable credentials corresponding to the user configured in your setup. Also, notice that the “rev” parameter must be provided. This parameter corresponds to the revision of the document that is to be updated. You can retrieve the revId using the GET document REST.

But…wait..a mismatch!

Now, after you update the attachment, if you retrieve the document using the GET document REST,

The corresponding response would look something like this:

You will notice that the _attachment and blob entry do not match. While the _attachment entry points to the latest image, the blob entry still describes the old image. But that is OK!

The reason for this discrepancy is because the Sync Gateway only deals with 1.x style attachments.

So how does this still work?

This works because within the context of the Sync Gateway which deals with 1.x style attachments, only the attachment entry is honored.

But what about Couchbase Lite? What happens when the updated document is synced down by the Couchbase Lite 2.x client?

When the document is replicated by Couchbase Lite 2.x client, Couchbase Lite looks for the presence of _attachments and blobs within the document and implements appropriate logic to identify that this was a 2.x style document that was created by a 2.x client but was subsequently updated by a 1.x client (such as the Sync Gateway REST API). It, therefore, treats the _attachments entry as the “real” attachment and unifies the corresponding blob entry.

From a developer’s perspective, all this is handled automatically. So you don’t really have to worry about any of the details. As a developer, you would have to know how to retrieve the updated attachment from within Couchbase Lite enabled app.

Retrieval of updated attachment on Couchbase Lite

When the updated attachment is synced over to your Couchbase Lite app,  use the Blob API to retrieve the data. Here is a code snippet that shows the usage of this API in swift. Refer to the developer documentation for equivalent code snippets for other platforms.

Now let’s look at the reverse flow.

Workflow #2: Handling of attachments created on Sync Gateway

Let’s take a look at how you can create JSON documents with binary data attachments on Sync Gateway and sync it over to the Couchbase Lite side.
This is the flow that we will be describing –

Create binary data attachments on Server

In order to attach binary data on the Couchbase Server side that can be synced over to the clients via the Sync Gateway, you would have to use the Sync Gateway attachments REST endpoint. At the time of writing of this post, mobile compatible attachments cannot be directly created using the Couchbase Server SDKs. The attachment data that is created through the Sync Gateway REST endpoint is persisted in the Couchbase Server bucket and synced over to Couchbase Lite clients subject to the access control policies configured on the Sync Gateway.

In order to do this, first create a document (or retrieve a previously created document) and then create the attachment for the document. Alternatively, you could create a multi-part document with both JSON and binary data. But that could be tedious as you would need to also generate the relevant attachment metadata. So the steps outlined below is my preferred option

  • Create Document

A JSON document can be created directly on Couchbase Server using Couchbase Server SDK or the admin UI or you can create it using the PUT Document REST API. The document could have also been synced up from a Couchbase Lite client.

Here is an example of using the Sync Gateway REST endpoint for creating a document with Id “user::jane”. You would replace the authorization header with suitable credentials corresponding to the user configured in your setup.

The response would look something like below:

  • Create Attachment for Document on Sync Gateway

The attachment data must be created through the Sync Gateway _attachments REST endpoint. This step is identical to the previous flow when an attachment was being updated through the REST endpoint.

Here is a sample curl command to update an attachment associated with a document using the _attachments REST endpoint. You would replace the authorization header with suitable credentials corresponding to the user configured in your setup. The “rev” parameter must be provided. This parameter corresponds to the revision of the document that is to be updated.

Internal Representation

When the document is updated by the Sync Gateway, it would look something like this
If you retrieve the document using the GET document REST,

The corresponding response would look something like this:

Creating attachments using Sync Gateway’s attachments REST API will result in the 1.x style representation of attachments. Notice that there is no 2.x style “blob” metadata. This is important to note when you access the document on Couchbase Lite

Retrieval of updated attachment on Couchbase Lite

When the previously created document is synced over to the Couchbase Lite side, it detects that this is a 1.x style document and leaves the _attachments entry intact. It treats objects nested within the _attachments entry as blobs. However, the document is not automatically updated to include “blob” entry that is added. So your app would need to look for the presence of blobs using the Blob API in both locations.

FAQ

To wrap things up, I have compiled a list of commonly asked questions related to the handling of attachments in Couchbase Mobile

Where are attachments stored?

On Couchbase Lite, attachments are stored in the Couchbase Lite database instance that contains the corresponding document. It is stored separately from the document which contains the associated metadata that holds the reference to the attachment. If the same attachment is shared by multiple documents, only a single instance of the attachment is stored in the database.

On Couchbase Server, attachments are stored in the same Couchbase Server bucket as the corresponding document. It is stored separately from the document which contains the associated metadata that holds the reference to the attachment. If the same attachment is shared by multiple documents, only a single instance of the attachment is stored in the bucket.

Is there a limit on the number of attachments that can be associated with a document?

You can attach one or more attachments to a JSON document. There are no hard limits on the number of attachments that can be associated with a document. However, since the attachment metadata is stored in the document xattrs (when shared_bucket_access is enabled) the number of attachments is bound by the allowed sync metadata size per document. With attachment metadata ranging from 100–200 bytes and sync metadata size limit of 1MB per document, there are practical limits on the number of attachments that can be associated with a document.

What is the maximum size of an attachment?

The maximum size of each attachment is 20MB. This follows from the limits on document sizes on Couchbase Server. While Couchbase Lite itself allows attachments of size greater than 20MB and this is fine as long as the attachment is local-only and is guaranteed to never be synced to the server. However, developers are cautioned from creating such large attachments as they will be rejected by the Sync Gateway.

Do attachments sync every time the associated JSON document changes?

The replication protocol is optimized to only sync attachments when there are updates to them. This implies that they are not pushed or pulled by Couchbase Lite clients even if there are updates to other data in the associated JSON documents.

How does the protocol handle failures when syncing attachments?

The protocol is very robust in terms of handling sync failures for instance due to network disruptions. Documents are not persisted by the Sync Gateway or on Couchbase Lite until all associated attachments or blobs are successfully synced. So there could be a time window where you could end up with orphaned attachments/blogs that have no associated documents. That’s not an issue because subsequent sync of the document will recognize that the attachment is already persisted and will not attempt to resynchronize it again.

What’s Next

Couchbase Mobile provides an easy-to-use interface to manage attachments. Check out the documentation for details on blob handling on each of the platforms.
If you have questions or feedback, please leave a comment below or feel free to reach out to me via Twitter or email me. The Couchbase dev forums are a great place to engage with the Couchbase development community.

 

Author

Posted by Priya Rajagopal, Senior Director, Product Management

Priya Rajagopal is a Senior Director of Product Management at Couchbase responsible for developer platforms for the cloud and the edge. She has been professionally developing software for over 20 years in several technical and product leadership positions, with 10+ years focused on mobile technologies. As a TISPAN IPTV standards delegate, she was a key contributor to the IPTV standards specifications. She has 22 patents in the areas of networking and platform security.

8 Comments

  1. Hi Priya,
    Here is another question for the FAQ if possible.

    Since BLOBs or attachments are binaries and can not be processed, searched, compressed, diff’d, etc. Why are they stored inside a huge database instead of in a folder with unique filename?

    Is it posible to have a Folder containing a website managing its database-documents-contents being replicated to the multiple nodes?

  2. Hi
    It depends on the use case. If you are talking about large volumes of binary data, then storing it in a external CDN/ store is the preferred option. In that case, you will store binary data external to Couchbase server and include the reference URL in the document. The app will be responsible for pulling down and pushing up the attachments and updating corresponding references to the document when such changes are detected

    The advantage of treating the binary attachments as “Documents” is that you are getting (eventual) consistency guarantees because the push and pull is handled by our sync protocol. We take care of detecting changes to attachments and pulling down changes as and when needed.

    Even if you choose to store the attachments in external store to bypass storing it in server bucket, if you are looking for offline-first capability, you will need to still locally persist the attachments in Couchbase lite so it is available even when the client is offline

    BTW, binary data compresses well.

  3. Hi Priya,

    Not able to understand why Couchbase Server SDKs don’t provide an option to directly add attachment (blobs) and we need to go through the sync gateway. Could you please elaborate on that? Are there any plans of adding that facility in the future releases?

    Best,
    PShri

  4. Hi
    Support for using CBS SDK for creating binary data that can be synced to couchbase lite clients is on our radar. In the meantime, the typical pattern is to create the document via SDK and use the Sync Gateway REST endpoint for associating attachments /binary data.

  5. How can I know when that gets added to the CBS SDK? Are there any places where I can subscribe and get to know about it. Thanks for taking time to respond.

    1. Not sure I understood the question. The same SDK client that is writing the document to the server bucket will be responsible for using the REST endpoint to associated/create attachment. So instead of a single call to create the document and attachment , the SDK client will make two calls- an INSERT/UPSERT call to create document in bucket followed by the _attachment REST API to create the attachment associated with doc

  6. Hi Priya,

    Do you propose other way to support huge attachment (size > 20MB)?
    it’s possible to split the attachment which grater than 20MB?

    Thanks

    1. If you have such large attachments, it is recommended you store that in an external CDN and stick the URL in the document. You probably don’t want to store such large binary data in a database. Your app would be responsible for managing the attachments. There is no built in mechanism for chunking and merging the attachments.

Leave a reply