Couchbase Server 4.5 has just been released, so let’s try it out! A complete overview of all the great new features can be found here. This article will highlight the new Sub-Document API feature. What’s a sub-document? The following document contains a sub-document which is accessible via the field ‘tags’:
With earlier Couchbase versions (<4.5) the update of a document had to follow the following pattern:
- Get the whole document which needs to be updated
- Update the documents on the client side (e.g. by only updating a few properties)
- Write the whole document back
A simple Java code example would be:
The new sub-document API is a server side feature which allows you to (surprise, surprise …) only get or modify a sub-document of an existing document in Couchbase. The advantages are:
- Better usability on the client side
- CRUD operations can be performed based on paths
- In cases where the modification doesn’t rely on the previous value, you can update a document without the need to fetch it upfront
- You can easier maintain key references between documents
- Improved performance
- It saves network bandwidth and has a improved latency because you don’t need to transfer the whole document over the wire
The sub-document API also allows you to get or modify inner values or arrays of a (sub-)document.
- Lookup operations: Queries the document for a specific path, e.g. GET, EXISTS
- Mutation operations: Modify one or multiple paths in a document, e.g. UPSERT, ARRAY_APPEND, COUNTER
A more detailed description of the API can be found in the Couchbase documentation: http://developer.couchbase.com/documentation/server/4.5/sdk/subdocument-operations.html.
The update of a document can now follow the following pattern:
- Update directly a property or subdocument by specifying the path under which it can be found
Our Java example would now be simplified to:
Couchbase Server does not have a built-in transaction manager, but if you talk about transactional behavior, the requirements are quite often less than what a ACID transaction manager would provide (e.g. handling just concurrent access instead of being fully ACID compliant). In Couchbase a document has a so called C(ompare) A(nd) S(wap) value. This value changes as soon as a document is modified on the server side.
- Get a document with a specific CAS value
- Change the properties on the client side
- Try to replace the document by passing the old CAS value. If the CAS value changed in between on the server side then you know that someone else modified the document in between and so you can retry to apply your changes.
So CAS is used for an optimistic locking approach. It’s optimistic because you expect that you can apply your changes and you handle the case that this wasn’t possible because someone else changed it before. A pessimistic approach would be to lock the document upfront and so no one else can write it until this lock will be released again.
You could now ask the following question:
- What happens if I modify a sub-document and someone else updates the same or another sub-document of the same document?
Sub-document operations are atomic. Atomicity means all or nothing. So if you update a sub-document by not retrieving an error message then you can be sure that the update was performed on the server side. This means if 5 clients are appending an element to an embedded array, then you can be sure that all 5 values were appended. However, atomicity isn’t meaning consistency regarding the state. So it isn’t telling you about conflicts. So if 2 clients are updating the same sub-document then both updates will be performed but in order to find out if their was a conflict regarding these updates you would still need the CAS value (or use pessimistic locking instead). However, if you are sure that the clients act on different sub-documents then you know that there will be no conflict and then the CAS value would be not required.
The new Sub-Document API is one of the new great features of Couchbase 4.5. It allows you to avoid to fetch the whole document in order to read/modify only a part of it. This means a better usability from a client side point of view. One of the main advantages is that it improves the performance, especially if working with bigger documents.