Insert in transactions - Moving from staged to new document creation - how it works?

Hi

For keyvalue transactions, during replace it is evident that the new version of document is held in xAttrs value of meta data of a document. This staged value in xAttrs is moved to original document after transaction is committed.

However, for a new document insertion during transactions how does a new document moves from staged to actual document insertion.
Are these staged new documents held in couchbase cluster or in SDK??

Kindly let me know or please redirect me to any documentation that is available which articulates how this is handled in couchbase.

Regards,
Venkat

Hi @sri_ram
That’s correct, for replaces and removes the post-transaction version of the document is held in the xattrs.
For inserts a hidden document is created for this staging purpose, which is converted to a real document at commit point. This hidden document handles write-write conflicts if two transactions try to concurrently insert the same document, and allows the transaction to be rolled back or completed by the async cleanup process if the application is unable to finish the transaction (for example if it crashes).

1 Like

@graham.pople

Thanks for the quick response, as per my understanding from above, this hidden document also resides on one of the data nodes.

So for replace we ideally have 3 round trips from client to cluster:
get document → stage(to update xAttrs) → trigger to replace document with xAttrs(if transaction is successful).

For insert it would be 2 round trips from client to cluster:
stage(create hidden document) → trigger to convert hidden document to real document.

Please, let me know if my understanding is correct.

Yes, correct on all counts, including that the hidden document is on the data nodes.

1 Like

Hi @graham.pople

Apologies to shoot one more query, hope this would be the last one for transactions :slight_smile: .

During inserts for the below:

For insert it would be 2 round trips from client to cluster:
stage(create hidden document) → trigger to convert hidden document to real document.

Say if out of 1k insertions in a single transactions, when client triggers to convert hidden document to real document after transaction is committed, what happens if my client crashes after sending events for only 500 documents and didn’t trigger events for remaining documents to convert from hidden to real documents?

Could you explain how CB would be achieving ACID compliance in this scenario??

  • would documents already moved from hidden to real documents available for document lookups for other clients??
  • how the other remaining documents would be moved from hidden to real documents ??

Regards,
Venkat

Hi @sri_ram
There is an asynchronous cleanup that’s responsible for cleaning up any transactions that couldn’t be completed, due to application crash or other reason, which you can read up on here. So this will find the failed transaction (usually within 60 seconds, this can be configured) and finish committing the remaining 500 documents.

This async cleanup is currently run client-side, so at least one application needs to be running and have initialised the Transactions object.

In addition we have Monotonic Atomic View (MAV) reads inside transactions. This means that any transaction T1 reading a document that has staged data from transaction T2, where T2 has reached the commit point, will see the post-transaction version of the document. So T1 will see all 1k inserts as committed once it does reads after T2 has committed, regardless of the commit state of individual documents.
This is called Read Atomicity, since it presents an atomic commit at the read point - rather than at the write point, which would be too expensive in a distributed system as it would require locking across multiple nodes. The mechanics of MAV reads are that T1 check’s T2’s ATR entry to see if it’s committed, if it finds a document that has staged data from T2.
So our isolation level is really Monotonic Atomic View, higher than the Read Committed we state. We just don’t tend to mention MAV in the documentation as it’s not a widely known concept at present - though we expect that to change. If you’re interested, there is more information on MAV on jepsen.io and the Bailis paper.

1 Like

This makes me ask a query on what happens if we do a document look up which is not inside a transaction, if Monotonic Atomic View (MAV) reads are inside transactions then does that mean that if a document lookup(not part of a transaction) done on any inserted or replaced document which is staged in a committed transaction would give non-committed data??. Is my understanding correct??

A regular non-transactional KV or N1QL read will not be performing MAV logic, and will return the non-committed data, regardless of the state of T2’s ATR entry.
In other words, transactional reads are at MAV isolation level, and non-transactional reads are at Read Committed. It’s in keeping with our philosophy that you should only pay for what you use, as MAV reads do involve a small cost if the document is discovered to be in a transaction and an ATR entry needs to be looked up.
Note that if you always need MAV reads, a read-only transaction is very cheap as it doesn’t need to create an ATR entry.

1 Like

Thanks for responding to so many queries that I have shot up.

The above explanation makes clear on how non committed data would be read by a regular non-transactional KV or N1QL read.

My new query is around based on inserts where some of the hidden documents are converted to real documents, say for example, below steps happened in order.

  1. Transaction has begun for 100 inserts
  2. Transaction is committed.
  3. A trigger to move hidden documents to real documents has been initiated by client.
  4. 90 documents are moved from hidden documents to real documents.
  5. before other 10 documents are moved from hidden documents to real documents my client crashed.
  6. Clean up process has not yet been initiated by any other client.
  7. Before clean up process has started, another client has done a non-transactional KV look up on one of the 90 documents.
  8. So as per my understanding this client would still get the looked up document in step 7, as the transaction is committed and the inserted document is moved from hidden to real document.

Please do let me know if my understanding of step 8 is correct or not??

Yes that’s correct. A non-transactional read will see any of the 90 documents as committed, but will not see the remaining 10 documents until those are handled by cleanup. That’s the difference between MAV and Read Committed.

But as mentioned, if it is an issue for your use-case then you can always get transactional MAV reads very cheaply with read-only transactions. If the document is not in a transaction, then a transactional read is essentially the same cost as a non-transactional one (the difference is a few bytes on the wire).

And no problems on the questions, keep them coming :slight_smile:

1 Like