As I see at max we can have 1024 ATR records corresponding to each vBucket, when a new transaction begins how SDK decides to which ATR record a particular transaction should go to.
I see from a POC we have done it is always based on the first document Id that is being mutated and nothing to do with transactionId.
Please let me know if my understanding is correct?
That’s right. The algorithm is that the ATR used is on the default collection of the bucket of the first mutated document. And that the ATR document will be on the same vbucket as the document - hence the 1,024 ATRs (by default), to match the default number of vbuckets. This minimises the set of the nodes that are pulled in to a transaction, to maximise overall availability (Bailis’ paper Highly Available Transactions is interesting reading if you’re curious). It also makes it more likely that initialising the ATR entry, and staging the first document, will be written to physical disk in the same flush, for a performance boost when using persist-level durability settings.
These are current implementation details though, so shouldn’t be relied upon.
If you want to change where the ATRs are you can do so, using the custom metadata collection feature.
Thanks for the explanation @graham.pople
One more query over this thread, can you please explain how remove works w.r.t to staging and removal of the document?
During staging it’s very similar to replaces, except it doesn’t stage a post-transaction version of the doc. The staged metadata also includes an indication of what type of operation it is. During the commit phase the document is just removed.