Supplying CAS to set_multi() or replace_multi() ...

Gentlefolk,

Neither set_multi() nor replace_multi() appear to have a way for me to supply them a CAS value per item. delete() does supply such a mechanism. It is documented in the Python SDK connection.delete() section:

Remove multiple keys with CAS:

            oks = cb.delete({
                "key1" : cas1,
                "key2" : cas2,
                "key3" : cas3
            })

Oddly, delete_multi() does not support supplying CAS. It does not appear that set() or replace() allows a dictionary as delete() does for the keys.

There are three best practices I am trying to follow in building my app. 1) Use optimistic locking. 2) Use idempotent mutation operations to allow optimistic locking with retries. And 3) use bulk operations. To follow these practices, I must track and maintain the key/value/cas triple tuple per document. If the bulk APIs do not support CAS, then I have to abandon bulk operations or use pessimistic locking and endure the performance hit that strategy entails. I doubt that either is the intention of the Couchbase team.

In my opinion, the Couchbase Python SDK is making it difficult to follow these three best practices. Clearly, the Couchbase team and the user community needs to have a discussion around how to actually follow these best practices using Python.

Finally, if the Couchbase Python SDK documentation is in massive error, then that needs to be corrected. While this product has open source roots, your salesmen remind me in email that it is a commercial product. They seem to be in a hurry for me to finish my prototyping effort. Perhaps the documentation might be in the way of my completing this prototype? Or perhaps the SDK is not really a v1.0 design and it has been inappropriately promoted to that status?

This SDK could be better. Lets start to talk about how and why.

Andrew

P.S. to put my operational bona fides in place, I have a pair of daemons inserting and deleting items to a 10+ million document database. I do bulk reads and use single operation CAS set()s and delete()s. I am happy with the results so far but I haven't really started stressing things nor aggressively using the database.

P.P.S. I've also attended on of your Couchbase Dev Days. Hence, I think I have a pretty good intellectual understanding of your technology. While I'm a Python and Couchbase noob, I know my data and its structure very well.

Mark & Company,

I've now tried to use the documented feature of delete() to bulk delete with CAS and it, in fact, throws an error:

def delete_rows(rows):
 
    cas_dict = {row.key: row.cas for row in rows}
 
    cb.delete(cas_dict, quiet=True)

Error:

   ...
   File "/Users/awd/Projects/ClientX/CXTwitter/src/trim_statuses.py", line 334, in delete_rows
    cb.delete(cas_dict, quiet=True)
  File "/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages/couchbase/connection.py", line 503, in delete
    return _Base.delete(self, key, cas, quiet)
couchbase.exceptions.ValueFormatError: <Must be unicode or string, C Source=(src/convert.c,93), OBJ={'us:111657404': 1735979066832715776, 'us:1456779943': 8407121342714085376, ...}>

Please note that the dictionary, "{'us:111657404': 1735979066832715776, 'us:1456779943': 8407121342714085376, ...}", other than the ellipsis, is the exact documented format.

Clearly, the C function is not expecting the dictionary of keys and CASes.

Having to do this kind of experimental programming is very annoying. The documentation says it works. The reality is it doesn't. Considering the exception being thrown, I doubt it ever did.

Andrew

Have you tried delete_multi?

1 Answer

« Back to question.

set_multi does allow a CAS. However this is currently not supported or documented (it will likely be exposed in future versions). The key issue is allowing the "idiomatic" way of setting values using a dictionary, and also being able to supply the CAS.

There's actually support for this. The key issue with not being able to supply CAS to set_multi is because of the ambiguity of "is it a value? or is it a set of parameters". For this reason there is an 'Arguments' class designed for this purpose. Unfortunately it has not been documented in the current version, as tests still need to be written for it.

It's available from couchbase.Arguments; and it's a dict subclass which accepts "value", "cas", and "ttl" fields. You use this as a dictionary value for each item you pass into the dict for set_multi.

In general; the set() family of methods accept more than a single parameter for each key; for example there is "ttl", "flags", "format", "value", and "cas" (the former is not applicable in add() though).

Feel free to file bugs on our issue trackers if you think there are features missing.

Mark,

Thank you for your comments.

As you can see from my comment on my own post, I am quite frustrated trying to use the more advanced features of your SDK. Promoting a commercial product to a v1.0.0 comes with some certain assurances. Such as, it performs as documented. Consider this a bug report that your documentation is in error and needs to be fixed. (I also filed a bug report on another thread about CAS not being documented in a ValueResult or its super classes.) Frankly, as documentation is part of being a commercial product, I think the v1.0.0 promotion is perhaps premature.

My main question is pretty simple, what is the point of exposing the *_multi() variants of the API calls if I cannot pass along the CAS value? Without the CAS, they are useful in only a few cases; none of which are probably on any app's critical performance path. You probably shouldn't bother exposing them. Regardless of whether you agree with my sentiment about the *_multi() variants, Couchbase needs to remove the documentation on passing a key/cas dictionary to the non-functioning delete() method.

As to your naming problem, this gets back to the point I've made in a different post -- that the API should really revolve around the triple tuple, key/value/cas. Your ValueResult is a suitable container. Following your advice, I've made my own property compatible container, CBResult, when a ValueResult is unavailable. Your *_multi() variant of the API should accept iterables of ValueResult/interface compatible CBResult-like classes. That would solve your naming problem. Basically, your query API and your get_multi() call returns structures that are a list comprehension away from being what should be sent to your *_multi() variants.

Because you and your team are responsive, we will be continuing to develop our initial prototype with Couchbase. That means I'll be willing to help you refine your product's *_multi() variants. What is your current thinking in this area? Do you have an issue for it in your issue tracker?

Anon,
Andrew

P.S. Notwithstanding my offer above, it isn't my job to help design someone else's commercial product. I get paid to use your product well. As Couchbase has certain characteristics we find very attractive, I'm willing to review your SDK design. But my client would rather I pick technologies that truly are ready for commercial use. Couchbase v2.1 looks like it is ready. Is the Python client? Should I pick one of the other client SDKs?