[TAP Protocol] How does the backfill option work for streaming future changes only


We are trying to use the TAP protocol starting with the tap_example.py and are not able to stream future changes only.

According to the wiki page (http://www.couchbase.com/wiki/display/couchbase/TAP+Protocol
), specifying backfill -1 as additional tap options should do the trick, but when we modify /opt/couchbase/lib/python/tap_example.py, giving opts the the following value, we still get the bunch of already existing keys in the bucket and not only the ones we “mutate” after the tap stream has been started :

opts = {memcacheConstants.TAP_FLAG_BACKFILL: 0xFFFFFFFFFFFFFFFF}

Here is the command we use :

/opt/couchbase/lib/python/tap_example.py -u bucketName couchServer:11210

Are we missing something, or has the TAP protocol changed ?
What backfill option should we use ?
(We don’t feel like using the java SDK, please don’t tell us to use it)

More over, something must have changed with the TAP protocol, now sending some “empty” data with 0 extralen from time to time (every ~minute ?), making the tap.py processCommand crash when it happens:

error: uncaptured python exception, closing channel <tap.TapConnection connected couchServer:11210 at 0x7f46d830bcb0> (<class ‘struct.error’>:unpack requires a string argument of length 16 [/usr/lib64/python2.6/asyncore.py|read|78] [/usr/lib64/python2.6/asyncore.py|handle_read_event|428] [/opt/couchbase/lib/python/mc_bin_server.py|handle_read|336] [/opt/couchbase/lib/python/tap.py|processCommand|98])

extralen: 0
data is empty

As a work around a try/except does the trick, but could you document this behaviour (the empty data “polling”) and/or adapt tap.py accordingly ?

One last question : It seams it is now possible to receive all changes from Couchbase using the REST API of the XDCR <- Is there any documentation about those APIs

Ref : http://www.couchbase.com/communities/q-and-a/feature-events-couchbase




For the XDCR API, this is not a documented feature, you can use the CAPI Server that we used in the Elasticsearch plugin:

The reason why it is not documented, is because it is not an official API, simply because we may change it in the future, so if you are using it you need to be careful and be sure your monitor the changes there.

For the TAP question I am investigating.


For XDCR, you can look at the The Couchbase API adapter that simulates a remote cluster and can use the output to stream to something else.

This is used by the Elasticsearch plug-in as well.

Note that both these approaches ( using TAP or XDCR stream) are officially not supported externally because the protocol could change in any release. In fact we are in the process of updating the TAP protocol and in the future plan to have a changes feed that could be directly consumed.

‘plan to have a changes feed that could be directly consumed’ -> Great news !!


‘For the TAP question I am investigating’ -> Thanks