Counters Idea - Feedback Requested
I'm currently using Membase for some use cases, and I'd like to ask for some feedback/ideas about using it for another use case that's coming up.
I want to store a bunch of realtime counters, which obviously must be able to process a high volume of writes in addition to reads. This by itself is something I know that Membase is good at.
I also want to store counters over time ranges (minute, hourly, daily, etc), and am thinking about pushing the counter values periodically to a time-series database. These counters can also be specific to a user, so I won't know all of the keys for the counters stored in Membase unless I run some complicated queries to determine who has which counters. That still wouldn't tell me which counters were actually updated since the last push.
Is the TAP protocol the only real way to solve for this use case through Membase? How exactly does one use the protocol? The documentation at http://www.couchbase.org/wiki/display/membase/TAP+Protocol is pretty lacking in explanation.
Looking specifically at jtap, does the "TAP stream," once you connect to it, just push to you all changes being made in Membase? Would the right thing to do be to:
1. make a custom Exporter that constantly streams in all of the updates
2. filter on just counters and write those values out to a file
3. have another process periodically read the values and push them to the time series database
Thanks for your feedback!
I want to add a few comments here about the tap protocol and how to use it. First, tap allows you to do many different things, but the two that you are probably most interested in are the dump and backfill features. Doing a tap dump will take a snapshot of the database and then stream all of the key-value pairs through the tap stream. Doing a tap backfill is like having a changes feed and you will get a stream of all of the key-value mutations that take place in the server.
The other thing I wanted to add is that you shouldn't use jtap. I wrote this library almost a year ago and soon realized that it would be much better if it were a part of the java sdk (Spymemcached). As a result I basically put all of the code from jtap into Spymemcached. The api is very similar, but it allows you do connect to all of the nodes in a Membase cluster automatically and also contains some bug fixes which makes it much more resilient than the jtap library. I highly recommend using Spymemcached for tap.