Skip to end of metadata
Go to start of metadata
You are viewing an old version of this page. View the current version. Compare with Current  |   View Page History

Project Left Ranger

Introduction

Project/Component Working Name:

Left Ranger Matches Again
For pop culture reference, see: {+}https://www.youtube.com/watch?v=jqg2bY2VoCo+
Since this proposal "matches again" any incoming changes, and is expected to be used with left-handled ranges, the riff on 'Lone Ranger Rides Again' seemed appropriate, if a bit dated.

Name of Document Author/Supplier:

Matt Ingenthron <matt@couchbase.com>

Date of this Document:

11 March 2014

Proposal Status:

DRAFT

Name of Major Document Customer(s)/Consumer(s):

Developers using Couchbase, SDK Development Team.

Communications:

couchbase@googlegroups.com

Project Summary:

This project will extend the system to allow for range operations. It will specifically add support for range queries which will also consider incoming changes that match that same range while the range query is being executed.

Project Description:

Prior to the creation of Couchbase, we have long had developer requests for range queries over keys. Many extensions of memcached, both those directly extended like memcachedb and those which are logical extensions, like redis, support various different ways of doing range queries. This project will bring range operations to Couchbase.
Support for range operations would be added in a recommendation similar to the RangeOps document on the memcached wiki. This was originally authored with the same binary protocol in mind that has been successfully used in Couchbase Server and is part of our existing server and SDKs.
This project would add a subset of the operations described on the RangeOps page: get, getq, delete, deleteq. It also proposes a similar approach for Couchbase extended operations touch and touchq. It also proposes a new statistic, whose interface should be considered volatile, to be used in conjunction with these range operations to get statistics for a given range based on what is on disk. As described there, these RangeOps would use EXTRAS space in the memcached binary protocol.
Unlike the recommendations on the RangeOps document, this project SHOULD NOT implement the ASCII protocol equivalents. Couchbase's direct protocol interface is binary only.

Risks and Assumptions:

This project assumes that the extensions would only apply to ep-engine's implementation of these protocol operations. The memcached "default" engine would not be extended to support these new command.
The project also assumes that initial support would be needed for the Couchbase C SDK (a.k.a. libcouchbase). Support for other SDKs would be added as time allows in those SDKs development lifecycle.
As mentioned above, it's also assumed that there will be no changes made to moxi and the commands will not be available over the ASCII protocol.

Business Summary

Problem Area:

Couchbase has grown into more significant deployments with much larger sets of data. Further, time in the field has shown that frequently to hit high performance needs on the part of our end users, we need to consider modeling the application on a simpler key-value approach rather than using more advanced features such as Couchbase Views.
One area is a need to be able to work with a set of related data. Users may currently model these applications by having a specific key format which has some sort of prefix or delimiter. Couchbase has encouraged this approach with the development of many applications. However, these are of limited use if there is no mechanism by which all given documents/items prefixed by a given key can be retrieved, updated or deleted.
Current approaches call for the creation of a Couchbase View to have an index. While this is close to meeting the functional requirements, it has a few problems.
First, building views is very expensive. Even for a simple view which would be ordered by the key/id alone, the view requires the execution of a javascript function for every mutation persisted and a set of separate files for the views. It must also do this for replicas in order to properly be able to support this after a rebalance.
Second, views may not have the expected sort order. Generally, developers expect this to be sorted in value order. Existing views are unicode collated and may require padding of portions of the first argument to the emit to get the desired behavior.
Third, views are very late with respect to when changes are made and have no provision for handling changes that may be in memory but are not yet persisted.
By providing for range operations which can work with concurrent modifications and not yet persisted data, we can address this area of common demand for features.

Market/Requester:

There is a quite diverse market for this.

How will you know when you are done?

The functional requirements will be implemented and tested by both Couchbase Server and libcouchbase. Regular feature testing from libcouchbase will demonstrate feature completeness.

Technical Description

Details:

This project touches a few different components and internal interfaces between those components. The general outline for how it would be done was outlined in a set of meetings on February 18 and 19, 2014 lead by Matt Ingenthron. Attendees included Chiyoung Seo, Aaron Miller, Cihan Biyikoglu, Mark Azad, Alex Ma and Kirk Kirkconnnell.
The components in play are couchstore, ep-engine and the client libraries. Most of the work would be between components and on private interfaces. Looking at this from a bottom up perspective, the project calls for...
Updating the couchstore project to allow for a way to query the ID b-tree and a way to query disk level statistics reduced on the interior nodes of that b-tree. This would allow other components, most notably ep-engine, to query the files and get aggregate statistics. Currently, couchstore has three fields, "deletedfalsecount", "deletedtruecount", "disksize" on the ID b-tree. These would all be queryable by range. Note that these statistics exist primarily for determining file fragmentation and disk usage for stored, compressed documents which makes them suitable for making approximations about a set of items in a range but does not give exact information on the range.
Updating ep-engine to extend existing opcodes:

  • get, getq: extend to allow for embedding range requests in the EXTRAS field and extended to return additional metadata. This may be done with a different opcode since the metadata being returned is different from a traditional get.
  • delete, deleteq: extend to allow for embedding range requests in the EXTRAS field.
  • touch, touchq: extend to allow for embedding range requests in the EXTRAS field.
  • stats: create a new stat which takes the string argument "range_disk_usage" and a range in the EXTRAS field.

For get operations, ep-engine would also use existing mechanisms to include incoming operations in the result of the query and the query would not be considered complete until both the disk query and any changes in memory have been considered. Then, as suggested on the RangeOps draft, an empty response would be sent with an opaque that is associated with the request. One important extension to the RangeOps draft is that it must be possible to specify a single vbucket or a set of vbuckets. Also important, there will be a flag to indicate the range query would continue to run until terminated, or automatically terminate. Note that owing to MB-10291 "cbmcd connections cannot be efficiently used since operations are never interleaved" these features will probably best be used from dedicated connections
.
It would be a goal in this project to add additional monitoring statistics for any range operations being executed, since they are potentially long running operations with memory management and disk IO considerations. The exact mechanism for exposing these statistics will not be defined here, but at a minimum it should include the approximate number of items matching the range and the number of items visited over the execution of the query.
The libcouchbase client library would also be updated to support the range queries specified. Also, libcouchbase would need to expose a new broadcast operation on top of stats which would be sent to all servers with ranges embedded in the EXTRAS for the stats request.

Issues/Issues to be Opened:

TBD.

Other Solutions Previously Considered:

Use of views, as discussed above.

In Scope:

Changes to ep-engine, couchstore and libcouchbase. Documentation of new extensions and all feature testing.

Out of Scope:

Changes to moxi to expose ASCII versions of these new operations.
Any changes to existing tools for backup, transfer, etc.

Interfaces:

A new set of operations at the Couchbase memcached protocol level:

  • get, getq
  • delete, deleteq
  • touch, touchq
  • a new stat for "range_disk_usage"

Doc Impact:

Additional documentation for libcouchbase about how to use the new API.
New documentation on the new statistic.
Additional documentation on the Couchbase specific extensions to the memcached protocol. As of this writing, it's not clear where those are documented.
Documentation enhancements to the dev guide to describe how these new operations may be used in various application development patterns.

Admin/Config Impact:

No Administration impact.
For configuration, there may be some sizing impact. Note that the new range operations may use a fair amount of disk IO as the ID b-tree is walked and then the individual items are fetched or background fetched. That said, this is not expected to be any worse than existing workload generated by querying views with the logical all_docs configuration.

Packaging and Delivery Impact:

New dot-minor (or greater) versions of all components mentioned would be required. Given that no existing interfaces are expected to be removed, there is no additional impact over a regular dot-minor upgrade.

Security Impact:

None.

Dependencies:

None outside the mentioned components. This spans many components on both the client and the server.

Reference Documents:

The memcached RangeOps documentation on the memcached wiki.
This should link to private commands in memcached and public commands in memcached, but it's not clear where those are documented outside the code. Need assistance from ep-engine/memcached teams.

Resources and Schedule

Projected Availability:

This project would be delivered in Couchbase Server 3.0.

Cost of Effort:

(unknown)

Prototype Availability

Prototype Availability:

It'd be expected that builds in leading to 3.0 or separate builds would deliver the first prototype, likely with reference implementation and documentation being lead by the libcouchbase project.

Prototype Cost:

Since this is a committed change, cost has not been estimated.

Open Questions

  1. Do we need additional statistics on a per range-query basis to be able to monitor?
  2. What statistics should be added? The draft currently includes stats about approximate number of items and number of items visited.
  3. Should it be a new CMD_GET opcode?
  4. Open question: what is the input to the range? A pair of bytestrings and the inclusion flags seem reasonable.
  5. Are we comfortable with the "and keep it open" behavior of the range query response?

Changelog

Initial draft posted on 2014-02-24.
Updated on 2014-02-25. Indicate metadata must be returned with range gets. Added list of Open Questions. Moved question on the range to the open questions. Added options on statistics.
Minor updates on 2014-03-11. Posted on community wiki.

Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.