Couchbase
  • Why NoSQL?
  • Couchbase Server
  • Download
  • Resources
  • Careers
Home | Forums | Couchbase | Couchbase Single Server 1.x

Membase and CouchDB Buckets

17 replies [Last post]
  • Login or register to post comments
Fri, 06/03/2011 - 02:54
adamdavies
Offline
Joined: 06/02/2011
Groups: None

From what I’ve been reading membase stores data in vbuckets and these are/can be spread over a number of servers. So each server has a sub-set of the data. However, CouchDB stores everything in a single database which is not spread over any servers (? Is this right ?) which means the ‘view’ mechanism of CouchDB is performed on a server by server bases. Is it proposed that each couchbase instance in a cluster (from the membase world) will have its own CouchDB instance for storing its own partition (vbucket) of the data set? Does this allow for distributed views and distributed map/reduce functions etc?

Top
  • Login or register to post comments
Fri, 06/03/2011 - 03:31
jan
Offline
Joined: 02/15/2011
Groups: None

Hi Adam,

in Membase, a vBucket is is backed by a number of distributed buckets behind the scenes. Persistence on these buckets (that like you say hold a subset of your data) is done in SQLite. We are now replacing SQLite with CouchDB, so you can do querying and other things on top of Membase (now Couchbase). CouchDB's Map Reduce is designed to be distributed and we'll be adding a component to Membase that will do the distribution for you. Your application will only see a single "database" or "vBucket" where behind the scenes it is many. All the distribution work is done for you transparently.

Cheers
Jan
--
Edit: Jan is basically correct here, except that "buckets" and "vBuckets" are reversed. A Membase (and soon Couchbase) administrator will create "buckets" that are then automatically broken up into 1024 "vbuckets" which are assigned to individual nodes of a cluster. This is really independent of whether we are using sqlite or CouchDB and you can read more about the concept here: http://www.couchbase.org/wiki/display/membase/vBuckets.

Edit result: I'm going to switch 'buckets' with 'vbuckets' in the right places so it's not confusing to future readers of these posts.

Top
  • Login or register to post comments
Fri, 06/03/2011 - 07:22
adamdavies
Offline
Joined: 06/02/2011
Groups: None

Thank Jan for you reply.

The buckets hold indexes. I assume these indexes point to data (as in relational B Tree indexes), and that data is held on the ‘vbuckets’ (CouchDB). So is it the case that there can be any number of vbuckets for each Bucket? For example, so you can have a single bucket holding loads of indexes which may point to data stored in any number of vbuckets.

Also can you have a number of vbuckets all of which are referenced by a single bucket? So is the vBucket and backing bucket can be represented by a One-To-Many relationship or a Many-To-One relationship.

Also, if you take a single Couchbase bucket, which is backed by number of vbuckets, can those vbuckets be distributed across several servers? (I think from your replay that it’s a YES)? If so can you also replicate these distributed vbuckets to other distributed vbuckets? (e.g.: http://dustin.github.com/2010/06/29/memcached-vbuckets.html)?

Thanks
Adam

Top
  • Login or register to post comments
Fri, 06/03/2011 - 07:32
jan
Offline
Joined: 02/15/2011
Groups: None

Hey Adam,

this is where I gotta get a little handwavey because I don't know Membase that well yet.

The buckets are just a lookup table to where the actual data is (vbuckets (the naming is unfortunate here)). CouchDB internally uses a B Tree, but that has nothing to do with anything "relational" :)

Buckets and vbuckets have a 1:N relationship (iirc, N is 1024 by default). On top of that, any of N vbuckets can also hold a copy of another vbucket's data, for failover reasons. Each vbucket has a 1:1 relationship with CouchDB storage nodes. You will not get direct access to these storage nodes (at least initially) because you may accidentally mess up the cluster state. A vbucket and it's storage node are always on the same machine. All (1024) vbuckets and their storage nodes can live on multiple machines and comprise a single bucket.

I hope this makes sense.

Cheers
Jan
--

Top
  • Login or register to post comments
Fri, 06/03/2011 - 08:44
adamdavies
Offline
Joined: 06/02/2011
Groups: None

Thanks Jan, just wanna repeat back a simple scenario of what I understand just to make sure I’ve got it right:
So given 4 servers/commodity PCs, you can have a configuration where a Bucket index node is sat on one machine, on the other 3 machines you can have CouchDB/Storage Nodes (the vbuckets) distributed (an individual vbucket is a single atomic unit of CouchDB/Storage Node and can not be distributed). All together the 4 machines (1 as Bucket indexer and the other 3 as data storage) represent a single bucket.

Following a client request, the ID for the item they want is hashed, that identifies our bucket as the location of the data, the client sends the request and it is picked up at the machine holding the indexing Bucket which identifies the machine holding the vbucket and forwards the request to get the data. Job done.

Latter, our dataset is getting to big for the 3 data machines so we add another machine and now we have 4 machines holding StorageNodes and we maintain our single machine for holding the index Bucket (so we’ve scaled out at the data level).

Latter still, our index Bucket machine is starting to feel the strain, so now we add another machine and the indexing Bucket is split between these machines (are these 6 machines still a single Bucket? If so does the two index Buckets have a parent root bucket/manager to act as a single entry point?)

Thanks again Jan, if what I am thinking is correct then this is going to be one hell of a system.

Edit: As I will explain in a later post, there is no "indexing server". Each server is only responsible for (and only know about) the vbuckets that it holds. This prevents us from having any single point of failure or bottleneck. The client driver understands the vbucket maps and hashes it's keys directly into a vbucket which is assigned to a server. This way, the clients themselves do the load-balancing and we don't have to worry about any "special" servers. And yes, that makes it one hell of a system ;-)

Top
  • Login or register to post comments
Fri, 06/03/2011 - 15:40
Frank
Offline
Joined: 06/28/2010
Groups: None

I know we suck at naming; sadly the whole bucket/vbucket confusion is very common, and indeed it gets even more complex with CouchDB and views in the picture.

A Bucket is what you'd call a Database in CouchDB. It is a logical sandbox or namespace. I.e. you can have multiple buckets that contain documents with the same keys but different values.

For our purposes we can entirely ignore multiple buckets, buckets are entirely isolated, so for this purpose let's say there is a single Bucket on the system.

Any bucket is sharded into 1024 active vBuckets. A specific key will always hash to the same vBucket (this happens in your client). There then is a vbucket Map which tells smart clients or Moxi on which node a specific vbucket resides at this point in time. At any point there can only be one node on which a specific vBucket is active on at any time. The map gets updated when new nodes are added/removed or a node is failed over (and in those cases it is always carefully managed that a vbucket can't be active on two nodes at the same time).

With the addition of CouchDB, each vBucket will be stored in a seperate CouchDB database. (well, they will be prefixed by the bucketname, as there is for example a vbucket 42 for each bucket of course). There are also 0..3 replicas/copies of vbuckets, but they are basically doing exactly the same, they just don't accept regular reads or writes, so we can ignore them for this explanation.

These are standard CouchDB databases, so they don't know that they are part of a greater scheme :)

Therefore each vBucket Database will deal with views just like it does now. There will be a view for the data contained in that database.

What we do in order to give you the illusion that there is only a single database your view is running on, is to do scatter gather across the vBuckets.

So each vBucket provides answers for the data it contains and then that is aggregated to provide the overall answer.

So no indexing vbuckets or anything, just lots of vBucket DBs that deal with data stored inside them like they do now, each of them doing their own indexing.

So in your 4 node case each node will hold 256 vBucket Databases. A data request gets hashed to one vbucket and then the client or moxi look at the map to identify which node in the cluster hlds this vbucket and will request the data directly from that node then. Data is typically in the memory caching layer, so most likely couchDB won't actually get involved to serve the data.

If a ndoe is added a subset of vBuckets held on each node are now moved to the new node and once the data has arrived there, that node will be marked as actve for those vBucets in the map and new requests will be resolved from there.

As each vBucket has it's own view generation ,the indexing work is now also spread over 4 machines instead of 3.

As distribution of data to vBucket is statistically balanced, approximately the work done on each node is the same, there shouldn't be specific nodes that do a lot more indexing work than others.

Hoep that helps!

Top
  • Login or register to post comments
Sat, 06/04/2011 - 12:34
adamdavies
Offline
Joined: 06/02/2011
Groups: None

Frank, this last reply has left me a bit confused. I understand the first part of your reply. I think what I need is an example, or maybe I should ask a different question.

In data storage we have two layers; the caching layer which holds indexes and read-often-data in RAM memory for quick access (this is the space memcache fits into). And a persistence layer which writes to non-volatile storage.

From what I understand about CouchDB it is probable that you generally create many more indexes than a SQL database for the purposes of access data very quickly.

Bringing couchDB and memcache together gives the opportunity of scaling out memory storage and disk storage separately. So if you have a small dataset but want loads of indexing done in a read-often environment then you can have a small hard drive and lots of memory. Alternatively, if you have loads of data, but not many indexes in a write-often environment then you can have large disk arrays but small memory. To deal with bottlenecks you can scale and reduce this two layers independently of each other, and even deploy them on separate physical hardware.

I've read almost everything on both the membase/couchbase sites and the CouchDB site including the Definitive Guide. So have a reasonable grasp of what is happening, but what I dont understand is the architecture of the couchDB + membase (elastic membase) offering. In particular the various deployment strategies (what machines can do what).

If it is the case that a vBucket is a self contained unit compromising the indexes and the data then this is a limitation for the situation where you have 1023 data sets all of say 10K and then one of 2mb, and its the 2mb dataset that is read all the time (Is this why there is a 2Mbyte limit for each storage item?).

I'm really excited by what you are working on and think it's great and a real game changer, but what I'm not sure about is how this technology is deployed under various circumstances, its architecture, and where the limitations are.

BTW: are there always 1024 vbuckets, even if you only store one document? And why the 1024 limit?

Top
  • Login or register to post comments
Mon, 06/06/2011 - 00:26
perry
Offline
Joined: 10/11/2010
Groups:

Adam, at a high level I think you're correct. One of the things that Membase (and now Couchbase) set out to do was to remove the complexity of managing both a caching layer and a data storage layer. Typically with memcached+MySQL, this is how the Web has grown up. At Couchbase, we've unified these two layers to give you a single, scalabale, highly-performant data storage layer that has the high-speed caching layer built right in. Now you don't have to scale them separately, just scale them as one unit.

Within the system, there are no "special" nodes. Each node is the same, and it is responsible for and handles only the data that it needs to. As we've been discussing, this is done through the use of vbuckets. A given dataset (bucket, database, whatever you want to call it) is split up into 1024 vbuckets. These vbuckets are then evenly distributed throughout all the nodes of a cluster. (i.e, 1 node will have all 1024 assigned to it, 2 nodes will have 512 each and so on.) Those vbuckets are also replicated as many times as necessary and the replicas are spread evenly throughout the cluster as well...making sure that no replica vbucket is on the same node as it's primary (or "active") vbucket.

When we want to add (or remove) servers from a cluster, it's a simple matter of moving a few vbuckets from their current location to the new location (either onto the new server that's being added or redistributing them throughout the cluster if removing a server). The same thing happens for replica vbuckets.

All of this happens in the currently existing Membase server.

Under the covers a bit, each vbucket has a corresponding disk persistence layer. In Membase this is sqlite, in Couchbase 2.0, it will be CouchDB. When moving vbuckets around, both the memory information and the disk information is moved.

That's the high level stuff, now to answer your more specific questions:
-Yes, it's theoretically possible that a single vbucket could be "overloaded" with requests. In practice, however, the even distribution of keys throughout vbuckets makes this very unlikely. The user does not have the ability to make the mistake of putting a hotly requested dataset all in one vbucket, the software will take care of distributing that across multiple. If you have one very hotly requested key, no auto-sharding system in the world will help ;-) In this case, it is better for your application to break that key up into multiple keys which will then distribute it across multiple vbuckets (and therefore multiple servers).

Memcached has a 1mb object size limit, Membase (and the upcoming Couchbase) has raised that to 20mb. However, allow me to caution against using such large object sizes. One of the major benefits of this system is the extremely reduced lookup time of finding an object within the database. If you spend a significant amount of time transferring the data over the network, you lose some of the advantage of the memory layer.

-Yes, there are always 1024 vbuckets, even if you only store one document. vbuckets have nothing to do with the amount of or actual data being stored in them. You can really think of vbuckets as "pre-sharding" a database so that you don't have to take any downtime to do it afterwards (as with a traditional RDBMS). 1024 was chosen as something we could reliably test while still being large enough to support a very large cluster (technically about 1024 servers). We have some plans in the future to move this to 65k, but it's definitely not necessary at the moment.

I hope that starts to clear things up for you, I've edited a few of the posts above to hopefully make the reading a bit easier for new people coming in.

Please let me know if you have any other questions or followups.

While Couchbase 2.0 is still on the horizon, the main concepts and capabilities of the system have been well proven by running in production at sites like Zynga, AOL Advertising and a whole slew of others.

Like you, we are quite excited to explore all the new things that CouchDB brings to our world (and that Membase brings to the CouchDB world).

Perry Krug
Solutions Architect

__________________

Forum support is great for free but sometimes you need a guaranteed response time and dedicated resources for your questions or issues.
Consider purchasing enterprise-level support from Couchbase: http://www.couchbase.com/products-and-services/overview
Call or email "sales -at- couchbase-dot- com" today!

Top
  • Login or register to post comments
Mon, 06/06/2011 - 01:56
adamdavies
Offline
Joined: 06/02/2011
Groups: None

Thanks, that's great and has cleared things up a lot.

The key to success (from a user/my perspective), seems to be to keep the data in small chunks. Thanks also for the pre-sharding concept. That will help in defining our usage semantics/data structures etc.

Again thanks
Adam

Top
  • Login or register to post comments
Mon, 06/06/2011 - 01:56
adamdavies
Offline
Joined: 06/02/2011
Groups: None

Thanks, that's great and has cleared things up a lot.

The key to success (from a user/my perspective), seems to be to keep the data in small chunks. Thanks also for the pre-sharding concept. That will help in defining our usage semantics/data structures etc.

Again thanks
Adam

Top
  • Login or register to post comments
Mon, 06/06/2011 - 12:57
perry
Offline
Joined: 10/11/2010
Groups:

Adam, I'm not really sure how the pre-sharding is related to semantics/data structures.

In general, an application should be accessing Membase/Couchbase as a unified data store. The underlying mechanics of how we scale are transparent to this.

Perry

__________________

Forum support is great for free but sometimes you need a guaranteed response time and dedicated resources for your questions or issues.
Consider purchasing enterprise-level support from Couchbase: http://www.couchbase.com/products-and-services/overview
Call or email "sales -at- couchbase-dot- com" today!

Top
  • Login or register to post comments
Tue, 06/07/2011 - 05:54
adamdavies
Offline
Joined: 06/02/2011
Groups: None

Sure, but knowing that the underlying database is structured into 1024 separate databases which are ballanced across the number of servers means that its better to have lots of small data units rather than a few large ones.

Top
  • Login or register to post comments
Thu, 06/09/2011 - 15:09
perry
Offline
Joined: 10/11/2010
Groups:

Makes sense.

__________________

Forum support is great for free but sometimes you need a guaranteed response time and dedicated resources for your questions or issues.
Consider purchasing enterprise-level support from Couchbase: http://www.couchbase.com/products-and-services/overview
Call or email "sales -at- couchbase-dot- com" today!

Top
  • Login or register to post comments
Mon, 06/13/2011 - 17:57
teslan
Offline
Joined: 06/13/2011
Groups: None

Being a noob (:in the Web 0.2 camp :) let me see if I have at all followed at least the meaning of these buckets and vbuckets.

bucket - means the same thing as "database"

vbucket - clever way of chopping up a database into 1024 "pieces", so that later on, those pieces could be easier moved around and re-arranged onto different machines (nodes).

The main point that seems to be here is that we (even the web 0.2 developers) write applications with one single database in mind (at the bucket level). CouchBase will in some fashion (round robin?) determine into which one of the vbuckets to place our next newly created document and most importantly, it will not mess things up by making sure that a single key (a single "_id", in CouchDB terms?) will never be stored within more than one single vbucket.

Now, lets try an example ...

If we have 10 databases, it would seem that we would have 10 "buckets" and that each of them would have 1024 "vbuckets" for a total of 10,240 vbuckets for the 10 databases. With that in mind, we have 10,240 "pieces" (not just 1024) that can then be (in any combination) moved from a single machine (node) and re-arranged onto serveral machines (nodes). If so, then is this where the DB amdin determines what goes where or is this fully automated?

If my example makes any sense then I have followed at least the surface of the discussion. If it does not make any sense then feel free not to waste your time going for a noob explanation ;)

Regards,
teslan

Edit: Having just re-read what I wrote, does existence of buckets and vbuckets either totally take away or at least greatly reduce the need for having multiple databases or would we (as app devs) still approach the things the old way by creating multiple databases, if for no other reason then to have things "more logically" organized and/or easier being able to move to a more "native" CouchDB that is without buckets and vbuckets ?

__________________

We are all prisoners of our own experiences.

Top
  • Login or register to post comments
Mon, 06/13/2011 - 18:31
Frank
Offline
Joined: 06/28/2010
Groups: None

Teslan,

yup, I think you've got it!

Luckily you don't have to worry about re-arranging the pieces (shards), that's what the system does. Basically they are equally distributed and every time you add or remove a node from the cluster the minimal set of these pieces gets move to the new node from all other nodes (or fron the node that will be removed back to the other nodes in the cluster).

Creating Buckets to seperate namespaces can still make sense. HOwever, you also assign a fixed amount of your cluster RAM to each bucket as a Cache. So from that side combining items into one bucket and just using key prefixes will let you share that cache. Seperate buckets allow you to prevent data for one "database" crowding out data from another one out of the cache RAM and of course for views it can make things easier.

However, in general i'd agree that you probably find things easier with fewer buckets (and rather using a doctype to address different types of docs in a view).

Hope you can make it to CouchConf, we'll have a lot more detail on the combined product there, including a demo!

Cheers,

Frank

Top
  • Login or register to post comments
Mon, 06/13/2011 - 21:40
teslan
Offline
Joined: 06/13/2011
Groups: None

Thanks Frank,

Now lets consider just about the only approach that is too often suggested (often as a silver bullet) when it comes to private data - where each and every user ends up with their own private DB. So with say 10k users we end up with 10k buckets and a whopping 10 million vbuckets. While number of files is often mentioned as one of constraining (resource related) factors, would it not be all that much worse with vBuckets in the play? Perhaps MemBase brings a real security solution to the table or is it just as loose minded about data privacy as CouchDB at least appears to be (: at least to us mindless noobs ;?)

Thanks again,
teslan

__________________

We are all prisoners of our own experiences.

Top
  • Login or register to post comments
Mon, 06/13/2011 - 22:03
Frank
Offline
Joined: 06/28/2010
Groups: None

I'll let some of our CouchDB experts tell you more about best practices to deal with DB security on CouchDB.

If you have 10k's of individual databases, my guess is each one is rather small (probably 10's of MB or so). The combined Couchbase Server product line (i.e. Membase plus CouchDB) is all about introducing clustering to distribute a database and scale it out. If each DB is small, then there really is no need for scaling out per db :) So no need for the clustered product if what you want to do is about having lots of small DBs. In that case just use our distribution of CouchDB (i.e. the part we use together with Membase ro provide the combined product). That will always stay fully compatible with Apache CouchDB and of course we are contributing our changes back to ASF. (Actually ,keep an eye out for some interesting update on what our team has been doing on that front on Wednesday...).

Of course if you use a single DB and just have owners as document properties, you can restrict DB access to your trusted application servers (plus there is per bucket authentication of course) and then enforce your user security at the application level. That is what big Membase deployments are doing (and of course most traditional DBs). I.e. no user never has access directly to the DB, everything goes via your app.
Clearly that loses you some of the neat CouchDB patterns of everyone having their "own" db, but that would be another way of solving the problem so that you could leverage the clustering and elasticity of the combined product more effectively (i.e. you'd get dynamic adding of nodes, failover of downed nodes and the memory caching for sub ms access latency to your cached working set).

So it really depends on what your key requirements are to decide whether you want the clustered solution or stick with the (potentially lots of) single node solution.

Top
  • Login or register to post comments
Mon, 06/13/2011 - 22:14
teslan
Offline
Joined: 06/13/2011
Groups: None

Besides answering my specific questions, this also answers some lingering questions and curiosity and is therefore worth repeating - from the overall community point of view ... so, I will do just that ;)

"In that case just use our distribution of CouchDB (i.e. the part we use together with Membase ro provide the combined product). That will always stay fully compatible with Apache CouchDB and of course we are contributing our changes back to ASF. " -Frank

Thanks again.

__________________

We are all prisoners of our own experiences.

Top
  • Login or register to post comments
  • Login or register to post comments
  • Login
  • Register

Company

  • About Us
  • Leadership
  • Customers
  • Partners
  • Contact Us

Product

  • Couchbase Server
  • Couchbase SDKs
  • Use Cases
  • Documentation
  • Forums

Open Source

  • Couchbase Project
  • Couchbase vs. CouchDB

Commercial

  • Subscriptions & Support
  • Training & Services

News

  • Blog
  • Newsletter
  • Press Releases
  • Buzz

Follow Us

    
  • Customer Login
  • Terms of Service
  • Privacy Policy
  • Trademark Policy
  • Site Map

© 2013 COUCHBASE All rights reserved.

Sign in to Couchbase Community

close
  • Create new account
  • Request new password
You are logging into the Forums, Wiki and Issue Tracker