Bucket Design to Optimize Queries

layerxfounder · January 26, 2015, 2:45pm

I was wondering is having multiple document types in a single bucket is still the desired bucket design when using N1QL. It seem that creating a different bucket per docType would be faster that keeping all the documents in one bucket. The documentation states KeySpaces are analogous to buckets.

If I had 10,000 documents which was set up like this:

BUCKET_PETS
5,000 docTypes dogs
2,500 docTypes cats
2,500 docTypes birds
Create an index on docType

select * from BUCKET_PETS where docType = ‘dogs’;

Is there a benefit to separate the different doctype in their own buckets?
So in this example one for dogs,cats, and birds.

Thanks much.
Keith

geraldss · January 26, 2015, 4:44pm

Hi Keith,

The tradeoff for using multiple buckets depends on a couple of things–

cardinality: do some types of pets much more / less frequently than others? For example, if some types of pets had 100,000 documents, and others had 1,000 documents, that difference would be significant when querying only the 1,000 pets.

-access patterns: do all queries cut across various types of pets, or are most queries type-aware? If you use multiple buckets, then queries that cut across types will need to use UNION and UNION ALL. If all your queries are type-aware, you probably want multiple buckets.

Finally, if you keep your documents in one bucket, you can index on type, and use type in your WHERE clause. This will help speed up your queries (see EXPLAIN).

Thanks.

daschl · January 26, 2015, 5:18pm

Also, @layerxfounder keep in mind that Couchbase also treats buckets not only as “namespaces”, but rather to allocate resources and authentication. Creating many buckets certainly adds some overhead, so I’d always start out from a single bucket and really only move on to more if it makes sense (both from a resource and application semantic perspective).

layerxfounder · January 26, 2015, 5:22pm

Thanks Gerald,

I know there are a lot of other factors with design. I do in fact have an index on docType. Some queries are fast and some not so much. I also understand this is a developer preview and there are optimizations in the works.

Thank you.
Keith

geraldss · January 26, 2015, 5:33pm

Which queries are slow? Can you use EXPLAIN to make sure they are using an index, and if they are still slow, please post the queries here (or send them to me privately). We certainly don’t want any queries to be slower than they should be, even in this developer preview.

Thanks,
Gerald

kris_smarty · January 30, 2015, 4:32am

It worked thanks!

One more quick question. I dont need the Default bucket. Is it fine to delete that? or it may cause some issue later on.

Sorry - this is on the wrong post.
Admin: pls delete/ignore.

Topic		Replies	Views
Should I be using multiple buckets? SQL++	1	1867	April 30, 2014
Application design, how many buckets is too much Couchbase Server n1ql	5	1178	April 1, 2019
One bucket with multiple document kinds vs one bucket per document type Couchbase Server	0	561	June 10, 2019
Buckets vs Tables vs Keyspace SQL++	2	8358	May 23, 2015
Best Approach for Large Bucket(s) Design Couchbase Server	2	527	March 15, 2021

Bucket Design to Optimize Queries

Related topics