I was wondering is having multiple document types in a single bucket is still the desired bucket design when using N1QL. It seem that creating a different bucket per docType would be faster that keeping all the documents in one bucket. The documentation states KeySpaces are analogous to buckets.
If I had 10,000 documents which was set up like this:
5,000 docTypes dogs
2,500 docTypes cats
2,500 docTypes birds
Create an index on docType
select * from BUCKET_PETS where docType = ‘dogs’;
Is there a benefit to separate the different doctype in their own buckets?
So in this example one for dogs,cats, and birds.
The tradeoff for using multiple buckets depends on a couple of things–
cardinality: do some types of pets much more / less frequently than others? For example, if some types of pets had 100,000 documents, and others had 1,000 documents, that difference would be significant when querying only the 1,000 pets.
-access patterns: do all queries cut across various types of pets, or are most queries type-aware? If you use multiple buckets, then queries that cut across types will need to use UNION and UNION ALL. If all your queries are type-aware, you probably want multiple buckets.
Finally, if you keep your documents in one bucket, you can index on type, and use type in your WHERE clause. This will help speed up your queries (see EXPLAIN).
Also, @layerxfounder keep in mind that Couchbase also treats buckets not only as “namespaces”, but rather to allocate resources and authentication. Creating many buckets certainly adds some overhead, so I’d always start out from a single bucket and really only move on to more if it makes sense (both from a resource and application semantic perspective).
I know there are a lot of other factors with design. I do in fact have an index on docType. Some queries are fast and some not so much. I also understand this is a developer preview and there are optimizations in the works.
Which queries are slow? Can you use EXPLAIN to make sure they are using an index, and if they are still slow, please post the queries here (or send them to me privately). We certainly don’t want any queries to be slower than they should be, even in this developer preview.