Does view indexing get slower the more documents in the bucket?
In CouchDb we have some views that only apply to some of the documents in the database (bucket). When new documents are added with their own new view it has to consider all the documents in the bucket regardless even if the first line of the view is to return as it is the wrong type of document. As more documents are added to the database, initial view creation takes longer and longer, making the application look bad.
Does Couchbase 2 solve this problem in anyway? I did consider putting all related documents into their own bucket but I gather couchbase prefers to have big buckets rather many small buckets, so that does not sound like the best idea as we could have hundreds of thousands of buckets in that case.
Thoughts?
Thanks
Ian
Hello,
For Couchbase we still need to process all documents and apply the map function against them to get a list of key-value pairs to insert into the index.
However, we have several optimizations that Apache CouchDB doesn't have, from more efficient index file format to in-process V8 based MapReduce engine, amongst many others.
Out latest nightly builds (latest version is 1495 at the moment) include a new approach for the initial index build, its advantages are:
1) Orders of magnitude faster (e.g., for a dataset that took about 7.5 hours with previous builds, it takes less than 30 minutes with latest build);
2) Uses much less disk space (orders of magnitude less as well);
3) When it finishes, the index file has 0% of fragmentation, meaning it doesn't need to be compacted right away (unlike the old approach, which is the same as Apache CouchDB)
The only "downside" with this approach is that it's not incremental as before. This means that if you query the index during the initial build, you'll always get an empty result set - with the old approach you get partial results, which are closer to the full result set as time passes. To get results, you'll have to wait for the initial index build to finish.
If you could try our latest 2.0 nightly build (1495) and compare against Apache CouchDB the time it takes to build your index, it would be very appreciated and welcome feedback (to be a fair comparison , use a Couchbase with a single node of course).
You can get the latest nightly build from here: http://www.couchbase.com/downloads-all
(only available for GNU/Linux and Mac OS X however)
Thanks