Couchbase Server 2.0 : General instability when views change on server
A quick rundown of our setup:
- We have only one server running, no replicas. It has 16GB ram.
- We set up an elasticsearch server on the same box and replicate all the data into ES
- That server has 800,000 documents, all in one bucket (although docs are of different types), mostly ranging around 2-4kb
- We have a small amount of user generated content (~1000 documents) and a large amount of auto-generated content. Each document is typed within the JSON.
- We use views heavily, and have 28 views, including an "all" view for each document type, that returns all the ids that match the JSON type field we want.
- We rely heavily on incremental MR
- We use Stale = FALSE as our default view fetching parameter. This is because certain types of view get operations we need to be 100% up to date. Note that not all things are Stale = FALSE in our system.
Here's where things break down:
1. When we push new updates to the server, sometimes our view code changes, and we programmatically update the view code on the server
2. The updates trigger the views to be recomputed, and the recomputation takes ~15 minutes
3. During the recomputation, even the trivially small views/unchanged views are basically inaccessible and there are lots of time outs (and we are using Stale = FALSE for the smaller ones)
4. This leads to timeouts in our system and long downtime for our users (15 minutes) if we were to push to production.
5. Note that we might be querying the same view with Stale = FALSE multiple (many) times during this period. I'm not sure how these stack.
A few questions:
- What is the suggested design for a system such as ours? We were initially putting each document type in its own bucket, but we have ~20 types and were growing quickly so that didn't work. Now we've jammed all the documents into one bucket with type fields.
- Are there any simple fixes/workarounds to what we are doing? Any general suggestions?