View performance

Hi, is it normal that the view latency drops like a rock at about 10,000 view entries (it's a _stats mapreduce)? A get() query against the view takes about 40 milliseconds with the new Python 1.0.0 SDK. What will it look like after some hundred million view entries?
I didn't test the throughput, but can I expect several thousands of get() queries per second against a large (100 mil) view on a ~10 node cluster? Or are views really just a replacement to traditional mapreduces which you usually do once a day and take some hours depending on what you're doing?

The query I do is always:
bucket.query('s', 'd', query='key="aKey"&group=true&stale=ok&limit=1')
I read on the papers how to query most efficiently in Python and this is my result.

Important to add is, the view gets several thousands of updates per second. But even when I stopped the updates, let Couchbase compact the view and database, then queried the view, I got that high latency.

The results I receive look like this JSON: {u'count': 0, u'max': 0, u'sum': 0, u'sumsqr': 0, u'min': 0}

1 Answer

« Back to question.

The actual performance of the view would depend on a variety of factors:

1) How much work the server has to do in order to aggregate those results
2) Whether the include_docs option is being used at the client
3) Whether the streaming option is being used at the client.

Generally I'd recommend using the streaming in the query method if you expect to get a large number of results.

More help can be provided if example code is shown displaying which options have been passed to the query() method.

I edited my question.

Uninstalled, reinstalled, got an average latency of 300 microseconds now at way over 100,000 view entries (tho I don't have ssd). Dunno why, but everything seems to be fine, now.