Efficient way to create views on a small subset of the data?
Suppose I have a view with a map step that I know will only emit for a small subset of the data, such as
function (doc, meta) {
if (doc.foo){
...
}
}
where doc.foo is defined for <.1% of documents. Suppose I already have a list of documents that meet this condition, and want to dynamically create this view. As I understand it, the view index is built by evaluating ALL documents, which will be very inefficient in this case. Is there any way to make this more efficient (e.g. using an existing index to create a subindex)? If not, are there plans to add this capability in the future?
Thanks!
Thanks for the info, glad to know you are thinking about such things. One other workaround I was thinking of was creating having a dummy field, doc.subset, that is initially not defined on any document.
function (doc, meta) {
if (doc.subset & doc.foo){
...
}
}
Then, when I want to run the map step on my subset, I set the doc.subset field in those documents, wait for the view index to update, then run the query. This should only run the map step on the subset as desired. After I am done, I can remove the doc.subset fields. Not sure about the efficiency of this option though.
Also, to get a dynamic map function, you could embed some javascript in a doc.body, and do:
function (doc, meta) {
if (doc.subset & doc.foo){
eval(doc.body)
}
}
You could create a separate Bucket for that 0.1%, your application logic would need to know to go to the second Bucket for any documents that had doc.foo (meaning there is some logic to this). That would solve it. Remember, what you are talking about is the initial indexing. It's incremental after that, which is extremely fast obviously.
Using indexes to create indexes is tricky for all kinds of reasons, mostly consistency related. We are currently working on ad hoc querying functionality but I don't have a timeline for it. But it's very exciting!
@scalabl3
Technical Evangelist
Couchbase Inc.