Problems Accessing Full Set in Couchbase

I've created a bucket called "data" with roughly 1.5 million keys that consumes 464 MB of RAM. As an example of key:

key: 2000006394700604500@1384480800
value: {
"products": {
"CA534APF87KKQ": 2,
"CA534APF07KJW": 0.5,
"AT573APF09KPQ": 0.5,
"FI911APF90OUB": 0.5,
"FI911APF88LZB": 0.5,
"CA534APF88KKP": 1.5,
"AT573APF45KOG": 3.5
},
"eid": 2000006394700604400,
"updated_at": 1384480800
}

the value updated_at is a timestamp.

After that I created a View using the following Map / Reduce functions:

Map:
function (doc, meta) {

if("eid" in doc){

var eid = doc.eid;
var timestamp = doc.updated_at;

var data = {};
data[eid] = doc.products;

emit(timestamp, data);

}
}

Reduce
function(key, values, rereduce) {
var data = {};

values.forEach(function(value) {
for (eid in value) {
data[eid] = value[eid];
}
});

return data;
}

And I could get results such as (using group_level = 1):

Key Value
1383789600 { "2137025934600734500": { "WH587SHF92ECD": 0.5...}}
1383876000 { "2137407496700965400": { "PI931SHF91WQY": 0.5,}}
1383962400 { "2137864596000155600": { "PI931APF16CIX": 1.5}}

After that we created a Python application to process the data such as:

>>>> from couchbase import Couchbase
>>>> cb = Couchbase.connect(host="localhost", bucket="data")
>>>> r = cb.query("webtrack", "webtrack", True, group_level = 1)

And we got correct results but only 1085 values.

Then I have been trying to get the full data set with no success and we couldn't work around this issue.

I tried Publishing the View to see if that would create a production view, but so far nothing appears as the result. In python the command:

>>>>r = cb.query("webtrack", "webtrack", False, group_level = 1)

Brings no results. I also tried:

>>>> r = cb.query("webtrack", "webtrack", False, group_level = 1, full_set = True)

But again no result came along.

After researching about this issue I learned that the creation might take a while...I wonder if I messed up into some configuration to create the production view and now it's taking so long that it can't quite process its creation.

Am I doing something wrong? How do I access the full data set from couchbase?

I appreciate your help in advance,

Will

1 Answer

« Back to question.

Firstly, the object returned by the query is an iterator - If you're not iterating then you won't get any rows.
Additionally, you can check the undelying Query object's encoded property to see what is actually being passed over the network. So e.g.

viter = cb.query(.....)
print viter.query.encoded
 
for row in viter:
    print row  # ...

Hi mnunberg,

Thanks for the help. We were already doing what you suggested to retrieve data from couchbase and this brought us 1085 values. We're still not able to retrieve the whole data set.

printing encoded yielded "full_set=1&group_level=1".

Is there anything else we can do?

Thanks in advance!

Well, let's first establish whether you are having an issue in the client, or are having an issue in getting the server to return your data. full_set should always yield the results.

Additionally, since you are using reduce (i.e. group_level=1) your results will only be the results of the reductions, which will be less than the total number of items you have inside 'map' since reduce by definition will.. reduce the results to some cummulative function.