Limitation of the reduce function.

I develop a web analytics service (http://bitscup.com) based on Couchbase. One of the problems I encountered this is limitation of reduce function.

reason: error (Reduction too large (66762 bytes))

I need to get a structure where the key is the URL, and the value - the median time and all this must be grouped by users.

We are still in the development stage and our database is very small, only 200,000 records and I'm surprised that has already encountered this problem. Tell me please, what am I doing wrong?

Example of the document:

{
   "ts": 1383629627580,
   "fts": 1383629627580,
   "hits": 1,
   "utm": {
       "source": "yandex cpc",
       "campaign": "noviy_sveet",
       "term": "c411103f2a634f973f604276418e6d2031d40a1c"
   },
   "site": "bitscup.com",
   "convs": [],
   "type": "session",
   "user": "93cfe4d7bd7a",
   "source": "yandex cpc"
}

My map function:

function (doc, meta) {
  if (meta.type === 'json' && doc.type === 'session') {
    var time = doc.ts - doc.fts;
    emit([doc.site, doc.ts.toString()], {src: doc.source, user: doc.user, time: time});
  }
}

My reduce function:

function(key, values, rereduce) {
  var result = {}, group = {}, median;
 
  median = function(array) {
    array.sort(function(a, b) { return a - b; });
    var mid = Math.floor(array.length / 2);
    if ((array.length % 2) === 1) {
      return array[mid];
    } else {
      return (array[mid - 1] + array[mid]) / 2;
    }
  };
 
  for (var i = 0; i < values.length; i++) {
    if (rereduce) {
      for (var src in values[i]) {
        if (!(result[src])) {result[src] = {"count": 0, "median": []};}
        result[src] = {
          "count": (result[src].count + values[i][src].count),
          "median": result[src].median.concat(values[i][src].median)
        };
      }
    } else {
      group[values[i].user] = {
        "src": values[i].src,
        "time": values[i].time
      };
    }
  }
 
  for (var user in group) {
    if (!(result[group[user].src])) {result[group[user].src] = {"count": 0, "median": []};}
    result[group[user].src] = {
      "count": (result[group[user].src].count + 1),
      "median": result[group[user].src].median.concat(group[user].time)
    };
  }
 
  for (var src in result) {
    result[src].median = median(result[src].median);
  }
 
  return result;
}

try putting most of the work in the mapper part and keeping the reduce part simple.

I felt like I was doing something wrong. We are changed the method of storage, and now there is no need for in reduce function. Thank you.

0 Answers

No answers yet