Storing time-series data
Sat, 03/02/2013 - 03:38
We want to store time-series data in couchbase. At present we are using RRD for same.
I understand that by using dateToArray and group_level we can generate graphs for day, week, month etc.
This would be good if we have single item for e.g. network bandwidth but what if we have multiple items that will be added to system. Will we have to create separate views for each item e.g. cpu, disk, processes etc?
I am trying to think of a schema or view for this scenario, can someone give some pointers?
Fri, 03/08/2013 - 14:53
documents:
{"event": "cpu", value: 1, date: …}
{"event": "memory", value: 2, date: …}Map:
function (doc, meta) {
emit(dateToArray(doc.date), [doc.event, doc.value]);
}Reduce:
function (key, values, rereduce) {
var result = {};
if (!rereduce) {
values.forEach(function(arr) {
var event = arr[0], value = arr[1];
if (!result.hasOwnProperty(event)) {
result[event] = {count: 1, avg: arr[1]};
} else {
var count = result[event].count + 1,
avg = result[event].avg;
avg += (value - avg) / count;
result[event].count = count;
result[event].avg = avg;
}
});
} else {
values.forEach(function(res) {
for (var event in res) {
if (!result.hasOwnProperty(event)) {
result[event] = {count: res.count, avg: res.avg};
} else {
var count = result[event].count + res.count;
var avg = result[event].avg * result[event].count/count + res.avg * res.count/count;
result[event].count = count;
result[event].avg = avg;
}
}
});
}
return result;
}Output:
{"rows":[{
"key": [2012, 03, 08],
"value": {"cpu": %avg_cpu_for_2012-03-08%, "memory": %avg_memory_for_2012-03-08%}
}]}Code might be wrong, but it shows the idea how to make it.
And by controlling group level, you can set resolution for average function.
Yes, for each metric you will need to create separate views unless they grouping them is meaningful.