Question : Cached count

socketman2016 · January 15, 2019, 5:36am

I am creating a tenant based service , each tenant has a dashboard that show some stats like user count
How can I have a cached count of users per tenant? I don’t want couch each time

What is the best way? I have 100s thousands tenants and 100s millions user

vsr1 · January 15, 2019, 6:27am

How about each tenant store user count as atomic counter. https://docs.couchbase.com/server/4.0/developer-guide/counters.html

socketman2016 · January 15, 2019, 7:48am

@vsr1 , Is there easier way?
How can I ensure consistency? my counter must be eventually consist , What must I do if count of users and atomic counter get out of sync , for instance I add user but failed to increment counter

ingenthr · January 16, 2019, 2:05am

The javascript views would work well for this.

Views retain inner data structures in the index on the counted items that qualify within the tree. These are built asynchronously. So, when you query the aggregation, you’ll get a cached copy of the count from the most recent update.

socketman2016 · January 17, 2019, 5:22am

@ingenthr Can you show me an example?

socketman2016 · January 20, 2019, 2:28pm

@ingenthr reply please

socketman2016 · January 21, 2019, 5:33pm

@ingenthr I apologize for mention, But I need help , help please

ingenthr · January 21, 2019, 6:27pm

Sorry for the delay, just a quick note to say I saw this and will get you an example shortly.

ingenthr · January 23, 2019, 2:12am

Given a set of documents which represent users where the key has the userid and tenant ID embedded like this…

key: u0:t0

{
  "tenant": 0,
  "user": 0,
  "name": "Groucho"
}

And then given a View with this index code that matches the keys to a pattern, grabbing the IDs out and emitting an entry for each tenant ID into the results…

function (doc, meta) {
  var re = new RegExp('t(.):u(.)');
  matched = re.exec(meta.id);
  
  if (matched.length >0) {
    emit("tenant" + matched[1], null); // the tenant matched
  }
}

… and given a reduce using the built in _count. A query against this view would give you the count of users in tenant0 and tenant1. The query string using my browser against a bucket test (but I’d use an SDK in the app, making sure I specify to “group”) is http://localhost:8092/test/_design/dev_agg/_view/usercount?limit=6&stale=false&connection_timeout=60000&inclusive_end=true&skip=0&full_set=true&group=true:

{"rows":[
{"key":"tenant0","value":2},
{"key":"tenant1","value":2}
]
}

My bucket had four users across two tenants and I get a count of two from each.

I intentionally made this one quite compact. It depends on being able to determine a user from the key, but you could also just match on a field of the document if you didn’t want to use the key pattern. Just change the logic in the function to emit whatever tenant keys and values make sense in your resulting dataset. This should perform quite well as it will only re-perform the count aggregation upon request and if you specify a range to the view query, it’ll only recalculate the count for that tenant.

socketman2016 · January 23, 2019, 5:51am

As I understand , the reduce function executed each time I run query? Right?
If the reduce function runs each time I have 2 issue

How can I get count for tenant0 I dont want calculate for all tenants,How can I filter?
Assume a tenant has millions user , if reducer need to runs each time and it must count millions documents , I think it is not optimized

ingenthr · January 23, 2019, 7:02am

No. The views engine will execute it only when data is changed and will store summaries on the interior of the index, so the cost is minimal.

You can specify ranges when querying the view. That’ll constrain it to the count of the range of interest, even if that’s just one value. See the docs on view querying.

The reduce will only be run for a subset containing changed documents.

I might recommend running a quick benchmark to prove it to yourself. Add a million docs. Count the amount of time it takes to run the first view query. Then count subsequent requests. Then change a subset of docs randomly… maybe 5%… then run a view query.

I expect you’ll see initial high cost, then subsequent low cost as the aggregation is summarized in the index for a subset of the data. As long as it isn’t changing quickly, even a stale=false request shouldn’t be too expensive.

socketman2016 · March 9, 2019, 6:29am

@ingenthr how can I group count by day?

{
  "tenant": 0,
  "user": 0,
  "name": "Groucho",
  "registerDate": 1552112926
}

I want to get count per day per tenant

socketman2016 · March 11, 2019, 2:20pm

Help needed!!!..

Topic		Replies	Views
Views and count distinct Couchbase Server	15	2599	March 27, 2019
Finding the highest atomic counters Couchbase Server	1	1876	March 4, 2015
Querying a view count issue Node.js SDK	0	1839	May 29, 2014
Managing the Incrementing Key Pattern Couchbase Server	5	4200	May 11, 2015
Atomic probabilistic counting and set membership Couchbase Server	4	2174	May 26, 2015

Question : Cached count

Related topics