Views and count distinct

I am searched very must but failed to have a solution , I have documents like

{
   "client":"a",
   "user":"b",
   ...
}

{
   "client":"a",
   "user":"b",
   ...
}

{
   "client":"a",
   "user":"c",
   ...
}

I want to have a view that show me how many DISTINCT user with cliant a exists

Please reply me …

Any idea ???

:sob: help please …

After 5 days!! UP …:disappointed_relieved:

@matthew.groves @brett19 @vsr1 @ingenthr I really need your helps

@socketman2016, Just got alerted on this one. Started looking at it. We shall respond soon.

1 Like

Hi @socketman2016,

Does it need to be a map/reduce view or is N1QL on the table (I notice you tagged @vsr1).

In N1QL it’s too easy
I want to know how can I do it in a map reduce view , the reason is here Question : Cached count

I know one solution is Feature request : _approx_count_distinct

Hi @socketman2016,

You could use N1QL embedded in an event handler, and store the result in a bucket. You can then either refresh the aggregate periodically using either timers.

Please see:
https://blog.couchbase.com/using-n1ql-with-couchbase-eventing-functions/
https://docs.couchbase.com/server/5.5/eventing/eventing-examples.html
https://blog.couchbase.com/timers-couchbase-functions/

Best Regards,
Siri

1 Like

Hi @socketman2016,

You could create a map function which emits client and user and reduce function _count.
Then call the view with group_level=2 and start key as [client, null] and end key as [client, “\uffff”] and count the number of row returned. This will be the number of distinct user for that client for startkey[0].

Map:
function (doc, meta) {
emit([doc.client, doc.user]);
}

reduce: _count

Query:
Querying for distinct user for client “a”.
stale=false&inclusive_end=true&full_set=true&group_level=2&startkey=%5B"a"%2C%20null%5D&endkey=%5B"a"%2C%20"%5Cu0fff"%5D

Result:
{“rows”:[
{“key”:[“a”,“Value1”],“value”:100},
{“key”:[“a”,“Value2”],“value”:100}
]
}
counting the number of rows gives the number of distinct user for client “a”.

1 Like

@AnkitPrabhu your approach is not scaled up
What do you think about 100 millions distinct user per client??
The N1QL works better than your approach

@Siri as eventing is not available in community edition , is there any alternative way ?

You could try out EE edition if you would like to play around with eventing. Support in community edition is in the future roadmap.

1 Like

@jeelan.poola is there any road-map reference? Can you tell me then it is available on CE?
@Siri as eventing is not available in community edition , is there any alternative way ?
@Siri Is eventing eventual consist? assume in function we increase 3 counter , what happens when first counter increased and when increasing second we have a node failure?

Hi Socketman2016,

We don’t have a firm date unfortunately. As Eventing is considered a developer feature, it is a candidate for inclusion to CE, but I don’t know the scope or timing of it, sorry.

Regarding consistency - eventing sees mutations after they have occurred. So there is a time lag between mutation and handler running.

Regarding counters - you shouldn’t count mutations because mutations are de-duplicated. When you create a handler, it sees all documents (“everything”) and if a document was overwritten multiple times, it will only see the newest few versions of the given document will be seen by the handler. So you can’t rely on counting mutations even regardless of node failure.

Regarding Node failure - eventing engine checkpoints the sequence number up to which it has processed regularly. If there is a node failure, a new node takes over from the last checkpoint. This means that there can be duplicate processing within the checkpoint window but not missed processing. The checkpoint window size is configurable as an advanced setting.

We do have features to address some of above to make Eventing more comparable with View capability over time, but as those are not on a scheduled release yet, I won’t go into the details.

It seems to me that if @AnkitPrabhu can refine his suggestion, Views may work better for your use case.

Siri

1 Like