How to GROUP BY and Filter using the Array values in Analytics N1QL query

Hi Couchbase Gurus,

Need your help and expertise, is there a way in Analytics - N1Ql query I could GROUP BY and filter result base on the given Array Values?

So, the array values would be:

["Sent Order", "Contact Emailed", "Contact Called"]

My documents:
// user documents:

[
 {
	"id": "user::0001",
	"firstName": "John",
	"lastName": "Doe"
 },
 {
	"id": "user::0002",
	"firstName": "Lyn",
	"lastName": "Max"
 },
 {
	"id": "user::0005",
	"firstName": "Garry",
	"lastName": "Poe"
 }
]

// activity documents:

[
 {
 	"id": "activity::1001",
 	"user": {
 	  "id": "user::0001",
 	  "name": "John Doe",
 	  "title": "Sales Rep"
 	},
 	"activityType": "Sent Order"
 },
 {
 	"id": "activity::1002",
 	"user": {
 	  "id": "user::0002",
 	  "name": "Lyn Max",
 	  "title": "Sales Rep"
 	},
 	"activityType": "Sent Order"
 },
 {
 	"id": "activity::1003",
 	"user": {
 	  "id": "user::0001",
 	  "name": "John Doe",
 	  "title": "Sales Rep"
 	},
 	"activityType": "Sent Order"
 },
 {
 	"id": "activity::1004",
 	"user": {
 	  "id": "user::0005",
 	  "name": "Garry Poe",
 	  "title": "Sales Rep"
 	},
 	"activityType": "Sent Order"
 },
 {
 	"id": "activity::1005",
 	"user": {
 	  "id": "user::0005",
 	  "name": "Garry Poe",
 	  "title": "Sales Rep"
 	},
 	"activityType": "Contact Called"
 },
  {
 	"id": "activity::1006",
 	"user": {
 	  "id": "user::0005",
 	  "name": "Garry Poe",
 	  "title": "Sales Rep"
 	},
 	"activityType": "Contact Email"
 },
 {
 	"id": "activity::1007",
 	"user": {
 	  "id": "user::0002",
 	  "name": "Lyn Max",
 	  "title": "Sales Rep"
 	},
 	"activityType": "Check In"
 },
  {
 	"id": "activity::1008",
 	"user": {
 	  "id": "user::0005",
 	  "name": "Garry Poe",
 	  "title": "Sales Rep"
 	},
 	"activityType": "Check In"
 }
]

My partial query, but I have no luck with desired result :frowning:

SELECT
  META(users).id,
  users.firstName,
  users.lastName,
  activitySummary
FROM `company_analytics`.users AS users

LEFT OUTER JOIN (
  SELECT 
    `user`.id AS id,
    COUNT(activityType) AS sentOrders
  FROM `company_analytics`.activities
  WHERE activityType IN ["Sent Order", "Contact Emailed", "Contact Called"]
  GROUP BY activityType, `user`.id
) AS activitySummary ON META(users).id = activitySummary.id

WHERE users.`title`.id = 'administrator'
ORDER BY activitySummary.sentOrders DESC

This would be my desired results:

[
 {
	"id": "user::0001",
	"firstName": "John",
	"lastName": "Doe",
	"activitySummary": {
	  "sentOrders": 2,
	  "calls": 0,
	  "emails": 0
	}
 },
 {
	"id": "user::0002",
	"firstName": "Lyn",
	"lastName": "Max",
	"activitySummary": {
	  "sentOrders": 1,
	  "calls": 0,
	  "emails": 0
	}
 },
 {
	"id": "user::0005",
	"firstName": "Garry",
	"lastName": "Poe",
	"activitySummary": {
	  "sentOrders": 1,
	  "calls": 1,
	  "emails": 1
	}
 }
]

So, what ever the the array values given ["Sent Order", "Contact Emailed", "Contact Called"] would be the results in my activitySummary.

really need you help;

Thanks Guys,

Hi @lyndondonz,

Please use ‘user_activity_summary’ from the following query on the inner side of your LEFT OUTER JOIN:

with requestedActivities as ["Sent Order", "Contact Email", "Contact Called"],
user_activity_count as (
  select user_id, activityType, count(*) as activityCount 
  from activities
  where array_contains(requestedActivities, activityType)
  group by user.id as user_id, activityType
), 
user_activity_summary as (
 select user_id, object_add_fields( {}, (from g as gi select gi.ua.activityType as `field-name`, gi.ua.activityCount as `field-value`)) activitySummary 
 from user_activity_count ua 
 group by user_id group as g
)
select * from user_activity_summary
1 Like

Here’s the overall query:

with requestedActivities as ["Sent Order", "Contact Email", "Contact Called"],
user_activity_count as (
  select user_id, activityType, count(*) as activityCount 
  from activities
  where array_contains(requestedActivities, activityType)
  group by user.id as user_id, activityType
), 
user_activity_summary as (
 select user_id, object_add_fields( {}, (from g as gi select gi.ua.activityType as `field-name`, gi.ua.activityCount as `field-value`)) activitySummary 
 from user_activity_count ua 
 group by user_id group as g
)
SELECT
  META(users).id,
  users.firstName,
  users.lastName,
  uas.activitySummary
FROM users LEFT OUTER JOIN user_activity_summary AS uas
ON META(users).id = uas.user_id
1 Like

Hi @dmitry.lychagin,

Cool, thanks man! it works :slight_smile:

I have a bit, problem with the response, how I include other array value from the requestedActivities [“Sent Order”, “Contact Email”, “Contact Called”]

Here is the result for the query:

[
  {
    "id": "user::53fe5737-9b8c-4c8b-b9d7-87521055b995",
    "activitySummary": {
      "Contact Called": 7,
      "Sent Order": 2
    },
    "firstName": “Lyn,
    "lastName": "Max”
  },
]

how can I include the other array value (for example: “Contact Email” ) even though it would return 0, null or missing
my desired output would be:

{
	"id": "user::0005",
	"firstName": "Garry",
	"lastName": "Poe",
	"activitySummary": {
	  "sentOrders":  2,
	  "calls": 7,
	  "emails": 0
	}
 }

Also, I like the kind of response so that I could use ORDER BY (for example: activitySummary.sentOrders )

Thanks :slight_smile:

Hi @lyndondonz,

Here’s the query (also does ORDER BY):

with requestedActivities as ["Sent Order", "Contact Email", "Contact Called"],
activitySummary0 as object_add_fields( toobject([]), (from requestedActivities as ra select ra as `field-name`, 0 as `field-value`) ),
user_activity_count as (
  select user_id, activityType, count(*) as activityCount 
  from activities
  where array_contains(requestedActivities, activityType)
  group by user.id as user_id, activityType
), 
user_activity_summary as (
 select user_id, 
   object_add_fields({}, (from g as gi select gi.ua.activityType as `field-name`, gi.ua.activityCount as `field-value`)) as activitySummary
 from user_activity_count ua 
 group by user_id group as g
)
select meta(users).id, users.firstName, users.lastName,
  case when uas.activitySummary is missing then activitySummary0 else object_concat( activitySummary0, uas.activitySummary ) end activitySummary
from users left outer join user_activity_summary as uas on meta(users).id = uas.user_id
order by activitySummary.`Sent Order`
1 Like

Hi @dmitry.lychagin,

Cool man!, the query works :slight_smile:

I have Couchbase Warning when I generate the query:
cross_product_join

is this ok? :slight_smile:

This cross join in this query is not a problem because there’s only a single document on the right side of that join.

Hi @dmitry.lychagin,

Whew!.. cool man thanks :slight_smile:

Thank you so much.

Hi @dmitry.lychagin,

The query you provided works great! :slight_smile:

But, I have a slight problem, I forgot to mention that I have another Array Values which are ["On Premise"', "Off Premise"]:

// the same activity document.

[
 {
 	"id": "activity::1007",
 	"user": {
 	  "id": "user::0002",
 	  "name": "Lyn Max",
 	  "title": "Sales Rep"
 	},
 	"activityType": "Check In",
 	"onPremise": false
 },
  {
 	"id": "activity::1008",
 	"user": {
 	  "id": "user::0005",
 	  "name": "Garry Poe",
 	  "title": "Sales Rep"
 	},
 	"activityType": "Check In",
 	"onPremise": true
 },
{
 	"id": "activity::1009",
 	"user": {
 	  "id": "user::0002",
 	  "name": "Lyn Max",
 	  "title": "Sales Rep"
 	},
 	"activityType": "Check In",
 	"onPremise": false
 },
]

My desired response for my activitySummary if it has ["On Premise"', "Off Premise"]:

"activitySummary": {
      "Contact Called": 7,
      "Sent Order": 2,
      "Contact Email": 0,
      "On Premise": 1,
      "Off Premise": 2
    }

Getting / filtering the On Premise:
WHERE activityType = 'Check In' AND onPremise = true

Getting / filtering the Off Premise:
WHERE activityType = 'Check In' AND onPremise = false

I hope you could help me solve this query, I’m trying to solved it but failed. Really, need help

Thanks

Hi @lyndondonz,

Is “onPremise” field only defined for “activityType”= “Check In” or is it applicable to other activities? If it’s only for “activityType”= “Check In” then we could rewrite activityType in the query to be “On Premise” / “Off Premise” instead of “Check In” to differentiate between the two.

user_activity_count as (
select user_id, activityType2 as activityType, count(*) as activityCount 
  from activities
  let activityType2 = case when activityType = "Check In" and onPremise then "On Premise" when activityType = "Check In" and not onPremise then "Off Premise" else activityType end
  where array_contains(requestedActivities, activityType)
  group by user.id as user_id, activityType2
)

You’ll also need to modify activitySummary0 definition to have “On Premise”: 0, “Off Premise”: 0 there instead of “Check In” : 0

1 Like

@dmitry.lychagin,

[quote=“dmitry.lychagin, post:10, topic:27751”]
Is “onPremise” field only defined for “activityType”= “Check In”
[/quote] - Yes

Great it works, thank you so much.

I have last question, Is there a way I could change the property names on activitySummary ?
from this:

"activitySummary": {
      "Contact Called": 7,
      "Off Premise": 3,
      "On Premise": 6,
      "Sent Order": 2,
      "Contact Email": 0
    }

to this:

"activitySummary": {
      "contactCalled": 7,
      "offPremise": 3,
      "onPremise": 6,
      "sentOrder": 2,
      "contactEmail": 0
    }

this is my full query:

WITH activityTypes AS ["Sent Order", "Contact Email", "Contact Called", "On Premise", "Off Premise"],
activitySummary0 AS object_add_fields( toobject([]), (FROM activityTypes AS ra SELECT ra AS `field-name`, 0 AS `field-value`) ),
activityTypesCount AS (
  SELECT 
    user_id,
    activityType2 as activityType, 
    count(*) AS activityCount 
  FROM `company_analytics`.activities
  LET activityType2 = (
    CASE WHEN activityType = "Check In" AND onPremise THEN "On Premise"
         WHEN activityType = "Check In" AND NOT onPremise THEN "Off Premise" 
         ELSE activityType END)
  WHERE array_contains(activityTypes, activityType2)
  AND createdAt BETWEEN '2020-01-01T06:00:00.000Z' AND '2020-09-30T22:34:46.621Z'
  GROUP BY user.id AS user_id, activityType2
),
user_activity_summary AS (
 SELECT 
   user_id, 
   object_add_fields({}, (FROM g AS gi SELECT gi.ua.activityType AS `field-name`, gi.ua.activityCount AS `field-value`)) AS activitySummary
 FROM activityTypesCount ua 
 GROUP BY user_id GROUP AS g
)
SELECT
  meta(users).id,
  users.firstName,
  users.lastName,
  CASE 
    WHEN uas.activitySummary IS MISSING
  THEN activitySummary0 
    ELSE object_concat( activitySummary0, uas.activitySummary ) END activitySummary
FROM `company_analytics`.users AS users 
LEFT OUTER JOIN user_activity_summary AS uas ON meta(users).id = uas.user_id

WHERE users.`title`.id = 'administrator'

ORDER BY activitySummary.`Off Premise` DESC

Thanks.

Hi @lyndondonz,

Here’s one way:

let x = "Contact Called"
select lower(substr(x, 0, 1)) || replace(substr(x, 1), " ", "")

There are two places in the query that create these field names:

from ... as x select x as `field-name`

You need to replace x with the above expression

from requestedActivities as ra select lower(substr(ra, 0, 1)) || replace(substr(ra, 1), " ", "") as `field-name`, ...

and

from g as gi select lower(substr(gi.ua.activityType, 0, 1)) || replace(substr(gi.ua.activityType, 1), " ", "")
 as `field-name`, ...
1 Like

Hi @dmitry.lychagin,

Thanks Man! it works :slight_smile: appreciated.