Analytics N1QL Query: GROUP BY `dates` base on the given date values

Hi Guys,

I’m wondering if there’s a way in Analytics N1QL query I could GROUP BY dates based on the given date values:
defined dates:

[
  "2020-02-11T00:00:00.000Z",
  "2020-03-25T00:00:00.000Z",
  "2020-04-05T00:00:00.000Z",
  "2020-05-03T00:00:00.000Z",
  "2020-06-09T00:00:00.000Z",
  "2020-07-15T00:00:00.000Z",
  "2020-08-20T00:00:00.000Z",
]

Here is my documents: sales

[
  {
  	"entityId": "territory::0000",
  	"periodAt": "2020-02-01T00:00:00.000Z",
  	"sales": 541.45
  },
 {
  	"entityId": "territory::0000",
  	"periodAt": "2020-02-09T00:00:00.000Z",
  	"sales": 200.11
  },
  {
  	"entityId": "territory::0000",
  	"periodAt": "2020-03-05T00:00:00.000Z",
  	"sales": 41.45
  },
  {
  	"entityId": "territory::0000",
  	"periodAt": "2020-04-05T00:00:00.000Z",
  	"sales": 11.31
  },
 {
  	"entityId": "territory::0000",
  	"periodAt": "2020-05-05T00:00:00.000Z",
  	"sales": 51.31
  },
   {
  	"entityId": "territory::0000",
  	"periodAt": "2020-05-25T00:00:00.000Z",
  	"sales": 31.51
  }
  
]

My goal for this query is to group this by the given/define dates. Even though, there are no documents matched based on the given/defined dates it will still display it.

The result I will trying to accomplished is these:

[
  {
  	"month": "Feb",
  	"year": "2020",
  	"totalSales": 741.56
  },
  {
  	"month": "Mar",
  	"year": "2020",
  	"totalSales": 41.45
  },
  {
  	"month": "Apr",
  	"year": "2020",
  	"totalSales": 11.31
  },
  {
  	"month": "May",
  	"year": "2020",
  	"totalSales": 82.82
  },
  {
  	"month": "Jun",
  	"year": "2020",
  	"totalSales": 0
  },
  {
  	"month": "Jul",
  	"year": "2020",
  	"totalSales": 0
  },
  {
  	"month": "Aug",
  	"year": "2020",
  	"totalSales": 0
  }
]

or perhaps it would be better, the grouping of dates are base on periodAt date range:

sa.periodAt BETWEEN '2020-08-01T00:00:00.000Z' AND '2020-08-31T23:59:59.999Z'

Here is my initial query but haven’t luck on the result.

SELECT
     ARRAY_AGG(sa.periodAt) AS periodAt, 
     SUM(sa.sales) AS totalSales
 FROM `company_analytics`.`sales-aggregates` AS sa
 WHERE sa.entityId IN [
"territory::0000",
"territory::0001"
 ]
 AND sa.periodAt BETWEEN '2020-08-01T00:00:00.000Z' AND '2020-08-31T23:59:59.999Z'
 GROUP BY DATE_PART_STR(periodAt,'month')

Thanks really need with this guys.

This doesn’t solve your entire issue, but since dates are strings you can use ‘substr’ to group by any fraction of the timestamp. E.g., for days you would do:

group by substr(periodAt,0,10)

and for months you would do:

group by substr(periodAt,0,7)

That still doesn’t provide you with the months where no documents match, I don’t know if that is possible. Hopefully some one else will chime in with ideas on that.

1 Like

I am not expertise with Analytic. But with N1QL the following query produces the desired results. Change DATE_RANGE_STR() group by based on your needs.

SELECT
     DATE_PART_STR(d.periodAt,'year') AS year,
     DATE_PART_STR(d.periodAt,'month') AS month,
     SUM(d.sales) AS totalSales
FROM (SELECT m AS periodAt FROM DATE_RANGE_STR('2020-01-01T00:00:00.000Z', '2020-12-31T00:00:00.000Z', 'month') AS m
      UNION ALL
      SELECT sa.periodAt, sa.sales
      FROM `company_analytics`.`sales-aggregates` AS sa
      WHERE sa.entityId IN [
     "territory::0000",
     "territory::0001"
      ]
      AND sa.periodAt BETWEEN 2020-01-01T00:00:00.000Z' AND '2020-08-31T23:59:59.999Z') AS d
GROUP BY DATE_PART_STR(d.periodAt,'year'), DATE_PART_STR(d.periodAt,'month');
1 Like

Hi @lyndondonz,

This query should work too. Similar idea. Use DATE_RANGE_STR() to get the required range, then OUTER JOIN with aggregated sales data.

WITH 
  DATE_START as '2020-01-01T00:00:00Z',
  DATE_END as '2020-08-01T00:00:00Z',
  DATES AS (
    SELECT date_part_str(dt, 'year') year, date_part_str(dt, 'month') month
    FROM date_range_str(DATE_START, DATE_END, 'month', 1) dt
  ),
  SALES AS (
    SELECT year, month, SUM(sa.sales) AS totalSales
    FROM `company_analytics`.`sales-aggregates` AS sa
    WHERE sa.entityId IN ["territory::0000", "territory::0001"]
    AND sa.periodAt BETWEEN DATE_START AND DATE_END
    GROUP BY DATE_PART_STR(sa.periodAt, 'year') year, DATE_PART_STR(sa.periodAt,'month') month
  )
SELECT d.year, d.month, if_missing(s.totalSales, 0) totalSales
FROM DATES d LEFT OUTER JOIN SALES s ON d.year = s.year AND d.month = s.month
ORDER BY d.year, d.month
1 Like

Hi @dmitry.lychagin and @vsr1,

Thanks guys, your solution works. :slight_smile: