Take and delete n-last modified records based on array containing list of updates

Hello,
I need to create a query to delete the oldest modified records based on this criteria:
The date to consider for each record is the biggest timestamp in updates.timestamp.
So for example I want to delete all record with the biggest timestamp contained in the array older then 2015-01-01 …

the bucket contains million of records with this format:

{
"field1": "test",
"field2": "test2",
"updates": [
                {
                    "status_code": "STARTED",
                    "timestamp": 1659006728751
                },
                {
                    "status_code": "JOINING",
                    "timestamp": 1659006729224
                },
                {
                    "status_code": "SUCCESS",
                    "timestamp": 1659006729688
                }
            ]
}

Thanks.

SELECT  d.*, ARRAY v FOR v IN d.updates WHEN v.timestamp >= MILLIS("2015-01-01") END AS updates
FROM default AS d
WHERE .....;
UPDATE default AS d
SET d.updates = ARRAY v FOR v IN d.updates WHEN v.timestamp >= MILLIS("2015-01-01") END
WHERE ......;

sorry it’s not working…

Not sure what is not working. It works.
The following document array 3 elements (1 before 2015-01-01 and 2 after that)
It removes before 2015-01-01 . If need opposite change the condition

WITH doc AS ( { "field1": "test", "field2": "test2", "updates": [ { "status_code": "STARTED", "timestamp": 1420099100000 }, { "status_code": "JOINING", "timestamp": 1659006729224 }, { "status_code": "SUCCESS", "timestamp": 1659006729688 } ] })
SELECT  d.*, ARRAY v FOR v IN d.updates WHEN v.timestamp >= MILLIS("2015-01-01") END AS updates
FROM doc AS d;

so I have many records like that one…

let’s say

[{
"id": 1,
"field1": "test",
"field2": "test2",
"updates": [
                {
                    "status_code": "STARTED",
                    "timestamp": 1659006729000
                },
                {
                    "status_code": "SUCCESS",
                    "timestamp": 1659006725100
                }
            ]
},
{
"id": 2,
"field1": "test",
"field2": "test2",
"updates": [
                {
                    "status_code": "STARTED",
                    "timestamp": 1659006728123
                }
            ]
},
{
"id": 3,
"field1": "test",
"field2": "test2",
"updates": [
                {
                    "status_code": "STARTED",
                    "timestamp": 1659006728751
                },
                {
                    "status_code": "JOINING",
                    "timestamp": 1659006731234
                },
                {
                    "status_code": "SUCCESS",
                    "timestamp": 1659006429188
                }
            ]
}
]

so I need to get only the records where the max timestamp is older then some date I want to specify…
I hope it was clearer…
I can achieve that with something like this: (modifying your query a bit)

SELECT ARRAY_MAX(ARRAY v.timestamp FOR v IN b.updates WHEN (v.timestamp < 1655356182966) END) AS max
FROM `GatewayHistory` b

but I’m not able to remove the records with null…
the result is something like:

[
  {
    "maxx": null
  },
  {
    "maxx": 1655356182966
  },
  {
    "maxx": null
  },
  {
    "maxx": null
  },
  {
    "maxx": null
  },
  {
    "maxx": null
  },
  {
    "maxx": null
  },
  {
    "maxx": null
  },
  {
    "maxx": null
  }
]

Thanks.

SELECT max
FROM `GatewayHistory` b
LET max = ARRAY_MAX(ARRAY v.timestamp FOR v IN b.updates WHEN (v.timestamp < 1655356182966)
WHERE max IS VALUED

OR

SELECT max
FROM `GatewayHistory` b
LET max = (SELECT RAW MAX(u.timestamp) FILTER WHERE (u.timestamp < 1655356182966) 
           FROM b.updates AS u)[0]
WHERE max IS VALUED;

OR

CREATE INDEX ix1 ON GatewayHistory(ALL ARRAY u.timestamp FOR u IN updates END);
SELECT MAX(u.timestamp) AS maxx, META(b).id
FROM `GatewayHistory` b
UNNEST b.updates AS u
WHERE u.timestamp < 1655356182966
GROUP BY META(b).id;
1 Like