Thanks! correct, we would like to get the count of all prev half~ year ago.
We are using community edition 6.5 - i assume its scanning around 70% of the 60 millions docs.
I thought there is any way to improve it since its indexed and its a pretty easy count agg. Is there any way to improve such index operation to make it faster?
I would like to count how much im going to delete - how do you suggest to delete such huge amount of documents operation?
CREATE INDEX ix1 ON mybucket(lastTimeSeenInbound) WHERE lastTimeSeenInbound < 1628236287000;
Check UI will see index count or use REST API see index stats.
Now You want delete.
Write small SDK program and run the following query
As results stream asynchronously set expiration of the document to 10s and Let the document expiry delete the documents.
SELECT RAW META(t).id
FROM mybucket AS t
WHERE lastTimeSeenInbound < 1628236287000;
Thanks, Im not sure I fully understood the expiry, since we are using community edition there is no way to set TTL on documents isnt it?
How selecting using the sdk sets expiry for docs? Can you refer me to some doc?
DELETE FROM mybucket AS t
WHERE lastTimeSeenInbound BETWEEN 1528236287000 AND 1628236287000;
If need give small ranges (if need use different range in separate request in parallel)
If required give LIMIT and repeat until mutation count is 0
try request_plus let indexer catch up
NOTE: mutations takes time and might be slow.
Example:
Divide 10 different ranges (depends on cluster resources)
Make 10 DELETE statements different ranges with LIMIT 50000
Run all queries in parallel
wait couple of min, let indexer catch up (or use request_plus)
loop again until each query returns 0 mutation count or all desired documents deleted.
You can ignore timeout if one raised.