I have created a CTE based n1ql query that works fine in lower environments where the data is very low when compared to PROD.
We have the primary index deleted and hence the query will take some or the other index ( as we have not created any specific index for this particular query)
When I ran this query in prod environment, one of my query node went down and failed over.
My doubt is, even if the query is not picking up right index and the data is huge, isn’t the expected behaviour is to get the query time out ( kept at default time out of 600 ) and not bring the entire query node (only a single service running 34gb ram) down and failed over?
We can only tune a query for index if we have the query service running and able to re-run the query more than once to tune it further.
CTE must materialized all the data and need to be present till the end of query. Depends on the query may need other memory . Run your CTE separately and see how many documents it generate. Avoid unnecessary fields in CTE. If it is huge you may want consider JOIN vs CTE.
All the results are in memory. In 7.0 there is memory_quota per request The N1QL Query Language Now Has a Per-Request Memory Quota
@bora.mohan Index service has a memory quota and will be very good at staying within this limit, but Query service does not have similar enforcement to limit memory usage. Index can also spill over to disk. Thus it may be to your benefit to create a secondary index for this query.
WITH nonConcludedRefId AS (SELECT DISTINCT RAW cte.refId
FROM bucket1 AS cte
where cte.docType='xxx' and cte.refNo is not missing
and cte.stage <> 'Concluded')
SELECT meta().id as docId, *
FROM bucket2 use keys (select DISTINCT raw meta(a).id from bucket1 i
inner join bucket2 a use hash(build) on i.ref = a.ref
let x = (select RAW c from nonConcludedRefId as c )
WHERE i.docType='xxx'
and a.docType='yyy'
and (i.stage = 'Concluded')
and a.ref NOT IN x
and i.refNo is not missing
and STR_TO_MILLIS(a.modifiedTimestamp) < STR_TO_MILLIS(DATE_ADD_STR(CLOCK_LOCAL(), -10, 'day'))
and a.modifiedTimestamp between '2021-07-01' and '2021-07-02');
WITH dockeys AS (WITH nonConcludedRefId AS ( SELECT DISTINCT RAW cte.refId
FROM bucket1 AS cte
WHERE cte.docType="xxx" AND cte.refNo IS NOT MISSING
AND cte.stage <> "Concluded")
SELECT DISTINCT RAW meta(a).id
FROM bucket1 AS i
INNER JOIN bucket2 AS a USE HASH(BUILD) ON i.ref = a.ref
WHERE i.docType="xxx"
AND a.docType="yyy"
AND (i.stage = "Concluded")
AND i.refNo IS NOT MISSING
AND a.ref NOT IN nonConcludedRefId
AND a.modifiedTimestamp < DATE_ADD_STR(CLOCK_LOCAL(), -10, "day")
AND a.modifiedTimestamp BETWEEN "2021-07-01" AND "2021-07-02")
SELECT META().id AS docId, *
FROM bucket2 USE KEYS dockeys;
CREATE INDEX ix1 ON bucket1(stage,refNo, ref, refId ) WHERE docType="xxx";
CREATE INDEX ix2 ON bucket2(modifiedTimestamp, ref) WHERE docType="yyy";
Also try PROBE instead of BUILD
BUILD side should be smallest. It depends on how many documents it produce after the constant predicates of bucket is applied
WITH nonConcludedRefIds AS ( SELECT DISTINCT RAW cte.refId
FROM bucket1 AS cte
WHERE cte.docType="xxx" AND cte.refNo IS NOT MISSING
AND cte.stage <> "Concluded"),
ConcludedRefIds AS ( SELECT DISTINCT RAW cte1.ref
FROM bucket1 AS cte1
WHERE cte1.docType="xxx" AND cte1.refNo IS NOT MISSING
AND cte1.stage = "Concluded")
SELECT meta(a).id, a.*
FROM bucket2 AS a
WHERE a.docType="yyy"
AND a.ref NOT IN nonConcludedRefIds
AND a.ref IN ConcludedRefIds
AND a.modifiedTimestamp < DATE_ADD_STR(CLOCK_LOCAL(), -10, "day")
AND a.modifiedTimestamp BETWEEN "2021-07-01" AND "2021-07-02";
CREATE INDEX ix1 ON bucket1(stage, refNo, ref, refId ) WHERE docType="xxx";
CREATE INDEX ix2 ON bucket2(modifiedTimestamp) WHERE docType="yyy";
Thank you for the queries, but I am still not sure why the node has to go down instead of query timeout.
I had plans of optimising my query provided it times out or is not performant.
How do i ensure that a node will not go down in future as there will be ad-hoc queries that will be executed in the system to serve customers.
If primary is deleted and query is running on a non distant index (almost all predicate fields are indexed), it should give me some space to analyse the query and not bring the node down.