USE KEYS references wrong data in correlated subquery

Hello everyone,

I was looking for a way to greatly optimize a query linking two types of documents : persons and time sheets. Work site managers input time sheets for their employees on work sites, then, a table is built for wage managers using these time sheets.
Our client wants to add a warning to the table when there haven’t been any timesheets in the 90 days before the displayed month. To do so, I directly used the persons’ id and an EXISTS correlated subquery because I wanted to avoid GROUP BY or JOINs but I may not have understood correctly how to use USE KEYS.
Below I wrote the query I was testing but USE KEYS references the person’s document instead of allowing me to test if they have at least one time sheet during the given period.

SELECT
  p AS `id_personne`,
  EXISTS(
    SELECT
      t.*
    FROM
      `bucket` t
    USE KEYS
      p
    WHERE
      t.`type` = 'pointage'
      AND
      t.`id_personne` = p
      AND
      t.`id_chantier` = 'work site id'
      AND
      t.`date_imputation` BETWEEN 1625781600000 AND 1633557599999
      AND
      t.`est_valide` IN [1, 2]
  ) AS `pointage`
FROM
  [
    'person id 1',
    'person id 2',
    'person id 3'
  ] AS p

Did I misunderstand how to use USE KEYS? Is what I want to do even possible? I even would like to do this query with a different list of work sites per person (person 1 on work site 1 and 2, person 2 on work site 2 and 3, etc) but I wanted to try simpler first.

Thanks and have a good day!

Pre 7.0 correlated subqueries requires USE KEYS. When USE KEYS provided subquery only consider the document keys specified in USE KEYS ONLY vs all the documents on the bucket.

Use 7.0 or you need to use JOIN or GROUP.

Thank you for such a fast reply! I’ll see with my manager if I should use JOIN/GROUP BY or if an upgrade is a possibility.

If you don’t want use join, execute separate query each person from the client.

CREATE INDEX ix1 ON bucket( id_personne, id_chantier,  est_valide, date_imputation ) WHERE type = "pointage";
SELECT RAW true
FROM `bucket` t
WHERE  t.`type` = 'pointage' AND  t.`id_personne` = $p AND  t.`id_chantier` = 'work site id'
      AND t.`date_imputation` BETWEEN 1625781600000 AND 1633557599999
      AND t.`est_valide` IN [1, 2]
LIMIT 1;

Even if use EXISTS query try use LIMIT 1 in subquery to avoid materialize all the data. Also avoid * in exists query as it not required.

GROUP BY query

SELECT t.`id_personne`, true AS pointage
FROM `bucket` t
WHERE t.`type` = 'pointage' AND t.`id_personne` IN $personids
      AND t.`id_chantier` = 'work site id'
      AND t.`date_imputation` BETWEEN 1625781600000 AND 1633557599999
      AND t.`est_valide` IN [1, 2]
GROUP BY t.`id_personne`;

Thank you, I’ll see what I do when it’s time, I was only looking for ways to do this in a fast way but I don’t have to implement it yet!

Hello,

We were able to upgrade to 7.x and the query seems to be working, thank you!