I have a single bucket (Couchbase Community edition 6.5) consisting of the following documents:
employees {
employeeGroupId: string,
type: "Employee"
}
clocks {
employeeId: string,
areaId: string
date: string,
type: "Clock"
}
Each employee has multiple corresponding clock items for each day. I need to get the following:
- first clock -> clockIn
- last clock -> clockOut
I have written the following query which gets the first and last clock items with execution time <100 ms:
SELECT META(employee).id AS employeeId,
employee.employeeGroupId,
MIN(clock.date) AS clockIn,
MAX(clock.date) AS clockOut
FROM `bucket` employee LEFT
JOIN `bucket` clock ON clock.employeeId = META(employee).id
AND type = "Clock"
AND clock.date BETWEEN "2020-06-01T00:00:00.000Z" AND "2020-06-02T00:00:00.000Z"
WHERE employee.type = "Employee"
GROUP BY employee;
The problem is I need to get the corresponding areaId with the matching clock.
I have written the following query that does. I create two separate sub queries that sort all of the clock items for the day first ascending and then descending and select the first item.
CREATE INDEX adv_employeeId_type_date_blockId ON `bucket`(`employeeId`,`type`,`date`,`blockId`)
CREATE INDEX adv_employeeId_type_date ON `bucket`(`employeeId`,`type`,`date`)
CREATE INDEX adv_type_employeeId_date ON `bucket`(`type`,`employeeId`,`date`)
SELECT META(employee).id AS employeeId,
employee.employeeGroupId,
clockIn,
clockOut
FROM `bucket` employee
LEFT JOIN (
SELECT obj.employeeId,
obj.date,
obj.areaId
FROM `bucket` obj
WHERE obj.employeeId = META(employee).id
AND obj.type = "Clock"
AND obj.date BETWEEN "2020-06-01T00:00:00.000Z" AND "2020-06-02T00:00:00.000Z"
ORDER BY obj.date
LIMIT 1) clockIn ON clockIn.employeeId = META(employee).id
LEFT JOIN (
SELECT obj.employeeId,
obj.date,
obj.areaId
FROM `bucket` obj
WHERE obj.employeeId = META(employee).id
AND obj.type = "Clock"
AND obj.date BETWEEN "2020-06-01T00:00:00.000Z" AND "2020-06-02T00:00:00.000Z"
ORDER BY obj.date DESC
LIMIT 1) clockOut ON clockOut.employeeId = META(employee).id
WHERE employee.type = "Employee"
GROUP BY employee,
clockIn,
clockOut;
The problem is that the above query is inefficient with execution time >10 seconds.
In other words I need to get additional object values from the aggregate MIN() and MAX() functions.
I am sure the second query is not the most efficient method to achieve this, does anyone have any other suggestions?