Hello,
To find addresses based on a certain distance I added a calculated field to the select statement. The field is named ‘distance’. When I use the ‘distance’ field in the where and order by clause the query execution time increases dramatically.
To investigate the cause I moved the calculation of the distance to the where clause and removed the order by on distance.
The result was almost immediate. So it seems that the calculation of the distance is not the culprit.
Instead adding a calculated field I added “10 AS test” in the select statement and “AND test > 10” to the where clause.
I expected no results which was indeed correct. But changing “AND test > 10” to “AND test = 10” did not produce any result either and it took a very long time to execute. I did not expect this.
It seems, correct me if I’m wrong, that using a calculated field in the where or order by clause is not correctly evaluated and increases the execution time dramatically.
This is my original query
SELECT
comp.companyInfo,
comp.datlinqId,
comp.externalId,
comp.isWholesale,
comp.name,
comp.salesmappId,
comp.segmentActivityId,
comp.segmentId,
comp.type,
comp.website,
META(comp
).id AS documentId,
ARRAY address FOR address IN comp.companyAddresses
WHEN (
address.latitude BETWEEN 50.455962604247
AND 54.050753395753
)
AND (
address.longitude BETWEEN 2.1195652898364
AND 7.9769727101636
)
END AS companyAddresses
,
ACOS(
SIN(RADIANS(52.0)) * SIN(RADIANS(companyAddresses[0].latitude)) +
COS(RADIANS(52.0)) * COS(RADIANS(companyAddresses[0].latitude)) *
COS(RADIANS(companyAddresses[0].longitude) - RADIANS(5.0))
) * 6371 AS distance
FROM
data_os
AS comp
USE INDEX (idx_company_long, idx_company_lat)
WHERE
META(comp
).id NOT LIKE “_sync:%”
AND comp
.type
= “COMPANY”
AND ARRAY_LENGTH(companyAddresses) > 0
AND distance <= 100
ORDER BY distance
LIMIT 100
I changed this by moving the distance calculation to the where clause
SELECT
comp.companyInfo,
comp.datlinqId,
comp.externalId,
comp.isWholesale,
comp.name,
comp.salesmappId,
comp.segmentActivityId,
comp.segmentId,
comp.type,
comp.website,
META(comp
).id AS documentId,
ARRAY address FOR address IN comp.companyAddresses
WHEN (
address.latitude BETWEEN 50.455962604247
AND 54.050753395753
)
AND (
address.longitude BETWEEN 2.1195652898364
AND 7.9769727101636
)
END AS companyAddresses
,
ACOS(
SIN(RADIANS(52.0)) * SIN(RADIANS(companyAddresses[0].latitude)) +
COS(RADIANS(52.0)) * COS(RADIANS(companyAddresses[0].latitude)) *
COS(RADIANS(companyAddresses[0].longitude) - RADIANS(5.0))
) * 6371 AS distance
FROM
data_os
AS comp
USE INDEX (idx_company_long, idx_company_lat)
WHERE
META(comp
).id NOT LIKE “_sync:%”
AND comp
.type
= “COMPANY”
AND ARRAY_LENGTH(companyAddresses) > 0
AND
ACOS(
SIN(RADIANS(52.0)) * SIN(RADIANS(companyAddresses[0].latitude)) +
COS(RADIANS(52.0)) * COS(RADIANS(companyAddresses[0].latitude)) *
COS(RADIANS(companyAddresses[0].longitude) - RADIANS(5.0))
) * 6371 BETWEEN 40 AND 100
LIMIT 100
Then reduced query to investigate the issue with using a calculated or derived field
SELECT
META(comp
).id AS documentId,
10 as test
FROM
data
AS comp
WHERE
META(comp
).id NOT LIKE “_sync:%”
AND comp
.type
= “COMPANY”
AND test = 10
LIMIT 100
This produces no result and increased the execution time
Removing the “AND test = 10” produces a direct result and shows the additional test field
So it seems using a calculated / derived field in the where or order by is not possible. Is this true?