I have a query which is returning the result in 4 miliseconds
SELECT `provider_uuid`
FROM `app-live` AS d use index (provider_all_games_asc)
where sub_type='provider-extra' and chart_date= 1638316800 and country_code = "us" order by all_games_gpfs
LIMIT 7
now, if I want to add the row number and change the query to the below query I am getting the results in 3.4 seconds
SELECT `provider_uuid` , ROW_NUMBER() OVER (
ORDER BY d.`all_games_gpfs`
) AS `row`
FROM `app-live` AS d use index (provider_all_games_asc)
where sub_type='provider-extra' and chart_date= 1638316800 and country_code = "us" order by all_games_gpfs
LIMIT 7
Is there a way in which I can improve this?
What I am trying to acheive is to get the “row” number of a specific provider_uuid based on the where clause. So, ideal would be to be able to provide the provider_uuid somehow and get the row number based on the where clause and sort by clause.
OLAP functions first materialize all possible values and then apply pagination can’t use index order and terminate early because calculation depends on results before pagination.
If final sort still needs all_games_gpfs NOT on row you can try this, Also if there is duplicates in all_games_gpfs will vary results in each run ( Avoid that ORDER BY all_games_gpfs, provider_uuid).
CREATE INDEX ix1 ON `app-live` (country_code, chart_date, all_games_gpfs, provider_uuid ) where sub_type="provider-extra" ;
SELECT provider_uuid, ROW_NUMBER() OVER ( ORDER BY `all_games_gpfs` ) AS `row`
FROM (SELECT `provider_uuid` , all_games_gpfs
FROM `app-live` AS d use index (ix1)
where sub_type='provider-extra' and chart_date= 1638316800 and country_code = "us"
order by all_games_gpfs
LIMIT 7) AS t;
OR
SELECT u AS provider_uuid, UNNEST_POS(u)+1 AS `row`
FROM 1 AS d1
UNNEST (SELECT RAW `provider_uuid`
FROM `app-live` AS d use index (ix1)
WHERE sub_type="provider-extra" AND chart_date= 1638316800 AND country_code = "us"
ORDER BY all_games_gpfs
LIMIT 7) AS u;
if I use the second query example without the LIMIT I get a 6.8 seconds query response time ( I have 70k entried for the where clause). the first example you sent is slower
Do you think we could change this to something that will return just the row number of a specifit provider_uuid something like provider_uuid = “000d1552-b6f8-46b8-aaca-f1c501267dd7” and to have the results in less than one second?
Do u want specific one uuid and its row id only not all of them?
WITH uuids AS (SELECT DISTINCT RAW `provider_uuid`
FROM `app-live` AS d use index (ix1)
WHERE sub_type="provider-extra" AND chart_date= 1638316800 AND country_code = "us"
ORDER BY all_games_gpfs)
SELECT FIRST {"row":pos+1, "provider_uuid":v } FOR pos:v IN uuids WHEN v = "000d1552-b6f8-46b8-aaca-f1c501267dd7" END.*;
This may not reduce time much (as you get 1 row vs many)
I just need the row id of one specific provider_uuid not all of them (but this row id can differ depending on the “WHERE” clause and “ORDER BY” form the sub_query)
WITH uuids AS (SELECT DISTINCT RAW `provider_uuid`
FROM `app-live` AS d use index (ix1)
WHERE sub_type="provider-extra" AND chart_date= 1638316800 AND country_code = "us"
ORDER BY all_games_gpfs)
SELECT FIRST {"row":pos+1, "provider_uuid":v } FOR pos:v IN uuids WHEN v = "000d1552-b6f8-46b8-aaca-f1c501267dd7" END.*;
Also check EXPLAIN of subquery and see it avoiding explicit order by
FWIW, and ideally with a stable result-set (i.e. unique order), I’d try something like:
SELECT t.*, ROW_NUMBER() OVER ()
FROM
(SELECT `provider_uuid`
FROM `app-live` AS d use index (provider_all_games_asc)
where sub_type='provider-extra' and chart_date= 1638316800 and country_code = "us" order by all_games_gpfs
LIMIT 7) t;
Obviously if the index can cover & provide order it should then perform best.