Use ROW_NUMBER() slow

I have a query which is returning the result in 4 miliseconds

SELECT `provider_uuid`
FROM `app-live` AS d use index (provider_all_games_asc)
where sub_type='provider-extra' and chart_date= 1638316800 and country_code = "us" order by all_games_gpfs
LIMIT 7

now, if I want to add the row number and change the query to the below query I am getting the results in 3.4 seconds

SELECT `provider_uuid` , ROW_NUMBER() OVER (
  ORDER BY d.`all_games_gpfs`
) AS `row`
FROM `app-live` AS d use index (provider_all_games_asc)
where sub_type='provider-extra' and chart_date= 1638316800 and country_code = "us" order by all_games_gpfs
LIMIT 7

Is there a way in which I can improve this?

What I am trying to acheive is to get the “row” number of a specific provider_uuid based on the where clause. So, ideal would be to be able to provide the provider_uuid somehow and get the row number based on the where clause and sort by clause.

OLAP functions first materialize all possible values and then apply pagination can’t use index order and terminate early because calculation depends on results before pagination.

If final sort still needs all_games_gpfs NOT on row you can try this, Also if there is duplicates in all_games_gpfs will vary results in each run ( Avoid that ORDER BY all_games_gpfs, provider_uuid).

CREATE INDEX ix1 ON  `app-live` (country_code, chart_date, all_games_gpfs, provider_uuid   ) where sub_type="provider-extra" ;

SELECT provider_uuid, ROW_NUMBER() OVER ( ORDER BY `all_games_gpfs` ) AS `row`
FROM (SELECT `provider_uuid` , all_games_gpfs
                  FROM `app-live` AS d use index (ix1)
                  where sub_type='provider-extra' and chart_date= 1638316800 and country_code = "us" 
                  order by all_games_gpfs
                  LIMIT 7) AS t;

OR

SELECT u AS provider_uuid, UNNEST_POS(u)+1 AS `row`
FROM 1 AS d1
UNNEST (SELECT RAW `provider_uuid`
        FROM `app-live` AS d use index (ix1)
        WHERE sub_type="provider-extra" AND chart_date= 1638316800 AND country_code = "us"
        ORDER BY all_games_gpfs
        LIMIT 7) AS u;

Thanks @vsr1 you are super nice like always.

if I use the second query example without the LIMIT I get a 6.8 seconds query response time ( I have 70k entried for the where clause). the first example you sent is slower

Do you think we could change this to something that will return just the row number of a specifit provider_uuid something like provider_uuid = “000d1552-b6f8-46b8-aaca-f1c501267dd7” and to have the results in less than one second?

@flaviu ,

Do u want specific one uuid and its row id only not all of them?

WITH uuids AS (SELECT DISTINCT RAW `provider_uuid`
                      FROM `app-live` AS d use index (ix1)
                     WHERE sub_type="provider-extra" AND chart_date= 1638316800 AND country_code = "us"
                    ORDER BY all_games_gpfs)
SELECT FIRST {"row":pos+1, "provider_uuid":v } FOR pos:v IN uuids WHEN v = "000d1552-b6f8-46b8-aaca-f1c501267dd7" END.*;

This may not reduce time much (as you get 1 row vs many)

I just need the row id of one specific provider_uuid not all of them (but this row id can differ depending on the “WHERE” clause and “ORDER BY” form the sub_query)

updated previous post.

getting a “syntax error - at AS”,

Missed WITH

WITH uuids AS (SELECT DISTINCT RAW `provider_uuid`
                      FROM `app-live` AS d use index (ix1)
                     WHERE sub_type="provider-extra" AND chart_date= 1638316800 AND country_code = "us"
                    ORDER BY all_games_gpfs)
SELECT FIRST {"row":pos+1, "provider_uuid":v } FOR pos:v IN uuids WHEN v = "000d1552-b6f8-46b8-aaca-f1c501267dd7" END.*;

Also check EXPLAIN of subquery and see it avoiding explicit order by

1 Like

yes, working now… still a little bit slow (3 seconds) do you think we could do anything to improve on this?

FWIW, and ideally with a stable result-set (i.e. unique order), I’d try something like:

SELECT t.*, ROW_NUMBER() OVER ()
FROM
(SELECT `provider_uuid`
FROM `app-live` AS d use index (provider_all_games_asc)
where sub_type='provider-extra' and chart_date= 1638316800 and country_code = "us" order by all_games_gpfs
LIMIT 7) t;

Obviously if the index can cover & provide order it should then perform best.

That said, @vsr1’s solution is better. HTH.

3sec might be too long , is it CE. Do u have explain of the subquery.

give me a second to add the provider_uuid to the index and try again

it was fetching the provider_uuid from the documets. Now with that in the index I am gettting a 300 ms resulst.

@vsr1 You are fantastic, like always. Thanks and I appreciate your help