Order by makes query very slow

mustanish · February 7, 2018, 5:24am

Hi,

I have a bucket named user with 160k records in it. When I run query to select recebt 10 records from the it, it takes more than 8 sec to select 10 record.

vsr1 · February 7, 2018, 10:22am

ORDER BY needs to produce all the qualifying records. Check the article and see if you can create index based on query order https://dzone.com/articles/designing-index-for-query-in-couchbase-n1ql

If further help needed please post exact query and sample document

mustanish · February 15, 2018, 12:52pm

@vsr1

Hi,

sorry for the delay didn’t get time to come here, however query which is creating problem is this

SELECT , (6371acos(cos(radians(28.549056336607507))*cos(radians(place_lat))*cos(radians(place_long)-radians(77.2689280000161))+
sin(radians(28.549056336607507))*sin(radians(place_lat)))) AS distance,(SELECT t2.user_photo,t2.name FROM users t2 use keys pins.user_id
WHERE t2.user_name= pins.user_name) AS details FROM pins USE INDEX (pin_geo_latlon_new USING GSI)
WHERE user_name NOT IN [‘mustanish123’] AND place_lat>=28.546153612815345 AND place_lat<=28.551958980392783 AND
place_long>=77.26699680953641 AND place_long<=77.27085919047727 AND status=1 AND cat_id!=1 ORDER BY distance ASC LIMIT 10 OFFSET 0;

I also want to sort this record based on pin_created_date for that I wrote this indexing

CREATE INDEX my-journey-new ON pins(user_name,place_lat,place_long,pin_create_date DESC) but this is not working

Sample data after running query

[
{
“details”: [
{
“name”: “Abhinav”,
“user_photo”: “Abhi.mmi.jpg”
}
],
“distance”: 0.026640982767172527,
“pins”: {
“action”: {
“commentCount”: 0,
“flagCount”: 0,
“likeCount”: 0
},
“alt”: 0,
“cat_id”: 0,
“cat_name”: null,
“child_cat_id”: 0,
“child_cat_name”: null,
“course”: 0,
“expiry_date”: 0,
“flag”: 0,
“images”: ,
“modified_by”: null,
“name”: “Abhinav”,
“photo”: “Abhi.mmi.jpg”,
“pin_create_date”: “2016-10-14 11:02:06”,
“pin_id”: “pin3100259512”,
“pin_privacy”: 0,
“pin_rate”: 1,
“pin_story”: “Grt”,
“pin_story_title”: null,
“pin_type”: “review”,
“place_id”: “2FHGLJ”,
“place_lat”: 28.549148,
“place_long”: 77.268676,
“place_name”: “Navketan Engineering Corporation”,
“point”: 0,
“progress”: 0,
“quality”: 0,
“social_id”: “0”,
“speed”: 0,
“status”: 1,
“user_id”: “4d01892401d0cbe60835148cc4a09737”,
“user_name”: “abhi.mmi”,
“utc”: 0
}
},
{
“details”: [
{
“name”: “lab”,
“user_photo”: “lab1235a813709b5436.jpg”
}
],
“distance”: 0.04750143392658823,
“pins”: {
“action”: {
“commentCount”: null,
“flagCount”: null,
“likeCount”: null
},
“alt”: 0,
“cat_id”: 3,
“cat_name”: “Smart City Issue”,
“child_cat_id”: 19,
“child_cat_name”: “Electricity Failure”,
“course”: 0,
“expiry_date”: 1834224456,
“flag”: 0,
“identifier”: “faa6daf3b45b46a2bf02ad1798ff428c”,
“images”: [
{
“created_on”: “2018-02-15 16:17:37”,
“photo”: “9b5a8bc969f476cd.jpg”,
“status”: 0,
“user_name”: “lab123”
}
],
“modified_by”: “”,
“name”: “lab”,
“old_pin_id”: 35166,
“photo”: “lab1235a813709b5436.jpg”,
“pin_create_date”: “2018-02-15 16:17:36”,
“pin_id”: “pin73012bf47bb9399a235c”,
“pin_privacy”: 0,
“pin_rate”: 0,
“pin_story”: “Bhruutbtbtyi”,
“pin_story_title”: “”,
“pin_type”: “report”,
“place_id”: “0”,
“place_lat”: 28.5491023,
“place_long”: 77.2684445,
“place_name”: “Reported Near 226, Okhla Industrial Estate Phase 3, New Delhi, Delhi. 5 M From Airtel Parking Area Pin 110020”,
“point”: 0,
“progress”: 0,
“quality”: 0,
“social_id”: 104,
“speed”: 0,
“status”: 1,
“user_id”: “205000fd12d7e7ee820c351e18899406”,
“user_name”: “lab123”,
“utc”: 0
}
}
]

vsr1 · February 15, 2018, 1:33pm

ORDER BY needs to produce all the qualified rows before applying LIMIT or OFFSET because lost row may come first in the ORDER.

mustanish · February 15, 2018, 5:23pm

So their is no way to optimize order by queries in n1ql but creating index similar to this CREATE INDEX my-journey-new ON pins(user_name,place_lat,place_long,pin_create_date DESC) reduced execution time to 30 ms for another query but how do I optimize this query.I mean their must be a way to optimize this.

vsr1 · February 15, 2018, 5:55pm

ORDER BY needs to produce all the qualified rows before applying LIMIT or OFFSET because lost row may come first in the ORDER. If query ORDER BY matches with index key order left to right and index or query doesn’t change order of items produced by indexer we will use index order avoid sort (query can’t have NOT, OR, IN predicates, no JOINS, no GROUP BY).

Your query doesn’t qualify it needs produce all qualified documents.

mustanish · February 15, 2018, 7:04pm

Thanks for explaining that one more thing where can I learn more about scan in couchbase I mean this
simple select * from pins where user_name NOT IN[‘mustanish123’] LIMIT 10 on explain does union scan but When IN operator is used exlain says Indexscan2.I hope I made my question clear

vsr1 · February 15, 2018, 7:14pm

https://developer.couchbase.com/documentation/server/current/performance/index-scans.html

You are using NOT IN operator.

user_name NOT IN[‘mustanish123’] is same as
user_name < ‘mustanish123’ OR user_name > ‘mustanish123’

Couchbase IndexScans are range scans those are converted to equivalent range to minimize scans.

mustanish · February 16, 2018, 11:02am

Hi,

SELECT p.pin_id,p.place_id,p.place_name,p.pin_story,p.pin_rate,p.cat_name,p.child_cat_name,p.pin_create_date,p.place_lat,p.place_long,p.pin_type,p.user_name,p.images, (acos(sin(RADIANS(28.549056336607507))*sin(RADIANS(p.place_lat)) + cos( RADIANS(28.549056336607507))*cos(RADIANS(p.place_lat))*cos (RADIANS(p.place_long) - RADIANS(77.2689280000161))) * 6371) AS distance,u.user_photo,u.name FROM pins AS p INNER JOIN users AS u ON KEYS p.user_id WHERE p.cat_id = 2 AND p.status=1 ORDER BY distance ASC LIMIT 10 OFFSET 0

Above query sort result based on distance and as suggested by you I am not using NOT IN but this query also taked more than 600ms.

Index used by this query is

CREATE INDEX my_world_idx1 ON pins(cat_id) WHERE (status = 1)

Explain for this query is

{
“plan”: {
"#operator": “Sequence”,
"~children": [
{
"#operator": “Sequence”,
"~children": [
{
"#operator": “IndexScan2”,
“index”: “my_world_idx1”,
“index_id”: “2a1ca5e34e467f97”,
“index_projection”: {
“primary_key”: true
},
“keyspace”: “pins”,
“namespace”: “default”,
“spans”: [
{
“exact”: true,
“range”: [
{
“high”: “2”,
“inclusion”: 3,
“low”: “2”
}
]
}
],
“using”: “gsi”
},
{
"#operator": “Fetch”,
“as”: “p”,
“keyspace”: “pins”,
“namespace”: “default”
},
{
"#operator": “Join”,
“as”: “u”,
“keyspace”: “users”,
“namespace”: “default”,
“on_keys”: “(p.user_id)”
},
{
"#operator": “Parallel”,
"~child": {
"#operator": “Sequence”,
"~children": [
{
"#operator": “Filter”,
“condition”: “(((p.cat_id) = 2) and ((p.status) = 1))”
},
{
"#operator": “InitialProject”,
“result_terms”: [
{
“expr”: “(p.pin_id)”
},
{
“expr”: “(p.place_id)”
},
{
“expr”: “(p.place_name)”
},
{
“expr”: “(p.pin_story)”
},
{
“expr”: “(p.pin_rate)”
},
{
“expr”: “(p.cat_name)”
},
{
“expr”: “(p.child_cat_name)”
},
{
“expr”: “(p.pin_create_date)”
},
{
“expr”: “(p.place_lat)”
},
{
“expr”: “(p.place_long)”
},
{
“expr”: “(p.pin_type)”
},
{
“expr”: “(p.user_name)”
},
{
“expr”: “(p.images)”
},
{
“as”: “distance”,
“expr”: “(acos(((sin(radians(28.549056336607507)) * sin(radians((p.place_lat)))) + ((cos(radians(28.549056336607507)) * cos(radians((p.place_lat)))) * cos((radians((p.place_long)) - radians(77.2689280000161)))))) * 6371)”
},
{
“expr”: “(u.user_photo)”
},
{
“expr”: “(u.name)”
}
]
}
]
}
}
]
},
{
"#operator": “Order”,
“limit”: “10”,
“sort_terms”: [
{
“expr”: “distance”
}
]
},
{
"#operator": “Limit”,
“expr”: “10”
},
{
"#operator": “FinalProject”
}
]
},
“text”: “SELECT p.pin_id,p.place_id,p.place_name,p.pin_story,p.pin_rate,p.cat_name,p.child_cat_name,p.pin_create_date,p.place_lat,p.place_long,\np.pin_type,p.user_name,p.images, (acos(sin(RADIANS(28.549056336607507))sin(RADIANS(p.place_lat)) + cos( RADIANS(28.549056336607507))\ncos(RADIANS(p.place_lat))*cos (RADIANS(p.place_long) - RADIANS(77.2689280000161))) * 6371) AS distance,u.user_photo,u.name FROM \npins AS p INNER JOIN users AS u ON KEYS p.user_id WHERE p.cat_id=2 AND p.status=1 ORDER BY distance \nASC LIMIT 10 OFFSET 0”
}

vsr1 · February 16, 2018, 1:31pm

Your Query doesn’t qualify. Please Check article https://dzone.com/articles/designing-index-for-query-in-couchbase-n1ql Step 6

Your query ORDER BY distance and distance is calculated in the query projection and NOT a leading Indexkey that has chosen. Also you have JOIN.
So, ORDER BY needs to produce all the qualified rows before applying LIMIT or OFFSET because lost row may come first in the ORDER.

mustanish · February 16, 2018, 6:18pm

So what should be done to optimize these kinds of query.Their must be a way to optimize.

Topic		Replies	Views
N1QL Query order by desc is slow SQL++	5	974	January 20, 2021
"Order by " making the N1QL Query slow SQL++ query , n1ql	2	761	October 10, 2018
Order by works very slow SQL++	1	658	February 13, 2019
Index for query with order by SQL++	11	7756	June 25, 2018
Select query with sorting performance advice SQL++ query , n1ql	3	1341	October 25, 2017

Order by makes query very slow

Related topics