Not Sure How I should Index

I have the below query, it take about 1 min to execute. I realize this is doing a lot of math here, when this was ran in MySQL with only a primary key index, it took less than 3 secs. What would be the best index to achieve a query result with less than a few seconds or quicker?

A single document looks like this:
{
“address”: {
“city”: “Des Moines”,
“country”: “US”,
“fullAddress”: “2428 Hubbell Ave, Des Moines, IA 50317”,
“phoneNumber”: “515-263-8576”,
“postalCode”: “50317”,
“state”: “IA”,
“streetAddress”: “2428 Hubbell Ave”
},
“geo”: {
“latitude”: 41.606425997287,
“longitude”: -93.57212305069
},
“id”: 45,
“image”: “”,
“name”: “QuikTrip”
}

SELECT stationID, stationName, address.stationFullAddress, (acos(sin(radians(39.9958)) * sin(radians(geo.latitude)) + cos(radians(39.9958)) * cos(radians(geo.latitude)) * cos(radians(-82.7974) - radians(geo.longitude))) * 6378) as distance FROM Stations where (acos(sin(radians(39.9958)) * sin(radians(geo.latitude)) + cos(radians(39.9958)) * cos(radians(geo.latitude)) * cos(radians(-82.7974) - radians(geo.longitude))) * 6378) < 5 ORDER BY distance ASC;

The couch base bucket can have different type of documents without proper type predicate it needs to fetch all documents and that is why it taking time.

You can check this article how to create GSI index. Also checkout FTS service geo spatial. cc @keshav_m

There are lot of constants if there same all the time you can try this. Check EXPLAIN see Spans has value 5 is pushed to index.

CREATE INDEX ix1 ON Stations((acos(sin(radians(39.9958)) * sin(radians(geo.latitude)) + cos(radians(39.9958)) * cos(radians(geo.latitude)) * cos(radians(-82.7974) - radians(geo.longitude))) * 6378), stationID, stationName, address.stationFullAddress);

Thank you for pointing me in the correct direction. Between the article and a few other sources I was able to comes up with a very fast solution. The solution was actually a lot faster than what I was working with in MySQL.

@brandon.wilson11, Good to know you found better solution. Is it possible post the solution so that others can benefit.

No problem. Since I am using NodeJS for our API I used geopoint to calculate my radians for my SW coordinates and NE coordinates. I provided my starting coordinates and distance. Once calculated I would able to build the query, as seen below.

Basically, all that geopoint is doing is creating a box around my location and subtracting points now within my radius. Basically, a box with a circle in it and all the points in the circle is all I am interested in.

If you need more detail about the maths involved, I highly recommend going to Finding Points Within a Distance of a Latitude/Longitude Using Bounding Coordinates

After I was able to get this working, I created an index, which indexed all the radians points. My average query time in a 200k bucket is 20 ms.
create index idx_stations_geo_latlon on Stations(RADIANS(geo.latitude), RADIANS(geo.longitude));

However, this is the end result.

SELECT stationName, address, geo, stationID, 
(acos(sin(RADIANS(<<GIVEN LAT>>)) * sin(RADIANS(geo.latitude)) + cos(RADIANS(<<GIVEN LAT>>)) * cos(RADIANS(geo.latitude)) * cos(RADIANS(geo.longitude) - RADIANS(<<GIVEN LON>>))) * 3959) as distance 
FROM Stations 
WHERE (RADIANS(geo.latitude) >= <<CALCULATED SW RAD LAT>> 
    AND RADIANS(geo.latitude) <= <<CALCULATED NE RAD LAT>> ) 
AND (RADIANS(geo.longitude) >= <<CALCULATED SW RAD LAT>>  
    AND RADIANS(geo.longitude) <= <<CALCULATED NE RAD LON>> ) 
AND (acos(sin(RADIANS(<<GIVEN LAT>>)) * sin(RADIANS(geo.latitude)) + 
cos(RADIANS(<<GIVEN LAT>>)) * cos(RADIANS(geo.latitude)) * 
cos(RADIANS(geo.longitude) - RADIANS(<<GIVEN LON>>))) * 3959) <= 15 
ORDER BY distance ASC;
1 Like