Trying to do geo spatial searches

I’m getting my feet wet trying to do geo spatial searches directly on the server :slight_smile:

I have looked through some documentation and a couple of blogs/tutorials that show how to do this. However, my data structure is slightly different and when searching I don’t seem to get it right.

First, my data structure is either a polygon (e.g. a lake) or a poly line (e.g. a stream), i.e. an array of points. It’s all about fishing and where you fished, hence the names and this structure :wink:

So here is a snippet showing the data that I want to reach in:

{
  "key": "8642",
  "name": "The lake",
  "points": [
    {
      "lat": 55.5747937187,
      "lon": 11.2856648546
    },
    {
      "lat": 55.5930342762,
      "lon": 11.2790527292
    },
   :
   :
    {
      "lat": 55.5566764979,
      "lon": 11.3007332022
    }
  ],
  "type": "Lake"
}

When searching I will have the location specified by the user (i.e. a lat/lon point). So now I want to find out if this point is inside (or perhaps just close to) a lake or near a stream (poly line).

I have created a FTS index:

And then I have tried to search using PostMan and posting a request with this content (adapted from one of the examples in the blogs):

{
  "from": 0,
  "size": 2,
  "fields": [
    "points"
  ],
  "query": {
    "location": {
      "lon": -55.6,
      "lat": 11.33
     },
    "distance": "1mi",
    "field": "points"
  },
  "sort": [
    {
      "by": "geo_distance",
      "field": "points",
      "unit": "mi",
      "location": {
      "lon": -55.6,
      "lat": 11.33
      }
    }
  ]
}

But this is what I get back:

{
    "status": {
        "total": 6,
        "failed": 0,
        "successful": 6
    },
    "request": {
        "query": {
            "location": [
                55.58,
                11.28
            ],
            "distance": "1mi",
            "field": "points"
        },
        "size": 2,
        "from": 0,
        "highlight": null,
        "fields": [
            "points"
        ],
        "facets": null,
        "explain": false,
        "sort": [
            {
                "by": "geo_distance",
                "field": "points",
                "location": {
                    "lat": 11.28,
                    "lon": 55.58
                },
                "unit": "mi"
            }
        ],
        "includeLocations": false
    },
    "hits": [],
    "total_hits": 0,
    "max_score": 0,
    "took": 389578,
    "facets": null
}

I would believe that the point should be less than a mile off the lake (or a nearby stream) so I’m sure this may have to do with the data structure that differs from the samples I have found.

Any ideas would be much appreciated :+1:

Thanks in advance.

/John

Hi @jda,

Short answer here is that - FTS don’t support polygon indexing or querying capability at the moment for queries like intersect/contains etc.
As your index definition points out - it only lets you index geo points and that won’t help you to capture the varied geo shapes like a polygon/lake or a polyline/stream as in your requirements.

Can we have an approximate, work-around solution possible here?

One approximation approach pops to mind is that,
Change the source document a bit include all the vertex points of a lake or a stream as individual, explicit fields in the source document. (fieldPt1, fieldPt2, fieldPt3, etc as against an array of coordinates today)
(This could be tricky as one never knows in advance how many points are there in the original source document to represent the lake/stream.)
One could also index a centroid /centre/midpoint point for the lake/stream as well to improve accuracy or work around the above-varied number of coordinates wrinkle.
And the user needs to explicitly index all those fields (fieldPt1, fieldPt2, fieldPt3… fieldPtN, centroidPt) explicitly in the index definition.

Later during query time, the user may issue the “point-distance-query” against all those individual new fields (Disjunct query).
If any of the polygon coordinates or centroid point is within the distance, then it should return the document hit for you.

You could also leverage/enable the “_all” option for these new fields and issue the query without the target field
"“field”: “points” to simplify your query.
This trick is explained here - Full-Text Search - 5 Tips To Improve Your Query Performance in the first point.

Oh that is slightly disappointing… I’ve had the challenges in Couchbase Lite as I could not directly do geospatial queries - and understood that I could in N1QL or on the server.

I think I can live with just getting the “nearest” points. Splitting the arrays out in separate fields is not an option as I never know how many there will be - and it could vary from very few to many…

Explore whether you could have a single centroid/midpoint for the polygon shapes in the source document?

Hmmm… Not sure I can. These are lakes and streams - so some of them are pretty close and I need to know where the user pressed on the map. So if you have a very long stream (or large lake) then one central point will introduce wrong results.

I’ll have to investigate the way I did it in CBLite and see if I can reuse that approach.

I’m working on using the “bounding box” concept for this instead as that will allow me to search for all points (similarly to what I do on mobile). So the initial N1QL looks like this:

select d.*,p.* from data d unnest points p 
where d.type in ['Lake','Stream'] and 
p.lat between 55.5 and 55.6 and 
p.lon between 11.0 and 11.5

I then need to calculate the distance from the point I start with to the points found, probably using some of the ideas from this article: How to Speed Up Spatial Search in Couchbase N1QL - DZone Performance

So far the query above is Ok in terms of speed and possibly could be improved by adding an index specifically to help it.

Edit:
This query now shows the distance:

select d.type,d.name,d.`key`,p.*,
    (acos(sin( RADIANS(55.55)) * sin (RADIANS(p.lat)) + 
    cos( RADIANS(55.55 ))  * cos(RADIANS(p.lat)) * 
    cos (RADIANS(p.lon) - RADIANS( 11.25))) * 6371) distance
from data d unnest points p 
where d.type in ['Lake','Stream'] and 
p.lat between 55.5 and 55.6 and 
p.lon between 11.0 and 11.5
order by distance asc
limit 5

Result looks like this:

	[
        {
            "distance": 1.1393731463644405,
            "key": "2272",
            "lat": 55.5496163108,
            "lon": 11.268100815,
            "name": "Bøstrup Å",
            "type": "Stream"
        },
        {
            "distance": 1.2050571143258948,
            "key": "2272",
            "lat": 55.5577846145,
            "lon": 11.2633297918,
            "name": "Bøstrup Å",
            "type": "Stream"
        },
        {
            "distance": 1.5156318704030634,
            "key": "2272",
            "lat": 55.5413010688,
            "lon": 11.2685482159,
            "name": "Bøstrup Å",
            "type": "Stream"
        },
        {
            "distance": 1.789016643436557,
            "key": "2272",
            "lat": 55.5344048529,
            "lon": 11.2569913897,
            "name": "Bøstrup Å",
            "type": "Stream"
        },
        {
            "distance": 1.9525790822056994,
            "key": "2276",
            "lat": 55.5667725132,
            "lon": 11.2408064902,
            "name": "Halleby Å",
            "type": "Stream"
        }
    ]

So now I just need to figure out if I can get rid of the “duplicates” (each point from an entity) in the query - or I need to remove that in code afterwards. But hey, this is doable :wink:

1 Like