Question on Join Syntax: brewery-beer joins beer-sample

morrie · May 4, 2015, 11:01am

Example: list all of the beers manufactured by brewery “Yards Brewing”.

The following query works:

select br.name brewery, b.name beer, b.style style from beer-sample b join beer-sample br on keys b.brewery_id where br.name = “Yards Brewing” order by beer, style;

However, I am not able to make this join work using the USE KEYS argument to define the parent join key. It seems to me that this query would perform much faster if I could specify the parent (brewery) item key. Anybody know how I would use the USE KEYS argument in this kind of parent-child join?

cihangirb · May 4, 2015, 3:37pm

Hi @morrie, If your objective is to optimize the query, I’d recommend optimizing the beer segment of the query. You could create the following index
create index beer_brewery on beer-sample(type,brewery_id) where type=“beer” using GSI;

and push the optimization before the join to filter as much as possible using both the b.type=“beer” and b.brewery_id=“…”. with the nested loop join, this will eliminate the number of items you need to loop over in “b” (beers).

select br.name brewery, b.name beer, b.style style, b.brewery_id from beer-sample b join beer-sample br on keys b.brewery_id where b.type=“beer” and b.brewery_id=“yards_brewing” order by beer, style;

if you can compare the “explains”, you will see that you can avoid the primary scan with the index and push the filter earlier in the exec plan.

does this help?
thanks
-cihan

morrie · May 5, 2015, 6:12am

Thanks for the response. Those are all valid performance optimizations.

I’m actually specifically interested in the question of how/if this query would be written to use the “USE KEYS” clause. I’m wondering if USE KEYS can only be used in joins of separate buckets and cannot be used on a same-bucket parent/child join.

colm · May 5, 2015, 12:56pm

Hi @morrie,

Your query has the correct syntax. You could put a USE KEYS clause on the LHS of the join:

select br.name brewery, b.name beer, b.style style 
from beer-sample b use keys ["b1", "b2", "b3" ] join beer-sample br on keys b.brewery_id 
where br.name = "Yards Brewing" order by beer, style;

This would only use beers that have keys “b1”, “b2” and “b3”.

For a join, “ON KEYS” is the correct syntax to use to specify the keys from the LHS that the RHS is joined with.

Does this help?

morrie · May 6, 2015, 12:19pm

Actually what I think I’m trying to do is use USE KEYS for the brewery id , which in this case is “yards_brewing” . I want to eliminate the where -clause reference to br.name, which is forcing a full scan of the bucket.

geraldss · May 6, 2015, 9:56pm

Hi @morrie,

USE KEYS only applies to the leftmost bucket in your query. So in your example, you would need to start with a set of known keys, in order to benefit from USE KEYS.

Your query is fundamentally based on brewery name, not brewery id. That is the mismatch with USE KEYS. As @cihangirb said, you can create an index on brewery name, which would avoid your full bucket scan. You can confirm this by using CREATE INDEX followed by EXPLAIN, which should show an IndexScan (as opposed to a PrimaryScan, which is a full bucket scan).

Please let us know after you try this.

Thanks,
Gerald

morrie · May 7, 2015, 6:33am

Thanks Gerald - that was the issue I was addressing and which you have now confirmed: USE KEYS can only be used against the left most bucket in the join. This means that in the parent-child join which I am trying to execute, the parent cannot be addressed by its key.

Incidentally, I’d like to see that changed: it should be possible to access the parent by its key; we should not need to support a 2nd index if we already know the key Id.

I’m satisfied with Gerald’s response: I believe we can close this thread. Thanks to all who replied.

geraldss · May 7, 2015, 7:48pm

Hi @morrie, one more point about the change you’d like to see.

If you model your data so that each brewery contains an array of beer_ids, then you can achieve your desired access paths: USE KEYS on the brewery_id, then JOIN or NEST the corresponding beers using the array of beer_ids.

Gerald

Topic		Replies	Views
How to USE KEYS in N1QL subquery SQL++ 40-rc	15	18586	September 23, 2020
JOIN in many to one mapping SQL++	1	1158	December 15, 2017
Possible To Execute Joins Using Non-Key Fields? SQL++	8	3774	February 10, 2016
Create sub quey use keys meta().id SQL++ query	8	1273	May 22, 2020
Join using joined bucket as key SQL++ query , n1ql , server	8	1662	June 6, 2018

Question on Join Syntax: brewery-beer joins beer-sample

Related topics