INFER statement returns array of arrays of schemas

Hello, I was working with INFER statement and find out unexpected behavior in output. Documentation says “At the top level, the output contains an array of schemas.” (link)
But in practice, INFER returns not just array with json-objects, but array of arrays, in which schemas are described

Examples:
INFER beer-sample returns
[
[
{//schema of document},
{//schema of document},
{//schema of document}…
]
]

So my question is, what purposes of this behavior? For now I don’t see any reasons for returning array in array, and from a logical point a view, just array of schemas is enough. Or am I miss something?

All SQL statements gives results as ARRAY
Inferancer on schema gives ARRAY of different schema. It end up ARRAY of ARRAY. This has been like that from CB 4.5 (it may be over look at that time). It can’t be changed at this time due to backward compatibility. Just derefrnace [0]

Ok, I get it, and your explanation seems logical, but if that so, why SELECT * FROM bucket does not
return same structure with ARRAY in ARRAY?

Because it is just an array of documents. In the case of INFER each individual result is an ARRAY. As @vsr1 noted, for legacy compatibility we can’t change this now.

INFER isn’t a selectable statement (yet); when this is implemented we will probably be able to flatten the outer array.

Thanks for your answer!

Seems like INFER can return multiple results, but I try so hard and got so far, but in the end I couldn`t find any cases in which INFER can return something like
[
[
//schemas
],
[
//schemas
],
[
//schemas
]…
]

It doesn’t (so please don’t waste your time trying) but it was written to return an array of schemas (back when it was introduced) and we can’t change that now without breaking things. results[0] will reliably give you the naked array of INFER results.

(I have a vague inkling that originally the inferencer was a stand alone tool and when it was incorporated into the product directly we must have missed the unnecessary additional nesting.)

For what it is worth, I don’t think the documentation is incorrect here either; “at the top level” means the elements in the results array. “At the top level” for a select statement is similarly the contents of results, albeit in that case the elements are documents. You can see the arrays clearly in the examples on the INFER documentation page you pointed to.

(https://docs.couchbase.com/server/current/n1ql/n1ql-rest-api/index.html#_response_body does note that the contents of the “results” array can be any JSON value; the elements don’t have to be JSON objects, JSON arrays are OK too.)

HTH.

2 Likes

Thank you very much, now everything become understandable

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.