N1QL - Creating an index for nested elements


I’m trying to set up an index that will index the value of a nested object. See the sample object below:

"created": "2015-09-24 11:03:19",
  "items": [
      "itemtype": "mobile-phone",
      "identifiers": [
          "identifier": "0898864134062374616",
          "type": "unknown"
      "manufacturer": "",
      "model": "",
      "value": "",
      "itemNumber": 1,
      "referenceItemNumber": null

I can query this no problem using the below query:

SELECT * FROM `default` 
    UNNEST items i 
        UNNEST i.identifiers b 
WHERE b.identifier = "0898864134062374616"

However I’ve not yet figured out how to write an index create statement that will create an index on the sub value “identifier” for the above object.

Any help would be appreciated. Thus far all attempts have resulted in the select query using the primary index instead.



Array indexing is something we are looking to expand in future releases. However you can index in some limited ways today: take an example such as this one:

       "id": "00000000000001",
       "desc": ”---",
       "type": "friends",
       "tags": [0,1,2,3,4,5,6,7,8,9]

CREATE INDEX itags_sorted ON default(ARRAY_SORT(tags)) USING GSI;

The following Query will use the index;

SELECT * FROM default WHERE tags= ARRAY_SORT([0,1,2,3,4,5,6,7,8,9]);

you obviously build references instead of embedding as well to solve the issue. In that case instead of using unnest, you would use joins.

Hi Cihan,

Thank you for your reply. Unfortunately I can see how to apply that type of index to our document structure. It’s unfortunate this feature isn’t available, I hadn’t anticipated this limitation on the GSI. We’d been looking forward to 4.0 for N1QL but it seems like for us it’s not going to be powerful enough day 1.

We can’t work around this either, we’ve got multiple live clusters with a dataset containing almost 200 million records in the above format at the moment. The suggested workaround of using reference objects would not only seem like a devolution (given that the attraction of couchbase is it allowed us to store JSON objects with more complexity than an RDB table) but it would unnecessarily multiply the number of stored documents by 5 - 10 times.

At present we use a view for looking up records by “identifier” but its limited and requires alot of filtering. A N1QL query would have been perfect for this but we’ll have to start looking at other solutions.

I understand and yes N1QL is very well suited for the type of processing. Indexing arrays is something we are working on now but I don’t have a preview that I can share at the moment. I am at cihan@couchbase.com if you’d like to check back in in a few weeks.

Hi Cihan,

is there any progress to share on this topic?

yes there is. could you reach out at cihan@couchbase.com so I can give you a preview of what we are planning to get your feedback?

Hi All,
I’m interested too in this kind of indexing and modifying the GSI index.
We have millions of object like the following and querying on Signals and Samples is a common use case.

  "StartTime": "2014-06-25T03:22:28",
  "EndTime": "2014-06-25T04:53:26",
  "EventDetails": "blablabla",  
  "Signals": [{
    "StartTime": "2014-06-25T03:22:28",
    "Value": 123,
    "Samples": [
      {"SampleTime": "2014-06-25T03:22:27","Value": 0.0,"Offset": -1},
      {"SampleTime": "2014-06-25T03:22:28","Value": 0.0,"Offset": 0},
      {"SampleTime": "2014-06-25T03:22:29","Value": 0.0,"Offset": 1}
    "OtherPropsDetails": "blablabla"
  "OtherPropsDetails": "blablabla"

Thank you, Raoul

Also interested in array indexing. Any updates on progress in that area is greatly appreciated. Thanks!

We are really enjoying couchbase.

But for production this feature is critical !!!

Any updates on progress in that area is greatly appreciated. Thanks!

Hi folks,

Just last week, we released 4.5 developer preview release that includes array indexing feature. Download and try out.


the new documentation for CREATE INDEX on embedded arrays is documented here.