Proper way to FTS Index and search Meta().id or docids using regex or wildcard seaches

“Doc ID with regex” is a way couchbase FTS will allow you to extract “type identifiers” for indexing. Let me explain this better with an example …

Ex: Say you have these documents on your couchbase bucket…

  • doc-airline-23
  • doc-airline-421
  • doc-airport-992

Using the regexp: -[a-z]*-, the docIDs that would match the pattern are: “-airline-” and “-airport-”.
Setting the above regex in your “Doc ID with regex” field, you’ll be able to define top level type mappings: “-airline-” or “-airport-” to index only those documents whose Doc IDs contain the term.

So setting this as your index definition would index all attributes of documents with “-airline-” in their IDs …

{
  "name": "temp",
  "type": "fulltext-index",
  "params": {
    "doc_config": {
      "docid_regexp": "-[a-z]*-",
      "mode": "docid_regexp"
    },
    "mapping": {
      "default_analyzer": "standard",
      "default_datetime_parser": "dateTimeOptional",
      "default_field": "_all",
      "default_mapping": {
        "dynamic": true,
        "enabled": false
      },
      "types": {
        "-airline-": {
          "dynamic": true,
          "enabled": true
        }
      }
    },
    "store": {
      "indexType": "scorch"
    }
  },
  "sourceType": "couchbase",
  "sourceName": "bucket",
  "sourceUUID": "",
  "uuid": ""
}

Hope that explains what “Doc ID with regex” is intended to do. Note that this won’t assist you in searching the doc IDs based on a regexp - it’ll only index ones that you’ve matched using the regexp.

The Doc ID query would only assist you to retrieve documents if you specify the exact ID(s) - so I gather this isn’t what you want.

Now if you want to search across all documents by applying a regexp you would need to issue a regexp query against the “_id” field that FTS automatically indexes.

Taking the previous example again, the following query would fetch me all documents that have “airline” in it …

{
  "query": {
    "regexp": ".*airline.*",
    "field": "_id"
  }
}