Is there a way to tell if an index has been used?

jda · April 10, 2024, 10:06am

I have started playing with the external index advisor. This leads to creating more indexes

However, then I would like to get rid of indexes that may not be used. The previous indexes have been created a while ago with more like a “guess” approach (sometimes fueled by good advices here).

I suppose I shouldn’t create a new index for each and every different type of query but ideally have some overlap - or what is the good advice?

A little more specific question about the index suggested by the advisor:

Here is an example. This query:

SELECT data.date,data.count,data.length
FROM data
WHERE type='Catch' AND year=2023
    AND statslocation IN [ "150", "153", "647", "127", "605", "655", "641", "673", "597", "644", "643", "45", "671", "645", "8" ]

leads to a recommended index:

CREATE INDEX adv_type_year_statslocation ON `data`(`type`,`year`,`statslocation`)

I know that this index is only relevant for documents of type “Catch” - so should I add a WHERE type='Catch' phrase to the index creation statement - or doesn’t it matter?

And another question/clarification. In the advisor I get this full response for the above query:

{
  "results": [
    {
      "#operator": "Advise",
      "advice": {
        "#operator": "IndexAdvice",
        "adviseinfo": {
          "recommended_indexes": {
            "covering_indexes": [
              {
                "index_statement": "CREATE INDEX adv_type_year_statslocation_date_count_length ON `data`(`type`,`year`,`statslocation`,`date`,`count`,`length`)",
                "keyspace_alias": "data"
              }
            ],
            "indexes": [
              {
                "index_statement": "CREATE INDEX adv_type_year_statslocation ON `data`(`type`,`year`,`statslocation`)",
                "keyspace_alias": "data",
                "recommending_rule": "Index keys follow order of predicate types: 2. equality/null/missing, 3. in."
              }
            ]
          }
        }
      },
      "query": "SELECT data.date,data.count,data.length\nFROM data\nWHERE type='Catch' AND year=2023\n    AND statslocation IN [ \"150\", \"153\", \"647\", \"127\", \"605\", \"655\", \"641\", \"673\", \"597\", \"644\", \"643\", \"45\", \"671\", \"645\", \"8\" ]"
    }
  ]
}

Does this mean that I should also create the first “covering” index?

dh · April 10, 2024, 10:45am

The system:indexes collection has a field metadata.stats.last_scan_time; if this is non-null then you can see when the index was last used since the last restart.
So keep an eye on this and you may get an idea of which indices are no longer used.

You can get similar information directly via the Index statistics endpoint: https://docs.couchbase.com/server/current/rest-api/rest-index-stats.html#_get_index_stats

Whether or not to create partial indices depends on your usage. If for example you will always filter on type='Catch' then it does make sense to create the partial index, assuming the data volume warrants it. (One of the main advantages to a partial index is that it is smaller - and is therefore faster to traverse - but if (almost) all your data matches the filter then the effect is diminished.) (Using collections may be a better fit in some cases, doing away with a “type” field.)

The advisor will attempt to suggest a covering index if it can. If your query won’t change - including the fields/expressions that are projected - then the covering index will typically give you superior performance, usually at the cost of being a larger index (more fields are typically included). If you will be running the query often then it likely makes sense to trade the space for the speed; if this is an occasional-only query or one where you may change the selected fields (but not the filtering criteria) then likely it doesn’t. (There will of course always be exceptions.)

You may want to refer to https://docs.couchbase.com/server/current/n1ql/n1ql-language-reference/covering-indexes.html

HTH,

jda · April 10, 2024, 11:03am

Ok, so it is really a matter of creating either the covering index or the other index. Great to know - that makes totally sense.

I will have a look at system:indexes to see if some of the indexes appear obsolete

jda · April 10, 2024, 11:24am

Is there any way I can get this information from the Admin interface (e.g. via a query)? I can see that I can get it via the REST API. However, I have a server where I only have easy access via the browser admin interface on port 8091

dh · April 10, 2024, 11:56am

Can you not query via the Query Workbench ? (http://:8091/ui/index.html#/query/workbench)

Otherwise AFAIK you’d need access to the appropriate Indexer port (9102 typically) to get the stats that way.

jda · April 11, 2024, 12:01pm

Well, I thought so, but if I run:

SELECT *
FROM system:indexes

I don’t get that property as far as I can see…

[
  {
    "indexes": {
      "condition": "((`type` = \"Catch\") and (`userkey` is valued))",
      "datastore_id": "http://127.0.0.1:8091",
      "id": "14e2468855630e91",
      "index_key": [
        "`userkey`",
        "`specieskey`",
        "`locationtype`",
        "`statslocation`",
        "`statspublic`",
        "`statstrippublic`",
        "`statstriptarget`",
        "`year`",
        "`month`",
        "`count`",
        "`length`",
        "`weight`"
      ],
      "keyspace_id": "data",
      "metadata": {
        "num_replica": 0
      },
      "name": "stats_catch_species_count_2",
      "namespace_id": "default",
      "state": "online",
      "using": "gsi"
    }
  },
:
:

dh · April 11, 2024, 12:30pm

Sorry, I forgot it is only in 7.6+ (MB-34863 exposed it, MB-54629 refined it) and you’re undoubtedly on an earlier version.

I believe the stat is still available via the index API though - just not through Query/SQL++.

jda · April 11, 2024, 12:32pm

Yeah, I am on 7.2.4 and see that 7.6 requires an OS update as our servers are on CentOS7 - so it will take a little more to upgrade to the next version

… but good to know that it is coming!

Tejas · May 9, 2024, 5:13am

Hello,
I am using “https://couchbase-server:18091/indexStatus” to get the details of “last_scan_time” for indexes.
I got time for most of the indexes but some indexes have time as null “lastScanTime”: “NA”. What does null value indicates here?
Thank you.

jda · May 10, 2024, 8:22am

As I understand it that means that no queries have used that index since the server was started - which is actually what I was trying to find out

vsr1 · May 10, 2024, 2:57pm

https://docs.couchbase.com/server/current/rest-api/rest-index-stats.html

last_known_scan_time
optional Time of the last scan request received for this index (Unix timestamp in nanoseconds). This may be useful for determining whether this index is currently unused.

This statistic is persisted to disk every 15 minutes, so it is preserved when the indexer restarts. integer (int64)

system · August 8, 2024, 2:58pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Finding unused indexes SQL++ index	6	2112	June 4, 2020
Query analyzer doesn't use index specified by "use index" statement SQL++	7	1355	June 12, 2021
How to find all un-used indexes? Couchbase Server index	8	2821	June 18, 2020
Create indexes in Couchbase SQL++	2	618	January 23, 2020
How to find which Index is being used? Couchbase Server query , java , index	2	945	November 2, 2022

Is there a way to tell if an index has been used?

Related topics