How to improve the speed of regular expression queries

Hello,
Is there any way to improve the speed of regular expression queries?
I have a collection with over 6 million records and the following query is taking 3 minutes 30 seconds. I would like to know if there is any way to reduce the query speed.
Thank you.

SELECT *
FROM test._default.record USE INDEX (USING FTS)
WHERE SEARCH(test,{ “fields”: [“*”], “highlight”: {}, “size”: 100, “from” : 0, “query”: { “query”: “+email:/test.+/”}})

@abhinav , since this is FTS not Query.

@ryosek The regex query will try to run a disjunction over all candidates from your data that it could find. The run time for this query will depend on that number of clauses for the disjunction. Alternatively, you can try using one of search’s custom analyzer to hold edge-ngrams of length:5 and then simply try running a term query for test. - which should be much faster than the regexp because of a single clause lookup.

That said, if you’re claiming your query is taking over 3min to run -

  • I suspect the bulk of the time being spent is because of your query being non-covering for the index.
  • Using the * clause with your SELECT predicate will cause KV fetches after the search is run - which will tremendously slow down your queries.
  • So identifying what in your documents matters to you becomes essential here - and storeing those fields in the search index to remove the fetch component in the EXPLAIN description for your query should be your first angle of attack.

Sorry, I think it is FTS, not Query as you say.
I didn’t know what to call FTS with a query clause like this code
“query”: { “query”: “+email:/test.+/”}

Thank you for your detailed reply!

I even didn’t know that I can index a specific field in FTS.
Is the index for FTS different from the index for a query? I have already created an index for email by running the following query.
CREATE INDEX EMAIL_IDX ON test. _default. record(email)

A member of my team wants to search many fields using regular expressions(e.g. name, address, etc.). That means I need to index each of these fields on the FTS, right?

And I also will try all the options you gave me.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.