Efficiently pattern matching

socketman2016 · February 14, 2019, 5:00pm

I have millions documents contains email field
How can I efficiently find email addresses that contains specific word? Pattern like ℅myWord%

vsr1 · February 14, 2019, 5:43pm

You can follow this
A Couchbase Index Technique for LIKE Predicates With Wildcard - DZone
Example 6 Token Functions | Couchbase Docs

Or you can use FTS index. cc @keshav_m

keshav_m · February 14, 2019, 5:48pm

Tokens() gives you a simple way to create a standard index and query.

Using FTS is also an alternative depending on the stemming, fuzziness requirements you have.

socketman2016 · February 14, 2019, 7:34pm

Can you show me how can I do this with FTS??

vsr1 · February 15, 2019, 12:19am

You can check this for quick reference. Compare text search in Couchbase & MongoDB- The Couchbase Blog

@abhinav , Could you help here.

abhinav · February 15, 2019, 12:37am

Hey @socketman2016, here’s one way you can do it …
Set up an FTS type mapping over your specific field (emailID) in this case.
I’d recommend using the Simple analyzer to start with …

An example …

emailID: abc.def@xyz.com
".", "@" are stop words with the simple analyzer
So, tokens generated with this analyzer would be: abc def xyz com
However, numbers won't be tokenized by this analyzer, i.e
tokens generated from abc.123@xyz.com would be: abc, xyz, com

You also would have the power to set up a custom analyzer that will generate tokens based on the rule set you define.
Here’s where you can test what tokens are generated for search terms while using different analyzers…
http://bleveanalysis.couchbase.com/analysis

Topic		Replies	Views
Finding cafes & gmail Full Text Search	1	1764	April 4, 2017
FTS does not perform pattern matching Mobile n1ql , dot-net	3	887	December 14, 2022
FTS matching an exact word of a format "aaaa:100@143" SQL++ fts	9	125	December 31, 2024
Help :( Node js full text search with n1ql SQL++	2	1226	July 13, 2017
FTS partial phrase search Full Text Search	12	4627	February 15, 2019

Efficiently pattern matching

Related topics