FTS analyzer for Russian

Hi! FTS documentatation says: “FTS includes Bleve’s general-purpose analyzers as well as pre-built text analyzers for the following languages: Danish, Dutch, English, Finnish, French, German, Hungarian, Italian, Norwegian, Persian, Portuguese, Romanian, Russian, Sorani, Spanish, Swedish, Thai, and Turkish.”

On fresh install 4.6.0-3453 in FTS creation I can see only: cjk, ckb, en, fa, fr, hi, it, keyword, pt, simple, standard and web in analyzers list.
So my question is: how can I obtain analyzer for Russian?

Same with 4.6.1-3652 and 4.5.1

Bump. Any ideas? We realy need get clear whith this issue.

@mschoch any suggestions?

1 Like

I think this is a documentation issue right now, because to support those languages bleve needs to bundle the libstemmer library (through cgo) which we don’t at the moment. So while it can work in the future, I don’t know if and when its on the roadma, I guess @mschoch can clarify.

1 Like

As @daschl says, Bleve support for Russian is accomplished by using libstemmer. Currently, Couchbase FTS does not incorporate that feature. In general, Couchbase FTS support for languages will likely be a subset of what Bleve supports.

As an open-source project Bleve can and does support community contributed languages. However, as a commercial product, Couchbase has to meet a higher standard, which includes things like testing these languages.

The documentation is incorrect in this case and needs to be updated.

2 Likes

I have opened a ticket to get the documentation updated here: https://issues.couchbase.com/browse/MB-23630

1 Like

Thank you for clarification. Hope PMs consider adding Russian in roadmap. For now we will try to settle search with Elasticsearch and transport plug-in. Thanks again for your replies.