johnfak
January 10, 2020, 7:59pm
21
Yes quicker — but now not returning results. I must have done somehting wrong … I’ll play around with it
{
“name”: “faksearch2”,
“type”: “fulltext-index”,
“params”: {
“doc_config”: {
“docid_prefix_delim”: “”,
“docid_regexp”: “”,
“mode”: “type_field”,
“type_field”: “type”
},
“mapping”: {
“default_analyzer”: “standard”,
“default_datetime_parser”: “dateTimeOptional”,
“default_field”: “_all”,
“default_mapping”: {
“dynamic”: true,
“enabled”: false
},
“default_type”: “_default”,
“docvalues_dynamic”: true,
“index_dynamic”: true,
“store_dynamic”: false,
“type_field”: “_type”,
“types”: {
“type”: {
“dynamic”: false,
“enabled”: true,
“properties”: {
“type”: {
“enabled”: true,
“dynamic”: false,
“fields”: [
{
“docvalues”: true,
“include_in_all”: true,
“include_term_vectors”: true,
“index”: true,
“name”: “type”,
“type”: “text”
}
]
}
}
}
}
},
“store”: {
“indexType”: “scorch”
}
},
“sourceType”: “couchbase”,
“sourceName”: “ice_us”,
“sourceUUID”: “021160cf87998bf9e4dc96303a90a13d”,
“sourceParams”: {},
“planParams”: {
“maxPartitionsPerPIndex”: 171,
“indexPartitions”: 6,
“numReplicas”: 0
},
“uuid”: “6f40814d6c9bcb55”
}
abhinav
January 10, 2020, 8:05pm
22
Cool, you’d want to make sure that the field name that you’re indexing matches with the field name you’re searching for.
johnfak
January 10, 2020, 8:18pm
23
Trying to index literally type value in JSON
{
…
<other data>
…
“l7”: “Harding”,
“d0”: “CUSTACCP”,
“l8”: “362 Helmson Ave”,
“type”: “addressbook”, <=== INDEX THIS
“l9”: “Apt 21”,
“cas”: 0,
“u0”: “00918050”,
“u1”: 4,
“u2”: “”,
“s1”: “”,
“s2”: false
}
I select
JSON Type Filed (type)
Unselect default type mappings
Click +Add type mapping
Add type as the type name… leave inherit on and select only index spcifief fields
Hit ok
Deselect default/dynamic mapping
Hover back over new type mapping - select ‘insert child field’
Select field : type
Select type : text
select searchable as : type
select analyzer :inherit
select all options - but ensure “store” is deselected.
Create index.
abhinav
January 10, 2020, 8:57pm
24
Just noticed that your index wouldn’t have indexed anything at all.
Do NOT un-select the default type mapping.
Do select “Only index specified fields”.
johnfak
January 10, 2020, 9:13pm
25
Thanks - that works but is slow again.
Do I need to select ‘insert child field’ still or not as Im applying to all ‘type’
abhinav
January 10, 2020, 9:25pm
27
This is incorrect. Follow these steps:
Drop the type mapping “type” and everything within it.
Within the default mapping, first select “only index specified fields” and then add a child field “type”.
Or instead, to make this even simpler, lets do this …
Copy this index mapping into a file, say “temp.json”
{
"name": "faksearch2",
"type": "fulltext-index",
"uuid": "",
"sourceType": "couchbase",
"sourceName": "ice_us",
"sourceUUID": "021160cf87998bf9e4dc96303a90a13d",
"sourceParams": {},
"planParams": {
"maxPartitionsPerPIndex": 171,
"numReplicas": 0,
"indexPartitions": 6
},
"params": {
"mapping": {
"default_mapping": {
"enabled": true,
"dynamic": false,
"properties": {
"type": {
"enabled": true,
"dynamic": false,
"fields": [
{
"name": "type",
"type": "text",
"store": false,
"index": true,
"include_term_vectors": true,
"include_in_all": true,
"docvalues": true
}
]
}
}
},
"default_type": "_default",
"default_analyzer": "standard",
"default_datetime_parser": "dateTimeOptional",
"default_field": "_all",
"store_dynamic": false,
"index_dynamic": true
},
"store": {
"indexType": "scorch"
},
"doc_config": {
"mode": "type_field",
"type_field": "type",
"docid_prefix_delim": "",
"docid_regexp": ""
}
}
}
Next run this command against your node …
curl -XPUT -H "Content-type:application/json" http://<username>:<password>@<ip>:8094/api/index/faksearch2 -d @temp.json
The UI would look like this …
johnfak
January 10, 2020, 9:32pm
28
Works … but back to slow.
Around 7-9k docs per 3 second refresh of screen.
So lets call it 3k docs per seconds …
Thats 4 hours to index the docstore …
johnfak
January 10, 2020, 9:32pm
29
CPU is idle.
04:28:25 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
04:28:30 PM all 15.70 0.00 7.66 0.14 0.00 0.24 0.00 0.00 0.00 76.25
04:28:30 PM 0 14.37 0.00 7.38 0.00 0.00 0.19 0.00 0.00 0.00 78.06
04:28:30 PM 1 17.90 0.00 7.78 0.19 0.00 0.19 0.00 0.00 0.00 73.93
04:28:30 PM 2 16.92 0.00 7.50 0.19 0.00 0.19 0.00 0.00 0.00 75.19
04:28:30 PM 3 13.66 0.00 8.16 0.00 0.00 0.19 0.00 0.00 0.00 77.99
abhinav
January 10, 2020, 9:37pm
30
That is pretty slow. Can you confirm that the index definition is exactly identical to what I shared earlier?
sreeks
January 13, 2020, 4:29pm
31
Hi @johnfak , are you sure of the cpu usage of cbft process?
Whats the amount of RAM and the respective FTS RAM quota set?
Healthy RAM/FTS quota helps in faster indexing too.
johnfak
January 14, 2020, 2:17pm
32
I believe so … although it formats it differently - sorry was out Monday
{
"type": "fulltext-index",
"name": "faksearch2",
"uuid": "612e4eeb653ebc1b",
"sourceType": "couchbase",
"sourceName": "ice_us",
"sourceUUID": "021160cf87998bf9e4dc96303a90a13d",
"planParams": {
"maxPartitionsPerPIndex": 171,
"indexPartitions": 6
},
"params": {
"doc_config": {
"docid_prefix_delim": "",
"docid_regexp": "",
"mode": "type_field",
"type_field": "type"
},
"mapping": {
"analysis": {},
"default_analyzer": "standard",
"default_datetime_parser": "dateTimeOptional",
"default_field": "_all",
"default_mapping": {
"dynamic": false,
"enabled": true,
"properties": {
"type": {
"dynamic": false,
"enabled": true,
"fields": [
{
"docvalues": true,
"include_in_all": true,
"include_term_vectors": true,
"index": true,
"name": "type",
"type": "text"
}
]
}
}
},
"default_type": "_default",
"docvalues_dynamic": true,
"index_dynamic": true,
"store_dynamic": false,
"type_field": "_type"
},
"store": {
"indexType": "scorch"
}
},
"sourceParams": {}
}
johnfak
January 14, 2020, 2:19pm
33
basically each server (3) has 16GB RAM
Not using MDS.
10GB to data
3GB to query
512MB (default) to search
1GB to analytics
sreeks
January 14, 2020, 4:03pm
34
512 MB is very less and try bumping this upto 2-3GB or so. You may adjust the memory quota of other non using services accordingly.
johnfak
January 14, 2020, 4:40pm
35
Definately quicker and ore acceptable/normal for a large index.
Query is down from 40 second (N1QL) to around 3.5 to 4 seconds and pretty on par with redis.
Nice job @abhinav @sreeks
Final side note.
Seems to be many more options and possible faster access via FTS over N1QL and Analytics (based on this basic use case).
Is there a guide on when to use one over the other based on performance/features.
Thanks again … very insightful.
1 Like
abhinav
January 15, 2020, 6:44pm
36
There’s documentation on various query types supported by FTS here …
https://docs.couchbase.com/server/6.5/fts/fts-query-types.html
And here’s how to leverage FTS from within N1QL in the upcoming release …
As for guidelines on when to use N1QL+GSI vs FTS, I’ll ping @binh.le to check if he can point you to any documentation available or if its a work in progress.
johnfak
January 15, 2020, 6:52pm
37
Thanks @abhinav
That would be great. Im going to reach out to our Couchbase engagement guys and ask for a team demo on FTS/analytics also.
appreciated.