discuss
1
(from github.com/anatomo)
Hello.
I have a question about Synonym List.
I uploaded synonym.txt.
A=>B
C=>B
When I search “A”, the keyword “C” is hit .
(The “A” is not similer to “C”.)
Maybe as a way to fix it, Synonym Filter set “expand=false” option.
https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-synonym-tokenfilter.html
Could you please advise how to do it?
discuss
2
(from github.com/marevol)
- Modify app/WEB-INF/classes/fess_indices/fess.json
- In Upgrade page, start Reindex with updating aliases
discuss
3
(from github.com/anatomo)
Thanks your advice.
I Modified app/WEB-INF/classes/fess_indices/fess.json.
(Added “expand:false” to synonym tokeizer.)
"tokenizer": {
"japanese_tokenizer": {
"type": "fess_japanese_reloadable_tokenizer",
"mode": "normal",
"user_dictionary": "${fess.dictionary.path}ja/kuromoji.txt",
"discard_punctuation": false,
"reload_interval":"1m"
},
"korean_tokenizer": {
"type": "fess_korean_tokenizer",
"index_eojeol": false,
"pos_tagging": false,
"user_dict_path": "${fess.dictionary.path}ko/seunjeon.txt"
},
"simplified_chinese_tokenizer": {
"type": "fess_simplified_chinese_tokenizer"
},
"vietnamese_tokenizer": {
"type": "fess_vietnamese_tokenizer",
"sentence_detector": false,
"ambiguities_resolved": false
},
"unigram_synonym_tokenizer": {
"type": "ngram_synonym",
"n": "1",
"synonyms_path": "${fess.dictionary.path}synonym.txt",
"expand":false,
"dynamic_reload":true,
"reload_interval":"1m"
},
"bigram_synonym_tokenizer": {
"type": "ngram_synonym",
"n": "2",
"synonyms_path": "${fess.dictionary.path}synonym.txt",
"expand":false,
"dynamic_reload":true,
"reload_interval":"1m"
}
},
And I start Reindex with updating aliases.
After reindex, I search “A”. But, the Keyword “B” and “C” isn’t hit.
(Just Keyword A is hit.)
Is this setting wrong?
discuss
4
(from github.com/marevol)
I think what you want to do is:
B=>A,C
discuss
5
(from github.com/anatomo)
Thank you, it works.
But, I have already created wrong format synonym.txt.
(It has about 250,000 words.)
Umm… I will try to fix the synonym.txt format.