my index show the same results (duplicate) when i have URL that accessible through https and http protocol, how can i “tell” fess to prefer index https only if already have two of the same?
can i controll the way fess create the “_id” field? i believe that the reason of this duplicates happen cause the id contain the protocol (http, https) so - http://example.com and https://example.com get different id and different doc_id - cause a duplication
(from github.com/marevol)
Fess supports a “canonical” tag, so I think it should be used.
If you cannot change target html pages, you can set http:.* to Excluded URLs For Crawling .
when you have more then 600 subsites its hard to pathmap them
so i believe a better solution is tell fess (somehow) that is there is two same URL but different protocol so he need to show the prefer one in results - so if i prefer to show https than http
if there is two results http://example.com/site1 https://example.com/site1
its need only show the https one
i will try to see how to do that but any help is welcome