Prefer index https when there is also http

Hi there,

my index show the same results (duplicate) when i have URL that accessible through https and http protocol, how can i “tell” fess to prefer index https only if already have two of the same?

can i controll the way fess create the “_id” field? i believe that the reason of this duplicates happen cause the id contain the protocol (http, https) so - and get different id and different doc_id - cause a duplication


Fess supports a “canonical” tag, so I think it should be used.
If you cannot change target html pages, you can set http:.* to Excluded URLs For Crawling

thanks but i am not sure canonical is the right choice for this - canonical is basically more for: is the same as - but not for protocols

in google you can “say” if you prefer to show results in https or http if there is same link result… i wish fess can resolve it also some how

Another solution might be Path Mapping with Crawling process type.


yes it can work but its mean i need to map each website

lets say i have many websites under one domain

and can be access through http and also https
but and is not

so i have to pathmap only
and if there is i also need to map it… and so on

when you have more then 600 subsites its hard to pathmap them

so i believe a better solution is tell fess (somehow) that is there is two same URL but different protocol so he need to show the prefer one in results - so if i prefer to show https than http
if there is two results
its need only show the https one

i will try to see how to do that but any help is welcome