Japanese File Names are garbled.

(from github.com/noffin)
I have some files hosted in windows IIS server and they have Japanese names. The names are garbled in FESS pages.
It seems they are url-encoded with Shift-JIS at crawl and decoded with UTF-8 at display time.
Do you have any solution, like setting config or anything, please ?

(from github.com/marevol)
Could you provide steps to reproduce it in my environment?

(from github.com/noffin)
Thank you for your replying.

  1. I’m using fess 12.1 in Cent OS 7 with OpenJdk.

  2. Each of the problem files has Japanese character name, and it is linked from SHIFT-JIS HTML or ASPX in IIS 5.

      eq.) a href=“./files/健康診断申請.pdf”
    
  3. I use Web crawler for the site with basic authentication and label setting.

  4. After crawling , garbled file names appears in fess search result.

I will add some more detail on the next Wednesday (when I can see the environment)

(from github.com/noffin)
Now I’ve understood you need the steps for your environment.
I tried to host a web server to provide the same situation for you, but I couldn’t. AWS S3 refuses Japanese character files nor apache2 didn’t return the proper file with shift-jis named files even though “AddDefaultCharset” is Off.

Could you give me some other advice? Logging any particular information while crawling could be useful?