A few Questions not covered in the Documenation

(from github.com/ghost)
Hi!

I have a few questions about the use and configuration of FESS. The documentation does not seem to cover these issues.

I’m using FESS 13 (current version) on Windows 10

1.) .I’m using FESS mainly to index Files on Windows SMB shares. The files on interest here are documents such as doc and PDF files. Now FESS would also index and crawl files types that are irrelevant in my searches (such as xml, zip and other types). Now I was wondering how I can limit indexing to to those file types. (e.g. I have to index all pdf files but I do NOT need to contest of a xml file indexed). How can I do that ?

2.) Is it possible to install FESS as a system device that would autostart so I do not need to be logged on to windows? There are a few files called fess-service-*** and a “service.bat” file in the bin folder which would suggest that the whole search appliance can be installed as a system service. How can that be done ? Is there any documentation available ?

3.) Thumbnail view is not working. I’ve already enabled the option under General Configuration, but no Thumbnail are displayed in the search results yet

4.) index reset – Is there any way to complete reset (or erase if you will) the entire index, in order to start a ‘clean recrawl’ of all the resources ?

Thank you for your Help

(from github.com/marevol)

  1. You can specify .*.pdf$ in Included Paths For Indexing. To skip to crawl xml files and others, use Excluded Paths For Crawling.

  2. Yes, but it’s not documented as OSS product at the moment. If you need more supports, please contact Commercial Support.

  3. Thumbnail generator is not supported on Windows.

  4. Search *:* at Admin Search page, and then click Delete button at the bottom of the page.

(from github.com/ghost)
A follow-up question on No.: 1

I would specify ..pdf$ in ‘Included Path for Indexing’ to include pdf files. So it is my understanding that I could either whitelist items that should be index or blacklist items that should not be indexed (or crawled).
Now how do I enter several file types for e.g. in the ‘included path for indexing section’ ?
Do I need to enter those file types one by one or do I have to build a string-in-a-line (such as "
.pdf;*doc;*docx" in Windows for e.g.)?

(from github.com/marevol)
You can put them to multiple lines.