Document refresh period

(from github.com/rustyx)
Is there a setting to control how often a document needs to be re-crawled (in case there is no Expires header)?

(from github.com/marevol)
No settings, but you can control Schedule and create crawling configs.

in case there is no Expires header

Fess does not use Expires header.

(from github.com/rustyx)
Does it mean that all the URLs are retrieved every time the schedule is started?

(from github.com/marevol)
It depends on your requirement…
If you have many updated urls, you can also use CsvListDataStore in Data Store Crawling.
CsvListDataStore crawls urls by using CSV file which contains urls.
The sample script which creates test data is csvlistdatastore.sh.

(from github.com/rustyx)
Ok, so the Web Crawler has no such setting. Clear, thanks.