maze13
1
When I create a new web crawler in fess v13.6.0, the crawler doesn’t work:
…
2020-02-28 08:17:08,399 [main] INFO Starting Crawler…
2020-02-28 08:17:08,438 [WebFsCrawler] INFO No crawling target urls.
2020-02-28 08:17:08,439 [main] INFO Finished Crawler
2020-02-28 08:17:08,476 [main] INFO [CRAWL INFO] CrawlerEndTime=2020-02-28T08:17:08.439+0000,CrawlerStatus=true,CrawlerStartTime=2020-02-28T08:17:08.400+0000,WebFsCrawlEndTime=2020-02-28T08:17:08.438+0000,CrawlerExecTime=40,WebFsCrawlStartTime=2020-02-28T08:17:08.423+0000
2020-02-28 08:17:08,482 [main] INFO Destroyed LaContainer.
When I add the values of the web crawler to an existing web crawler (created before v13.6.0) the crawler works as expected.
What is a crawling setting?
I think the setting may not be proper.
maze13
3
The crawling setting is:
URLs: https://domain
Included URLs for crawling: https://domain/.*
Included URLs for indexing: https://domain/.*
The other config parameter are default.
When I take the parameter above and enter them into an existing web crawler, the crawling works as expected.
https://domain/.*
does not match https://domain
because of missing /.
maze13
5
I also tried (without success):
URLs: https://domain
https://domain/
Included URLs for crawling: https://domain/.*
https://domain
Included URLs for indexing: https://domain/.*
https://domain
The error is always the same: No crawling target urls.
Try:
URL: https://domain
Included URL(for crawling): https://domain.*
“No crawling target urls.” means the specified URL does not match Included URL.
maze13
7
That doesn’t work either.
It is strange that those crawling configuration work in web crawler created before v13.6.0. Am I the only one that is having this issue?