Hello! I have problem. I created configuration to crawling some urls with parameter: "included urls for cralwling’. I checked log file and saw that FESS had crawled urls without any errors. And in admin’s dashboard I saw how count and size of docs constantly were increasing. But when I tried to find text from processed sites, information about the processing of which I saw in the log file, i couldn’t find it. How can i solve this problem? I use fess with elasticsearch cluster
What is the crawling config? I think the setting of “included urls” is wrong.
Sorry for taking so long to answer! I have web crawling configuration with following parameters (list some of them):
Included Urls For Crawling:https://somelocalresource/.*
Config Parameters: config.html.canonical.xpath=
The number of thread:10
As for this parameter “config.html.canonical.xpath=”, I set up it because, i noticed that when in my fess_crawler log file appeared phrase “INFO CANONICAL …some url” my crawler stopped. But when i added this parameter “config.html.canonical.xpath=” it continued to craw, because i saw information about “crawling url” in log file, but i couldn’t fint info in from crawling pages. How do I check the correctness of crawling? Http request to elasticsearch index? (Which index stores the data I need?) What should I do for the fess-crawler to work correctly?
I think it’s better to check fess-crawler.log with debug log level.
Please, can you tell me, how i can set debug log level for fess crawler?
Please see Log Level Setting.