discuss
October 17, 2018, 3:20pm
#1
(from github.com/burple6 )
Given the following HTML code, FESS fails to see the linked pages:
<a href="start-here.html">Start Here</a> <a href="our-brand.html">Our Brand</a> <a href="logo-basics.html">Logo Basics</a>
If the links are changed to absolute links, FESS crawler sees and crawls them.
discuss
October 17, 2018, 4:59pm
#2
(from github.com/burple6 )
This issue is happening when crawling the site http://med.umich.edu/branding/index.html . Included URL patterns are ".*/med.umich.edu/branding/.*".
discuss
October 18, 2018, 6:54am
#3
(from github.com/marevol )
It was not produced. It works.
2018-10-18 15:50:30,579 [WebFsCrawler] INFO Target URL: http://med.umich.edu/branding/index.html
2018-10-18 15:50:30,579 [WebFsCrawler] INFO Included URL: .*/med.umich.edu/branding/.*
2018-10-18 15:50:30,685 [Crawler-20181018155023-1-1] INFO Crawling URL: http://med.umich.edu/branding/index.html
2018-10-18 15:50:30,713 [Crawler-20181018155023-1-1] INFO Checking URL: http://med.umich.edu/robots.txt
2018-10-18 15:50:33,552 [Crawler-20181018155023-1-1] INFO Crawling URL: http://med.umich.edu/branding/our-brand.html
2018-10-18 15:50:34,979 [Crawler-20181018155023-1-1] INFO Crawling URL: http://med.umich.edu/branding/start-here.html
discuss
November 2, 2018, 2:16pm
#4
(from github.com/burple6 )
You are correct, this was resolved by clearing the crawler indices and crawling again.