Failure URL needs to show referring page

We’re getting some errors in the Failure URL log which we could fix if we could find the errant page.

For example, we’ve got some bad URLs resulting in a java.lang.IllegalArgumentException from org.codelibs.fess.crawler.exception.CrawlingAccessException.

If the details page showed which page had been crawled when it encountered that URL, that would help us fix the problem.

Check fess-crawler.log.

All I can find is this:

2017-05-17 03:25:07,866 [Crawler-20170517000000-9-1] INFO  Crawling URL:
2017-05-17 03:25:10,324 [Crawler-20170517000000-11-2] INFO  Crawling URL:
2017-05-17 03:25:10,395 [Crawler-20170517000000-9-1] INFO  Failed to access to; The url may not be valid:; The url may not be valid:; The url may not be valid:; The url may not be valid:; The url may not be valid:

But that isn’t telling me where it got that original bad URL from …

To check child links

  1. Change to in
  2. Restart Fess
  3. Search by anchor:“” or anchor:“*

Thank you for that.

Reading, does this mean I need to wait for version 11.2.0? I’m presuming that issue 1074 adds support for “anchor” as a search property?

In Fess 11.2,

Change to in

is not needed(broken-links reporting will be added in Fess 11.2).
So, I think that it works in current releases if modifying