(from github.com/pcolmer)
We’re getting some errors in the Failure URL log which we could fix if we could find the errant page.
For example, we’ve got some bad URLs resulting in a java.lang.IllegalArgumentException from org.codelibs.fess.crawler.exception.CrawlingAccessException.
If the details page showed which page had been crawled when it encountered that URL, that would help us fix the problem.
2017-05-17 03:25:07,866 [Crawler-20170517000000-9-1] INFO Crawling URL: http://www.96boards.org/register/ http://www.96boards.org/register/
2017-05-17 03:25:10,324 [Crawler-20170517000000-11-2] INFO Crawling URL: http://connect.linaro.org/tag/narayana-prasad-athreya/
2017-05-17 03:25:10,395 [Crawler-20170517000000-9-1] INFO Failed to access to http://www.96boards.org/register/ http://www.96boards.org/register/; The url may not be valid: http://www.96boards.org/register/ http://www.96boards.org/register/; The url may not be valid: http://www.96boards.org/register/ http://www.96boards.org/register/; The url may not be valid: http://www.96boards.org/register/ http://www.96boards.org/register/; The url may not be valid: http://www.96boards.org/register/ http://www.96boards.org/register/; The url may not be valid: http://www.96boards.org/register/ http://www.96boards.org/register/
But that isn’t telling me where it got that original bad URL from …
Reading https://github.com/codelibs/fess/issues/1074, does this mean I need to wait for version 11.2.0? I’m presuming that issue 1074 adds support for “anchor” as a search property?
Change to query.additional.search.fields=anchor in fess_config.properties
is not needed(broken-links reporting will be added in Fess 11.2).
So, I think that it works in current releases if modifying query.additional.search.fields.