(from github.com/shwetazilpe)
Hi,
I’m trying to crawl Redmine website as given in the documentation I have created an account in www.redmine.org and my cofiguration is -
Web Crawling Configuration :
Name = Redmine
URL = https://www.redmine.org/my/page
Web Authentication :
Hostname = www.redmine.org
Port =
Realm =
Scheme = Form
Username= my-username
Password= ******
Parameters= encoding=UTF-8
token_method=GET
token_url=https://www.redmine.org/login
token_pattern=name=”authenticity_token” +value=”([^”]+)”
token_name=authenticity_token
login_method=POST
login_url=https://www.redmine.org/login
login_parameters=username=${username}&password=${password}
Web Config= Redmine
But I’m getting following error in fess_crawler log:
`2019-03-19 17:46:54,918 [WebFsCrawler] INFO org.codelibs.fess.helper.WebFsIndexHelper - Target URL: https://www.redmine.org/my/page
2019-03-19 17:46:54,918 [WebFsCrawler] DEBUG org.codelibs.fess.helper.WebFsIndexHelper - Crawling https://www.redmine.org/my/page
2019-03-19 17:46:54,932 [IndexUpdater] DEBUG org.codelibs.fess.indexer.IndexUpdater - Starting indexUpdater.
2019-03-19 17:46:54,970 [Crawler-20190319174625-1-1] DEBUG org.codelibs.fess.crawler.service.impl.EsUrlQueueService - Queued URL: [UrlQueueImpl [id=20190319174625-1.aHR0cHM6Ly93d3cucmVkbWluZS5vcmcvbXkvcGFnZQ, sessionId=20190319174625-1, method=GET, url=https://www.redmine.org/my/page, encoding=null, parentUrl=null, depth=0, lastModified=0, createTime=1552997814730]]
2019-03-19 17:46:55,232 [Crawler-20190319174625-1-1] INFO org.codelibs.fess.crawler.helper.impl.LogHelperImpl - Crawling URL: https://www.redmine.org/my/page
2019-03-19 17:46:55,234 [Crawler-20190319174625-1-1] DEBUG org.codelibs.fess.crawler.FessCrawlerThread - Searching indexed document: https:%2F%2Fwww.redmine.org%2Fmy%2Fpage;role=Rguest
2019-03-19 17:46:55,238 [Crawler-20190319174625-1-1] DEBUG org.codelibs.fess.es.client.FessEsClient - Query DSL:
{“timeout”:“10000ms”,“query”:{“ids”:{“type”:[“doc”],“values”:[“https:%2F%2Fwww.redmine.org%2Fmy%2Fpage;role=Rguest”],“boost”:1.0}},“version”:true,"_source":{“includes”:["_id",“last_modified”,“anchor”,“segment”,“expires”,“click_count”,“favorite_count”],“excludes”:[]}}
2019-03-19 17:46:55,262 [Crawler-20190319174625-1-1] DEBUG org.codelibs.fess.es.client.FessEsClient - Query DSL:
{“size”:0,“timeout”:“10000ms”,“query”:{“term”:{“parent_id”:{“value”:“https:%2F%2Fwww.redmine.org%2Fmy%2Fpage;role=Rguest”,“boost”:1.0}}},"_source":{“includes”:[“url”],“excludes”:[]}}
2019-03-19 17:46:55,266 [Crawler-20190319174625-1-1] DEBUG org.codelibs.fess.crawler.helper.impl.LogHelperImpl - Getting the content from URL: https://www.redmine.org/my/page
2019-03-19 17:46:55,279 [Crawler-20190319174625-1-1] DEBUG org.codelibs.fess.crawler.client.http.HcHttpClient - Initializing org.codelibs.fess.crawler.client.http.HcHttpClient
2019-03-19 17:46:56,668 [Crawler-20190319174625-1-1] DEBUG org.codelibs.fess.crawler.client.http.form.FormScheme - Token is not found.
.
.
.
`2019-03-19 17:46:56,832 [Crawler-20190319174625-1-1] WARN org.codelibs.fess.crawler.client.http.form.FormScheme - Failed to login on https://www.redmine.org/login. The http status is 422.
where I’m going wrong. Did I missed any other configuration?
Thanks in advance!