how to set http.proxy.host in fess_config.properties?

(from github.com/sho-suzuki)
related issue #1100

Hi. I’m using fess 11.2.1 docker image.
I set the fess_config.properties as follows, then execute gitbucket crawler and got error.

fess_config.properties

http.proxy.host=proxy_host   (like host.example.com)
http.proxy.port=proxy_port

fess-crawler.log

2017-07-21 13:38:38,290 [AVw08iP-f4ysEdHmb2Lr-1] WARN  Failed to access to http://gitbucket:8080/gitbucket/api/v3/fess/repos?offset=0
java.lang.NumberFormatException: For input string: "host.example.com"
       at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) ~[?:1.8.0_131]
       at java.lang.Integer.parseInt(Integer.java:580) ~[?:1.8.0_131]
       at java.lang.Integer.valueOf(Integer.java:766) ~[?:1.8.0_131]

Next change fess_config.properties but same error occours.

fess_config.properties

http.proxy.host=proxy_host_ip   (like 127.0.0.1)
http.proxy.port=proxy_port

fess-crawler.log

2017-07-21 13:38:38,290 [AVw08iP-f4ysEdHmb2Lr-1] WARN  Failed to access to http://gitbucket:8080/gitbucket/api/v3/fess/repos?offset=0
java.lang.NumberFormatException: For input string: "127.0.0.1"
       at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) ~[?:1.8.0_131]
       at java.lang.Integer.parseInt(Integer.java:580) ~[?:1.8.0_131]
       at java.lang.Integer.valueOf(Integer.java:766) ~[?:1.8.0_131]

Since I think that http.proxy.host designates a host name or IP address, I think conversion to Integer is unnecessary.

(from github.com/marevol)
What is the stacktrace, not partial?

(from github.com/sho-suzuki)
I think it’s not partial issue.
When http.proxy.host is set, the same error always occurs.

when I set http.proxy.host to default value, stacktrace as follows.

http.proxy.host=
http.proxy.port=8080
http.proxy.username=
http.proxy.password=
2017-07-23 00:00:19,196 [20170723000000-2] WARN  Failed to access to http://gitbucket:8080/gitbucket/api/v3/fess/repos?offset=0
org.codelibs.elasticsearch.runner.net.CurlException: Failed to access to http://gitbucket:8080/gitbucket/api/v3/fess/repos?offset=0
        at org.codelibs.elasticsearch.runner.net.CurlRequest.execute(CurlRequest.java:159) ~[elasticsearch-cluster-runner-5.4.2.0.jar:?]
        at org.codelibs.elasticsearch.runner.net.CurlRequest.execute(CurlRequest.java:169) ~[elasticsearch-cluster-runner-5.4.2.0.jar:?]
        at org.codelibs.fess.ds.impl.GitBucketDataStoreImpl.getRepositoryList(GitBucketDataStoreImpl.java:205) [classes/:?]
        at org.codelibs.fess.ds.impl.GitBucketDataStoreImpl.storeData(GitBucketDataStoreImpl.java:78) [classes/:?]
        at org.codelibs.fess.ds.impl.AbstractDataStoreImpl.store(AbstractDataStoreImpl.java:106) [classes/:?]
        at org.codelibs.fess.helper.DataIndexHelper$DataCrawlingThread.process(DataIndexHelper.java:236) [classes/:?]
        at org.codelibs.fess.helper.DataIndexHelper$DataCrawlingThread.run(DataIndexHelper.java:222) [classes/:?]
Caused by: org.codelibs.elasticsearch.runner.net.CurlException: Failed to access the response.
        at org.codelibs.elasticsearch.runner.net.CurlRequest$1.onResponse(CurlRequest.java:193) ~[elasticsearch-cluster-runner-5.4.2.0.jar:?]
        at org.codelibs.elasticsearch.runner.net.CurlRequest.execute(CurlRequest.java:157) ~[elasticsearch-cluster-runner-5.4.2.0.jar:?]
        ... 6 more
Caused by: java.net.ConnectException: Connection refused (Connection refused)
        at java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:1.8.0_131]
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) ~[?:1.8.0_131]
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) ~[?:1.8.0_131]
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) ~[?:1.8.0_131]
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[?:1.8.0_131]
        at java.net.Socket.connect(Socket.java:589) ~[?:1.8.0_131]
        at java.net.Socket.connect(Socket.java:538) ~[?:1.8.0_131]
        at sun.net.NetworkClient.doConnect(NetworkClient.java:180) ~[?:1.8.0_131]
        at sun.net.www.http.HttpClient.openServer(HttpClient.java:463) ~[?:1.8.0_131]
        at sun.net.www.http.HttpClient.openServer(HttpClient.java:558) ~[?:1.8.0_131]
        at sun.net.www.http.HttpClient.<init>(HttpClient.java:242) ~[?:1.8.0_131]
        at sun.net.www.http.HttpClient.New(HttpClient.java:339) ~[?:1.8.0_131]
        at sun.net.www.http.HttpClient.New(HttpClient.java:357) ~[?:1.8.0_131]
        at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1202) ~[?:1.8.0_131]
        at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1181) ~[?:1.8.0_131]
        at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1032) ~[?:1.8.0_131]
        at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:966) ~[?:1.8.0_131]
        at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1546) ~[?:1.8.0_131]
        at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1474) ~[?:1.8.0_131]
        at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480) ~[?:1.8.0_131]
        at org.codelibs.elasticsearch.runner.net.CurlRequest$1.onResponse(CurlRequest.java:174) ~[elasticsearch-cluster-runner-5.4.2.0.jar:?]
        at org.codelibs.elasticsearch.runner.net.CurlRequest.execute(CurlRequest.java:157) ~[elasticsearch-cluster-runner-5.4.2.0.jar:?]
        ... 6 more
2017-07-23 00:00:19,199 [20170723000000-2] INFO  There exist 0 repositories
2017-07-23 00:00:19,199 [20170723000000-2] WARN  Token is invalid or no Repository
2017-07-23 00:00:19,205 [20170723000000-2] INFO  Deleted 0 old docs.
2017-07-23 00:00:24,161 [DataStoreCrawler] INFO  [EXEC TIME] crawling time: 10003ms
2017-07-23 00:00:24,162 [main] INFO  Finished Crawler
2017-07-23 00:00:24,228 [main] INFO  [CRAWL INFO] DataCrawlExecTime=10003,DataCrawlEndTime=2017-07-23T00:00:24.161+0900,CrawlerEndTime=2017-07-23T00:00:24.162+0900,DataIndexExecTime=0,CrawlerStatus=true,CrawlerStartTime=2017-07-23T00:00:12.955+0900,WebFsCrawlEndTime=2017-07-23T00:00:14.170+0900,DataIndexSize=0,CrawlerExecTime=11207,DataCrawlStartTime=2017-07-23T00:00:14.109+0900,WebFsCrawlStartTime=2017-07-23T00:00:14.108+0900
2017-07-23 00:00:25,840 [main] INFO  Destroyed LaContainer.

then, I set proxy host. stacktrace are changed.

http.proxy.host=host.example.com
http.proxy.port=proxy_port
http.proxy.username=
http.proxy.password=
2017-07-23 18:00:18,492 [main] INFO  Lasta Di boot successfully.
2017-07-23 18:00:18,496 [main] INFO    SmartDeploy Mode: Warm Deploy
2017-07-23 18:00:18,496 [main] INFO    Smart Package: org.codelibs.fess.app
2017-07-23 18:00:18,611 [main] INFO  Starting Crawler..
2017-07-23 18:00:19,577 [AVw08iP-f4ysEdHmb2Lr-1] WARN  Failed to access to http://gitbucket:8080/gitbucket/api/v3/fess/repos?offset=0
java.lang.NumberFormatException: For input string: "host.example.com"
        at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) ~[?:1.8.0_131]
        at java.lang.Integer.parseInt(Integer.java:580) ~[?:1.8.0_131]
        at java.lang.Integer.valueOf(Integer.java:766) ~[?:1.8.0_131]
        at org.dbflute.util.DfTypeUtil.doParseStringAsInteger(DfTypeUtil.java:556) ~[dbflute-runtime-1.1.3.jar:?]
        at org.dbflute.util.DfTypeUtil.doConvertToInteger(DfTypeUtil.java:536) ~[dbflute-runtime-1.1.3.jar:?]
        at org.dbflute.util.DfTypeUtil.toInteger(DfTypeUtil.java:520) ~[dbflute-runtime-1.1.3.jar:?]
        at org.dbflute.helper.jprop.ObjectiveProperties.getAsInteger(ObjectiveProperties.java:191) ~[dbflute-runtime-1.1.3.jar:?]
        at org.lastaflute.core.direction.ObjectiveConfig.getAsInteger(ObjectiveConfig.java:210) ~[lastaflute-0.9.8.jar:?]
        at org.codelibs.fess.mylasta.direction.FessConfig$SimpleImpl.getHttpProxyHostAsInteger(FessConfig.java:5331) ~[classes/:?]
        at org.codelibs.fess.mylasta.direction.FessProp.getHttpProxy(FessProp.java:1664) ~[classes/:?]
        at org.codelibs.fess.ds.impl.GitBucketDataStoreImpl.getRepositoryList(GitBucketDataStoreImpl.java:205) [classes/:?]
        at org.codelibs.fess.ds.impl.GitBucketDataStoreImpl.storeData(GitBucketDataStoreImpl.java:78) [classes/:?]
        at org.codelibs.fess.ds.impl.AbstractDataStoreImpl.store(AbstractDataStoreImpl.java:106) [classes/:?]
        at org.codelibs.fess.helper.DataIndexHelper$DataCrawlingThread.process(DataIndexHelper.java:236) [classes/:?]
        at org.codelibs.fess.helper.DataIndexHelper$DataCrawlingThread.run(DataIndexHelper.java:222) [classes/:?]
2017-07-23 18:00:19,583 [AVw08iP-f4ysEdHmb2Lr-1] INFO  There exist 0 repositories
2017-07-23 18:00:19,583 [AVw08iP-f4ysEdHmb2Lr-1] WARN  Token is invalid or no Repository
2017-07-23 18:00:19,593 [AVw08iP-f4ysEdHmb2Lr-1] INFO  Deleted 0 old docs.
2017-07-23 18:00:24,535 [DataStoreCrawler] INFO  [EXEC TIME] crawling time: 5008ms
2017-07-23 18:00:24,536 [main] INFO  Finished Crawler
2017-07-23 18:00:24,682 [main] INFO  [CRAWL INFO] DataCrawlExecTime=5008,DataCrawlEndTime=2017-07-23T18:00:24.536+0900,CrawlerEndTime=2017-07-23T18:00:24.536+0900,CrawlerStatus=true,DataIndexExecTime=0,CrawlerStartTime=2017-07-23T18:00:18.611+0900,DataIndexSize=0,CrawlerExecTime=5925,DataCrawlStartTime=2017-07-23T18:00:18.663+0900
2017-07-23 18:00:26,542 [main] INFO  Destroyed LaContainer.

(from marevol (Shinsuke Sugaya) · GitHub)
Thank you for the info.
I needed the full stacktace.
The cause is

    at org.codelibs.fess.mylasta.direction.FessProp.getHttpProxy(FessProp.java:1664) ~[classes/:?]

I’ll fix it.

(from github.com/sho-suzuki)
How can I get a complete stack trace to suit your needs?
I have changed the following settings.

  1. job crawler script
- 	return container.getComponent("crawlJob").logLevel("info").sessionId("AVw08iP-f4ys****").webConfigIds([] as String[]).fileConfigIds([] as String[]).dataConfigIds(["AVw08iP-f4ys*****"] as String[]).jobExecutor(executor).execute();
+ 	return container.getComponent("crawlJob").logLevel("debug").sessionId("AVw08iP-f4ys****").webConfigIds([] as String[]).fileConfigIds([] as String[]).dataConfigIds(["AVw08iP-f4ys****"] as String[]).jobExecutor(executor).execute();

-> debug log are written to fess-crawler.log.

  1. fess.in.sh
- FESS_JAVA_OPTS="$FESS_JAVA_OPTS -Dfess.log.level=warn"
+ FESS_JAVA_OPTS="$FESS_JAVA_OPTS -Dfess.log.level=debug"

-> debug log are written to fess.log.

Or is there another way?

(from github.com/marevol)
In your first post, the stacktrace was only:

2017-07-21 13:38:38,290 [AVw08iP-f4ysEdHmb2Lr-1] WARN  Failed to access to http://gitbucket:8080/gitbucket/api/v3/fess/repos?offset=0
java.lang.NumberFormatException: For input string: "host.example.com"
       at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) ~[?:1.8.0_131]
       at java.lang.Integer.parseInt(Integer.java:580) ~[?:1.8.0_131]
       at java.lang.Integer.valueOf(Integer.java:766) ~[?:1.8.0_131]

I needed:

    at org.codelibs.fess.mylasta.direction.FessProp.getHttpProxy(FessProp.java:1664) ~[classes/:?]

(from github.com/sho-suzuki)
I understood. Thanking you in advance!

(from github.com/sho-suzuki)
Is this problem fixed in #1181?

(from github.com/marevol)
Yes.

(from github.com/sho-suzuki)
OK. Thank you. I’ll try it.

(from github.com/sho-suzuki)
@marevol
Thanks a lot.
I can crawl it via proxy.