Processing no docs

(from github.com/Anders-Bergqvist)
In Fess-crawler.log I first have the website crawled but then there are hours of [IndexUpdater] INFO Processing no docs (Doc:{access 5ms, cleanup 20ms}, Mem:{used 165MB, heap 512MB, max 512MB})

What does this mean? Have the crawler completed the website crawl and has nothing else to crawl? Why does it not stop and send the mail that the crawl is completed?

(from github.com/marevol)
What is your crawling configuration?

(from github.com/Anders-Bergqvist)
General
image

Web Crawler 1

Web crawler2

(from github.com/marevol)
What are target urls and filters?

(from github.com/Anders-Bergqvist)
Crawler1


Crawler2


(from github.com/marevol)
canonical in these site may not be correct.
To ignore canonical, you can put config.html.canonical.xpath= to Config Parameters.

(from github.com/Anders-Bergqvist)
Canonicalization is an important part of our website architecture. It works fine with Google.com and other search engines. We can’t just turn it off. Then our pages that turns up in search on Fess will be doubled and half will be copies.

(from github.com/marevol)
Could you attach fess-crawler.log with debug level?

(from github.com/Anders-Bergqvist)
Yes I have put it in debugmode now. I noticed that when the crawler starts reporting “Processing no docs” and does so for several hours the last page is always the last one in one of our news archives with paging that ends there. Meta in are:

Can there be an issue with the crawler coming to a dead end and it has already crawled all links on the current page? Or is it that we have put meta noindex. We don’t want to index the list page itself only the pages beeing listed.

Pages that the crawler started with “Processing no docs” on are (all of them are the last of a paginated page):
https://www.oru.se/nyheter/nyhetsarkiv/nyhetsarkiv-2016/?page=26
https://www.oru.se/nyheter/nyhetsarkiv/nyhetsarkiv-2015/?page=21
https://www.oru.se/nyheter/nyhetsarkiv/nyhetsarkiv-2014/?page=12

(from github.com/marevol)
Hmm, these urls seem to work.
Some other pages may block crawling.
If so, the thread dump show up in fess-crawler.log at the end of crawling.

(from github.com/Anders-Bergqvist)
It seems that the crawler sometimes goes on until “EmptyListCount is over 3600”. BUT sometimes it only reports “Processing no docs” about thirty times before it ends crawling.

Se end of fess_crawler.log when it reports "Processing no docs " for several hours.

2019-08-20 22:46:28,220 [CoreLib-TimeoutManager] INFO  [SYSTEM MONITOR] {"os":{"memory":{"physical":{"free":3239723008,"total":16820518912},"swap_space":{"free":4294963200,"total":4294963200}},"cpu":{"percent":0},"load_averages":[0.02, 0.03, 0.01]},"process":{"file_descriptor":{"open":251,"max":65535},"cpu":{"percent":0,"total":1911270},"virtual_memory":{"total":2293112832}},"jvm":{"memory":{"heap":{"used":194915952,"committed":536870912,"max":536870912,"percent":36},"non_heap":{"used":152216568,"committed":224931840}},"pools":{"mapped":{"count":0,"used":0,"capacity":0},"direct":{"count":2,"used":16384,"capacity":16384}},"gc":{"young":{"count":23923,"time":123737},"old":{"count":80,"time":17590}},"threads":{"count":26,"peak":32},"classes":{"loaded":17727,"total_loaded":371992,"unloaded":354265},"uptime":81987940},"elasticsearch":{"nodes":{"XRQNfCDlTC2RX4o5ZU5QTQ":{"timestamp":1566333988216,"name":"fess-01-test","transport_address":"127.0.0.1:9300","host":"127.0.0.1","ip":"127.0.0.1:9300","roles":["master","data","ingest"],"os":{"timestamp":1566333988217,"cpu":{"percent":0,"load_average":{"1m":0.02,"5m":0.03,"15m":0.01}},"mem":{"total_in_bytes":16820518912,"free_in_bytes":3239723008,"used_in_bytes":13580795904,"free_percent":19,"used_percent":81},"swap":{"total_in_bytes":4294963200,"free_in_bytes":4294963200,"used_in_bytes":0},"cgroup":{"cpuacct":{"control_group":"/system.slice/elasticsearch.service","usage_nanos":248620501781106},"cpu":{"control_group":"/system.slice/elasticsearch.service","cfs_period_micros":100000,"cfs_quota_micros":-1,"stat":{"number_of_elapsed_periods":0,"number_of_times_throttled":0,"time_throttled_nanos":0}},"memory":{"control_group":"/system.slice/elasticsearch.service","limit_in_bytes":"9223372036854771712","usage_in_bytes":"7928774656"}}},"process":{"timestamp":1566333988217,"open_file_descriptors":1231,"max_file_descriptors":65535,"cpu":{"percent":0,"total_in_millis":248611530},"mem":{"total_virtual_in_bytes":4837883904}},"jvm":{"timestamp":1566333988219,"uptime_in_millis":5298433017,"mem":{"heap_used_in_bytes":407490888,"heap_used_percent":38,"heap_committed_in_bytes":1056309248,"heap_max_in_bytes":1056309248,"non_heap_used_in_bytes":0,"non_heap_committed_in_bytes":144976544,"pools":{"young":{"used_in_bytes":128591960,"max_in_bytes":139591680,"peak_used_in_bytes":139591680,"peak_max_in_bytes":139591680},"survivor":{"used_in_bytes":1488096,"max_in_bytes":17432576,"peak_used_in_bytes":17432576,"peak_max_in_bytes":17432576},"old":{"used_in_bytes":277423184,"max_in_bytes":899284992,"peak_used_in_bytes":694867512,"peak_max_in_bytes":899284992}}},"threads":{"count":84,"peak_count":90},"gc":{"collectors":{"young":{"collection_count":1060104,"collection_time_in_millis":9401842},"old":{"collection_count":1126,"collection_time_in_millis":50012}}},"buffer_pools":{"mapped":{"count":537,"used_in_bytes":800608942,"total_capacity_in_bytes":800608942},"direct":{"count":456,"used_in_bytes":11469873,"total_capacity_in_bytes":11469872}},"classes":{"current_loaded_count":17811,"total_loaded_count":0,"total_unloaded_count":118}},"thread_pool":{"analyze":{"threads":1,"queue":0,"active":0,"rejected":0,"largest":1,"completed":10756887},"ccr":{"threads":0,"queue":0,"active":0,"rejected":0,"largest":0,"completed":0},"fetch_shard_started":{"threads":1,"queue":0,"active":0,"rejected":0,"largest":4,"completed":89},"fetch_shard_store":{"threads":0,"queue":0,"active":0,"rejected":0,"largest":0,"completed":0},"flush":{"threads":1,"queue":0,"active":0,"rejected":0,"largest":1,"completed":80135},"force_merge":{"threads":0,"queue":0,"active":0,"rejected":0,"largest":0,"completed":0},"generic":{"threads":42,"queue":0,"active":0,"rejected":0,"largest":42,"completed":18736130},"get":{"threads":2,"queue":0,"active":0,"rejected":0,"largest":2,"completed":104327313},"listener":{"threads":0,"queue":0,"active":0,"rejected":0,"largest":0,"completed":0},"management":{"threads":5,"queue":0,"active":1,"rejected":0,"largest":5,"completed":23600838},"ml_autodetect":{"threads":0,"queue":0,"active":0,"rejected":0,"largest":0,"completed":0},"ml_datafeed":{"threads":0,"queue":0,"active":0,"rejected":0,"largest":0,"completed":0},"ml_utility":{"threads":1,"queue":0,"active":0,"rejected":0,"largest":1,"completed":1},"refresh":{"threads":1,"queue":0,"active":0,"rejected":0,"largest":1,"completed":217018790},"rollup_indexing":{"threads":0,"queue":0,"active":0,"rejected":0,"largest":0,"completed":0},"search":{"threads":4,"queue":0,"active":0,"rejected":0,"largest":4,"completed":1244698537},"search_throttled":{"threads":0,"queue":0,"active":0,"rejected":0,"largest":0,"completed":0},"security-token-key":{"threads":0,"queue":0,"active":0,"rejected":0,"largest":0,"completed":0},"snapshot":{"threads":0,"queue":0,"active":0,"rejected":0,"largest":0,"completed":0},"warmer":{"threads":0,"queue":0,"active":0,"rejected":0,"largest":0,"completed":0},"watcher":{"threads":0,"queue":0,"active":0,"rejected":0,"largest":0,"completed":0},"write":{"threads":2,"queue":0,"active":0,"rejected":0,"largest":2,"completed":4880347}},"fs":{"timestamp":1566333988219,"total":{"total_in_bytes":527367929856,"free_in_bytes":502146899968,"available_in_bytes":475286736896},"least_usage_estimate":{"path":"/var/lib/elasticsearch/nodes/0","total_in_bytes":527367929856,"available_in_bytes":475286740992,"used_disk_percent":9.875683733408096},"most_usage_estimate":{"path":"/var/lib/elasticsearch/nodes/0","total_in_bytes":527367929856,"available_in_bytes":475286740992,"used_disk_percent":9.875683733408096},"data":[{"mount":"/ (/dev/sda2)","total_in_bytes":527367929856,"free_in_bytes":502146899968,"available_in_bytes":475286736896}]},"transport":{"server_open":0,"rx_count":0,"rx_size_in_bytes":0,"tx_count":0,"tx_size_in_bytes":0},"http":{"current_open":2,"total_opened":562895}}}},"timestamp":1566333988220}
2019-08-20 22:46:34,903 [IndexUpdater] INFO  Processing no docs (Doc:{access 4ms, cleanup 20ms}, Mem:{used 186MB, heap 512MB, max 512MB})
2019-08-20 22:46:44,909 [IndexUpdater] INFO  Processing no docs (Doc:{access 10ms, cleanup 20ms}, Mem:{used 187MB, heap 512MB, max 512MB})
2019-08-20 22:46:54,902 [IndexUpdater] INFO  Processing no docs (Doc:{access 3ms, cleanup 20ms}, Mem:{used 187MB, heap 512MB, max 512MB})
2019-08-20 22:47:04,902 [IndexUpdater] INFO  Processing no docs (Doc:{access 3ms, cleanup 20ms}, Mem:{used 188MB, heap 512MB, max 512MB})
2019-08-20 22:47:14,901 [IndexUpdater] INFO  Processing no docs (Doc:{access 2ms, cleanup 20ms}, Mem:{used 188MB, heap 512MB, max 512MB})
2019-08-20 22:47:14,925 [IndexUpdater] INFO  Terminating indexUpdater. emptyListCount is over 3600.
2019-08-20 22:47:14,955 [IndexUpdater] INFO  Thread: Thread[eshttp,5,main]
2019-08-20 22:47:14,955 [IndexUpdater] INFO  	at java.base@11.0.4/jdk.internal.misc.Unsafe.park(Native Method)
2019-08-20 22:47:14,955 [IndexUpdater] INFO  	at java.base@11.0.4/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
2019-08-20 22:47:14,955 [IndexUpdater] INFO  	at java.base@11.0.4/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1628)
2019-08-20 22:47:14,955 [IndexUpdater] INFO  	at java.base@11.0.4/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:177)
2019-08-20 22:47:14,955 [IndexUpdater] INFO  Thread: Thread[WebFsCrawler,5,main]
2019-08-20 22:47:14,955 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.Object.wait(Native Method)
2019-08-20 22:47:14,955 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.Thread.join(Thread.java:1313)
2019-08-20 22:47:14,955 [IndexUpdater] INFO  	at app//org.codelibs.fess.crawler.Crawler.awaitTermination(Crawler.java:125)
2019-08-20 22:47:14,955 [IndexUpdater] INFO  	at app//org.codelibs.fess.helper.WebFsIndexHelper.doCrawl(WebFsIndexHelper.java:418)
2019-08-20 22:47:14,955 [IndexUpdater] INFO  	at app//org.codelibs.fess.helper.WebFsIndexHelper.crawl(WebFsIndexHelper.java:81)
2019-08-20 22:47:14,955 [IndexUpdater] INFO  	at app//org.codelibs.fess.exec.Crawler.lambda$doCrawl$7(Crawler.java:474)
2019-08-20 22:47:14,955 [IndexUpdater] INFO  	at app//org.codelibs.fess.exec.Crawler$$Lambda$601/0x00000008405ca040.run(Unknown Source)
2019-08-20 22:47:14,955 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.Thread.run(Thread.java:834)
2019-08-20 22:47:14,955 [IndexUpdater] INFO  Thread: Thread[CommandGeneratorDestoryTimer-1566252004964,5,main]
2019-08-20 22:47:14,955 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.Object.wait(Native Method)
2019-08-20 22:47:14,955 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.Object.wait(Object.java:328)
2019-08-20 22:47:14,955 [IndexUpdater] INFO  	at java.base@11.0.4/java.util.TimerThread.mainLoop(Timer.java:527)
2019-08-20 22:47:14,955 [IndexUpdater] INFO  	at java.base@11.0.4/java.util.TimerThread.run(Timer.java:506)
2019-08-20 22:47:14,956 [IndexUpdater] INFO  Thread: Thread[CommandGeneratorDestoryTimer-1566252004971,5,main]
2019-08-20 22:47:14,956 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.Object.wait(Native Method)
2019-08-20 22:47:14,956 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.Object.wait(Object.java:328)
2019-08-20 22:47:14,956 [IndexUpdater] INFO  	at java.base@11.0.4/java.util.TimerThread.mainLoop(Timer.java:527)
2019-08-20 22:47:14,956 [IndexUpdater] INFO  	at java.base@11.0.4/java.util.TimerThread.run(Timer.java:506)
2019-08-20 22:47:14,956 [IndexUpdater] INFO  Thread: Thread[ProcessCommand,5,main]
2019-08-20 22:47:14,956 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.Thread.sleep(Native Method)
2019-08-20 22:47:14,956 [IndexUpdater] INFO  	at app//org.codelibs.fess.exec.Crawler.lambda$main$3(Crawler.java:236)
2019-08-20 22:47:14,956 [IndexUpdater] INFO  	at app//org.codelibs.fess.exec.Crawler$$Lambda$547/0x00000008405b0040.run(Unknown Source)
2019-08-20 22:47:14,956 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.Thread.run(Thread.java:834)
2019-08-20 22:47:14,956 [IndexUpdater] INFO  Thread: Thread[CommandGeneratorDestoryTimer-1566252004987,5,main]
2019-08-20 22:47:14,956 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.Object.wait(Native Method)
2019-08-20 22:47:14,956 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.Object.wait(Object.java:328)
2019-08-20 22:47:14,956 [IndexUpdater] INFO  	at java.base@11.0.4/java.util.TimerThread.mainLoop(Timer.java:527)
2019-08-20 22:47:14,956 [IndexUpdater] INFO  	at java.base@11.0.4/java.util.TimerThread.run(Timer.java:506)
2019-08-20 22:47:14,956 [IndexUpdater] INFO  Thread: Thread[ThumbnailGenerator,5,main]
2019-08-20 22:47:14,956 [IndexUpdater] INFO  	at java.base@11.0.4/jdk.internal.misc.Unsafe.park(Native Method)
2019-08-20 22:47:14,956 [IndexUpdater] INFO  	at java.base@11.0.4/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:234)
2019-08-20 22:47:14,956 [IndexUpdater] INFO  	at java.base@11.0.4/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2123)
2019-08-20 22:47:14,956 [IndexUpdater] INFO  	at java.base@11.0.4/java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:458)
2019-08-20 22:47:14,956 [IndexUpdater] INFO  	at app//org.codelibs.fess.thumbnail.ThumbnailManager.lambda$init$0(ThumbnailManager.java:119)
2019-08-20 22:47:14,956 [IndexUpdater] INFO  	at app//org.codelibs.fess.thumbnail.ThumbnailManager$$Lambda$501/0x0000000840592440.run(Unknown Source)
2019-08-20 22:47:14,956 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.Thread.run(Thread.java:834)
2019-08-20 22:47:14,956 [IndexUpdater] INFO  Thread: Thread[Crawler-20190820000000-2-5,5,Crawler-20190820000000-2]
2019-08-20 22:47:14,956 [IndexUpdater] INFO  	at java.base@11.0.4/java.net.SocketInputStream.socketRead0(Native Method)
2019-08-20 22:47:14,956 [IndexUpdater] INFO  	at java.base@11.0.4/java.net.SocketInputStream.socketRead(SocketInputStream.java:115)
2019-08-20 22:47:14,956 [IndexUpdater] INFO  	at java.base@11.0.4/java.net.SocketInputStream.read(SocketInputStream.java:168)
2019-08-20 22:47:14,956 [IndexUpdater] INFO  	at java.base@11.0.4/java.net.SocketInputStream.read(SocketInputStream.java:140)
2019-08-20 22:47:14,965 [IndexUpdater] INFO  	at java.base@11.0.4/sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:448)
2019-08-20 22:47:14,965 [IndexUpdater] INFO  	at java.base@11.0.4/sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(SSLSocketInputRecord.java:68)
2019-08-20 22:47:14,965 [IndexUpdater] INFO  	at java.base@11.0.4/sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1104)
2019-08-20 22:47:14,965 [IndexUpdater] INFO  	at java.base@11.0.4/sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:823)
2019-08-20 22:47:14,965 [IndexUpdater] INFO  	at app//org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
2019-08-20 22:47:14,965 [IndexUpdater] INFO  	at app//org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
2019-08-20 22:47:14,965 [IndexUpdater] INFO  	at app//org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:280)
2019-08-20 22:47:14,965 [IndexUpdater] INFO  	at app//org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138)
2019-08-20 22:47:14,965 [IndexUpdater] INFO  	at app//org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
2019-08-20 22:47:14,965 [IndexUpdater] INFO  	at app//org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
2019-08-20 22:47:14,965 [IndexUpdater] INFO  	at app//org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
2019-08-20 22:47:14,965 [IndexUpdater] INFO  	at app//org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:157)
2019-08-20 22:47:14,965 [IndexUpdater] INFO  	at app//org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
2019-08-20 22:47:14,965 [IndexUpdater] INFO  	at app//org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
2019-08-20 22:47:14,965 [IndexUpdater] INFO  	at app//org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
2019-08-20 22:47:14,966 [IndexUpdater] INFO  	at app//org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
2019-08-20 22:47:14,966 [IndexUpdater] INFO  	at app//org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
2019-08-20 22:47:14,966 [IndexUpdater] INFO  	at app//org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
2019-08-20 22:47:14,966 [IndexUpdater] INFO  	at app//org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
2019-08-20 22:47:14,966 [IndexUpdater] INFO  	at app//org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
2019-08-20 22:47:14,966 [IndexUpdater] INFO  	at app//org.codelibs.fess.crawler.client.http.HcHttpClient.executeHttpClient(HcHttpClient.java:866)
2019-08-20 22:47:14,966 [IndexUpdater] INFO  	at app//org.codelibs.fess.crawler.client.http.HcHttpClient.processHttpMethod(HcHttpClient.java:690)
2019-08-20 22:47:14,966 [IndexUpdater] INFO  	at app//org.codelibs.fess.crawler.client.http.HcHttpClient.doHttpMethod(HcHttpClient.java:653)
2019-08-20 22:47:14,966 [IndexUpdater] INFO  	at app//org.codelibs.fess.crawler.client.http.HcHttpClient.doGet(HcHttpClient.java:612)
2019-08-20 22:47:14,966 [IndexUpdater] INFO  	at app//org.codelibs.fess.crawler.client.AbstractCrawlerClient.execute(AbstractCrawlerClient.java:142)
2019-08-20 22:47:14,966 [IndexUpdater] INFO  	at app//org.codelibs.fess.crawler.client.FaultTolerantClient.execute(FaultTolerantClient.java:67)
2019-08-20 22:47:14,966 [IndexUpdater] INFO  	at app//org.codelibs.fess.crawler.CrawlerThread.run(CrawlerThread.java:164)
2019-08-20 22:47:14,966 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.Thread.run(Thread.java:834)
2019-08-20 22:47:14,966 [IndexUpdater] INFO  Thread: Thread[IndexUpdater,10,main]
2019-08-20 22:47:14,966 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.Thread.dumpThreads(Native Method)
2019-08-20 22:47:14,966 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.Thread.getAllStackTraces(Thread.java:1657)
2019-08-20 22:47:14,966 [IndexUpdater] INFO  	at app//org.codelibs.fess.util.ThreadDumpUtil.processThreadDump(ThreadDumpUtil.java:66)
2019-08-20 22:47:14,966 [IndexUpdater] INFO  	at app//org.codelibs.fess.util.ThreadDumpUtil.printThreadDump(ThreadDumpUtil.java:39)
2019-08-20 22:47:14,966 [IndexUpdater] INFO  	at app//org.codelibs.fess.indexer.IndexUpdater.run(IndexUpdater.java:273)
2019-08-20 22:47:14,966 [IndexUpdater] INFO  Thread: Thread[Finalizer,8,system]
2019-08-20 22:47:14,966 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.Object.wait(Native Method)
2019-08-20 22:47:14,966 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:155)
2019-08-20 22:47:14,966 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:176)
2019-08-20 22:47:14,966 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:170)
2019-08-20 22:47:14,966 [IndexUpdater] INFO  Thread: Thread[main,5,main]
2019-08-20 22:47:14,966 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.Object.wait(Native Method)
2019-08-20 22:47:14,966 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.Thread.join(Thread.java:1305)
2019-08-20 22:47:14,966 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.Thread.join(Thread.java:1379)
2019-08-20 22:47:14,966 [IndexUpdater] INFO  	at app//org.codelibs.fess.exec.Crawler.joinCrawlerThread(Crawler.java:524)
2019-08-20 22:47:14,966 [IndexUpdater] INFO  	at app//org.codelibs.fess.exec.Crawler.doCrawl(Crawler.java:490)
2019-08-20 22:47:14,966 [IndexUpdater] INFO  	at app//org.codelibs.fess.exec.Crawler.process(Crawler.java:345)
2019-08-20 22:47:14,967 [IndexUpdater] INFO  	at app//org.codelibs.fess.exec.Crawler.main(Crawler.java:264)
2019-08-20 22:47:14,967 [IndexUpdater] INFO  Thread: Thread[Crawler-20190820000000-2,5,main]
2019-08-20 22:47:14,967 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.Object.wait(Native Method)
2019-08-20 22:47:14,967 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.Thread.join(Thread.java:1305)
2019-08-20 22:47:14,967 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.Thread.join(Thread.java:1379)
2019-08-20 22:47:14,967 [IndexUpdater] INFO  	at app//org.codelibs.fess.crawler.Crawler.run(Crawler.java:243)
2019-08-20 22:47:14,967 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.Thread.run(Thread.java:834)
2019-08-20 22:47:14,967 [IndexUpdater] INFO  Thread: Thread[Common-Cleaner,8,InnocuousThreadGroup]
2019-08-20 22:47:14,967 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.Object.wait(Native Method)
2019-08-20 22:47:14,967 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:155)
2019-08-20 22:47:14,967 [IndexUpdater] INFO  	at java.base@11.0.4/jdk.internal.ref.CleanerImpl.run(CleanerImpl.java:148)
2019-08-20 22:47:14,967 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.Thread.run(Thread.java:834)
2019-08-20 22:47:14,967 [IndexUpdater] INFO  	at java.base@11.0.4/jdk.internal.misc.InnocuousThread.run(InnocuousThread.java:134)
2019-08-20 22:47:14,967 [IndexUpdater] INFO  Thread: Thread[Keep-Alive-Timer,8,InnocuousThreadGroup]
2019-08-20 22:47:14,967 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.Thread.sleep(Native Method)
2019-08-20 22:47:14,967 [IndexUpdater] INFO  	at java.base@11.0.4/sun.net.www.http.KeepAliveCache.run(KeepAliveCache.java:168)
2019-08-20 22:47:14,967 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.Thread.run(Thread.java:834)
2019-08-20 22:47:14,967 [IndexUpdater] INFO  	at java.base@11.0.4/jdk.internal.misc.InnocuousThread.run(InnocuousThread.java:134)
2019-08-20 22:47:14,967 [IndexUpdater] INFO  Thread: Thread[Signal Dispatcher,9,system]
2019-08-20 22:47:14,967 [IndexUpdater] INFO  Thread: Thread[eshttp,5,main]
2019-08-20 22:47:14,967 [IndexUpdater] INFO  	at java.base@11.0.4/jdk.internal.misc.Unsafe.park(Native Method)
2019-08-20 22:47:14,967 [IndexUpdater] INFO  	at java.base@11.0.4/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
2019-08-20 22:47:14,967 [IndexUpdater] INFO  	at java.base@11.0.4/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1628)
2019-08-20 22:47:14,967 [IndexUpdater] INFO  	at java.base@11.0.4/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:177)
2019-08-20 22:47:14,967 [IndexUpdater] INFO  Thread: Thread[eshttp,5,main]
2019-08-20 22:47:14,967 [IndexUpdater] INFO  	at java.base@11.0.4/jdk.internal.misc.Unsafe.park(Native Method)
2019-08-20 22:47:14,967 [IndexUpdater] INFO  	at java.base@11.0.4/java.util.concurrent.locks.LockSupport.parkUntil(LockSupport.java:275)
2019-08-20 22:47:14,967 [IndexUpdater] INFO  	at java.base@11.0.4/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1619)
2019-08-20 22:47:14,967 [IndexUpdater] INFO  	at java.base@11.0.4/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:177)
2019-08-20 22:47:14,967 [IndexUpdater] INFO  Thread: Thread[CoreLib-TimeoutManager,5,main]
2019-08-20 22:47:14,967 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.Thread.sleep(Native Method)
2019-08-20 22:47:14,967 [IndexUpdater] INFO  	at app//org.codelibs.core.timer.TimeoutManager.run(TimeoutManager.java:150)
2019-08-20 22:47:14,967 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.Thread.run(Thread.java:834)
2019-08-20 22:47:14,967 [IndexUpdater] INFO  Thread: Thread[Reference Handler,10,system]
2019-08-20 22:47:14,967 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.ref.Reference.waitForReferencePendingList(Native Method)
2019-08-20 22:47:14,967 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.ref.Reference.processPendingReferences(Reference.java:241)
2019-08-20 22:47:14,968 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.ref.Reference$ReferenceHandler.run(Reference.java:213)
2019-08-20 22:47:14,968 [IndexUpdater] INFO  Thread: Thread[CommandGeneratorDestoryTimer-1566252004948,5,main]
2019-08-20 22:47:14,968 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.Object.wait(Native Method)
2019-08-20 22:47:14,968 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.Object.wait(Object.java:328)
2019-08-20 22:47:14,968 [IndexUpdater] INFO  	at java.base@11.0.4/java.util.TimerThread.mainLoop(Timer.java:527)
2019-08-20 22:47:14,968 [IndexUpdater] INFO  	at java.base@11.0.4/java.util.TimerThread.run(Timer.java:506)
2019-08-20 22:47:14,968 [IndexUpdater] INFO  Thread: Thread[Java2D Disposer,10,system]
2019-08-20 22:47:14,968 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.Object.wait(Native Method)
2019-08-20 22:47:14,968 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:155)
2019-08-20 22:47:14,968 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:176)
2019-08-20 22:47:14,968 [IndexUpdater] INFO  	at java.desktop@11.0.4/sun.java2d.Disposer.run(Disposer.java:144)
2019-08-20 22:47:14,968 [IndexUpdater] INFO  	at java.base@11.0.4/java.lang.Thread.run(Thread.java:834)
2019-08-20 22:47:14,971 [IndexUpdater] INFO  [EXEC TIME] index update time: 34223617ms
2019-08-20 22:47:18,046 [WebFsCrawler] INFO  [EXEC TIME] crawling time: 82032836ms
2019-08-20 22:47:18,047 [main] INFO  Finished Crawler
2019-08-20 22:47:18,082 [main] INFO  [CRAWL INFO] DataCrawlEndTime=2019-08-20T00:00:05.207+0200,CrawlerEndTime=2019-08-20T22:47:18.048+0200,WebFsCrawlExecTime=82032836,CrawlerStatus=false,CrawlerStartTime=2019-08-20T00:00:05.111+0200,WebFsCrawlEndTime=2019-08-20T22:47:18.047+0200,CrawlerErrors=QueueTimeout,WebFsIndexExecTime=34223617,WebFsIndexSize=17476,CrawlerExecTime=82032937,DataCrawlStartTime=2019-08-20T00:00:05.192+0200,WebFsCrawlStartTime=2019-08-20T00:00:05.182+0200

(from github.com/marevol)
“Processing no docs” means that Indexer cannot get a document from Indexing queue, but Crawler is still running. So, Crawler is blocked(Crawler is accessing to a web page, but the web server does not return the response…).

2019-08-20 22:47:14,956 [IndexUpdater] INFO Thread: Thread[Crawler-20190820000000-2-5,5,Crawler-20190820000000-2]

This thread seemed to be blocked.
The cause is a last accessed url in this thread.

(from github.com/Anders-Bergqvist)
Is there a way to see wich url that was?

(from github.com/marevol)
You can check Crawler-20190820000000-2-5 in fess-crawler.log.
It’s a thread name.

(from github.com/Anders-Bergqvist)
It’s this url, but I can’t see anything wrong with it?
Line 5940: 2019-08-20 01:00:19,842 [Crawler-20190820000000-2-5] INFO Crawling URL: https://www.oru.se/personal/kristin_ewins

(from github.com/Anders-Bergqvist)
I see that the thread gets problems with our profile pages. Can it be that the crawler can’t deal with a a href="mailto:someperson" When I inspect the mail link in Chrome and click it, it attaches the site url before the mail adress so that this:


Becomes this:
image