Elasticsearchのログファイルも確認いたしました。
いくつかErrorがあり、設定変更及び見直しを検討したのですが、改善せず。。
本日試した際のElasticsearchのログを記載させていただきます。
引き続きのご助力をお願いしたく存じます。
※文字数制限により一部となります。
[2022-01-12T17:45:47,081][WARN ][o.e.t.TcpTransport ] [fess02] exception caught on transport layer [Netty4TcpChannel{localAddress=/10.4.163.51:58592, remoteAddress=10.4.162.44/10.4.162.44:9300, profile=default}], closing connection
java.io.IOException: 既存の接続はリモート ホストに強制的に切断されました。
at sun.nio.ch.SocketDispatcher.read0(Native Method) ~[?:?]
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:43) ~[?:?]
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:276) ~[?:?]
at sun.nio.ch.IOUtil.read(IOUtil.java:233) ~[?:?]
at sun.nio.ch.IOUtil.read(IOUtil.java:223) ~[?:?]
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:356) ~[?:?]
at org.elasticsearch.transport.CopyBytesSocketChannel.readFromSocketChannel(CopyBytesSocketChannel.java:130) ~[transport-netty4-client-7.12.1.jar:7.12.1]
at org.elasticsearch.transport.CopyBytesSocketChannel.doReadBytes(CopyBytesSocketChannel.java:115) ~[transport-netty4-client-7.12.1.jar:7.12.1]
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:148) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:615) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:578) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) [netty-common-4.1.49.Final.jar:4.1.49.Final]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.49.Final.jar:4.1.49.Final]
at java.lang.Thread.run(Thread.java:829) [?:?]
[2022-01-12T17:45:47,112][INFO ][o.e.c.r.a.AllocationService] [fess02] updating number_of_replicas to [2] for indices [.configsync]
[2022-01-12T17:45:47,128][INFO ][o.e.c.s.MasterService ] [fess02] node-left[{fess03}{ZV5sqAofSeurCgV17nOwcw}{UEYX8-udTT603tciL_zheQ}{10.4.162.44}{10.4.162.44:9300}{cdfhilmrstw}{ml.machine_memory=16105050112, ml.max_open_jobs=20, xpack.installed=true, ml.max_jvm_size=6416302080, transform.node=true} reason: disconnected], term: 37, version: 2487785, delta: removed {{fess03}{ZV5sqAofSeurCgV17nOwcw}{UEYX8-udTT603tciL_zheQ}{10.4.162.44}{10.4.162.44:9300}{cdfhilmrstw}{ml.machine_memory=16105050112, ml.max_open_jobs=20, xpack.installed=true, ml.max_jvm_size=6416302080, transform.node=true}}
[2022-01-12T17:45:47,519][INFO ][o.e.c.s.ClusterApplierService] [fess02] removed {{fess03}{ZV5sqAofSeurCgV17nOwcw}{UEYX8-udTT603tciL_zheQ}{10.4.162.44}{10.4.162.44:9300}{cdfhilmrstw}{ml.machine_memory=16105050112, ml.max_open_jobs=20, xpack.installed=true, ml.max_jvm_size=6416302080, transform.node=true}}, term: 37, version: 2487785, reason: Publication{term=37, version=2487785}
[2022-01-12T17:45:47,534][INFO ][o.e.c.r.DelayedAllocationService] [fess02] scheduling reroute for delayed shards in [59.5s] (46 delayed shards)
[2022-01-12T17:45:49,925][INFO ][o.e.c.r.a.AllocationService] [fess02] updating number_of_replicas to [3] for indices [.configsync]
[2022-01-12T17:45:49,925][INFO ][o.e.c.s.MasterService ] [fess02] node-join[{fess03}{ZV5sqAofSeurCgV17nOwcw}{UEYX8-udTT603tciL_zheQ}{10.4.162.44}{10.4.162.44:9300}{cdfhilmrstw}{ml.machine_memory=16105050112, ml.max_open_jobs=20, xpack.installed=true, ml.max_jvm_size=6416302080, transform.node=true} join existing leader], term: 37, version: 2487787, delta: added {{fess03}{ZV5sqAofSeurCgV17nOwcw}{UEYX8-udTT603tciL_zheQ}{10.4.162.44}{10.4.162.44:9300}{cdfhilmrstw}{ml.machine_memory=16105050112, ml.max_open_jobs=20, xpack.installed=true, ml.max_jvm_size=6416302080, transform.node=true}}
[2022-01-12T17:45:50,409][INFO ][o.e.c.s.ClusterApplierService] [fess02] added {{fess03}{ZV5sqAofSeurCgV17nOwcw}{UEYX8-udTT603tciL_zheQ}{10.4.162.44}{10.4.162.44:9300}{cdfhilmrstw}{ml.machine_memory=16105050112, ml.max_open_jobs=20, xpack.installed=true, ml.max_jvm_size=6416302080, transform.node=true}}, term: 37, version: 2487787, reason: Publication{term=37, version=2487787}
[2022-01-12T17:46:08,222][WARN ][o.e.c.r.a.AllocationService] [fess02] failing shard [failed shard, shard [.configsync][0], node[ZV5sqAofSeurCgV17nOwcw], [R], recovery_source[peer recovery], s[INITIALIZING], a[id=XFMdifmSS6eqWAigA3dlHw], unassigned_info[[reason=REPLICA_ADDED], at[2022-01-12T08:45:49.925Z], delayed=false, allocation_status[no_attempt]], expected_shard_size[114178], message [failed to create shard], failure [IOException[failed to obtain in-memory shard lock]; nested: ShardLockObtainFailedException[[.configsync][0]: obtaining shard lock for [starting shard] timed out after [5000ms], lock already held for [closing shard] with age [17982ms]]; ], markAsStale [true]]
java.io.IOException: failed to obtain in-memory shard lock
at org.elasticsearch.index.IndexService.createShard(IndexService.java:488) ~[elasticsearch-7.12.1.jar:7.12.1]
at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:776) ~[elasticsearch-7.12.1.jar:7.12.1]
at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:170) ~[elasticsearch-7.12.1.jar:7.12.1]
at org.elasticsearch.indices.cluster.IndicesClusterStateService.createShard(IndicesClusterStateService.java:565) ~[elasticsearch-7.12.1.jar:7.12.1]
at org.elasticsearch.indices.cluster.IndicesClusterStateService.createOrUpdateShards(IndicesClusterStateService.java:542) ~[elasticsearch-7.12.1.jar:7.12.1]
at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyClusterState(IndicesClusterStateService.java:230) ~[elasticsearch-7.12.1.jar:7.12.1]
at org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:499) ~[elasticsearch-7.12.1.jar:7.12.1]
at org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:489) ~[elasticsearch-7.12.1.jar:7.12.1]
at org.elasticsearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:460) ~[elasticsearch-7.12.1.jar:7.12.1]
at org.elasticsearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:407) ~[elasticsearch-7.12.1.jar:7.12.1]
at org.elasticsearch.cluster.service.ClusterApplierService.access$000(ClusterApplierService.java:57) ~[elasticsearch-7.12.1.jar:7.12.1]
at org.elasticsearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:151) ~[elasticsearch-7.12.1.jar:7.12.1]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:673) [elasticsearch-7.12.1.jar:7.12.1]
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:241) [elasticsearch-7.12.1.jar:7.12.1]
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:204) [elasticsearch-7.12.1.jar:7.12.1]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at java.lang.Thread.run(Thread.java:829) [?:?]
Caused by: org.elasticsearch.env.ShardLockObtainFailedException: [.configsync][0]: obtaining shard lock for [starting shard] timed out after [5000ms], lock already held for [closing shard] with age [17982ms]
at org.elasticsearch.env.NodeEnvironment$InternalShardLock.acquire(NodeEnvironment.java:787) ~[elasticsearch-7.12.1.jar:7.12.1]
at org.elasticsearch.env.NodeEnvironment.shardLock(NodeEnvironment.java:693) ~[elasticsearch-7.12.1.jar:7.12.1]
at org.elasticsearch.index.IndexService.createShard(IndexService.java:406) ~[elasticsearch-7.12.1.jar:7.12.1]
... 17 more
[2022-01-12T17:46:18,565][WARN ][o.e.c.r.a.AllocationService] [fess02] failing shard [failed shard, shard [.configsync][0], node[ZV5sqAofSeurCgV17nOwcw], [R], recovery_source[peer recovery], s[INITIALIZING], a[id=C2DTL-AxTdCVeFR4Y_1flA], unassigned_info[[reason=ALLOCATION_FAILED], at[2022-01-12T08:46:08.222Z], failed_attempts[1], failed_nodes[[ZV5sqAofSeurCgV17nOwcw]], delayed=false, details[failed shard on node [ZV5sqAofSeurCgV17nOwcw]: failed to create shard, failure IOException[failed to obtain in-memory shard lock]; nested: ShardLockObtainFailedException[[.configsync][0]: obtaining shard lock for [starting shard] timed out after [5000ms], lock already held for [closing shard] with age [17982ms]]; ], allocation_status[no_attempt]], expected_shard_size[114178], message [failed to create shard], failure [IOException[failed to obtain in-memory shard lock]; nested: ShardLockObtainFailedException[[.configsync][0]: obtaining shard lock for [starting shard] timed out after [5000ms], lock already held for [closing shard] with age [28435ms]]; ], markAsStale [true]]
java.io.IOException: failed to obtain in-memory shard lock
at org.elasticsearch.index.IndexService.createShard(IndexService.java:488) ~[elasticsearch-7.12.1.jar:7.12.1]
at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:776) ~[elasticsearch-7.12.1.jar:7.12.1]
at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:170) ~[elasticsearch-7.12.1.jar:7.12.1]
at org.elasticsearch.indices.cluster.IndicesClusterStateService.createShard(IndicesClusterStateService.java:565) ~[elasticsearch-7.12.1.jar:7.12.1]
at org.elasticsearch.indices.cluster.IndicesClusterStateService.createOrUpdateShards(IndicesClusterStateService.java:542) ~[elasticsearch-7.12.1.jar:7.12.1]
at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyClusterState(IndicesClusterStateService.java:230) ~[elasticsearch-7.12.1.jar:7.12.1]
at org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:499) ~[elasticsearch-7.12.1.jar:7.12.1]
at org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:489) ~[elasticsearch-7.12.1.jar:7.12.1]
at org.elasticsearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:460) ~[elasticsearch-7.12.1.jar:7.12.1]
at org.elasticsearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:407) ~[elasticsearch-7.12.1.jar:7.12.1]
at org.elasticsearch.cluster.service.ClusterApplierService.access$000(ClusterApplierService.java:57) ~[elasticsearch-7.12.1.jar:7.12.1]
at org.elasticsearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:151) ~[elasticsearch-7.12.1.jar:7.12.1]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:673) [elasticsearch-7.12.1.jar:7.12.1]
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:241) [elasticsearch-7.12.1.jar:7.12.1]
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:204) [elasticsearch-7.12.1.jar:7.12.1]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at java.lang.Thread.run(Thread.java:829) [?:?]
Caused by: org.elasticsearch.env.ShardLockObtainFailedException: [.configsync][0]: obtaining shard lock for [starting shard] timed out after [5000ms], lock already held for [closing shard] with age [28435ms]
at org.elasticsearch.env.NodeEnvironment$InternalShardLock.acquire(NodeEnvironment.java:787) ~[elasticsearch-7.12.1.jar:7.12.1]
at org.elasticsearch.env.NodeEnvironment.shardLock(NodeEnvironment.java:693) ~[elasticsearch-7.12.1.jar:7.12.1]
at org.elasticsearch.index.IndexService.createShard(IndexService.java:406) ~[elasticsearch-7.12.1.jar:7.12.1]
... 17 more
なお、Elasticssearch側でもエラーが出ている状況ではありますが、サービスが起動できていないわけでもなく、ステータスもGreenとなっています。
クロール中に処理が落ちたかとも考えたのですが、ログを追っていくと、完了しているようにも見え、FessからElasticsearchのファイルでしたり、JDKファイルの読み込みができていないようにも感じております。。。
もうしばらく調査し、それでも解決しなかった場合は、過去同様の事例がありました時の対処として、環境の再構築&ヒープサイズの変更(増加)をして様子を見てみます。
※ただ、現状のヒープ使用率は各サーバー10%に満たない状況でして、根本的な解決には至らないと思っております。再発が懸念されます。。
以上となります。
引き続きよろしくお願いいたします。