Is it possible to crawling data on two FESS instances to different index?

discuss · November 12, 2019, 5:44am

(from github.com/cross1154)
hi,
I have two fess instance work on one elasticsearch cluster,
I want to crawling data into different index in the same time like:
FessA >> IndexA
FessB >> IndexB

I change fess_config.properties as below:
FessA :
index.document.update.index=alias1 (binding with IndexA)
FessB :
index.document.update.index=alias2 (binding with IndexB)

and fess.update is binding with IndexC.

But when I restart fess sever and do some crawling job,I see all the data is in IndexC.
I want to know is it possible to crawling data into different index in the same time with two instance?
If it is possible,how to setting it?

discuss · November 12, 2019, 6:49am

(from github.com/cross1154)
more information added:
fess:12.3.2
elasticseach:6.4.2

discuss · November 12, 2019, 9:27pm

(from github.com/marevol)
Fess uses index.document.search.index for searching.
You need to change all the following settings.

index.document.search.index=fess.search
index.document.update.index=fess.update
index.document.suggest.index=fess
index.document.crawler.index=.crawler
index.config.index=.fess_config
index.user.index=.fess_user
index.log.index=fess_log

discuss · November 13, 2019, 7:31am

(from github.com/cross1154)
Thanks your answer,
If I use one fess instance for data search only(for example: FessZ),
use FessA ,FessB for data crawling only(into IndexA & IndexB).
set fess.search binding both IndexA & IndexB,
FessZ,FessA ,FessB use the same .fess_user,fess.search in elastic search cluster,
In this way,if it is possible ,could you tell me the minimal modification to set FessA & FessB just let them crawling data in to different index?

discuss · November 13, 2019, 11:39am

(from github.com/marevol)
Why do you want to split an index?

discuss · November 14, 2019, 1:11am

(from github.com/cross1154)
I want to crawling multiple domains.
(crawling the data of file server A into IndexA ,
and crawling the data of file server B into IndexB .)
And there are batch job modify the data in different indexes every evening,
(modify methods are different)

discuss · November 14, 2019, 2:16am

(from github.com/marevol)
For your use case, scheduler.target.name is available.
See https://github.com/codelibs/fess/issues/1315#issuecomment-339515720.

discuss · November 15, 2019, 2:59am

(from github.com/cross1154)
Thank you for the answer.
I change instance A’s setting like this:
1.change fess_config.properties
scheduler.target.name =node4 (this is one of my es-cluster nodes)
index.document.update.index=fessA.update
2.rename app/WEB-INF/classes/fess_indices/fess/alias/fess.update.json
to app/WEB-INF/classes/fess_indices/fess/alias/fessA.update.json
and restart my instance,

then my setting is like below:
FESS instance A with alias fessA.update binding indexA, with scheduler that target to node4
FESS instance B with alias fess.update binding indexB, with scheduler that target to all

I run scheduler of instance A , but data is crawling into indexB(fess.update binding).
Could you tell me is there anything I missed ?

discuss · November 15, 2019, 9:47pm

(from github.com/marevol)

Do not change index.document.*.index
Set a target name to Target

discuss · November 18, 2019, 3:46am

(from github.com/cross1154)
Thank you for the answer.
Do you mean we can make jobs crawling data into different indexes in parallel from multiple instances only by set the target name as a node？ I don’t understand how to set it.
In my view,if I do not change index.document.*.index,I have to use fess.update in all of my multiple Fess instance,All of my job started in one time will crawling data into the same index which fess.update is binding .

I also try to set target name as a Index(alias) name in fess_config.properties,
and edit Scheduled job’s target with the same Index(alias) name,
but it doesn’t work. crawler job still mapping with fess.update.

discuss · November 23, 2019, 1:04pm

(from github.com/cross1154)
I stop trying to use scheduler.target.name and turn to try this:
1.modify in fess_config.properties:

index.document.search.index=fess185.search
index.document.update.index=fess185.update
index.document.suggest.index=fess185
index.document.crawler.index=.crawler185
index.config.index=.fess_config185
index.user.index=.fess_user185
index.log.index=fess_log185

2.rename json file :

app/WEB-INF/classes/fess_indices/fess/alias/fess185.search.json
app/WEB-INF/classes/fess_indices/fess/alias/fess185.update.json

then I restart fess and found an error that .fess_user185.user was not found.
I reindex an other fess instance’s .fess_user.user to .fess_user185.user and then restart successful with no error log.
Alias [fess185.search] and [fess185.update] was created and bind with fess.20191123.

discuss · November 23, 2019, 1:29pm