One table - three DataStore Crawlers

Hi Marevol,

I try to describe my problem to you, hoping to be clear enough.
I have 3 DataStore crawlers working on the same Oracle table.

I run crawlers in this order:

Crawler 1:
sql select * from AD_FILES where files_tags like ‘% <TAG_001>%’
Crawler 2:
sql = select * from AD_FILES where files_tags like ‘% <TAG_002>%’

If I do researches now I get the results I expect based on Crawler 1 and 2.

I now run third crawler:

Crawler 3:
sql = select * from AD_FILES

If I now do research I get the results only for crawler 3 and I no longer have the results based on crawlers 1 and 2.

But if I run the crawlers in this order:

Crawler 3 (sql = select * from AD_FILES)
Crawler 1 (sql select * from AD_FILES where files_tags like ‘% <TAG_001>%’)
Crawler 2 (sql=select * from AD_FILES where files_tags like ‘%<TAG_002>%’)

now searches return results for all crawlers.

Fess is 13.6.2 version

Many thanks
Luigi

Could you check the difference of document values in Admin Search page?

Same result.

Only order :

Crawler 3
Crawler 1
Crawler 2

works correctly.

It seems that crawling without particular ‘where’ clause is resetting previous crawler indices created with specific filter on the same table.

Many thanks
Luigi

DataStore crawling deletes old documents at the end of the crawling.
To avoid the deletion, you can put delete_old_docs=false into Parameter.

Again not working.
I put option in this way :

delete_old_docs=false
driver=oracle.jdbc.OracleDriver
url=jdbc:oracle:thin:@//localhost:1521/ORA9ZV

Hi
Luigi

Hi Marevol,

I solved my problem since I understood that it all depended on my URL which was the same in all three crawlers.
So it was enough to make them different to get the results I expected.
I did not understand (sorry for my poor intuition) that the ‘unique key’ that differentiates crawlers is precisely the URL string.

Many thanks
Luigi

Hi Marevol,

can you give me a confirmation of the solution i have given to my problem?
And what about the parameter: ‘delete_old_docs = false’ was the correct way I entered it in Parameter?
Under what circumstances could it be useful ?

Many thanks
Luigi

URL is used as the document ID.
delete_old_docs is a flag that indicates whether documents in the previous crawling are deleted.

Hi Marevol,

‘delete_old_docs=false’ not worked in my test.
I think that this parameter works with some conditions that in my case are not present.

Many thanks
Luigi