(from github.com/arengore )
I’m indexing several msg email files, I want to overwrite file last_modified with email date, so i picked the field “Last-Save-Date” calling it dmodified:
crawler.metadata.name.mapping=\
title=title:string\n\
Title=title:string\n\
Last-Save-Date=dmodified:string\n\
query.additional.response.fields=dmodified
query.additional.search.fields=dmodified
The field is picked by crawler and I can see it on ES.
On crawler Config Parameters I entered
doc.last_modified=doc.dmodified
but last_modified don’t change it’s value.
Where I’m wrong?
discuss
January 30, 2019, 10:13pm
2
(from marevol (Shinsuke Sugaya) · GitHub )
doc.last_modified=doc.dmodified
? The above setting does not exist…
It may work if Last-Save-Date
is ISO 8601 format and replacing dmodified
with last_modified
.
discuss
February 13, 2019, 3:28pm
3
(from github.com/arengore )
no way, the metadata are ingested from elasticsearch and then becomes unmodifiable.
So, I’m using a bash script to change files timestamp.
discuss
February 14, 2019, 5:07pm
4
(from github.com/freestyle68 )
Hello,
I have seen you commit #2019 , I have tested it with msg file and also with pdfs.
With msg the Last-Save-Date wins over file timestamp, but with pdf the winning is always file timestamp.
Shouldn’t be Last-Save-Date be winning always if present?
I attach a sample pdf with timestamp winning.
069858.pdf
discuss
February 14, 2019, 9:04pm
5
(from github.com/marevol )
It’s not for PDF.