(from github.com/rodrigobml)
I will need extract fone number by regex in additional field, but I didn’t find where the code is included.
(from github.com/rodrigobml)
I need to extract phone numbers or email address from eml, docs or pdfs and add this additional fields in Index for facet search.
I debuged the code, but not find the method that extract content file.
I think that after tika extract the content file I call the regex extract method. In Open Semantic Search there is a regex extract plugin for this.
(from github.com/marevol)
crawler.metadata.name.mapping is a mapping parameter.
The source code is AbstractFessFileTransformer.