xpath from HTML-body in custom field

Hi,

we would like to seprate a HTML-String in a custom field. Two queastions:

  1. When I add in fess_config.properties custom field. What type the field have or where can I set the type?
  2. What is the right syntax to get the contant of an div for this field, my try is not successfull:
    field.xpath.custom_breadcrumb=//DIV[@id=‘breadcrumb’]
    field.xpath.custom_breadcrumb=//[@id=‘breadcrumb’]
    field.xpath.custom_breadcrumb=//DIV[@id=‘breadcrumb’]/DIV

I think the syntax for xpath is not right or the HTMLo of the document will not be parsed.
//BODY will save the whole Text without the HTML there. How can I access the content of a div with html?

Thanks.

You need to add the field definition to doc.json and then start reindexing on the maintenance page. You can check past topics about doc.json.
Please see Part 3: Web Scraping with Fess.

1 Like