external results question

(from github.com/Outstep)

I am exploring Fess to see if it can work for a project that is being worked on at the moment and I really like what I have seen in Fess.

Actually, I am not completely sure if this is a Fess question or Elasticsearch question since I am researching the best approach to make it work which might require a plugin of some type to be written.

We have some JSON and/or XML URL data feeds coming from a type of metasearch engine server. Basically, that data source is sent a query and it returns a list of URL’s along with supporting fields like descriptions, date, etc…

I would like to be able to have Fess send the query to these metasearch engines, receive and aggregate the results, after which they would be stored in the Fess Elasticsearch distributed storage data servers so that on the next similar query, Fess to look to see if it already had some of the data available. If it did then it could aggregate those results with the newly returned data from the metasearch engines and return the results to the front end.

One of the clustering components that I have also been exploring for use with Fess and/or Elasticsearch is called Carrot2 which clusters results, but for now I am just trying to see how I might go about having Fess or Elasticsearch call out to the external metasearch engines and aggregate the results with what it has already, if any of this might be possible at all.

Any ideas or suggestions would be sincerely appreciated :slight_smile:

(from github.com/marevol)
For JSON, see fess-ds-json.
Fess provides Data Store feature to crawl other data sources, such as DB and CSV.

(from github.com/Outstep)
Thanks for the reply, but crawling external data sources is not really what I was wondering about and I apologize for not being clear on it.

What I am trying to find out is this. If I have an external data source that I can use the Fess frontend to send the query, then that external data source would return the results for which I would like to have them stored in the Fess data store so that they would effectively be cached for a simple query in the future which would only add more results if not already in the Fess cache.

I want to have Fess sent the query to the external service, and then add those results to the existing Fess data store and send results back to the frontend from data store.

Does this clear up my question any?

(from github.com/Outstep)
Is there any way to have Fess send the user query to the external metasearch engine and then pipe JSON back in the results into the elasticsearch data store and also back to the user?

(from github.com/marevol)
I think you need to modify a source code in Fess.