(from github.com/Anders-Bergqvist)
Hi!
Anyone that knows how to write the expression for decreasing the boost for pdf:s and word-docs. I want webb pages (html) to rank slightly higher than pdf and word as a baseline.
(from github.com/Anders-Bergqvist)
Hi!
Anyone that knows how to write the expression for decreasing the boost for pdf:s and word-docs. I want webb pages (html) to rank slightly higher than pdf and word as a baseline.
(from github.com/marevol)
You can add Function Score query. For example,
<component name="queryHelper" class="org.codelibs.fess.helper.QueryHelper">
<postConstruct name="addBoostFunction">
<arg>org.elasticsearch.index.query.QueryBuilders.termQuery("filetype", "pdf")</arg>
<arg>
<component class="org.elasticsearch.index.query.functionscore.WeightBuilder">
<postConstruct name="setWeight">
<arg>0.5</arg>
</postConstruct>
</component>
</arg>
</postConstruct>
...
</component>
Other references are:
(from github.com/Anders-Bergqvist)
I put that in app.xml and got this in crawler log:
2019-03-07 17:39:41,775 [main] ERROR Crawler does not work correctly.
org.lastaflute.di.core.factory.dixml.exception.DiXmlParseFailureException: Look! Read the message below.
Failed to parse the dependency XML.
[Dependency XML]
app.xml
at org.lastaflute.di.core.factory.dixml.DiXmlLaContainerBuilder.throwDependencyXmlParseFailureException(DiXmlLaContainerBuilder.java:92) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.core.factory.dixml.DiXmlLaContainerBuilder.parse(DiXmlLaContainerBuilder.java:79) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.redefiner.core.RedefinableXmlLaContainerBuilder.parse(RedefinableXmlLaContainerBuilder.java:77) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.core.factory.dixml.DiXmlLaContainerBuilder.build(DiXmlLaContainerBuilder.java:61) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.core.factory.conbuilder.impl.AbstractLaContainerBuilder.build(AbstractLaContainerBuilder.java:41) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.core.factory.provider.LaContainerDefaultProvider.build(LaContainerDefaultProvider.java:112) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.core.factory.provider.LaContainerDefaultProvider.create(LaContainerDefaultProvider.java:67) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.core.factory.LaContainerFactory.doCreate(LaContainerFactory.java:79) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.core.factory.LaContainerFactory.create(LaContainerFactory.java:72) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.core.factory.SingletonLaContainerFactory.createContainer(SingletonLaContainerFactory.java:99) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.core.factory.SingletonLaContainerFactory.init(SingletonLaContainerFactory.java:64) ~[lasta-di-0.7.7.jar:?]
at org.codelibs.fess.exec.Crawler.main(Crawler.java:223) [classes/:?]
Caused by: org.lastaflute.di.core.factory.dixml.exception.TagComponentCreationFailureException: Failed to create component defined at component tag: org.codelibs.fess.helper.QueryHelper
at org.lastaflute.di.core.factory.dixml.taghandler.ComponentTagHandler.start(ComponentTagHandler.java:57) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.helper.xml.SaxHandler.start(SaxHandler.java:157) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.helper.xml.SaxHandler.start(SaxHandler.java:151) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.helper.xml.SaxHandler.startElement(SaxHandler.java:64) ~[lasta-di-0.7.7.jar:?]
at org.apache.xerces.parsers.AbstractSAXParser.startElement(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.xinclude.XIncludeHandler.startElement(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.impl.dtd.XMLDTDValidator.startElement(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanStartElement(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.jaxp.SAXParserImpl.parse(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.lastaflute.di.util.LdiSAXParserUtil.parse(LdiSAXParserUtil.java:38) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.helper.xml.SaxHandlerParser.parse(SaxHandlerParser.java:68) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.helper.xml.SaxHandlerParser.parse(SaxHandlerParser.java:64) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.core.factory.dixml.DiXmlLaContainerBuilder.parse(DiXmlLaContainerBuilder.java:74) ~[lasta-di-0.7.7.jar:?]
... 10 more
Caused by: org.lastaflute.di.helper.beans.exception.BeanNoClassDefFoundError: Failed to analyze the bean class: org.codelibs.fess.helper.QueryHelper
at org.lastaflute.di.helper.beans.impl.BeanDescImpl.<init>(BeanDescImpl.java:113) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.helper.beans.factory.BeanDescFactory.getBeanDesc(BeanDescFactory.java:48) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.core.factory.annohandler.impl.ConstantAnnotationHandler.createComponentDef(ConstantAnnotationHandler.java:57) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.core.factory.annohandler.impl.TigerAnnotationHandler.createComponentDef(TigerAnnotationHandler.java:243) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.core.factory.annohandler.impl.AbstractAnnotationHandler.createComponentDef(AbstractAnnotationHandler.java:86) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.core.factory.annohandler.impl.AbstractAnnotationHandler.createComponentDef(AbstractAnnotationHandler.java:81) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.core.factory.dixml.taghandler.ComponentTagHandler.start(ComponentTagHandler.java:52) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.helper.xml.SaxHandler.start(SaxHandler.java:157) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.helper.xml.SaxHandler.start(SaxHandler.java:151) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.helper.xml.SaxHandler.startElement(SaxHandler.java:64) ~[lasta-di-0.7.7.jar:?]
at org.apache.xerces.parsers.AbstractSAXParser.startElement(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.xinclude.XIncludeHandler.startElement(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.impl.dtd.XMLDTDValidator.startElement(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanStartElement(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.jaxp.SAXParserImpl.parse(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.lastaflute.di.util.LdiSAXParserUtil.parse(LdiSAXParserUtil.java:38) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.helper.xml.SaxHandlerParser.parse(SaxHandlerParser.java:68) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.helper.xml.SaxHandlerParser.parse(SaxHandlerParser.java:64) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.core.factory.dixml.DiXmlLaContainerBuilder.parse(DiXmlLaContainerBuilder.java:74) ~[lasta-di-0.7.7.jar:?]
... 10 more
Caused by: java.lang.NoClassDefFoundError: javax/servlet/http/HttpServletRequest
at java.lang.Class.getDeclaredConstructors0(Native Method) ~[?:1.8.0_191]
at java.lang.Class.privateGetDeclaredConstructors(Class.java:2671) ~[?:1.8.0_191]
at java.lang.Class.getConstructors(Class.java:1651) ~[?:1.8.0_191]
at org.lastaflute.di.helper.beans.impl.PropertyDescImpl.setupStringConstructor(PropertyDescImpl.java:90) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.helper.beans.impl.PropertyDescImpl.<init>(PropertyDescImpl.java:84) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.helper.beans.impl.BeanDescImpl.setupReadMethod(BeanDescImpl.java:488) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.helper.beans.impl.BeanDescImpl.setupPropertyDescs(BeanDescImpl.java:437) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.helper.beans.impl.BeanDescImpl.<init>(BeanDescImpl.java:109) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.helper.beans.factory.BeanDescFactory.getBeanDesc(BeanDescFactory.java:48) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.core.factory.annohandler.impl.ConstantAnnotationHandler.createComponentDef(ConstantAnnotationHandler.java:57) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.core.factory.annohandler.impl.TigerAnnotationHandler.createComponentDef(TigerAnnotationHandler.java:243) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.core.factory.annohandler.impl.AbstractAnnotationHandler.createComponentDef(AbstractAnnotationHandler.java:86) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.core.factory.annohandler.impl.AbstractAnnotationHandler.createComponentDef(AbstractAnnotationHandler.java:81) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.core.factory.dixml.taghandler.ComponentTagHandler.start(ComponentTagHandler.java:52) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.helper.xml.SaxHandler.start(SaxHandler.java:157) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.helper.xml.SaxHandler.start(SaxHandler.java:151) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.helper.xml.SaxHandler.startElement(SaxHandler.java:64) ~[lasta-di-0.7.7.jar:?]
at org.apache.xerces.parsers.AbstractSAXParser.startElement(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.xinclude.XIncludeHandler.startElement(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.impl.dtd.XMLDTDValidator.startElement(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanStartElement(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.jaxp.SAXParserImpl.parse(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.lastaflute.di.util.LdiSAXParserUtil.parse(LdiSAXParserUtil.java:38) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.helper.xml.SaxHandlerParser.parse(SaxHandlerParser.java:68) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.helper.xml.SaxHandlerParser.parse(SaxHandlerParser.java:64) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.core.factory.dixml.DiXmlLaContainerBuilder.parse(DiXmlLaContainerBuilder.java:74) ~[lasta-di-0.7.7.jar:?]
... 10 more
Caused by: java.lang.ClassNotFoundException: javax.servlet.http.HttpServletRequest
at java.net.URLClassLoader.findClass(URLClassLoader.java:382) ~[?:1.8.0_191]
at java.lang.ClassLoader.loadClass(ClassLoader.java:424) ~[?:1.8.0_191]
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) ~[?:1.8.0_191]
at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ~[?:1.8.0_191]
at java.lang.Class.getDeclaredConstructors0(Native Method) ~[?:1.8.0_191]
at java.lang.Class.privateGetDeclaredConstructors(Class.java:2671) ~[?:1.8.0_191]
at java.lang.Class.getConstructors(Class.java:1651) ~[?:1.8.0_191]
at org.lastaflute.di.helper.beans.impl.PropertyDescImpl.setupStringConstructor(PropertyDescImpl.java:90) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.helper.beans.impl.PropertyDescImpl.<init>(PropertyDescImpl.java:84) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.helper.beans.impl.BeanDescImpl.setupReadMethod(BeanDescImpl.java:488) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.helper.beans.impl.BeanDescImpl.setupPropertyDescs(BeanDescImpl.java:437) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.helper.beans.impl.BeanDescImpl.<init>(BeanDescImpl.java:109) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.helper.beans.factory.BeanDescFactory.getBeanDesc(BeanDescFactory.java:48) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.core.factory.annohandler.impl.ConstantAnnotationHandler.createComponentDef(ConstantAnnotationHandler.java:57) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.core.factory.annohandler.impl.TigerAnnotationHandler.createComponentDef(TigerAnnotationHandler.java:243) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.core.factory.annohandler.impl.AbstractAnnotationHandler.createComponentDef(AbstractAnnotationHandler.java:86) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.core.factory.annohandler.impl.AbstractAnnotationHandler.createComponentDef(AbstractAnnotationHandler.java:81) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.core.factory.dixml.taghandler.ComponentTagHandler.start(ComponentTagHandler.java:52) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.helper.xml.SaxHandler.start(SaxHandler.java:157) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.helper.xml.SaxHandler.start(SaxHandler.java:151) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.helper.xml.SaxHandler.startElement(SaxHandler.java:64) ~[lasta-di-0.7.7.jar:?]
at org.apache.xerces.parsers.AbstractSAXParser.startElement(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.xinclude.XIncludeHandler.startElement(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.impl.dtd.XMLDTDValidator.startElement(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanStartElement(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.apache.xerces.jaxp.SAXParserImpl.parse(Unknown Source) ~[xercesImpl-2.11.0.jar:?]
at org.lastaflute.di.util.LdiSAXParserUtil.parse(LdiSAXParserUtil.java:38) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.helper.xml.SaxHandlerParser.parse(SaxHandlerParser.java:68) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.helper.xml.SaxHandlerParser.parse(SaxHandlerParser.java:64) ~[lasta-di-0.7.7.jar:?]
at org.lastaflute.di.core.factory.dixml.DiXmlLaContainerBuilder.parse(DiXmlLaContainerBuilder.java:74) ~[lasta-di-0.7.7.jar:?]
... 10 more
2019-03-07 17:39:41,779 [main] INFO Destroyed LaContainer.
(from github.com/Anders-Bergqvist)
I don’t know what that means. How do I make it valid? I copied your code into it and then I got the error. Do you mean that some syntax is wrong?
(from github.com/Anders-Bergqvist)
I found that there are an app.xml in four places. With one should i modify?
/usr/share/fess/app/WEB-INF/classes/app.xml
/usr/share/fess/app/WEB-INF/env/crawler/resources/app.xml
/usr/share/fess/app/WEB-INF/env/suggest/resources/app.xml
/usr/share/fess/app/WEB-INF/env/thumbnail/resources/app.xml
(from github.com/Anders-Bergqvist)
When i stopped fess and added:
<component name="queryHelper" class="org.codelibs.fess.helper.QueryHelper"> <postConstruct name="addBoostFunction"> <arg>org.elasticsearch.index.query.QueryBuilders.termQuery("filetype", "pdf")</arg> <arg> <component class="org.elasticsearch.index.query.functionscore.WeightBuilder"> <postConstruct name="setWeight"> <arg>0.5</arg> </postConstruct> </component> </arg> </postConstruct> </component>
Into:
/usr/share/fess/app/WEB-INF/classes/app.xml
I got 404 from the server and I was not able to login. I took that part away and restarted and then it worked again. Whats wrong?
(from github.com/marevol)
Did you insert postConstruct element into component element of queryHelper?
(from github.com/Anders-Bergqvist)
I just pasted the code above as it shows right into app.xml after another /component
(from github.com/Anders-Bergqvist)
Ok! I see my error now. There already was an etry for <component name="queryHelper" class="org.codelibs.fess.helper.QueryHelper">
I made a new one and therefore there was a duplicate that made it crash. Now I only inserted the <postConstruct …> -part into the existing one.
(from github.com/Anders-Bergqvist)
By the way, If I want to have the same setting for word-documents. How do I declare that? Can I write:
<arg>org.elasticsearch.index.query.QueryBuilders.termQuery("filetype", "pdf", "doc", "docx")</arg>
?
(from github.com/Anders-Bergqvist)
Could you set the value to 0,8 in <arg>0.5</arg>
?
When I go to System info>search and search for a pdf wich has been crawled since the change it still has Boost 1.0 as a value. Should It not be 0,5?
(from github.com/marevol)
The boost value is for the document, so it’s different from one of addBoostFunction
.
I think you can check only a final score in admin search page.
(from github.com/Anders-Bergqvist)
So there is no way to se that the pdf-files actually has 0,5 boost?
(from github.com/Anders-Bergqvist)
I have implemented the addBoostFunction
as stated above. My question is if there are a way to see that a pdf-document actually has 0,5 boost. You would want to have a record that says what the final score is and the parameters that affect the score.
© 2020. All Rights Reserved - CodeLibs, Inc.