StrictOpen XML

質問、要望でなくPOIのバグ共有になります。

Excelで拡張子、xlsxですが、ファイル形式「StrictOpen XMLスプレッドシート(*.xlsx)」で保存されていると、エラーでcontentが作成されておりませんでした。
xlsx→zipで内容を見ると拾えそうなんですが。残念です。

2716 2026-02-05 10:52:07,824 [Crawler-20260205105201-1-2] DEBUG Could not get a text.
2717 org.codelibs.fess.crawler.exception.ExtractException: Could not extract a content.
2718 at org.codelibs.fess.crawler.extractor.impl.TikaExtractor.getText(TikaExtractor.java:483) ~[fess-crawler-15.4.0.jar:?]
2719 at org.codelibs.fess.crawler.extractor.impl.TikaExtractor.getText(TikaExtractor.java:244) ~[fess-crawler-15.4.0.jar:?]
2720 at org.codelibs.fess.crawler.transformer.AbstractFessFileTransformer.getExtractData(AbstractFessFileTransformer.java:450) [classes/?:?]
2721 at org.codelibs.fess.crawler.transformer.AbstractFessFileTransformer.generateData(AbstractFessFileTransformer.java:134) [classes/?:?]
2722 at org.codelibs.fess.crawler.transformer.AbstractFessFileTransformer.transform(AbstractFessFileTransformer.java:109) [classes/?:?]
2723 at org.codelibs.fess.crawler.processor.impl.DefaultResponseProcessor.process(DefaultResponseProcessor.java:112) [fess-crawler-15.4.0.jar:?]
2724 at org.codelibs.fess.crawler.CrawlerThread.processResponse(CrawlerThread.java:392) [fess-crawler-15.4.0.jar:?]
2725 at org.codelibs.fess.crawler.FessCrawlerThread.processResponse(FessCrawlerThread.java:332) [classes/?:?]
2726 at org.codelibs.fess.crawler.CrawlerThread.run(CrawlerThread.java:241) [fess-crawler-15.4.0.jar:?]
2727 at java.base/java.lang.Thread.run(Thread.java:1474) [?:?]
2728 Caused by: org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.microsoft.ooxml.OOXMLParser@313356a0
2729 at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:312) ~[tika-core-3.2.3.jar:3.2.3]
2730 at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:298) ~[tika-core-3.2.3.jar:3.2.3]
2731 at org.codelibs.fess.crawler.extractor.impl.TikaExtractor$TikaDetectParser.parse(TikaExtractor.java:687) ~[fess-crawler-15.4.0.jar:?]
2732 at org.codelibs.fess.crawler.extractor.impl.TikaExtractor.lambda$getText$0(TikaExtractor.java:300) ~[fess-crawler-15.4.0.jar:?]
2733 at org.codelibs.fess.crawler.extractor.impl.TikaExtractor.getContent(TikaExtractor.java:563) ~[fess-crawler-15.4.0.jar:?]
2734 at org.codelibs.fess.crawler.extractor.impl.TikaExtractor.getText(TikaExtractor.java:289) ~[fess-crawler-15.4.0.jar:?]
2735 … 9 more
2736 Caused by: org.apache.poi.ooxml.POIXMLException: Strict OOXML isn’t currently supported, please see bug #57699
2737 at org.apache.poi.xssf.eventusermodel.XSSFReader.(XSSFReader.java:112) ~[poi-ooxml-5.4.1.jar:5.4.1]
2738 at org.apache.poi.xssf.eventusermodel.XSSFReader.(XSSFReader.java:86) ~[poi-ooxml-5.4.1.jar:5.4.1]
2739 at org.apache.tika.parser.microsoft.ooxml.XSSFEcelExtractorDecorator.buildXHTML(XSSFEcelExtractorDecorator.java:146) ~[tika-parser-microsoft-module-3.2.3.jar:3.2.3]
2740 at org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.getXHTML(AbstractOOXMLExtractor.java:143) ~[tika-parser-microsoft-module-3.2.3.jar:3.2.3]
2741 at org.apache.tika.parser.microsoft.ooxml.XSSFEcelExtractorDecorator.getXHTML(XSSFEcelExtractorDecorator.java:130) ~[tika-parser-microsoft-module-3.2.3.jar:3.2.3]
2742 at org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:247) ~[tika-parser-microsoft-module-3.2.3.jar:3.2.3]
2743 at org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:117) ~[tika-parser-microsoft-module-3.2.3.jar:3.2.3]
2744 at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:298) ~[tika-core-3.2.3.jar:3.2.3]
2745 at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:298) ~[tika-core-3.2.3.jar:3.2.3]
2746 at org.codelibs.fess.crawler.extractor.impl.TikaExtractor$TikaDetectParser.parse(TikaExtractor.java:687) ~[fess-crawler-15.4.0.jar:?]
2747 at org.codelibs.fess.crawler.extractor.impl.TikaExtractor.lambda$getText$0(TikaExtractor.java:300) ~[fess-crawler-15.4.0.jar:?]
2748 at org.codelibs.fess.crawler.extractor.impl.TikaExtractor.getContent(TikaExtractor.java:563) ~[fess-crawler-15.4.0.jar:?]
2749 at org.codelibs.fess.crawler.extractor.impl.TikaExtractor.getText(TikaExtractor.java:289) ~[fess-crawler-15.4.0.jar:?]
2750 … 9 more
2751 2026-02-05 10:52:07,825 [Crawler-20260205105201-1-2] DEBUG ExtractData: ExtractData [metadata={}, content=null]

GitHub - pjfanning/excel-streaming-reader: An easy-to-use implementation of a streaming Excel reader using Apache POI とかを使うようにすれば、処理できるのかもしれませんが、商用サポートで依頼とかない限りは、今のところ、対応したりする計画はないです。