Exception when using Apache POI to read XLSX file - java

I'm currently working with Apachi, as I needed to read data from a XLSX file, and it will later be converted to CSV. Here's the code I'm using to create my XSSFWorkbook, and it is causing an exception every single time. From what I could find, XMLBeans is part of the cause. It has been deprecated, however, it is a dependency of POI in this instance.
public static void appendCSV(File inputFile, String outputFile, String tag)
{
System.out.println(inputFile.getAbsolutePath());
InputStream inp = null;
try {
inp = new FileInputStream(inputFile);
XSSFWorkbook wb = new XSSFWorkbook(inp);
My exception gets thrown at the last line in the block above.
Exception in thread "main" org.apache.poi.POIXMLException: java.lang.reflect.InvocationTargetException
at org.apache.poi.POIXMLFactory.createDocumentPart(POIXMLFactory.java:65)
at org.apache.poi.POIXMLDocumentPart.read(POIXMLDocumentPart.java:601)
at org.apache.poi.POIXMLDocument.load(POIXMLDocument.java:174)
at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:279)
at BigBangarang.appendCSV(BigBangarang.java:68)
at BigBangarang.main(BigBangarang.java:268)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.poi.xssf.usermodel.XSSFFactory.createDocumentPart(XSSFFactory.java:56)
at org.apache.poi.POIXMLFactory.createDocumentPart(POIXMLFactory.java:62)
... 5 more
Caused by: java.lang.NoSuchMethodError: org.apache.xmlbeans.XmlOptions.setLoadEntityBytesLimit(I)Lorg/apache/xmlbeans/XmlOptions;
at org.apache.poi.POIXMLTypeLoader.<clinit>(POIXMLTypeLoader.java:50)
at org.apache.poi.xssf.model.SharedStringsTable.readFrom(SharedStringsTable.java:127)
at org.apache.poi.xssf.model.SharedStringsTable.<init>(SharedStringsTable.java:108)
... 11 more
Has anybody ran into this situation before? I have the most up to date release of XMLBeans, and I'm almost feeling like I may need to find an older version, as it says a method is missing. I'm not sure if there is an alternate/easier way to either read an XLSX, or to simply convert it to a CSV prior to handling any data.

You either need to upgrade your XMLBeans version to 2.6, or to upgrade you Apache POI version to 3.15 beta 1 or later.
You're hitting Apache POI bug #59195, for which a temporary workaround was applied around a month ago, and is included in the 3.15 beta 1 release. (Also in nightly builds from the time of the commit onwards). A full fix will take a bit longer, follow that bug if you're interested!

Related

Poi 5.0 Jar movement Issue in XWPFDocument

I am currently Apache poi 3.14 version jar to create Word documents. I am now looking to upgrade Poi to latest stable version of 5.0. But upon checking I am facing issues where I am even unable to load document stream in XWPF document. I have attached a sample code where I try to read a simple docx file & then re-write it again, I am getting error in place of even loading file bytes into XWPFDocument. I am baffled any detailed help would be really appreciated.
package basePackage;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.InputStream;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
public class PoiJars {
public static void main(String[] args) throws Exception {
String docxFilePath = "SimpleWordFile.docx";
InputStream stream = new FileInputStream(docxFilePath);
XWPFDocument document = new XWPFDocument(stream);
FileOutputStream outFile = new FileOutputStream("output.docx");
document.write(outFile);
}
}
Exception that occurs is:
Exception in thread "main" java.lang.NoClassDefFoundError: org/openxmlformats/schemas/drawingml/x2006/chart/ChartSpaceDocument$Factory
at org.apache.poi.xddf.usermodel.chart.XDDFChart.<init>(XDDFChart.java:155)
at org.apache.poi.xwpf.usermodel.XWPFChart.<init>(XWPFChart.java:75)
at org.apache.poi.ooxml.POIXMLFactory.createDocumentPart(POIXMLFactory.java:61)
at org.apache.poi.ooxml.POIXMLDocumentPart.read(POIXMLDocumentPart.java:660)
at org.apache.poi.ooxml.POIXMLDocument.load(POIXMLDocument.java:165)
at org.apache.poi.xwpf.usermodel.XWPFDocument.<init>(XWPFDocument.java:126)
at basePackage.PoiJars.main(PoiJars.java:18)
Caused by: java.lang.ClassNotFoundException: org.openxmlformats.schemas.drawingml.x2006.chart.ChartSpaceDocument$Factory
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:355)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
... 7 more
Jars I am using in Class path:
poi-5.0.0.jar, poi-ooxml-5.0.0.jar, xmlbeans-4.0.0.jar along with other commons & codec jar dependencies.
My queries are:
1)Why I am not even able to load basic docx file in XWPFdocument?
2)If use poi-ooxml-full-5.0.0.jar instead of poi-ooxml-5.0.0.jar, XWPFDocument class is not present in it, what is reason ?
3)Also can some one pls help me in sharing some links to get complete understanding POI architecture & code flow, so I can modify classes in jar according to my needs.

Error when parsing an embedded .xlsx file from a .ppt using apache-poi. The supplied POIFSFileSystem does not contain a BIFF8 'Workbook' entry

I am facing an issue when using apache poi to extract an embedded .xlsx files from a .ppt file. It would be really great if somebody could help me out.
The subject of the problem:
Problem trying to solve: Extracting a ".xlsx" file embedded inside a ".ppt".
I am currently using apache-poi.
It seems that when I try to do it using hslfSlideShow.getEmbeddedObjects(), I get the xlsx object just fine but when I try converting it to the XLSFWorkbook object using say WorkbookFactory.create(inputStream), it threw an error saying
java.lang.IllegalArgumentException: The supplied POIFSFileSystem does not contain a BIFF8 'Workbook' entry. Is it really an excel file? Had: [OlePres000, Ole, CompObj, Package]
at org.apache.poi.hssf.usermodel.HSSFWorkbook.getWorkbookDirEntryName(HSSFWorkbook.java:286)
at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:326)
at org.apache.poi.hssf.usermodel.HSSFWorkbookFactory.createWorkbook(HSSFWorkbookFactory.java:64)
at org.apache.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:167)
at org.apache.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:112)
at org.apache.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:253)
at org.apache.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:221)
Interestingly it is calling HSSFWorkbookFactory even though its an xlsx file.
And no the xlsx file is not corrupted/password-protected. I can open it just fine.
Also, it works fine if I try parsing the .xlsx file without embedding it in the .ppt.
And the parsing works fine when I embed it in a .pptx file and call methods such as xmlSlideShow.getAllEmbeddedParts() to get the embedded objects from .pptx.
Promoting some comments and investigation to an answer...
This was a limitation in older version of Apache POI, but was fixed in July in r1880164.
For backwards-compatibility reasons, PowerPoint will often (but not always...) write embedded OOXML resources wrapped in an intermediate OLE2 layer. This has the advantage that tools/programs which expect embedded office documents to be something like a xls / doc to cope, but at the expense of another layer of wrapping.
Newer versions of Apache POI (5.0 should be the first released one with the fix in) have support in WorkbookFactory for receiving an OLE2 wrapper like this, pulling out the underlying xlsx stream and handing that off to XSSFWorkbook. (Older versions did this for OLE2-based password-protected xlsx files, but not their unencrypted cousins)
For now, if you're stuck on an affected POI version, the code you'll want is something like this (largely taken from the unit test verifying support!):
POIFSFileSystem fs = new POIFSFileSystem(data.getInputStream());
if(fs.getRoot().hasEntry("Package")) {
DocumentInputStream dis = new DocumentInputStream((DocumentEntry)fs.getRoot().getEntry("Package"));
try (OPCPackage pkg = OPCPackage.open(dis)) {
XSSFWorkbook wb = new XSSFWorkbook(pkg);
handleWorkbook(wb);
wb.close();
}
} else {
try (HSSFWorkbook wb = new HSSFWorkbook(fs)) {
handleWorkbook(wb);
}
}

OutOfMemoryError while creating XSSFWorkbook in Apache POI

I have a spring boot Rest service which i am using to create excel file(xlsm). Getting strange issue as application start first time it will easily create a excel file but calling a rest endpoint again generate OutOfMemoryError exception.
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
at org.apache.xmlbeans.impl.store.Cur$CurLoadContext.attr(Cur.java:3044)
at org.apache.xmlbeans.impl.store.Cur$CurLoadContext.attr(Cur.java:3065)
at org.apache.xmlbeans.impl.store.Locale$SaxHandler.startElement(Locale.java:3198)
at org.apache.xerces.parsers.AbstractSAXParser.startElement(AbstractSAXParser.java:498)
at org.apache.xerces.parsers.AbstractXMLDocumentParser.emptyElement(AbstractXMLDocumentParser.java:180)
at org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:275)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(XMLDocumentFragmentScannerImpl.java:1653)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:324)
at org.apache.xerces.parsers.XML11Configuration.parse(XML11Configuration.java:890)
at org.apache.xerces.parsers.XML11Configuration.parse(XML11Configuration.java:813)
at org.apache.xerces.parsers.XMLParser.parse(XMLParser.java:108)
at org.apache.xerces.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1198)
at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:564)
at org.apache.xmlbeans.impl.store.Locale$SaxLoader.load(Locale.java:3414)
at org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:1272)
at org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:1259)
at org.apache.xmlbeans.impl.schema.SchemaTypeLoaderBase.parse(SchemaTypeLoaderBase.java:345)
at org.openxmlformats.schemas.spreadsheetml.x2006.main.WorksheetDocument$Factory.parse(Unknown Source)
at org.apache.poi.xssf.usermodel.XSSFSheet.read(XSSFSheet.java:228)
at org.apache.poi.xssf.usermodel.XSSFSheet.onDocumentRead(XSSFSheet.java:220)
at org.apache.poi.xssf.usermodel.XSSFWorkbook.parseSheet(XSSFWorkbook.java:452)
at org.apache.poi.xssf.usermodel.XSSFWorkbook.onDocumentRead(XSSFWorkbook.java:417)
at org.apache.poi.ooxml.POIXMLDocument.load(POIXMLDocument.java:184)
at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:286)
at com.service.ExcelReportManager.runReport(ExcelReportManager.java:248)
at com.report.controller.ReportingEndPoint.runReport(ReportingEndPoint.java:35)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:209)
at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:136)
Here is a code which creating this exception:
try (OPCPackage pkg = OPCPackage.open(fileCopy);
XSSFWorkbook workbook = new XSSFWorkbook(pkg))
as i am closing a resource but still there is somehow creating problem. I read different already created issue here but nothing seems to be working for me. Is there any clue to resolve this ?
Is the excel file size too big?
If it is, it can be causing this error, the memory use for generating and reading big excel files with poi is very high.
Your try syntax is correct, the resources will always be closed.
You can give a try on increasing the max memory for the execution jvm.
Use -Xmx2048m on initialization, for example.
Instead of XSSFWorkbook (which keeps the entire Excel workbook in memory) try to use very efficient and high performance streaming SXSSFWorkbook class like below:
SXSSFWorkbook workbook = new SXSSFWorkbook(100);
where 100 is the default number of rows that will be kept in memory and processed in real time.

Apache POI: Open Template, write, Save as. crashes and creates corrupt file

I have been writing a program to write names on a roster and have written most of the methods for finding and sorting.
The goal of this method is to open an existing file, write in a cell, and save as a different file.
I think the problem may be the template file because it consists of 2 sheets. one sheet is to collect all the names and the second sheet has an image with text boxes linked to the first page to print onto the text box
public static void FindTemplate(String Session) throws FileNotFoundException, IOException
{
if(Session.toLowerCase().contains("Level 1".toLowerCase()))
// generic roster as an else
{
FileInputStream In = new FileInputStream("Directory\\Templates\\A Template.xls");
HSSFWorkbook wb = new HSSFWorkbook(In);
HSSFSheet sheet = wb.getSheetAt(0);
Cell cell = null;
cell = sheet.getRow(0).getCell(0);
cell.setCellValue("Found it");//just as a test for now
In.close();
wb.write(new FileOutputStream("WA1.xls"));
wb.close();
}
}
When i run it i get a large error that i don't understand. The error is at wb.write(new FileOutputStream("WA1.xls")); I have saved files like this before in my other methods. Also a new file is created but it is corrupt.
Error:
WAException in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/collections4/bidimap/TreeBidiMap
at org.apache.poi.hpsf.Section.<init>(Section.java:178)
at org.apache.poi.hpsf.MutableSection.<init>(MutableSection.java:41)
at org.apache.poi.hpsf.PropertySet.init(PropertySet.java:494)
at org.apache.poi.hpsf.PropertySet.<init>(PropertySet.java:196)
at org.apache.poi.hpsf.MutablePropertySet.<init>(MutablePropertySet.java:44)
at org.apache.poi.hpsf.SpecialPropertySet.<init>(SpecialPropertySet.java:47)
at org.apache.poi.hpsf.DocumentSummaryInformation.<init>(DocumentSummaryInformation.java:99)
at org.apache.poi.hpsf.PropertySetFactory.create(PropertySetFactory.java:116)
at org.apache.poi.POIDocument.getPropertySet(POIDocument.java:236)
at org.apache.poi.POIDocument.getPropertySet(POIDocument.java:197)
at org.apache.poi.POIDocument.readPropertySet(POIDocument.java:175)
at org.apache.poi.POIDocument.readProperties(POIDocument.java:158)
at org.apache.poi.hssf.usermodel.HSSFWorkbook.updateEncryptionInfo(HSSFWorkbook.java:2295)
at org.apache.poi.hssf.usermodel.HSSFWorkbook.getBytes(HSSFWorkbook.java:1506)
at org.apache.poi.hssf.usermodel.HSSFWorkbook.write(HSSFWorkbook.java:1428)
at org.apache.poi.hssf.usermodel.HSSFWorkbook.write(HSSFWorkbook.java:1414)
at rosterWrite.FindTemplate(rosterWrite.java:79)
at rosterWrite.main(rosterWrite.java:24)
Caused by: java.lang.ClassNotFoundException: org.apache.commons.collections4.bidimap.TreeBidiMap
at java.net.URLClassLoader.findClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
... 18 more
The error message is Caused by: java.lang.ClassNotFoundException: org.apache.commons.collections4.bidimap.TreeBidiMap, that means your project is lack of the TreeBidiMap class of the Apache commons-collections library. If you using maven, just add commons-collection library to your pom.xml as this page shows. If not, you need to download the library from official site and put it to your project.

Aspose cell exception:om.ctc.wstx.sr.ValidatingStreamReader cannot be cast to com.ctc.wstx.sr.ValidatingStreamReader

I am using Aspose.Cells (trial version) to parse a .xls (Excel) file for Java. But when I try to load the file, it throws the exception given below:
SEVERE: java.lang.IllegalStateException: XML Stream Exception: XMLStreamException: com.ctc.wstx.sr.ValidatingStreamReader cannot be cast to com.ctc.wstx.sr.ValidatingStreamReader
Here is my code
Workbook workbook = new Workbook();
try {
workbook.open(path+fileName);
} catch (Exception e) {
e.printStackTrace();
}
Worksheet worksheet = workbook.getWorksheets().get(0);
This exception is coming at workbook.open(path+fileName); this line.I am quiet sure that this is not due to wrong path because when I give wrong path then aspose throws FileNotFoundException.So now I am stuck here and unable to find why this is happening?Note: In search of this problem, I found this answer on aspose forum but it is not helpful and feasible(to check all the classes present in jars placed in lib).
We recommend you to kindly try our latest version of the product (e.g v7.7.x (JAVA)) as we did remove some inter dependencies jars and have written/included our own custom XML parsers to perform some XML operations in the product. In the new versions, we we have removed the conflicting "com.etc.wstx" jar in the product, so you should not find this exception any more.
Thanks,

Categories