Read Excel spreadsheet xml file[Java] [duplicate] - java

This question already has answers here:
How to load old Microsoft Office XML file (Excel) using Java
(6 answers)
Closed 4 years ago.
I have Excel spreadsheet xml file(generated from excel via save as Excel XML Spreadsheet 2003) and I need to extract data from it in java.
I found similar topic:
How to load old Microsoft Office XML file (Excel) using Java
But last answer was 2 years ago - something could change.
If you have any idea how to parse excel xml file(or convert it) please let me know.
Thanks in advance.

I had the same problem some time ago, ended up writing a SAX parser to read the XML file. I wrote a blog post about it here.
You can find the sample project to parse the file in Github.

You can for sure export your Excel file into pure XML.
Then it will be various libs for that:
JDOM
SAX
DOM
Simple XML
and more...
You should also read this topic: Best way to read XML in Java
Enjoy
EDIT
If you need to extract data from Excel XML then your file has extention XLS, right?The you can use easly - jXLS

Related

Adding cell comments from raw xml for Excel 2007 using java

I need to create cell comments for existing excel file by extracting the xlsx file and using raw XML file.
Is it possible to do it in java without using Apache poi library?
Well I don't know much about excel but I found this website that might help you.

JAVA parse a CSV file [duplicate]

This question already has answers here:
CSV API for Java [closed]
(10 answers)
Closed 9 years ago.
I just recently went through the exercise of going through various examples and tutorials dealing with parsing XML's - I have been introduced to JSoup, DOM, StAX, and maybe one other.
I now want to branch off into opening and reading a csv (comma deliminted file) and search for particular data.
A brief internet search shows similar to the XML exercise - plenty of options.
What technique do you recommend (within the JAVA world) for opening, reading, searching a csv file?
I guess I would also like to include writing a csv file also.
Thanks for the advice.
I can recommend you opencsv library which is simple and easy to use csv file parser. You can find how to read and how to write examples on FAQ of opencsv site.

Can I save an excel spreadsheet as a jpg using java?

Does anyone know if it's possible to save an Excel Spreadsheet as a jpg using java? Currently, I am reading and manipulating Excel Spreadsheets in java using Apache POI. It's working great for everything else, but I haven't been able to find an answer to this question in their documentation or online.
There are some commercial libs to do so, another solution would be to use Apache POI to render your Excel as HTML and then convert this to an image utilizing a lib like this java-html2image

How to load old Microsoft Office XML file (Excel) using Java

I'm not able to load an Excel file in the older Office XML format (think Office 2002 or 2003 version) into Java. I tried JXL and Apache's POI (version 3.7). POI doesn't work since it appears to want the newer Office .xlsx format.
Here's an example of the older Office XML format.
One can generate a similar XML file from MS Excel 2010 by saving the workbook as the format "XML Spreadsheet 2003"?
Are there any open-source Java libraries that will load the XMLSS format? Otherwise I have no choice but to write a custom parser: read the XML file then interpret the cell tags to build out the cell matrix. In this XML format, any rows with empty cell values are skipped, the next cell with data positioned with an index attribute that acts like an offset in the columns, I assume to save space in the XML file.
The format is called SpreadsheetML (do not confuse with .xlsx which is also xml-based), a library called Xelem can handle it:
import nl.fountain.xelem.excel.Workbook;
import nl.fountain.xelem.lex.ExcelReader;
//...
ExcelReader reader = new ExcelReader();
Workbook xlWorkbook = reader.getWorkbook("c:\\my\\spreadsheet.xml");
System.out.println(xlWorkbook.getSheetNames());
Copying Mark Beardsley's answer from POI team http://apache-poi.1045710.n5.nabble.com/How-to-convert-xml-to-xls-td2306602.html :
You have got an Office 2003 xml file there, not an OpenXML file; it is an early attempt by Microsoft to create an xml based file format for Excel and it is in that sense a 'valid' Office file format.
Sadly, POI cannot interpret this file at all and that is why you saw the exception when you tried to wrap it up in the InputStream and pass it to WorkbookFactory(s) constructor. You do however have a number of options;
You could use Excel itself and manually open and save each file you wish to convert, as you already have done.
If you have access to Visual Studio and can write Visual Basic or C# code then you could use a control that will allow you to control Excel programmatically. This way you could automate a file conversion process using Excel itself. Then once the file has been converted wither to the binary or OpenXML formats, POI can be used to process it.
If you are running on a stand alone PC on which a copy of Excel is installed and using the Windows operating system, then you could use OLE to do something very similar from Java code. As above, POI can be used to process the file following the conversion.
If you have access to OpenOffice, it has a rather good API that is accessible from Java code. You could use it to convert between the file types for you - it is simply a matter of discovering the correct filter to use in this case. OpenOffice is good for all except the most complex files and you should be able to use POI to process the file following conversion. However, if you choose this route, it may be best to do all of the work using OpenOffice's UNO api.
Depending upon what you want to do with the file's contents, you could create your own parser using core java code and either the SAX or Xerces parsers (consider using xmlBeans (http://xmlbeans.apache.org/) ). If you simply open the original xml file using a simple text editor, you can see that the structure is not complex and, if all you wish to get at is the raw data it contains, this could be your best option.
After a lot of pain I've found a solution to this. JODConverter uses the OpenOffice.org/LibreOffice API and can convert SpreadsheetML to whatever formats OpenOffice.org suppports.
You might get some result using the OpenOffice API. If not directly you could probably convert to a 'supported' format.
Otherwise the schema for the Office 2003 'SpreadsheetML' isn't very complicated. I have succesfully created an xslt scenario to convert a resultset (database query) to a (simple yet effective) Excel 2003 document (XML format). The other way around should not be very hard to achieve.
Cheers,
Wim
The answer today was to ask the vendor to change their Excel file format to an Excel binary rather than the old Office XML. Doing so allowed me to use Apache POI 3.7 to read the file with no issues. I appreciate the answers, as I had no idea there was no direct support in the Java-based open source libraries for this old Office XML format. Now I know next time to check earlier to see what format the Excel files are in before committing to a timeline.
I had the same problem some time ago, ended up writing a SAX parser to read the XML file. I wrote a blog post about it here.
You can find the sample project to parse the file in Github.

xlsx file reading in java [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Read xlsx file in Java
can anybody answer how to read xlsx file in java.
Try Apache POI - the Java API for Microsoft Documents
Have a look at http://poi.apache.org/spreadsheet/index.html
"Why would you use docx4j to do this", I hear you ask, "rather than POI, which focuses on xlsx and binary xls?"
Probably because you like JAXB (as opposed to XML Beans), or you are already using docx4j for docx or pptx, and need to be able to do some stuff with xlsx as well.
Another possible reason is that the jar XML Beans generates from the OpenXML schemas is too big for your purposes. (To get around this, POI offers a 'lite' subset: the 'big' ooxml-schemas-1.0.jar is 14.5 MB! But if you need to support arbitrary spreadsheets, you'll probably need the complete jar). In contrast, the whole of docx4j/pptx4j/xlsx4j weighs in at about the same as POI's lite subset.
If you are processing spreadsheets only (ie not docx or pptx), and preceding paragraph is not a concern for you, then you would probably be best off using POI.

Categories