Read excel file in java without using any jars - java

I have a requirement to read excel file in java without using any third party library jar like POI,JEXCEL .I don't know exactly can spring support for same? Please suggest if you know something.
Thanks in advance
Using POI I have done but read without using any jar
public static void readFromExcel(String file) throws IOException{
HSSFWorkbook myExcelBook = new HSSFWorkbook(new FileInputStream(file));
HSSFSheet myExcelSheet = myExcelBook.getSheet("Birthdays");
HSSFRow row = myExcelSheet.getRow(0);
if(row.getCell(0).getCellType() == HSSFCell.CELL_TYPE_STRING){
String name = row.getCell(0).getStringCellValue();
System.out.println("name : " + name);
}

If you are not allowed to use a 3rd party JAR or library you will need to write a parser to read the document into your data classes.
I would advise you to take a look at the file format specification for Microsoft Office. You will need to understand this to build a reliable parser.
It would be much easier to just use Apache POI and see if the requirements can be changed to allow it.

Related

Error when parsing an embedded .xlsx file from a .ppt using apache-poi. The supplied POIFSFileSystem does not contain a BIFF8 'Workbook' entry

I am facing an issue when using apache poi to extract an embedded .xlsx files from a .ppt file. It would be really great if somebody could help me out.
The subject of the problem:
Problem trying to solve: Extracting a ".xlsx" file embedded inside a ".ppt".
I am currently using apache-poi.
It seems that when I try to do it using hslfSlideShow.getEmbeddedObjects(), I get the xlsx object just fine but when I try converting it to the XLSFWorkbook object using say WorkbookFactory.create(inputStream), it threw an error saying
java.lang.IllegalArgumentException: The supplied POIFSFileSystem does not contain a BIFF8 'Workbook' entry. Is it really an excel file? Had: [OlePres000, Ole, CompObj, Package]
at org.apache.poi.hssf.usermodel.HSSFWorkbook.getWorkbookDirEntryName(HSSFWorkbook.java:286)
at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:326)
at org.apache.poi.hssf.usermodel.HSSFWorkbookFactory.createWorkbook(HSSFWorkbookFactory.java:64)
at org.apache.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:167)
at org.apache.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:112)
at org.apache.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:253)
at org.apache.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:221)
Interestingly it is calling HSSFWorkbookFactory even though its an xlsx file.
And no the xlsx file is not corrupted/password-protected. I can open it just fine.
Also, it works fine if I try parsing the .xlsx file without embedding it in the .ppt.
And the parsing works fine when I embed it in a .pptx file and call methods such as xmlSlideShow.getAllEmbeddedParts() to get the embedded objects from .pptx.
Promoting some comments and investigation to an answer...
This was a limitation in older version of Apache POI, but was fixed in July in r1880164.
For backwards-compatibility reasons, PowerPoint will often (but not always...) write embedded OOXML resources wrapped in an intermediate OLE2 layer. This has the advantage that tools/programs which expect embedded office documents to be something like a xls / doc to cope, but at the expense of another layer of wrapping.
Newer versions of Apache POI (5.0 should be the first released one with the fix in) have support in WorkbookFactory for receiving an OLE2 wrapper like this, pulling out the underlying xlsx stream and handing that off to XSSFWorkbook. (Older versions did this for OLE2-based password-protected xlsx files, but not their unencrypted cousins)
For now, if you're stuck on an affected POI version, the code you'll want is something like this (largely taken from the unit test verifying support!):
POIFSFileSystem fs = new POIFSFileSystem(data.getInputStream());
if(fs.getRoot().hasEntry("Package")) {
DocumentInputStream dis = new DocumentInputStream((DocumentEntry)fs.getRoot().getEntry("Package"));
try (OPCPackage pkg = OPCPackage.open(dis)) {
XSSFWorkbook wb = new XSSFWorkbook(pkg);
handleWorkbook(wb);
wb.close();
}
} else {
try (HSSFWorkbook wb = new HSSFWorkbook(fs)) {
handleWorkbook(wb);
}
}

Reading data from ods(Open office spreadsheet) for selenium in java

Is there any possibility of using ods(Open office spreadsheet) files for read and write data in selenium using java. I searched internet and i am not find any way. Pls anyone help me.
File file = new File("file location");
Sheet sheet = SpreadSheet.createFromFile(file).getSheet(0)MutableCell domain = null;
sheet.getCellAt(row index,column index).setValue(value you want to write in the cell) / getvalue()-fetches the value in the cell;
sheet.getSpreadSheet().saveAs(file);
Note:
For using this, you need openjdocument jar file added to your project.

While Reading the data from Excel file with extension xlsx using apache poi it takes long time

While reading the excel file with extension xlsx using apache poi it takes the long time for identifying the extension. Can you please help why it takes the long time?
if (file.getExcelFile().getOriginalFilename().endsWith("xls"))
{
workbook = new HSSFWorkbook(file.getExcelFile().getInputStream());
} else if (file.getExcelFile().getOriginalFilename().endsWith("xlsx"))
{
workbook = new XSSFWorkbook(file.getExcelFile().getInputStream());
} else {
throw new IllegalArgumentException("Received file does not have a standard excel extension.");
}
Promoting a comment to an answer - don't try to do this yourself, Apache POI has built-in code for doing this for you!
You should use WorkbookFactory.create(File) to do it, eg just
workbook = WorkbookFactory.create(file.getExcelFile());
As explained in the Apache POI docs, use a File directly in preference to an InputStream for quicker and lower memory processing

How to can process xlsx file (from template) in Java?

In my application we generate Excel files using JExcel API which allows us to use XLS templates file. Now we must also manage XLSX format, but JExcel can not handle this format. What other API can be used ? I wanted to use POI but it does not take into account the templates. This forces us to change the code to fully recreated a file each time.
Thank's.
The format xlsx is just a zip of some XML files, and some other files maybe.
You could use ZipFile, but a Zip File System could be easiser to operate on single embedded XML files:
Map<String, String> zipProperties = new HashMap<>();
zipProperties.put("encoding", "UTF-8");
try (FileSystem zipFS = FileSystems.newFileSystem(docxUri,
zipProperties)) {
Path mediaPath = zipFS.getPath("/word/media");
...
You can copy/rename/move and so on. Excel is a bit harder, as it uses a shared.xml with shared strings.
This approach allows to keep near to some current Excel variant, which apache POI seems to have difficulty to achieve.

Java library for reading Word documents

Is there an open-source Java library for reading Word documents (both .docx and the older .doc format)?
Read-only access if sufficient; I do not need to modify the Word documents using Java. However, I would like to have access to images and style information.
EDIT
I've checked out Apache POI, but it doesn't look like it is being actively maintained. See http://poi.apache.org/hwpf/index.html:
At the moment we unfortunately do not have someone taking care for HWPF and fostering its development.
Apache POI HWPF for .doc and XWPF for .docx files
There is an apache project that does this: http://poi.apache.org//
public class XParseTest
{
public static void main(String[] args) throws XmlException, OpenXML4JException, IOException
{
File file=new File("e:\\testing\\new.docx");
FileInputStream fs = new FileInputStream(file);
OPCPackage d = OPCPackage.open(fs);
XWPFWordExtractor xw = new XWPFWordExtractor(d);
System.out.println(xw.getText());
}
}
this will parse docx file...

Categories