Biff exception in Java - java

When I tried to read an Excel file in Java it throws "biff exception".
What does this mean? I tried to Google it but wasn't able to find a proper explanation.
jxl.read.biff.BiffException: Unable to recognize OLE stream
at jxl.read.biff.CompoundFile.<init>(CompoundFile.java:116)
at jxl.read.biff.File.<init>(File.java:127)
at jxl.Workbook.getWorkbook(Workbook.java:221)
at jxl.Workbook.getWorkbook(Workbook.java:198)
at Com.Parsing.ExcelFile.excel(Extract.java:20)
at Com.Parsing.Extract.main(Extract.java:55)

I also faced similar problem and was able to fix it.
I was using a .xlsx file and when I changed it to .xls file, it worked just fine. Seems JXL doesn't support .xlsx format.
Please correct me if somebody knows that it supports.

The javadoc for BiffException.
Exception thrown when reading a biff file.
This exception has a number of messages that should provide some information about the cause:
excelFileNotFound
excelFileTooBig
expectedGlobals
passwordProtected
streamNotFound
unrecognizedBiffVersion
unrecognizedOLEFile
Edit:
unrecognizedOLEFile seems to mean that something is embedded in the file that cannot be read.

An Excel workbook with several sheets (from BIFF5 on) is stored using the compound document file format (also known as “OLE2 storage file format” or “Microsoft Office compatible storage file format”). It contains several streams for different types of data.
A complete documentation of the format of compound document files can be found at
http://sc.openoffice.org/compdocfileformat.pdf
I think the exception mean that your parsing library can not recognise it(For example:biff5 format can not be parsed in POI and Jexcelapi).
You can check your file version be open it in Office and click 'SAVE AS',the format list in the filedialog is it's current file version.

Related

Reading a xls file using POI or jxl which actually neither in xls file nor a .xlsx throws Unable to recognize OLE stream

I want to open a .xls file which not supported by Apache POI since the file is neither a .xls file nor a
.xlsx one. If you open it in a text editor, you'll see that instead it's
HTML!
Please see https://bz.apache.org/bugzilla/show_bug.cgi?id=51031
As per the java.lang.IllegalArgumentException: Your InputStream was neither an OLE2 stream, nor an OOXML stream
I also tried with jxl and it throws jxl.read.biff.BiffException: Unable to recognize OLE stream.
only way to bypass this is save it again ,since I 'am using an automation script can anyone suggest a walkaround .

Validation of files based on their file extensions

I get files from queues in Java. They may be of following formats.
docx
pdf
doc
xls
xlsx
txt
rtf
After reading their extensions, I want to validate whether they are actually files of these types.
For example, I got a file and checked that it has extension .xls. Afterwards, I want to check whether it is actually an .xls file or someone uploaded file of some other format after changing its extension.
EDIT: I'd like to check the file's MIME type by actually checking its content, not its extension. How it can be done?
I don't think this is a problem you should be solving. Any solution to this problem would be brittle and based upon your current understand of what constitutes a valid file of a particular type.
For example, take a XLS file. Do you know for sure what Excel accepts when opening such a file? Can you be sure you'll keep abreast of any changes in future releases that might support a different encoding style?
Ask yourself - what's the worse that could happen if the user uploads a file of the wrong type? Perhaps you'll pass the file to the application that handles that file extension and you'll get an error? Not a problem, just pass that to the user!
Without using external libraries:
You can get the file mimetype using MimetypesFileTypeMap:
File f = new File(...);
System.out.println(new MimetypesFileTypeMap().getContentType(f));
You can get a similar result with:
URLConnection.guessContentTypeFromName
Both these solutions, according to the documentation, look only at the extension.
A better option: URLConnection.guessContentTypeFromStream
File f= new File(...);
System.out.println(URLConnection.guessContentTypeFromStream(new FileInputStream(f)));
This try to guess from the first bytes of the file - be warned this is only a guess - I found it works in most cases, but fails to detect some obvious types.
I recommend a combination of both.

Read excel file extension from struts control <s:file>

I want read contents of excel file selected using
. I am using POI 3.8. The problem is I am not able to identify
if this file is xls or xlsx. Please help
Why do you need to know?
Just use WorkbookFactory to load the excel file, and it'll autodetect for you. You then don't need to know, and your code remains completely generic between the two file formats.

Need to find corrupt document(docx file format)

I am using xslt to convert my html to docx file format(which is in open office xml format). When i open some generated docx file in word, its showing error(may be a mistake in xml nodes).Is it possible to find whether the created document will open or show errors while opening or Is it possible to recover the document programmatically(what word do if the document contains error)? or any word api to use in our code to recover
Please help me.. Thanks in advance...
Try checking the relationships xml file within word/_rels and contrast it versus a working docx. My docx files get corrupted when I forget to add the corresponding entries in there.
Update:
Also check all your image file extensions are defined in the [Content_Types].xml file.
Is it possible to find whether the created document will open or show errors while opening
In theory, you should be able to use a validating XML parser to validate your created document against the XML schemas for OOXML. In practice:
You might need to do to searching to locate machine readable versions of the relevant schema.
It is not inconceivable that the problems are due to things that would not be picked up by schema validation.
Is it possible to recover the document programmatically(what word do if the document contains error)?
In general no. If the document is sufficiently different from what MS Office expects, it won't be able to "make head nor tail of it". (It ain't magical ...)
or any word api to use in our code to recover
Again, no. If the document is sufficiently different from the schema, a schema-conforming reader / writer API won't be able to cope with it.
The real solution is to figure out what the errors in your conversion software are and rectify them. Apart from validating against the schema, there are unlikely to be any real short-cuts.
Probably your file may gone have corrupted. For this you need to recover it using some third part word recovery tool.

Why is a cell value returned as an empty string even though it has content?

I wrote a Java method that uses POI HSSF API to convert an Excel file to my data structure.
The code worked just fine for a while. But now there are suddenly problems. I suspect it might be because recently we installed Office 2007 on all the client computers, previously we had Office 2003.
The problem I ran into is: Inside the XLS file I have a column of cells that is filled with serial-numbers by the user. When the Java application gets the cell, it has a cell type STRING. And when I ask for the string value of the cell I get an empty string.
The file is originally created by the application, then the users fill it with data and load it back into the aplication. So I don't think the file format is wrong, since it's created by the same version of the API.
What could be the problem?
EDIT:
Clarification: We upgraded Office installation to 2007, but the application still uses HSSF and XLS format. Only the users open and edit the files with Office 2007. Is that a problem?
Have you checked if Excel automatically switched the cell type to NUMERIC when the user entered the value?
Excel has this annoying feature to "intelligently guess" what kind of value the user enters which then often causes a problem in POI.
HSSF is the POI Project's pure Java implementation of the Excel '97(-2007) file format. XSSF is the POI Project's pure Java implementation of the Excel 2007 OOXML (.xlsx) file format.
read furthur
http://poi.apache.org/spreadsheet/index.html

Categories