JAVA Apache poi Excel corrupt - java

I am using Apache poi library in java for generating excel report.
For generating excel report, i am reading text file which is delimited using PIPE and writing that into excel sheet.
Most of the cases say 98%, i don't see any issue. Some times the excel file generated unable to open. Says "Unreadable content".
I am fixing the issue by adding /removing one or two character(i.e XX or AA) to/from the data in the text file resolves this issue. I have compared data file of working one as well as corrupted one. I don't see any difference at all.
I am not sure what could be the reason for this issue. I don't see any exception or error while writing data in to excel file.
Datafile is thousands of lines and the source code has around 15 classes each goes around 300 to 400 lines. So, it's hard to post both. Sharing partial feed or class is not worth since the any addition/removal of character makes the sheet readable
This excel report has 10 to 15 sheets and multiple formats and cell merge as well as formula.
Does anyone have suggestion on this. Is there chance for memory issue while reading or writing. File is read line by line and written as sheet by sheet.

I went through the whole application and i could not find any trace. So i upgraded the application from Apache poi HSSF to Apache poi XSSF and refactored all the methods in the application that fixed this issue.

Related

One computer corrupts Excel files (that are created through my program through Apache POI), while other computers work fine

One aspect of my Java program deals with Excel file creation and manipulation (and simply opening the file) using Apache POI.
In the office in which the program is being used, one computer seems to corrupt any Excel file that it opens (but only Excel files created by my program, other Excel files work fine). However, other computers have no such issue.
When other computers attempt to open the Excel file, it is corrupted (I have tried everything to repair the files, but nothing works).
Furthermore, the program used to work fine on this computer as well. Suddenly, one day, it began corrupting all Excel files created through the system.
Error Popup: "We found a problem with some content in '6077 - model mixed - July 2018 - EHF 16837.xlsm'. Do you want us to try to recover as much as we can? If you trust the source of this workbook, click Yes."
There a various reasons which could cause the behaviour. I'd suggest to analyse the excel file in more detail as outlined here: https://stackoverflow.com/a/54077245/812093 (e.g. find out the exact row number in the xml which is causing excel to fail). Maybe you can provide the information here then we can help you further

How to stop forceformularecalculation apache poi

I am using apache poi 3.12 to generate excel using database data. For some cells I am using formulas also, all calculations are working properly and file was generated successfully. Below is my issue.
1) Opened the generated file using MS Excel application.
2) Verified the values and all are Ok.
3) As the excel file was password protected after verification I tried to close the file.
4) While closing it was showing message as "MS EXcel recalculates formulas when opening files last saved by an earlier version of excel".
I added "workbook.setForceFormulaRecalculation(false);" before writing my input stream to workbook.
From API docs it says it will not recalculate the formulas one workbook is opened if set the value to "false".
FYI, I have used unimplemented excel function "YIELD" in one cell.
Can anybody help on this.

pagination of xlsx file to XSSFworkbook using apache POI

Right now in my code, I am reading a xlsx file, into an XSSFWorkbook, and then finally writing it into a database. But, when the size of xlsx file increases, it causes an outOfMemory error.
I can not increase the server size, or divide the xlsx file into pieces.
I tried loading workbook using file (instead of inputstream), but that didn't help either.
I am looking for a way to read 10k rows at a time (instead of the entire file at once) and iteratively write to the workbook and then to the database.
Is there a good way to do this with Apache POI?
POI contains something called an "eventmodel" which is designed exactly for this purpose. It's mentioned in the FAQ:
The SS eventmodel package is an API for reading Excel files without loading the whole spreadsheet into memory. It does require more knowledge on the part of the user, but reduces memory consumption by more than tenfold. It is based on the AWT event model in combination with SAX. If you need read-only access, this is the best way to do it.
However, you may want to double check first if the issue is somewhere else. Check out this item:
I think POI is using too much memory! What can I do?
This one comes up quite a lot, but often the reason isn't what you might initially think. So, the first thing to check is - what's the source of the problem? Your file? Your code? Your environment? Or Apache POI?
(If you're here, you probably think it's Apache POI. However, it often isn't! A moderate laptop, with a decent but not excessive heap size, from a standing start, can normally read or write a file with 100 columns and 100,000 rows in under a couple of seconds, including the time to start the JVM).
Apache POI ships with a few programs and a few example programs, which can be used to do some basic performance checks. For testing file generation, the class to use is in the examples package, SSPerformanceTest. Run SSPerformanceTest with arguments of the writing type (HSSF, XSSF or SXSSF), the number rows, the number of columns, and if the file should be saved. If you can't run that with 50,000 rows and 50 columns in HSSF and SXSSF in under 3 seconds, and XSSF in under 10 seconds (and ideally all 3 in less than that!), then the problem is with your environment.
Next, use the example program ToCSV to try reading the a file in with HSSF or XSSF. Related is XLSX2CSV, which uses SAX parsing for .xlsx. Run this against both your problem file, and a simple one generated by SSPerformanceTest of the same size. If this is slow, then there could be an Apache POI problem with how the file is being processed (POI makes some assumptions that might not always be right on all files). If these tests are fast, then any performance problems are in your code!

Is it possible using apache poi to load data from an open excel file that is constantly updating?

I have this Java program that uses apache poi to load data from an excel file.
Problem I'm facing is I can't seem to load data from the excel file that is constantly updating. I only get the initial data when I run my java program.
You have to reread the data from the excel file. POI makes a copy into java objects when it reads it, so any further changes won't get reflected in your Java code without rereading the file.
If you mean that you do reread the file but don't see the updates, then it could be that someone is making changes in excel but not saving them, so POI can't see them yet.
This Answer is referred from Fetch Data From Excel have a look at this answer for more details. Maybe this question is a duplicate of the above link or vice versa.
The problem is because the excel data is not saved. I was also dealing with the same problem and got up with a different solution which worked for me. I just created a macro in excel to save the excel workbook whenever it's cell values got changed. Now I got the excel file with up-to-date saved data which can be read through java code and can be used for other purposes.

Saving different csv files as different sheets in a single excel workbook

Related to this question, how to save many different csv files into one excel workbook with one sheet per csv ? I would like to know how to do this programmatically in Java.
You'll need some form of library for accessing Excel from Java. A Google search turned this one up:
http://j-integra.intrinsyc.com/support/com/doc/excel_example.html
An alternative is to use the XML Excel format that came into being with Office 2003. You'll end up with a XML file, but you can open it in Excel and see the different sheets.
http://www.javaworld.com/javaworld/jw-07-2004/jw-0712-officeml.html
If you want open source, the POI library can be used to generated Excel files.
A nice CSV parser is Open CSV
That should set the stage for what you are trying to do (basically use the CSV parser to get data, then write the data to an XLS file.
Take a look at the Aspose products, I've used them before when working with Excel and they saved me a huge amount of headache and time. Excel has several quirks that can make importing and exporting spreadsheets painful.
Aspose.Cells

Categories