I need to append contents to an existing excel file using JExcel.
I am trying the following approach:
Read from existing workbook
workbook = Workbook.getWorkbook(new File(errorFilePath));
Create writable workbook from exisitng workbook into a temp file
if (!tempFile.exists()) {
tempFile.getParentFile().mkdirs();
tempFile.createNewFile();
}
newCopy = Workbook.createWorkbook(tempFile, workbook);
excelSheet = newCopy.getSheet(0);
Write to writable workbook(times is a writable cell format variable)
Label label;
label = new Label(column, row, stringData, times);
excelSheet .addCell(label);
Close both exisitng and writable workbook->Delete exisitng workbook
in finally block -> Rename temp file name to existing(now deleted) workbook name
finally {
if (null != newCopy) {
newCopy.write();
newCopy.close();
}
if (null != workbook) {
workbook.close();
}
if (null != errorFile && errorFile.exists()) {
errorFile.delete();
}
if (null != tempFile) {
tempFile.renameTo(new File(errorFilePath));
}
}
The problem is everything works fine for the first run(without redeploying).
But whenever I change some java code, and the web application redeploys I get a null pointer exception while closing the newly created workbook(after writing).
I am getting the following stack trace(originating from line newCopy.write())
java.lang.NullPointerException
at jxl.write.biff.CellValue.getData(CellValue.java:259)
at jxl.write.biff.LabelRecord.getData(LabelRecord.java:141)
at jxl.biff.WritableRecordData.getBytes(WritableRecordData.java:71)
at jxl.write.biff.File.write(File.java:147)
at jxl.write.biff.RowRecord.writeCells(RowRecord.java:329)
at jxl.write.biff.SheetWriter.write(SheetWriter.java:479)
at jxl.write.biff.WritableSheetImpl.write(WritableSheetImpl.java:1514)
at jxl.write.biff.WritableWorkbookImpl.write(WritableWorkbookImpl.java:950)
Java Version : 1.6
JExcel Version : 2.6.10
Windows 7
Well, first suspicion is, in this line:
label = new Label(column, row, stringData, times);
you pass null argument(s).
I faced the same issue.
I was trying to add rows to the sheet dynamically in a loop using insertRow. After spending several hours it was probably a bug in the latest version of jxl api.
JXL api after 2.6.9 seem to have bug in insertRow. I switched to 2.6.9 from 2.6.12.
Related
This is the code that I have for reading a very large excel file (xlsx) that is 23.5MB with 700,000+ rows.
String dir = rootPath + File.separator + "tmpFiles" + File.separator
+ FILE_NAME;
File fisNew = new File(dir);
Workbook w = StreamingReader.builder()
.rowCacheSize(100)
.open(fisNew);
Sheet worksheet = null;
worksheet = w.getSheetAt(0);
worksheet.getRow(0).getPhysicalNumberOfCells();
I get an UnsupportedOperationException Null pointer error on this line:
worksheet.getRow(0).getPhysicalNumberOfCells(); And I also don't get an actual String value when I print out this line: SpecialtyUtil.removeWhiteSpaces(excelheader.getCell(0)). I am supposed to get the name of the column but I get some StreamingSheet string instead. Not so sure what I need to change here in order to process a xlsx file.
EDIT: Any idea how to write to an excel file using StreamingReader? I know that it is an unsupported operation, but is there a workaround?
If you look into the following source code in github link, StreamingSheet does not support the method getPhysicalNumberOfCells(). I provide below the code snippet.
/**
* Not supported
*/
#Override
public int getPhysicalNumberOfRows() {
throw new UnsupportedOperationException();
}
github link is given below.
https://github.com/monitorjbl/excel-streaming-reader/blob/master/src/main/java/com/monitorjbl/xlsx/impl/StreamingSheet.java#L97
We can use getLastRowNum()
Integer noOfCol = sheet.getLastRowNum(); // row no starts from 0 --- n
here is the implementation
#Override
public int getLastRowNum() {
return reader.getLastRowNum();
}
StreamingSheet.java
I try to analyze excel files with links to other files and I like to know the file name and path. For that I'm using apache poi 3.14.
I figured it out for Ref3DPtg objects but for Ref3DPxg I don't know how to do it. I only get access to the cell address and the sheet name.
Does anyone know how to do it?
Code:
...
if(ptg instanceof Ref3DPxg){
cellAddress = ptg.format2DRefAsString();
sheetName = ptg.getSheetName();
workbookName = ???;
} else if(ptg instanceof Ref3DPtg) {
// by Ref3DPtg is no problem
}
Because of the way that the XLSX file format stores external references, which isn't actually =[Other.xlsx]Sheet1!A1 but actually =[23]Sheet1!A1, it's a two step process. First, get the external workbook number from the Pxg. Next, from Workbook get the ExternalLinks table for that workbook number, noting the off-by-one. (External Workbook 0 is actually the current workbook, so External Workbook 1 corresponds to External Link 0). Finally, fetch the filename for that link
So, your code should be something like:
if(ptg instanceof Ref3DPxg){
Ref3DPxg pxg = (Ref3DPxg)ptg;
int extWB = pxg.getExternalWorkbookNumber();
int extLink = extWB-1;
ExternalLinksTable links = wb.getExternalLinksTable().get(extLink);
String filename = links.getLinkedFileName();
}
I am using POI to read,edit and write excel files.
My process flow is like write an excel, read it and again write it.
Then if I try to edit the file using normal desktop excel application while my Java app is still running, the excel cannot be saved, it says some process is holding the excel,
I am properly closing all file handles.
Please help and tell me how to fix this issue.
SXSSFWorkbook wb = new SXSSFWorkbook(WINDOW_SIZE);
Sheet sheet = getCurrentSheet();//stores the current sheet in a instance variable
for (int rowNum = 0; rowNum < data.size(); rowNum++) {
if (rowNum > RECORDS_PER_SHEET) {
if (rowNum % (RECORDS_PER_SHEET * numberOfSheets) == 1) {
numberOfSheets++;
setCurrentSheet(wb.getXSSFWorkbook().createSheet());
sheet = getCurrentSheet();
}
}
final Row row = sheet.createRow(effectiveRowCnt);
for (int columnCount = 0; columnCount < data.get(rowNum).size(); columnCount++) {
final Object value = data.get(rowNum).get(columnCount);
final Cell cell = row.createCell(columnCount);
//This method creates the row and cell at the given loc and adds value
createContent(value, cell, effectiveRowCnt, columnCount, false, false);
}
}
public void closeFile(boolean toOpen) {
FileOutputStream out = null;
try {
out = new FileOutputStream(getFileName());
wb.write(out);
}
finally {
try {
if (out != null) {
out.close();
out = null;
if(toOpen){
// Open the file for user with default program
final Desktop dt = Desktop.getDesktop();
dt.open(new File(getFileName()));
}
}
}
}
}
The code looks correct. After out.close();, there shouldn't be any locks left.
Things that could still happen:
You have another Java process (for example hanging in a debugger). Your new process tries to write the file, fails (because of process 1) and in the finally, it tries to open Excel which sees the same problem. Make sure you log all exceptions that happen in wb.write(out);
Note: The code above looks correct in this respect, since it only starts Excel when out != null and that should only be the case when Java could open the file.
Maybe the file wasn't written completely (i.e. there was an exception during write()). Excel tries to open the corrupt file and gives you the wrong error message.
Use a tool like Process Explorer to find out which process keeps a lock on a file.
I tried all the options. After looking thoroughly, it seems the problem is the Event User model.
I am using the below code for reading the data:
final OPCPackage pkg = OPCPackage.open(getFileName());
final XSSFReader r = new XSSFReader(pkg);
final SharedStringsTable sst = r.getSharedStringsTable();
final XMLReader parser = fetchSheetParser(sst);
final Iterator<InputStream> sheets = r.getSheetsData();
while (sheets.hasNext()) {
final InputStream sheet = sheets.next();
final InputSource sheetSource = new InputSource(sheet);
parser.parse(sheetSource);
sheet.close();
}
I assume the excel is for some reason held by this process for some time. If I remove this code and use the below code:
final File file = new File(getFileName());
final FileInputStream fis = new FileInputStream(file);
final XSSFWorkbook xWb = new XSSFWorkbook(fis);
The process works fine an the excel does not remain locked.
I figured it out actually.
A very simple line was required but some some reason it was not explained in the New Halloween Document (http://poi.apache.org/spreadsheet/how-to.html#sxssf)
I checked the Busy Developer's Guide and got the solution.
I needed to add
pkg.close(); //To close the OPCPackage
This added my code works fine with any number of reads and writes on the same excel file.
I am trying to validate an Excel file using Java before dumping it to database.
Here is my code snippet which causes error.
try {
fis = new FileInputStream(file);
wb = new XSSFWorkbook(fis);
XSSFSheet sh = wb.getSheet("Sheet1");
for(int i = 0 ; i < 44 ; i++){
XSSFCell a1 = sh.getRow(1).getCell(i);
printXSSFCellType(a1);
}
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
Here is the error I get
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.ArrayList.<init>(Unknown Source)
at java.util.ArrayList.<init>(Unknown Source)
at org.apache.xmlbeans.impl.values.NamespaceContext$NamespaceContextStack.<init>(NamespaceContext.java:78)
at org.apache.xmlbeans.impl.values.NamespaceContext$NamespaceContextStack.<init>(NamespaceContext.java:75)
at org.apache.xmlbeans.impl.values.NamespaceContext.getNamespaceContextStack(NamespaceContext.java:98)
at org.apache.xmlbeans.impl.values.NamespaceContext.push(NamespaceContext.java:106)
at org.apache.xmlbeans.impl.values.XmlObjectBase.check_dated(XmlObjectBase.java:1273)
at org.apache.xmlbeans.impl.values.XmlObjectBase.stringValue(XmlObjectBase.java:1484)
at org.apache.xmlbeans.impl.values.XmlObjectBase.getStringValue(XmlObjectBase.java:1492)
at org.openxmlformats.schemas.spreadsheetml.x2006.main.impl.CTCellImpl.getR(Unknown Source)
at org.apache.poi.xssf.usermodel.XSSFCell.<init>(XSSFCell.java:105)
at org.apache.poi.xssf.usermodel.XSSFRow.<init>(XSSFRow.java:70)
at org.apache.poi.xssf.usermodel.XSSFSheet.initRows(XSSFSheet.java:179)
at org.apache.poi.xssf.usermodel.XSSFSheet.read(XSSFSheet.java:143)
at org.apache.poi.xssf.usermodel.XSSFSheet.onDocumentRead(XSSFSheet.java:130)
at org.apache.poi.xssf.usermodel.XSSFWorkbook.onDocumentRead(XSSFWorkbook.java:286)
at org.apache.poi.POIXMLDocument.load(POIXMLDocument.java:159)
at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:207)
at com.xls.validate.ExcelValidator.main(ExcelValidator.java:79)
This works perfectly fine when the .xlsx file is less than 1 MB.
I understand this is because my .xlsx file is around 5-10 MB and POI tries to load the entire sheet at once in JVM memory.
What can be a possible workaround?
There are two options available to you. Option #1 - increase the size of your JVM Heap, so that Java has more memory available to it. Processing Excel files in POI using the UserModel code is DOM based, so the whole file (including parsed form) needs to be buffered into memory. Try a question like this one for advice on how to increase the help.
Option #2, which is more work - switch to event based (SAX) processing. This only processes part of the file at a time, so needs much much less memory. However, it requires more work from you, which is why you might be better throwing a few more GB of memory at the problem - memory is cheap while programmers aren't! The SpreadSheet howto page has instructions on how to do SAX parsing of .xlsx files, and there are various example files provided by POI you can look at for advice.
.
Also, another thing - you seem to be loading a File via a stream, which is bad as it means even more stuff needs buffering into memory. See the POI Documentation for more on this, including instructions on how to work with the File directly.
You can use SXSSF workbook from POI for memory related issues. Refer here
I faced the similar issue while reading and merging multiple CSVs into a single XLSX file.
I had a total of 3 csv sheets each with 30k rows totalling to 90k.
It got resolved by using SXSFF as below,
public static void mergeCSVsToXLSX(Long jobExecutionId, Map<String, String> csvSheetNameAndFile, String xlsxFile) {
try (SXSSFWorkbook wb = new SXSSFWorkbook(100);) { // keep 100 rows in memory, exceeding rows will be flushed to
// disk
csvSheetNameAndFile.forEach((sheetName, csv) -> {
try (CSVReader reader = new CSVReader(new FileReader(csv))) {
wb.setCompressTempFiles(true);
SXSSFSheet sheet = wb.createSheet(sheetName);
sheet.setRandomAccessWindowSize(100);
String[] nextLine;
int r = 0;
while ((nextLine = reader.readNext()) != null) {
Row row = sheet.createRow((short) r++);
for (int i = 0; i < nextLine.length; i++) {
Cell cell = row.createCell(i);
cell.setCellValue(nextLine[i]);
}
}
} catch (IOException ioException) {
logger.error("Error in reading CSV file {} for jobId {} with exception {}", csv, jobExecutionId,
ioException.getMessage());
}
});
FileOutputStream out = new FileOutputStream(xlsxFile);
wb.write(out);
wb.dispose();
} catch (IOException ioException) {
logger.error("Error in creating workbook for jobId {} with exception {}", jobExecutionId,
ioException.getMessage());
}
}
Use Event API (HSSF Only).
The event API is newer than the User API. It is intended for intermediate developers who are willing to learn a little bit of the low level API structures. Its relatively simple to use, but requires a basic understanding of the parts of an Excel file (or willingness to learn). The advantage provided is that you can read an XLS with a relatively small memory footprint.
Well, here's a link with some detailed info about your error, and how to fix it: http://javarevisited.blogspot.com/2011/09/javalangoutofmemoryerror-permgen-space.html?m=1.
Well, let me try to explain your error:
The java.lang.OutOfMemoryError has two variants. One in the Java Heap Space, and the other in PermGen Space.
Your error could be caused by a memory leak, a low amount of system RAM, or very little RAM allocated to the Java Virtual Machine.
The difference between the Java Heap Space and PermGen Space variants is that PermGen Space stores pools of Strings and data on the primitive types, such as int, as well as how to read methods and classes, the Java Heap Space works differently. So if you have a lot of strings or classes in your project, and not enough allocated/system RAM, you will get an OutOfMemoryError. The default amount of RAM the JVM allocates to PermGen is 64 MB, which is quite a small bit of memory space. The linked article explains much more about this error and provides detailed information about how to fix this.
Hope this helps!
To resolve Outofmemery error follow this.
You can not modify existing cells in a SXSSFWorkbook but you can create the new file along with your modification using SXSSFWorkbook.
It's possible by passing the workbook object along with rowaccesswindow size.
SXSSFWorkbook workbook = new SXSSFWorkbook( new XSSFWorkbook(new FileInputStream(file)),100);
//Your changes in workbook
workbook.write(out);
To resolve Outofmemery error, follow this.
You can not modify existing cells in a SXSSFWorkbook, but you can create the new file along with your modification using SXSSFWorkbook.
It's possible by passing the workbook object along with rowaccesswindow size.
SXSSFWorkbook workbook = new SXSSFWorkbook( new XSSFWorkbook(new FileInputStream(file)),100);
//Your changes in workbook
workbook.write(out);
I too faced the same issue of OOM while parsing xlsx file...after two days of struggle, I finally found out the below code that was really perfect;
This code is based on sjxlsx. It reads the xlsx and stores in a HSSF sheet.
[code=java]
// read the xlsx file
SimpleXLSXWorkbook = new SimpleXLSXWorkbook(new File("C:/test.xlsx"));
HSSFWorkbook hsfWorkbook = new HSSFWorkbook();
org.apache.poi.ss.usermodel.Sheet hsfSheet = hsfWorkbook.createSheet();
Sheet sheetToRead = workbook.getSheet(0, false);
SheetRowReader reader = sheetToRead.newReader();
Cell[] row;
int rowPos = 0;
while ((row = reader.readRow()) != null) {
org.apache.poi.ss.usermodel.Row hfsRow = hsfSheet.createRow(rowPos);
int cellPos = 0;
for (Cell cell : row) {
if(cell != null){
org.apache.poi.ss.usermodel.Cell hfsCell = hfsRow.createCell(cellPos);
hfsCell.setCellType(org.apache.poi.ss.usermodel.Cell.CELL_TYPE_STRING);
hfsCell.setCellValue(cell.getValue());
}
cellPos++;
}
rowPos++;
}
return hsfSheet;[/code]
I'm trying to get the following code to run and am getting an IOException:
String cellText = null;
InputStream is = null;
try {
// Find /mydata/myworkbook.xlsx
is = new FileInputStream("/mydata/myworkbook.xlsx");
is.close();
System.out.println("Found the file!");
// Read it in as a workbook and then obtain the "widgets" sheet.
Workbook wb = new XSSFWorkbook(is);
Sheet sheet = wb.getSheet("widgets");
System.out.println("Obtained the widgets sheet!");
// Grab the 2nd row in the sheet (that contains the data we want).
Row row = sheet.getRow(1);
// Grab the 7th cell/col in the row (containing the Plot 500 English Description).
Cell cell = row.getCell(6);
cellText = cell.getStringCellValue();
System.out.println("Cell text is: " + cellText);
} catch(Throwable throwable) {
System.err.println(throwable.getMessage());
} finally {
if(is != null) {
try {
is.close();
} catch(IOException ioexc) {
ioexc.printStackTrace();
}
}
}
The output from running this in Eclipse is:
Found the file!
Stream Closed
java.io.IOException: Stream Closed
at java.io.FileInputStream.readBytes(Native Method)
at java.io.FileInputStream.read(FileInputStream.java:236)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at java.io.PushbackInputStream.read(PushbackInputStream.java:186)
at java.util.zip.ZipInputStream.readFully(ZipInputStream.java:414)
at java.util.zip.ZipInputStream.readLOC(ZipInputStream.java:247)
at java.util.zip.ZipInputStream.getNextEntry(ZipInputStream.java:91)
at org.apache.poi.openxml4j.util.ZipInputStreamZipEntrySource.<init>(ZipInputStreamZipEntrySource.java:51)
at org.apache.poi.openxml4j.opc.ZipPackage.<init>(ZipPackage.java:83)
at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:267)
at org.apache.poi.util.PackageHelper.open(PackageHelper.java:39)
at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:204)
at me.myorg.MyAppRunner.run(MyAppRunner.java:39)
at me.myorg.MyAppRunner.main(MyAppRunner.java:25)
The exception is coming from the line:
Workbook wb = new XSSFWorkbook(is);
According to the XSSFWorkbook Java Docs this is a valid constructor for an XSSFWorkbook object, and I don't see anything "jumping out" at me to indicate that I'm using my InputStream incorrectly. Can any POI gurus help spot where I'm going awrye? Thanks in advance.
The problem is simple:
is = new FileInputStream("/mydata/myworkbook.xlsx");
is.close();
You are closing your output stream before passing it to the constructor and it cannot be read.
Simply delete the is.close() here to fix the issue, as it will be cleaned up in the finally statement at the end.
you are closing the stream is.close();
and then using it, don't close it until you have used it.
As the others have pointed out, you are closing your InputStream which is breaking things
However, you really shouldn't be using an InputStream in the first place! POI uses less memory when given the File object directly rather than going through an InputStream.
I'd suggest you have a read through the POI FAQ on File vs InputStream, then change your code to be:
OPCPackage pkg = OPCPackage.open(new File("/mydata/myworkbook.xlsx"));
Workbook wb = new XSSFWorkbook(pkg);