I'm using Apache POI to create and save records into Workbook.
I have almost 5000+ new records to be written and saved into the workbook.
But at the time of writing the fileOutputStream into the workbook, the execution basically halts and slowed down.
What I mean to say is, at the time of executing this line:
workbook.write(fileOutputStream);
it almost stops to process 5000+ records. I validated that it's taking nearly 1 hour (!) to write in the workbook.
How can I improve the performance and overcome this drawback?? Please suggest...
** Note: The rest of the codes are normal Apache POI related codes and they are executing fine, no issue, hence I didnot mention all of them. Only I got stuck at the above line.
I found one discussion here:
FileOutputStream (Apachhe POI) taking too long time to save
but, it did not help me. I need to save the whole file.
One more solution I understand, like, while iterating over the Row and creating cells, DO NOT keep declaring CellStyle and sheet.autoSizeColumn(colNumber) inside the loop, rather declare these 2 only once at the outside of the loop and set the values and style only inside the loop, i.e, cell.setCellStyle and cell.setCellValue.
Declaring the above 2 everytime while iterating, basically degrades the performance of the POI radically.
Let's have a concrete example we can talk about:
import java.io.FileOutputStream;
import org.apache.poi.ss.usermodel.*;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
import org.apache.poi.xssf.streaming.SXSSFWorkbook;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import java.util.GregorianCalendar;
class CreateExcel100000Rows {
public static void main(String[] args) throws Exception {
System.out.println("whole program starts " + java.time.LocalDateTime.now());
try (
//Workbook workbook = new XSSFWorkbook(); FileOutputStream fileout = new FileOutputStream("Excel.xlsx")
//Workbook workbook = new SXSSFWorkbook(); FileOutputStream fileout = new FileOutputStream("Excel.xlsx")
Workbook workbook = new HSSFWorkbook(); FileOutputStream fileout = new FileOutputStream("Excel.xls")
) {
int rows = 100000;
if (workbook instanceof HSSFWorkbook) rows = 65536;
Object[][] data = new Object[rows][4];
data[0] = new Object[] {"Value", "Date", "Formatted value", "Formula"};
for (int i = 1; i < rows; i++) {
data[i] = new Object[] {1.23456789*i, new GregorianCalendar(2000, 0, i), 1.23456789*i, "ROUND(A" + (i+1) + ",2)"};
}
DataFormat dataFormat = workbook.createDataFormat();
CellStyle dateStyle = workbook.createCellStyle();
dateStyle.setDataFormat(dataFormat.getFormat("DDDD, MMMM, DD, YYYY"));
CellStyle numberStyle = workbook.createCellStyle();
numberStyle.setDataFormat(dataFormat.getFormat("#,##0.00 \" Coins\""));
Sheet sheet = workbook.createSheet();
sheet.setColumnWidth(0, 12*256);
sheet.setColumnWidth(1, 35*256);
sheet.setColumnWidth(2, 17*256);
sheet.setColumnWidth(3, 10*256);
for (int r = 0; r < data.length; r++) {
Row row = sheet.createRow(r);
for (int c = 0; c < data[0].length; c++) {
Cell cell = row.createCell(c);
if (r == 0) cell.setCellValue((String)data[r][c]);
if (r > 0 && c == 0) {
cell.setCellValue((Double)data[r][c]);
} else if (r > 0 && c == 1) {
cell.setCellValue((GregorianCalendar)data[r][c]);
cell.setCellStyle(dateStyle);
} else if (r > 0 && c == 2) {
cell.setCellValue((Double)data[r][c]);
cell.setCellStyle(numberStyle);
} else if (r > 0 && c == 3) {
cell.setCellFormula((String)data[r][c]);
}
}
}
System.out.println("write starts " + java.time.LocalDateTime.now());
workbook.write(fileout);
System.out.println("write ends " + java.time.LocalDateTime.now());
if (workbook instanceof SXSSFWorkbook) ((SXSSFWorkbook)workbook).dispose();
}
System.out.println("whole program ends " + java.time.LocalDateTime.now());
}
}
This code creates a HSSFWorkbook having the first sheet filled from row 1 to row 65,536 having different kind of cell values in columns A:D.
Using java -Xms256M -Xmx512M, that is heap space from 256 to 512 MByte, this takes 2 seconds in whole. HSSFWorkbook.write takes less than a second.
If you do
...
try (
Workbook workbook = new XSSFWorkbook(); FileOutputStream fileout = new FileOutputStream("Excel.xlsx")
//Workbook workbook = new SXSSFWorkbook(); FileOutputStream fileout = new FileOutputStream("Excel.xlsx")
//Workbook workbook = new HSSFWorkbook(); FileOutputStream fileout = new FileOutputStream("Excel.xls")
) {
...
This code creates a XSSFWorkbook having the first sheet filled from row 1 to row 100,000 having different kind of cell values in columns A:D.
Using java -Xms256M -Xmx512M, that is heap space from 256 to 512 MByte, this takes 7 seconds in whole. XSSFWorkbook.write takes 2 seconds. This can be improved by giving more available heap space.
If you do
...
try (
//Workbook workbook = new XSSFWorkbook(); FileOutputStream fileout = new FileOutputStream("Excel.xlsx")
Workbook workbook = new SXSSFWorkbook(); FileOutputStream fileout = new FileOutputStream("Excel.xlsx")
//Workbook workbook = new HSSFWorkbook(); FileOutputStream fileout = new FileOutputStream("Excel.xls")
) {
...
This code creates a SXSSFWorkbook having the first sheet filled from row 1 to row 100,000 having different kind of cell values in columns A:D.
Using java -Xms256M -Xmx512M, that is heap space from 256 to 512 MByte, this takes 2 seconds in whole. SXSSFWorkbook.write takes less than a second.
Note: Using SXSSFWorkbook, ((SXSSFWorkbook)workbook).dispose() is necessary to get rid of the used temporary files.
If you are using merged cells, this answer might be helpful.
I once had 3000+ records and it took 10 minutes to generate the output xlsx.
After using a Java profiler, I found that
org.apache.poi.xssf.usermodel.XSSFSheet#getMergedRegion
took most of the time.
Based on my data set, I found this method grows in O(n^2) (n is the count of records), which explains why it works for small records set(less than 1K) but takes a lot of time for large records set.
I checked the template and output, it had a lot of merged cells generated by jx:each:
Excel headers
| A | B | C |
| headers |
`jx:each` cells
| a | b | <- merged
| a | b |
...
| footers |
So I unmerged the cells in jx:each template, and it takes less than 1 second now.
I am facing an issue when i write huge set of data to a Excel file with multiple sheets. I am using apache POI for the excel export.
File file = new File("../path/file.xls");
FileOutputStream fout = new FileOutputStream(file);
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
int limit = 100000,offset=0,count=0,sheetIndex=0;
XSSFWorkbook workbook = new XSSFWorkbook();
do{
XSSFSheet sheet = null;
if (file.exists() && sheetIndex > 0) {
try {
workbook = (XSSFWorkbook)WorkbookFactory.create(file);
} catch (InvalidFormatException e) {
e.printStackTrace();
}
sheet = workbook.createSheet("Sheet-"+sheetIndex);
}else{
workbook = new XSSFWorkbook();
sheet = workbook.createSheet("Sheet-"+sheetIndex);
}
Row header = sheet.createRow(0);
//...Header row creation...
List<DataType> result = query(criteria,offset,limit);
offset = offset + limit;
count = results.size();
sheetIndex++;
int rowCount = 1;
for(DataType rowData : results){
Row row = sheet.createRow(rowCount++);
//row creation....
}
try {
workbook.write(outputStream);
outputStream.writeTo(fout);
} finally {
outputStream.flush();
}
}while(count == limit);
workbook.write(outputStream);
outputStream.writeTo(fout);
outputStream.close();
fout.close();
In the loop i am fetching 100k records from DB and writing it to the excel, and each 100k i am creating a new Sheet until there are no more records from the DB.
This code have 2 issues
1. I am facing issues in opening the file, the excel file alert me that it has issues when i try to open, eventually when i say ok it loads the data.
I can see there are only 1 sheet with 100k data though my DB contains 240M records. I also can see the loop is looping for number of times.
How can i get these issues resolved? really stucked!
Thanks in advance.
The XSSFWorkbook workbook is created multiple times and it overwrites the one created on previous loop. The workbook needs to be created only once.
I suggest changing the loop entry to the following:
XSSFWorkbook workbook = new XSSFWorkbook();
do {
XSSFSheet sheet = workbook.createSheet("Sheet-"+sheetIndex);
Row header = sheet.createRow(0);
//...Header row creation...
// remaining code
I have changed WorkBook type to SXSSFWorkbook and set the flush limit to 100 and it worked.
The performance has increased 5 times better than the XSSFWorkbook.
I am writing a small utility that creates a pivot table in an excel sheet using POI and I want to read the data from the pivot table back to the program which will save it as a PDF file using Itext. I am running into a problem where the program cannot read the data from the pivot table after it is created. The program only can "see" the information in the pivot table after I manually open the created file and hit the save button in excel. Does anyone know a way to read the data from the pivot table from the XSSFPivotTable object or otherwise force a way for the file to "save" so it can be accessed by the program again?
Here is a snippet of code so you can see what I'm talking about. I'm a student so any advice on best practices would be greatly appreciated as well.
public void returnPivotData() throws IOException {
FileInputStream fs = new FileInputStream(this.xlsxFile);
XSSFWorkbook book = new XSSFWorkbook(fs);
XSSFSheet dataSheet = book.getSheet("Sheet1");
AreaReference dataRef = new AreaReference("A1:E15",
SpreadsheetVersion.EXCEL2007);
XSSFPivotTable table = dataSheet.createPivotTable(dataRef,
new CellReference("A16"));
table.addRowLabel(0);
table.addColumnLabel(DataConsolidateFunction.SUM, 3);
// Save the data back to the file
FileOutputStream fsOut = new FileOutputStream(
"D:\\workspace\\test.xlsx");
book.write(fsOut);
fsOut.close();
book.close();
fs.close();
// This does not allow access
FileInputStream fsIn = new FileInputStream("D:\\workspace\\test.xlsx");
XSSFWorkbook bookNew = new XSSFWorkbook(fsIn);
XSSFSheet sheet = bookNew.getSheet("Sheet1");
for (int i = 0; i < 20; i++) {
XSSFRow rowNew = sheet.getRow(i);
XSSFCell cellNew = rowNew.getCell(0,
MissingCellPolicy.CREATE_NULL_AS_BLANK);
System.out.println(cellNew.toString());
}
fsIn.close();
bookNew.close();
}
I am trying to update an existing .XLSM file using Apache POI. Every time I run my code I receive an error as shown below.
Exception in thread "main" java.lang.IllegalArgumentException: Attempting to write a row[1] in the range [0,9] that is already written to disk.
at org.apache.poi.xssf.streaming.SXSSFSheet.createRow(SXSSFSheet.java:136)
at com.log.test.Test.main(Test.java:41)
Basically I wanted to use a macro enabled excel file as standard template , using java code i wanted to make a copy of template and update the some sheet's columns data and save the file.
I am trying with below sample code :
OPCPackage pkg = OPCPackage.open(new File("C:/LogTest/testme.xlsm"));
XSSFWorkbook wb_template;
wb_template = new XSSFWorkbook(pkg);
System.out.println("package loaded");
SXSSFWorkbook wb = new SXSSFWorkbook(wb_template);
wb.setCompressTempFiles(true);
SXSSFSheet sh = (SXSSFSheet) wb.getSheet("Asset Names");
sh.setRandomAccessWindowSize(100);
for (int rownum = 1; rownum < 10; rownum++) {
Row row = sh.createRow(rownum);
for (int cellnum = 0; cellnum < 2; cellnum++) {
Cell cell = row.createCell(cellnum);
String address = new CellReference(cell).formatAsString();
cell.setCellValue("hello");
}
}
FileOutputStream out = new FileOutputStream(new File("C:/output/new.xlsm"));
wb.write(out);
out.close();
wb.dispose();
System.out.println("Done !!!");
Can this be achieved using Apache POI ? or i need to use some other libraries ?
sample template
I want to read and write large excel files. Therefore, I used SXSSFWorkbook to write the excel file and XSSF and SAX EVENT API to read the files.
However, the cell content is empty when the excel file is read, and if the excel file is written using SXSSFWOrkbook. If I open the written excel file and save it again, the content is shown correctly.
The following is the code I used to write the excel file.
SXSSFWorkbook wb = new SXSSFWorkbook();
wb.setCompressTempFiles(true);
SXSSFSheet sh = (SXSSFSheet) wb.createSheet();
// sh.setRandomAccessWindowSize(100);// keep 100 rows in memory,
// exceeding rows will be flushed to disk
for (int rownum = 0; rownum < 100; rownum++) {
Row row = sh.createRow(rownum);
for (int cellnum = 0; cellnum < 10; cellnum++) {
Cell cell = row.createCell(cellnum);
String address = new CellReference(cell).formatAsString();
cell.setCellValue(address);
}
}
FileOutputStream out = new FileOutputStream("D:\\tempsxssf.xlsx");
wb.write(out);
out.flush();
out.close();
wb.dispose();
I am in a big trouble, can someone help me to figure out the issue?
I used another constructor according POI documentation
SXSSFWorkbook(workbook, rowAccessWindowSize, compressTmpFiles, useSharedStringsTable)
Like this:
SXSSFWorkbook workbook = new SXSSFWorkbook(null, 1000, true, true);
where you can enable shared strings table