How to retrieve some specific rows and columns from an excel sheet? - java

I am reading an xlsx file using java (Apache POI).
I have created a Document class (having all excel column heading as variables)
i have to read each row in the excel and map to the Document class by creating a collection of Document class.
The problem I am facing is that I have to start reading from row 2 and from column 7 to column 35 and map the corresponding values to the document class.
Unable to to figure out exactly how the code should be ?
I have written the following lines of code.
List sheetData = new ArrayList();
InputStream excelFile = new BufferedInputStream(new FileInputStream("D:\\Excel file\\data.xlsx"));
Workbook workBook = new XSSFWorkbook(excelFile); // Creates Workbook
XSSFSheet sheet = (XSSFSheet) workBook.getSheet("Daily");
DataFormatter formatter = new DataFormatter();
for (int i = 7; i <= 35; i++) {
XSSFRow row = sheet.getRow(i);
Cell cell = row.getCell(i);
String val = formatter.formatCellValue(cell);
sheetData.add(val);
}

Assuming I've understood your question correctly, I believe you want to process every row which exists from row 2 onwards to the end of the file, and for each of those rows consider the cells in columns 7 through 35. I believe you also might need to process those values, but you haven't said how, so for this example I'll just stuff them in a list of strings and hope for the best...
This is based on the Apache POI documentation for iterating over rows and cells
File excelFile = new File("D:\\Excel file\\data.xlsx");
Workbook workBook = WorkbookFactory.create(excelFile);
Sheet sheet = workBook.getSheet("Daily");
DataFormatter formatter = new DataFormatter();
// Start from the 2nd row, processing all to the end
// Note - Rows and Columns in Apache POI are 0-based not 1-based
for (int rn=1; rn<=sheet.getLastRowNum(); rn++) {
Row row = sheet.getRow(rn);
if (row == null) {
// Whole row is empty. Handle as required here
continue;
}
List<String> values = new ArrayList<String>();
for (int cn=6; cn<35; cn++) {
Cell cell = row.getCell(cn);
String val = null;
if (cell != null) { val = formatter.formatCellValue(cell); }
if (val == null || val.isEmpty()) {
// Cell is empty. Handle as required here
}
// Save the value to list. Save to an object instead if required
values.append(val);
}
}
workBook.close();
Depending on your business requirements, put in logic for handling blank rows and cells. Then, do whatever you need to do with the values you find, again as per your business requirements!

You could iterate with an Iterator in the document, but there is also an function "getRow() and getCell()"
Workbook workbook = new XSSFWorkbook(excelFile);
// defines the standard pointer in document in the first Sheet
XSSFSheet data = this.workbook.getSheetAt(0);
// you could iterate the document with an iterator
Iterator<Cell> iterator = this.data.iterator();
// x/y pointer at the document
Row row = data.getRow(y);
Cell pointingCell = row.getCell(x);
String pointingString = pointingCell.getStringCellValue();

Related

Cannot invoke "org.apache.poi.ss.usermodel.Cell.setCellValue(String)" because "cell" is null for existing XLSM file

I use Apache POI to write data to a predefined XLSM file. I use this code to open existing file:
Cell cell;
File file = new File(XLSMPath);
FileInputStream inputStream = new FileInputStream(file);
XSSFWorkbook workbook = XSSFWorkbookFactory.createWorkbook(inputStream);
XSSFSheet sheet = workbook.getSheetAt(0);
XSSFRow row = sheet.getRow(recordcount+4);
Data is written in first iteration in the 5th row and so on. Code to set value of a given cell:
cell = row.getCell(CellReference.convertColStringToIndex("A"));
cell.setCellValue(myvalue);
It worked fine for the first 400 iterations, but after that I get following error message:
Cannot invoke "org.apache.poi.ss.usermodel.Cell.setCellValue(String)"
because "cell" is null
You need to create the worksheet cells yourself. Just check if your Cell is null and create new Cell using createCell(int):
cell = row.getCell(CellReference.convertColStringToIndex("A"));
if (cell == null) {
// maybe in your case index should be taken in other way
cell = row.createCell(CellReference.convertColStringToIndex("A"));
}
cell.setCellValue(myvalue);

Sort excel by a column using shiftRows- Apache POI - XmlValueDisconnectedException

I have an XSSFWorkbook with n number of columns. And my requirement is to sort the entire sheet by the first column.
I referred to this link but did not get any information about sorting.
I have also tried the code from here but it gives exception at
sheet.shiftRows(row2.getRowNum(), row2.getRowNum(), -1);
I am using Apache POI 3.17.
Anyone has any suggestion or solution?
There seem to be a bug in POI when shifting columns, they say it was fixed in 3.9 but I used 3.17 and still have it:
Exception in thread "main" org.apache.xmlbeans.impl.values.XmlValueDisconnectedException
at org.apache.xmlbeans.impl.values.XmlObjectBase.check_orphaned(XmlObjectBase.java:1258)
at org.openxmlformats.schemas.spreadsheetml.x2006.main.impl.CTRowImpl.getR(Unknown Source)
at org.apache.poi.xssf.usermodel.XSSFRow.getRowNum(XSSFRow.java:394)
...
I assume it is the same you have. So I worked out an other way:
Sort your rows, then create a new workbook and copy rows in the correct order.
Then write this sorted workbook to the original file.
For simplicity, I assume all cell values are Strings. (if not, then modify accordingly)
private static final String FILE_NAME = "/home/userName/Workspace/fileToSort.xlsx";
public static void main(String[] args) {
Workbook originalWorkbook;
//create a workbook from your file
try(FileInputStream excelFile = new FileInputStream(new File(FILE_NAME))) {
originalWorkbook = new XSSFWorkbook(excelFile);
} catch (IOException e) {
throw new RuntimeException("Couldn't open file: " + FILE_NAME);
}
Sheet originalSheet = originalWorkbook.getSheetAt(0);
// Create a SortedMap<String, Row> where the key is the value of the first column
// This will automatically sort the rows
Map<String, Row> sortedRowsMap = new TreeMap<>();
// save headerRow
Row headerRow = originalSheet.getRow(0);
Iterator<Row> rowIterator = originalSheet.rowIterator();
// skip header row as we saved it already
rowIterator.next();
// sort the remaining rows
while(rowIterator.hasNext()) {
Row row = rowIterator.next();
sortedRowsMap.put(row.getCell(0).getStringCellValue(), row);
}
// Create a new workbook
try(Workbook sortedWorkbook = new XSSFWorkbook();
FileOutputStream out = new FileOutputStream(FILE_NAME)) {
Sheet sortedSheet = sortedWorkbook.createSheet(originalSheet.getSheetName());
// Copy all the sorted rows to the new workbook
// - header first
Row newRow = sortedSheet.createRow(0);
copyRowToRow(headerRow, newRow);
// then other rows, from row 1 up (not row 0)
int rowIndex = 1;
for(Row row : sortedRowsMap.values()) {
newRow = sortedSheet.createRow(rowIndex);
copyRowToRow(row, newRow);
rowIndex++;
}
// Write your new workbook to your file
sortedWorkbook.write(out);
} catch (Exception e) {
e.printStackTrace();
}
}
// Utility method to copy rows
private static void copyRowToRow(Row row, Row newRow) {
Iterator<Cell> cellIterator = row.cellIterator();
int cellIndex = 0;
while(cellIterator.hasNext()) {
Cell cell = cellIterator.next();
Cell newCell = newRow.createCell(cellIndex);
newCell.setCellValue(cell.getStringCellValue());
cellIndex++;
}
}
I tried it out on the following file
A B
---------------
Header1 Header2
a one
c three
d four
b two
and it sorts it this way:
A B
---------------
Header1 Header2
a one
b two
c three
d four

Excel sheet merged cell reading using apache poi in java

I am reading the excel sheet using Apache poi in java and I am using CellRangeAddress to get the region.
Case1: If I am giving 2-3 data for merging and going for next cell then it's ok.
I'm getting the next merged region .
Case2: If I am giving more than 6 values and going for next region, then It is showing IndexOutofBoundException for merged region
Here The Code:
List<OrganizationDB> orgList = new ArrayList<OrganizationDB>();
List<EmployeeDB> empList;
XSSFWorkbook workBook;
XSSFSheet excelSheet;
XSSFRow row;
XSSFCell cells;
TreeViewer treeViewer = null;
File excelFile = new File("D:\\ExcelExport\\ExcelSheet2.xls");
FileInputStream fis;
if (excelFile.exists()) {
fis = new FileInputStream(excelFile);
workBook = new XSSFWorkbook(fis);
excelSheet = workBook.getSheetAt(0);
int count = 1;
while (count <= excelSheet.getLastRowNum()) {
CellRangeAddress region = excelSheet.getMergedRegion(count);
row = excelSheet.getRow(count);
//XSSFCell cell = row.getCell(0);
orgDb = new OrganizationDB();
orgDb.setOrganizationName(row.getCell(0).getStringCellValue());
orgDb.setCityName(row.getCell(4).getStringCellValue());
orgDb.setStateName(row.getCell(5).getStringCellValue());
empList = new ArrayList<EmployeeDB>();
while(count<=region.getLastRow()) {
row = excelSheet.getRow(count);
empDb = new EmployeeDB();
empDb.setCompanyName(row.getCell(0).getStringCellValue());
empDb.setEmpID(row.getCell(1).getStringCellValue());
empDb.setEmpName(row.getCell(2).getStringCellValue());
empDb.setPhoneNo((int) row.getCell(3).getNumericCellValue());
empList.add(empDb);
orgDb.setEmpList(empList);
count++;
}
orgList.add(orgDb);
}
I see a logic in your code, which I do not fully understand. Could you check it please?
You have one counter count used by two nested while loops.
while (count <= excelSheet.getLastRowNum()) {
CellRangeAddress region = excelSheet.getMergedRegion(count);
...
while(count<=region.getLastRow()){
...
count++;
perhaps there are two merged regions with rows 1 to 3 and 4 to 6, then after first run of your top while your count = 3, because nested while increased it.
Then code tries to get mergedRegion(3) and
there is no mergedRegion with the index 3.
It must be mergedRegion(2) with next set of rows instead...
I guess you have to use different counters for mergedRegions and rows in them.

How to get rid of "Want to save your changes to 'Test.xlsx'" while inserting formula cells into Excel using POI and opening /closing from file system

I am writing numeric cells to a new excel file 'Test.xlsx' using POI.
I first insert 2 rows with 2 columns each with values:
Row 1 --> 10 (Cell A1), 20
Row 2 --> 15, 20 (Cell B2)
For row 3, column 1, I set the formula SUM(A1:B2).
XSSFWorkbook wb = null;
String absPath = "C:\excel\Test.xlsx";
File f2 = createFileIfDirExist(absPath); //method will return handle if file created successfully
if (f2 != null) {
wb = new XSSFWorkbook();
XSSFSheet sheet = wb.createSheet("MySheet");
if (sheet != null) {
XSSFRow row0 = sheet.createRow(0);
XSSFCell cell0 = row0.createCell(0);
cell0.setCellValue(10);
XSSFCell cell1 = row0.createCell(1);
cell1.setCellValue(20);
XSSFRow row1 = sheet.createRow(0);
XSSFCell cell2 = row1.createCell(0);
cell2.setCellValue(15);
XSSFCell cell3 = row1.createCell(1);
cell3.setCellValue(20);
XSSFRow row2 = sheet.createRow(0);
XSSFCell cell4 = row2.createCell(0);
cell4.setCellType(Cell.CELL_TYPE_FORMULA);
cell4.setCellFormula("SUM(A1:B2)");
FileOutputStream outFile = new FileOutputStream(new File(f2));
wb.write(outFile);
outFile.close();
f2.close();
// call method to evaluate formulaEval.evaluateFormulaCell(cell) on every cell in sheet.
}
}
In my actual implementation, i create these sheets, rows and cells dynamically using user input and using loops which I have not used here. It's a basic version of what my code does.
Now, after I run this program, I see that the file is created. I open the file and see that the cell corresponding to cell A3 (formula cell) has the value 65. However, when I close the file, I get the prompt 'want to save your changes to create.xlsx'?
I unzipped create.xlsx before and after clicking on the save for the above prompt and see that the formula cell has the same values for tags F and V in MySheet.xml.
I also noticed that the xml file calcChain.xml is missing before I manually save the excel file and it appears once I save it manually.
Am I missing something to get rid of the prompt when I close the excel? Any pointers will be greatly appreciated.

How to initialize cells when creating a new Excel file (Apache POI)

I am currently using Apache POI to create empty excel files (that I will later read and edit). The problem is that whenever I try to get the cell, I always get a null value. I have tried initializing the first few columns and rows (the cells are no longer null) but with this approach, I cannot insert new rows and columns. How can I be able to initialize all cells of a sheet without having to set the number of rows and columns? Thanks
EDIT: Hi this is my code in creating excel files. I could not use the iterator to initialize all my cells since there are no rows and columns for the spreadsheet.
FileOutputStream fileOut = new FileOutputStream(loc + formID +".xls");
HSSFWorkbook workbook = new HSSFWorkbook();
Sheet sheet = workbook.createSheet("Sheet1");
for (Row row : sheet) {
for (Cell cell : row) {
CellStyle style = workbook.createCellStyle();
style.setFillBackgroundColor(IndexedColors.BLACK.getIndex());
cell.setCellValue("");
cell.setCellStyle(style);
}
}
workbook.write(fileOut);
fileOut.flush();
fileOut.close();
You might find that MissingCellPolicy can help with your needs. When calling getCell on a row, you can specify a MissingCellPolicy to have something happen if no cell was there, eg
Row r = sheet.getRow(2);
Cell c = r.getCell(5, MissingCellPolicy.CREATE_NULL_AS_BLANK);
c.setCellValue("This will always work, c will never be null");
org.apache.poi.ss.util.CellUtil can also help too:
// Get the row, creating it if needed
Row r = CellUtil.getRow(4, sheet);
// Get the cell, creating it if needed
Cell c = CellUtil.getCell(2, r);

Categories