Exception when reading empty cells from Excel file - java

When trying to read an Excel sheet I get an exception if some cell is empty:
Cell[] rowCells = sheet.getRow(1);
or
Cell cell = sheet.getCell(0,1);
I always get the same message:
java.lang.ArrayIndexOutOfBoundsException: 1
at jxl.read.biff.SheetImpl.getCell(SheetImpl.java:356)
at gui.ReadExcel.read(ReadExcel.java:45)
at gui.GUIcontroller.chooseSaveFile(GUIcontroller.java:101)
What is the problem? How can I know if the cell is empty, so I won't copy its value?

You can use the getRows or getColumns method to check the bounds of the sheet. The ArrayIndexOutOfBoundsException occurs because you are trying to access a value, which is beyond the range of the farthest cell which is not empty.
int rows = sheet.getRows();
int columns = sheet.getColumns();
int i = 1;
if(i<rows)
Cell[] rowCells = sheet.getRow(i); //Won't throw an Exception
if(i<rows && j<columns)
Cell cell = sheet.getCell(i,j);

In this case you can't read the cell because, as far as jxl is concerned, it doesn't really exist on the spreadsheet. It has yet to be created so there is really no cell to get. It may sound odd because excel sheets go on for what seems like forever though it doesn't store the data of all these empty cells because the file size would be huge. So when jxl goes to read the data it will simply tell you there is nothing there.
If you want to read the cells and all your cells are grouped together than you could try:
int width = sheet.getColumns();
int height = sheet.getRows();
List<Cell> cells = new ArrayList<Cell>();
for(int i=0; i<width; i++){
for(int j=0; j<height; j++){
cells.add(sheet.getCell(i, j));
}
}
If they're not grouped together and your not sure which cells maybe empty there is still a fairly simple solution
List<Cell> cells = new ArrayList<Cell>();
Cell cell = null;
try{
cell = sheet.getCell(0, 1);
}catch(Exception e){
e.printStackTrace();
}finally{
if(cell != null){
cells.add(cell);
}
}
This way you can safely attempt to read a cell and throw it away if it doesn't contain anything.
I hope this is what you were looking for.

Related

TreeMap is missing a key from the Object Rows

I am using the Apache POI library to read values from an Excel sheet into a Java program.
I iterate through each row of a table to get the values I need.
Within the object Row, there is a TreeMap that contains XSSFCell objects as values.
Normally I get the following TreeMap:
Where key 4 is included. The value often is an empty string as chosen in this picture.
For some reason, for some objects I get the following TreeMap:
Where the key 4 is missing.
Both Row Objects belong to the same table.
This is how I use my object Row:
XSSFSheet mySheet = myWorkBook.getSheet("nameOfSheet");
Iterator<Row> rowIterator = mySheet.iterator();
while (rowIterator.hasNext()) {
Row row = rowIterator.next();
// here I call my method
}
You can prevent this from causing inconsistency in your application by calling "getCell()" passing in both an index and a MissingCellPolicy, probably Row.RETURN_BLANK_AS_NULL.
The Apache POI guide explains:
In some cases, when iterating, you need full control over how missing
or blank rows and cells are treated, and you need to ensure you visit
every cell and not just those defined in the file. (The CellIterator
will only return the cells defined in the file, which is largely those
with values or stylings, but it depends on Excel).
In cases such as these, you should fetch the first and last column
information for a row, then call getCell(int, MissingCellPolicy) to
fetch the cell. Use a MissingCellPolicy to control how blank or null
cells are handled.
// Decide which rows to process
int rowStart = Math.min(15, sheet.getFirstRowNum());
int rowEnd = Math.max(1400, sheet.getLastRowNum());
for (int rowNum = rowStart; rowNum < rowEnd; rowNum++) {
Row r = sheet.getRow(rowNum);
if (r == null) {
// This whole row is empty
// Handle it as needed
continue;
}
int lastColumn = Math.max(r.getLastCellNum(), MY_MINIMUM_COLUMN_COUNT);
for (int cn = 0; cn < lastColumn; cn++) {
Cell c = r.getCell(cn, Row.RETURN_BLANK_AS_NULL);
if (c == null) {
// The spreadsheet is empty in this cell
} else {
// Do something useful with the cell's contents
}
}
}

Removing several blank lines in XLS using Apache POI HSSF with an incrementing loop

I need to remove several lines of an excel xls sheet.
These lines always contain the same first cell thats why i check the first cell of all rows to find these rows
SSFCell myCell = myRow.getCell(0);
myCell.setCellType(Cell.CELL_TYPE_STRING);
String foundString = myCell.getStringCellValue();
if(foundString.equals(searchString)){
foundRows.add(rowCount);
}
rowCount++;
I then go on and "remove" those rows using removeRow which nulls all values
public static void removeRows() {
List<Integer> foundRowsToDelete = new ArrayList<Integer>();
//Copy values to another list
for(int i=0; i<foundRows.size(); i++){
foundRowsToDelete.add(foundRows.get(i));
}
//Delete values from rows, leaving empty rows
while(foundRowsToDelete.size()!=0){
int rowIndex = foundRowsToDelete.get(0);
Row removingRow = mySheet.getRow(rowIndex);
if (removingRow != null) {
mySheet.removeRow(removingRow);
foundRowsToDelete.remove(0);
}
}
//Move empty rows to bottom of the sheet
for(int i = 0; i < mySheet.getLastRowNum(); i++){
if(isRowEmpty(i)){
mySheet.shiftRows(i+1, mySheet.getLastRowNum(), -1);
i--;
}
}
}
I check if they are empty through using the duplicated rowcounter
//Comparision of previously detected empty rows and given row count
public static boolean isRowEmpty(int suspectedRowNumber) {
for(int i=0;i<foundRows.size();i++){
if (suspectedRowNumber == foundRows.get(i)){
foundRows.remove(i);
return true;
}
}
return false;
}
However only the first of these rows gets deleted. The rest will stay empty.
I therefore assume that there is something wrong with some incrementing done by me, but i just can't figure out exactly why.
Thanks for your help in advance.
It's not immediately clear why your code isn't working, but I look at a couple things to debug
Your foundRowsToDelete ArrayList is being populated with values contained in the foundRows Array. Are you sure what you expect to find in foundRows is actually there.
Is there a reason you don't remove the row when initally iterating through the rows in your sheet? Maybe something like this:
Sheet sheet = workbook.getSheetAt(0);
For (Row row : sheet) {
SSFCell myCell = row.getCell(0);
if(myCell.getCellType() == Cell.CELL_TYPE_STRING){
String foundString = myCell.getStringCellValue();
if(foundString.equalsIgnoreCase(searchString){
// why not just remove here?
sheet.removeRow(row);
}
}
}
}

How to remove iterator loop and allow cellDateType to work in xlsx java?

Hi to all experts out there,
Currently I have some problem regarding the iterator loop. I need to remove it in order for my data to appear on my xlsx excel sheet but I am not sure how do I go about removing it such that my codes are error free. And I suspect that the error may be on the iterator loop.
For now, this is my codes and an image link of how it looks like now.
The data in the excel sheet are not suppose to have a space in between but apparently, there is. There isn't any data in the first column because I didn't have any data being keyed in into the website. So it's okay for the first column to be blank.
int r = 3;
for (Iterator iter = Cells.iterator();iter.hasNext();) {
Object[] _o = (Object[]) iter.next();
currentRow = s.createRow(r);
for(int colNum = 0; colNum < _col_cnt; colNum++){
XSSFCell currentCell =currentRow.createCell(colNum);
if (CellDataType[c].equals("STRING")
|| CellDataType[c].equals("VARCHAR")) {
String _l = (String) _o[colNum];
if (_l != null) {
currentCell.setCellValue(_l);
System.out.println("Data: " + _l);
}
}
hardcode (Testing):
int r = 3;
for (Iterator iter = Cells.iterator();iter.hasNext();) {
Object[] _o = (Object[]) iter.next();
currentRow = s.createRow(r);
for(int colNum = 0; colNum < _col_cnt; colNum++){
XSSFCell currentCell =currentRow.createCell(colNum);
currentCell.setCellValue("Hello");
Your problem is not in the code you've shown here, but it's in the code that you showed in your earlier question. Two words of advice.
If you used indentation correctly, and lined up the curly braces where they're supposed to be, you would see the error almost immediately.
If you stepped through your code with the debugger, and looked at the value of r, you would have found the problem immediately.
Your line r++; is inside the inner loop. This means it gets incremented 3 times for each iteration of the outer loop. You need to move r++; down one line, so that it's outside the inner loop, but inside the outer loop. That way, it will get incremented just once per row, which is what you need.

handling empty columns in apache poi

I have a program reading excel sheet from a java program.
I am iterating over cells as below:
Iterator cells = row.cellIterator();
String temp;
StringBuilder sb;
SimpleDateFormat sdf = new SimpleDateFormat("yyyyMMdd");
while (cells.hasNext()) {
Cell cell = (Cell) cells.next();
temp = null;
switch (cell.getCellType()) {
case Cell.CELL_TYPE_STRING:
temp = cell.getRichStringCellValue().getString();
break;
case Cell.CELL_TYPE_NUMERIC:
if (DateUtil.isCellDateFormatted(cell)) {
temp = sdf.format(cell.getDateCellValue());
} else {
temp = df.format(new BigDecimal(cell.getNumericCellValue()));
}
break;
default:
}
if (temp == null || temp.equalsIgnoreCase("null")) {
sb.append("").append(";");
} else {
sb.append(temp).append(";");
}
}
As seen, I am trying to create a string builder containing values from excel row in semicolon separated way.
Issue is, if a column value is empty, I want it as an empty value in the string builder with two consecutive semicolons.
However, the call
Cell cell = (Cell) cells.next();
simply ignores the empty cells and jumps over to next non empty cell.
So the line
if (temp == null || temp.equalsIgnoreCase("null"))
is never met.
How to get a handle on empty column values as well in the iterator ?
This is virtually a duplicate of this question, and so my answer to that question basically applies exactly to you too.
The Cell Iterator only iterates over cells that are defined in the file. If the cell has never been used in Excel, it probably won't appear in the file (Excel isn't always consistent...), so POI won't see it
If you want to make sure you hit every cell, you should lookup by index instead, and either check for null cells (indicating the cell has never existed in the file), or set a MissingCellPolicy to control how you want null and blank cells to be treated
So, if you really do want to get every cell, do something like:
Row r = sheet.getRow(myRowNum);
int lastColumn = Math.max(r.getLastCellNum(), MY_MINIMUM_COLUMN_COUNT);
for (int cn=0; cn<lastColumn; cn++) {
Cell c = r.getCell(cn, Row.RETURN_BLANK_AS_NULL);
if (c == null) {
// The spreadsheet is empty in this cell
} else {
// Do something useful with the cell's contents
}
}
You can do this
int previous=0;
while(cell.hasNext())
{
Cell cell = (Cell) cells.next();
int current=cell.getColumnIndex();
int numberofsemicolons=previous-current;
for(numberofsemicolons)
{
sb.append("").append(";");
}
previous=current;
}
or you can do
int numberofcells=row.getFirstCellNum()-row.getLastCellNum();
for(i=0;i<=numberofcells;i++)
{
Cell cell = (Cell) cells.next();
int current=cell.getColumnIndex();
while(i<current)
{
sb.append("").append(";");
i++
}
}
Answer posted by #Gagravarr works perfectly for me but MissingCellPolicy is an enum now, so while getting the cell value instead of using
Cell c = r.getCell(cn, Row.RETURN_BLANK_AS_NULL);
I have used
Cell c =r.getCell(cn,Row.MissingCellPolicy.RETURN_BLANK_AS_NULL);

Apache POI rows number

I am using Apache POI java and want to get the total number of rows which are not empty. I successfully processed a whole row with all its columns. Now I am assuming that I get an excel sheet with multiple rows and not a single row...so how to go about that? I was thinking of getting total number of rows (int n) and then loop until i<=n but not sure.
Suggestions are most welcome :)
Note: Apache POI version is 3.8. I am not dealing with Xlsx format...only xls.
Yes I tried this code but got 20 in return....which is not possible given I have only 5 rows
FileInputStream fileInputStream = new FileInputStream("COD.xls");
HSSFWorkbook workbook = new HSSFWorkbook(fileInputStream);
HSSFSheet worksheet = workbook.getSheet("COD");
HSSFRow row1 = worksheet.getRow(3);
Iterator rows = worksheet.rowIterator();
int noOfRows = 0;
while( rows.hasNext() ) {
HSSFRow row = (HSSFRow) rows.next();
noOfRows++;
}
System.out.println("Number of Rows: " + noOfRows);
for (int i = 0; i <= sheet.getLastRowNum(); i++) {
if ((tempRow = sheet.getRow(i)) != null) {
//Your Code Here
}
}
The problem is that POI considers empty rows as physical rows. This happens at times in Excel and while they are not visible to the eye, the rows certainly exist.
If you were to open you Excel sheet and select everything below your data, then delete it (i know it is empty looking, but do it anyway), POI will return the right number.
You may want to getPhysicalNumberOfRows() other than getLastRowNum()?
You can iterate over the rows which are not empty using this:
Iterator<Row> rowIterator = sheet.rowIterator();
while (rowIterator.hasNext()) {
Row row = (Row) rowIterator.next();
// Your code here
}
Thanks
worksheet.getLastRownNum() // *base index 0*
This method will give you the last row number where you might have fill the row, even if you have filled 5 rows, there can be cases that you might have filled some spaces in the remaining 15 rows or at the 21st row because of which it is giving last row number as 20.
There can be the cases that in every 5th row you enters data (starting from 1), then your 5th entry will be in 21st row, so again, if you use this method, you will get 20 in result.

Categories