xssf How to get anything as String - java

I try to parse an excel file into XML using apache poi xssf.
Now having a cell and not knowing what is in it I just want to get a String out of it.
But when I use
cell.getStringCellValue()
it throws an exception, what is not very suprising since it is documented this way.
So I build my way around that by checking weather it is a numeric or a text cell. But what to do with formula cells. They may contain numbers like
= A2 + B2
What gives me the sum (e.g. 4) or a reference to another text
= C2
what might refer to a text like "Hans".
How can I know what is really in my cell and how do I get a String out of it?

Excel stores some cells as strings, but most as numbers with special formatting rules applied to them. If you want to get the raw values, use a switch statement based on cell.getCellType() as some of the other answers have shown.
However, if what you want is a string of the cell, showing the same as what Excel would show, based on applying all the formatting rules on the cell + cell types, then Apache POI has a class to do just that - DataFormatter
All you need to do is something like:
Workbook wb = WorkbookFactory.create(new File("myfile.xls"));
DataFormatter df = new DataFormatter();
Sheet s = wb.getSheetAt(0);
Row r1 = s.getRow(0);
Cell cA1 = r1.getCell(0);
String asItLooksInExcel = df.formatCellValue(cA1);
Doesn't matter what the cell type is, DataFormatter will format it as best it can for you, using the rules applied in Excel, and giving you back a nicely formatted string at the end.

The accepeted answer does not work with formula cells (in the result String you get the formula, not the result of the formula).
Here is what worked for me in every case:
final XSSFWorkbook workbook = new XSSFWorkbook(file);
final DataFormatter dataFormatter = new DataFormatter();
final FormulaEvaluator objFormulaEvaluator = new XSSFFormulaEvaluator(workbook);
final Cell cell = ...;
objFormulaEvaluator.evaluate(cell);
final String cellValue = dataFormatter.formatCellValue(cell, objFormulaEvaluator);

You can add check on CELL type as below :
switch(cell.getCellType()) {
case Cell.CELL_TYPE_BOOLEAN:
System.out.print(cell.getBooleanCellValue() + "\t\t");
break;
case Cell.CELL_TYPE_NUMERIC:
System.out.print(cell.getNumericCellValue() + "\t\t");
break;
case Cell.CELL_TYPE_STRING:
System.out.print(cell.getStringCellValue() + "\t\t");
break;
}

Try this one
case Cell.CELL_TYPE_FORMULA:
switch (cell.getCachedFormulaResultType()) {
case Cell.CELL_TYPE_STRING:
System.out.println(cell.getRichStringCellValue().getString());
break;
case Cell.CELL_TYPE_NUMERIC:
if (DateUtil.isCellDateFormatted(cell)) {
System.out.println(cell.getDateCellValue() + "");
} else {
System.out.println(cell.getNumericCellValue());
}
break;
}
break;

Related

Cannot get a numeric value from a text cell while getting data from xlsx file

Getting below error while reading data from a .xlsx file. I am not able to read data due to this error. Facing the error message "cannot get a numeric value from a text cell".
Here is the code:
switch (cell.getCellType()) {
case HSSFCell.CELL_TYPE_FORMULA:
rowArray[count] = isCellDateFormatted(cell) ? dateFormat.format(cell.getDateCellValue()) : Double.toString(cell.getNumericCellValue());
break;
case Cell.CELL_TYPE_BOOLEAN:
rowArray[count] = Boolean.toString(cell.getBooleanCellValue());
break;
case Cell.CELL_TYPE_NUMERIC:
rowArray[count] = isCellDateFormatted(cell) ? dateFormat.format(cell.getDateCellValue()) : Double.toString(cell.getNumericCellValue());
break;
case Cell.CELL_TYPE_STRING:
rowArray[count] = cell.getStringCellValue().replace(separatorStr, escapeStr + separatorStr).replace("\n", " ");
break;
default:
rowArray[count] = "";
}
Here is the exception:
java.lang.IllegalStateException: Cannot get a numeric value from a text cell
at org.apache.poi.xssf.usermodel.XSSFCell.typeMismatch(XSSFCell.java:994)
at org.apache.poi.xssf.usermodel.XSSFCell.getNumericCellValue(XSSFCell.java:305)
at org.apache.poi.ss.usermodel.DateUtil.isCellDateFormatted(DateUtil.java:494)
at cvx.qwer.adfffg.excel.XlsxToCsv.convertToCsv(XlsxToCsv.java:76)
at cvx.qwer.adfffg.excel.XlsxToCsv.main(XlsxToCsv.java:136)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
As of your showed code the rowArray seems to be a String array. So the need is to get all cell values as String representations. Best way to do so is using DataFormatter in combination with FormulaEvaluator:
...
Workbook workbook ...
...
FormulaEvaluator evaluator = workbook.getCreationHelper().createFormulaEvaluator();
DataFormatter formatter = new DataFormatter();
...
Cell cell ...
...
rowArray[count] = formatter.formatCellValue(cell, evaluator);
...
Using this the whole switching the cell types is not necessary.
DataFormatter.formatCellValue:
Returns the formatted value of a cell as a String regardless of the
cell type.

Apache POI: non-deprecated way to get cell content? [duplicate]

I'm reading an excel-file (file extension xlsx) using org.apache.poi 3.15.
This is my code:
try (FileInputStream fileInputStream = new FileInputStream(file); XSSFWorkbook workbook = new XSSFWorkbook(file)) {
XSSFSheet sheet = workbook.getSheetAt(0);
Iterator<Row> rowIterator = sheet.iterator();
while (rowIterator.hasNext()) {
Row row = rowIterator.next();
Iterator<Cell> cellIterator = row.cellIterator();
while (cellIterator.hasNext()) {
Cell cell = cellIterator.next();
switch (cell.getCellType()) {
case Cell.CELL_TYPE_NUMERIC:
System.out.print(cell.getNumericCellValue() + "(Integer)\t");
break;
case Cell.CELL_TYPE_STRING:
System.out.print(cell.getStringCellValue() + "(String)\t");
break;
}
}
System.out.println("");
}
} catch (Exception e) {
e.printStackTrace();
}
I get a warning that cell.getCellType() is deprecated. Can anyone tell me the alternative?
The accepted answer shows the reason for the deprecation but misses to name the alternative:
CellType getCellTypeEnum()
where the CellType is the enum decribing the type of the cell.
The plan is to rename getCellTypeEnum() back to getCellType() in POI 4.0.
You can use:
cell.getCellTypeEnum()
Further to compare the cell type, you have to use CellType as follows:-
if(cell.getCellTypeEnum() == CellType.STRING){
.
.
.
}
You can Refer to the documentation. Its pretty helpful:-
https://poi.apache.org/apidocs/org/apache/poi/ss/usermodel/Cell.html
Use getCellType()
switch (cell.getCellType()) {
case BOOLEAN :
//To-do
break;
case NUMERIC:
//To-do
break;
case STRING:
//To-do
break;
}
FileInputStream fis = new FileInputStream(new File("C:/Test.xlsx"));
//create workbook instance
XSSFWorkbook wb = new XSSFWorkbook(fis);
//create a sheet object to retrieve the sheet
XSSFSheet sheet = wb.getSheetAt(0);
//to evaluate cell type
FormulaEvaluator formulaEvaluator = wb.getCreationHelper().createFormulaEvaluator();
for(Row row : sheet)
{
for(Cell cell : row)
{
switch(formulaEvaluator.evaluateInCell(cell).getCellTypeEnum())
{
case NUMERIC:
System.out.print(cell.getNumericCellValue() + "\t");
break;
case STRING:
System.out.print(cell.getStringCellValue() + "\t");
break;
default:
break;
}
}
System.out.println();
}
This code will work fine. Use getCellTypeEnum() and to compare use just NUMERIC or STRING.
From the documentation:
int getCellType()
Deprecated. POI 3.15. Will return a CellType enum in the future.
Return the cell type. Will return CellType in version 4.0 of POI. For forwards compatibility, do not hard-code cell type literals in your code.
It looks that 3.15 offers no satisfying solution: either one uses the old style with Cell.CELL_TYPE_*, or we use the method getCellTypeEnum() which is marked as deprecated.
A lot of disturbances for little add value...
For POI 3.17 this worked for me
switch (cellh.getCellTypeEnum()) {
case FORMULA:
if (cellh.getCellFormula().indexOf("LINEST") >= 0) {
value = Double.toString(cellh.getNumericCellValue());
} else {
value = XLS_getDataFromCellValue(evaluator.evaluate(cellh));
}
break;
case NUMERIC:
value = Double.toString(cellh.getNumericCellValue());
break;
case STRING:
value = cellh.getStringCellValue();
break;
case BOOLEAN:
if(cellh.getBooleanCellValue()){
value = "true";
} else {
value = "false";
}
break;
default:
value = "";
break;
}
You can do this:
private String cellToString(HSSFCell cell) {
CellType type;
Object result;
type = cell.getCellType();
switch (type) {
case NUMERIC : //numeric value in excel
result = cell.getNumericCellValue();
break;
case STRING : //String Value in Excel
result = cell.getStringCellValue();
break;
default :
throw new RuntimeException("There is no support for this type of value in Apche POI");
}
return result.toString();
}

Alternative to deprecated getCellType

I'm reading an excel-file (file extension xlsx) using org.apache.poi 3.15.
This is my code:
try (FileInputStream fileInputStream = new FileInputStream(file); XSSFWorkbook workbook = new XSSFWorkbook(file)) {
XSSFSheet sheet = workbook.getSheetAt(0);
Iterator<Row> rowIterator = sheet.iterator();
while (rowIterator.hasNext()) {
Row row = rowIterator.next();
Iterator<Cell> cellIterator = row.cellIterator();
while (cellIterator.hasNext()) {
Cell cell = cellIterator.next();
switch (cell.getCellType()) {
case Cell.CELL_TYPE_NUMERIC:
System.out.print(cell.getNumericCellValue() + "(Integer)\t");
break;
case Cell.CELL_TYPE_STRING:
System.out.print(cell.getStringCellValue() + "(String)\t");
break;
}
}
System.out.println("");
}
} catch (Exception e) {
e.printStackTrace();
}
I get a warning that cell.getCellType() is deprecated. Can anyone tell me the alternative?
The accepted answer shows the reason for the deprecation but misses to name the alternative:
CellType getCellTypeEnum()
where the CellType is the enum decribing the type of the cell.
The plan is to rename getCellTypeEnum() back to getCellType() in POI 4.0.
You can use:
cell.getCellTypeEnum()
Further to compare the cell type, you have to use CellType as follows:-
if(cell.getCellTypeEnum() == CellType.STRING){
.
.
.
}
You can Refer to the documentation. Its pretty helpful:-
https://poi.apache.org/apidocs/org/apache/poi/ss/usermodel/Cell.html
Use getCellType()
switch (cell.getCellType()) {
case BOOLEAN :
//To-do
break;
case NUMERIC:
//To-do
break;
case STRING:
//To-do
break;
}
FileInputStream fis = new FileInputStream(new File("C:/Test.xlsx"));
//create workbook instance
XSSFWorkbook wb = new XSSFWorkbook(fis);
//create a sheet object to retrieve the sheet
XSSFSheet sheet = wb.getSheetAt(0);
//to evaluate cell type
FormulaEvaluator formulaEvaluator = wb.getCreationHelper().createFormulaEvaluator();
for(Row row : sheet)
{
for(Cell cell : row)
{
switch(formulaEvaluator.evaluateInCell(cell).getCellTypeEnum())
{
case NUMERIC:
System.out.print(cell.getNumericCellValue() + "\t");
break;
case STRING:
System.out.print(cell.getStringCellValue() + "\t");
break;
default:
break;
}
}
System.out.println();
}
This code will work fine. Use getCellTypeEnum() and to compare use just NUMERIC or STRING.
From the documentation:
int getCellType()
Deprecated. POI 3.15. Will return a CellType enum in the future.
Return the cell type. Will return CellType in version 4.0 of POI. For forwards compatibility, do not hard-code cell type literals in your code.
It looks that 3.15 offers no satisfying solution: either one uses the old style with Cell.CELL_TYPE_*, or we use the method getCellTypeEnum() which is marked as deprecated.
A lot of disturbances for little add value...
For POI 3.17 this worked for me
switch (cellh.getCellTypeEnum()) {
case FORMULA:
if (cellh.getCellFormula().indexOf("LINEST") >= 0) {
value = Double.toString(cellh.getNumericCellValue());
} else {
value = XLS_getDataFromCellValue(evaluator.evaluate(cellh));
}
break;
case NUMERIC:
value = Double.toString(cellh.getNumericCellValue());
break;
case STRING:
value = cellh.getStringCellValue();
break;
case BOOLEAN:
if(cellh.getBooleanCellValue()){
value = "true";
} else {
value = "false";
}
break;
default:
value = "";
break;
}
You can do this:
private String cellToString(HSSFCell cell) {
CellType type;
Object result;
type = cell.getCellType();
switch (type) {
case NUMERIC : //numeric value in excel
result = cell.getNumericCellValue();
break;
case STRING : //String Value in Excel
result = cell.getStringCellValue();
break;
default :
throw new RuntimeException("There is no support for this type of value in Apche POI");
}
return result.toString();
}

Reading cell content as rich text using Apache POI. Handling numeric cells when using cell.getRichStringCellValue () method

I want to read a cell value from an Excel Spreadsheet as a rich text, not String, but an exception is thrown when the cell type is numeric and I am using cell.getRichStringCellValue () method . What would be a good approach to handle this problem?
You need to follow the approach carefully and lovingly laid out in the Apache POI documentation (who'd have thought?!). You'll want to do something like:
import org.apache.poi.ss.usermodel.*;
Workbook wb = WorkbookFactory.create(new File("input.xls"));
Sheet sheet1 = wb.getSheetAt(0);
for (Row row : sheet1) {
for (Cell cell : row) {
CellReference cellRef = new CellReference(row.getRowNum(), cell.getColumnIndex());
System.out.print(cellRef.formatAsString());
System.out.print(" - ");
switch (cell.getCellType()) {
case Cell.CELL_TYPE_STRING:
RichTextString contents = cell.getRichStringCellValue();
// TODO Handle contents
break;
case Cell.CELL_TYPE_NUMERIC:
if (DateUtil.isCellDateFormatted(cell)) {
Date date = cell.getDateCellValue();
// TODO Handle Date value
} else {
double number = cell.getNumericCellValue();
// TODO Handle number
}
break;
case Cell.CELL_TYPE_BOOLEAN:
boolean value = cell.getBooleanCellValue();
// TODO Handle
break;
case Cell.CELL_TYPE_FORMULA:
// Either get formula, or check last value, or evaluate
break;
default:
// Shouldn't happen
}
}
}
Then add your own logic for handling the contents now that you have fetched them

How to get the formula cell value(data) using apache poi 3.1

I am using Apache poi-3.1-FINAL-20080629 in my application. here, I have one problem using formula...
My Cell has formula(sheet2!C10) and the data inside this cell is String type (e.g $3,456)...How to access that cell also want to display the formula.
My code Looks like:
HSSFWorkbook wb = new HSSFWorkbook(file);
HSSFSheet sheet = wb.getSheetAt(0);
HSSFFormulaEvaluator evaluator = new HSSFFormulaEvaluator(sheet, wb);
Iterator rows = sheet.rowIterator();
while (rows.hasNext()) {
HSSFRow row = (HSSFRow) rows.next();
System.out.println("\n");
Iterator cells = row.cellIterator();
while (cells.hasNext()) {
HSSFCell cell = (HSSFCell) cells.next();
int cellType = cell.getCellType();
if (HSSFCell.CELL_TYPE_NUMERIC == cellType)
System.out.print(cell.getNumericCellValue() + " ");
else if (HSSFCell.CELL_TYPE_STRING == cellType)
System.out.print(cell.getStringCellValue() + " ");
else if (HSSFCell.CELL_TYPE_BOOLEAN == cellType)
System.out.print(cell.getBooleanCellValue() + " ");
else if (HSSFCell.CELL_TYPE_BLANK == cellType)
System.out.print("BLANK ");
else if (HSSFCell.CELL_TYPE_FORMULA == cellType) {
System.out.print(evaluator.evaluateInCell(cell).toString() ); } else
System.out.print("Unknown cell type " + cellType);
}
}
evaluator.evaluateInCell(cell).toString() is throwing null pointer exception. Please Help!!!
OK, so first up, Apache POI 3.1 is rather old. The clue is in the jar name - poi-3.1-FINAL-20080629 dates from 2008, so 6 years ago! It's well worth upgrading, the number of bug fixes since then is pretty vast....
As for your problem, I'm fairly sure that you don't want to be calling evaluator.evaluateInCell. That's almost never the right method to call
You have three options available to you, depending on what you want to do with your formula cell:
I want the formula string
Call Cell.getCellFormula and you'll get back the formula string
I want the last formula value Excel calculated
When Excel writes out a file, it normally stores the last evaluated value for the formula, in a cache, to make opening nice and quick. You can read this, if present, from Apache POI. It's only a cache, so it might not always be correct, but it normally is.
You need to call Cell.getCachedFormulaResultType, then switch on that, and read the values out using the normal getters, eg
int cellType = cell.getCellType();
handleCell(cell, cellType);
private void handleCell(Cell cell, int cellType) {
if (HSSFCell.CELL_TYPE_NUMERIC == cellType)
System.out.print(cell.getNumericCellValue() + " ");
else if (HSSFCell.CELL_TYPE_STRING == cellType)
System.out.print(cell.getStringCellValue() + " ");
else if (HSSFCell.CELL_TYPE_BOOLEAN == cellType)
System.out.print(cell.getBooleanCellValue() + " ");
else if (HSSFCell.CELL_TYPE_BLANK == cellType)
System.out.print("BLANK ");
else if (HSSFCell.CELL_TYPE_FORMULA == cellType)
handleCell(cell, cell.getCachedFormulaResultType());
else
System.out.print("Unknown cell type " + cellType);
}
I want Apache POI to calculate the formula result for me
Your best bet here is to get a FormulaEvaluator object and call evaluate:
FormulaEvaluator evaluator = wb.getCreationHelper().createFormulaEvaluator();
// suppose your formula is in B3
CellReference cellReference = new CellReference("B3");
Row row = sheet.getRow(cellReference.getRow());
Cell cell = row.getCell(cellReference.getCol());
CellValue cellValue = evaluator.evaluate(cell);
switch (cellValue.getCellType()) {
case Cell.CELL_TYPE_BOOLEAN:
System.out.println(cellValue.getBooleanValue());
break;
case Cell.CELL_TYPE_NUMERIC:
System.out.println(cellValue.getNumberValue());
break;
case Cell.CELL_TYPE_STRING:
System.out.println(cellValue.getStringValue());
break;
case Cell.CELL_TYPE_BLANK:
break;
case Cell.CELL_TYPE_ERROR:
break;
// CELL_TYPE_FORMULA will never happen
case Cell.CELL_TYPE_FORMULA:
break;
}
The Apache POI Formula Evaluation page has more on all of these

Categories