Java POI FormulaEvaluator giving unexpected floating point value - java

I am using Java POI library to read an Excel file and then display it in HTML table. The Excel file is very simple with 1 row and 3 columns:
A1 cell= 21.7
B1 cell= 20.0
C1 cell is a formula cell with the formula =(A1-B1)/B1 and it has a custom format of "Percentage" with 0 decimal places. Excel displays its value as 9%. This is because 1.7/20 on a calculator gives result as 0.085; when it is converted to "Percentage" format it becomes 8.5% and because format says include 0 decimal places, it gets rounded up to 9%, so that's what Excel displays. All good.
However, POI displays the value as 8% instead. I observe that 1.7/20 is calculated to be 0.084999999. Because of the Percentage format as applied above it converts to 8.4999999% and because of 0 decimal places, it gets rounded down to 8%.
How can I have POI return me 9% instead of 8%? Here is the code snippet:
String myFormat="0%";
CreationHelper helper = wbWrapper.getWb().getCreationHelper();
CellUtil.setCellStyleProperty(cell, CellUtil.DATA_FORMAT,helper.createDataFormat().getFormat(myFormat));
String val = dataFormatter.formatCellValue(cell, evaluator);
Here evaluator is an instance of org.apache.poi.ss.usermodel.FormulaEvaluator and dataFormatter is an instance of org.apache.poi.ss.usermodel.DataFormatter
When I print the variable "val" it is returning 8% instead of what is displayed in Excel (9%).

Your observations are correct. The problem occurs because of the general floating point problems. It can simply be shown:
...
System.out.println(1.7/20.0); //0.08499999999999999
System.out.println((21.7-20.0)/20.0); //0.08499999999999996
...
As you see, the division of double value 1.7 by double value 20.0 results in 0.08499999999999999. This would be fine since this value would be taken as 0.085 using DecimalFormat. But the more complex equation (21.7-20.0)/20.0 results in 0.08499999999999996. And this clearly is lower than 0.085 .
Excel tries solving those problems by an additional rule for floating point values. It always uses only 15 significant decimal digits of an floating point value. So Excel does something like :
...
BigDecimal bd = new BigDecimal((21.7-20.0)/20.0);
System.out.println(bd.round(new MathContext(15)).doubleValue()); //0.085
...
Neither apache poi's FormulaEvaluator nor it's DataFormatter behaves like Excel in this point. That's why the difference.
One could have an own MyDataFormatter where the only difference to /org/apache/poi/ss/usermodel/DataFormatter.java is:
...
private String getFormattedNumberString(Cell cell, ConditionalFormattingEvaluator cfEvaluator) {
if (cell == null) {
return null;
}
Format numberFormat = getFormat(cell, cfEvaluator);
double d = cell.getNumericCellValue();
java.math.BigDecimal bd = new java.math.BigDecimal(d);
d = bd.round(new java.math.MathContext(15)).doubleValue();
if (numberFormat == null) {
return String.valueOf(d);
}
String formatted = numberFormat.format(Double.valueOf(d));
return formatted.replaceFirst("E(\\d)", "E+$1"); // to match Excel's E-notation
}
...
Then using that MyDataFormatter instead of DataFormatter would be more compatible to Excel's behavior.
Example:
import java.io.FileOutputStream;
import org.apache.poi.ss.usermodel.*;
import org.apache.poi.ss.util.*;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
class CreateExcelEvaluateFormula {
public static void main(String[] args) throws Exception {
Workbook workbook = new XSSFWorkbook();
CreationHelper creationHelper = workbook.getCreationHelper();
FormulaEvaluator formulaEvaluator = creationHelper.createFormulaEvaluator();
Sheet sheet = workbook.createSheet();
Row row = sheet.createRow(0);
Cell cell = row.createCell(0); cell.setCellValue(21.7);
cell = row.createCell(1); cell.setCellValue(20.0);
cell = row.createCell(2); cell.setCellFormula("(A1-B1)/B1");
formulaEvaluator.evaluateFormulaCell(cell);
double d = cell.getNumericCellValue();
System.out.println(d); //0.08499999999999996
MyDataFormatter dataFormatter = new MyDataFormatter();
String myFormat="0%";
CellUtil.setCellStyleProperty(cell, CellUtil.DATA_FORMAT, creationHelper.createDataFormat().getFormat(myFormat));
String val = dataFormatter.formatCellValue(cell, formulaEvaluator);
System.out.println(val); //9%
FileOutputStream out = new FileOutputStream("Excel.xlsx");
workbook.write(out);
out.close();
workbook.close();
}
}

Related

Date cell mask doesn't come correctly

I'm reading an .xlsx file with Apache POI that has a cell with the following value: Jan/17.
But when I do check the cell variable with
String pattern = cell.getCellStyle().getDataFormatString()
pattern return wrong mask: mmm-yy and wrong value Jan-17.
I tried to use DataFormatter, but return the same error
I tried to use CellDateFormatter, but return the same error too
Replace the character "-" to "/" is not an option because it can be used "-" in file.
This is not an error but is how Microsoft Excel localizes the date formats. Excel files always store formats in en_US locale. That is mmm-yy for this format. It is the Excel built-in format with ID 17 (0x11, "mmm-yy"). See BuiltinFormats. It gets applied when user inputs only month and year in a cell. For example: 1/17, which is month 1 in year 17.
In my German Excel this looks like so using German Windows locale:
The Excel GUI then interprets that format different dependent on the Excel and/or Windows locale settings. For example, if I change the Windows locale settings to Portuguese (Brasil) in Control Panel - Time and Region:
then it looks like so:
Note, nothing has changed in Excel file. Only Windows locale settings have changed.
Unfortunately Apache POI DataFormatter fails to interpret locale settings exactly like Excel does.
Following code interprets the Excel built-in format with ID 17 (0x11, "mmm-yy") as mmm-yy = Jan-17 using en_US locale. This is correct. But it interprets it as mmm.-yy = Jan.-17 using de_DE locale. This is wrong, should be mmm yy = Jan 17 like in Excel. And using pt_BR locale it interprets it as mmm.-yy= jan.-17. This is wrong too, should be mmm/yy = jan/17 like in Excel.
import org.apache.poi.ss.usermodel.*;
import java.io.FileInputStream;
class ReadExcel {
public static void main(String[] args) throws Exception {
Workbook workbook = WorkbookFactory.create(new FileInputStream("./ExcelExampleIn.xlsx"));
// up to apache poi 5.1.0 a FormulaEvaluator is needed to evaluate the formulas while using DataFormatter
FormulaEvaluator evaluator = workbook.getCreationHelper().createFormulaEvaluator();
//DataFormatter dataFormatter = new DataFormatter(new java.util.Locale("en", "US"));
//DataFormatter dataFormatter = new DataFormatter(new java.util.Locale("de", "DE"));
DataFormatter dataFormatter = new DataFormatter(new java.util.Locale("pt", "BR"));
// from 5.2.0 on the DataFormatter can set to use cached values for formula cells
dataFormatter.setUseCachedValuesForFormulaCells(true);
Sheet sheet = workbook.getSheetAt(0);
for (Row row : sheet) {
for (Cell cell : row) {
String pattern = cell.getCellStyle().getDataFormatString();
System.out.println(pattern);
//String value = dataFormatter.formatCellValue(cell, evaluator); // up to apache poi 5.1.0
String value = dataFormatter.formatCellValue(cell); // from apache poi 5.2.0 on
System.out.println(value);
}
}
workbook.close();
}
}
To overcome this incorrectness in Apache POI, one could add a special data format for the mmm-yy to the DataFormatter dependent on used locale. This can be achieved using public void addFormat(java.lang.String excelFormatStr, java.text.Format format).
Complete example again:
import org.apache.poi.ss.usermodel.*;
import org.apache.poi.util.LocaleUtil;
import java.io.FileInputStream;
class ReadExcel {
public static void main(String[] args) throws Exception {
Workbook workbook = WorkbookFactory.create(new FileInputStream("./ExcelExampleIn.xlsx"));
// up to apache poi 5.1.0 a FormulaEvaluator is needed to evaluate the formulas while using DataFormatter
FormulaEvaluator evaluator = workbook.getCreationHelper().createFormulaEvaluator();
//LocaleUtil.setUserLocale(new java.util.Locale("en", "US"));
//LocaleUtil.setUserLocale(new java.util.Locale("de", "DE"));
LocaleUtil.setUserLocale(new java.util.Locale("pt", "BR"));
DataFormatter dataFormatter = new DataFormatter(); // uses user locale set
// from 5.2.0 on the DataFormatter can set to use cached values for formula cells
dataFormatter.setUseCachedValuesForFormulaCells(true);
if (LocaleUtil.getUserLocale().equals(new java.util.Locale("de", "DE"))) {
dataFormatter.addFormat("mmm-yy", new java.text.SimpleDateFormat("MMM yy", new java.util.Locale("en", "US")));
} else if (LocaleUtil.getUserLocale().equals(new java.util.Locale("en", "US"))) {
dataFormatter.addFormat("mmm-yy", new java.text.SimpleDateFormat("MMM-yy", new java.util.Locale("en", "US")));
} else if (LocaleUtil.getUserLocale().equals(new java.util.Locale("pt", "BR"))) {
dataFormatter.addFormat("mmm-yy", new java.text.SimpleDateFormat("MMM/yy", new java.util.Locale("en", "US")));
}
Sheet sheet = workbook.getSheetAt(0);
for (Row row : sheet) {
for (Cell cell : row) {
String pattern = cell.getCellStyle().getDataFormatString();
System.out.println(pattern);
//String value = dataFormatter.formatCellValue(cell, evaluator); // up to apache poi 5.1.0
String value = dataFormatter.formatCellValue(cell); // from apache poi 5.2.0 on
System.out.println(value);
}
}
workbook.close();
}
}
Note, the java.text.SimpleDateFormat always gets created usig en_US locale. Else, month abbreviations may be followed by a dot. For example: Jan. 17 or Jan./17.

apache poi , DataFormatter trimming values beyond ten places of decimal

So i have a simple code that populate values to database
try {
for(int indexRow = 1; indexRow < numberOfRecords; indexRow++) {
record = new String[noOfColumns];
for(int indexColumn = 0;indexColumn < noOfColumns; indexColumn++) {
indexError = indexColumn;
String value = "";
XSSFRow row = sheet.getRow(indexRow);
if(row != null) {
XSSFCell cell = sheet.getRow(indexRow).getCell(indexColumn);
if(cell != null) {
value = fmt.formatCellValue(cell);
}
record[indexColumn] = value;
}
}
records.add(record);
}
}
Now i have ran through the source code as well but i cannot find a way by which i can set the default DataFormatter in a way that it can change the DecimalFormat in a way to accomodate the extra changes .
Any help would be greatly appreciated .
Eg :in excel i have
-5.57055337362326
but through code it writes into db as
-5.5705533736
Important: The purpose of the DataFormatter.formatCellValue() method is to return cell's value in the way it is shown in the Excel document.
Let's say if you will define numeric format in Excel to show 4 fractional digits and your document looks so:
Your code sample will return -5,5706; if you will change numeric format to show 8 fractional digits - result will be -5,57055337.
By default numeric format in Excel is 10 digits based (in Apache POI please check ExcelGeneralNumberFormat.decimalFormat constant), and looks like it is the one used in your document based on the output you have.
Solution
As it is mentioned by #samabcde (adding my answer to fix couple issues in his answer and to provide additional details), solution is to use cell.getNumericCellValue() instead:
DecimalFormat decimalFormat = new DecimalFormat("#.###############");
String cellValue = decimalFormat.format(cell.getNumericCellValue());
Here we've used "#.###############" format with 15 digits since it is a maximum precision for Excel >>
Additional information
Please pay attention to this article: When reading Excel with POI, beware of floating points
In terms of configuration of DataFromatter you can set up default number format using DataFormatter#setDefaultNumberFormat(Format format), and it will be used when you call format.formatCellValue(cell), but only in case of usage of unknown/broken formats in the Excel document.
P.S.: Answer to the first comment
It is not fully clear from your comment all the cases you want to cover, assumption is that DataFormatter works for you in all cases except numeric values, and DecimalFormat with "#.###############" pattern works in that case for you. Anyway in case you will want more specific logic it will be needed just to check some other conditions.
Please find utility method you can use in this case:
private static final DecimalFormat DECIMAL_FORMAT = new DecimalFormat("#.###############");
private static final DataFormatter DATA_FORMATTER = new DataFormatter();
public static String formatCellValue(HSSFCell cell) {
if (cell != null && cell.getCellTypeEnum() == CellType.NUMERIC
&& !DateUtil.isCellDateFormatted(cell)) {
return DECIMAL_FORMAT.format(cell.getNumericCellValue());
} else {
return DATA_FORMATTER.formatCellValue(cell);
}
}
For the Excel file below:
Field A1 has format of 4 fractional digits with a real value 28,9999999999999
Field A2 has format of 4 fractional digits with a real value -5.5
Field A3 has default Excel format with a real value 28,9999999999999
Utility method above will return real values here, i.e.: 28,9999999999999, -5.5 and 28,9999999999999
DataFormatter.formatCellValue() will return values how they look in the Excel itself, i.e.: 29,0000, -5,5000 and 29.
Reason for losing digit
The below code is from POI version 3.14, for the DataFormatter.formatCellValue method, with numeric cell, it will eventually call getFormattedNumberString method. The code is as
private String getFormattedNumberString(Cell cell, ConditionalFormattingEvaluator
cfEvaluator) {
Format numberFormat = getFormat(cell, cfEvaluator);
double d = cell.getNumericCellValue();
if (numberFormat == null) {
return String.valueOf(d);
}
String formatted = numberFormat.format(new Double(d));
return formatted.replaceFirst("E(\\d)", "E+$1"); // to match Excel's E-notation
}
The numberFormat will be a DecimalFormat with 10 decimal places by default, suppose you have not set any format, a cell with value '-5.57055337362326' will return '-5.5705533736' as expected.
Solution
If only the exact value is needed, using cell.getNumericValue method and create a DecimalFormat can solve this problem. If the display in Excel is required to change also, then we need to created a custom DataFormatter. Both solutions are illustrated in following example:
import java.io.IOException;
import java.text.DecimalFormat;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.DataFormat;
import org.apache.poi.ss.usermodel.DataFormatter;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.xssf.usermodel.XSSFCellStyle;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
public class TestDataFormatter {
public static void main(String[] args) throws IOException {
Double testValue = Double.valueOf("-5.57055337362326");
System.out.println("test value:\t\t" + testValue.toString());
XSSFWorkbook workbook = new XSSFWorkbook();
Sheet sheet = workbook.createSheet();
Row row = sheet.createRow(0);
Cell cell = row.createCell(0);
cell.setCellValue(testValue);
// 10 decimal place shown by default
DataFormatter dataFormatter = new DataFormatter();
String defaultFormatted = dataFormatter.formatCellValue(cell);
System.out.println("default formatted:\t" + defaultFormatted);
// Create custom format
XSSFCellStyle style = workbook.createCellStyle();
DataFormat customDataFormat = workbook.createDataFormat();
int dataFormatIndex = customDataFormat.getFormat("0.00000000000000");
style.setDataFormat(dataFormatIndex);
cell.setCellStyle(style);
String customFormatted = dataFormatter.formatCellValue(cell);
System.out.println("custom formatted:\t" + customFormatted);
// Get numeric value and then format by DecimalFormat
System.out.println("get numeric:\t\t" + cell.getNumericCellValue());
System.out.println(
"format numeric:\t\t" + new DecimalFormat("0.00000000000000").format(cell.getNumericCellValue()));
workbook.close();
}
}

How can Apache POI use formulas in streaming mode?

I am using Apache POI 3.17 (current). When I use HSSFCell.setFormula() to insert a formula like "A1+17" it works. When I do the same in streaming mode, using SXSSFCell.setFormula() the formula appears (with a leading "=") in the input line but the displayed result in the cell is always 0.
I tried with the cell types NUMERIC and FORMULA. Here is my minimal not working example:
final SXSSFWorkbook wb = new SXSSFWorkbook();
final SXSSFSheet sheet = wb.createSheet("Test-S");
final SXSSFRow row = sheet.createRow(0);
final SXSSFCell cell1 = row.createCell(0);
cell1.setCellType(CellType.NUMERIC);
cell1.setCellValue(124);
final SXSSFCell formulaCell1 = row.createCell(1);
formulaCell1.setCellType(CellType.FORMULA);
formulaCell1.setCellFormula("A1 + 17");
final SXSSFCell formulaCell2 = row.createCell(2);
formulaCell2.setCellType(CellType.NUMERIC);
formulaCell2.setCellFormula("A1+18");
FileOutputStream os = new FileOutputStream("/tmp/test-s.xlsx");
wb.write(os);
wb.close();
os.close();
The three cells display as 124/0/0, although in the input line the formulae are displayed correctly.
Any hints are appreciated.
It works for me with Excel 2016, I get the correct results in the cells when I open the sample file. Probably older versions of Excel handle this slightly differently, please try to force evaluation of formulas with the following two things
// evaluate all formulas and store cached results
wb.getCreationHelper().createFormulaEvaluator().evaluateAll();
// suggest to Excel to recalculate the formulas itself as well
sheet.setForceFormulaRecalculation(true);
Hopefully one of those two will make it work for you as well.
The answers does not answer the question why this problem with OpenOffice/Libreoffice only occurs if SXSSFCell is used as a formula cell. When using XSSFCell as a formula cell it does not occur.
The answer is that SXSSFCell always uses a cell value, even if the formula was not evaluated at all. And the worst thing is that it uses the value 0 (zero) if if the formula was not evaluated at all. This is a fundamental misusing of the value 0 in mathematics. The value 0 explicitly does not mean that there is not a value or that there is a unknown value. It means that there is the value 0 and nothing else. So the value 0 should not be used as the cached formula result of a not evaluated formula. Instead no value should be used until the formula is evaluated. Exact as XSSFCell does.
So the really correct answer must be that apache poi should correct their SXSSFCell code.
Workaround until this:
import java.io.FileOutputStream;
import org.apache.poi.xssf.streaming.*;
import org.apache.poi.ss.usermodel.CellType;
import java.lang.reflect.Field;
import java.util.TreeMap;
public class CreateExcelSXSSFFormula {
public static void main(String[] args) throws Exception {
SXSSFWorkbook wb = new SXSSFWorkbook();
SXSSFSheet sheet = wb.createSheet("Test-S");
SXSSFRow row = sheet.createRow(0);
SXSSFCell cell = row.createCell(0);
cell.setCellValue(124);
SXSSFFormulaonlyCell formulacell = new SXSSFFormulaonlyCell(row, 1);
formulacell.setCellFormula("A1+17");
cell = row.createCell(2);
cell.setCellFormula("A1+17");
formulacell = new SXSSFFormulaonlyCell(row, 3);
formulacell.setCellFormula("A1+18");
cell = row.createCell(4);
cell.setCellFormula("A1+18");
wb.write(new FileOutputStream("test-s.xlsx"));
wb.close();
wb.dispose();
}
private static class SXSSFFormulaonlyCell extends SXSSFCell {
SXSSFFormulaonlyCell(SXSSFRow row, int cellidx) throws Exception {
super(row, CellType.BLANK);
Field _cells = SXSSFRow.class.getDeclaredField("_cells");
_cells.setAccessible(true);
#SuppressWarnings("unchecked") //we know the problem and expect runtime error if it possibly occurs
TreeMap<Integer, SXSSFCell> cells = (TreeMap<Integer, SXSSFCell>)_cells.get(row);
cells.put(cellidx, this);
}
#Override
public CellType getCachedFormulaResultTypeEnum() {
return CellType.BLANK;
}
}
}
Of course I should have mentioned that I use LibreOffice. I have now found that LibreOffice intentionally does not recalculate formulae from an Excel-created sheet, and it considers POI sheets as Excel-created.
See https://ask.libreoffice.org/en/question/12165/calc-auto-recalc-does-not-work/ .
Changing the LibreOffice settings (Tools – Options – LibreOffice Calc – formula – Recalculation on file load) helps.

Java POI : How to read Excel cell value and not the formula computing it?

I am using Apache POI API to getting values from an Excel file.
Everything is working great except with cells containing formulas. In fact, the cell.getStringCellValue() is returning the formula used in the cell and not the value of the cell.
I tried to use evaluateFormulaCell() method but it's not working because I am using GETPIVOTDATA Excel formula and this formula is not implemented in the API:
Exception in thread "main" org.apache.poi.ss.formula.eval.NotImplementedException: Error evaluating cell Landscape!K11
at org.apache.poi.ss.formula.WorkbookEvaluator.addExceptionInfo(WorkbookEvaluator.java:321)
at org.apache.poi.ss.formula.WorkbookEvaluator.evaluateAny(WorkbookEvaluator.java:288)
at org.apache.poi.ss.formula.WorkbookEvaluator.evaluate(WorkbookEvaluator.java:221)
at org.apache.poi.hssf.usermodel.HSSFFormulaEvaluator.evaluateFormulaCellValue(HSSFFormulaEvaluator.java:320)
at org.apache.poi.hssf.usermodel.HSSFFormulaEvaluator.evaluateFormulaCell(HSSFFormulaEvaluator.java:213)
at fromExcelToJava.ExcelSheetReader.unAutreTest(ExcelSheetReader.java:193)
at fromExcelToJava.ExcelSheetReader.main(ExcelSheetReader.java:224)
Caused by: org.apache.poi.ss.formula.eval.NotImplementedException: GETPIVOTDATA
at org.apache.poi.hssf.record.formula.functions.NotImplementedFunction.evaluate(NotImplementedFunction.java:42)
For formula cells, excel stores two things. One is the Formula itself, the other is the "cached" value (the last value that the forumla was evaluated as)
If you want to get the last cached value (which may no longer be correct, but as long as Excel saved the file and you haven't changed it it should be), you'll want something like:
for(Cell cell : row) {
if(cell.getCellType() == Cell.CELL_TYPE_FORMULA) {
System.out.println("Formula is " + cell.getCellFormula());
switch(cell.getCachedFormulaResultType()) {
case Cell.CELL_TYPE_NUMERIC:
System.out.println("Last evaluated as: " + cell.getNumericCellValue());
break;
case Cell.CELL_TYPE_STRING:
System.out.println("Last evaluated as \"" + cell.getRichStringCellValue() + "\"");
break;
}
}
}
Previously posted solutions did not work for me. cell.getRawValue() returned the same formula as stated in the cell. The following function worked for me:
public void readFormula() throws IOException {
FileInputStream fis = new FileInputStream("Path of your file");
Workbook wb = new XSSFWorkbook(fis);
Sheet sheet = wb.getSheetAt(0);
FormulaEvaluator evaluator = wb.getCreationHelper().createFormulaEvaluator();
CellReference cellReference = new CellReference("C2"); // pass the cell which contains the formula
Row row = sheet.getRow(cellReference.getRow());
Cell cell = row.getCell(cellReference.getCol());
CellValue cellValue = evaluator.evaluate(cell);
switch (cellValue.getCellType()) {
case Cell.CELL_TYPE_BOOLEAN:
System.out.println(cellValue.getBooleanValue());
break;
case Cell.CELL_TYPE_NUMERIC:
System.out.println(cellValue.getNumberValue());
break;
case Cell.CELL_TYPE_STRING:
System.out.println(cellValue.getStringValue());
break;
case Cell.CELL_TYPE_BLANK:
break;
case Cell.CELL_TYPE_ERROR:
break;
// CELL_TYPE_FORMULA will never happen
case Cell.CELL_TYPE_FORMULA:
break;
}
}
There is an alternative command where you can get the raw value of a cell where formula is put on. It's returns type is String. Use:
cell.getRawValue();
If the need is to read values from Excel sheets and having them as strings then, for example to present them somewhere or to use them in text file formats, then using DataFormatter will be the best.
DataFormatter is able to get a string from each cell value, whether the cell value itself is string, boolean, number, error or date. This string then looks the same as Excel will show it in the cells in it's GUI.
Only problem are formula cells. Up to apache poi 5.1.0 a FormulaEvaluator is needed to evaluate the formulas while using DataFormatter. This fails when apache poi is not able evaluating the formula. From 5.2.0 on the DataFormatter can be set to use cached values for formula cells. Then no formula evaluation is needed if Excel had evaluated the formulas before.
Complete example:
import org.apache.poi.ss.usermodel.*;
import java.io.FileInputStream;
class ReadExcel {
public static void main(String[] args) throws Exception {
Workbook workbook = WorkbookFactory.create(new FileInputStream("./ExcelExample.xlsx"));
// up to apache poi 5.1.0 a FormulaEvaluator is needed to evaluate the formulas while using DataFormatter
FormulaEvaluator evaluator = workbook.getCreationHelper().createFormulaEvaluator();
DataFormatter dataFormatter = new DataFormatter(new java.util.Locale("en", "US"));
// from 5.2.0 on the DataFormatter can set to use cached values for formula cells
dataFormatter.setUseCachedValuesForFormulaCells(true);
Sheet sheet = workbook.getSheetAt(0);
for (Row row : sheet) {
for (Cell cell : row) {
//String value = dataFormatter.formatCellValue(cell, evaluator); // up to apache poi 5.1.0
String value = dataFormatter.formatCellValue(cell); // from apache poi 5.2.0 on
System.out.println(value);
}
}
workbook.close();
}
}
If you want to extract a raw-ish value from a HSSF cell, you can use something like this code fragment:
CellBase base = (CellBase) cell;
CellType cellType = cell.getCellType();
base.setCellType(CellType.STRING);
String result = cell.getStringCellValue();
base.setCellType(cellType);
At least for strings that are completely composed of digits (and automatically converted to numbers by Excel), this returns the original string (e.g. "12345") instead of a fractional value (e.g. "12345.0"). Note that setCellType is available in interface Cell(as of v. 4.1) but deprecated and announced to be eliminated in v 5.x, whereas this method is still available in class CellBase. Obviously, it would be nicer either to have getRawValue in the Cell interface or at least to be able use getStringCellValue on non STRING cell types. Unfortunately, all replacements of setCellType mentioned in the description won't cover this use case (maybe a member of the POI dev team reads this answer).
SelThroughJava's answer was very helpful I had to modify a bit to my code to be worked .
I used https://mvnrepository.com/artifact/org.apache.poi/poi and https://mvnrepository.com/artifact/org.testng/testng as dependencies .
Full code is given below with exact imports.
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import org.apache.poi.hssf.usermodel.HSSFCell;
import org.apache.poi.hssf.util.CellReference;
import org.apache.poi.sl.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.CellType;
import org.apache.poi.ss.usermodel.CellValue;
import org.apache.poi.ss.usermodel.FormulaEvaluator;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.ss.usermodel.WorkbookFactory;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
public class ReadExcelFormulaValue {
private static final CellType NUMERIC = null;
public static void main(String[] args) {
try {
readFormula();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
public static void readFormula() throws IOException {
FileInputStream fis = new FileInputStream("C:eclipse-workspace\\sam-webdbriver-diaries\\resources\\tUser_WS.xls");
org.apache.poi.ss.usermodel.Workbook workbook = WorkbookFactory.create(fis);
org.apache.poi.ss.usermodel.Sheet sheet = workbook.getSheetAt(0);
FormulaEvaluator evaluator = workbook.getCreationHelper().createFormulaEvaluator();
CellReference cellReference = new CellReference("G2"); // pass the cell which contains the formula
Row row = sheet.getRow(cellReference.getRow());
Cell cell = row.getCell(cellReference.getCol());
CellValue cellValue = evaluator.evaluate(cell);
System.out.println("Cell type month is "+cellValue.getCellTypeEnum());
System.out.println("getNumberValue month is "+cellValue.getNumberValue());
// System.out.println("getStringValue "+cellValue.getStringValue());
cellReference = new CellReference("H2"); // pass the cell which contains the formula
row = sheet.getRow(cellReference.getRow());
cell = row.getCell(cellReference.getCol());
cellValue = evaluator.evaluate(cell);
System.out.println("getNumberValue DAY is "+cellValue.getNumberValue());
}
}

How to get the formatted value of a number for a cell in Apache POI?

I wanted to get the value of a Numeric cell as a simple string.
Suppose there the type of cell is numeric with value 90%.
Now I cannot use cell.getStringCellValue() as it will throw exception.
I also cannot use cell.getNumericCellValue() as it will return me .9 and not 90%.
I want to store in db which is of type varchar2, so I want the value in string only.
I cannot change the cell type in xls as its the end user job, I have to handle this in code itself.
Also formatter does't work well as there could be different cell types in the xls...dd:mm,dd:mm:ss,formula etc.
All I want is that whatever the cell type is I need to get its value as simple String.
You can force the value to be returned as a String using the methods below
HSSFDataFormatter hdf = new HSSFDataFormatter();
System.out.println (hdf.formatCellValue(mycell));
will return "90%"
The API for this method is at http://poi.apache.org/apidocs/org/apache/poi/ss/usermodel/DataFormatter.html#formatCellValue%28org.apache.poi.ss.usermodel.Cell%29
This works directly even with an HSSFCell
it worked for me even when my Cell is an HSSFCell
i've also tried this cast - which works.
HSSFCell cell1 = (HSSFCell) row1.getCell(2);
HSSFDataFormatter hdf = new HSSFDataFormatter();
System.out.println ("formatted "+ hdf.formatCellValue(cell1));
Try
cell.getRichStringCellValue ().getString();
Have a look at this example
Here is Doc
The following code is using current apache poi versions of 2021. Now DataFormatter can be used for XSSF (Office Open XML *.xlsx) as well as for HSSF (BIFF *.xls) formats. It should be used together with FormulaEvaluator to get values from formula cells too.
import org.apache.poi.ss.usermodel.*;
import java.io.FileInputStream;
class ReadExcel {
public static void main(String[] args) throws Exception {
Workbook workbook = WorkbookFactory.create(new FileInputStream("Excel.xlsx"));
//Workbook workbook = WorkbookFactory.create(new FileInputStream("Excel.xls"));
DataFormatter dataFormatter = new DataFormatter(java.util.Locale.US);
FormulaEvaluator formulaEvaluator = workbook.getCreationHelper().createFormulaEvaluator();
String cellValue = "";
for (Sheet sheet: workbook) {
System.out.println(sheet.getSheetName());
for (Row row : sheet) {
for (Cell cell : row) {
cellValue = dataFormatter.formatCellValue(cell, formulaEvaluator);
System.out.println(cell.getAddress() + ":" + cellValue);
// do something with cellValue
}
}
}
workbook.close();
}
}

Categories