Java 8 and Apache POI 4.1.x here.
The expected "cell formatting behavior" of Excel, as I've gathered from being an Excel user for most of my life, is as follows:
There appears to be an underlying "internal value" of a cell, and then there is the "visualization" of that value
The internal value is a number, value, etc. that is the actual/real value of that cell
The visualization is an optional way of representing the internal value to the end user
Example: The internal value of a cell might be 0.678261, but the cell might be formatted to handle decimals as percentages with hundredths-place precision, and so the end user might see that cell represented as 67.83%. But if they were to use it in a formula, or modify its value, the underlying value of 0.678261 is what would be used/modified
I'm trying to figure out how to do the same with POI.
Meaning, I would like to use POI's API to write an internal value to a cell, but then configure that cell to (visually) represent the value with a different formatting applied.
My two use cases are:
Representing a numeric/decimal as a valid price (e.g. visually representing 203.9483495949 as $203.95 to the end users, or 0.8375733 as $0.84); and
Representing a numeric/decimal as a valid percentage (e.g. visually representing 0.009383484 as 1.00% to the end users, or 0.53282 as 53.28%)
Currently I'm writing these values as follows:
BigDecimal validPrice = BigDecimal.valueOf(203.9483495949);
BigDecimal validPct = BigDecimal.valueOf(0.53282);
Row nextRow = sheet.createRow(rowNum);
nextRow.createCell(0).setCellValue(validPrice.doubleValue());
nextRow.createCell(1).setCellValue(validPct.doubleValue());
But when I write the data to an Excel file, I just see those same raw/internal values visualized (203.9483495949 and 0.53282 respectively) in the columns, and I have to manually set formatting on them after I open the files up. Instead, rather than forcing end users to apply this formatting manually, I'd like POI to apply the formatting in the code, so that when the files are opened, they are formatted as $203.95 and 53.28% aleady.
Any idea as to how to do this?
As shown in Quick-guide - DataFormats you need using cell styles to format numbers in Excel cells having special number format.
Those CellStyles are on workbook level and can be created as needed as shown in above linked examples. But you should not create exactly the same CellStyle multiple times as there are limits for unique cell formats/cell styles in Excel.
The more flexible way is using CellUtil. There:
The various methods that deal with style's allow you to create your
CellStyles as you need them. When you apply a style change to a cell,
the code will attempt to see if a style already exists that meets your
needs. If not, then it will create a new style. This is to prevent
creating too many styles.
Using this we can create a structure which holds data objects per row and cell and a structure which holds data formats per column.
Example:
import java.io.FileOutputStream;
import org.apache.poi.ss.usermodel.*;
import org.apache.poi.ss.util.CellUtil;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
class CreateExcelFormattedValues {
public static void main(String[] args) throws Exception {
Workbook workbook = new XSSFWorkbook();
DataFormat format = workbook.createDataFormat();
// structure which holds data objects per row and cell
Object[][] data = new Object[][]{
new Object[]{"Price", "Percent"},
new Object[]{203.9483495949, 0.53282},
new Object[]{0.8375733, 0.009383484}
};
// structure which holds data formats per column
short currencyDataFormat = format.getFormat("$#,##0.00");
short percentDataFormat = format.getFormat("0.00%");
short[] dataFormats = {currencyDataFormat, percentDataFormat};
Sheet sheet = workbook.createSheet();
Row row;
int firstRow = 1; // first row is second row
int r = 0; // loop variable for row
Cell cell;
int firstCol = 1; // first column is column B
int c = 0; // loop variable for column
for (Object[] dataRow : data) {
row = sheet.createRow(firstRow + r);
c = 0;
for (Object dataValue : dataRow) {
cell = row.createCell(firstCol + c);
CellUtil.setCellStyleProperty(cell, CellUtil.DATA_FORMAT, dataFormats[c]);
if (dataValue instanceof String) {
cell.setCellValue((String)dataValue);
} else if (dataValue instanceof Double) {
cell.setCellValue((Double)dataValue);
}
c++;
}
r++;
}
FileOutputStream out = new FileOutputStream("Excel.xlsx");
workbook.write(out);
out.close();
workbook.close();
}
}
I am using Aspose to read a CSV file.
I do not beforehand know the number of cells for each row of the file, but I will need to know it for further processing.
Unfortunately, I see no way to find out the number of cells in a CSV row.
Imagine the following row in the CSV file. It contains 7 cells, 4 of which are empty:
1,2,,4,,,
Using
row.iterator();
Aspose will only return 3 cells, as it ignores all empty cells.
As an alternative, I now do the following:
Cell lastCell = row.getLastCell();
int count = 0;
do {
cell = row.getCellOrNull(count);
String cellValue = cell == null ? "" : cell.getStringValueWithoutFormat();
//do something with the cell value...
count++;
} while (cell == null || ! lastCell.equals(cell));
This works better, as it returns the first 4 cells.
However, it still ignores the last 3 cells .
Is there any way to get information about the missing cells?
(It would be sufficient for me if Aspose could return the original Row as a String - I could then count the number of commas and find out the number of cells this way)
You may use Worksheet.getCells().getMaxDisplayRange() method to get the maximum display range.
Please consider this CSV. If you open it in MS-Excel and check the last cell, you will find it is Q2
Book1.csv
2,,,,1,,,,,,,,,,,,,
,,3,,,,
Aspose.Cells returns the same via the following code.
TxtLoadOptions opts = new TxtLoadOptions(LoadFormat.CSV);
Workbook wb = new Workbook("Book1.csv", opts);
Worksheet ws = wb.getWorksheets().get(0);
Range rng = ws.getCells().getMaxDisplayRange();
System.out.println(rng);
Here is the console output of the code.
Console Output
Aspose.Cells.Range [ Sheet1!A1:Q2 ]
Note: I am working as Developer Evangelist at Aspose
I got allocated updation of existing project. Previous developer used jcom api to export data to excel sheets. But jcom api don't work with 64 bit systems. I decided to change code and using apache poi api. I managed to done many methods. My problem is array formulas. I Need to implement array formulas using apache poi. Those formulas are posted below, any help will be more appreciable. Thanks in advance guys.
Formulas:
String formula1 = "SUM(R[-2]C/1.200)";//net income calculation from gross income
String formula2 = "SUM(R[-1]C-R[1]C)";
String formula3 = "SUM(R[-" + Integer.toString(listTypeTotals.size()+1) + "]C:R[-2]C)";
I tried to set that cell as formulatype and passing formula as string.
setCellFormulaStyle(sheet, 4, i+2, formula2);
public static void setCellFormulaStyle(HSSFSheet sheet,int row, int column,String value)
{
HSSFRow temprow = null;
temprow =getRow_CreateRow(sheet, row);
temprow.createCell(column).setCellFormula(value);
}
public static HSSFRow getRow_CreateRow(HSSFSheet sheet,int row)
{
HSSFRow excelrow=null;
excelrow = sheet.getRow(row);
if(excelrow==null)
{
excelrow =sheet.createRow(row);
return excelrow;
}else
return excelrow;
}
I am getting following exception
org.apache.poi.ss.formula.FormulaParseException: Specified named range 'R' does not exist in the current workbook.
at org.apache.poi.ss.formula.FormulaParser.parseNonRange(FormulaParser.java:569)
at org.apache.poi.ss.formula.FormulaParser.parseRangeable(FormulaParser.java:517)
at org.apache.poi.ss.formula.FormulaParser.parseRangeExpression(FormulaParser.java:268)
at org.apache.poi.ss.formula.FormulaParser.parseSimpleFactor(FormulaParser.java:1119)
at org.apache.poi.ss.formula.FormulaParser.percentFactor(FormulaParser.java:1079)
at org.apache.poi.ss.formula.FormulaParser.powerFactor(FormulaParser.java:1066)
at org.apache.poi.ss.formula.FormulaParser.Term(FormulaParser.java:1426)
at org.apache.poi.ss.formula.FormulaParser.additiveExpression(FormulaParser.java:1526)
at org.apache.poi.ss.formula.FormulaParser.concatExpression(FormulaParser.java:1510)
at org.apache.poi.ss.formula.FormulaParser.comparisonExpression(FormulaParser.java:1467)
at org.apache.poi.ss.formula.FormulaParser.Arguments(FormulaParser.java:1051)
at org.apache.poi.ss.formula.FormulaParser.function(FormulaParser.java:936)
at org.apache.poi.ss.formula.FormulaParser.parseNonRange(FormulaParser.java:558)
at org.apache.poi.ss.formula.FormulaParser.parseRangeable(FormulaParser.java:429)
at org.apache.poi.ss.formula.FormulaParser.parseRangeExpression(FormulaParser.java:268)
at org.apache.poi.ss.formula.FormulaParser.parseSimpleFactor(FormulaParser.java:1119)
at org.apache.poi.ss.formula.FormulaParser.percentFactor(FormulaParser.java:1079)
at org.apache.poi.ss.formula.FormulaParser.powerFactor(FormulaParser.java:1066)
at org.apache.poi.ss.formula.FormulaParser.Term(FormulaParser.java:1426)
at org.apache.poi.ss.formula.FormulaParser.additiveExpression(FormulaParser.java:1526)
at org.apache.poi.ss.formula.FormulaParser.concatExpression(FormulaParser.java:1510)
at org.apache.poi.ss.formula.FormulaParser.comparisonExpression(FormulaParser.java:1467)
at org.apache.poi.ss.formula.FormulaParser.unionExpression(FormulaParser.java:1447)
at org.apache.poi.ss.formula.FormulaParser.parse(FormulaParser.java:1568)
at org.apache.poi.ss.formula.FormulaParser.parse(FormulaParser.java:176)
at org.apache.poi.hssf.model.HSSFFormulaParser.parse(HSSFFormulaParser.java:72)
at org.apache.poi.hssf.usermodel.HSSFCell.setCellFormula(HSSFCell.java:594)`
I am not sure how you have use the formula yet. However if your problem is regarding the dynamic cell reference based on rows and Columns, perhaps I can help you.
For any cell, say "B5", at runtime,
cell.getReference();
will give you cell reference (like in example... it will return you "B5"
cell.getReference().toString().charAt(0);
will give you the Column Reference (will give you "B" if the current cell is B5). Now
cell.getRowIndex();
OR
cell.getReference().toString().charAt(1);
will give you Row Index. I have used that multiple times to update/create the Named ranges and formula on my workbook.
Small Changes to include handling address like AZ89
For the cases of AZ99 we can use for this we can use small hacks like:
String str = cell.getReference();
for(index = 0;index<str.length();index++){
if((int)str.charAt(index)<65){
break;
}
}
String Col = str.substring(0, index);
String Row = str.substring(index+1, str.length());
If I've got an list of parameters 'x,y,z' that aren't sorted, is there a straightforward way to write them to particular cells in an excel document created with POI, as though the first two parameters are X and Y coordinates?
For example, I have rows like:
10,4,100
Is it possible to write the value '100' in the cell at the 10th row, 4th column?
Looking at the documentation, it looks straightforward to iterate values into the next row, but I can't see any way of creating a fixed number of rows and columns and writing particular values to only certain cells.
Any advice or suggestions would be appreciated, thanks!
Sure, it's very easy, just remember that POI is 0 based not 1 based in addressing. Assuming you want to write to the 10th row, 4th column, you'd do something like
Row r = sheet.getRow(9); // 10-1
if (r == null) {
// First cell in the row, create
r = sheet.createRow(9);
}
Cell c = r.getCell(3); // 4-1
if (c == null) {
// New cell
c = r.createCell(3, Cell.CELL_TYPE_NUMERIC);
}
c.setCellValue(100);
I have excel file with such contents:
A1: SomeString
A2: 2
All fields are set to String format.
When I read the file in java using POI, it tells that A2 is in numeric cell format.
The problem is that the value in A2 can be 2 or 2.0 (and I want to be able to distinguish them) so I can't just use .toString().
What can I do to read the value as string?
I had same problem. I did cell.setCellType(Cell.CELL_TYPE_STRING); before reading the string value, which solved the problem regardless of how the user formatted the cell.
I don't think we had this class back when you asked the question, but today there is an easy answer.
What you want to do is use the DataFormatter class. You pass this a cell, and it does its best to return you a string containing what Excel would show you for that cell. If you pass it a string cell, you'll get the string back. If you pass it a numeric cell with formatting rules applied, it will format the number based on them and give you the string back.
For your case, I'd assume that the numeric cells have an integer formatting rule applied to them. If you ask DataFormatter to format those cells, it'll give you back a string with the integer string in it.
Also, note that lots of people suggest doing cell.setCellType(Cell.CELL_TYPE_STRING), but the Apache POI JavaDocs quite clearly state that you shouldn't do this! Doing the setCellType call will loose formatting, as the javadocs explain the only way to convert to a String with formatting remaining is to use the DataFormatter class.
A simple example of using this class:
DataFormatter dataFormatter = new DataFormatter();
String formattedCellStr = dataFormatter.formatCellValue(cell);
The below code worked for me for any type of cell.
InputStream inp =getClass().getResourceAsStream("filename.xls"));
Workbook wb = WorkbookFactory.create(inp);
DataFormatter objDefaultFormat = new DataFormatter();
FormulaEvaluator objFormulaEvaluator = new HSSFFormulaEvaluator((HSSFWorkbook) wb);
Sheet sheet= wb.getSheetAt(0);
Iterator<Row> objIterator = sheet.rowIterator();
while(objIterator.hasNext()){
Row row = objIterator.next();
Cell cellValue = row.getCell(0);
objFormulaEvaluator.evaluate(cellValue); // This will evaluate the cell, And any type of cell will return string value
String cellValueStr = objDefaultFormat.formatCellValue(cellValue,objFormulaEvaluator);
}
I would recommend the following approach when modifying cell's type is undesirable:
if(cell.getCellType() == Cell.CELL_TYPE_NUMERIC) {
String str = NumberToTextConverter.toText(cell.getNumericCellValue())
}
NumberToTextConverter can correctly convert double value to a text using Excel's rules without precision loss.
As already mentioned in the Poi's JavaDocs (https://poi.apache.org/apidocs/org/apache/poi/ss/usermodel/Cell.html#setCellType%28int%29) don't use:
cell.setCellType(Cell.CELL_TYPE_STRING);
but use:
DataFormatter df = new DataFormatter();
String value = df.formatCellValue(cell);
More examples on http://massapi.com/class/da/DataFormatter.html
Yes, this works perfectly
recommended:
DataFormatter dataFormatter = new DataFormatter();
String value = dataFormatter.formatCellValue(cell);
old:
cell.setCellType(Cell.CELL_TYPE_STRING);
even if you have a problem with retrieving a value from cell having formula, still this works.
Try:
new java.text.DecimalFormat("0").format( cell.getNumericCellValue() )
Should format the number correctly.
You can read numerical cells as String using java.
int type = cell.getCellType();
if(type == 0){
String value = NumberToTextConverter.toText(cell.getNumericCellValue());
}
else{
value = String.valueOf(cell.getStringCellValue());
}
Here,
0 => numeric cell
getCellType() => this method use to get type of excel cell.
As long as the cell is in text format before the user types in the number, POI will allow you to obtain the value as a string. One key is that if there is a small green triangle in the upper left-hand corner of cell that is formatted as Text, you will be able to retrieve its value as a string (the green triangle appears whenever something that appears to be a number is coerced into a text format). If you have Text formatted cells that contain numbers, but POI will not let you fetch those values as strings, there are a few things you can do to the Spreadsheet data to allow that:
Double click on the cell so that the editing cursor is present inside the cell, then click on Enter (which can be done only one cell at a time).
Use the Excel 2007 text conversion function (which can be done on multiple cells at once).
Cut out the offending values to another location, reformat the spreadsheet cells as text, then repaste the previously cut out values as Unformatted Values back into the proper area.
One final thing that you can do is that if you are using POI to obtain data from an Excel 2007 spreadsheet, you can the Cell class 'getRawValue()' method. This does not care what the format is. It will simply return a string with the raw data.
When we read the MS Excel's numeric cell value using Apache POI library, it read it as numeric. But sometime we want it to read as string (e.g. phone numbers, etc.). This is how I did it:
Insert a new column with first cell =CONCATENATE("!",D2). I assume D2 is cell id of your phone-number column. Drag new cell up to end.
Now if you read the cell using POI, it will read the formula instead of calculated value. Now do following:
Add another column
Select complete column created in step 1. and choose Edit->COPY
Go to top cell of column created in step 3. and Select Edit->Paste Special
In the opened window, Select "Values" radio button
Select "OK"
Now read using POI API ... after reading in Java ... just remove the first character i.e. "!"
I also have had a similar issue on a data set of thousands of numbers and I think that I have found a simple way to solve. I needed to get the apostrophe inserted before a number so that a separate DB import always sees the numbers as text. Before this the number 8 would be imported as 8.0.
Solution:
Keep all the formatting as General.
Here I am assuming numbers are stored in Column A starting at Row 1.
Put in the ' in Column B and copy down as many rows as needed. Nothing appears in the worksheet but clicking on the cell you can see the apostophe in the Formula bar.
In Column C: =B1&A1.
Select all the Cells in Column C and do a Paste Special into Column D using the Values option.
Hey Presto all the numbers but stored as Text.
getStringCellValue returns NumberFormatException if the cell type is numeric. If you don't want to change the cell type to string, you can do this.
String rsdata = "";
try {
rsdata = cell.getStringValue();
} catch (NumberFormatException ex) {
rsdata = cell.getNumericValue() + "";
}
Many of these answers reference old POI documentation and classes. In the newest POI 3.16, Cell with the int types has been deprecated
Cell.CELL_TYPE_STRING
Instead the CellType enum can be used.
CellType.STRING
Just be sure to update your pom with the poi dependency as well as the poi-ooxml dependency to the new 3.16 version otherwise you will continue to get exceptions. One advantage with this version is that you can specify the cell type at the time the cell is created eliminating all the extra steps described in previous answers:
titleRowCell = currentReportRow.createCell(currentReportColumnIndex, CellType.STRING);
This worked perfect for me.
Double legacyRow = row.getCell(col).getNumericCellValue();
String legacyRowStr = legacyRow.toString();
if(legacyRowStr.contains(".0")){
legacyRowStr = legacyRowStr.substring(0, legacyRowStr.length()-2);
}
I would much rather go the route of the wil's answer or Vinayak Dornala, unfortunately they effected my performance far to much.
I went for a HACKY solution of implicit casting:
for (Row row : sheet){
String strValue = (row.getCell(numericColumn)+""); // hack
...
I don't suggest you do this, for my situation it worked because of the nature of how the system worked and I had a reliable file source.
Footnote:
numericColumn
Is an int which is generated from reading the header of the file processed.
public class Excellib {
public String getExceldata(String sheetname,int rownum,int cellnum, boolean isString) {
String retVal=null;
try {
FileInputStream fis=new FileInputStream("E:\\Sample-Automation-Workspace\\SampleTestDataDriven\\Registration.xlsx");
Workbook wb=WorkbookFactory.create(fis);
Sheet s=wb.getSheet(sheetname);
Row r=s.getRow(rownum);
Cell c=r.getCell(cellnum);
if(c.getCellType() == Cell.CELL_TYPE_STRING)
retVal=c.getStringCellValue();
else {
retVal = String.valueOf(c.getNumericCellValue());
}
I Tried This and It worked For me
There is a ready-to-use wrapper
(some additional optimizations can be applied)
it supports numeric and String cells
formulas are recognized and handled automatically
avoid some boilerplate
public final class Cell {
private final static DataFormatter FORMATTER = new DataFormatter();
private XSSFCell mCell;
public Cell(#NotNull XSSFCell cell) {
mCell = cell;
if (isFormula()) {
XSSFWorkbook book = mCell.getSheet().getWorkbook();
FormulaEvaluator evaluator = book.getCreationHelper().createFormulaEvaluator();
mCell = (XSSFCell) evaluator.evaluateInCell(mCell);
}
}
/**
* Get content
*/
public final int getInt() {
return (int) getLong();
}
public final long getLong() {
return Math.round(getDouble());
}
public final double getDouble() {
return mCell.getNumericCellValue();
}
public final String getString() {
if (!isString()) {
return FORMATTER.formatCellValue(mCell);
}
return mCell.getStringCellValue();
}
/**
* Get properties
*/
public final boolean isNumber() {
if (isFormula()) {
return mCell.getCachedFormulaResultType().equals(CellType.NUMERIC);
}
return mCell.getCellType().equals(CellType.NUMERIC);
}
public final boolean isString() {
if (isFormula()) {
return mCell.getCachedFormulaResultType().equals(CellType.STRING);
}
return mCell.getCellType().equals(CellType.STRING);
}
public final boolean isFormula() {
return mCell.getCellType().equals(CellType.FORMULA);
}
/**
* Debug info
*/
#Override
public String toString() {
return getString();
}
}
I encountered the same issue and easiest fix would be setting the CELL TYPE as STRING. This will avoid exceptions being prompted.
FileInputStream fis = new FileInputStream(new File(filePath));
XSSFWorkbook wb = new XSSFWorkbook(fis);
XSSFSheet sheet = wb.getSheetAt(0); // get first sheet
row.getCell(1).setCellType(CellType.STRING); // set Cell Type as String
String val = row.getCell(1).getStringCellValue(); // get the value as String type
System.out.println(val); // prints the value;
Another option would be to force excel to evaluate the integer value as a string. To achieve that you will have to prefix a single quote before the number.
Here is an example appending a single quote to number 1:
Do you control the excel worksheet in anyway? Is there a template the users have for giving you the input? If so, you can have code format the input cells for you.
It looks like this can't be done in the current version of POI, based on the fact that this bug:
https://issues.apache.org/bugzilla/show_bug.cgi?id=46136
is still outstanding.
We had the same problem and forced our users to format the cells as 'text' before entering the value. That way Excel correctly stores even numbers as text.
If the format is changed afterwards Excel only changes the way the value is displayed but does not change the way the value is stored unless the value is entered again (e.g. by pressing return when in the cell).
Whether or not Excel correctly stored the value as text is indicated by the little green triangle that Excel displays in the left upper corner of the cell if it thinks the cell contains a number but is formated as text.
cell.setCellType(Cell.CELL_TYPE_STRING); is working fine for me
cast to an int then do a .toString(). It is ugly but it works.