apache poi get default column type - java

I'm using POI to create a new row of cells in an existing spreadsheet. POI allows you to get the column default style, but there's no equivalent (as near as I can tell) to getting a default type. I'm getting a String from my user interface and I don't know how to set the cell type. If the string is a double, then fine, it's NUMERIC. But if the String specifies a date, how would I best detect it so that it is also set to NUMERIC? There are some many formatting types for a date that it is impractical to detect the type from the cell style format. Does POI support a way to parse based on a format?

To set the cell type you use:
setCellType()
as outlined in the docs:
https://poi.apache.org/apidocs/
setCellType void setCellType(int cellType) Set the cells type (numeric, formula or string). If the cell currently contains a value,
the value will be converted to match the new type, if possible.
Formatting is generally lost in the process however.
If what you want to do is get a String value for your numeric cell,
stop!. This is not the way to do it. Instead, for fetching the string
value of a numeric or boolean or date cell, use DataFormatter instead.
Throws: java.lang.IllegalArgumentException - if the specified cell
type is invalid java.lang.IllegalStateException - if the current value
cannot be converted to the new type See Also: CELL_TYPE_NUMERIC,
CELL_TYPE_STRING, CELL_TYPE_FORMULA, CELL_TYPE_BLANK,
CELL_TYPE_BOOLEAN, CELL_TYPE_ERROR

Related

Java Apache POI Excel Numeric Cell value with Locale pt-BR

I want to set the XSSFCell in Excel to display the CellType as numeric, and need it to diplay the value as pt-BR Locale (with . for separator instead of , (not decimal/fractional - but still double). I managed to diplay the correct cell value, but the CellTyle only gets String. I know that double have the US configuration 0.000,00, and a simple NumberFormat with Locale do the trick, but the CellType gets to text, not number.
The number already comes configured with Locale, which comes straight from Database. (Eg: 412.000), but double converts it to US (412,000).
So, how can I format the cell to my Locale and still set the CellType as Numeric ?
This is what I got so far:
XSSFCell cell = row.createCell(anyCellNumber);
double value = 412.000; //valueThatComesFromDataBaseWithLocaleFormatted
cell.setCellType(CellType.NUMERIC);
//At this point, double has already converted the cell value to 412,000 - So, What I did was:
cell.setCellValue(String.format("%.3f", value).replace(",", ".")); // Replaced the , to . to match Locale
So, this is the point, the CellType is set, but NumberFormat, DecimalFormat, Locale, String.format, etc, gets the cellvalue to String, and thus the CellType is text.
CanĀ“t use Double.parseDouble(valueFormatted.toString()) because it throws Exception, if the String is Locale formatted.
So, how (if it is possible), can I have a CellType.NUMERIC AND the cell value formatted for my Locale (pt-BR)???
Thanks in advance!
As far as I understand you specifically want your Cell to be of type numeric and your display to be something "not Date" ? The only approach that I clould think of would be to define your own cellStyle. This is possible in excel with custom functions. In apache-poi it could be possible when tinkering around with cell Styles:
...
Cell cell = row.createCell(cellIndex);
CellStyle cellStyle = workbook.createCellStyle();
CreationHelper createHelper = workbook.getCreationHelper();
createHelper.createDataFormat("some specific custom data format").
cell.setCellStyle(cellStyle);
I am however not sure that there is a way. I am not sure but these cellStyles seem to be limited to the default excel styles https://www.roseindia.net/java/poi/setdataformat.shtml
The simplest approach would be to make your cell of Type "Date". Even if your result from the Database is a number, it can be converted to Date, since Date is only a layer upon a timestamp. I would highly suggest to look at this approch instead of "hacking" your way around number fromats looking similar to Dates.

Why do we need converters for TextFormatters

My professor said that it is 'ideal' to use a filter and a converter for a TextFormatter. I looked at his examples and tried them, but couldn't understand why we need a converter at all.
From the docs:
A Formatter describes a format of a TextInputControl text by using two distinct mechanisms:
A filter (getFilter()) that can intercept and modify user input. This helps to keep the text in the desired format. A default text
supplier can be used to provide the intial text.
A value converter (getValueConverter()) and value (valueProperty()) can be used to provide special format that
represents a value of type V. If the control is editable and the text
is changed by the user, the value is then updated to correspond to the
text.
I am cleary something missing here. I get why you want to convert a string to an integer (for calculations etc.). But why do you have to have it as a part of TextFormatter? Can't we just use getText() and then just cast the text as we want to have the value?
One more thing: If we have a filter that doesn't allow non-numeric characters, then why do we need to take care of the conversion of the text to integer/double etc. with a converter?
Maybe I am just missing something very obvious.
You can't cast a String to an Integer (or any other type, except an Object): you have to convert it. Even if the text formatter has a filter that only allows numeric entry, the text field's getText() method still returns a string, which is usually not very convenient (as the entry in the text field likely represents a numeric value in some object).
You might need to get the integer (for example) value represented by the text field in many different places, so you centralize the conversion code in one place by including the converter as part of the formatter.
Additionally, the formatter's value is an observable property, so you can easily bind other properties to it, etc. This would be tricky if you needed to perform the conversion in a binding on the text field's text property.

how to parse a custom column in excel file using POI

I need to parse a excel file which has a date time(mm/dd/yyyy hh:mm:ss) stored in custom format. When I check the cell type it is indicating a numeric value, but I am not able to retrieve the value. It is displaying the below exception.
java.lang.IllegalStateException: Cannot get a numeric value from a text cell
at org.apache.poi.hssf.usermodel.HSSFCell.typeMismatch(HSSFCell.java:648)
at org.apache.poi.hssf.usermodel.HSSFCell.getNumericCellValue(HSSFCell.java:673)
When i try to read the value as date using 'cell.getDateCellValue()', then also I am getting the same exception.
If I try to retrieve the value as a string, then the exception is just reversed.
Any idea how to make this work ?
I was having similar issues with apache poi.
My suggestion would be to:
1) force the cell to be text cell.setCellType(CELLTYPESTRING)
2) read in the string value cell.getStringCellValue()
3) do something on your end to parse the string into meaningful data
Hope this helps. Apache POI has its limitations so sometimes you have to hack it like that

What is the difference between getRichStringCellValue() and getStringCellValue() methods of POI HSSFCell class?

I am trying to read data stored in a Excel sheet using Java POI. I am confused with these two methods because both methods reutrn string value stored in the cell. Could anyone explain the difference between these two methods?
The important clue is to look # the documentation and note the different return types.
getRichStringCellValue() returns the type of XSSFRichTextString while getStringCellValue() returns a plain old java String.
You probably only want to use getStringCellValue(), unless you're doing something like copying a spreadsheet and wish to retain any formatting. If that's the case, the XSSRichTextString object that is returned by getRichStringCellValue() will contain any format information like bold or italic.
From Apache's Documentation:
getRichStringCellValue():
get the value of the cell as a string - for numeric cells we throw an
exception. For blank cells we return an empty string. For formulaCells
that are not string Formulas, we throw an exception.
getStringCellValue():
get the value of the cell as a string - for numeric cells we throw an
exception.

Reading string value from Excel with HSSF but it's double

I'm using HSSF-POI for reading excel data. The problem is I have values in a cell that look like a number but really are strings. If I look at the format cell in Excel, it says the type is "text". Still the HSSF Cell thinks it's numeric. How can I get the value as a string?
If I try to use cell.getRichStringValue, I get exception; if cell.toString, it's not the exact same value as in Excel sheet.
Edit: until this gets resolved, I'll use
new BigDecimal(cell.getNumericCellValue()).toString()
The class you're looking for in POI is DataFormatter
When Excel writes the file, some cells are stored as literal Strings, while others are stored as numbers. For the latter, a floating point value representing the cell is stored in the file, so when you ask POI for the value of the cell that's what it actually has.
Sometimes though, especially when doing Text Extraction (but not always), you want to make the cell value look like it does in Excel. It isn't always possible to get that exactly in a String (non full space padding for example), but the DataFormatter class will get you close.
If you're after a String of the cell, looking much as you had it looking in Excel, just do:
// Create a formatter, do this once
DataFormatter formatter = new DataFormatter(Locale.US);
.....
for(Cell cell : row) {
CellReference ref = new CellReference(cell);
// eg "The value of B12 is 12.4%"
System.out.println("The value of " + ref.formatAsString() + " is " + formatter.formatCellValue(cell));
}
The formatter will return String cells as-is, and for Numeric cells will apply the formatting rules on the style to the number of the cell
If the documents you are parsing are always in a specific layout, you can change the cell type to "string" on the fly and then retrieve the value. For example, if column 2 should always be string data, set its cell type to string and then read it with the string-type get methods.
cell.setCellType(Cell.CELL_TYPE_STRING);
In my testing, changing the cell type did not modify the contents of the cell, but did allow it to be retrieved with either of the following approaches:
cell.getStringCellValue();
cell.getRichStringCellValue().getString();
Without an example of a value that is not converting properly, it is difficult to know if this will behave any differently than the cell.toString() approach you described in the description.
You mean HSSF-POI says
cell.getCellType() == Cell.CELL_TYPE_NUMERIC
NOT
Cell.CELL_TYPE_STRING as it should be?
I would think it's a bug in POI, but every cell contains a Variant, and Variant has a type. It's kind of hard to make a bug there, so instead I think Excel uses some extra data or heuristic to report the field as text. Usual MS way, alas.
P.S. You cannot use any getString() on a Variant containing numeric, as the binary representation of the Variant data depends on it's type, and trying to get a string from what is actually a number would result in garbage -- hence the exception.
This below code works fine to read any celltype but that cell should contain numeric value
new BigDecimal(cell.getNumericCellValue()));
e.g.
ase.setGss(new BigDecimal(hssfRow.getCell(3).getNumericCellValue()));
where variable gss is of BigDecimal type.
Excel will convert anything that looks like a number or date or time from a string. See MS Knowledge base article, which basically suggests to enter the number with an extra character that makes it a string.
You are probably dealing with an Excel problem. When you create the spreadsheet, the default cell type is Generic. With this type, Excel guesses the type based on the input and this type is saved with each cell.
When you later change the cell format to Text, you are just changing the default. Excel doesn't change every cell's type automatically. I haven't found a way to do this automatically.
To confirm this, you can go to Excel and retype one of the numbers and see if it's text in HSSF.
You can also look at the real cell type by using this function,
#Cell("type", A1)
A1 is the cell for the number. It shows "l" for text, "v" for numbers.
The problem with Excel is that the default format is generic. With this format Excel stores numbers entered in the cell as numeric. You have to change the format to text before entering the values. Reentering the values after changing the format will also work.
That will lead to little green triangles in the left upper corner of the cells if the content looks like a number to Excel. If this is the case the value is really stored as text.
With new BigDecimal(cell.getNumericCellValue()).toString() you will still have a lot of problems. For example if you have identifying numbers (e.g. part numbers or classification numbers) you probably have cases that have leading zeros which will be a problem with the getNumericCellValue() approach.
I try to thoroughly explain how to correctly create the Excel to the party creating the files I have to handle with POI. If the files are uploaded by end users I even have created a validation program to check for expected cell types if I know the columns in advance. As a by-product you can also check various other things of the supplied files (e.g. are the right columns provided or mandatory values).
"The problem is I have values in a cell that look like a number" => look like number when viewed in Excel?
"but really are strings" => what does that mean? How do you KNOW that they really are strings?
"If I look at the format cell" => what's "the format cell"???
'... in Excel, it says the type is "text"' => Please explain.
"Still the HSSF Cell thinks it's numeric." => do you mean that the_cell.getCellType() returns Cell.CELL_TYPE_NUMERIC?
"How can I get the value as a string?" => if it's NUMERIC, get the numeric value using the_cell.getNumericCellValue(), then format it as a string any way you want to.
"If I try to use cell.getRichStringValue, I get exception;" => so it's not a string.
"if cell.toString, it's not the exact same value as in Excel sheet." => so cell.toString() doesn't format it the way that Excel formats it.
Whatever heuristic Excel uses to determine type is irrelevant to you. It's the RESULT of that decision as stored in the file and revealed by getCellType() that matters.

Categories