Getting null while reading Hyperlink from excel using poi - java

I have one excel file in which a row contains few cell as string and other as numeric, hyperlink.
I want to read the data from excel for that I wrote below code
HSSFCell cell =row.getCell(j+1);
cell.setCellType(CellType.STRING);
String cellValue = cell.getStringCellValue();
above code reads numeric cells and string cells but when it comes to the cells which contain a hyperlink, in that case, it is reading those cells as null.I can put my hyperlink between double quotes("abc#cd.com") in the sheet but I want to handle this on the code level. Is there any way to handle this scenario?

You should use cell.getHyperlink() to get a hyperlink from the cell.
if(cell.getCellTypeEnum() == CellType.STRING){
Hyperlink hyperlink = cell.getHyperlink();
String value = cell.getRichStringCellValue().getString();
if(hyperlink == null) {
return value;
} else {
return value + " " + hyperlink.getAddress();
}
}

Related

How to get cell style of empty cell apache POI

I am using poi-ooxml#3.17 to read and write excel file. I have added some styles/protection on some of cells. When i read the file i am not able to get cell styles applied to cells with no value as when i tries to access row/cell with empty value it returns null.
Below is code to write data in same excel file.
public static void writeDataToSheet(final Sheet sheet, final List<Map<String, Object>> sheetData) {
List<String> columns = getColumnNames(sheet);
LOGGER.debug("Inside XLSXHelper writeDataToSheet {}", Arrays.asList(columns));
IntStream.range(0, sheetData.size()).forEach((index) -> {
if (Objects.isNull(sheet.getRow(index + 1))) {
sheet.createRow(index + 1);
}
Row row = sheet.getRow(index + 1);
Map<String, Object> data = sheetData.get(index);
IntStream.range(0, columns.size()).forEach((colIndex) -> {
String column = columns.get(colIndex);
Cell cell = row.getCell(colIndex);
if (Objects.isNull(cell)) {
cell = row.createCell(colIndex);
}
cell.setCellValue(data.get(column) != null ? data.get(column).toString() : null);
});
});
}
Could anyone provide me a solution where i can read the styles applied to cell when cell is empty?
Thanks.
Cells without content or explicit style applied are not present in the sheet because of not to increase the file size unnecessarily. So apache poi returns null for such cells.
If you are looking at the sheet in spreadsheet application, then maybe it looks as if all cells in a row or all cells in a column have the same style applied to. But this is not the case. In real the row and/or the column has the style applied to. Only cells in intersection of styled rows and columns must be present in the sheet having the last applied style.
If a new cell needs to be created, then the spreadsheet application gets the preferred style for that cell. This is either the already applied cell style or if that not present, then the row style (default cell style for this row) or if that not present, then the column style (default cell style for this column). Unfortunately apache poi does not do so. So we need doing this ourself:
public CellStyle getPreferredCellStyle(Cell cell) {
// a method to get the preferred cell style for a cell
// this is either the already applied cell style
// or if that not present, then the row style (default cell style for this row)
// or if that not present, then the column style (default cell style for this column)
CellStyle cellStyle = cell.getCellStyle();
if (cellStyle.getIndex() == 0) cellStyle = cell.getRow().getRowStyle();
if (cellStyle == null) cellStyle = cell.getSheet().getColumnStyle(cell.getColumnIndex());
if (cellStyle == null) cellStyle = cell.getCellStyle();
return cellStyle;
}
This method may be used in code every time a new cell needs to be created:
...
if (Objects.isNull(cell)) {
cell = row.createCell(colIndex);
cell.setCellStyle(getPreferredCellStyle(cell));
}
...

Issue with reading blank cells in large xlsx files using apache POI/monitorjbl/excel-streaming-reader

I am working on requirement where I need to read large xlsx file contains more than one million records. The apache POI is not memory efficient when reading large files .Hence I am using below API which adds
https://github.com/monitorjbl/excel-streaming-reader which is a wrapper around that streaming API while preserving the syntax of the standard POI API. Everything working fine except reading blank cells in the row. The above API throwing null pointer if cell is blank
for(int i=0; i<=expectedColumns-1; i++) {
Cell cell = row.getCell(i);
switch (cell.getCellType()) {
}
}
java.lang.NullPointerException
at test.XLSXToCSVConverterStreamer.xlsx(XLSXToCSVConverterStreamer.java:67)
at test.XLSXToCSVConverterStreamer.main(XLSXToCSVConverterStreamer.java:164)
if a cell in row is null it is throwing null pointer at Switch case i.e cell.getCelltype. I have modified code to read null cells as blank cells but its not supporting
for(int i=0; i<=expectedColumns-1; i++) {
//Cell cell = row.getCell(i);
Cell cell = row.getCell(i, Row.CREATE_NULL_AS_BLANK);
switch (cell.getCellType()) {
}
}
if I use Cell cell = row.getCell(i, Row.CREATE_NULL_AS_BLANK) to read empty cells as blank I am getting below issue. Kindly help me in resolving this
com.monitorjbl.xlsx.exceptions.NotSupportedException
at com.monitorjbl.xlsx.impl.StreamingRow.getCell(StreamingRow.java:108)
Lot of methods are not supported by the streaming excel But It gives advantage of reading large excel files. You can read the blank cells from a row as follows (use Streaming Excel Reader v1.1.0)
boolean flag = false;
int lastcolno = row.getLastCellNum();
for (colno = 0; colno < lastcolno; colno++) {
colFlag = isColumnEmpty(row, colno);
if (flag == true)
break;
}
if (colFlag == true) {
System.out.println("In index row, column no: "
+ (colno + 1) + " is empty");
}
public static boolean isColumnEmpty(Row row, int colno) {
Cell c = row.getCell(colno);
if (c == null || c.getCellType() == Cell.CELL_TYPE_BLANK)
return true;
return false;
}

Evaluate Excel Column with expression

I have an excel sheet which contains many column as input. I have another sheet which contains column name with expression.
Example - The expression sheet contains
{{CS_RRC_Successful (number)+PS_RRC_Sucessful (number)}*100},
where CS_RRC_Successful (number) and PS_RRC_Sucessful (number) is column available in input sheet.
How I should evaluate these expression with poi using java?
The "cellType" will tell you if it a formula. Then, call getCellFormula()
if(theCell.getCellType()== Cell.CELL_TYPE_FORMULA)
{
String formulaVal = theCell.getCellFormula();
}
Check out the API docs to see the other formula-related methods available on a Cell. https://poi.apache.org/apidocs/org/apache/poi/ss/usermodel/Cell.html
This is just a sample code that evaluates the formula - SUM(A1:B1)+10 for a cell. You will have to replace it with the formula you are using and set the cell type to Numeric (or whichever data type your excel formula will return).
See below example:
FormulaEvaluator evaluator = workbook.getCreationHelper().createFormulaEvaluator();
//you will need to create the row in case you are making a new file.
Row row = sheet.getRow(0);
Cell cell2 = row.createCell(2);
//set cell formula
cell2.setCellFormula("SUM(A1:B1)+10");
//evaluate the formula
evaluator.evaluateFormulaCell(cell2);
//set the cell type to numeric
cell2.setCellType(Cell.CELL_TYPE_NUMERIC); //or CELL_TYPE_FORMULA
System.out.println(cell2.getNumericCellValue());
//write data to excel file
FileOutputStream out = new FileOutputStream(new File("res/Book1.xlsx"));
workbook.write(out);
out.close();
System.out.println("Excel written successfully..");

Results address of empty cell in Excel sheet using XSSF

I am reading an Excel sheet using POI's XSSF . The Excel sheet has thousands of rows of user information like user name, address, age, department etc.
I can read and write the Excel sheet successfully but i want to locate empty/null cell's address and want to print it as result in another sheet
example result what i want is:
Empty cell : C5
Empty cell : E7
Empty cell : H8
Thanks and appreciate for discussions and replies.
You need to check for both Null and Blank cells. Null cells are ones that have never been used, Blank ones are ones that have been used or styled in some way that Excel has decided to keep them around in the file
The easiest way to control this fetching is with a MissingCellPolicy. Your code can then be something like:
Row r = getRow(); // Logic here to get the row of interest
// Iterate over all cells in the row, up to at least the 10th column
int lastColumn = Math.max(r.getLastCellNum(), 10);
for (int cn = 0; cn < lastColumn; cn++) {
Cell c = r.getCell(cn, Row.RETURN_BLANK_AS_NULL);
if (c == null) {
// The spreadsheet is empty in this cell
} else {
// Do something useful with the cell's contents
// eg...
System.out.println("There is data in cell " + (new CellReference(c)).formatAsString());
}
}
You can find more on this in the POI docs on iterating over rows and cells
As far as I unterstood your question right, this should do the trick:
XSSFCell cell = (XSSFCell) row.getCell( index );
if( cell == null )
{
cell = (XSSFCell) row.createCell( index );
}
if( cell.getStringCellValue().trim().isEmpty() )
{
String cellRef = CellReference.convertNumToColString( cell.getColumnIndex() );
System.err.println( "Empty cell found: " + cellRef
+ Integer.toString( row.getRowNum() ) );
}

Read excel cells and determine formatted words in cell

Is it possible to read the format of a cell from an excell sheet and determinde which words are bold or italic?
I can read and write to cells, and I also know that JExcel can write formatted cells. In formatted cells I mean that the text is italic, or bold.
Is it possible the read a cell data and determine which words are bold?
For instance I will have this in cell:
"A sample text from one excel cell"
I want to know that the string "excel cell" is bold, and the string "sample" is Italic.
Is this possible in JExcel, if not how would I do that in Java? Can somebody suggest an API?
Maybe a better approach would be to pares an xml file.
I don't know about JExcel, but I can tell you this is fairly easy to do in Apache POI. Here is a simple application to show one way it can be done. It isn't incredibly pretty, but it should be enough to get you started:
public static final void main(String... args) throws Exception
{
InputStream is = ExcelFormatTest.class.getResourceAsStream("Test.xlsx");
Workbook wb = new XSSFWorkbook(is);
Sheet sheet = wb.getSheetAt(0);
Cell cell = sheet.getRow(0).getCell(0);
XSSFRichTextString richText = (XSSFRichTextString)cell.getRichStringCellValue();
int formattingRuns = cell.getRichStringCellValue().numFormattingRuns();
for(int i = 0; i < formattingRuns; i++)
{
int startIdx = richText.getIndexOfFormattingRun(i);
int length = richText.getLengthOfFormattingRun(i);
System.out.println("Text: " + richText.getString().substring(startIdx, startIdx + length));
if(i == 0)
{
short fontIndex = cell.getCellStyle().getFontIndex();
Font f = wb.getFontAt(fontIndex);
System.out.println("Bold: " + (f.getBoldweight() == Font.BOLDWEIGHT_BOLD));
System.out.println("Italics: " + f.getItalic() + "\n");
}
else
{
Font f = richText.getFontOfFormattingRun(i);
System.out.println("Bold: " + (f.getBoldweight() == Font.BOLDWEIGHT_BOLD));
System.out.println("Italics: " + f.getItalic() + "\n");
}
}
}
Basically, you get a RichTextString object from a cell (make sure it is a String cell first, though), then iterate over the formatting runs and check the font for each one. It looks like the first run uses the Cell's CellStyle/font, so you have to look it up that way (you get an NPE if you try to get it from the RichTextString).
Once you have the font, you can get all of its attributes. Here is the Javadoc for POI's Font.
If you are using older, non-XLSX files, replace XSSF with HSSF in the class names, and you'll have to change the RichTextString code a bit to lookup the font using the font index. Here are the JavaDocs for XSSFRichTextString and HSSFRichTextString.
Running this with the following in Sheet 1, A1: "A sample text from one excel cell" gives the following results:
Text: A 
Bold: false
Italics: false
Text: sample
Bold: true
Italics: false
Text:  text 
Bold: false
Italics: false
Text: from
Bold: false
Italics: true
Text:  one 
Bold: false
Italics: false
Text: excel cell
Bold: true
Italics: true
Here's how I'd do it in VBA. Maybe you can translate:
Sub ListBoldStrings()
Dim cell As Excel.Range
Dim i As Long
Dim BoldChars As String
Dim BoldStrings() As String
'replace "|" with a char that will not appear in evaluated strings
Const SEPARATOR_CHAR As String = "|"
Set cell = ActiveCell
With cell
For i = 1 To .Characters.Count
If .Characters(i, 1).Font.Bold Then
BoldChars = BoldChars + .Characters(i, 1).Text
Else
BoldChars = BoldChars + SEPARATOR_CHAR
End If
If Right$(BoldChars, 2) = WorksheetFunction.Rept(SEPARATOR_CHAR, 2) Then
BoldChars = Left$(BoldChars, Len(BoldChars) - 1)
End If
Next i
End With
BoldStrings = Split(BoldChars, SEPARATOR_CHAR)
For i = LBound(BoldStrings) To UBound(BoldStrings)
Debug.Print BoldStrings(i)
Next i
End Sub

Categories