I'm trying to get some values in my java program from an excel .xlsx file using Apache POI, but I'm having trouble because my loop encounters an empty cell sometimes, then I get a NullPointerException. How can I "test" the cell before even reading it ? Here's a piece of my code :
FileInputStream file = new FileInputStream(new File(file));
XSSFWorkbook workbook = new XSSFWorkbook(file);
XSSFSheet sheet = workbook.getSheetAt(0);
int rows;
rows = sheet.getPhysicalNumberOfRows();
for (int i=1;i<rows;i++){
Row row = sheet.getRow(i);
Cell cell = row.getCell(2); // Here is the NullPointerException
String cellString = cell.getStringCellValue();
myArrayList.add(cellString);
}
Which brings me to :
java.lang.NullPointerException
at analyse.Analyser.getExcelWords3(Analyser.java:73)
at analyse.Analyser.main(Analyser.java:21)
I want to know if there's a possibility to check if the cell is empty before trying to read it, then I won't get the NPE. Thank you in advance !
To avoid NullPointerException add this Row.MissingCellPolicy.CREATE_NULL_AS_BLANK
Cell cell = row.getCell(j, Row.MissingCellPolicy.CREATE_NULL_AS_BLANK);
This will create a blank cell instead of giving you NPE
wrap your code in a try / catch statement that is what it's there for..
https://docs.oracle.com/javase/tutorial/essential/exceptions/catch.html
some untested code below to give you the idea:
for (int i=1;i<rows;i++){
try{
Row row = sheet.getRow(i);
Cell cell = row.getCell(2); // Here is the NullPointerException
String cellString = cell.getStringCellValue();
myArrayList.add(cellString);
}catch(NullPointerException NPE)
{
//what to do when the exception occurs?
}
}
Look at this method:
/**
* Returns the cell at the given (0 based) index, with the specified {#link org.apache.poi.ss.usermodel.Row.MissingCellPolicy}
*
* #return the cell at the given (0 based) index
* #throws IllegalArgumentException if cellnum < 0 or the specified MissingCellPolicy is invalid
* #see Row#RETURN_NULL_AND_BLANK
* #see Row#RETURN_BLANK_AS_NULL
* #see Row#CREATE_NULL_AS_BLANK
*/
public XSSFCell getCell(int cellnum, MissingCellPolicy policy) {
It should help you.
` FileInputStream file = new FileInputStream(new File(file));
XSSFWorkbook workbook = new XSSFWorkbook(file);
XSSFSheet sheet = workbook.getSheetAt(0);
int rows;
String cellString = null;
rows = sheet.getPhysicalNumberOfRows();
for (int i=1;i<rows;i++){
Row row = sheet.getRow(i);
if(row.getCell(2) == null)
{
cellString = null;
}
else
{
Cell cell = row.getCell(2); //if 3rd column of excelsheet then getCell(2)..
cellString = cell.getStringCellValue();
}
myArrayList.add(cellString);
}
if excel cell contain empty/blank then it will be stored as null`
Related
I am reading excel file using POI library in my java code. So far fine. But now I have one requirement. The excel file contains many records (e.g. 1000 rows). It also has column headers (1st row). Now I am doing excel filtering on it. Say I have one 'year' column and I am filtering all rows for year=2019. I get 15 rows.
Question: I want to process only these 15 rows in my java code. Is there any method in poi library or way to know if the row being read is filtered or (the other way i.e. not filtered).
Thanks.
I already have working code but right now I am looking for how to read only filtered row. Nothing new tried yet other than searching in library and forums.
The below code is inside a method. I am not used to formatting with stackoverflow so kindly ignore any formatting issue.
// For storing data into CSV files
StringBuffer data = new StringBuffer();
try {
SimpleDateFormat dtFormat = new SimpleDateFormat(CommonConstants.YYYY_MM_DD); // "yyyy-MM-dd"
String doubleQuotes = "\"";
FileOutputStream fos = new FileOutputStream(outputFile);
// Get the workbook object for XLSX file
XSSFWorkbook wBook = new XSSFWorkbook(new FileInputStream(inputFile));
wBook.setMissingCellPolicy(Row.RETURN_BLANK_AS_NULL);
// Get first sheet from the workbook
//XSSFSheet sheet = wBook.getSheetAt(0);
XSSFSheet sheet = wBook.getSheet(CommonConstants.METADATA_WORKSHEET);
//Row row;
//Cell cell;
// Iterate through each rows from first sheet
int rows = sheet.getLastRowNum();
int totalRows = 0;
int colTitelNumber = 0;
Row firstRowRecord = sheet.getRow(1);
for (int cn = 0; cn < firstRowRecord.getLastCellNum(); cn++) {
Cell cellObj = firstRowRecord.getCell(cn);
if(cellObj != null) {
String str = cellObj.toString();
if(CommonConstants.COLUMN_TITEL.equalsIgnoreCase(str)) {
colTitelNumber = cn;
break;
}
}
}
// Start with row Number 1. We don't need 0th number row as it is for Humans to read but not required for processing.
for (int rowNumber = 1; rowNumber <= rows; rowNumber++) {
StringBuffer rowData = new StringBuffer();
boolean skipRow = false;
Row rowRecord = sheet.getRow(rowNumber);
if (rowRecord == null) {
LOG.error("Empty/Null record found");
} else {
for (int cn = 0; cn < rowRecord.getLastCellNum(); cn++) {
Cell cellObj = rowRecord.getCell(cn);
if(cellObj == null) {
if(cn == colTitelNumber) {
skipRow = true;
break; // The first column cell value is empty/null. Which means Titel column cell doesn't have value so don't add this row in csv.
}
rowData.append(CommonConstants.CSV_SEPARTOR);
continue;
}
switch (cellObj.getCellType()) {
case Cell.CELL_TYPE_BOOLEAN:
rowData.append(cellObj.getBooleanCellValue() + CommonConstants.CSV_SEPARTOR);
//LOG.error("Boolean:" + cellObj.getBooleanCellValue());
break;
case Cell.CELL_TYPE_NUMERIC:
if (DateUtil.isCellDateFormatted(cellObj)) {
Date date = cellObj.getDateCellValue();
rowData.append(dtFormat.format(date).toString() + CommonConstants.CSV_SEPARTOR);
//LOG.error("Date:" + cellObj.getDateCellValue());
} else {
rowData.append(cellObj.getNumericCellValue() + CommonConstants.CSV_SEPARTOR);
//LOG.error("Numeric:" + cellObj.getNumericCellValue());
}
break;
case Cell.CELL_TYPE_STRING:
String cellValue = cellObj.getStringCellValue();
// If string contains double quotes then replace it with pair of double quotes.
cellValue = cellValue.replaceAll(doubleQuotes, doubleQuotes + doubleQuotes);
// If string contains comma then surround the string with double quotes.
rowData.append(doubleQuotes + cellValue + doubleQuotes + CommonConstants.CSV_SEPARTOR);
//LOG.error("String:" + cellObj.getStringCellValue());
break;
case Cell.CELL_TYPE_BLANK:
rowData.append("" + CommonConstants.CSV_SEPARTOR);
//LOG.error("Blank:" + cellObj.toString());
break;
default:
rowData.append(cellObj + CommonConstants.CSV_SEPARTOR);
}
}
if(!skipRow) {
rowData.append("\r\n");
data.append(rowData); // Appending one entire row to main data string buffer.
totalRows++;
}
}
}
pTransferObj.put(CommonConstants.TOTAL_ROWS, (totalRows));
fos.write(data.toString().getBytes());
fos.close();
wBook.close();
} catch (Exception ex) {
LOG.error("Exception Caught while generating CSV file", ex);
}
All rows which are not visible in the sheet have a zero height. So if the need is only reading the visible rows, one could check via Row.getZeroHeight.
Example
Sheet:
Code:
import java.io.FileInputStream;
import org.apache.poi.ss.usermodel.*;
class ReadExcelOnlyVisibleRows {
public static void main(String[] args) throws Exception {
Workbook workbook = WorkbookFactory.create(new FileInputStream("SAMPLE.xlsx"));
DataFormatter dataFormatter = new DataFormatter();
CreationHelper creationHelper = workbook.getCreationHelper();
FormulaEvaluator formulaEvaluator = creationHelper.createFormulaEvaluator();
Sheet sheet = workbook.getSheetAt(0);
for (Row row : sheet) {
if (!row.getZeroHeight()) { // if row.getZeroHeight() is true then this row is not visible
for (Cell cell : row) {
String cellContent = dataFormatter.formatCellValue(cell, formulaEvaluator);
System.out.print(cellContent + "\t");
}
System.out.println();
}
}
workbook.close();
}
}
Result:
F1 F2 F3 F4
V2 2 2-Mai FALSE
V4 4 4-Mai FALSE
V2 6 6-Mai FALSE
V4 8 8-Mai FALSE
You have to use auto filter provided in Apache Poi library and also you have set the freezing. I provide below the brief code snippet, you can use accordingly.
XSSFSheet sheet = wBook.getSheet(CommonConstants.METADATA_WORKSHEET);
sheet.setAutoFilter(new CellRangeAddress(0, 0, 0, numColumns));
sheet.createFreezePane(0, 1);
I had to override some hooks and come up with my own approach to incorporate filtering of hidden rows in order to prevent processing of those. Below is code snippet. My approach consists of opening a second copy of the same sheet just so that I can query the current row getting processed to see if it's hidden or not. The answer above touches on this, the below expands on it to show how it can be nicely incorporated into the Spring batch excel framework. One drawback is that you have to open a second copy of the same file, but I couldn't figure out a way (perhaps there's none!) to get my hands on the internal Workbook sheet, among other reasons because org.springframework.batch.item.excel.poi.PoiSheet is package private (Note that below syntax is Groovy!!!):
/**
* Produces a reader that knows how to ingest a file in excel format.
*/
private PoiItemReader<String[]> createExcelReader(String filePath) {
File f = new File(filePath)
PoiItemReader<String[]> reader = new PoiItemReader<>()
reader.setRowMapper(new PassThroughRowMapper())
Resource resource = new DefaultResourceLoader().getResource("file:" + f.canonicalPath)
reader.setResource(resource)
reader.setRowSetFactory(new VisibleRowsOnlyRowSetFactory(resource))
reader.open(new ExecutionContext())
reader
}
...
// The "hooks" I overwrote to inject my logic
static class VisibleRowsOnlyRowSet extends DefaultRowSet {
Workbook workbook
Sheet sheet
VisibleRowsOnlyRowSet(final Sheet sheet, final RowSetMetaData metaData) {
super(sheet, metaData)
}
VisibleRowsOnlyRowSet(final Sheet sheet, final RowSetMetaData metaData, Workbook workbook) {
this(sheet, metaData)
this.workbook = workbook
this.sheet = sheet
}
boolean next() {
boolean moreLeft = super.next()
if (moreLeft) {
Row row = workbook.getSheet(sheet.name).getRow(getCurrentRowIndex())
if (row?.getZeroHeight()) {
log.warn("Row $currentRow is hidden in input excel sheet, will omit it from output.")
currentRow.eachWithIndex { _, int i ->
currentRow[i] = ''
}
}
}
moreLeft
}
}
static class VisibleRowsOnlyRowSetFactory extends DefaultRowSetFactory {
Workbook workbook
VisibleRowsOnlyRowSetFactory(Resource resource) {
this.workbook = WorkbookFactory.create(resource.inputStream)
}
RowSet create(Sheet sheet) {
new VisibleRowsOnlyRowSet(sheet, super.create(sheet).metaData, workbook)
}
}
I am reading an xlsx file using java (Apache POI).
I have created a Document class (having all excel column heading as variables)
i have to read each row in the excel and map to the Document class by creating a collection of Document class.
The problem I am facing is that I have to start reading from row 2 and from column 7 to column 35 and map the corresponding values to the document class.
Unable to to figure out exactly how the code should be ?
I have written the following lines of code.
List sheetData = new ArrayList();
InputStream excelFile = new BufferedInputStream(new FileInputStream("D:\\Excel file\\data.xlsx"));
Workbook workBook = new XSSFWorkbook(excelFile); // Creates Workbook
XSSFSheet sheet = (XSSFSheet) workBook.getSheet("Daily");
DataFormatter formatter = new DataFormatter();
for (int i = 7; i <= 35; i++) {
XSSFRow row = sheet.getRow(i);
Cell cell = row.getCell(i);
String val = formatter.formatCellValue(cell);
sheetData.add(val);
}
Assuming I've understood your question correctly, I believe you want to process every row which exists from row 2 onwards to the end of the file, and for each of those rows consider the cells in columns 7 through 35. I believe you also might need to process those values, but you haven't said how, so for this example I'll just stuff them in a list of strings and hope for the best...
This is based on the Apache POI documentation for iterating over rows and cells
File excelFile = new File("D:\\Excel file\\data.xlsx");
Workbook workBook = WorkbookFactory.create(excelFile);
Sheet sheet = workBook.getSheet("Daily");
DataFormatter formatter = new DataFormatter();
// Start from the 2nd row, processing all to the end
// Note - Rows and Columns in Apache POI are 0-based not 1-based
for (int rn=1; rn<=sheet.getLastRowNum(); rn++) {
Row row = sheet.getRow(rn);
if (row == null) {
// Whole row is empty. Handle as required here
continue;
}
List<String> values = new ArrayList<String>();
for (int cn=6; cn<35; cn++) {
Cell cell = row.getCell(cn);
String val = null;
if (cell != null) { val = formatter.formatCellValue(cell); }
if (val == null || val.isEmpty()) {
// Cell is empty. Handle as required here
}
// Save the value to list. Save to an object instead if required
values.append(val);
}
}
workBook.close();
Depending on your business requirements, put in logic for handling blank rows and cells. Then, do whatever you need to do with the values you find, again as per your business requirements!
You could iterate with an Iterator in the document, but there is also an function "getRow() and getCell()"
Workbook workbook = new XSSFWorkbook(excelFile);
// defines the standard pointer in document in the first Sheet
XSSFSheet data = this.workbook.getSheetAt(0);
// you could iterate the document with an iterator
Iterator<Cell> iterator = this.data.iterator();
// x/y pointer at the document
Row row = data.getRow(y);
Cell pointingCell = row.getCell(x);
String pointingString = pointingCell.getStringCellValue();
I have written a program that reads Excel template sheet. In which the first column was hidden. Now, I have a code that un-hide the excel column programmatically (so Column start from A1).
I'm using Apache POI 3.16 version.
When I open a file, it should show me a column from A1 instead it shows me from B1 column.
When I write below code for XLS, It working properly but didn't work for an XLSX format.
sheet.showInPane(0, 0);
I need to manually move the horizontal scroll bar to view my first column. How should I achieve this programmatically to auto-scroll to the first column for XLSX format?
Here is my full code.
public Workbook readWorkBookAndWriteErrors(String bufId,String inputFile, String ext) throws Exception {
Workbook workBook =null;
Sheet sheet = null;
if(GlobalVariables.EXCEL_FORMAT_XLS.equalsIgnoreCase(ext)){
// Get the workbook instance for XLS file
workBook = new HSSFWorkbook(new FileInputStream(inputFile));
}else{
// Get the workbook instance for XLSX file
workBook = new XSSFWorkbook(new FileInputStream(inputFile));
}
sheet = workBook.getSheetAt(0);
Row row = null;
if(sheet.isColumnHidden(0)){
sheet.setColumnHidden(0, false);
sheet.setActiveCell(new CellAddress("A1"));
sheet.showInPane(0, 0);
sheet.createFreezePane(0, 1);
Iterator<Row> rowIterator = sheet.iterator();
int rowIndex = 1;
while (rowIterator.hasNext()) {
row = rowIterator.next();
if(rowIndex == 1){
rowIndex++;
continue;
}
Cell cell = row.createCell(0);
cell.setCellValue("error message");
rowIndex++;
}
}
return workBook;
}
Here is the answer to my question. Please refer this Source
public Workbook readWorkBookAndWriteErrors(String bufId,String inputFile, String ext) throws Exception {
Workbook workBook =null;
Sheet sheet = null;
if(GlobalVariables.EXCEL_FORMAT_XLS.equalsIgnoreCase(ext)){
// Get the workbook instance for XLS file
workBook = new HSSFWorkbook(new FileInputStream(inputFile));
}else{
// Get the workbook instance for XLSX file
workBook = new XSSFWorkbook(new FileInputStream(inputFile));
}
sheet = workBook.getSheetAt(0);
Row row = null;
if(sheet.isColumnHidden(0)){
sheet.setColumnHidden(0, false);
if(sheet instanceof XSSFSheet){
CTWorksheet ctWorksheet = null;
CTSheetViews ctSheetViews = null;
CTSheetView ctSheetView = null;
XSSFSheet tempSheet = (XSSFSheet) sheet;
// First step is to get at the CTWorksheet bean underlying the worksheet.
ctWorksheet = tempSheet.getCTWorksheet();
// From the CTWorksheet, get at the sheet views.
ctSheetViews = ctWorksheet.getSheetViews();
// Grab a single sheet view from that array
ctSheetView = ctSheetViews.getSheetViewArray(ctSheetViews.sizeOfSheetViewArray() - 1);
// Se the address of the top left hand cell.
ctSheetView.setTopLeftCell("A1");
}else{
sheet.setActiveCell(new CellAddress("A1"));
sheet.showInPane(0, 0);
}
Iterator<Row> rowIterator = sheet.iterator();
int rowIndex = 1;
while (rowIterator.hasNext()) {
row = rowIterator.next();
if(rowIndex == 1){
rowIndex++;
continue;
}
Cell cell = row.createCell(0);
cell.setCellValue("error message");
rowIndex++;
}
}
return workBook;
}
If you are using latest C# NPOI, I made this utility function:
/// <summary>Set view position at given coordinates.</summary>
/// <param name="sheet">A reference to the Excel sheet.</param>
/// <param name="topLeftCell">The coordinates of the cell that will show in top left corner when you open the Excel sheet (example: "A1").</param>
/// <param name="onlyFirstView">If true, only first sheet view will have position adjusted. If false, every views from the sheet will have have position adjusted.</param>
public static void SetViewPosition(ref NPOI.SS.UserModel.ISheet sheet, string topLeftCell = "A1", bool onlyFirstView = true)
{
NPOI.OpenXmlFormats.Spreadsheet.CT_Worksheet worksheet = (NPOI.OpenXmlFormats.Spreadsheet.CT_Worksheet)(typeof(NPOI.XSSF.UserModel.XSSFSheet).GetField("worksheet", System.Reflection.BindingFlags.NonPublic | System.Reflection.BindingFlags.Instance)?.GetValue((NPOI.XSSF.UserModel.XSSFSheet)sheet));
if (worksheet?.sheetViews?.sheetView != null && worksheet.sheetViews.sheetView.Count > 0)
{
if (onlyFirstView)
worksheet.sheetViews.sheetView[0].topLeftCell = topLeftCell;
else
foreach (NPOI.OpenXmlFormats.Spreadsheet.CT_SheetView view in worksheet.sheetViews.sheetView)
view.topLeftCell = topLeftCell;
}
}
Usage example:
NPOI.SS.UserModel.ISheet mySheet = myWorkbook.GetSheetAt(0);
// First sheet from our workbook opens with view set to top left corner (cell A1 visible in top left corner).
SetViewPosition(ref mySheet, "A1");
Below code worked for me to move horizontal scrollbar to required position.
Used apache poi libraries(poi, poi-ooxml).
((XSSFSheet)sheet).getCTWorksheet().getSheetViews().getSheetViewArray(0).setTopLeftCell("AE11");
I have a piece of code which throws me (I have made the line bold)
Exception in thread "main" java.lang.ClassCastException: org.apache.poi.hssf.usermodel.HSSFCell cannot be cast to java.lang.String
at com.codi.excel.ExcelRead.main(ExcelRead.java:36)
My code is as follows -
HSSFWorkbook wb = new HSSFWorkbook(input);
HSSFSheet sheet = wb.getSheetAt(0);
List MobileSeries=new ArrayList();
MobileSeries = findRow(sheet, cellContent);
if(MobileSeries !=null){
for(Iterator iter=MobileSeries.iterator();iter.hasNext();){
**String mobileSeries=(String)iter.next();**
String LidPattern=extractNumber(mobileSeries);
if (lid.startsWith(LidPattern)) {
System.out.println("This is a mobile number");
Could you please help me out.
Apache POI provides a handy class for you to do just that - DataFormatter
Using DataFormatter, for string cells you'll get the current contents, and for numeric or date cells the value will be formatted based on the formatting / styling rules applied to the cell, then returned as a string with that applied.
To loop over all the rows and cells in a workbook, getting their values, following the pattern in the docs, just do something like:
Workbook wb = WorkbookFactory.create(input);
Sheet sheet = wb.getSheetAt(0);
DataFormatter formatter = new DataFormatter();
for (Row r : sheet) {
for (Cell c : r) {
String value = formatter.formatCellValue(c);
}
}
Easy!
While you are iterating the rows of a worksheet, check whether the HSSFCell cell is of type String or not.
InputStream fileInputStream = null;
HSSFWorkbook hssfWorkbook;
HSSFSheet sheet;
HSSFRow row;
HSSFCell cell;
Iterator rowIterator, cellIterator;
// Put these loc in a try-catch block
fileInputStream = new FileInputStream("/path/to/TestExcel.xls");
hssfWorkbook = new HSSFWorkbook(fileInputStream);
sheet = hssfWorkbook.getSheetAt(0);
rowIterator = sheet.rowIterator();
while (rowIterator.hasNext()) {
row = (HSSFRow) rowIterator.next();
cellIterator = row.cellIterator();
while (cellIterator.hasNext()) {
cell = (HSSFCell) cellIterator.next();
if (cell.getCellType() == HSSFCell.CELL_TYPE_STRING) {
String someVariable = cell.getStringCellValue();
} else if (cell.getCellType() == HSSFCell.CELL_TYPE_NUMERIC) {
// Handle numeric type
} else {
// Handle other types
}
}
// Other code
}
Try to access cell and extract value out of it. The below snippet should help you:
Cell cell = iter.next();
cell.getStringCellValue()
Could anyone help?
By using that code, I was able to get a value from an Excel field as well. There is a value for a specific column = 5.
public Integer Check4(int b) throws IOException {
InputStream myxls = new FileInputStream("book.xls");
HSSFWorkbook wb = new HSSFWorkbook(myxls);
HSSFSheet sheet = wb.getSheetAt(0); // first sheet
HSSFRow row = sheet.getRow(0); // first row
//HSSFCell cell0 = row.getCell((short)a); // first arg
HSSFCell cell1 = row.getCell((short)b); // second arg
cell1.setCellType(HSSFCell.CELL_TYPE_NUMERIC);
System.out.println("smth "+ cell1);
return ;
}
However,
The output of the such code is:
"smth + 5.0"
I'd get it, how to convert the var cell1 5.0 to 5 ?
Math.round, Integer.parseInt() don't, actually, help
You have to get the string value from the HSSFCell before using Integer.parseInt().
Use Integer.parseInt(cell1.getStringCellValue()). You can probably use getNumericCellValue() but that will return a double.
finally, solve by (int) Math.round(cell1.getNumericCellValue()))
public Integer Check4(int b) throws IOException {
InputStream myxls = new FileInputStream("book.xls");
HSSFWorkbook wb = new HSSFWorkbook(myxls);
HSSFSheet sheet = wb.getSheetAt(0); // first sheet
HSSFRow row = sheet.getRow(1); // first row
//HSSFCell cell0 = row.getCell((short)a); // first arg
HSSFCell cell1 = row.getCell((short)b); // second arg
String sell = cell1.toString();
cell1.setCellType(HSSFCell.CELL_TYPE_NUMERIC);
System.out.println("smth "+ (int) Math.round(cell1.getNumericCellValue()));
return (int) Math.round(cell1.getNumericCellValue());
}