Read Excel file containing multiple values in single column -Java - java

I'm reading Excel file using Apache POI.
My Excel table structure is like this
|2000s| 2001, 2003, 2008, 2009|
so for right hand side data, I require it to assign to 2000s
Till now I've implemented this way:
List<Class> list = new ArrayList<Class>();
File file = new File(file_path);
FileInputStream fis = new FileInputStream(file);
//Create an instance of workbook which refers to an excel file
XSSFWorkbook wb = new XSSFWorkbook(fis);
//This selects the 1st sheet
XSSFSheet sheet = wb.getSheetAt(0);
//Iterate through each row one by one
Iterator<Row> itr = sheet.iterator();
String newName = null;
String oldName = null;
while(itr.hasNext()){
Row nextRow = itr.next();
// For each row, iterate through all the columns
Iterator<Cell> cellIterator = nextRow.cellIterator();
while (cellIterator.hasNext())
{
Cell cell = cellIterator.next();
newName = nextRow.getCell(0).toString();
if(nextRow.getCell(1).toString().contains(",")){
StringTokenizer st = new StringTokenizer(nextRow.getCell(1).toString(),",");
while(st.hasMoreTokens()){
oldName = st.nextToken();
}
}
else{
oldName = nextRow.getCell(1).toString();
}
}
System.out.println();
}
When I compile, it throws me "Null pointer Exception" at nextRow.getCell(1) line.
I don't understand how do I map all comma values to 2000s.
This is working perfectly fine for normal data(without comma).

Comma values have been handled
I'm posting answer so somebody can get help from here.
What I've done is- added String Tokenizer class and if there's comma in the cell, it breaks the value with the comma delimiter.
Lets have a look at the code below
while(itr.hasNext()){
Row nextRow = itr.next();
// For each row, iterate through all the columns
Iterator<Cell> cellIterator = nextRow.cellIterator();
while (cellIterator.hasNext())
{
Cell cell = cellIterator.next();
newName = nextRow.getCell(0).toString();
if(nextRow.getCell(1).toString().contains(",")){
StringTokenizer st = new StringTokenizer(nextRow.getCell(1).toString(),",");
while(st.hasMoreTokens()){
oldName = st.nextToken();
}
}
else{
oldName = nextRow.getCell(1).toString();
}
}
System.out.println();
}
Here newName gets the value of 1st col.(2000s)
and oldName gets the tokens based on ',' delimiter- In this case 2001, 2003, 2008, 2009
for all these values of oldName, newName 2000s would be mapped.
UPDATE: Reason I was getting 'Null Pointer Exception' there, because some cells at 2nd column(nextRow.getCell(1)) are null.
So whenever iterator reaches to the null cell, it throws Null Pointer Exception.
Here you need to assign Missing Cell Policy
by
Cell cell2 = row.getCell(j,org.apache.poi.ss.usermodel.Row.CREATE_NULL_AS_BLANK);
(It just treats null values as blank)
This way you can also resolve Null pointer exception in Excel while reading from Null values

Related

Require help on Java Logic for Reading Excel file and populating to HashMap

I am reading an excel data file from Java using apache POI API and populating an HashMap collection with excel Headers as Key and specified Row Data as Value of the Map. All Headers are always present but data corresponding to some headers may or may not be present.
Below is my code logic:
First I populate all the headers to an ArrayList.
Then I Iterate through the cells of the specified excel data row and I add the header value from the ArrayList, populated previously, and Data cell value from the row as key-value to an HashMap.
Below is my code:
ArrayList<String> headerList = new ArrayList<String>();
Map<String, String> dataMap = new LinkedHashMap<String, String>();
DateFormat df = new SimpleDateFormat("dd/MM/yyyy");
// Create object of XSSFWorkbook to get hold of excel file
FileInputStream fis = new FileInputStream(System.getProperty("user.dir") + "\\Resources\\TestData.xlsx");
XSSFWorkbook workbook = new XSSFWorkbook(fis);
try
{
int noOfSheets = workbook.getNumberOfSheets();
for(int i=0; i<noOfSheets; i++)
{
if(workbook.getSheetName(i).equalsIgnoreCase(workSheet))
{
//Get access to sheet
XSSFSheet sheet = workbook.getSheetAt(i);
//Get access to all rows of sheet
Iterator<Row> rows = sheet.iterator();
Row headerRow = rows.next();
Iterator<Cell> headerCells = headerRow.cellIterator();
while(headerCells.hasNext())
{
headerList.add(headerCells.next().getStringCellValue());
}
// Get access to specific row
while(rows.hasNext())
{
Row dataRow = rows.next();
if(dataRow.getCell(0).getStringCellValue().equalsIgnoreCase(testCase))
{
int j = 0;
//Get access to collection of cells of the identified rows
Iterator<Cell> dataCells = dataRow.cellIterator();
//loop through all the cells of the row and add cell data to arraylist.
while(dataCells.hasNext())
{
Cell dataCell = dataCells.next();
if(dataCell.getCellType()==CellType.STRING)
{
//arrList.add(dataCell.getStringCellValue());
dataMap.put(headerList.get(j), dataCell.getStringCellValue());
}
else if(dataCell.getCellType()==CellType.NUMERIC)
{
if(DateUtil.isCellDateFormatted(dataCell))
{
//arrList.add(df.format(dataCell.getDateCellValue()));
dataMap.put(headerList.get(j), df.format(dataCell.getDateCellValue()));
}
else
{
//arrList.add(NumberToTextConverter.toText(dataCell.getNumericCellValue()));
dataMap.put(headerList.get(j), NumberToTextConverter.toText(dataCell.getNumericCellValue()));
}
}
else if(dataCell.getCellType()==CellType.BOOLEAN)
{
//arrList.add(Boolean.toString(dataCell.getBooleanCellValue()));
dataMap.put(headerList.get(j), Boolean.toString(dataCell.getBooleanCellValue()));
}
else
dataMap.put(headerList.get(j), null);
j++;
}
}
}
}
}
If there is no data in any cell then I do not want the corresponding header to be added in the Map. But when I iterate through the Data cells(dataCells.hasNext()), then the iterator does not return me null for that blank cell, instead it totally skips the Blank cell. So all headers are added but those cells where is no data are not added hence there is mismatch of Header-Data key-values.
Example: If data cell of Column 5 is blank then, "Value" of the column 5 header is mapped with the value of Data Column 6 as "Key-Value" in the HashMap. What I want is column 5 header should be skipped being added to the Map when column 5 data is blank. How do I resolve this logical mismatch issue?
Excel file screenshot
Debug scenario screenshot

Java: How to push the excel data in to java map(<string,ArrayList<Strings>)

I am facing a weird problem here, i need to read the excel data which starts after some rows.
My excel input is something like below
Row1 : This report is to display all the user details
Row2 : Kindly find the below details
Row3 : TABLE1 (This is the identifier after this row my table data is available).
Row4 : ID Name DOB
Row5 : 101 RAM 10-07-1986
Row6 : 102 Sita 24-08-1989
Row6 : Table2
note:i need to read only row4 to row6jav
I need the output like below in the map,
mymap [ID =[101,102],[Name = RAM,Sita],[DOB = 10-07-1986,24-08-1989]]
I have tried the below code which is working absolutely fine if my first 3 rows are not there, only creating issue if i give first 3 rows. your help is much appreciated.
public static void main(String[] args) {
try {
File file = new File("C:\\demo\\employee.xlsx"); //creating a new file instance
FileInputStream fis = new FileInputStream(file); //obtaining bytes from the file
//creating Workbook instance that refers to .xlsx file
XSSFWorkbook wb = new XSSFWorkbook(fis);
XSSFSheet sheet = wb.getSheetAt(0); //creating a Sheet object to retrieve object
Iterator<Row> itr = sheet.iterator(); //iterating over excel file
// CAREFUL HERE! use LinkedHashMap to guarantee the insertion order!
Map<String, List<String>> myMap = new LinkedHashMap<>();
// populate map with headers and empty list
if (itr.hasNext()) {
Row row = itr.next();
Iterator<Cell> headerIterator = row.cellIterator();
while (cellIterator.hasNext()) {
Cell cell = cellIterator.next();
myMap.put(getCellValue(cell), new ArrayList<>());
}
}
Iterator<List<String>> columnsIterator;
// populate lists
while (itr.hasNext()) {
// get the list iterator every row to start from first list
columnsIterator = myMap.values().iterator();
Row row = itr.next();
Iterator<Cell> cellIterator = row.cellIterator(); //iterating over each column
while (cellIterator.hasNext()) {
Cell cell = cellIterator.next();
// here don't check hasNext() because if the file not contains problems
// the # of columns is same as # of headers
columnsIterator.next().add(getCellValue(cell));
}
}
// here your map should be filled with data as expected
} catch (Exception e) {
e.printStackTrace();
}
}
public static String getCellValue(Cell cell) {
switch (cell.getCellType()) {
case Cell.CELL_TYPE_STRING: //field that represents string cell type
return cell.getStringCellValue() + "\t\t\t";
case Cell.CELL_TYPE_NUMERIC: //field that represents number cell type
return cell.getNumericCellValue() + "\t\t\t";
case Cell.CELL_TYPE_Date: //field that represents Date cell type
return cell.getDateCellValue() + "\t\t\t";
default:
return "";
}
}

Reading excel cells containing alphanumeric data in Java

I am trying to get the values from cells in an excel spreadsheet that contain both letters and numbers. Below is a sample of my code:
String excelFilePath = "FILEPATH";
FileInputStream inputStream = new FileInputStream(new File(excelFilePath));
Workbook workbook = new XSSFWorkbook(inputStream);
Sheet firstSheet = workbook.getSheetAt(0);
Iterator<Row> iterator = firstSheet.iterator();
iterator.next();
while (rowIterator.hasNext()) {
Row nextRow = rowIterator.next();
Iterator<Cell> cellIterator = nextRow.cellIterator();
while (cellIterator.hasNext()) {
Cell nextCell = cellIterator.next();
int columnIndex = nextCell.getColumnIndex();
switch (columnIndex) {
case 0:
String name = nextCell.getStringCellValue();
break;
case 1:
Date enrollDate = nextCell.getDateCellValue();
case 2:
String userid = nextCell.getStringCellValue();
}
}
My question is in relation to case 2.User IDs can either be letters, numbers or a mix of both for example: (AX77552, 112321, ASWTT)If I say getStringCellValue() I get an error and I also get an error if I say getNumericCellValue(). Is there any way that I can make it so that you can get numeric and string values for the cells in this row of data?

How to get datavalidation source for a cell in java using poi?

I have defined a list of valuses my_list in one excel sheet as follow:
In another excel sheet, I reference for some cells to that list sothat this list is shown as dropdown in the cell as follows:
Using poi, I go throw excel sheet rows/columns and read cells for cell.
I get value of cells using method:
cell.getStringCellValue()
My question is how to get the name of the list my_list from the cell?
This problem contains multiple different problems.
First we need get sheet's data validations and then for each data validation get Excel cell ranges the data validation applies to. If the cell is in one of that cell ranges and if data validation is a list constraint then do further proceedings. Else return a default value.
If we have a explicit list like "item1, item2, item3, ..." then return this.
Else if we have a formula creating the list and is formula1 a area reference to a range in same sheet, then get all cells in that cell range and put their values in an array and return this.
Else if we have a formula creating the list and is formula1 a reference to a defined name in Excel, then get the Excel cell range the name refers to. Get all cells in that cell range and put their values in an array and return this.
Complete Example. The ExcelWorkbook contains the data validation in first sheet cell D1.
import org.apache.poi.ss.usermodel.*;
import org.apache.poi.ss.util.*;
import org.apache.poi.ss.SpreadsheetVersion;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import java.io.FileInputStream;
import java.util.List;
public class ExcelGetDataValidationList {
static String[] getDataFromAreaReference(AreaReference areaReference, Sheet sheet) {
DataFormatter dataFormatter = new DataFormatter();
Workbook workbook = sheet.getWorkbook();
CellReference[] cellReferences = areaReference.getAllReferencedCells(); // get all cells in that cell range
String[] listValues = new String[cellReferences.length]; // and put their values in an array
for (int i = 0 ; i < cellReferences.length; i++) {
CellReference cellReference = cellReferences[i];
if (cellReference.getSheetName() == null) {
listValues[i] = dataFormatter.formatCellValue(
sheet.getRow(cellReference.getRow()).getCell(cellReference.getCol())
);
} else {
listValues[i] = dataFormatter.formatCellValue(
workbook.getSheet(cellReference.getSheetName()).getRow(cellReference.getRow()).getCell(cellReference.getCol())
);
}
}
return listValues;
}
static String[] getDataValidationListValues(Sheet sheet, Cell cell) {
List<? extends DataValidation> dataValidations = sheet.getDataValidations(); // get sheet's data validations
for (DataValidation dataValidation : dataValidations) {
CellRangeAddressList addressList = dataValidation.getRegions(); // get Excel cell ranges the data validation applies to
CellRangeAddress[] addresses = addressList.getCellRangeAddresses();
for (CellRangeAddress address : addresses) {
if (address.isInRange(cell)) { // if the cell is in that cell range
DataValidationConstraint constraint = dataValidation.getValidationConstraint();
if (constraint.getValidationType() == DataValidationConstraint.ValidationType.LIST) { // if it is a list constraint
String[] explicitListValues = constraint.getExplicitListValues(); // if we have a explicit list like "item1, item2, item3, ..."
if (explicitListValues != null) return explicitListValues; // then return this
String formula1 = constraint.getFormula1(); // else if we have a formula creating the list
System.out.println(formula1);
Workbook workbook = sheet.getWorkbook();
AreaReference areaReference = null;
try { // is formula1 a area reference?
areaReference = new AreaReference(formula1,
(workbook instanceof XSSFWorkbook)?SpreadsheetVersion.EXCEL2007:SpreadsheetVersion.EXCEL97
);
String[] listValues = getDataFromAreaReference(areaReference, sheet); //get data from that area reference
return listValues; // and return this
} catch (Exception ex) {
//ex.printStackTrace();
// do nothing as creating AreaReference had failed
}
List<? extends Name> names = workbook.getNames(formula1); // is formula1 a reference to a defined name in Excel?
for (Name name : names) {
String refersToFormula = name.getRefersToFormula(); // get the Excel cell range the name refers to
areaReference = new AreaReference(refersToFormula,
(workbook instanceof XSSFWorkbook)?SpreadsheetVersion.EXCEL2007:SpreadsheetVersion.EXCEL97
);
String[] listValues = getDataFromAreaReference(areaReference, sheet); //get data from that area reference
return listValues; // and return this
}
}
}
}
}
return new String[]{}; // per default return an empy array
}
public static void main(String[] args) throws Exception {
//String filePath = "ExcelWorkbook.xls";
String filePath = "ExcelWorkbook.xlsx";
Workbook workbook = WorkbookFactory.create(new FileInputStream(filePath));
Sheet sheet = workbook.getSheetAt(0);
Row row = sheet.getRow(0); if (row == null) row = sheet.createRow(0); // row 1
Cell cell = row.getCell(3); if (cell == null) cell = row.createCell(3); // cell D1
System.out.println(cell.getAddress() + ":" + cell);
String[] dataValidationListValues = getDataValidationListValues(sheet, cell);
for (String dataValidationListValue : dataValidationListValues) {
System.out.println(dataValidationListValue);
}
workbook.close();
}
}
Note: Current Excel versions allow data validation list reference to be a direct area reference to another sheet without using a named range. But this is nothing what apache poi can get. Apache poi is on Excel 2007 level only.
The my_list your mean is Define Name in excel, honestly i don't know is apache-poi can do it or not. But this is may a clue, you can get the my_list formula using .getRefersToFormula();, please try the bellow code :
String defineNameFromExcel = "my_list";
List define = new ArrayList<>();
define = myExcel.getAllNames();
Iterator<List> definedNameIter = define.iterator();
while(definedNameIter.hasNext()) {
Name name = (Name) definedNameIter.next();
if(name.getNameName().equals(defineNameFromExcel)) {
String sheetName = name.getSheetName();
String range = name.getRefersToFormula();
range = range.substring(range.lastIndexOf("!"));
System.out.println(sheetName);
System.out.println(range);
}
}
It will get sheet name and range, with the information may you can extract for get the value you want, hope this helps.
Reference

Append Excel columns with XSSFWorkbook

I have an excel sheet, and I want to selectively transfer its content to a list. The object has 2 attributes, String id, String str.
I want to set the first column as id. I got this part right. I also want to append the values of column 3,4,6,7. For example, if my excel looks like:
4404A01459C1 || A1 || 13 || 14 || B1 || 8 || 7
I want 4404A01459C1 as id(again, I got this part). Then I want 13;14;8;7, skipping A1 and B1, separating the values with ; How do I achieve this?
FileInputStream inputStream = new FileInputStream("D:\\work\\calculatepi\\test.xlsx");
Workbook workbook = new XSSFWorkbook(inputStream);
Sheet firstSheet = workbook.getSheetAt(0);
Iterator<Row> rowIterator = firstSheet.iterator();
List<SampleGene> sgl=new ArrayList<SampleGene>();
while(rowIterator.hasNext()){
Row row = rowIterator.next();
Iterator<Cell> cellIterator = row.cellIterator();
SampleGene sg = new SampleGene();
sg.setId(row.getCell(0).toString());
//need help here
sgl.add(sg);
}
return null;
Try using StringBuilder and iterate over cellIterator;append each cell value to the StringBuilder.
StringBuilder sb = new StringBuilder();
while(cellIterator.hasNext())
{
sb.append(cellIterator.next().toString());
sb.append(";");
}
sg.setStr(sb.toString());

Categories