Using Java to read and clean Excel file - java

I'm trying to create a Java application that can read and clean an Excel file in .xlsx format (by comparing it to a master list, also in Excel format) before writing it to a new Excel file. I also need to be able to post the cleaned file to a web app, but that's for later.
I'm using Apache POI, and have been able to successfully read the Excel file (in the output section in Eclipse it doesn't start from the first row though).
Here's where I'm stuck. I've been able to read the rows using an iterator, but how do I extract all these values to variables? Can I create a data table?
public class ReadAndCleanExcel {
public void readExcel(String filePath) {
Object[][] data = null;
final DataFormatter df = new DataFormatter();
try {
FileInputStream file = new FileInputStream(new File(filePath));
//Create Workbook instance holding reference to .xlsx file
XSSFWorkbook workbook = new XSSFWorkbook(file);
//Get first/desired sheet from the workbook
XSSFSheet sheet = workbook.getSheetAt(0);
//Iterate through each rows one by one
Iterator<Row> rowIterator = sheet.iterator();
int rownum = 0;
int colnum = 0;
Row r=rowIterator.next();
int rowcount=sheet.getLastRowNum();
int colcount=r.getPhysicalNumberOfCells();
data = new Object[rowcount][colcount];
while (rowIterator.hasNext()) {
Row row = rowIterator.next();
//For each row, iterate through all the columns
Iterator<Cell> cellIterator = row.cellIterator();
colnum = 0;
while (cellIterator.hasNext()) {
Cell cell = cellIterator.next();
//Check the cell type and format accordingly
data[rownum][colnum] = df.formatCellValue(cell);
System.out.print(df.formatCellValue(cell).toUpperCase() + " ");
colnum++;
}
rownum++;
System.out.println();
}
file.close();
workbook.close();
} catch (Exception e) {
e.printStackTrace();
}
}

Related

I have a problem with uploading a file, with this error: Found interface org.apache.poi.util.POILogger, but class was expected

public List<Map<String, String>> parseExcelFileWithHeaders(
MultipartFile file,
List<String> headersList)
{
List<Map<String, String>> parsedFile = new ArrayList<>();
try {
// Create Workbook instance holding reference to .xlsx file
XSSFWorkbook workbook =
new XSSFWorkbook(file.getInputStream());
List<String> headers = headersList;
Map<String, String> cells = new HashMap<>();
// Get first/desired sheet from the workbook
XSSFSheet sheet = workbook.getSheetAt(0);
// Iterate through each rows one by one
Iterator<Row> rowIterator = sheet.iterator();
while (rowIterator.hasNext()) {
Row row = rowIterator.next();
// For each row, iterate through all the columns
Iterator<Cell> cellIterator = row.cellIterator();
int colNum = 0;
while (cellIterator.hasNext()) {
Cell cell = cellIterator.next();
cells.put(headers.get(colNum++),
getCharValue(cell).toString());
}
parsedFile.add(cells);
}
workbook.close();
} catch (Exception e) {
e.printStackTrace();
}
return parsedFile;
}
Odds are that you didn't initialize the POI logging subsystem correctly, and as a result, the logging that would be done by one of these methods can't write the (probable) error or output to the Logger.
I would look outside of this block of code to see if you can correctly configure the logging, and then see what the log output has to say about the issue that this block of code is raising.

Sort excel by a column using shiftRows- Apache POI - XmlValueDisconnectedException

I have an XSSFWorkbook with n number of columns. And my requirement is to sort the entire sheet by the first column.
I referred to this link but did not get any information about sorting.
I have also tried the code from here but it gives exception at
sheet.shiftRows(row2.getRowNum(), row2.getRowNum(), -1);
I am using Apache POI 3.17.
Anyone has any suggestion or solution?
There seem to be a bug in POI when shifting columns, they say it was fixed in 3.9 but I used 3.17 and still have it:
Exception in thread "main" org.apache.xmlbeans.impl.values.XmlValueDisconnectedException
at org.apache.xmlbeans.impl.values.XmlObjectBase.check_orphaned(XmlObjectBase.java:1258)
at org.openxmlformats.schemas.spreadsheetml.x2006.main.impl.CTRowImpl.getR(Unknown Source)
at org.apache.poi.xssf.usermodel.XSSFRow.getRowNum(XSSFRow.java:394)
...
I assume it is the same you have. So I worked out an other way:
Sort your rows, then create a new workbook and copy rows in the correct order.
Then write this sorted workbook to the original file.
For simplicity, I assume all cell values are Strings. (if not, then modify accordingly)
private static final String FILE_NAME = "/home/userName/Workspace/fileToSort.xlsx";
public static void main(String[] args) {
Workbook originalWorkbook;
//create a workbook from your file
try(FileInputStream excelFile = new FileInputStream(new File(FILE_NAME))) {
originalWorkbook = new XSSFWorkbook(excelFile);
} catch (IOException e) {
throw new RuntimeException("Couldn't open file: " + FILE_NAME);
}
Sheet originalSheet = originalWorkbook.getSheetAt(0);
// Create a SortedMap<String, Row> where the key is the value of the first column
// This will automatically sort the rows
Map<String, Row> sortedRowsMap = new TreeMap<>();
// save headerRow
Row headerRow = originalSheet.getRow(0);
Iterator<Row> rowIterator = originalSheet.rowIterator();
// skip header row as we saved it already
rowIterator.next();
// sort the remaining rows
while(rowIterator.hasNext()) {
Row row = rowIterator.next();
sortedRowsMap.put(row.getCell(0).getStringCellValue(), row);
}
// Create a new workbook
try(Workbook sortedWorkbook = new XSSFWorkbook();
FileOutputStream out = new FileOutputStream(FILE_NAME)) {
Sheet sortedSheet = sortedWorkbook.createSheet(originalSheet.getSheetName());
// Copy all the sorted rows to the new workbook
// - header first
Row newRow = sortedSheet.createRow(0);
copyRowToRow(headerRow, newRow);
// then other rows, from row 1 up (not row 0)
int rowIndex = 1;
for(Row row : sortedRowsMap.values()) {
newRow = sortedSheet.createRow(rowIndex);
copyRowToRow(row, newRow);
rowIndex++;
}
// Write your new workbook to your file
sortedWorkbook.write(out);
} catch (Exception e) {
e.printStackTrace();
}
}
// Utility method to copy rows
private static void copyRowToRow(Row row, Row newRow) {
Iterator<Cell> cellIterator = row.cellIterator();
int cellIndex = 0;
while(cellIterator.hasNext()) {
Cell cell = cellIterator.next();
Cell newCell = newRow.createCell(cellIndex);
newCell.setCellValue(cell.getStringCellValue());
cellIndex++;
}
}
I tried it out on the following file
A B
---------------
Header1 Header2
a one
c three
d four
b two
and it sorts it this way:
A B
---------------
Header1 Header2
a one
b two
c three
d four

Is there any way to move horizontal scroll bar the sheet one column to the left using POI?

I have written a program that reads Excel template sheet. In which the first column was hidden. Now, I have a code that un-hide the excel column programmatically (so Column start from A1).
I'm using Apache POI 3.16 version.
When I open a file, it should show me a column from A1 instead it shows me from B1 column.
When I write below code for XLS, It working properly but didn't work for an XLSX format.
sheet.showInPane(0, 0);
I need to manually move the horizontal scroll bar to view my first column. How should I achieve this programmatically to auto-scroll to the first column for XLSX format?
Here is my full code.
public Workbook readWorkBookAndWriteErrors(String bufId,String inputFile, String ext) throws Exception {
Workbook workBook =null;
Sheet sheet = null;
if(GlobalVariables.EXCEL_FORMAT_XLS.equalsIgnoreCase(ext)){
// Get the workbook instance for XLS file
workBook = new HSSFWorkbook(new FileInputStream(inputFile));
}else{
// Get the workbook instance for XLSX file
workBook = new XSSFWorkbook(new FileInputStream(inputFile));
}
sheet = workBook.getSheetAt(0);
Row row = null;
if(sheet.isColumnHidden(0)){
sheet.setColumnHidden(0, false);
sheet.setActiveCell(new CellAddress("A1"));
sheet.showInPane(0, 0);
sheet.createFreezePane(0, 1);
Iterator<Row> rowIterator = sheet.iterator();
int rowIndex = 1;
while (rowIterator.hasNext()) {
row = rowIterator.next();
if(rowIndex == 1){
rowIndex++;
continue;
}
Cell cell = row.createCell(0);
cell.setCellValue("error message");
rowIndex++;
}
}
return workBook;
}
Here is the answer to my question. Please refer this Source
public Workbook readWorkBookAndWriteErrors(String bufId,String inputFile, String ext) throws Exception {
Workbook workBook =null;
Sheet sheet = null;
if(GlobalVariables.EXCEL_FORMAT_XLS.equalsIgnoreCase(ext)){
// Get the workbook instance for XLS file
workBook = new HSSFWorkbook(new FileInputStream(inputFile));
}else{
// Get the workbook instance for XLSX file
workBook = new XSSFWorkbook(new FileInputStream(inputFile));
}
sheet = workBook.getSheetAt(0);
Row row = null;
if(sheet.isColumnHidden(0)){
sheet.setColumnHidden(0, false);
if(sheet instanceof XSSFSheet){
CTWorksheet ctWorksheet = null;
CTSheetViews ctSheetViews = null;
CTSheetView ctSheetView = null;
XSSFSheet tempSheet = (XSSFSheet) sheet;
// First step is to get at the CTWorksheet bean underlying the worksheet.
ctWorksheet = tempSheet.getCTWorksheet();
// From the CTWorksheet, get at the sheet views.
ctSheetViews = ctWorksheet.getSheetViews();
// Grab a single sheet view from that array
ctSheetView = ctSheetViews.getSheetViewArray(ctSheetViews.sizeOfSheetViewArray() - 1);
// Se the address of the top left hand cell.
ctSheetView.setTopLeftCell("A1");
}else{
sheet.setActiveCell(new CellAddress("A1"));
sheet.showInPane(0, 0);
}
Iterator<Row> rowIterator = sheet.iterator();
int rowIndex = 1;
while (rowIterator.hasNext()) {
row = rowIterator.next();
if(rowIndex == 1){
rowIndex++;
continue;
}
Cell cell = row.createCell(0);
cell.setCellValue("error message");
rowIndex++;
}
}
return workBook;
}
If you are using latest C# NPOI, I made this utility function:
/// <summary>Set view position at given coordinates.</summary>
/// <param name="sheet">A reference to the Excel sheet.</param>
/// <param name="topLeftCell">The coordinates of the cell that will show in top left corner when you open the Excel sheet (example: "A1").</param>
/// <param name="onlyFirstView">If true, only first sheet view will have position adjusted. If false, every views from the sheet will have have position adjusted.</param>
public static void SetViewPosition(ref NPOI.SS.UserModel.ISheet sheet, string topLeftCell = "A1", bool onlyFirstView = true)
{
NPOI.OpenXmlFormats.Spreadsheet.CT_Worksheet worksheet = (NPOI.OpenXmlFormats.Spreadsheet.CT_Worksheet)(typeof(NPOI.XSSF.UserModel.XSSFSheet).GetField("worksheet", System.Reflection.BindingFlags.NonPublic | System.Reflection.BindingFlags.Instance)?.GetValue((NPOI.XSSF.UserModel.XSSFSheet)sheet));
if (worksheet?.sheetViews?.sheetView != null && worksheet.sheetViews.sheetView.Count > 0)
{
if (onlyFirstView)
worksheet.sheetViews.sheetView[0].topLeftCell = topLeftCell;
else
foreach (NPOI.OpenXmlFormats.Spreadsheet.CT_SheetView view in worksheet.sheetViews.sheetView)
view.topLeftCell = topLeftCell;
}
}
Usage example:
NPOI.SS.UserModel.ISheet mySheet = myWorkbook.GetSheetAt(0);
// First sheet from our workbook opens with view set to top left corner (cell A1 visible in top left corner).
SetViewPosition(ref mySheet, "A1");
Below code worked for me to move horizontal scrollbar to required position.
Used apache poi libraries(poi, poi-ooxml).
((XSSFSheet)sheet).getCTWorksheet().getSheetViews().getSheetViewArray(0).setTopLeftCell("AE11");

Row count displayed as -1 while reading downloaded xlsx file in Java

I am trying to read a xlsx file with particular row number which will be provided as parameter.
I am getting NullPointerException if sheet.getRow(RowNum). I am able to read the xlsx file if i adjust column width and save it again manually. But that destroys my purpose of automation. I am able to read any other xlsx files which are created manually.
Here is the sample code :
public String readCouponCode(int getRowCount) {
try {
Registration reg=new Registration();
File inputFile = new File(this.DownloadFile);
System.out.println(DownloadFile);
// Get the workbook instance for XLSX file
XSSFWorkbook wb = new XSSFWorkbook(new FileInputStream(inputFile));
// Get first sheet from the workbook
XSSFSheet sheet = wb.getSheetAt(0);
// Get iterator to all the rows in current sheet
Iterator<Row> rowIterator = sheet.iterator();
// Traversing over each row of XLSX file
Row row = sheet.getRow(getRowCount);
// For each row, iterate through each columns
Iterator<Cell> cellIterator = row.cellIterator();
while (cellIterator.hasNext()) {
Cell cell = cellIterator.next();
int cellIndex = cell.getColumnIndex();
// System.out.println(cellIndex);
if (cellIndex == 0) {
CouponCode = cell.getStringCellValue();
System.out.println(cell.getStringCellValue() + "\t");
}
}
} catch (Exception e) {
System.err.println("Exception :" + e.getMessage());
}
return CouponCode;
}
I also tried with XSSFRow but it yields same result.
Note : I tried by commenting sheet.getRow() line and it prints only the last row. I tried to get number of rows by using sheet.getPhysicalNumberOfRows(), it gives 1 but actually my xlsx file has 7 rows.
Jars used:
dom4j poi-3.13-20150929
poi-excelant-3.13-20150929
poi-ooxml-3.13-20150929
poi-ooxml-schemas-3.13-20150929
xmlbeans-2.5.0

getting incorrect value from excel using apache poi

in excel the value of column is 10101101010000000000 but when im reading it in java using POI the value is changed to 10101101009999999000, can anyone give me an idea on whats going on and how can i get the exact values from the excel.
i've tried setting the celltype as string and use cell.getStringCellValue() and also this new BigDecimal(cell.getNumericCellValue()).toPlainString() but im still not getting the same value as with the excel
here's my code
List<BankClassVo> data = new ArrayList<BankClassVo>();
FileInputStream fis = new FileInputStream(new File(Constant.VALIDATION_REFERENCE_FILE_PATH + Constant.BANK_CLASSIFICATION_REF_FILE + ".xlsx"));
XSSFWorkbook myWorkBook = new XSSFWorkbook(fis);
XSSFSheet mySheet = myWorkBook.getSheetAt(1);
Iterator<Row> rowIterator = mySheet.iterator();
while (rowIterator.hasNext()) {
Row row = rowIterator.next();
Iterator<Cell> cellIterator = row.cellIterator();
BankClassVo vo = new BankClassVo ();
while (cellIterator.hasNext()) {
Cell cell = cellIterator.next();
if (cell.getColumnIndex() == 0) {
vo.setsClass(new BigDecimal(cell.getNumericCellValue()).toPlainString());
}
else if (cell.getColumnIndex() == 1) {
vo.setClassification(cell.getStringCellValue());
}
}
data.add(vo);
}
myWorkBook.close();
return data;
Use MathContext and RoundingMode. Ref
BigDecimal value = new BigDecimal(cell.getNumericCellValue(), new MathContext(10 , RoundingMode.CEILING));
System.out.println(value.toPlainString());

Categories