java code to extract the check boxes from word document - java

package parser;
import java.io.FileInputStream;
import java.io.IOException;
import java.util.LinkedList;
import java.util.List;
import org.apache.poi.xwpf.usermodel.IBodyElement;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.apache.poi.xwpf.usermodel.XWPFTable;
import org.apache.poi.xwpf.usermodel.XWPFTableCell;
import org.apache.poi.xwpf.usermodel.XWPFTableRow;
public class App {
public static void main(String[] args) {
List<List<List<String>>> tablesResults = new LinkedList<>();
try {
XWPFDocument doc = new XWPFDocument(new FileInputStream("filename"));
List<IBodyElement> documentBody = doc.getBodyElements();
for (IBodyElement i: documentBody){
if (i.getElementType() == org.apache.poi.xwpf.usermodel.BodyElementType.TABLE){
XWPFTable table = (XWPFTable) i;
List<XWPFTableRow> tableRows = table.getRows();
List<List<String>> tableList = new LinkedList<>();
for (XWPFTableRow r: tableRows){
List<String> rowList = new LinkedList<>();
for (XWPFTableCell cell: r.getTableCells()){
rowList.add(cell.getText());
}
tableList.add(rowList);
}
tablesResults.add(tableList);
}
}
for (List<List<String>> table: tablesResults){
for(List<String> row: table){
for(String cell: row){
System.out.print(cell + ", ");
}
System.out.println();
}
System.out.println("-------------------------");
}
} catch (IOException ex) {
System.out.println("Exception:");
System.out.println(ex.toString());
}
}
}
I am not able to extract the checkboxes from the tabular cells and also another table. at present I am using Apache poi, I need your suggestion and help to parse the data from a word document, in the next step I am going to compare this tabular data with another word document
picture of the table

Related

Update a list inside iterator loop

This code tries to iterate over an excel file and load the data to a List<List<String>> but it throws java.lang.NullPointerException at resultList.add(rrow) but it is not clear what is the problem with it:
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
import java.io.FileInputStream;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
public class ReadExcel {
public List<List<String>> resultList;
public List<List<String>> ReadExcelToList(String csvPath) throws IOException {
try {
FileInputStream excelFile = new FileInputStream(csvPath);
Workbook workbook = new XSSFWorkbook(excelFile);
System.out.println(workbook.getSheetName(0));
Sheet datatypeSheet = workbook.getSheetAt(0);
Iterator<Row> iterator = datatypeSheet.iterator();
while (iterator.hasNext()) {
Row currentRow = iterator.next();
Iterator<Cell> cellIterator = currentRow.iterator();
List<String> rrow = new ArrayList<>();
while (cellIterator.hasNext()) {
Cell currentCell = cellIterator.next();
switch (currentCell.getCellType()) {
case Cell.CELL_TYPE_STRING:
rrow.add(currentCell.getStringCellValue());
break;
case Cell.CELL_TYPE_NUMERIC:
rrow.add(String.valueOf(currentCell.getNumericCellValue()));
break;
}
}
resultList.add(rrow);
}
} catch (Exception e) {
e.printStackTrace();
}
return resultList;
}
}
java.lang.NullPointerException
at ReadExcel.readExcelToList(ReadExcel.java:40)
at BasicCSVReader.main(BasicCSVReader.java:35)
Your resultList was never created (it's null). You can fix it by defining it as follows:
public List<List<String>> resultList = new ArrayList<>();

How to read and replace bookmark values with apache POI

i'm a complete novice with apache POI and i already tried several things. My problem is that i have a few bookmarks in a docx-File and i want to replace the value of them.
i already got so far that i add the text to the bookmark, but the previous value is still there
my code:
InputStream fis = new FileInputStream(fileName);
XWPFDocument document = new XWPFDocument(fis);
List<XWPFParagraph> paragraphs = document.getParagraphs();
for (XWPFParagraph paragraph : paragraphs)
{
//Here you have your paragraph;
CTP ctp = paragraph.getCTP();
// Get all bookmarks and loop through them
List<CTBookmark> bookmarks = ctp.getBookmarkStartList();
for(CTBookmark bookmark : bookmarks)
{
if(bookmark.getName().equals("Firma1234"))
{
System.out.println(bookmark.getName());
XWPFRun run = paragraph.createRun();
run.setText(lcFirma);
ctp.getDomNode().insertBefore(run.getCTR().getDomNode(), bookmark.getDomNode());
}
}
}
OutputStream out = new FileOutputStream(output);
document.write(out);
document.close();
out.close();
the value of "lcFirma" is "Firma"
the value of the Bookmark is "Testmark"
my docx-File before:
Testmark -> name=Firma1234
my docx-File after:
FirmaTestmark
like i said the text is inserted before the value of the bookmark instead of replacing it, how do i replace the text instead?
Greetings,
Kevin
I also had similar requirement of setting the "Default text" field of a .docx bookmark. I was not able to do so, so, I did this as a workaround : Replaced the entire paragraph containing the bookmark with text. So, instead of the bookmark being populated with a default text, I had a paragraph that held the bookmarked text. In my case, the .docx had to finally converted to a .pdf file, so the absence of bookmark did not matter, but the presence of correct text was more important.
This is how I did it with Apache POI :
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.InputStream;
import java.io.OutputStream;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import org.apache.commons.lang3.StringUtils;
import org.apache.poi.util.TempFileCreationStrategy;
import org.apache.poi.xdgf.usermodel.section.geometry.RelMoveTo;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.apache.poi.xwpf.usermodel.XWPFParagraph;
import org.apache.poi.xwpf.usermodel.XWPFRun;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTBookmark;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTP;
import org.w3c.dom.DOMException;
import org.w3c.dom.Document;
import org.w3c.dom.NamedNodeMap;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.w3c.dom.UserDataHandler;
/**
*
* #author binita.bharati#gmail.com
*
* This code will replace bookmark with plain text. A bookmark is seen as "Text Form Field" in a .docx file.
*
*/
public class BookmarkReplacer {
public static void main(String[] args) throws Exception {
replaceBookmark();
}
private static String replaceBookmarkedPara(String input, String bookmarkTxt) {
char[] tmp = input.toCharArray();
StringBuilder sb = new StringBuilder();
int bookmarkedCharCount = 0;
for (int i = 0 ; i < tmp.length ; i++) {
int asciiCode = tmp[i];
if (asciiCode == 8194) {
bookmarkedCharCount ++;
if (bookmarkedCharCount == 5) {
sb.append(bookmarkTxt);
}
}
else {
sb.append(tmp[i]);
}
}
return sb.toString();
}
private static void removeAllRuns(XWPFParagraph paragraph) {
int size = paragraph.getRuns().size();
for (int i = 0; i < size; i++) {
paragraph.removeRun(0);
}
}
private static void insertReplacementRuns(XWPFParagraph paragraph, String replacedText) {
String[] replacementTextSplitOnCarriageReturn = StringUtils.split(replacedText, "\n");
for (int j = 0; j < replacementTextSplitOnCarriageReturn.length; j++) {
String part = replacementTextSplitOnCarriageReturn[j];
XWPFRun newRun = paragraph.insertNewRun(j);
newRun.setText(part);
if (j+1 < replacementTextSplitOnCarriageReturn.length) {
newRun.addCarriageReturn();
}
}
}
public static void replaceBookmark () throws Exception
{
InputStream fis = new FileInputStream("C:\\input.docx");
XWPFDocument document = new XWPFDocument(fis);
List<XWPFParagraph> paragraphs = document.getParagraphs();
for (XWPFParagraph paragraph : paragraphs)
{
//Here you have your paragraph;
CTP ctp = paragraph.getCTP();
// Get all bookmarks and loop through them
List<CTBookmark> bookmarks = ctp.getBookmarkStartList();
for(CTBookmark bookmark : bookmarks)
{
if(bookmark.getName().equals("data_incipit") || bookmark.getName().equals("incipit_Codcli")
|| bookmark.getName().equals("Incipit_titolo"))
{
String paraText = paragraph.getText();
System.out.println("paraText = "+paraText +" for bookmark name "+bookmark.getName());
String replacementText = replaceBookmarkedPara(paraText, "haha");
removeAllRuns(paragraph);
insertReplacementRuns(paragraph, replacementText);
}
}
}
OutputStream out = new FileOutputStream("C:\\output.docx");
document.write(out);
document.close();
out.close();
}
}
Try below code
private List<XWPFParagraph> collectParagraphs()
{
List<XWPFParagraph> paragraphs = new ArrayList<>();
paragraphs.addAll(this.document.getParagraphs());
for (XWPFTable table : this.document.getTables())
{
for (XWPFTableRow row : table.getRows())
{
for (XWPFTableCell cell : row.getTableCells())
paragraphs.addAll(cell.getParagraphs());
}
}
return paragraphs;
}
public List<String> getBookmarkNames()
{
List<String> bookmarkNames = new ArrayList<>();
Iterator<XWPFParagraph> paraIter = null;
XWPFParagraph para = null;
List<CTBookmark> bookmarkList = null;
Iterator<CTBookmark> bookmarkIter = null;
CTBookmark bookmark = null;
XWPFRun run = null;
// Get an Iterator for the XWPFParagraph object and step through them
// one at a time.
paraIter = collectParagraphs().iterator();
while (paraIter.hasNext())
{
para = paraIter.next();
// Get a List of the CTBookmark object sthat the paragraph
// 'contains' and step through these one at a time.
bookmarkList = para.getCTP().getBookmarkStartList();
bookmarkIter = bookmarkList.iterator();
while (bookmarkIter.hasNext())
{
bookmark = bookmarkIter.next();
bookmarkNames.add(bookmark.getName());
}
}
return bookmarkNames;
}

Apache poi get table from text box

I'm using apache poi for iteration table in docx file. All works fine but if table in text box, my code don't see table - table.size() = 0
XWPFDocument doc = new XWPFDocument(new FileInputStream(fileName));
List<XWPFTable> table = doc.getTables();
for (XWPFTable xwpfTable : table) {
List<XWPFTableRow> row = xwpfTable.getRows();
for (XWPFTableRow xwpfTableRow : row) {
List<XWPFTableCell> cell = xwpfTableRow.getTableCells();
for (XWPFTableCell xwpfTableCell : cell) {
if(xwpfTableCell != null){
List<XWPFTable> itable = xwpfTableCell.getTables();
if(itable.size()!=0){
for (XWPFTable xwpfiTable : itable) {
List<XWPFTableRow> irow = xwpfiTable.getRows();
for (XWPFTableRow xwpfiTableRow : irow) {
List<XWPFTableCell> icell = xwpfiTableRow.getTableCells();
for (XWPFTableCell xwpfiTableCell : icell) {
if(xwpfiTableCell!=null){
}
}
}
}
}
}
}
}
}
Following code is low level parsing a *.docx document and getting all tables in document body of it.
The approach is using a org.apache.xmlbeans.XmlCursor and searching for all w:tbl elements in document.xml. If found add them to a List<CTTbl>.
Because a text box rectangle shape provides fall-back content in the document.xml, we need to skip the mc:Fallback elements. Else we would have the tables within the text boxes twice.
At last we go through the List<CTTbl> and get the contents of all the tables.
import java.io.*;
import org.apache.poi.xwpf.usermodel.*;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTBody;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTTbl;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTRow;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTTc;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTP;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTR;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTText;
import org.apache.xmlbeans.impl.values.XmlAnyTypeImpl;
import org.apache.xmlbeans.XmlCursor;
import javax.xml.namespace.QName;
import java.util.List;
import java.util.ArrayList;
public class WordReadAllTables {
public static void main(String[] args) throws Exception {
XWPFDocument document = new XWPFDocument(new FileInputStream("22.docx"));
CTBody ctbody = document.getDocument().getBody();
XmlCursor xmlcursor = ctbody.newCursor();
QName qnameTbl = new QName("http://schemas.openxmlformats.org/wordprocessingml/2006/main", "tbl", "w");
QName qnameFallback = new QName("http://schemas.openxmlformats.org/markup-compatibility/2006", "Fallback", "mc");
List<CTTbl> allCTTbls = new ArrayList<CTTbl>();
while (xmlcursor.hasNextToken()) {
XmlCursor.TokenType tokentype = xmlcursor.toNextToken();
if (tokentype.isStart()) {
if (qnameTbl.equals(xmlcursor.getName())) {
if (xmlcursor.getObject() instanceof CTTbl) {
allCTTbls.add((CTTbl)xmlcursor.getObject());
} else if (xmlcursor.getObject() instanceof XmlAnyTypeImpl) {
allCTTbls.add(CTTbl.Factory.parse(xmlcursor.getObject().newInputStream()));
}
} else if (qnameFallback.equals(xmlcursor.getName())) {
xmlcursor.toEndToken();
}
}
}
for (CTTbl cTTbl : allCTTbls) {
StringBuffer tableHTML = new StringBuffer();
tableHTML.append("<table>\n");
for (CTRow cTRow : cTTbl.getTrList()) {
tableHTML.append(" <tr>\n");
for (CTTc cTTc : cTRow.getTcList()) {
tableHTML.append(" <td>");
for (CTP cTP : cTTc.getPList()) {
for (CTR cTR : cTP.getRList()) {
for (CTText cTText : cTR.getTList()) {
tableHTML.append(cTText.getStringValue());
}
}
}
tableHTML.append("</td>");
}
tableHTML.append("\n </tr>\n");
}
tableHTML.append("</table>");
System.out.println(tableHTML);
}
document.close();
}
}
This code needs the full jar of all of the schemas ooxml-schemas-1.3.jar as mentioned in faq-N10025.

Apache POI Java Read using Member Class Variable

Hi how can i create dynamic objects in a loop to store multiple values in the object. Then access those objects to manipulate.
Is it possible to give dynamic variable names for object creation. Can i give dynamic values of variable on the left hand side of an assignment. Please ask me to edit if the question is not clear, if solution already available please point me out to that.
package poi;
import org.apache.poi.ss.usermodel.*;
import org.apache.poi.xssf.usermodel.XSSFRow;
import org.apache.poi.xssf.usermodel.XSSFSheet;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
public class ExcelRead_UsingMember {
public static int C, R, i;
public static double ID;
private static final String FILE_READ = "C:/Users/m93162/ApachePOI_Excel_Workspace/MyFirstExcel.xlsx";
//private static final String FILE_WRITE = "C:/Users/m93162/ApachePOI_Excel_Workspace/WriteExcel.xlsx";
public static void main(String[] args) {
List<Member> listOfMembers = new ArrayList<Member>();
try {
FileInputStream excelFile = new FileInputStream(new File(FILE_READ));
Workbook workbook = new XSSFWorkbook(excelFile);
Sheet datatypeSheet = workbook.getSheetAt(0);
Iterator<Row> iterator = datatypeSheet.iterator();
while (iterator.hasNext()) {
Row currentRow = iterator.next();
Iterator<Cell> cellIterator = currentRow.iterator();
int i=0;
while (cellIterator.hasNext()) {
Member [member+i] = new Member();
QUESTION: I want to create dynamic objects here and store the values below dynamically. How to approach this.
Cell currentCell = cellIterator.next();
if (currentCell.getCellTypeEnum() == CellType.STRING) {
System.out.print(currentCell.getStringCellValue() + "--");
} else if (currentCell.getCellTypeEnum() == CellType.NUMERIC) {
System.out.print(currentCell.getNumericCellValue() + "--");
}
}
System.out.println();
}
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
}
You can use map.
DynamicObject
DynamicObject {
prvate Map<String, Object> map = new HashMap();
public void addPropery(String key, Object value) {
map.put(key, value);
}
}
Your code
private void processRow(Row row) {
while (cellIterator.hasNext()) {
dynamicObjects[member+i] = new DynamicObject();
DynamicObject dynamicObject = dynamicObjects[member+i];
Cell currentCell = cellIterator.next();
if (currentCell.getCellTypeEnum() == CellType.STRING) {
System.out.print(currentCell.getStringCellValue() + "--");
dynamicObject.addProperty("stringField", currentCell.getStringCellValue());
} else if (currentCell.getCellTypeEnum() == CellType.NUMERIC) {
System.out.print(currentCell.getNumericCellValue() + "--");
dynamicObject.addProperty("numericField", currentCell.getNumericCellValue());
}
}
}
Then you can traverse keys of the map to get all possible values. You can also store other objects(for example, as nested maps) inside the map as values.

Issue with inserting string into excel sheet

I am trying to create an excel file which can have value as name and ID
The lines which have made bol are giving me error in my program.Kindly help me out as what might be the mistake... pls
If possible also help me out with the code of only writing name as in string into the excel file.
package demos;
import jxl.*;
import jxl.write.WritableSheet;
import jxl.write.WritableWorkbook;
import jxl.write.WriteException;
import java.io.*;
import java.util.*;
import com.sun.rowset.internal.Row;
import jxl.CellView;
import jxl.Workbook;
import jxl.WorkbookSettings;
import jxl.write.Label;
import jxl.write.WritableCellFormat;
import jxl.write.biff.RowsExceededException;
public class StringInp
{
public static void main(String[] args) throws IOException
{
try
{
String filename="C:\\virclipse\\input.xls";
WritableWorkbook wb=Workbook.createWorkbook(new File(filename));
//Create a blank sheet
WritableSheet sheet = wb.createSheet("Employee Data",0);
//This data needs to be written (Object[])
Map<String, Object[]> data = new TreeMap<String, Object[]>();
data.put("1", new Object[] {"ID", "NAME"});
data.put("2", new Object[] {101, "Shivany"});
data.put("3", new Object[] {102, "Nalini"});
data.put("4", new Object[] {103, "John"});
data.put("5", new Object[] {104, "Ayush"});
//Iterate over data and write to sheet
Set<String> keyset = data.keySet();
int rownum = 0;
for (String key : keyset)
{
***Row row = sheet.createRow(rownum++);***
Object [] objArr = data.get(key);
int cellnum = 0;
for (Object obj : objArr)
{
***Cell cell = row.createCell(cellnum++);***
if(obj instanceof String)
{
***cell.setCellValue((String)obj);***
}
else if(obj instanceof Integer)
{
***cell.setCellValue((Integer)obj);***
}
}
//Write the workbook in file system
FileOutputStream out = new FileOutputStream(new File("filename"));
wb.write();
wb.close();
}
}
catch(WriteException e)
{
System.out.println("there is an error");
}
}
}
You are using the right imports but you need to have the required jxml jars in your build path also.
you can read here, what is jExcel API here is a tutorial also you can download API from here please download jexcelapi_2_6_12.zip, after downloading extract it and put jxl.jar in your build path, those errors should go then

Categories