Insert multiple copied paragraphs in XWPFDocument - java
I am trying to copy paragraphs of a XWPFDocument using Apache POI. Since POI has no method to insert a pre-made paragraph at an arbitrary point, I've read plenty of answers suggesting to first insert a throwaway paragraph using insertNewParagraph(), then replace the temporary paragraph by the one I actually want with setParagraph(). This is further complicated by that insertNewParagraph can't just take an input which is the desired index into the body's list of elements (like how XWPFTable.addRow(row,pos) works), and must pass it an XmlCursor.
TestIn.docx I created as a a test with 6 paragraphs A, B, C, D, E, F.
import java.io.FileInputStream;
import java.io.FileOutputStream;
import org.apache.poi.xwpf.usermodel.IBodyElement;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.apache.poi.xwpf.usermodel.XWPFParagraph;
import org.apache.xmlbeans.XmlCursor;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTP;
public class ParagraphIssue
{
public void debugElement (IBodyElement elem, StringBuilder s, XWPFParagraph a, XWPFParagraph b, XWPFParagraph c, XWPFParagraph d, XWPFParagraph e, XWPFParagraph f,
XWPFParagraph t1, XWPFParagraph r1, XWPFParagraph t2, XWPFParagraph r2)
{
if (s.length () > 0) s.append (" ");
if (elem == a) s.append ("A");
else if (elem == b) s.append ("B");
else if (elem == c) s.append ("C");
else if (elem == d) s.append ("D");
else if (elem == e) s.append ("E");
else if (elem == f) s.append ("F");
else if (elem == t1) s.append ("T1");
else if (elem == r1) s.append ("R1");
else if (elem == t2) s.append ("T2");
else if (elem == r2) s.append ("R2");
else s.append ("U");
}
public void debug (XWPFDocument doc, XWPFParagraph a, XWPFParagraph b, XWPFParagraph c, XWPFParagraph d, XWPFParagraph e, XWPFParagraph f,
XWPFParagraph t1, XWPFParagraph r1, XWPFParagraph t2, XWPFParagraph r2)
{
StringBuilder s = new StringBuilder ();
for (IBodyElement elem : doc.getBodyElements ())
debugElement (elem, s, a, b, c, d, e, f, t1, r1, t2, r2);
System.out.println("Elements: " + s);
s = new StringBuilder ();
for (XWPFParagraph para : doc.getParagraphs ())
debugElement (para, s, a, b, c, d, e, f, t1, r1, t2, r2);
System.out.println("Paragraphs: " + s);
}
public void run (XWPFDocument doc, int insertionPoint)
{
XWPFParagraph paraA = doc.getParagraphs().get(0);
XWPFParagraph paraB = doc.getParagraphs().get(1);
XWPFParagraph paraC = doc.getParagraphs().get(2);
XWPFParagraph paraD = doc.getParagraphs().get(3);
XWPFParagraph paraE = doc.getParagraphs().get(4);
XWPFParagraph paraF = doc.getParagraphs().get(5);
System.out.println ("--- Document initial state ---");
debug (doc, paraA, paraB, paraC, paraD, paraE, paraF, null, null, null, null);
// Clone the first paragraph
XWPFParagraph cloneThis = (XWPFParagraph) doc.getBodyElements ().get (0);
XWPFParagraph clonedPara = new XWPFParagraph ((CTP) cloneThis.getCTP ().copy (), doc);
// Add new paragraph before the final paragraph
XWPFParagraph insertBeforePara = (XWPFParagraph) doc.getBodyElements ().get (insertionPoint);
XmlCursor cursor = insertBeforePara.getCTP ().newCursor ();
XWPFParagraph newPara = doc.insertNewParagraph (cursor);
newPara.insertNewRun (0).setText ("this should get replaced");
System.out.println ("--- Insert 1st temporary para before F ---");
debug (doc, paraA, paraB, paraC, paraD, paraE, paraF, newPara, clonedPara, null, null);
int newParaIndex = 0;
for (IBodyElement elem : doc.getBodyElements ())
{
if (elem == newPara)
break;
else if (elem.getElementType () == newPara.getElementType ())
newParaIndex++;
}
System.out.println ("1st temporary para is at index " + newParaIndex); // 5, as expected
// Now replace the added paragraph with the cloned one
doc.setParagraph (clonedPara, newParaIndex);
System.out.println ("--- Replace 1st temporary para ---");
debug (doc, paraA, paraB, paraC, paraD, paraE, paraF, newPara, clonedPara, null, null);
// Do exactly the same thing again to clone the second paragraph
XWPFParagraph cloneThis2 = (XWPFParagraph) doc.getBodyElements ().get (1);
XWPFParagraph clonedPara2 = new XWPFParagraph ((CTP) cloneThis2.getCTP ().copy (), doc);
XWPFParagraph insertBeforePara2 = (XWPFParagraph) doc.getBodyElements ().get (insertionPoint + 1);
XmlCursor cursor2 = insertBeforePara2.getCTP ().newCursor ();
XWPFParagraph newPara2 = doc.insertNewParagraph (cursor2);
newPara2.insertNewRun (0).setText ("this should get replaced too");
System.out.println ("--- Insert 2nd temporary para before F ---");
debug (doc, paraA, paraB, paraC, paraD, paraE, paraF, newPara, clonedPara, newPara2, clonedPara2);
int newParaIndex2 = 0;
for (IBodyElement elem : doc.getBodyElements ())
{
if (elem == newPara2)
break;
else if (elem.getElementType () == newPara2.getElementType ())
newParaIndex2++;
}
System.out.println ("2nd temporary para is at index " + newParaIndex2);
doc.setParagraph (clonedPara2, newParaIndex2); // So then this replaces the wrong paragraph
System.out.println ("--- Replace 2nd temporary para ---");
debug (doc, paraA, paraB, paraC, paraD, paraE, paraF, newPara, clonedPara, newPara2, clonedPara2);
}
public final static void main (final String [] args)
{
try (FileInputStream in = new FileInputStream ("W:\\TestIn.docx"))
{
XWPFDocument doc = new XWPFDocument (in);
new ParagraphIssue ().run (doc, 5);
try (FileOutputStream out = new FileOutputStream ("W:\\TestOut.docx"))
{
doc.write (out);
}
}
catch (Exception e)
{
e.printStackTrace ();
}
}
}
A lot is debug code so I can get output that shows exactly what's happening:
--- Document initial state ---
Elements: A B C D E F
Paragraphs: A B C D E F
--- Insert 1st temporary para before F ---
Elements: A B C D E T1 F
Paragraphs: A B C D E T1 F
1st temporary para is at index 5 - perfect so far
--- Replace 1st temporary para ---
Elements: A B C D E T1 F
Paragraphs: A B C D E R1 F - The list of paragraphs has the replacement paragraph, but the list of elements still has the temporary paragraph
--- Insert 2nd temporary para before F ---
Elements: A B C D E T1 T2 F
Paragraphs: T2 A B C D E R1 F - now the 2nd temporary paragraph has gone at the front of the list; its in the correct place in the list of elements
2nd temporary para is at index 6
--- Replace 2nd temporary para ---
Elements: A B C D E T1 T2 F
Paragraphs: T2 A B C D E R2 F - List of elements still contains temporary paragraphs; List of paragraphs has 2nd paragraph in wrong place
Amazingly, the saved Word doc actually looks correct, but I don't understand how when neither list looks correct.
As far as finding where to do the insert goes, so far I could've used int newParaIndex = doc.getPosOfParagraph (newPara);. Problem with this comes when you add tables into the mix. Now I edited the source doc and inserted a table so the list of elements now looks like A, B, (table), C, D, E, F and change insertionPoint to 6 accordingly.
Now you can no longer use doc.getPosOfParagraph () as this returns the index of the paragraph in the list of elements (including tables) but setParagraph needs the index of the paragraph in the list of paragraph (excluding tables). Using doc.getParagraphPos() to compenstate for this returns 0 for the 2nd inserted temporary paragraph because as you can see in the output above, that's literally where it is. So I worked around this by searching only the paragraphs of the elements list, as you can see in the code.
Running again with the table added (this is the 'U' in the debug output):
--- Document initial state ---
Elements: A B U C D E F
Paragraphs: A B C D E F
--- Insert 1st temporary para before F ---
Elements: A B U C D E T1 F
Paragraphs: A B C D E T1 F
--- Replace 1st temporary para ---
Elements: A B U C D E T1 F
Paragraphs: A B C D E R1 F
--- Insert 2nd temporary para before F ---
Elements: A B U C D E T1 T2 F
Paragraphs: T2 A B C D E R1 F
2nd temporary para is at index 6
--- Replace 2nd temporary para ---
Elements: A B U C D E T1 T2 F
Paragraphs: T2 A B C D E R2 F
Again this does actually generate the correct output in the saved doc. My questions are:
Is there a better way to do this that fixes the screwyness of temporary paragraphs being replaced in one list but not the other, and of the 2nd temporary paragraph showing up at the front of the list? For example should I re-use the same XmlCursor to insert the 2nd temporary paragraph? Should I make all the temporary paragraphs in one go and then replace them all in one hit afterwards rather than doing one at a time? Would anything like this help?
When I try this approach in our real app, Word complains the document is corrupted. It offers to attempt to recover it, and if I click Yes then it opens and the content and all the copied paragraphs all look correct, but the odd behaviour here is causing the corrupt doc warning.
Related
Alloy API throws a Null when executing alloy command
I have been using the Alloy API which can be written in Java. My goal is to compile Alloy model, display it visually, and narrow down the search for instances. At this time, I need to command the source of the Alloy language, which may execute correctly or throw a NullPointerException, depending on the source. I have checked the contents of the API class in the eclipse debugger, but I cannot understand it properly. The issue is: The debugger shows that TranslateAlloyToKodkod.execute_command occurs java.lang.NullPointerException. According to the Alloy API documentation, TranslateAlloyToKodkod.execute_command returns null if the user chose "save to FILE" as the SAT solver, and nonnull if the solver finishes the entire solving and is either satisfiable or unsatisfiable. But I never changed execute options that "save to FILE" as the SAT solver. For your information, the solver, Alloy analyzer finishes the entire solving of following two sources. Would you let me know how to fix the problem? Here is the Java code I created, with some additions from the API example: import java.io.File; import edu.mit.csail.sdg.alloy4.A4Reporter; import edu.mit.csail.sdg.alloy4.Err; import edu.mit.csail.sdg.alloy4.ErrorWarning; import edu.mit.csail.sdg.alloy4compiler.ast.Command; import edu.mit.csail.sdg.alloy4compiler.ast.Module; import edu.mit.csail.sdg.alloy4compiler.parser.CompUtil; import edu.mit.csail.sdg.alloy4compiler.translator.A4Options; import edu.mit.csail.sdg.alloy4compiler.translator.A4Solution; import edu.mit.csail.sdg.alloy4compiler.translator.TranslateAlloyToKodkod; import edu.mit.csail.sdg.alloy4viz.VizGUI; public final class exportXML { private static String outputfilepath; public static void main(String[] args) throws Err { VizGUI viz = null; A4Reporter rep = new A4Reporter() { #Override public void warning(ErrorWarning msg) { System.out.print("Relevance Warning:\n"+(msg.toString().trim())+"\n\n"); System.out.flush(); } }; String args_filename = args[0]; String[] path_split = args_filename.split("/"); int pos_fname = path_split.length -1; String[] filename_split = path_split[pos_fname].split("\\."); for ( int i=0; i<filename_split.length; i++ ) { System.out.println(filename_split[i]); } String dir = ""; for ( int i = 0; i < path_split.length - 1; i++ ) { dir = dir.concat(path_split[i]) + "/"; } String out_fname = "Instance_of_" + filename_split[0]; outputfilepath = dir + out_fname; File outdir = new File(outputfilepath); outdir.mkdir(); for(String filename:args) { System.out.println("=========== parse + typechecking: "+filename+" ============="); Module world = CompUtil.parseEverything_fromFile(rep, null, filename); A4Options options = new A4Options(); options.solver = A4Options.SatSolver.SAT4J; for (Command command: world.getAllCommands()) { System.out.println("=========== command : "+command+" ============"); A4Solution ans = TranslateAlloyToKodkod.execute_command(rep, world.getAllReachableSigs(), command, options); System.out.println(ans); if (ans.satisfiable()) { int cnt = 1; A4Solution tmp = ans.next(); while ( tmp.satisfiable() ) { tmp = tmp.next(); cnt++; } System.out.println("=========== "+cnt+" satisfiable solution found ============"); tmp = ans; String[] outXml = new String[cnt]; for ( int i = 0; i < cnt; i++ ) { outXml[i] = outputfilepath + "/" + out_fname + String.valueOf(i+1) + ".xml"; tmp.writeXML(outXml[i]); tmp = tmp.next(); } } } } } } This is the sample of Alloy sources that will be successfully executed: module adressBook open ordering [Book] abstract sig Target {} sig Addr extends Target {} abstract sig Name extends Target {} sig Alias, Group extends Name {} sig Book { names: set Name, addr: names -> some Target } { no n: Name | n in n.^(addr) all a: Alias | lone a.addr } pred add (b, b': Book, n: Name, t: Target) { t in Addr or some lookup [b, t] b'.addr = b.addr + n -> t } pred del (b, b': Book, n: Name, t: Target) { no b.addr.n or some n.(b.addr) - t b'.addr = b.addr - n -> t } fun lookup (b: Book, n: Name): set Addr { n.^(b.addr) & Addr } pred init (b: Book) {no b.addr} fact traces { init [first] all b: Book - last | let b' = next [b] | some n: Name, t: Target | add [b, b', n, t] or del [b, b', n, t] } pred show {} run show for 10 assert lookupYields { all b: Book, n: b.names | some lookup [b, n] } check lookupYields for 3 but 4 Book check lookupYields for 6 This is the Alloy source that will fail to execute (it will throw a null pointer): sig Element {} one sig Group { elements: set Element, unit: one elements, mult: elements -> elements -> one elements, inv: elements -> one elements } fact NoRedundantElements { all e: Element | e in Group.elements } fact UnitLaw1 { all a: Group.elements | Group.mult [a] [Group.unit] = a } fact UnitLaw2 { all a: Group.elements | Group.mult [Group.unit] [a] = a } fact AssociativeLaw { all a: Group.elements | all b: Group.elements | all c:Group.elements | Group.mult [Group.mult [a] [b]] [c] = Group.mult [a] [Group.mult [b] [c]] } fact InvLaw1{ all a: Group.elements | Group.mult [Group.inv[a]] [a] = Group.unit } assert InvLaw2 { all a: Group.elements | Group.mult [a] [Group.inv[a]] = Group.unit } check InvLaw2 assert Commutativity { all a: Group.elements | all b: Group.elements | Group.mult [a] [b] = Group.mult [b] [a] } check Commutativity for 6 pred subgroup (g: set Element, h: set Element) { (all a: g | a in h) and (Group.unit in g) and (all a, b: g | Group.mult [a] [b] in g) and (all a: g | Group.inv[a] in g) } pred regularSubgroup(n: set Element, g: set Element) { subgroup [n, g] and (all n0: n, g0: g | Group.mult [Group.mult [g0] [n0]] [Group.inv[g0]] in n) } pred main(n1: set Element, n2: set Element) { let g = Group.elements | regularSubgroup [n1, g] and (some g0: g | (not g0 in n1)) and regularSubgroup [n2, n1] and (some n10: n1 | (not n10 in n2)) and (not regularSubgroup [n2, g]) } run main for 8
I think this should be reported as an issue on the https://github.com/alloytools/org.alloytools.alloy site? Preferably with a PR that fixes it.
XWPFDocument - replacing text doesn't work
I want to replace tags with values in a docx document. Here is a line of the document : <Site_rattachement>, le <date_avenant> I want to replace <Site_rattachement> and <date_avenant> by some value. My code : doc = new XWPFDocument(OPCPackage.open(docxFile)); for (XWPFParagraph p : doc.getParagraphs()) { List<XWPFRun> runs = p.getRuns(); if (runs != null) { for (XWPFRun r : runs) { String text = r.getText(0); replaceIfNeeded(r, text, my_value); } } } But first r.getText(0) gives me < instead of <Site_rattachement>. Next occurence gives me Site_rattachement. Next occurence gives me >. Is there something wrong with my docx file?
Insert a bulleted list from an ArrayList Apache POI XWPF
I have an array list that I want to use to create a new bullet list inside a document. I already have numbering (with numbers) and I want to have both (number and bullet) on different lists. My document is pre-populated with some data and I have some tokens who determine where go my data. For my list, I have token who is like this one and I able to reach it. {{tokenlist1}} I want to : first option : reach my token, create a new bullet list and delete my token second option : replace my token by my first element and continue my bullet list. It would be really appreciated if the bullet form (square, round, check, ....) can stay the same as they are with the token.
EDIT for those who want an answer here's my solution. Action Map<String, Object> replacements = new HashMap<String, Object>(); replacements.put("{{token1}}", "texte changé 1"); replacements.put("{{token2}}", "ici est le texte du token numéro 2"); replacements.put("{{tokenList1}}", tokenList1); replacements.put("{{tokenList2}}", tokenList1); templateWithToken = reportService.findAndReplaceToken(replacements, templateWithToken); Service public XWPFDocument findAndReplaceToken (Map<String, Object> replacements, XWPFDocument document) { List<XWPFParagraph> paragraphs = document.getParagraphs(); for (int i = 0; i < paragraphs.size(); i++) { XWPFParagraph paragraph = paragraphs.get(i); List<XWPFRun> runs = paragraph.getRuns(); for (Map.Entry<String, Object> replPair : replacements .entrySet()) { String find = replPair.getKey(); Object repl = replPair.getValue(); TextSegment found = paragraph.searchText(find, new PositionInParagraph()); if (found != null) { if (repl instanceof String) { replaceText(found, runs, find, repl); } else if (repl instanceof ArrayList<?>) { Iterator<?> iterArrayList = ((ArrayList) repl).iterator(); boolean isPassed = false; while (iterArrayList.hasNext()) { Object object = (Object) iterArrayList.next(); if (isPassed == false) { replaceText(found, runs, find, object.toString()); } else { XWPFRun run = paragraph.createRun(); run.addCarriageReturn(); run.setText(object.toString()); } isPassed = true; } } } } } return document; } private void replaceText(TextSegment found, List<XWPFRun> runs, String find, Object repl) { int biginRun = found.getBeginRun(); int biginRun2 = found.getEndRun(); if (found.getBeginRun() == found.getEndRun()) { // whole search string is in one Run XWPFRun run = runs.get(found.getBeginRun()); String runText = run.getText(run.getTextPosition()); String replaced = runText.replace(find, repl.toString()); run.setText(replaced, 0); } else { // The search string spans over more than one Run // Put the Strings together StringBuilder b = new StringBuilder(); for (int runPos = found.getBeginRun(); runPos <= found .getEndRun(); runPos++) { XWPFRun run = runs.get(runPos); b.append(run.getText(run.getTextPosition())); } String connectedRuns = b.toString(); String replaced = connectedRuns.replace(find, repl.toString()); // The first Run receives the replaced String of all // connected Runs XWPFRun partOne = runs.get(found.getBeginRun()); partOne.setText(replaced, 0); // Removing the text in the other Runs. for (int runPos = found.getBeginRun() + 1; runPos <= found .getEndRun(); runPos++) { XWPFRun partNext = runs.get(runPos); partNext.setText("", 0); } } }
How to read word document and get parts of it with all styles using docx4j
I am using docx4j to deal with word document formatting. I have one word document which is divided in number of tables. I want to read all the tables and if I find some keywords then I want to take those contents to another word document with all the formatting. My word document is as follow. Like from above I want to take content which is below Some Title. Here my keyword is Sample Text. So whenever Sample Text gets repeated, content needs to be fetched to new word document. I am using following code. MainDocumentPart mainDocumentPart = null; WordprocessingMLPackage docxFile = WordprocessingMLPackage.load(new File(fileName)); mainDocumentPart = docxFile.getMainDocumentPart(); WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.createPackage(); ClassFinder finder = new ClassFinder(Tbl.class); new TraversalUtil(mainDocumentPart.getContent(), finder); Tbl tbl = null; int noTbls = 0; int noRows = 0; int noCells = 0; int noParas = 0; int noTexts = 0; for (Object table : finder.results) { noTbls++; tbl = (Tbl) table; // Get all the Rows in the table List<Object> allRows = DocxUtility.getDocxUtility() .getAllElementFromObject(tbl, Tr.class); for (Object row : allRows) { Tr tr = (Tr) row; noRows++; // Get all the Cells in the Row List<Object> allCells = DocxUtility.getDocxUtility() .getAllElementFromObject(tr, Tc.class); toCell: for (Object cell : allCells) { Tc tc = (Tc) cell; noCells++; // Get all the Paragraph's in the Cell List<Object> allParas = DocxUtility.getDocxUtility() .getAllElementFromObject(tc, P.class); for (Object para : allParas) { P p = (P) para; noParas++; // Get all the Run's in the Paragraph List<Object> allRuns = DocxUtility.getDocxUtility() .getAllElementFromObject(p, R.class); for (Object run : allRuns) { R r = (R) run; // Get the Text in the Run List<Object> allText = DocxUtility.getDocxUtility() .getAllElementFromObject(r, Text.class); for (Object text : allText) { noTexts++; Text txt = (Text) text; } System.out.println("No of Text in Para No: " + noParas + "are: " + noTexts); } } System.out.println("No of Paras in Cell No: " + noCells + "are: " + noParas); } System.out.println("No of Cells in Row No: " + noRows + "are: " + noCells); } System.out.println("No of Rows in Table No: " + noTbls + "are: " + noRows); } System.out.println("Total no of Tables: " + noTbls );
Assuming your text is in a single run (ie not split across runs), then you can search for it via XPath. Or you can manually traverse using TraversalUtil. See docx4j's Getting Started for more info. So finding your stuff is pretty easy. Copying the formatting it uses, and any rels in it, is in the general case, complicated. See my post http://www.docx4java.org/blog/2010/11/merging-word-documents/ for more on the issues involved.
How can I update an existing entry in the csv file
I need to update an existing entry in a CSV file using java code. The code that I have written is below. I'm able to match the entry given by the user with the entry in the file, but can't figure how to write a new entry at the same location. while ((row = reader.readNext()) != null) { for (int i = 0; i < 1; i++) { System.out.print("row is "+row[i]); // display CSV values System.out.println("Cell Value: " + row[i]); System.out.println("User Input: " + t1); System.out.println("-------------"); if(t1.equals(row[0])) { data.add(new String[] { t1, t2, t3, t4, t5, t6, t7, t8, t9, t10, t11, t12, t13, t14, t15, t16, t17, t18, t19, t20, t21, t22, t23, t24, t26, t27, t28, t29, t30, t31, t32, t33, t34, t35, t36, t37, t38, t39 }); flag=1; writer.writeAll(data); break; } } rowno++; }
This is what you need to do to write a new value to the same location Have two Files, The input one to read from, and new one to write to In the loop, read a line from file, and if it doesn't match, write to output file If found from input file, write new entry in output file. Finish looping through file. Close the Streams Change the name of the output file to the name of the input file.