Replace a image with Apache POI

Replace a image with Apache POI - java

I need help by replacing an image with another image in Word using Apache POI or any other library that might do the job. I know how to replace a word using Apache POI but I can't figure a way out to replace an image.
public static void main(String[] args) throws FileNotFoundException {
String c22 = "OTHER WORD";
try {
XWPFDocument doc = new XWPFDocument(OPCPackage.open("imagine.docx"));
for (XWPFParagraph p : doc.getParagraphs()) {
List<XWPFRun> runs = p.getRuns();
if (runs != null) {
for (XWPFRun r : runs) {
String text = r.getText(0);
if (text != null ) {
String imgFile = "imaginedeschis.jpg";
try (FileInputStream is = new FileInputStream(imgFile)) {
r.addPicture(is, XWPFDocument.PICTURE_TYPE_JPEG, imgFile,
Units.toEMU(200), Units.toEMU(200)); // 200x200 pixels
text = text.replace("1ST WORD", c22);
} // 200x200 pixels
r.setText(text, 0);
}
}
}
}
doc.write(new FileOutputStream("output.docx"));
} catch (InvalidFormatException | IOException m){ }
}

I am using below Java code to replace one image in Word document (*.docx). Please share if anyone have better approach.
public XWPFDocument replaceImage(XWPFDocument document, String imageOldName, String imagePathNew, int newImageWidth, int newImageHeight) throws Exception {
try {
LOG.info("replaceImage: old=" + imageOldName + ", new=" + imagePathNew);
int imageParagraphPos = -1;
XWPFParagraph imageParagraph = null;
List<IBodyElement> documentElements = document.getBodyElements();
for(IBodyElement documentElement : documentElements){
imageParagraphPos ++;
if(documentElement instanceof XWPFParagraph){
imageParagraph = (XWPFParagraph) documentElement;
if(imageParagraph != null && imageParagraph.getCTP() != null && imageParagraph.getCTP().toString().trim().indexOf(imageOldName) != -1) {
break;
}
}
}
if (imageParagraph == null) {
throw new Exception("Unable to replace image data due to the exception:\n"
+ "'" + imageOldName + "' not found in in document.");
}
ParagraphAlignment oldImageAlignment = imageParagraph.getAlignment();
// remove old image
document.removeBodyElement(imageParagraphPos);
// now add new image
// BELOW LINE WILL CREATE AN IMAGE
// PARAGRAPH AT THE END OF THE DOCUMENT.
// REMOVE THIS IMAGE PARAGRAPH AFTER
// SETTING THE NEW IMAGE AT THE OLD IMAGE POSITION
XWPFParagraph newImageParagraph = document.createParagraph();
XWPFRun newImageRun = newImageParagraph.createRun();
//newImageRun.setText(newImageText);
newImageParagraph.setAlignment(oldImageAlignment);
try (FileInputStream is = new FileInputStream(imagePathNew)) {
newImageRun.addPicture(is, XWPFDocument.PICTURE_TYPE_JPEG, imagePathNew,
Units.toEMU(newImageWidth), Units.toEMU(newImageHeight));
}
// set new image at the old image position
document.setParagraph(newImageParagraph, imageParagraphPos);
// NOW REMOVE REDUNDANT IMAGE FORM THE END OF DOCUMENT
document.removeBodyElement(document.getBodyElements().size() - 1);
return document;
} catch (Exception e) {
throw new Exception("Unable to replace image '" + imageOldName + "' due to the exception:\n" + e);
} finally {
// cleanup code
}
}
Please visit https://bitbucket.org/wishcoder/java-poi-word-document/wiki/Home for more examples like:
Open existing Microsoft Word Document (*.docx)
Clone Table in Word Document and add new data to cloned table
Update existing Table->Cell data in document
Update existing Hyper Link in document
Replace existing Image in document
Save update Microsoft Word Document (*.docx)

I recommend using transparent tables to track images. following code will replace table row 0 col 1 cell's picture.
List<XWPFParagraph> paragraphs = table.getRow(0).getCell(1).getParagraphs();
for (XWPFParagraph para: paragraphs) {
for (XWPFRun r : para.getRuns()) {
CTR ctr = r.getCTR();
List<CTDrawing> drawings = ctr.getDrawingList();
for (int i = 0; i < drawings.size(); i++) {
ctr.removeDrawing(i);
}
}
}
XWPFParagraph paragraph = table.getRow(0).getCell(1).addParagraph();
XWPFRun run = paragraph.createRun();
FileInputStream fis = new FileInputStream('filepath');
run.addPicture(fis, XWPFDocument.PICTURE_TYPE_PNG, "filename", Units.toEMU(200), Units.toEMU(60));

Related

delete am image from a PDF file using PDFbox

I am attempting to delete images from a PDF using java and PDFbox. The images are not inline, and the PDF does not have patterns or forms. The pdf file contains 2 images. The PDFdebugger tool shows Resources >> XObject >> IM3 and IM5. The problem is: I display the output pdf file and the images are not deleted.
public class DeleteImage {
public static void removeImages(String pdfFile) throws Exception {
PDDocument document = PDDocument.load(new File(pdfFile));
for (PDPage page : document.getPages()) {
PDResources pdResources = page.getResources();
pdResources.getXObjectNames().forEach(propertyName -> {
if(!pdResources.isImageXObject(propertyName)) {
return;
}
PDXObject o;
try {
o = pdResources.getXObject(propertyName);
if (o instanceof PDImageXObject) {
System.out.println("propertyName" + propertyName);
page.getCOSObject().removeItem(propertyName);
}
} catch (IOException e) {
e.printStackTrace();
}
});
for (COSName name : page.getResources().getPatternNames()) {
PDAbstractPattern pattern = page.getResources().getPattern(name);
System.out.println("have pattern");
}
PDFStreamParser parser = new PDFStreamParser(page);
parser.parse();
List<Object> tokens = parser.getTokens();
System.out.println("original tokens size" + tokens.size());
List<Object> newTokens = new ArrayList<Object>();
for(int j=0; j<tokens.size(); j++) {
Object token = tokens.get( j );
if( token instanceof Operator ) {
Operator op = (Operator)token;
System.out.println("operation" + op.getName());
//find image - remove it
if( op.getName().equals("Do") ) {
System.out.println("op equals Do");
newTokens.remove(newTokens.size()-1);
continue;
} else if ("BI".equals(op.getName())) {
System.out.println("inline -- op equals BI");
} else {
System.out.println("op not quals Do");
}
}
newTokens.add(token);
}
PDDocument newDoc = new PDDocument();
PDPage newPage = newDoc.importPage(page);
newPage.setResources(page.getResources());
System.out.println("tokens size" + newTokens.size());
PDStream newContents = new PDStream(newDoc);
OutputStream out = newContents.createOutputStream();
ContentStreamWriter writer = new ContentStreamWriter( out );
writer.writeTokens( newTokens);
out.close();
newPage.setContents( newContents );
}
document.save("RemoveImage.pdf");
document.close();
}
public static void remove(String pdfFile) throws Exception {
PDDocument document = PDDocument.load(new File(pdfFile));
PDResources resources = null;
for (PDPage page : document.getPages()) {
resources = page.getResources();
for (COSName name : resources.getXObjectNames()) {
PDXObject xobject = resources.getXObject(name);
if (xobject instanceof PDImageXObject) {
System.out.println("have image");
removeImages(pdfFile);
}
}
}
document.save("RemoveImage.pdf");
document.close();
}
}

If You Call remove...
In remove you
load the PDF into document,
iterate over the pages of document, and for each page
iterate over the XObject resources, and for each Xobject
check whether it is an image Xobject, and if it is
call removeImages which loads the same original file, processes it, and saves the result as "RemoveImage.pdf".
After all that processing you save the unchanged document to "RemoveImage.pdf".
So in that last step you overwrite any changes you may have done in removeImages and end up with your original file in "RemoveImage.pdf"!
If You Call removeImages Directly...
In removeImages you do some changes but there are certain issues:
Whenever you find an image Xobject resource, you attempt to remove it from the page directly
page.getCOSObject().removeItem(propertyName);
but the image Xobject resource is not a direct child of the page, it is managed by pdResources, so you should remove it from there.
You remove all Do instructions from the page content, not only those for image Xobjects, so you probably remove more than you wanted.

PDFBox merge 2 pdf files side by side with java

I compare 2 pdf files and mark highlight on them.
When i using pdfbox to merge it for comparison . It have error missing highlight.
I using this function:
The function to merge 2 file pdfs with all pages of them to side by side.
function void generateSideBySidePDF() {
File pdf1File = new File(FILE1_PATH);
File pdf2File = new File(FILE2_PATH);
File outPdfFile = new File(OUTFILE_PATH);
PDDocument pdf1 = null;
PDDocument pdf2 = null;
PDDocument outPdf = null;
try {
pdf1 = PDDocument.load(pdf1File);
pdf2 = PDDocument.load(pdf2File);
outPdf = new PDDocument();
for(int pageNum = 0; pageNum < pdf1.getNumberOfPages(); pageNum++) {
// Create output PDF frame
PDRectangle pdf1Frame = pdf1.getPage(pageNum).getCropBox();
PDRectangle pdf2Frame = pdf2.getPage(pageNum).getCropBox();
PDRectangle outPdfFrame = new PDRectangle(pdf1Frame.getWidth()+pdf2Frame.getWidth(), Math.max(pdf1Frame.getHeight(), pdf2Frame.getHeight()));
// Create output page with calculated frame and add it to the document
COSDictionary dict = new COSDictionary();
dict.setItem(COSName.TYPE, COSName.PAGE);
dict.setItem(COSName.MEDIA_BOX, outPdfFrame);
dict.setItem(COSName.CROP_BOX, outPdfFrame);
dict.setItem(COSName.ART_BOX, outPdfFrame);
PDPage outPdfPage = new PDPage(dict);
outPdf.addPage(outPdfPage);
// Source PDF pages has to be imported as form XObjects to be able to insert them at a specific point in the output page
LayerUtility layerUtility = new LayerUtility(outPdf);
PDFormXObject formPdf1 = layerUtility.importPageAsForm(pdf1, pageNum);
PDFormXObject formPdf2 = layerUtility.importPageAsForm(pdf2, pageNum);
// Add form objects to output page
AffineTransform afLeft = new AffineTransform();
layerUtility.appendFormAsLayer(outPdfPage, formPdf1, afLeft, "left" + pageNum);
AffineTransform afRight = AffineTransform.getTranslateInstance(pdf1Frame.getWidth(), 0.0);
layerUtility.appendFormAsLayer(outPdfPage, formPdf2, afRight, "right" + pageNum);
}
outPdf.save(outPdfFile);
outPdf.close();
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
if (pdf1 != null) pdf1.close();
if (pdf2 != null) pdf2.close();
if (outPdf != null) outPdf.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}

Insert this into your code after the "Source PDF pages has to be imported" segment to copy the annotations. The ones of the right PDF must have their rectangle moved.
// copy annotations
PDPage src1Page = pdf1.getPage(pageNum);
PDPage src2Page = pdf2.getPage(pageNum);
for (PDAnnotation ann : src1Page.getAnnotations())
{
outPdfPage.getAnnotations().add(ann);
}
for (PDAnnotation ann : src2Page.getAnnotations())
{
PDRectangle rect = ann.getRectangle();
ann.setRectangle(new PDRectangle(rect.getLowerLeftX() + pdf1Frame.getWidth(), rect.getLowerLeftY(), rect.getWidth(), rect.getHeight()));
outPdfPage.getAnnotations().add(ann);
}
Note that this code has a flaw - it works only with annotations WITH appearance stream (most have it). It will have weird effects for those that don't, in that case, one would have to adjust the coordinates depending on the annotation type. For highlights, it would be the quadpoints, for line it would be the line coordinates, etc, etc.

Apache poi replace existing picture on header

Is there any way to replace an image on word(docx) file header by name of the image with apache poi? I'am thinking about that:
+--------------------------------+
+HEADER myimage.jpeg-+
+ -----------BODY------------+
+--------------------------------+
replaceImage("myimage.jpeg", newPictureInputStream,
"newPicture_name.jpeg");
Here what I tried:
XWPFParagraph originalParagraph = null;
originalParagraph = getPictureParagraphInHead(lookingPictureName);
ListIterator<XWPFRun> it = originalParagraph.getRuns().listIterator();
XWPFRun replacedRun = null;
while (it.hasNext()) {
XWPFRun run = it.next();
int runIDX = it.nextIndex();
if (run.getEmbeddedPictures().size() > 0) {
XWPFRun newRun = null;
newRun = new XWPFRun(run.getCTR(), (IRunBody) originalParagraph);
originalParagraph.addRun(newRun);
originalParagraph.removeRun(originalParagraph.getRuns().indexOf(run));
break;
}
}

I'm not sure if you can get the "filename" of the image with POI. It's probably in the XML so you might have to make your own method for finding the image.
To get the Header you do:
XWPFHeaderFooterPolicy policy = new XWPFHeaderFooterPolicy(doc); // XWPFDocument
XWPFHeader header = policy.getDefaultHeader();
And to delete the images, get the XWPFRun from your paragraph (cell/row/table..)
CTR ctr = myRun.getCTR(); //
List<CTDrawing> images = ctr.getDrawingList();
for (int i=0; i<images.size(); i++)
{
ctr.removeDrawing(i);
}

Apache POI - add multiple paragraphs to header/footer on the same line

I am using Apace POI to process some documents and I would like to add a header/footer which would consist of multiple paragraphs, but I would like for them to be displayed on the same line.
This is my attempt so far:
XWPFDocument document = new XWPFDocument();
// adding header and footer
CTP ctp = CTP.Factory.newInstance();
CTR ctr = ctp.addNewR();
// create footer components
CTText footerCopyrightText = ctr.addNewT();
footerCopyrightText.setStringValue("\u00A9" + " My Website - " + Calendar.getInstance().get(Calendar.YEAR));
CTText footerPageText = ctr.addNewT();
footerPageText.setStringValue(document.getProperties().getExtendedProperties().getUnderlyingProperties().getPages() + "");
XWPFParagraph footerCopyrightParagraph = new XWPFParagraph( ctp, document );
footerCopyrightParagraph.setAlignment(ParagraphAlignment.CENTER);
XWPFParagraph footerPageParagraph = new XWPFParagraph(ctp, document);
footerPageParagraph.setAlignment(ParagraphAlignment.RIGHT);
XWPFParagraph[] footerParagraphs = {footerCopyrightParagraph, footerPageParagraph};
CTSectPr sectPr = document.getDocument().getBody().addNewSectPr();
XWPFHeaderFooterPolicy headerFooterPolicy = new XWPFHeaderFooterPolicy(document, sectPr );
headerFooterPolicy.createFooter(STHdrFtr.DEFAULT, footerParagraphs);
However, the end result so far is that I get a single right-aligned text, which consists of the two XWPFParagraphs, concatenated.
I have also checked some other examples here on Stack Overflow (there was one for a Header, but I didn't manage to get it to work).
A basic idea of what I want to achieve is this: http://imgur.com/jrwVO0F
Any ideas on what I am doing wrong?
Thank you,

Add Tabstops and use them
Here's my draft - printing my Name Left, Center and Right on a A4 Document. I have no clue whatsoever as to how those position elements are calculated though... Code to add tabstops is from Java Apache POI Tab Stop word document
import java.awt.Desktop;
import java.io.*;
import java.math.BigInteger;
import org.apache.poi.xwpf.usermodel.*;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.*;
public class POIExample {
public static void main(String[] args) {
try {
XWPFDocument document = new XWPFDocument();
XWPFParagraph paragraph = document.createParagraph();
XWPFRun tmpRun = paragraph.createRun();
tmpRun.setText("JAN");
tmpRun.addTab();
tmpRun.setText("JAN");
tmpRun.addTab();
tmpRun.setText("JAN");
BigInteger pos1 = BigInteger.valueOf(4500);
setTabStop(paragraph, STTabJc.Enum.forString("center"), pos1);
BigInteger pos2 = BigInteger.valueOf(9000);
setTabStop(paragraph, STTabJc.Enum.forString("right"), pos2);
File f = File.createTempFile("poi", ".docx");
try (FileOutputStream fo = new FileOutputStream(f)) {
document.write(fo);
}
Desktop.getDesktop().open(f);
} catch (Exception e) {
e.printStackTrace();
}
}
public static void setTabStop(XWPFParagraph oParagraph, STTabJc.Enum oSTTabJc, BigInteger oPos) {
CTP oCTP = oParagraph.getCTP();
CTPPr oPPr = oCTP.getPPr();
if (oPPr == null) {
oPPr = oCTP.addNewPPr();
}
CTTabs oTabs = oPPr.getTabs();
if (oTabs == null) {
oTabs = oPPr.addNewTabs();
}
CTTabStop oTabStop = oTabs.addNewTab();
oTabStop.setVal(oSTTabJc);
oTabStop.setPos(oPos);
}
}

So, after some tinkering, I finally have a functioning version. Here's hoping it will prove useful to other users as well.
Creating footer object code
// create footer components
XWPFDocument document = new XWPFDocument();
CTP footerCtp = CTP.Factory.newInstance();
CTR footerCtr = footerCtp.addNewR();
XWPFParagraph footerCopyrightParagraph = new XWPFParagraph(footerCtp, document);
document.getProperties().getExtendedProperties().getUnderlyingProperties().getPages();
XWPFRun run = footerCopyrightParagraph.getRun(footerCtr);
run.setText("My Website.com");
run.addTab();
run.setText("\u00A9" + " My Website - " + Calendar.getInstance().get(Calendar.YEAR));
run.addTab();
run.setText("Right Side Text");
setTabStop(footerCtp, STTabJc.Enum.forString("right"), BigInteger.valueOf(9000));
XWPFParagraph[] footerParagraphs = {footerCopyrightParagraph};
CTSectPr sectPr = document.getDocument().getBody().addNewSectPr();
XWPFHeaderFooterPolicy headerFooterPolicy = new XWPFHeaderFooterPolicy(document, sectPr);
headerFooterPolicy.createFooter(STHdrFtr.DEFAULT, footerParagraphs);
SetTabStop method
private static void setTabStop(CTP oCTP, STTabJc.Enum oSTTabJc, BigInteger oPos) {
CTPPr oPPr = oCTP.getPPr();
if (oPPr == null) {
oPPr = oCTP.addNewPPr();
}
CTTabs oTabs = oPPr.getTabs();
if (oTabs == null) {
oTabs = oPPr.addNewTabs();
}
CTTabStop oTabStop = oTabs.addNewTab();
oTabStop.setVal(oSTTabJc);
oTabStop.setPos(oPos);
}

Retrieving content of hyperlinked slides in powerpoint files(.PPTX) through apache POI

I am trying to get the text content of powerpoint files and replace with some other text. I have a powerpoint file of 20 slides. where 13,14,15,16 slides have hyperlink to 17,18,19 and 20th slide. I am using XMLSlideshow to traverse through the slides, But it gives only 16 slides. It does not give last 4 hyperlinked slides.
Any idea really appreaciable in advance how can I get content of all hyperlinked slides and Replace by some other text.
here is my code.
public static void replaceContentInPPTX(File inputFile, File outputFile) throws IOException{
FileInputStream fis = null;
FileOutputStream fos = null;
XMLSlideShow ppt = null;
try{
fis = new FileInputStream(inputFile);
fos = new FileOutputStream(outputFile);
ppt = new XMLSlideShow(fis);
// System.out.println("Available slide layouts:"+ppt.getSlideMasters().length);
/* for(XSLFSlideMaster master : ppt.getSlideMasters()){
XSLFShape[] shape = master.getShapes();
for(XSLFSlideLayout layout : master.getSlideLayouts()){
System.out.println(layout.getType());
}
}*/
System.out.println("No of slides:"+ppt.getSlides().length); // gives 16 slides.
for(XSLFSlide slide : ppt.getSlides()) {
for(XSLFShape shape : slide){
if(shape instanceof XSLFTextShape) {
XSLFTextShape txShape = (XSLFTextShape)shape;
for (XSLFTextParagraph xslfParagraph : txShape.getTextParagraphs()) {
String originalText = replaceUnwantedChar(xslfParagraph.getText());
if(! originalText.isEmpty()) {
String translation = "";
if(translation != null ) {
CTRegularTextRun[] ctRegularTextRun = xslfParagraph.getXmlObject().getRArray();
for(int index = ctRegularTextRun.length-1; index > 0 ; index--){
xslfParagraph.getXmlObject().removeR(index);
}
ctRegularTextRun[0].setT(translation);
}
}
}
}
}
}
ppt.write(fos);
fos.close();
fis.close();
}catch(Exception ex){
ex.printStackTrace();
}
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Replace a image with Apache POI - java

Related

delete am image from a PDF file using PDFbox

PDFBox merge 2 pdf files side by side with java

Apache poi replace existing picture on header

Apache POI - add multiple paragraphs to header/footer on the same line

Retrieving content of hyperlinked slides in powerpoint files(.PPTX) through apache POI

Categories

Resources