iText Flying Saucer pdf headers and ignoring html

iText Flying Saucer pdf headers and ignoring html - java

We use xhtml to pdf with good success, but a new requirement came up to have headers and page count on every page. We are using newset release of Flying Saucer.
I followed example here: http://today.java.net/pub/a/today/2007/06/26/generating-pdfs-with-flying-saucer-and-itext.html#page-specific-features
...but this would not work. The header would be top left on first page.
If I use the r7 version, headers and page numbering works perfectly, but none of the passed in html is rendered, whilst in r8 the headers\ page numbers are ignored, but the html is rendered perfectly. xHTML used for tests is copied from url above.
I know I must be missing something very simple, if anyone has any ideas\ comments, I would be very grateful to hear.

I think they changed this functionality in r8.... try this method instead:
https://gist.github.com/626264

We use the same method and everything works perfectly, I have however decided not to use flying-saucer's built in headers/footers and use a PdfStamper to add them after the PDF is generated, it works quite well, here is an example.
public void modifyPdf(PdfStamper stamper) {
this.reader = stamper.getReader();
PdfContentByte under = null;
PdfPTable header = null;
PdfPTable footer = null;
final int total = this.reader.getNumberOfPages();
for (int page = 1; page <= total; page++) {
under = stamper.getUnderContent(page);
final PdfDocument doc = under.getPdfDocument();
final Rectangle rect = this.reader.getPageSizeWithRotation(page);
header = ... //build your header
footer = ... // build your footer
final float x = 0;
//write header to PDF
if (header != null) {
float y = (rect.getTop() - 0);
header.writeSelectedRows(0, -1, x, y, under);
}
//write footer to PDF
if (footer != null) {
float y = (rect.getBottom() + 20);
footer.writeSelectedRows(0, -1, x, y, under);
}
}
}
you can build your stamper like this:
final PdfReader reader = new PdfReader(/*your pdf file*/);
final PdfStamper stamper = new PdfStamper(reader, /* output */);
Hope you find this helpful.

Related

How to use iText to parse paths (such as lines in the document)

I am using iText to parse text in a PDF document, and i am using PdfContentStreamProcessor with a RenderListener. Such as:
PdfReader reader = new PdfReader(file.toURI().toURL());
int numberOfPages = reader.getNumberOfPages();
MyRenderListener listener = new MyRenderListener ();
PdfContentStreamProcessor processor = new PdfContentStreamProcessor(listener);
for (int pageNumber = 1; pageNumber <= numberOfPages; pageNumber++) {
PdfDictionary pageDic = reader.getPageN(pageNumber);
PdfDictionary resourcesDic = pageDic.getAsDict(PdfName.RESOURCES);
Rectangle pageSize = reader.getPageSize(pageNumber);
listener.startPage(pageNumber, pageSize);
processor.processContent(ContentByteUtils.getContentBytesForPage(reader, pageNumber), resourcesDic);
}
I have no problem to get the text with the renderText(TextRenderInfo) method, but how do I parse the graphic content appart from images? For example in my case I would like to get:
Text content which is in a box
Horizontal lines

Per mkl comment, by using ExtRenderListener I am able to get the geometries. I used How to extract the color of a rectangle in a PDF, with iText for reference

iText set up border color and width for check box, but it does not work

I want to use iText to add a check box to a PDF file, and here is my code:
public static void testPdf() throws IOException {
String src = "/Users/heartisan/Downloads/xx.pdf";
String dest = "/Users/heartisan/Downloads/yy.pdf";
PdfDocument pdf = new PdfDocument(new PdfReader(src), new PdfWriter(dest));
PdfAcroForm form = PdfAcroForm.getAcroForm(pdf, true);
Document document = new Document(pdf);
for (int i = 0; i < 3; i++) {
PdfButtonFormField checkField = PdfFormField.createCheckBox(pdf, new Rectangle(369 + i * 69, 751, 15, 15),
"experience".concat(String.valueOf(i+1)), "Off", PdfFormField.TYPE_CHECK);
checkField.setBorderWidth(2);
checkField.setBorderColor(DeviceGray.GRAY);
checkField.setVisibility(PdfFormField.VISIBLE);
checkField.setBackgroundColor(ColorConstants.RED);
checkField.setToggleOff(false);
// checkField.getWidgets().get(0).setBorderStyle(PdfAnnotation.STYLE_SOLID);
form.addField(checkField, pdf.getPage(1));
}
document.close();
}
Then here is the result:
Actually, as the code showed before, I set up the border color and width, but it just not work, I used Adobe Arcobat and it works:
Then I debugged the two files' fields, and I found:
As I marked, both the color and width's values are gone, both the values were there just before I call document.close(), I don't know why.
Can anyone help me?

Is it expected result if you open your resultant PDF in Google Chrome or Pdf Studio 2020? I have the same result In Acrobat so I think it's not an issue of iText. Btw, If you click on your checkFields in Acrobat, all looks as expected.

unable to put stamp using itext7 using Java lanaguage on only skia generated pdf (shows inverted stamp)

I am unable to put stamp using itext7 using Java language on only skia generated pdf (skia is pdf library used by google; if someone has worked on google docs-> Clicks on Print -> Save as Pdf ). It Stamps incorrectly; if I stamp at top left position of pdf page then it would stamp at bottom left and show (inverted mirror) image and (inverted mirror) text. For all other pdfs it gives correct stamping.
It seems pdf generated by skia has missing meta -data.

Since you didn't share any code, nor any document, I created a PDF document from Google docs, and I used the code I wrote in answer to the Itextsharp 7 - Scaled and Centered Image as watermark question to add a Watermark in the center.
The result looked like this:
As you can see in the Document Properties, the original document was created using Skia/PDF m67; modified using iText® 7.1.3.
You need a Watermark in the top left, so I adapted the code like this:
public void createPdf(String src, String dest) throws IOException {
PdfDocument pdfDoc = new PdfDocument(
new PdfReader(src), new PdfWriter(dest));
Document document = new Document(pdfDoc);
PdfCanvas over;
PdfExtGState gs1 = new PdfExtGState();
gs1.setFillOpacity(0.5f);
int n = pdfDoc.getNumberOfPages();
Rectangle pagesize;
ImageData img = ImageDataFactory.create(IMG);
float iW = img.getWidth();
float iH = img.getHeight();
float x, y;
for (int i = 1; i <= n; i++)
{
PdfPage pdfPage = pdfDoc.getPage(i);
pagesize = pdfPage.getPageSize();
x = pagesize.getLeft();
y = pagesize.getTop() - iH;
over = new PdfCanvas(pdfDoc.getPage(i));
over.saveState();
over.setExtGState(gs1);
over.addImage(img, iW, 0, 0, iH, x, y);
over.restoreState();
}
document.close();
pdfDoc.close();
}
The result looks like this:
The image isn't mirrored; it's at the top-left position of the page. In short: there doesn't seem to be any problem with PDF's created with Skia/PDF m67.

Retrieve the page number of an image in pdf- IText

I am using the code from the below link to render the images
MyImageRenderListener - IText
Below is my try block of the Code. What I am actually doing is finding the DPI of the image and if the dpi of the image is below 300 then writing it in a text file.
NOW, I also want to write the page numbers where these images are located in the PDF. How can I obtain the Page Number of that image?
try {
String filename;
FileOutputStream os;
PdfImageObject image = renderInfo.getImage();
BufferedImage img = null;
String txtfile = "results/results.txt";
PdfDictionary imageDict = renderInfo.getImage().getDictionary();
float widthPx = imageDict.getAsNumber(PdfName.WIDTH).floatValue();
float heightPx = imageDict.getAsNumber(PdfName.HEIGHT).floatValue();
float widthUu = renderInfo.getImageCTM().get(Matrix.I11);
float heigthUu = renderInfo.getImageCTM().get(Matrix.I22);
float widthIn = widthUu/72;
float heightIn = heigthUu/72;
float imagepdi = widthPx/widthIn;
filename = String.format(path, renderInfo.getRef().getNumber(), image.getFileType());
System.out.println(filename+"-->"+imagepdi);
if(imagepdi < 300){
File file = new File("C:/Users/Abhinav/workspace/itext/results/result.txt");
if(filename != null){
if (!file.exists()) {
file.createNewFile();
}
FileWriter fw = new FileWriter(file.getAbsoluteFile(),true);
file.setReadable(true, false);
file.setExecutable(true, false);
file.setWritable(true, false);
BufferedWriter bw = new BufferedWriter(fw);
bw.write(filename);
bw.write("\r\n");
bw.close();
}
}

This is a strange question, because it is incomplete and illogical.
Why is your question incomplete?
You are using MyImageRenderListener in the context of another example, ExtractImages:
PdfReader reader = new PdfReader(filename);
PdfReaderContentParser parser = new PdfReaderContentParser(reader);
MyImageRenderListener listener = new MyImageRenderListener(RESULT);
for (int i = 1; i <= reader.getNumberOfPages(); i++) {
parser.processContent(i, listener);
}
reader.close();
In this example, you loop over every page number to examine every separate page. Hence you know the page number whenever MyImageRenderListener returns an image.
Images are stored inside a PDF as external objects (aka XObject). MyImageRenderListener returns what's stored in such a stream object (containing the bytes of the image). So far, so good.
Why is your question illogical?
Because the whole purpose of storing images in XObject is to be able to reuse the same image stream. Imagine an image of a logo. That image can be present on every page of the document. In this case, MyImageRenderListener will give you the same image (from the same stream) as many times as there are pages, but in reality, there is only one image, and it's external to the page content. It doesn't make sense for that image to "know" the page it is on: it is on every page. The same logic applies even when the image is only used on one page. That is inherent to the design of PDF: an image stream doesn't know which page it belongs to. The link between the image stream and the page exists through the /XObject entry in the /Resources of the page dictionary.
What would be an elegant way to solve this?
Create a member-variable in MyImageRenderListener, e.g.:
protected int pagenumber;
public void setPagenumber(int pagenumber) {
this.pagenumber = pagenumber;
}
Use the setter from your loop:
PdfReader reader = new PdfReader(filename);
PdfReaderContentParser parser = new PdfReaderContentParser(reader);
MyImageRenderListener listener = new MyImageRenderListener(RESULT);
for (int i = 1; i <= reader.getNumberOfPages(); i++) {
listener.setPagenumber(i);
parser.processContent(i, listener);
}
reader.close();
Now you can use pagenumber in the renderImage(ImageRenderInfo renderInfo) method. This way, you'll always know which page is being examined when this method is triggered.

iText PDFDocument page size inaccurate

I am trying to add a header to existing pdf documents in Java with iText. I can add the header at a fixed place on the document, but all the documents are different page sizes, so it is not always at the top of the page. I have tried getting the page size so that I could calculate the position of the header, but it seems as if the page size is not actually what I want. On some documents, calling reader.getPageSize(i).getTop(20) will place the text in the right place at the top of the page, however, on some different documents it will place it half way down the page. Most of the pages have been scanned be a Xerox copier, if that makes a difference. Here is the code I am using:
PdfReader reader = new PdfReader(readFilePath);
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(writeFilePath));
BaseFont bf = BaseFont.createFont(BaseFont.HELVETICA, BaseFont.CP1252, BaseFont.NOT_EMBEDDED);
for (int i = 1; i <= reader.getNumberOfPages(); i++) {
PdfContentByte cb = stamper.getOverContent(i);
cb.beginText();
cb.setFontAndSize(bf, 14);
float x = reader.getPageSize(i).getWidth() / 2;
float y = reader.getPageSize(i).getTop(20);
cb.showTextAligned(PdfContentByte.ALIGN_CENTER, "Copy", x, y, 0);
cb.endText();
}
stamper.close();
PDF that works correctly
PDF that works incorrectly

Take a look at the StampHeader1 example. I adapted your code, introducing ColumnText.showTextAligned() and using a Phrase for the sake of simplicity (maybe you can change that part of your code too):
public void manipulatePdf(String src, String dest) throws IOException, DocumentException {
PdfReader reader = new PdfReader(src);
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
Phrase header = new Phrase("Copy", new Font(FontFamily.HELVETICA, 14));
for (int i = 1; i <= reader.getNumberOfPages(); i++) {
float x = reader.getPageSize(i).getWidth() / 2;
float y = reader.getPageSize(i).getTop(20);
ColumnText.showTextAligned(
stamper.getOverContent(i), Element.ALIGN_CENTER,
header, x, y, 0);
}
stamper.close();
reader.close();
}
As you have found out, this code assumes that no rotation was defined.
Now take a look at the StampHeader2 example. I'm using your "Wrong" file and I've added one extra line:
stamper.setRotateContents(false);
By telling the stamper not to rotate the content I'm adding, I'm adding the content using the coordinates as if the page isn't rotated. Please take a look at the result: stamped_header2.pdf. We added "Copy" at the top of the page, but as the page is rotated, we see the word appear on the side. The word is rotated because the page is rotated.
Maybe that's what you want, maybe it isn't. If it isn't, please take a look at StampHeader3 in which I calculate x and y differently, based on the rotation of the page:
if (reader.getPageRotation(i) % 180 == 0) {
x = reader.getPageSize(i).getWidth() / 2;
y = reader.getPageSize(i).getTop(20);
}
else {
x = reader.getPageSize(i).getHeight() / 2;
y = reader.getPageSize(i).getRight(20);
}
Now the word "Copy" appears on what is perceived as the "top of the page" (but in reality, it could be the side of the page): stamped_header3.pdf

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

iText Flying Saucer pdf headers and ignoring html - java

I think they changed this functionality in r8.... try this method instead: https://gist.github.com/626264

Related

How to use iText to parse paths (such as lines in the document)

iText set up border color and width for check box, but it does not work

unable to put stamp using itext7 using Java lanaguage on only skia generated pdf (shows inverted stamp)

Retrieve the page number of an image in pdf- IText

iText PDFDocument page size inaccurate

Categories

Resources