Changing opacity of transparent image / Changing value of extgstate dictionary - java

I'm trying to implement a invisible watermarking function using itext 7 in java. So far I've managed to implement the embedding of watermark to all pages using the following code:
PdfDocument pdfdoc = new PdfDocument(new PdfReader(source),new PdfWriter(dest));
Document doc = new Document(pdfdoc);
PdfCanvas canvas;
Rectangle pagesize;
PdfExtGState qrcode = new PdfExtGState();
qrcode.setFillOpacity(0); // sets opacity of watermark.
byte[] bytearray = convertBI(watermark);
ImageData imgd = ImageDataFactory.create(bytearray);
float w = imgd.getWidth() , h = imgd.getHeight();
float x,y;
for(int i = 1;i<=pdfdoc.getNumberOfPages();i++)
{
PdfPage page = pdfdoc.getPage(i);
pagesize = page.getPageSizeWithRotation();
page.setIgnorePageRotationForContent(true);
x = (pagesize.getLeft() + pagesize.getRight())/ 2;
y = (pagesize.getTop() + pagesize.getBottom())/ 2;
canvas = new PdfCanvas(pdfdoc.getPage(i));
canvas.saveState();
canvas.setExtGState(qrcode);
canvas.addImage(imgd,w,0,0,h,x-(w/2),y-(h/2),true);
canvas.restoreState();
}
doc.close();
However I'm having troubles retrieving the watermark. So far I've tried redrawing the page on another canvas and setting the fill opacity but to no avail. The only way I've managed to make the watermark visible through the use of itext-rups and manually changing the value of the extGS as seen here
Would anyone be able to advise me on whether it is possible to change the value of the extGState dictionary from code or any alternative methods to achieving the same result?
Update: So I've tried to access the dictionary in code but it just return nulls.
PdfDocument pdfdoc = new PdfDocument(new PdfReader(source),new PdfWriter(dest));
Document doc = new Document(pdfdoc);
for(int pageNo = 1; pageNo<= pdfdoc.getNumberOfPages();pageNo++)
{
PdfPage pdfpage = pdfdoc.getPage(pageNo);
PdfResources rsrc = pdfpage.getResources();
PdfDictionary pExtGSD = rsrc.getResource(PdfName.ExtGState);
if(!pExtGSD.isEmpty())
{
System.out.println(pExtGSD.getAsFloat(new PdfName("/Gs1")));
}
}
doc.close();

Related

How to use iText to parse paths (such as lines in the document)

I am using iText to parse text in a PDF document, and i am using PdfContentStreamProcessor with a RenderListener. Such as:
PdfReader reader = new PdfReader(file.toURI().toURL());
int numberOfPages = reader.getNumberOfPages();
MyRenderListener listener = new MyRenderListener ();
PdfContentStreamProcessor processor = new PdfContentStreamProcessor(listener);
for (int pageNumber = 1; pageNumber <= numberOfPages; pageNumber++) {
PdfDictionary pageDic = reader.getPageN(pageNumber);
PdfDictionary resourcesDic = pageDic.getAsDict(PdfName.RESOURCES);
Rectangle pageSize = reader.getPageSize(pageNumber);
listener.startPage(pageNumber, pageSize);
processor.processContent(ContentByteUtils.getContentBytesForPage(reader, pageNumber), resourcesDic);
}
I have no problem to get the text with the renderText(TextRenderInfo) method, but how do I parse the graphic content appart from images? For example in my case I would like to get:
Text content which is in a box
Horizontal lines
Per mkl comment, by using ExtRenderListener I am able to get the geometries. I used How to extract the color of a rectangle in a PDF, with iText for reference

pdfbox - pdf increase size after converting to grayscale

I need to convert scanned PDF to grayscale PDF. I found 2 solutions for that.
First one is to just use renderImage
private void convertToGray() throws IOException {
File pdfFile = new File(PATH);
try (PDDocument originalPdf = PDDocument.load(pdfFile);
PDDocument doc = new PDDocument()) {
LOGGER.info("Current heap after loading file: {}", Runtime.getRuntime().totalMemory());
PDFRenderer pdfRenderer = new PDFRenderer(originalPdf);
for (int pageNum = 0; pageNum < originalPdf.getNumberOfPages(); pageNum++) {
// PDImageXObject pdImage = LosslessFactory.createFromImage(doc, bufferedImage);
BufferedImage grayImage = pdfRenderer.renderImageWithDPI(pageNum, 300F, ImageType.GRAY);
PDImageXObject pdImage = JPEGFactory.createFromImage(doc, grayImage);
float pageWight = originalPdf.getPage(pageNum).getMediaBox().getWidth();
float pageHeight = originalPdf.getPage(pageNum).getMediaBox().getHeight();
PDPage page = new PDPage(new PDRectangle(pageWight, pageHeight));
doc.addPage(page);
try (PDPageContentStream contentStream = new PDPageContentStream(doc, page)) {
contentStream.drawImage(pdImage, 0F, 0F, pageWight, pageHeight);
}
}
doc.save(NEW_PATH);
}
}
But this leads to increase size of the file (because some PDFs has less DPI than 300.
Second one is to just replace existing image with gray analog
private void convertByImageToGray() throws IOException {
File pdfFile = new File(PATH);
try (PDDocument document = PDDocument.load(pdfFile)) {
List<COSObject> objects = document.getDocument().getObjectsByType(COSName.IMAGE);
for (COSObject object : objects) {
LOGGER.info("Class: {}; {}", object.getClass(), object.toString());
}
for (int pageNum = 0; pageNum < document.getNumberOfPages(); pageNum++) {
PDPage page = document.getPage(pageNum);
replaceImage(document, page);
}
document.save(NEW_PATH);
}
}
private void replaceImage(PDDocument document, PDPage page) throws IOException {
PDResources resources = page.getResources();
Iterable<COSName> xObjectNames = resources.getXObjectNames();
if (xObjectNames != null) {
for (COSName xObjectName : xObjectNames) {
PDXObject object = resources.getXObject(xObjectName);
if (object instanceof PDImageXObject) {
PDImageXObject img1 = (PDImageXObject) object;
BufferedImage bufferedImage1 = img1.getImage();
BufferedImage grayBufferedImage = convertBufferedImageToGray(bufferedImage1);
// PDImageXObject grayImage = JPEGFactory.createFromImage(document, grayBufferedImage);
PDImageXObject grayImage = LosslessFactory.createFromImage(document, grayBufferedImage);
resources.put(xObjectName, grayImage);
}
}
}
}
private static BufferedImage convertBufferedImageToGray(BufferedImage sourceImg) {
ColorSpace cs = ColorSpace.getInstance(ColorSpace.CS_GRAY);
ColorConvertOp op = new ColorConvertOp(sourceImg.getColorModel().getColorSpace(), cs, null);
op.filter(sourceImg, sourceImg);
return sourceImg;
}
But still some files increase in size like 3 times (even they were already grayscale; interesting that int this case JPEGFactory produces larger files than LosslessFactory). All images in grayscale PDF have the same size as original ones. And I don't understand why.
Maybe there is a better way to make grayscale PDF with predictable size (except ghostscript)?
UPDATE: I've just realized that the issue is with creating PDF from image. It does not compress as well.
For example, I have dummy 1-page scan file that is less than 1 Mb. But if I get image from it (directly copying via Acrobat Reader to Paint, or via code above) it size is ~8-10 Mb depending on the method. And if I create new PDF from this image it's barely compressed. Here is example code:
File pdfFile = new File(FULL_FILE);
try (PDDocument document = PDDocument.load(pdfFile)) {
PDPage page = new PDPage();
document.addPage(page);
PDImageXObject pdImage = PDImageXObject.createFromFile("example.png", document);
try (PDPageContentStream contents = new PDPageContentStream(document, page)) {
contents.drawImage(pdImage, 0F, 0F);
}
document.save(FULL_FILE_NEW);
}
Yes LosslessFactory produces smaller files compared to JPEGFactory
In the below link there are different methods to try and achieve the same goal. Overall the best quality gray scale image was the one from Option 6, however this was by no means the fastest (I myself used Option 4). Comparisons are also provided for you to choose
This link contains possible ways to convert color images to black. It helped me a lot.
Let me know if it works for you and approve my answer if it helped.

Remove PdfName.Rotate value without rotation

I have to combine multiple pages from several files into new one PDF. The page orientation of all the pages must be portrait.
After this work is done, I am using a couple of programs to reset the rotation to zero without really rotate the page.
I want to use itext to remove the rotation value.
Taked from itext examples, I've tried something like this:
protected void manipulatePdf(String dest) throws Exception {
PdfDocument pdfDoc = new PdfDocument(new PdfReader(SRC), new PdfWriter(DEST));
int n = pdfDoc.getNumberOfPages();
PdfPage page;
PdfNumber rotate;
for (int p = 1; p <= n; p++) {
page = pdfDoc.getPage(p);
rotate = page.getPdfObject().getAsNumber(PdfName.Rotate);
page.setRotation(0);
pdfDoc.close();
}
}
This:
PdfDictionary diccionario = page.getPdfObject();
diccionario.Remove(iText.Kernel.Pdf.PdfName.Rotate);
And the function CopyPagesTo with the same result: The pages orientation has been altered.
Here there is an example file with 0, 90, 180 y 270 degrees.
The goal is set rotate value of all pages to zero keeping portrait mode:
https://filebin.ca/4vep0uuU1p2s/1.pdf
Any advice would be greatly appreciated.
I have found a solution using the SetIgnorePageRotationForContent function.
VB.NET example:
Dim srcPdf As iText.Kernel.Pdf.PdfDocument = New iText.Kernel.Pdf.PdfDocument(New iText.Kernel.Pdf.PdfReader(srcFile))
Dim destPDF As New iText.Kernel.Pdf.PdfDocument(New iText.Kernel.Pdf.PdfWriter(destFile))
For contador = 1 To srcPdf.GetNumberOfPages
Dim srcPage = srcPdf.GetPage(contador)
Dim rotacion As iText.Kernel.Pdf.PdfNumber = srcPage.GetPdfObject().GetAsNumber(iText.Kernel.Pdf.PdfName.Rotate)
If IsNothing(rotacion) OrElse rotacion.IntValue = 0 Then
srcPdf.CopyPagesTo(contador, contador, destPDF)
Continue For
End If
Dim destPage As iText.Kernel.Pdf.PdfPage = destPDF.AddNewPage(New iText.Kernel.Geom.PageSize(srcPage.GetPageSizeWithRotation))
If rotacion.IntValue = 180 Then
destPage.GetPdfObject().Put(iText.Kernel.Pdf.PdfName.Rotate, New iText.Kernel.Pdf.PdfNumber(180))
Else
destPage.GetPdfObject().Put(iText.Kernel.Pdf.PdfName.Rotate, New iText.Kernel.Pdf.PdfNumber(rotacion.IntValue + 180))
End If
destPage.SetIgnorePageRotationForContent(True)
Dim canvas As New iText.Kernel.Pdf.Canvas.PdfCanvas(destPage)
Dim pageCopy As iText.Kernel.Pdf.Xobject.PdfFormXObject = srcPage.CopyAsFormXObject(destPDF)
canvas.AddXObject(pageCopy, 0, 0)
destPage.GetPdfObject().Remove(iText.Kernel.Pdf.PdfName.Rotate)
Next
destPDF.Close()
srcPdf.Close()

iText7. How to flatten an existing pdf document

I have masked an existing pdf document with images, as described into this question: iText7 Image Transparency
My issue is that somebody using Acrobat Reader DC Pro can still edit the document and remove the images, making the masking ineffective.
I have been thinking of flattening the pdDocument, but it seems the API applies to form, and not to the entire document.
I have tried the code below, but it is still possible to edit the pdf and remove the masking images.
Do you have any advice for this?
// Read the pdf input
PdfReader pdfReader = new PdfReader(value);
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
PdfWriter pdfWriter = new PdfWriter(outputStream);
PdfDocument pdfDoc = new PdfDocument(pdfReader, pdfWriter);
Document document = new Document(pdfDoc);
// Creating an ImageData object
ImageData data = ImageDataFactory.create(fileName);
for (int x = 1; x < 800; ) {
for (int y = 1; y < 1000; ) {
Image image = new Image(data);
image.setFixedPosition(x , y);
document.add(image);
y = y + y1 + 40;
}
x = x + x1 + 40;
}
PdfAcroForm.getAcroForm(pdfDoc, true).flattenFields();
// The content has now been modified, return it as a stream
document.close();
I expect: the image cannot be removed, or the document cannot be edited

unable to put stamp using itext7 using Java lanaguage on only skia generated pdf (shows inverted stamp)

I am unable to put stamp using itext7 using Java language on only skia generated pdf (skia is pdf library used by google; if someone has worked on google docs-> Clicks on Print -> Save as Pdf ). It Stamps incorrectly; if I stamp at top left position of pdf page then it would stamp at bottom left and show (inverted mirror) image and (inverted mirror) text. For all other pdfs it gives correct stamping.
It seems pdf generated by skia has missing meta -data.
Since you didn't share any code, nor any document, I created a PDF document from Google docs, and I used the code I wrote in answer to the Itextsharp 7 - Scaled and Centered Image as watermark question to add a Watermark in the center.
The result looked like this:
As you can see in the Document Properties, the original document was created using Skia/PDF m67; modified using iText® 7.1.3.
You need a Watermark in the top left, so I adapted the code like this:
public void createPdf(String src, String dest) throws IOException {
PdfDocument pdfDoc = new PdfDocument(
new PdfReader(src), new PdfWriter(dest));
Document document = new Document(pdfDoc);
PdfCanvas over;
PdfExtGState gs1 = new PdfExtGState();
gs1.setFillOpacity(0.5f);
int n = pdfDoc.getNumberOfPages();
Rectangle pagesize;
ImageData img = ImageDataFactory.create(IMG);
float iW = img.getWidth();
float iH = img.getHeight();
float x, y;
for (int i = 1; i <= n; i++)
{
PdfPage pdfPage = pdfDoc.getPage(i);
pagesize = pdfPage.getPageSize();
x = pagesize.getLeft();
y = pagesize.getTop() - iH;
over = new PdfCanvas(pdfDoc.getPage(i));
over.saveState();
over.setExtGState(gs1);
over.addImage(img, iW, 0, 0, iH, x, y);
over.restoreState();
}
document.close();
pdfDoc.close();
}
The result looks like this:
The image isn't mirrored; it's at the top-left position of the page. In short: there doesn't seem to be any problem with PDF's created with Skia/PDF m67.

Categories