Hi I want to print a pdf file with size 100x100 (
//this is JAVA code to make the page
PDRectangle myPageSize= new PDRectangle(100, 100);
PDPage myPage= new PDPage(myPageSize);
) but when I try to print this the page seems to have margin and not fully scaled. Below is the code to print using JAVA Print API. Also I use PDFBox library to make my PDF.
PDFPrintable printable = new PDFPrintable(document, Scaling.STRETCH_TO_FIT);
PrinterJob job = PrinterJob.getPrinterJob();
job.setPrintable(printable);
job.print();
Also I use this (printer).
I expect to fit the PDF file to whole paper size (100mm x 100mm) using my thermal printer.
Related
I tried many things to write hindi characters using Apache PdfBox but seems its existing issue in the library.
I tried many font files available, Can someone really help me out in this.
I tried following :
PDDocument doc = new PDDocument();
PDPage page = new PDPage();
doc.addPage(page);
PDFont font = PDTrueTypeFont.loadTTF( doc, new FileInputStream(new File("D:\\Data\\fonts\\dn.ttf")));
font.setFontEncoding(new WinAnsiEncoding());
PDPageContentStream content = new PDPageContentStream( doc, page, true, false );
content.setFont(font, 10);
content.beginText();
content.moveTextPositionByAmount( 200, 100 );
content.drawString( "हिंदी" ); // Writing word "Hindi" in hindi language.
content.endText();
content.close();
doc.save( new FileOutputStream(new File("D:\\testOutput1.pdf")));
doc.close();
It's working for me in PDFBox.
The trick here is to use non-unicode string instead of unicode string.
Use Kruti Dev Font given in below link.
Then convert your unicode string to non-unicode string.
And finally use that converted string in your code.
That means replace this like
content.drawString( "हिंदी" ); // Writing word "Hindi" in hindi language.
With this line
content.drawString( "fganh" ); // Writing word "Hindi" in hindi language.
Convert Unicode (Mangal) To Kruti Dev Font
I think this cannot be done using PdfBox as there are lot of issues with it.
I tried many fonts and the encoding types of PdfBox but failed to write in Hindi.
At the end I tried it in Node Js express pdfmaker() which converts Html to PDF, However I had issues on my Linux server and I installed appropriate ttf font and it worked !
we have created an application to generate pdf documents using itext 5 library. As the part of pdf generation, we tried to embed an image inline in pdf which should be non editable and read only. We tried with an addImage method of PdfContentByte as below,
byte[] decoded = Base64.getDecoder().decode(encodedImage);
image = Image.getInstance(decoded);
After this image is retrieved, used the same in addImage method.
PdfContentByte canvas = pdfStamper.getOverContent(item.getPage(0));
canvas.addImage(image, Boolean.TRUE);
Findings : Since the image is in Base 64 string format, the image is not displayed in the resultant pdf document (if the image is not in Base 64 format, it is working fine).
when we open the pdf , the below error is shown :-------> "An error exists on this page. Acrobat may not display the page correctly. Please contact the person who created the PDF document to correct the problem."
how can we handle this situation ?.
Is any other way to achieve this requirement. Please help
Code :-
PdfReader resultantPdfReader = new PdfReader("template.pdf");
PdfStamper resultandPdfStamper = new PdfStamper(resultantPdfReader, new FileOutputStream("A13.pdf"));
AcroFields acroFields = resultantPdfReader.getAcroFields();
Rectangle fieldPosRec = acroFields.getFieldPositions("imageField").get(0).position;
String encodedSignature = "";
encodedSignature = new String(Files.readAllBytes(Paths.get("MyImage.png")));
if(encodedSignature.indexOf("data:image/png;base64,") != -1) {
encodedSignature = encodedSignature.substring("data:image/png;base64,".length());
}
Image image = null;
try {
byte[] decoded = Base64.getDecoder().decode(encodedSignature);
image = Image.getInstance(decoded);
} catch (Exception e) {
e.printStackTrace();
}
image.scaleAbsoluteHeight(fieldPosRec.getHeight());
image.scaleAbsoluteWidth(fieldPosRec.getWidth());
image.scaleToFit(fieldPosRec);
acroFields.removeField("imageField");
image.setAbsolutePosition(fieldPosRec.getLeft(), fieldPosRec.getBottom());
PdfContentByte canvas = resultandPdfStamper.getOverContent(item.getPage(0));
canvas.addImage(image, Boolean.TRUE);
resultandPdfStamper.close();
resultantPdfReader.close();
Base64 Image String : iVBORw0KGgoAAAANSUhEUgAAAU4AAACWCAYAAACxSWGfAAAAAXNSR0IArs4c6QAAFJFJREFUeAHtnQmwZFV9hw37DgoiaJgZsGBAFgkSIEJgjAZRAliFikrEoOgoGkQlEFJKiXFhr5RCAKMjJhBETAoRImjAB4KAKBgEDBBhQM2IyEDYh2XM90Gf8k7T3a/fe9093X1//6rv3XOXvsvX9/773HNP93vBCxIxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAMxEAPNBl7MhEObJ2Y8BmIgBmKgtYG9mTwBv2vAIBEDMRADMdDKwCuYeCmUhFmGrZbNtBiIgRhoa2AN5uwDZ8L3oCST6Q6vYR3nwnyYC8MQ67MTp8JT4HEthsMbZccTMRADMTCpgU1Z4kPwbXgCppsku3ndIta/vBLpymz7w2CidF9NnCZQE6lR9v+5sfyNgRiIgYqBlSjvASfALVAShsNn4Br4OPwRzDSsZc4Hk+UiqG7LstOcNx/mQj9iM1Z6LFwNZfveonurXmJVCmVemZZhDMRAzQ1swPG/E74GD0BJEg4dPw8OAp8u9zNMjibJc2ERVPfDstOcNx/mwnRjLV54MFwBS6Fsx3X7MKg5jmeCyxzXPCPjnQ38QefZmRsDI2dge/bYJCE7wwpQ4mcULm5wFcOny4wBD02O8yps1LR9b6sfhvsa/GaS8k7M/yvYH9YE41H4NzgLJsAEWQ3dWBs1Xg0/fLaUPzEQA7UysAdH+2UotSyHtl1eAn8Nm8Kwhol0PlgzvA6qxzCVsrVMa5sHg7XPdrEqM24F153aZjtLHaanxtlBTmaNhAFvs0+EdzX29iaGJp+L4DKw5jVqYa1xQ/DYpF3ZeS77A/gJfBXuhMnCW/QjwRq4bbpLIBEDMVADA37ovwfuB2tOj8MnYBVItDfgLbpNFLJT+8UyJwZiYNwMbM0BfR/KbaxPi18+bgfZ4+Pxg+bNcCPkFr3HcrO63hlYl1VtA3vBe+FYsA3uO3Ar/Aouh9PhjbAaJDobWIPZn4MnwYt/EbwNEp0NvI7Z10P5oDmasu2ciRgYGgOvYk9OgnKSdju0Le6bcAhsDIllDezN6F2gz2fgNPDDKdHewI7M+i6Uc9AP6/eBfVoTMTAUBjwZj4HylTbb3qxZWsNcANY4rXlaA7Um6kW/A/gaawNLoZzglp3mvF50yGY1IxsvY8+/AcXNDZT/eGSPZjA7vgWb+TqUc+oBykfB6pCIgaExsCV7Yj84L25P1lNgqrfe1jJNrNY6rX2WROHwHNgX6hQrcrAfhodABw79brXTE60NvJTJZ0L58H6M8nHwQkjEwNAYsMH9MPAE9eJeCPNgpmHS9db0DLCGVZLoBGVvv8Y9rFFWj9uO3NY8E60NmBhNkOU8NHGaQE2kiRgYKgObsDeXQUlqCyiv04c9tAngQ3AfuC1rtGfDLBi3sPniVLAN02O9C/wASbQ24K23t+CLoZwb3qJ7q56IgaEzcBB79CB4st4L+0K/w6RireJxcLsOfcLcj2TNagceB7DF/wWPzafmHqtP0RPPN+CHqQ95fNijL/ku1OFuhMNMjJqBF7PD/w7lZLXstEHGbDZmm6c1T/fDmugHwYtpFMP+l5dCcXoVZR+eJZY1YLPQLnAyXAfF1/WUXweJGBhKA9YqrV16wlrbtNa5PMPaxRVQLqD/przf8tyhKW57FZb3Z9xKDdpeCIeACSLxnIFqsrybSeW9dngevAXiCwmJ4TPgrfACKCet7Zq2bw5LmCxvg7J/E5SH/ZZtJ/bRC7/s81cpD7rmziaHMjoly3vY41PAmmcS5lC+fdkpDcyDheAF7hPLw2AYT1hv00fhAZLfVrHt0u9J6/QCmAd1jyTLup8BY3L8q3EcfrKXdkT7aM4dgWNr9QDJRGWteXmHtcxbwYRp4jweTKR1jSTLur7zY3rca3FcJ4IXuP3hjgFrdKMUs9jZYXmA1FzLNHnuPEoye7iv1WTpbbfnWCG34T0UnVUN3sD5jZN5gqHfOx/laH6AdCUHczRsNKCDSi3zuaad8jQ8yXJAJ142M1gDR7I5awA+Nd98sJvu69Z8gGTH6FK7sSbtt3H2hH602da1lrk6Pv2wejf8A1wOd0Dx7tDkaZeiPOBBQmL0DbyWQ7DtzXbNfUf/cJ53BCbI14MJ08RZLuY7KfeyFlqHWqYuNwU/kD4B3qXYs6F846m4LUMTaJIlEhLjZWAWh3MfeKJ/arwOreXReKv+d2DSLBf3TGuh41rL9MHarvABOB2uhvLDI8VdGfqNp5+C7ctHwRvgZZCIgbEz4BP068GT/z9gBahLdKqFmli7bQsdh1qmv7o0F+xU/vfgr1TdBSUpNg/9euglcCK8E14JdupPxEAtDHyZo/Si+DnU+ee3plMLHdVa5ga8138Gh8MC+BE8Bs3J0XG/2eR8l3N5X+frEzFQWwPv58i9OB6F7WprYdkDb1cLbZVUyrRh7ZdpDdD39S/hBLCGaE2x7HfzcCHzLoRPw1thS7AmmoiBGGgY2IXhEvDieUdjWgbLGii1UB8oNSeZMm5H9mHol2lb4l5wJJwNN4FtjmU/q0PbKG2rtM3yUNgNbMtMxEBXBqxd1DFewkH/GLzY7DbyEUiMhgFrkVuBbYoF30drh82xlAn/AybRKgsZN5EmYmBaBuqYOP0W0GWwO1wJpRsSxcSQGdiQ/fFWuyRIhybNlaE5bmeCt+HVBHkL47ZdJmKgpwZG7auEvTj4k1iJSfNXYDuW7XOJ5WvA83ALqCZIyxu32C1rkfaZ/K8mftli2UyKgb4YqFuN07ZM+9jZ9rUHXAuJwRqw50JzLXIbpvmEvjlsi7QGWU2SNzOeWmSzqYwP1ECdapxerP/UsHsYwyTN/p5q9od9OTTXIme12KztjXdCNUFaXghpi0RCYrgM1KXGaS3HvnibwQJ4DyR6Z2BtVrUtVJOk42u22IS1Rb9dU02S1iofbrFsJsXAUBqoQ+K05nMR+PU3k6ddT+yGlJiegTm8rJogLfuB1Opc+gXTqwnSsk+5badMxMDIGqjDrfoneXdMmr+F/SFJEwldxOosY9tjNUna3LFui9fq1CfYzUnygRbLZlIMjLyBVrWEkT+oygHsS/kCsIazJ1wOiecbsB9kNUFa9im3tfXm+DUTmhOkT7nTO6HZVMZjYAQNeOE/CD5c+JsR3P9+77Lft9bLhaCjZux5YNvjv8AR8OewISRiIAbG1IA16c+DycAf8E383sCrKf4zPAElWfoDJ34p4BR4F2wPq0AiBmKgRgZ251hNCn6tcq0aHXe7Q9XB++AnUJLlM5R9aPZGaHVLzuREDMRAnQycycGaID5Tp4NucaxbM+0L8H9QEua9lD8LsyERAzEQA88a8BZzMZgoXvHslHr9WZnDPQCugJIsHX4f3g65BUdCIgZiYFkDb2LURHHjspPHfmwWR2gN26feJWE+RPk0sFtRIgZiIAbaGjifOSaOj7VdYnxm2Da5F/hk3DbLkjDtLvR+SPsuEhIxEAOdDazL7MfBJGLfxHENuxIdCT+Hkix9Sn427AqJGIiBGOjawLtZ0kRyWdevGK0F7Upkv8pqV6I7GT8KTKaJGIiBGJiyAROmifPgKb9yeF/g7fZ8aO5K9C2mpSvR8L5v2bMYGAkD3pp7i+6t+jojscedd9KuRKdCuhJ19pS5MRADMzBwBK+1tjnK3xSyq9DboLkr0ZVMS1ciJCRiIAZ6a6Dcyu7X29UOZG2z2YpdieycbvKX0pXImmciBmIgBnpuwORisrkfRqWDt12J/Lm7Vl2JbNNMVyIkJGIgBvpn4LOs2sR5Rv820dM1b8La/EGNUrssXYl8ap6IgRiIgYEYuIitmIT+dCBbm9lGtuPl/kdG9/eHYH/MdCVCQiIGYmCwBi5hcyaiXQa72Slv7bW8ojwln6C83pTXkBfMxIA/N+g5cjJ4ziRioNYGvBBMnA6HNQ5kx5aA+/k1aPXvcJmc6LGBarK8m3Xrv7BVj7eV1cXASBmwFuHFcA94oQxb+M2epeA+ngTDuI/s1thEp2TpOWL7sudM3oexectzINMx4AXgBWFi8oIYlvDJub9O5H7ZOf8wSPTHQDVZlnNB75Jk2R/nWesYGBi22/XVcXoBeOH6bSb/w2aitwaSLHvrM2uroYFyu/4bjv1zsM1ydOBT8mvApHk/5BeLkNCDMFH6vn4QzoPypYdqzdIP0NyGIyERA90Y8KLaB8pF5PAm+FuYDYOKzdjQ7eD274ItITE9AyvyslfBR8Haux9Ceq0ywXiSJRISMTBdAyZP+3KeDr+FcoH5YMZ/HfEB6GefyR1Z/73gdm+AjSDRvQG/9eWXAI6Gb4NfOy3vYRnezTR/Wu+9MBcSMRADPTTg/975C/hXeBTKhfcU5YvgHbAm9Cr8ibdHwO1cAmtBorMB24FfA5+Ey+ExKO9TGd7BtC/BQTAHEjEQAwMyYIK0H+XFYOIsF6WJ7hzYG0y0041DeOHT4Hq/AitB4vkG1mGS/+rDNuir4Uko74VD7wxuBnsiHAAbQyIGYmAIDGzAPhwKV4EXarlwvbX/R9gNvOXvNo5lwbKOT3X7opostz7H6a9V2W/yR1A+XIovx3/cmP8mhi6fiIGhNzCVBDH0BzONHZzNa7xll20qr19E+T5Y2IYHmG6t8otwMNhH03+O9iWoc1hDtI15D9gdtobqOWZt3wR6BVwJ1jptx0zEwEgZqJ7UI7XjfdjZbVmnCXQH2HOS9Xux625tMBmcBT7MWNjAxFqH8IPHBFkS5eZNB/0449dBSZTXUrYdMxEDI20gibP12/ciJs/pgAmzU5hYF3ZgVBPrFhxTNVHOYrwaDzPyAyiJ8nrKT1YXSDkGxsFAEufU3kX7ZFqznAN+je848MnwnArWwnwI0ikmS6zOXxVWaQy7LXe7XLv1dnq9TRPOr8ZiRmwrLonyRso2WyRiYKwNJHF2//buyqIXwovAW859wAdKrcJl5rShm8Taap3DMM3fD70bSqL0CfjvhmHHsg8xMEgDSZzd2d6fxc6G1eCb8Haw/W660SmxrstK7ThvzW1JA293S9lhdXyq5Zm83vbcRAzEQAxMauAwljCJWbOyf+EKkIiBGIiBGGhhwNr4iWDCXAp+5z0RAzEQAzHQxoAPT84Fk6a3wgdCIgZiIAZioI2B9Zg+ASZN/z+Q/ycoEQMxEAMx0MaAXYu+ACbNX8J2kIiBGIiBGOhgwO+qmzS/A5t0WC6zYiAGYiAGMGC/TJPmE/BKSMRADMRADHQwYL9Jf9TDxHl4h+UyKwZiIAZiAAN2O7oUTJoO86UAJCRiIAZioJOBjzLTpOk/ebPmmYiBGIiBGOhgwLZM2zRNnLZxJmIgBmIgBjoYsOvRLWDS9KuUiRiIgRiIgUkMlK5HJk+TaCIGYiAGYqCDgXQ96iAns2IgBmKg2UC6HjUbyXgMxEAMdDCQrkcd5GRWDMRADLQy8BEmputRKzOZFgMxEAMtDKTrUQspmRQDMRAD7Qyk61E7M5keAzEQA20MpOtRGzGZHAMxEAOtDKTrUSsrmRYDMRADbQxUux75YCgRA3U2sBMHf1ydBeTYJzeQrkeTO8oS9TCwKodpwnwa8rsM9XjPp32U6Xo0bXV54RgZsJZ5K5gwTZzHg4k0MQMD4/rbk3Y9ug48QfaFb0FitAz4n0bXqrB2pTzZ9FbLei74z/fE/yf1i6Zhmeb8cQiP91g4AlaEn8HB4HWRmKGBlWb4+mF8uR8G88ETx6fpSZpI6HN0k+RaJTMTYKvpTlu5D/u8Jut8KWzVYd2PMK9dUh325DqXfZ/XYHuGW8IzcAIcA0sg0QMD41jj3AUv18CNsCs8DompG/BDdUPwAVth40q5TDPJvQR6HU+xQpPYw42h5UKrad0suwbr+MMG/jM+y81Dk/lk4baGJblamzwQ9oa3QjXOYOQsSC2zaqUH5XGscb6l4eV7DJM0n3+SvJBJJek1D6uJcQOW6/aD9VGWfRJaJbRupzUnPtfX63CdD8LNHVa8LvOak2nzuMnVWmunmut/Mv8mOB9MXLYx9ipKspzHCr39LuFPJP4UJhrcxjDRBwPdXhh92HRfVunx3A2e6H8C10IdYjUO0lpfNfE1J8UybhNGN/EMC/kP7H7dhkWV6Q9RrlN0Sq7WYtcHa+sl7qHwDZhpEi0J05qlNcwSX6EwAeeA71uizwbGLXHugi9v072Nmg29/JRndQONFdiatb6S8By2S4zrTWHPTHImw2ria5Uc/R9MS6ew3iz6ewNeVzuDCe7N4Ad5Cc9NE+hUkmhJmB/ndZs3VnQxw69DkmVDyCAH45Y4T0ae/3ztFPjYIEVOYVu2CVaTYbVcTYzWWLxguglvQe+FagJslxjTfNGN0d4tM90kOpddmNdgW4Zbg3EHfBqSMLWxnGLcEucleHw9nAufh+th1SZ8Atw8rTrez/m2KbutbsLa8v1QTYbV8qLKvMXdrDDLLHcDJYnaDi/VmuhpjN8FO8IWsANUw9qltcwkzKqV5VQet8RpY/2ty8llt5v1FvgRqCa+akIsZWuQPllOjKeBuRzWfPB23vbp5ge1NzDtdphocBvDxJAYGLfEWf1E99bmNfA0LAFvZx22Y1Dzkwx5E2oYJsp5sB/sBjbZNMdFTJAJSKJEQiIGYqC+Bmx2aYVPwK1VngTbQSIGYiAGYqBhoJo0TZQ+Ud8f1mnMzyAGYiAGYiAGYiAGYiAGYiAGYiAGYiAGYiAGYiAGYiAGYiAGYiAGYiAGYiAGYiAGYiAGYiAGYiAGYiAGYiAGYiAGYiAGYiAGYiAGYiAGYiAGYiAGYiAGRs3A/wMpp3D77pc5JQAAAABJRU5ErkJggg==
Your example image has properties that iText cannot properly translate into an inlined image. Unfortunately it does not recognize this and outputs an erroneous result PDF.
In particular your image file uses transparency. Inline images don't allow for Mask or SMask entries (which images in PDFs use to represent transparency). Thus, your image as is cannot be used as inline image.
As a result the inline image created by iText only consists of a black rectangle while the transparency information (which contains the line drawing) is dropped.
Furthermore, your image uses a calibrated RGB color space. Such calibrated RGB color spaces cannot be inlined themselves, so the color space definition has to be put into the page resources. iText, though, when creating an inlined image, fails to reference that non-inlined part properly.
As a result the inline image created by iText references a color space by the wrong name, causing the "An error exists on this page" error message by Adobe Reader. Fixing that reference one gets a valid result PDF showing the black rectangle mentioned above.
In a comment you explain that your actual objective is to prevent copying image from the generated pdf document.
In general this obviously is not possible - any information a PDF viewer can access to draw on screen or paper also can be accessed by some PDF processor designed for that task to copy to some file. (Let's ignore proprietary, viewer specific DRM extensions here.)
What you can try, though, is draw your data in such a way that the common PDF viewers don't offer to copy it. Trying to use an inline image as you did is one approach in that direction. Other approaches wrap the image in other structures, e.g. in a pattern:
PdfContentByte canvas = resultandPdfStamper.getOverContent(1);
Rectangle pageSize = resultantPdfReader.getPageSize(1);
PdfPatternPainter painter = canvas.createPattern(pageSize.getWidth(), pageSize.getHeight());
painter.addImage(image);
canvas.setColorFill(new PatternColor(painter));
canvas.rectangle(0, 0, pageSize.getWidth(), pageSize.getHeight());
canvas.fill();
(AddImageInPattern test testAddToPageTest3)
Adobe Acrobat Reader here does not offer to copy that image. Also my (admittedly older) Adobe Acrobat Pro does not offer to copy it, merely to remove it (more exactly, remove the whole rectangle filled with the pattern).
Beware, though, what the common PDF viewers do or don't offer is a moving target...
I'm trying to open PDF file in iText7, write there some new piece of text, apply font from original PDF to it and save it in another PDF document. I'm using Java 1.8
Thus, I need a set of font names used in original pdf, from where user will choose one, that will be applied to a new paragraph.
And I also need to somehow apply this font.
For now I have this piece of code, that I've taken from here:
public static void main(String[] args) throws IOException {
PdfDocument pdf = new PdfDocument(new PdfReader("example.pdf"));
Set<PdfName> fonts = listAllUsedFonts(pdf);
fonts.stream().forEach(System.out::println);
}
public static Set<PdfName> listAllUsedFonts(PdfDocument pdfDoc) throws IOException {
PdfDictionary acroForm = pdfDoc.getCatalog().getPdfObject().getAsDictionary(PdfName.AcroForm);
if (acroForm == null) {
return null;
}
PdfDictionary dr = acroForm.getAsDictionary(PdfName.DR);
if (dr == null) {
return null;
}
PdfDictionary font = dr.getAsDictionary(PdfName.Font);
if (font == null) {
return null;
}
return font.keySet();
}
It returns this output:
/Helv
/ZaDb
However, the only font example.pdf has is Verdana (it is what document properties in Adobe Acrobat Pro says). Moreover, there are Verdana in two implementations: Bold and normal.
So, I have these questions:
Why does this function returns two fonts instead of one (Verdana).
How can I generate normal well-read names of fonts to display them
to user (e.g. Helvetica instead of Helv)?
How can I apply font got from the original document to the
new paragraph?
Thank you in advance!
If you only wish to display the names of the fonts being used (which you are legally allowed to do) you can use the following code:
public void go() throws IOException {
final Set<String> usedFontNames = new HashSet<>();
IEventListener fontNameExtractionStrategy = new IEventListener() {
#Override
public void eventOccurred(IEventData iEventData, EventType eventType) {
if(iEventData instanceof TextRenderInfo)
{
TextRenderInfo tri = (TextRenderInfo) iEventData;
String fontName = tri.getFont().getFontProgram().getFontNames().getFontName();
usedFontNames.add(fontName);
}
}
#Override
public Set<EventType> getSupportedEvents() {
return null;
}
};
PdfCanvasProcessor parser = new PdfCanvasProcessor(fontNameExtractionStrategy);
File inputFile = new File("YOUR_INPUT_FILE_HERE.pdf");
PdfDocument pdfDocument = new PdfDocument(new PdfReader(inputFile));
for(int i=1;i<=pdfDocument.getNumberOfPages();i++)
{
parser.processPageContent(pdfDocument.getPage(i));
}
pdfDocument.close();
for(String fontName : usedFontNames)
{
System.out.println(fontName);
}
}
You should not reuse a font from one PDF in another PDF, and here's why: fonts are hardly ever fully embedded in a PDF document. For instance: you use the font Verdana regular (238 KB) and the font Verdana bold (207 KB), but when you create a simple PDF document saying "Hello World" in regular and bold, the file size will be much smaller than 238 + 207 KB. Why is this? Because the PDF will only consist of a subset of the font Verdana regular and a subset of the font Verdana bold.
You may have noticed that I am talking of the font Verdana regular
and the font Verdana bold. Those are two different fonts from
the same font family. Reading your question, I notice that you don't make that distinction. You talk about the font Verdana with
two implementations bold and normal. This is incorrect. You should
talk about the font family Verdana and two fonts Verdana bold and
Verdana regular.
A PDF usually contains subsets of different fonts. It can even contain two different subsets of the same font. See also What are the extra characters in the font name of my PDF?
Your goal is to take the font of one PDF and to use that font of another PDF. However, suppose that your original PDF only contains the subset that is required to write "Hello World" and that you want to create a new PDF saying "Hello Universe." That will never work, because the subset won't contain the glyphs to render the letter U, n, i, v, r, and s.
Also take into account that fonts are usually licensed. Many fonts
have a license that states that you can use to font to create a
document and embed that font in that document. However, there is
often a clause that says that other people are not allowed to
extract to font to use it in a different context. For instance: you paid for the font when you purchased a copy of MS Windows, but someone
who receives a PDF containing that font may not have a license to use
that font. See Does one need to have a license for fonts if we are using ttf files in itext?
Given the technical and legal issues related to your question, I don't think it makes sense to work on a code sample. Your design is flawed. You should work with a licensed font program instead of trying to extract a font from an existing PDF. This answers question 3: How can I apply font got from the original document to the new paragraph? You can't: it is forbidden by law (see Extra info below) and it might be technically impossible if the subset doesn't contain all the characters you need!
Furthermore, the sample you found on the official iText web site looks for the fonts defined in a form. /Helv and ZaDb refer to Helvetica and Zapfdingbats. Those are two fonts of a set of 14 known as the Standard Type 1 fonts. These fonts are never embedded in the document since every viewer is supposed to know how to render them. You don't need a full font program if you want to use these fonts; the font metrics are sufficient. For instance: iText ships with 14 AFM files (AFM = Adobe Font Metrics) that contain the font metrics.
You wonder why you don't find Verdana, since Verdana is used as font for the text in your document, but you are looking at the wrong place. You are asking iText for the fonts used for the form, not for the fonts used in the text. This answer question 1: Why does this function returns two fonts instead of one (Verdana).
As for your question 2: you are looking at the internal name of the font, and that internal name can be anything (even /F1, /F2,...). The postscript name of the font is stored in the font dictionary. That's the name you need.
Extra info:
I checked the Verdana license:
Microsoft supplied font. You may use this font to create, display, and print content as permitted by the license terms or terms of use, of the Microsoft product, service, or content in which this font was included. You may only (i) embed this font in content as permitted by the embedding restrictions included in this font; and (ii) temporarily download this font to a printer or other output device to help print content. Any other use is prohibited.
The use you want to make of the font is prohibited. If you have a license for Verdana, you can embed the font in a PDF. However, it is not permitted to extract that font and use it for another purpose. You need to use the original font program.
I've located a region of interest in the page by tracking TextPosition objects using PDFTextStripper as shown in the example: https://github.com/apache/pdfbox/blob/trunk/examples/src/main/java/org/apache/pdfbox/examples/util/PrintTextLocations.java
As shown, the TextPosition has been retrieved from fields like
text.getXDirAdj(), text.getWidthDirAdj(), text.getYDirAdj(), text.getHeightDir() .
From this example I tried to keep everything else the same except setting the cropBox of the target page.
https://github.com/apache/pdfbox/blob/2.0.3/tools/src/main/java/org/apache/pdfbox/tools/PDFToImage.java
OLD CROPBOX: [0.0,0.0,595.276,841.89] -> NEW CROPBOX [50.0,42.0,592.0,642.0].
So how can I use the getYDirAdj and getXDirAdj in setting the cropbox correctly ?
The original pdf file I'm processing can be downloaded from here: http://downloadcenter.samsung.com/content/UM/201504/20150407095631744/ENG-US_NMATSCJ-1.103-0330.pdf
Cropping the page
In a comment the OP reduced his problem to
Ok. Given a java PDRectangle rect = new PDRectangle(40f, 680f, 510f, 100f) obtained from TextLocation how would a java code snippet, that sets the cropBox of a single page look like ? Or how would you do it? TextLocation based rect --> some transformation --> setCropBox(theRightBox).
To set the crop box of the page twelve of the given document to the given PDRectangle you can use code like this:
PDDocument pdDocument = PDDocument.load(resource);
PDPage page = pdDocument.getPage(12-1);
page.setCropBox(new PDRectangle(40f, 680f, 510f, 100f));
pdDocument.save(new File(RESULT_FOLDER, "ENG-US_NMATSCJ-1.103-0330-page12cropped.pdf"));
(SetCropBox.java test method testSetCropBoxENG_US_NMATSCJ_1_103_0330)
Adobe Reader now shows merely this part of page twelve:
Beware, though, the page in question does not only specify a media box (mandatory) and a crop box, it also defines a bleed box and an art box. Thus, application which consider those boxes more interesting than the crop box, might display the page differently. In particular the art box (being defined as "the extent of the page’s meaningful content") might by some applications be considered important.
Rendering the cropped page
In a comment to this answer the OP remarked
This is good and works. It correctly saves the page in the PDF file. I've tried to do the same in JPG and failed.
I reduced the OP's code to the essentials
PDDocument pdDocument = PDDocument.load(resource);
PDPage page = pdDocument.getPage(12-1);
page.setCropBox(new PDRectangle(40f, 680f, 510f, 100f));
PDFRenderer renderer = new PDFRenderer(pdDocument);
BufferedImage img = renderer.renderImage(12 - 1, 4f);
ImageIOUtil.writeImage(img, new File(RESULT_FOLDER, "ENG-US_NMATSCJ-1.103-0330-page12cropped.jpg").getAbsolutePath(), 300);
pdDocument.close();
(SetCropBox.java test method testSetCropBoxImgENG_US_NMATSCJ_1_103_0330)
The result:
Thus, I cannot reproduce an issue here.
Possible details to check for:
ImageIOUtil is not part of the main PDFBox artifact, instead it is located in pdfbox-tools; does the version of that artifact match the version of the core pdfbox artifact?
I run the code in an Oracle Java 8 environment; other Java environments might give rise to different results.
There are minor differences in our implementations. E.g. I load the PDF via an InputStream, you directly from file system, I have hardcoded the page number, you have it in some variable, ... None of these differences should cause your problem, but who knows...
I'm using C# and iTextSharp to add a watermark to my PDF files:
Document document = new Document();
PdfReader pdfReader = new PdfReader(strFileLocation);
PdfStamper pdfStamper = new PdfStamper(pdfReader, new FileStream(strFileLocationOut, FileMode.Create, FileAccess.Write, FileShare.None));
iTextSharp.text.Image img = iTextSharp.text.Image.GetInstance(WatermarkLocation);
img.SetAbsolutePosition(100, 300);
PdfContentByte waterMark;
//
for (int pageIndex = 1; pageIndex <= pdfReader.NumberOfPages; pageIndex++)
{
waterMark = pdfStamper.GetOverContent(pageIndex);
waterMark.AddImage(img);
}
//
pdfStamper.FormFlattening = true;
pdfStamper.Close();
It works fine, but my problem is that in some PDF files no watermark is added although the file size increased, any idea?
The fact that the file size increases is a good indication that the watermark is added. The main problem is that you're adding the watermark outside the visible area of the page. See How to position text relative to page using iText?
You need something like this:
Rectangle pagesize = reader.GetCropBox(pageIndex);
if (pagesize == null)
pagesize = reader.GetMediaBox(pageIndex);
img.SetAbsolutePosition(
pagesize.GetLeft(),
pagesize.GetBottom());
That is: if you want to add the image in the lower-left corner of the page. You can add an offset, but make sure the offset in the x direction doesn't exceed the width of the page, and the offset in the y direction doesn't exceed the height of the page.
Although I don't know the specifics of iTextSharp, likely on the pages where your image is not showing, the previous PDF content has modified the current transformation matrix such that whatever you put on the page is moved off the page.
This can be fixed by emitting a gsave operator before the original page content and emitting a grestore operator after the original page content (but before yours). This, however may not fix all cases with a PDF document that modifies the CTM does a gsave and no grestore. This is not supposed to happen in theory, according to the PDF specification:
Occurrences of the q and Q operators shall be balanced within a given content stream (or within the sequence of streams specified in a page dictionary’s Contents array).
but I can tell you from experience that this is not the case in practice.