I am trying to insert barcode in the PDF using PDFBox2.0.13. I tried using the BufferedImage for this as given in
How to add Code128 Barcode image to existing pdf using pdfbox(1.8.12) with barcode4j library?
but this uses "new PDPixelMap(doc, bim)" this PDPixelMap is deprecated in 2.0.x.
My question is how do we insert barcode in PDF with APIs available in PDFBox2.0.13(probably replacement of PDPixelMap)and without using PDPixelMap.? Would be great if code snippet provided.
Use LosslessFactory like this:
PDImageXObject img = LosslessFactory.createFromImage(doc, bim);
contentStream.drawImage(img, x, y);
Related
I am trying to get text from element which are inside canvas, I am trying to capture canvas portion as image and then extracting text from image using OCR library. But I am facing issue as some of characters are not getting converted in exact text.
I am using Selenium webdriver, Java and Maven.
Code to extract text from image:
Ocr.setUp();
Ocr ocr = new Ocr(); // create a new OCR engine
ocr.startEngine("eng", Ocr.SPEED_FASTEST); // English
textFromImage = ocr.recognize(new File[]{new File("E:\\Device.png")},
Ocr.RECOGNIZE_TYPE_TEXT, Ocr.OUTPUT_FORMAT_PLAINTEXT);
System.out.println(textFromImage);
ocr.stopEngine();
Currently I am trying to get text which is in google search text box from below image :
I am expecting output as get text : What is swift in iOS
But OCR returns text : Whatls swlflln IOS as per following screenshot:
Some how OCR could not covert some characters same as what in image. Is there any other solution for this?
I am trying to create a java project that simply prints a project source code, be it java, php, c++ and others.
I can create the PDF just fine with iText, but now I need some kind of highlighting the java code I read the same way a code editor like sublime highlights. I discovered pdfbox: a library for creating/manipulating PDF files, but I can't find how to highlight code text(like sublime does) by using this library. Any help?
Copying from another SO question : highlight text using pdfbox when it's location in the pdf is known
PDDocument doc = PDDocument.load(/*path to the file*/);
PDPage page = (PDPage)doc.getDocumentCatalog.getAllPages.get(i);
List annots = page.getAnnotations;
PDAnnotationTextMarkup markup = new PDAnnotationTextMarkup(PDAnnotationTextMarkup.Su....);
markup.setRectangle(/*your PDRectangle*/);
markup.setQuads(/*float array of size eight with all the vertices of the PDRectangle in anticlockwise order*/);
annots.add(markup);
doc.save(/*path to the output file*/);
I have to generate pdf in my spring mvc application. recently I tested iTextPdf library, but i could not generate unicode pdf document. in fact I didn't see non-latin characters in the generated document. I decided to use Apache PDFBox for my purpose, but I don't know has it support unicode characters? If has, is there any good tutorial for learning pdfBox? And If not, which library should I use?
Thanks in advance.
The 1.8.* versions don't support PDF generation with Unicode, but the 2.0.* versions do. This is the example EmbeddedFonts.java:
public class EmbeddedFonts
{
public static void main(String[] args) throws IOException
{
PDDocument document = new PDDocument();
PDPage page = new PDPage(PDRectangle.A4);
document.addPage(page);
String dir = "../pdfbox/src/main/resources/org/apache/pdfbox/resources/ttf/";
PDType0Font font = PDType0Font.load(document, new File(dir + "LiberationSans-Regular.ttf"));
PDPageContentStream stream = new PDPageContentStream(document, page);
stream.beginText();
stream.setFont(font, 12);
stream.setLeading(12 * 1.2);
stream.newLineAtOffset(50, 600);
stream.showText("PDFBox Unicode with Embedded TrueType Font");
stream.newLine();
stream.showText("Supports full Unicode text ?");
stream.newLine();
stream.showText("English русский язык Tiếng Việt");
stream.endText();
stream.close();
document.save("example.pdf");
document.close();
}
}
Note that unlike iText, PDFBox support for PDF creation is very low level, i.e. we don't support paragraphs or tables out of the box. There is no tutorial, but a lot of examples. The API orients itself on the PDF specification.
The current version of Apache PDFBox can't deal with Unicode, see:
https://pdfbox.apache.org/ideas.html
iTextPdf v. 5.x generates pdf files with Unicode. There is an exemple here:
iText in Action: Chapter 11: Choosing the right font
part3.chapter11.UnicodeExample
http://itextpdf.com/examples/iia.php?id=199
To run it, you just need to adapt the value of EncodingExample.FONT and to add some code to create the output file.
I am using itext 2.1.7 to create a rtf file dynamically in a spring MVC controller.
The class I am using is RTFWriter2. This works well. But I am not able to add an image to it.
The image is a byte array which I get from a JPA domain object. I also tried to read a sample image from a file. But this does not work either. The image class is from the itext package and its constructor allows an byte array.
This is the code I use:
Image img = Image.getInstance(user.getStammdaten().getProfileImage());
document.add(img);
document.newPage();
Any clues?
Maybe this scripst will help you.
https://joseluisbz.wordpress.com/2011/06/22/script-de-clases-rtf-para-jsp-y-php/
Now,
Here there is an example of use:
https://joseluisbz.wordpress.com/2011/07/16/subiendo-imagenes-png-y-jpg-y-archivos-a-mysql-con-php-y-jsp-y-mostrarlos-en-rtf-usando-clases/
I have the need to convert any multipage PDF file into a set of JPGs.
Since the PDF files are supposed to come from a scanner, we can assume each page just contains a graphic object to extract, but I cannot be 100% sure of that.
So, I need to convert any renderable content from each page into a single JPEG file.
How can I do this with iText?
If I can't do this with iText, what Java library can achieve this?
Thanks.
Ghostscript (available for Windows, Linux, MacOS X, Solaris, AIX,...) can convert...
...from input formats: PDF, PostScript, EPS and AI
...into output formats: JPEG, TIFF, PNG, PNM, PPM, BMP, (and more).
(The ImageMagick mentioned above doesn't do the conversion on its own -- it uses Ghostscript under the hood, as do many other tools.)
ICEpdf - http://www.icepdf.org/ - has an open source entry version which should do what you need.
I believe the primary difference between the open source version and the pay-for version is that the pay-for has much better font support.
You can also use Sun's PDF-Renderer and JPedal does PDF to image (low and high res.
With Apache PDFBox you could do the following:
PDDocument document = PDDocument.load(pdffile);
List<PDPage> pages = document.getDocumentCatalog().getAllPages();
for (int i = 0; i < pages.size(); i++) {
PDPage page = pages.get(i);
BufferedImage image = page.convertToImage(BufferedImage.TYPE_INT_RGB, 72);
ImageIO.write(image, "jpg", new File(pdffile.getAbsolutePath() + "_" + i + ".jpg"));
}