I am using flying saucer (xhtmlrenderer) 9.1.22 to create PDF files from HTML content,
but the bold text in PDF looks blurry.
this is my code snippet:
ITextRenderer renderer = new ITextRenderer();
renderer.getFontResolver().addFont("arial.ttf", BaseFont.CP1252, BaseFont.EMBEDDED);
renderer.setDocumentFromString(htmlAsString);
renderer.layout():
renderer.createPDF(output);
Related
I have been using PDFBOX and EasyTable which extends PDFBOX to draw datatables. I have hit a problem whereby I have a java object with a string of HTML data that I need to be added to the PDF using PDFBOX. A dig at the documentation seems not to bear any fruits.
The code below is a snippet hello world, which I want on the pdf been generated to have H1 formatting.
// Create a document and add a page to it
PDDocument document = new PDDocument();
PDPage page = new PDPage();
document.addPage( page );
// Create a new font object selecting one of the PDF base fonts
PDFont font = PDType1Font.HELVETICA_BOLD;
// Start a new content stream which will "hold" the to be created content
PDPageContentStream contentStream = new PDPageContentStream(document, page);
// Define a text content stream using the selected font, moving the cursor and drawing the text "Hello World"
contentStream.beginText();
contentStream.setFont( font, 12 );
contentStream.moveTextPositionByAmount( 100, 700 );
contentStream.drawString( "<h1>HelloWorld</h1>" );
contentStream.endText();
// Make sure that the content stream is closed:
contentStream.close();
// Save the results and ensure that the document is properly closed:
document.save( "Hello World.pdf");
document.close();
}
Use jerico to format the html to free text while mapping correctly the output of tags.
sample
public String extractAllText(String htmlText){
return new net.htmlparser.jericho
.Source(htmlText)
.getRenderer()
.setMaxLineLength(Integer.MAX_VALUE)
.setNewLine(null)
.toString();
}
Include on your gradle or maven:
compile group: 'net.htmlparser.jericho', name: 'jericho-html', version: '3.4'
PDFBox does not know HTML, at least not for creating content.
Thus, with plain PDFBox you have to parse the HTML yourself and derive special text drawing characteristics from the tags text is in.
E.g. when you encounter "<h1>HelloWorld</h1>", you have to extract the text "HelloWorld" and use the information that it is in a h1 tag to select an appropriate prime header font and font size to draw that "HelloWorld".
Alternatively you can look for a library doing that HTML parsing and transforming to PDF text drawing instructions for PDFBox, e.g. Open HTML to PDF.
I am using arial font using BaseFont in ItextPdf in my Java app. I am updating the acro fields of an editable pdf and regenerating. It's working all fine.
I am creating it as:
ArrayList<BaseFont> fonts = new ArrayList<BaseFont>();
BaseFont baseFont = BaseFont.createFont("/path/to/arial.ttf",BaseFont.IDENTITY_H,BaseFont.EMBEDDED);
fonts.add(baseFont);
and within my acroform I am attaching that font like this:
acroForm.setSubstitutionFonts(fonts);
How can I change the size of this BaseFont?
For my work, I need to translate pdf document into Image with PDFBox.
PDDocument document = PDDocument.load(new File(fileUrl));
PDFRenderer pdfRenderer=new PDFRenderer(document);
BufferedImage bim=pdfRenderer.renderImageWithDPI(page, dpi.floatValue());
My document have many optional content group that are not visible (with Acrobat Reader for example) but after rendering my image contains this ocg.
How render pdf document without render all ocg ?
I'm having trouble getting flying saucer to use a secondary font for the glyphs/charachters which are not present in my main font.
The Java code I'm using for this purpose is more or less:
String result = getPrintHtmlContent(urlString);
result = CharacterConverter.replaceInvalidCharacters(result);
ITextRenderer renderer = new ITextRenderer();
renderer.getFontResolver();
renderer.getFontResolver().addFont(FONTS_DIR_PATH + "ARIALUNI.TTF", BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
renderer.getFontResolver().addFont(FONTS_DIR_PATH + "droidsans/DroidSans.ttf", BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
renderer.getFontResolver().addFont(FONTS_DIR_PATH + "droidsans/DroidSansBold.ttf", BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
renderer.setDocumentFromString(result, "http://" + frontendHost + ":" + frontendPort + frontendContextRoot);
renderer.layout();
renderer.createPDF(os);
And the css:
body {
font-family: "Droid Sans", "Arial Unicode MS";
}
I have also included the fonts in the css by using the #font-face rule.
I am able to get this to work using either of the fonts separately, so there seems to be no problem with flying saucer finding the fonts or the css not rendering correctly.
If I on the other hand do as above and try to use both fonts the output PDF is only using Droid Sans...
Is it even possible to use a "fallback font" in flying saucer, as it is on websites?
I asked the same question on Flying Saucer developer community and got a reply:
https://groups.google.com/forum/#!topic/flying-saucer-dev/5p00ISwnxiw
In short the answer is NO, it is not possible to use a secondary font.
I have to generate pdf file using iText in Netbeans IDE. The pdf may contain bangla letter. I already generate pdf file with Bangla letters. But the problem is Bangla letters are not in correct form.
Suppose I have to show: বরিশাল -- But pdf generate: [1]: http://i.stack.imgur.com/abwOV.jpg
Suppose I have to show: পড়ি -- But pdf generate: পড় ি
My code to generate this file:
Document document = new Document();
BaseFont unicode = BaseFont.createFont("c:/windows/fonts/NikoshBan.ttf", BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
Font font = new Font(unicode);
PdfWriter writer=PdfWriter.getInstance(document, new FileOutputStream("TableDat.pdf"));
document.open();
document.add(new Paragraph("বরিশাল",font));
document.close();