I am having html content store as a raw string in my database and I like to print it in pdf, but with custom size, for example page size to be 10cm width and 7 com height, not standard A4 format.
Can someone gives me some examples if it is possible.
ByteArrayOutputStream out = new ByteArrayOutputStream();
PDRectangle rec = new PDRectangle(recWidth, recHeight);
PDPage page = new PDPage(rec);
try (PDDocument document = new PDDocument()) {
PdfRendererBuilder builder = new PdfRendererBuilder();
builder.defaultTextDirection(BaseRendererBuilder.TextDirection.LTR);
String htmlContent = "<b>Hello world</b>" + content;
builder.withHtmlContent(htmlContent, "");
document.addPage(page);
builder.usePDDocument(document);
PdfBoxRenderer renderer = builder.buildPdfRenderer();
renderer.createPDFWithoutClosing();
document.save(out);
} catch (Exception e) {
ex.printStackTrace();
}
return new ByteArrayInputStream(out.toByteArray());
This code generates for me 2 files, one small and one A4.
UPDATE:
I tried this one:
try (PDDocument document = new PDDocument()) {
PdfRendererBuilder builder = new PdfRendererBuilder();
builder.defaultTextDirection(BaseRendererBuilder.TextDirection.LTR);
builder.useDefaultPageSize(210, 297, PdfRendererBuilder.PageSizeUnits.MM);
builder.usePdfAConformance(PdfRendererBuilder.PdfAConformance.PDFA_3_A);
String htmlContent = "<b>content</b>";
builder.withHtmlContent(htmlContent, "");
builder.usePDDocument(document);
PdfBoxRenderer renderer = builder.buildPdfRenderer();
renderer.createPDFWithoutClosing();
document.save(out);
} catch (Exception e) {
log.error(">>> The creation of PDF is invalid!");
}
But in this case content is not shown, if I remove useDefaultPageSize, content will be shown
I didn't check this solution before, but try initialise the builder object with your desired page size and document type like below
builder.useDefaultPageSize(210, 297, PdfRendererBuilder.PageSizeUnits.MM);
builder.usePdfAConformance(PdfRendererBuilder.PdfAConformance.PDFA_3_A);
the lib include many PDF format next is PdfAConformance Enum with possible values
PdfAConformance Enum
Related
How to add an external link in PDF and redirect to the webpage.
.
.
.
example image describe below
On click on Goolge,user should redirect to webpage https://www.google.com
here is my code
private void createPDFiText() {
int margin = getResources().getDimensionPixelSize(R.dimen._5sdp);
Document document = new Document(PageSize.A4, margin, margin, margin, margin);
try {
PdfWriter.getInstance(document, getOutputStream());
document.open();
for (int i = 12; i <= 17; i++) {
Phrase phrase = new Phrase("Open ");
Phrase phrase1 = new Phrase(" on Click On it.");
Font anchorFont = new Font(Font.FontFamily.UNDEFINED, 25);
anchorFont.setColor(BaseColor.BLUE);
anchorFont.setStyle(Font.FontStyle.UNDERLINE.getValue());
Anchor anchor = new Anchor("Google", anchorFont);
anchor.setReference("www.google.com");
phrase.add(anchor);
phrase.add(phrase1);
document.add(phrase);
}
document.close();
} catch (DocumentException | IOException e) {
e.printStackTrace();
}
}
I am referring to this answer. Have a look. Modifying existing pdf file using iText 5.5.13.2 is complicated. But the referred solution is more easier.
iText 7 has handier way to modify existing pdf.
There are several other ways. Like PdfStamper etc.
From referred answer, add following code to make an anchor.
Phrase phrase = new Phrase("Open ");
Phrase phrase1 = new Phrase(" on Click On it.");
Font anchorFont = new Font(Font.FontFamily.UNDEFINED, 11);
anchorFont.setColor(BaseColor.BLUE);
anchorFont.setStyle(Font.FontStyle.UNDERLINE.getValue());
Anchor anchor = new Anchor("Google", anchorFont);
anchor.setReference("www.google.com");
phrase.add(anchor);
phrase.add(phrase1);
document.add(phrase);
Change the font and colors based on your needs.
Full code:
try {
PdfReader reader = new PdfReader("test.pdf"); //src pdf path (the pdf I need to modify)
Document document = new Document();
PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream("test2.pdf")); // destination pdf path
document.open();
PdfContentByte cb = writer.getDirectContent();
PdfImportedPage page = writer.getImportedPage(reader, 1);
document.newPage();
document.setPageSize(reader.getPageSize(1));
cb.addTemplate(page, 0, 0);
Phrase phrase = new Phrase("Open ");
Phrase phrase1 = new Phrase(" on Click On it.");
Font anchorFont = new Font(Font.FontFamily.UNDEFINED, 11);
anchorFont.setColor(BaseColor.BLUE);
anchorFont.setStyle(Font.FontStyle.UNDERLINE.getValue());
Anchor anchor = new Anchor("Google", anchorFont);
anchor.setReference("https://www.google.com");
phrase.add(anchor);
phrase.add(phrase1);
document.add(phrase);
document.close();
} catch (IOException e) {
e.printStackTrace();
} catch (DocumentException e) {
e.printStackTrace();
}
}
When I generate a pdf document with Link API, there have some strange things. A rectangle always outside the link text. It looks like the cell rectangle, but I didn't set any cell in the document.
My code is like this:
try (PdfDocument pdfDocument = new PdfDocument(new PdfWriter(file));
Document document = new Document(pdfDocument);) {
PdfAction pdfAction = PdfAction.createURI("https://kb.itextpdf.com/home");
Link link = new Link("https://kb.itextpdf.com/home", pdfAction);
Paragraph paragraph = new Paragraph();
paragraph.add(link);
document.add(paragraph);
} catch (Exception e) {
System.out.println(e);
}
I resolved it like this:
PdfLinkAnnotation linkAnnotation = link.getLinkAnnotation();
linkAnnotation.setBorder(new PdfAnnotationBorder(0, 0, 0));
I compare 2 pdf files and mark highlight on them.
When i using pdfbox to merge it for comparison . It have error missing highlight.
I using this function:
The function to merge 2 file pdfs with all pages of them to side by side.
function void generateSideBySidePDF() {
File pdf1File = new File(FILE1_PATH);
File pdf2File = new File(FILE2_PATH);
File outPdfFile = new File(OUTFILE_PATH);
PDDocument pdf1 = null;
PDDocument pdf2 = null;
PDDocument outPdf = null;
try {
pdf1 = PDDocument.load(pdf1File);
pdf2 = PDDocument.load(pdf2File);
outPdf = new PDDocument();
for(int pageNum = 0; pageNum < pdf1.getNumberOfPages(); pageNum++) {
// Create output PDF frame
PDRectangle pdf1Frame = pdf1.getPage(pageNum).getCropBox();
PDRectangle pdf2Frame = pdf2.getPage(pageNum).getCropBox();
PDRectangle outPdfFrame = new PDRectangle(pdf1Frame.getWidth()+pdf2Frame.getWidth(), Math.max(pdf1Frame.getHeight(), pdf2Frame.getHeight()));
// Create output page with calculated frame and add it to the document
COSDictionary dict = new COSDictionary();
dict.setItem(COSName.TYPE, COSName.PAGE);
dict.setItem(COSName.MEDIA_BOX, outPdfFrame);
dict.setItem(COSName.CROP_BOX, outPdfFrame);
dict.setItem(COSName.ART_BOX, outPdfFrame);
PDPage outPdfPage = new PDPage(dict);
outPdf.addPage(outPdfPage);
// Source PDF pages has to be imported as form XObjects to be able to insert them at a specific point in the output page
LayerUtility layerUtility = new LayerUtility(outPdf);
PDFormXObject formPdf1 = layerUtility.importPageAsForm(pdf1, pageNum);
PDFormXObject formPdf2 = layerUtility.importPageAsForm(pdf2, pageNum);
// Add form objects to output page
AffineTransform afLeft = new AffineTransform();
layerUtility.appendFormAsLayer(outPdfPage, formPdf1, afLeft, "left" + pageNum);
AffineTransform afRight = AffineTransform.getTranslateInstance(pdf1Frame.getWidth(), 0.0);
layerUtility.appendFormAsLayer(outPdfPage, formPdf2, afRight, "right" + pageNum);
}
outPdf.save(outPdfFile);
outPdf.close();
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
if (pdf1 != null) pdf1.close();
if (pdf2 != null) pdf2.close();
if (outPdf != null) outPdf.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
Insert this into your code after the "Source PDF pages has to be imported" segment to copy the annotations. The ones of the right PDF must have their rectangle moved.
// copy annotations
PDPage src1Page = pdf1.getPage(pageNum);
PDPage src2Page = pdf2.getPage(pageNum);
for (PDAnnotation ann : src1Page.getAnnotations())
{
outPdfPage.getAnnotations().add(ann);
}
for (PDAnnotation ann : src2Page.getAnnotations())
{
PDRectangle rect = ann.getRectangle();
ann.setRectangle(new PDRectangle(rect.getLowerLeftX() + pdf1Frame.getWidth(), rect.getLowerLeftY(), rect.getWidth(), rect.getHeight()));
outPdfPage.getAnnotations().add(ann);
}
Note that this code has a flaw - it works only with annotations WITH appearance stream (most have it). It will have weird effects for those that don't, in that case, one would have to adjust the coordinates depending on the annotation type. For highlights, it would be the quadpoints, for line it would be the line coordinates, etc, etc.
Trying to write heart symbol in pdf through java code .
This is my input to pdf : ❤️❤️❤️
But pdf generated is empty without anything written.
Using itext to write to pdf.
The Font used is tradegothic_lt_boldcondtwenty.ttf
OutputStream file = new FileOutputStream(fileName);
Document document = new Document(PageSize.A6);
PdfWriter writer = PdfWriter.getInstance(document, file);
document.open();
PdfLayer nested = new PdfLayer("Layer 1", writer);
PdfContentByte cb = writer.getDirectContent();
cb.beginLayer(nested);
ColumnText ct = new ColumnText(cb);
Font font = getFont();
Phrase para1 = new Phrase("❤️❤️❤️",font);
ct.setSimpleColumn(para1,38,0,260,138,15, Element.ALIGN_LEFT);
ct.go();
cb.endLayer();
document.close();
file.close();
private Font getFont() {
final String methodName = "generatePDF";
LOGGER.entering(CLASSNAME, methodName);
Font font = null;
try {
String filename = tradegothic_lt_boldcondtwenty.ttf;
FontFactory.register(filename, filename);
font = FontFactory.getFont(filename, BaseFont.CP1252, BaseFont.EMBEDDED,11.8f);
} catch(Exception exception) {
LOGGER.logp(Level.SEVERE, CLASSNAME, methodName, "Exception Occurred while fetching the Trade Gothic font." + exception);
font = FontFactory.getFont(FontFactory.HELVETICA_BOLD,11.8f);
}
return font;
}
Phrase para1 has the heart correctly. But not able to see in pdf
How can we extract text content from PDF file, we are using pdfbox to extract text from PDF file but we are getting header and footer is not required. I am using following java code.
PDFTextStripper stripper = null;
try {
stripper = new PDFTextStripper();
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
stripper.setStartPage(pageCount);
stripper.setEndPage(pageCount);
try {
String pageText = stripper.getText(document);
System.out.println(pageText);
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
You have tagged this as an itext/itextpdf question, yet you are using PdfBox. That's confusing.
You also claim that your PDF file has headers and footers. This would imply that your PDF is a Tagged PDF and that the header and the footer are marked as artifacts. If that is the case, than you should take advantage of the Tagged nature of the PDF, and extract the PDF as is done in the ParseTaggedPdf example:
TaggedPdfReaderTool readertool = new TaggedPdfReaderTool();
PdfReader reader = new PdfReader(StructuredContent.RESULT);
readertool.convertToXml(reader, new FileOutputStream(RESULT));
reader.close();
If this doesn't result in anything, you clearly don't have a Tagged PDF in which case there are no headers and footers in your document from a technical point of view. You may see headers and footers with your human eyes, but that doesn't mean that a machine sees these headers and footers. To a machine, it's just text like any other text in the page.
The ExtractPageContentArea example shows how we can define a rectangle that excludes the header and the footer when parsing for the content.
PdfReader reader = new PdfReader(pdf);
PrintWriter out = new PrintWriter(new FileOutputStream(txt));
Rectangle rect = new Rectangle(70, 80, 490, 580);
RenderFilter filter = new RegionTextRenderFilter(rect);
TextExtractionStrategy strategy;
for (int i = 1; i <= reader.getNumberOfPages(); i++) {
strategy = new FilteredTextRenderListener(new LocationTextExtractionStrategy(), filter);
out.println(PdfTextExtractor.getTextFromPage(reader, i, strategy));
}
out.flush();
out.close();
reader.close();
In this case, we have examined the document manually and we noticed that the actual text is always added inside the rectangle new Rectangle(70, 80, 490, 580). The header is added above Y coordinate 580 and below coordinate 80. By using the RegionTextRenderFilter we can extract the content excluding the content that doesn't overlap with the rectangle we have defined.