PDFBox why is the image not appearing in the PDF output?

PDFBox why is the image not appearing in the PDF output? - java

I have (I think) correctly followed the instructions from: https://memorynotfound.com/apache-pdfbox-add-image-pdf-document/
I am attempting to insert an image logo.png. The code runs and doesn't throw up any errors, but the resulting PDF does not contain an image! The text does appear as expected. Does anybody know why this is and how to fix it?
I'm using Java 8 in Apache NetBeans 11.
Thanks. Here's the code:
public void generate(File samplefile) throws IOException {
PDDocument document = new PDDocument();
//Adding the blank page to the document
//Repeat this next line for further pages
PDPage page = new PDPage();
document.addPage(page);
File dir = new File(ArdenRecord.sadd + "/Sample Reports");
if (!dir.exists()) {
dir.mkdir();
}
String fname = samplefile.toString().split("\\.")[0].split("\\\\")[2];
File f = new File(ArdenRecord.sadd + "/Sample Reports/" + fname + ".pdf");
File imfile = new File(ArdenRecord.sadd + "/logo.png");
PDImageXObject pdImage = PDImageXObject.createFromFile(imfile.toString(), document);
PDPageContentStream contents = new PDPageContentStream(document, page);
PDRectangle mediaBox = page.getMediaBox();
float startX = (mediaBox.getWidth() - pdImage.getWidth()) / 2;
float startY = (mediaBox.getHeight() - pdImage.getHeight()) / 2;
contents.drawImage(pdImage, startX, startY);
contents.beginText();
contents.newLineAtOffset(25, 700);
contents.setFont(PDType1Font.TIMES_ROMAN, 12);
BufferedReader br = new BufferedReader(new FileReader(samplefile));
String st;
int n = 0;
while ((st = br.readLine()) != null) {
if (n < 4 || n > 20 && n < 30) {
contents.showText(st);
contents.newLineAtOffset(0, -18);
}
n++;
}
contents.endText();
contents.close();
document.save(f);
document.close();
Desktop.getDesktop().open(f);
}
}```

Maybe there's nothing wrong with your code. I copy-pasted (removed .split("\\\\")[2] to get path right), compiled and tested it with PDFBox 2.0.17, OpenJDK 8, this PNG file and the first chapter of Lorem Ipsum in a text file. See the result below (Adobe Reader screenshot).
At least you should try with a different PNG file.

Related

How move MediaBox to 0.0 with itext(itextsharp)

The MediaBox coordinate of my PDF file is (-8,-8), now I want to set it (0,0).
I tried to set it directly, but the contents of the file were offset.
So I want to change the MediaBox coordinates and move the content as well.
Here's the itextshare code(c#). I'm glad to be able to solve it with Java itext.
using (PdfReader pdfReader = new PdfReader(#"MediaBoxZero.pdf"))
{
using (PdfStamper stamper = new PdfStamper(pdfReader, new FileStream(#"MediaBoxZero_Result.pdf", FileMode.Create)))
{
var mediaBox = pdfReader.GetBoxSize(1, "media");
PdfArray mediaBoxN = new PdfArray();
mediaBoxN.Add(new float[] { 0, 0, mediaBox.Width, mediaBox.Height });
for (int curPageNum = 1; curPageNum <= pdfReader.NumberOfPages; ++curPageNum)
{
PdfDictionary pagedict = pdfReader.GetPageN(curPageNum);
pagedict.Put(PdfName.MEDIABOX, mediaBoxN);
}
}
}
I tried affine transformation, but it didn't work. Affine transformation should only work when generating new PDFs, and I wanted to edit existing PDFs.
using (PdfReader pdfReader = new PdfReader(#"MediaBoxZero.pdf"))
{
using (PdfStamper stamper = new PdfStamper(pdfReader, new FileStream(#"MediaBoxZero_Result.pdf", FileMode.Create)))
{
PdfContentByte pb = stamper.GetOverContent(1);
AffineTransform at = new AffineTransform();
at.Translate(100,0);
pb.Transform(at);
pb.ConcatCTM(at);
//var mediaBox = pdfReader.GetBoxSize(1, "media");
//PdfArray mediaBoxN = new PdfArray();
//mediaBoxN.Add(new float[] { 0, 0, mediaBox.Width, mediaBox.Height });
//for (int curPageNum = 1; curPageNum <= pdfReader.NumberOfPages; ++curPageNum)
//{
// PdfDictionary pagedict = pdfReader.GetPageN(curPageNum);
// foreach (var item in pagedict.GetEnumerator())
// {
// }
// pagedict.Put(PdfName.MEDIABOX, mediaBoxN);
//}
}
}
}

You can shift the page contents like this with iTextSharp to match the MediaBox change:
using (PdfReader pdfReader = new PdfReader(SOURCE_PDF))
{
for (int i = 1; i <= pdfReader.NumberOfPages; i++)
{
Rectangle mediaBox = pdfReader.GetPageSize(i);
if (mediaBox.Left == 0 && mediaBox.Bottom == 0)
continue;
PdfDictionary pageDict = pdfReader.GetPageN(i);
pageDict.Put(PdfName.MEDIABOX, new PdfArray { new PdfNumber(0), new PdfNumber(0),
new PdfNumber(mediaBox.Width), new PdfNumber(mediaBox.Height) });
Rectangle cropBox = pdfReader.GetBoxSize(i, "crop");
if (cropBox != null)
{
pageDict.Put(PdfName.CROPBOX, new PdfArray { new PdfNumber(cropBox.Left - mediaBox.Left),
new PdfNumber(cropBox.Bottom-mediaBox.Bottom), new PdfNumber(cropBox.Right - mediaBox.Left),
new PdfNumber(cropBox.Top - mediaBox.Bottom) });
}
using (MemoryStream stream = new MemoryStream())
{
string translation = String.Format(CultureInfo.InvariantCulture, "1 0 0 1 {0} {1} cm\n", -mediaBox.Left, -mediaBox.Bottom);
byte[] translationBytes = Encoding.ASCII.GetBytes(translation);
stream.Write(translationBytes, 0, translationBytes.Length);
byte[] contentBytes = pdfReader.GetPageContent(i);
stream.Write(contentBytes, 0, contentBytes.Length);
pdfReader.SetPageContent(i, stream.ToArray());
}
}
using (FileStream fileStream = new FileStream(#"MediaBox-normalized.pdf", FileMode.Create))
using (PdfStamper pdfStamper = new PdfStamper(pdfReader, fileStream))
{
}
}
Some remarks:
You cannot simply manipulate the UnderContent of the respective page because iText attempts to prevent changes to the graphics state there (like a change of the current transformation matrix) from bleeding through to the existing content. Thus, here we update the page content on the byte level.
The code updates the CropBox (if set) alongside the MediaBox. Strictly speaking it should also update the other boxes known to PDF pages.
The code ignores annotations. If your PDFs have annotations, you have to also move the annotation Rects and other annotation properties in userspace coordinates, like QuadPoints, Vertices, etc.

PDFBox does not write the message I want into the page

I'm looking for at least an hour at my code but I can't find the bug. I'm using PDFBox to create PDF's (PDFBox HelloWorld Example). To learn how PDFBox works I just wanted to create some pages with "hello world" and the page number like "hello world 1", "hello world 2" and so on. As you can see I created a for loop to create six pages.
private void drawPDF(PDDocument doc, File file) throws IOException {
for (int pageIndex = 0; pageIndex < 6; pageIndex++) {
PDPage page = new PDPage(PDRectangle.A4);
doc.addPage(page);
PDFont font = PDType1Font.HELVETICA_BOLD;
String message = "hello world " + (pageIndex + 1);
float stringHeight = font.getFontDescriptor().getFontBoundingBox().getHeight() * FONT_SIZE;
PDRectangle pageSize = page.getMediaBox();
try (PDPageContentStream contents = new PDPageContentStream(doc, page)) {
contents.beginText();
contents.setFont(font, FONT_SIZE);
contents.setTextMatrix(Matrix.getTranslateInstance(0, pageSize.getHeight() - stringHeight / 1000f));
contents.showText(message);
System.out.println(message + " - " + doc.getNumberOfPages());
contents.endText();
}
}
doc.save(file);
}
In my console I get the following output (first number is pageIndex, second number is doc.getNumberOfPages()):
hello world 1 - 1
hello world 2 - 2
hello world 3 - 3
hello world 4 - 4
hello world 5 - 5
hello world 6 - 6
This is my load function to view the pdf.
private final ObservableList<Image> pdfFilePages = FXCollections.observableArrayList();
private void loadFile(File file) throws FileNotFoundException, IOException {
if (file != null) {
ByteBuffer buffer = null;
try (RandomAccessFile raf = new RandomAccessFile(file, "r"); FileChannel channel = raf.getChannel()) {
buffer = ByteBuffer.allocate((int) channel.size());
channel.read(buffer);
buffer.flip();
PDFFile pdfFile = new PDFFile(buffer);
List<Image> pages = new ArrayList<>();
for (int i = 0; i < pdfFile.getNumPages(); i++) {
PDFPage page = pdfFile.getPage(i, true);
System.out.println("page: " + i + " - " + pdfFile.getNumPages());
java.awt.geom.Rectangle2D bbox = page.getBBox();
java.awt.geom.Rectangle2D rect = new Rectangle(0, 0, (int) bbox.getWidth(), (int) bbox.getHeight());
BufferedImage buffImage = new BufferedImage((int) (bbox.getWidth() * 2d),
(int) (bbox.getHeight() * 2d), BufferedImage.TYPE_INT_RGB);
java.awt.Image awtImage = page.getImage((int) (bbox.getWidth() * 2.0),
(int) (bbox.getHeight() * 2.0), rect, null, true, true);
java.awt.Graphics2D bufImageGraphics = buffImage.createGraphics();
bufImageGraphics.drawImage(awtImage, 0, 0, null);
Image image = SwingFXUtils.toFXImage(buffImage, null);
pages.add(image);
}
pdfFilePages.addAll(pages);
}
}
}
This is what I get in my console:
page: 0 - 6
page: 1 - 6
page: 2 - 6
page: 3 - 6
page: 4 - 6
page: 5 - 6
When I load the pdf file to display the content in my application i get "hello world 1 - 1" for the first and second page. The following pages have "hello world 2 - 2" to "hello world 5 - 5". I don't understand why I get two pages of "hello world 1 - 1". I hope someone can explain to me where I made a mistake.

With the help of mkl I found the bug in my code. The for loop has to start at 1 and run until the index is equal to pdfFile.getNumPages(). The method getPage() return the first page when the index is 0. Because the index is 1 in the second iteration, the first page will be passed twice. The last page won't be passed since the loop has been finished. This seems to be the right approach.
private void loadFile(File file) throws FileNotFoundException, IOException {
if (file != null) {
ByteBuffer buffer = null;
try (RandomAccessFile raf = new RandomAccessFile(file, "r"); FileChannel channel = raf.getChannel()) {
buffer = ByteBuffer.allocate((int) channel.size());
channel.read(buffer);
buffer.flip();
PDFFile pdfFile = new PDFFile(buffer);
List<Image> pages = new ArrayList<>();
for (int i = 1; i <= pdfFile.getNumPages(); i++) {
PDFPage page = pdfFile.getPage(i, true);
System.out.println("page: " + i + " - " + pdfFile.getNumPages());
java.awt.geom.Rectangle2D bbox = page.getBBox();
java.awt.geom.Rectangle2D rect = new Rectangle(0, 0, (int) bbox.getWidth(), (int) bbox.getHeight());
BufferedImage buffImage = new BufferedImage((int) (bbox.getWidth() * 2d),
(int) (bbox.getHeight() * 2d), BufferedImage.TYPE_INT_RGB);
java.awt.Image awtImage = page.getImage((int) (bbox.getWidth() * 2.0),
(int) (bbox.getHeight() * 2.0), rect, null, true, true);
java.awt.Graphics2D bufImageGraphics = buffImage.createGraphics();
bufImageGraphics.drawImage(awtImage, 0, 0, null);
Image image = SwingFXUtils.toFXImage(buffImage, null);
pages.add(image);
}
pdfFilePages.addAll(pages);
}
}
}

Splitting a large Pdf file with PDFBox gets large result files

I am processing some large pdf files, (up to 100MB and about 2000 pages), with pdfbox. Some of the pages contain a QR code, I want to split those files into smaller ones with the pages from one QR code to the next.
I got this, but the result file sizes are the same as the source file. I mean, if I cut a 100MB pdf file into a ten files I am getting ten files 100MB each.
This is the code:
PDDocument documentoPdf =
PDDocument.loadNonSeq(new File("myFile.pdf"),
new RandomAccessFile(new File("./tmp/temp"), "rw"));
int numPages = documentoPdf.getNumberOfPages();
List pages = documentoPdf.getDocumentCatalog().getAllPages();
int previusQR = 0;
for(int i =0; i<numPages; i++){
PDPage page = (PDPage) pages.get(i);
BufferedImage firstPageImage =
page.convertToImage(BufferedImage.TYPE_USHORT_565_RGB , 200);
String qrText = readQRWithQRCodeMultiReader(firstPageImage, hintMap);
if(qrText != null and i!=0){
PDDocument outputDocument = new PDDocument();
for(int j = previusQR; j<i; j++){
outputDocument.importPage((PDPage)pages.get(j));
}
File f = new File("./splitting_files/"+previusQR+".pdf");
outputDocument.save(f);
outputDocument.close();
documentoPdf.close();
}
I also tried the following code for storing the new file:
PDDocument outputDocument = new PDDocument();
for(int j = previusQR; j<i; j++){
PDStream src = ((PDPage)pages.get(j)).getContents();
PDStream streamD = new PDStream(outputDocument);
streamD.addCompression();
PDPage newPage = new PDPage(new
COSDictionary(((PDPage)pages.get(j)).getCOSDictionary()));
newPage.setContents(streamD);
byte[] buf = new byte[10240];
int amountRead = 0;
InputStream is = null;
OutputStream os = null;
is = src.createInputStream();
os = streamD.createOutputStream();
while((amountRead = is.read(buf,0,10240)) > -1) {
os.write(buf, 0, amountRead);
}
outputDocument.addPage(newPage);
}
File f = new File("./splitting_files/"+previusQR+".pdf");
outputDocument.save(f);
outputDocument.close();
But this code creates files which lacks some content and also have the same size than the original.
How can I create smaller pdfs files from a larger one?
Is it posible with PDFBox? Is there any other library with which I can transform a single page into an image (for qr recognition), and also allows me to split a big pdf file into smaller ones?
Thx!

Thx! Tilman you are right, the PDFSplit command generates smaller files. I checked the PDFSplit code out and found that it removes the page links to avoid not needed resources.
Code extracted from Splitter.class :
private void processAnnotations(PDPage imported) throws IOException
{
List<PDAnnotation> annotations = imported.getAnnotations();
for (PDAnnotation annotation : annotations)
{
if (annotation instanceof PDAnnotationLink)
{
PDAnnotationLink link = (PDAnnotationLink)annotation;
PDDestination destination = link.getDestination();
if (destination == null && link.getAction() != null)
{
PDAction action = link.getAction();
if (action instanceof PDActionGoTo)
{
destination = ((PDActionGoTo)action).getDestination();
}
}
if (destination instanceof PDPageDestination)
{
// TODO preserve links to pages within the splitted result
((PDPageDestination) destination).setPage(null);
}
}
else
{
// TODO preserve links to pages within the splitted result
annotation.setPage(null);
}
}
}
So eventually my code looks like this:
PDDocument documentoPdf =
PDDocument.loadNonSeq(new File("docs_compuestos/50.pdf"), new RandomAccessFile(new File("./tmp/t"), "rw"));
int numPages = documentoPdf.getNumberOfPages();
List pages = documentoPdf.getDocumentCatalog().getAllPages();
int previusQR = 0;
for(int i =0; i<numPages; i++){
PDPage firstPage = (PDPage) pages.get(i);
String qrText ="";
BufferedImage firstPageImage = firstPage.convertToImage(BufferedImage.TYPE_USHORT_565_RGB , 200);
firstPage =null;
try {
qrText = readQRWithQRCodeMultiReader(firstPageImage, hintMap);
} catch (NotFoundException e) {
e.printStackTrace();
} finally {
firstPageImage = null;
}
if(i != 0 && qrText!=null){
PDDocument outputDocument = new PDDocument();
outputDocument.setDocumentInformation(documentoPdf.getDocumentInformation());
outputDocument.getDocumentCatalog().setViewerPreferences(
documentoPdf.getDocumentCatalog().getViewerPreferences());
for(int j = previusQR; j<i; j++){
PDPage importedPage = outputDocument.importPage((PDPage)pages.get(j));
importedPage.setCropBox( ((PDPage)pages.get(j)).findCropBox() );
importedPage.setMediaBox( ((PDPage)pages.get(j)).findMediaBox() );
// only the resources of the page will be copied
importedPage.setResources( ((PDPage)pages.get(j)).getResources() );
importedPage.setRotation( ((PDPage)pages.get(j)).findRotation() );
processAnnotations(importedPage);
}
File f = new File("./splitting_files/"+previusQR+".pdf");
previusQR = i;
outputDocument.save(f);
outputDocument.close();
}
}
}
Thank you very much!!

iText : How does add image to top of PDF page

I going to convert tiff to pdf file, but image displayed bottom of page, how to start image from top of the pdf page.
private static String convertTiff2Pdf(String tiff) {
// target path PDF
String pdf = null;
try {
pdf = tiff.substring(0, tiff.lastIndexOf('.') + 1) + "pdf";
// New document A4 standard (LETTER)
Document document = new Document();
PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream(pdf));
document.setMarginMirroring(true);
int pages = 0;
document.open();
PdfContentByte cb = writer.getDirectContent();
RandomAccessFileOrArray ra = null;
int comps = 0;
ra = new RandomAccessFileOrArray(tiff);
comps = TiffImage.getNumberOfPages(ra);
// Convertion statement
for (int c = 0; c < comps; ++c) {
Image img = TiffImage.getTiffImage(ra, c+1);
if (img != null) {
img.scalePercent(7200f / img.getDpiX(), 7200f / img.getDpiY());
img.setAbsolutePosition(0, 0);
img.scaleAbsolute(600, 250);
cb.addImage(img);
document.newPage();
++pages;
}
}
ra.close();
document.close();
} catch (Exception e) {
System.out.println(e);
pdf = null;
}
System.out.println("[" + tiff + "] -> [" + pdf + "] OK");
return pdf;
}

You are creating a new document with A4 pages (as opposed to using the LETTER format). These pages have a width of 595 pt and a height of 842 pt. The origin of the coordinate system (0, 0) is in the lower-left corner, which is exactly where you're adding the image using the method setAbsolutePosition(0, 0);
Surprisingly, you don't adapt the size of the page to the size of the image. Instead you want to add the image at the top of the page. In this case, you need to change the coordinates of the absolute position like this:
img.setAbsolutePosition(0, PageSize.A4.getHeight() - img.getScaledHeight());
If img.getScaledHeight() exceeds PageSize.A4.getHeight() (which is equal to 842), your image will be clipped at the bottom. The image will be clipped on the right if img.getScaledWidth() exceeds PageSize.A4.getWidth() (which is equal to 595).

Based in the answer, this code center any size image.
image.setAbsolutePosition((PageSize.A4.getWidth() - img.getScaledWidth())/2, (PageSize.A4.getHeight() - img.getScaledHeight())/2 );

How to generate multiple lines in PDF using Apache pdfbox

I am using Pdfbox to generate PDF files using Java. The problem is that when i add long text contents in the document, it is not displayed properly. Only a part of it is displayed. That too in a single line.
I want text to be in multiple lines.
My code is given below:
PDPageContentStream pdfContent=new PDPageContentStream(pdfDocument, pdfPage, true, true);
pdfContent.beginText();
pdfContent.setFont(pdfFont, 11);
pdfContent.moveTextPositionByAmount(30,750);
pdfContent.drawString("I am trying to create a PDF file with a lot of text contents in the document. I am using PDFBox");
pdfContent.endText();
My output:

Adding to the answer of Mark you might want to know where to split your long string. You can use the PDFont method getStringWidth for that.
Putting everything together you get something like this (with minor differences depending on the PDFBox version):
PDFBox 1.8.x
PDDocument doc = null;
try
{
doc = new PDDocument();
PDPage page = new PDPage();
doc.addPage(page);
PDPageContentStream contentStream = new PDPageContentStream(doc, page);
PDFont pdfFont = PDType1Font.HELVETICA;
float fontSize = 25;
float leading = 1.5f * fontSize;
PDRectangle mediabox = page.getMediaBox();
float margin = 72;
float width = mediabox.getWidth() - 2*margin;
float startX = mediabox.getLowerLeftX() + margin;
float startY = mediabox.getUpperRightY() - margin;
String text = "I am trying to create a PDF file with a lot of text contents in the document. I am using PDFBox";
List<String> lines = new ArrayList<String>();
int lastSpace = -1;
while (text.length() > 0)
{
int spaceIndex = text.indexOf(' ', lastSpace + 1);
if (spaceIndex < 0)
spaceIndex = text.length();
String subString = text.substring(0, spaceIndex);
float size = fontSize * pdfFont.getStringWidth(subString) / 1000;
System.out.printf("'%s' - %f of %f\n", subString, size, width);
if (size > width)
{
if (lastSpace < 0)
lastSpace = spaceIndex;
subString = text.substring(0, lastSpace);
lines.add(subString);
text = text.substring(lastSpace).trim();
System.out.printf("'%s' is line\n", subString);
lastSpace = -1;
}
else if (spaceIndex == text.length())
{
lines.add(text);
System.out.printf("'%s' is line\n", text);
text = "";
}
else
{
lastSpace = spaceIndex;
}
}
contentStream.beginText();
contentStream.setFont(pdfFont, fontSize);
contentStream.moveTextPositionByAmount(startX, startY);
for (String line: lines)
{
contentStream.drawString(line);
contentStream.moveTextPositionByAmount(0, -leading);
}
contentStream.endText();
contentStream.close();
doc.save("break-long-string.pdf");
}
finally
{
if (doc != null)
{
doc.close();
}
}
(BreakLongString.java test testBreakString for PDFBox 1.8.x)
PDFBox 2.0.x
PDDocument doc = null;
try
{
doc = new PDDocument();
PDPage page = new PDPage();
doc.addPage(page);
PDPageContentStream contentStream = new PDPageContentStream(doc, page);
PDFont pdfFont = PDType1Font.HELVETICA;
float fontSize = 25;
float leading = 1.5f * fontSize;
PDRectangle mediabox = page.getMediaBox();
float margin = 72;
float width = mediabox.getWidth() - 2*margin;
float startX = mediabox.getLowerLeftX() + margin;
float startY = mediabox.getUpperRightY() - margin;
String text = "I am trying to create a PDF file with a lot of text contents in the document. I am using PDFBox";
List<String> lines = new ArrayList<String>();
int lastSpace = -1;
while (text.length() > 0)
{
int spaceIndex = text.indexOf(' ', lastSpace + 1);
if (spaceIndex < 0)
spaceIndex = text.length();
String subString = text.substring(0, spaceIndex);
float size = fontSize * pdfFont.getStringWidth(subString) / 1000;
System.out.printf("'%s' - %f of %f\n", subString, size, width);
if (size > width)
{
if (lastSpace < 0)
lastSpace = spaceIndex;
subString = text.substring(0, lastSpace);
lines.add(subString);
text = text.substring(lastSpace).trim();
System.out.printf("'%s' is line\n", subString);
lastSpace = -1;
}
else if (spaceIndex == text.length())
{
lines.add(text);
System.out.printf("'%s' is line\n", text);
text = "";
}
else
{
lastSpace = spaceIndex;
}
}
contentStream.beginText();
contentStream.setFont(pdfFont, fontSize);
contentStream.newLineAtOffset(startX, startY);
for (String line: lines)
{
contentStream.showText(line);
contentStream.newLineAtOffset(0, -leading);
}
contentStream.endText();
contentStream.close();
doc.save(new File(RESULT_FOLDER, "break-long-string.pdf"));
}
finally
{
if (doc != null)
{
doc.close();
}
}
(BreakLongString.java test testBreakString for PDFBox 2.0.x)
The result
This looks as expected.
Of course there are numerous improvements to make but this should show how to do it.
Adding unconditional line breaks
In a comment aleskv asked:
could you add line breaks when there are \n in the string?
One can easily extend the solution to unconditionally break at newline characters by first splitting the string at '\n' characters and then iterating over the split result.
E.g. if instead of the long string from above
String text = "I am trying to create a PDF file with a lot of text contents in the document. I am using PDFBox";
you want to process this even longer string with embedded new line characters
String textNL = "I am trying to create a PDF file with a lot of text contents in the document. I am using PDFBox.\nFurthermore, I have added some newline characters to the string at which lines also shall be broken.\nIt should work alright like this...";
you can simply replace
String text = "I am trying to create a PDF file with a lot of text contents in the document. I am using PDFBox";
List<String> lines = new ArrayList<String>();
int lastSpace = -1;
while (text.length() > 0)
{
[...]
}
in the solutions above by
String textNL = "I am trying to create a PDF file with a lot of text contents in the document. I am using PDFBox.\nFurthermore, I have added some newline characters to the string at which lines also shall be broken.\nIt should work alright like this...";
List<String> lines = new ArrayList<String>();
for (String text : textNL.split("\n"))
{
int lastSpace = -1;
while (text.length() > 0)
{
[...]
}
}
(from BreakLongString.java test testBreakStringNL)
The result:

I know it's a bit late, but i had a little problem with mkl's solution. If the last line would only contain one word, your algorithm writes it on the previous one.
For Example: "Lorem ipsum dolor sit amet" is your text and it should add a line break after "sit".
Lorem ipsum dolor sit
amet
But it does this:
Lorem ipsum dolor sit amet
I came up with my own solution i want to share with you.
/**
* #param text The text to write on the page.
* #param x The position on the x-axis.
* #param y The position on the y-axis.
* #param allowedWidth The maximum allowed width of the whole text (e.g. the width of the page - a defined margin).
* #param page The page for the text.
* #param contentStream The content stream to set the text properties and write the text.
* #param font The font used to write the text.
* #param fontSize The font size used to write the text.
* #param lineHeight The line height of the font (typically 1.2 * fontSize or 1.5 * fontSize).
* #throws IOException
*/
private void drawMultiLineText(String text, int x, int y, int allowedWidth, PDPage page, PDPageContentStream contentStream, PDFont font, int fontSize, int lineHeight) throws IOException {
List<String> lines = new ArrayList<String>();
String myLine = "";
// get all words from the text
// keep in mind that words are separated by spaces -> "Lorem ipsum!!!!:)" -> words are "Lorem" and "ipsum!!!!:)"
String[] words = text.split(" ");
for(String word : words) {
if(!myLine.isEmpty()) {
myLine += " ";
}
// test the width of the current line + the current word
int size = (int) (fontSize * font.getStringWidth(myLine + word) / 1000);
if(size > allowedWidth) {
// if the line would be too long with the current word, add the line without the current word
lines.add(myLine);
// and start a new line with the current word
myLine = word;
} else {
// if the current line + the current word would fit, add the current word to the line
myLine += word;
}
}
// add the rest to lines
lines.add(myLine);
for(String line : lines) {
contentStream.beginText();
contentStream.setFont(font, fontSize);
contentStream.moveTextPositionByAmount(x, y);
contentStream.drawString(line);
contentStream.endText();
y -= lineHeight;
}
}

///// FOR PDBOX 2.0.X
// FOR ADDING DYNAMIC PAGE ACCORDING THE LENGTH OF THE CONTENT
import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.PDPageContentStream;
import org.apache.pdfbox.pdmodel.common.PDRectangle;
import org.apache.pdfbox.pdmodel.font.PDFont;
import org.apache.pdfbox.pdmodel.font.PDType1Font;
public class Document_Creation {
public static void main (String args[]) throws IOException {
PDDocument doc = null;
try
{
doc = new PDDocument();
PDPage page = new PDPage();
doc.addPage(page);
PDPageContentStream contentStream = new PDPageContentStream(doc, page);
PDFont pdfFont = PDType1Font.HELVETICA;
float fontSize = 25;
float leading = 1.5f * fontSize;
PDRectangle mediabox = page.getMediaBox();
float margin = 72;
float width = mediabox.getWidth() - 2*margin;
float startX = mediabox.getLowerLeftX() + margin;
float startY = mediabox.getUpperRightY() - margin;
String text = "I am trying to create a PDF file with a lot of text contents in the document. I am using PDFBox.An essay is, generally, a piece of writing that gives the author's own argument — but the definition is vague, overlapping with those of an article, a pamphlet, and a short story. Essays have traditionally been sub-classified as formal and informal. Formal essays are characterized by serious purpose, dignity, logical organization, length,whereas the informal essay is characterized by the personal element (self-revelation, individual tastes and experiences, confidential manner), humor, graceful style, rambling structure, unconventionality or novelty of theme.Lastly, one of the most attractive features of cats as housepets is their ease of care. Cats do not have to be walked. They get plenty of exercise in the house as they play, and they do their business in the litter box. Cleaning a litter box is a quick, painless procedure. Cats also take care of their own grooming. Bathing a cat is almost never necessary because under ordinary circumstances cats clean themselves. Cats are more particular about personal cleanliness than people are. In addition, cats can be left home alone for a few hours without fear. Unlike some pets, most cats will not destroy the furnishings when left alone. They are content to go about their usual activities until their owners return.";
List<String> lines = new ArrayList<String>();
int lastSpace = -1;
while (text.length() > 0)
{
int spaceIndex = text.indexOf(' ', lastSpace + 1);
if (spaceIndex < 0)
spaceIndex = text.length();
String subString = text.substring(0, spaceIndex);
float size = fontSize * pdfFont.getStringWidth(subString) / 1000;
System.out.printf("'%s' - %f of %f\n", subString, size, width);
if (size > width)
{
if (lastSpace < 0)
lastSpace = spaceIndex;
subString = text.substring(0, lastSpace);
lines.add(subString);
text = text.substring(lastSpace).trim();
System.out.printf("'%s' is line\n", subString);
lastSpace = -1;
}
else if (spaceIndex == text.length())
{
lines.add(text);
System.out.printf("'%s' is line\n", text);
text = "";
}
else
{
lastSpace = spaceIndex;
}
}
contentStream.beginText();
contentStream.setFont(pdfFont, fontSize);
contentStream.newLineAtOffset(startX, startY);
float currentY=startY;
for (String line: lines)
{
currentY -=leading;
if(currentY<=margin)
{
contentStream.endText();
contentStream.close();
PDPage new_Page = new PDPage();
doc.addPage(new_Page);
contentStream = new PDPageContentStream(doc, new_Page);
contentStream.beginText();
contentStream.setFont(pdfFont, fontSize);
contentStream.newLineAtOffset(startX, startY);
currentY=startY;
}
contentStream.showText(line);
contentStream.newLineAtOffset(0, -leading);
}
contentStream.endText();
contentStream.close();
doc.save("C:/Users/VINAYAK/Desktop/docccc/break-long-string.pdf");
}
finally
{
if (doc != null)
{
doc.close();
}
}
}
}

Just draw the string in a position below, typically done within a loop:
float textx = margin+cellMargin;
float texty = y-15;
for(int i = 0; i < content.length; i++){
for(int j = 0 ; j < content[i].length; j++){
String text = content[i][j];
contentStream.beginText();
contentStream.moveTextPositionByAmount(textx,texty);
contentStream.drawString(text);
contentStream.endText();
textx += colWidth;
}
texty-=rowHeight;
textx = margin+cellMargin;
}
These are the important lines:
contentStream.beginText();
contentStream.moveTextPositionByAmount(textx,texty);
contentStream.drawString(text);
contentStream.endText();
Just keep drawing new strings in new positions. For an example using a table, see here:
http://fahdshariff.blogspot.ca/2010/10/creating-tables-with-pdfbox.html

contentStream.moveTextPositionByAmount(textx,texty) is key point.
say for example if you are using a A4 size means 580,800 is width and height correspondling(approximately). so you have move your text based on the position of your document size.
PDFBox supports varies page format . so the height and width will vary for different page format

Pdfbox-layout abstracts out all the tedious details of managing the layout. As a complete Kotlin example, here is how to convert a text file to a pdf without worrying about line wrapping and pagination.
import org.apache.pdfbox.pdmodel.font.PDType1Font
import rst.pdfbox.layout.elements.Document
import rst.pdfbox.layout.elements.Paragraph
import java.io.File
fun main() {
val textFile = "input.txt"
val pdfFile = "output.pdf"
val font = PDType1Font.COURIER
val fontSize = 12f
val document = Document(40f, 50f, 40f, 60f)
val paragraph = Paragraph()
File(textFile).forEachLine {
paragraph.addText("$it\n", fontSize, font)
}
document.add(paragraph)
document.save(File(pdfFile))
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

PDFBox why is the image not appearing in the PDF output? - java

Related

How move MediaBox to 0.0 with itext(itextsharp)

PDFBox does not write the message I want into the page

Splitting a large Pdf file with PDFBox gets large result files

iText : How does add image to top of PDF page

How to generate multiple lines in PDF using Apache pdfbox

Categories

Resources