How to convert PDDocument to Base64 String in Java? - java

How to convert PDDocument (the pdf document contains words and images, if it is possible) to Base64 String? Is there any suggestion of code. Please.

The answer assumes that you are using jdk8 or higher, if not, please see here.
import java.util.Base64;
...
ByteArrayOutputStream baos = new ByteArrayOutputStream();
doc.save(baos);
String base64String = Base64.getEncoder().encodeToString(baos.toByteArray());
doc.close(); // don't forget to close your document

Related

How to convert an html String to a PDF InputStream?

If we have string with a content of a html page, how can we convert it to a InputStream made after transform this string to a pdf document?
I'm trying to use iText with XMLWorkerHelper, and this following code works, but the problem is I don't want the output on a file. I have tried several variations in order to get the result on a InputStream that I could convert to a Primefaces StreamedContent but no success. How we can do it?
Is there another technique that we can use to solve this problem?
The motivation to this is use xhtml files wich is already rendered and output it as a pdf to be downloaded by the user.
Document document = new Document();
PdfWriter writer = PdfWriter.getInstance(document,
new FileOutputStream("results/loremipsum.pdf"));
document.open();
XMLWorkerHelper.getInstance().parseXHtml(writer, document,
new FileInputStream("/html/loremipsum.html"));
document.close();
If you need an InputStream from which some other code can read the PDF your code produces, you can simply create the PDF using a byte array output stream and thereafter wrap the byte array from that stream in a byte array input stream:
ByteArrayOutputStream baos = new ByteArrayOutputStream();
Document document = new Document();
PdfWriter writer = PdfWriter.getInstance(document, baos);
document.open();
XMLWorkerHelper.getInstance().parseXHtml(writer, document, new FileInputStream("/html/loremipsum.html"));
document.close();
ByteArrayInputStream pdfInputStream = new ByteArrayInputStream(baos.toByteArray());
You can optimize this a bit by creating and processing the PDF in different threads and using a PipedOutputStream and a PipedInputStream instead.

Convert a PDF with forms to a PDF with text only (preserve data) using iText

I have multiple PDFs that get populated with multiple records (a.pdf,b.pdf,c[0-9].pdf,d[0-9].pdf,ez.pdf) using acroforms and pdfbox.
The resulting files (aflat.pdf,bflat.pdf,c[0-9]flat.pdf,d[0-9]flat.pdf,ezflat.pdf) should have their forms(dictionaries and whatever adobe uses) removed but the fields filled as raw text saved on the pdf (setReadOnly is not what I want!).
PdfStamper can only remove fields without saving their content but I've found some references to PdfContentByte as a way to save the content. Alas, the documentation is too brief to understand how I should do this.
As a last resort I could use FieldPosition to write directly on the PDF. Has anyone ever encountered such problem? How do I solve it?
UPDATE: Saving a single page of b.pdf yields a valid bfilled.pdf but a blank bflattened.pdf. Saving the whole document solved the issue.
populateB();
try (PDDocument doc = new PDDocument(); FileOutputStream stream = new FileOutputStream("bfilled.pdf")) {
//importing the page will corrupt the fields
/*wrong approach*/doc.importPage((PDPage)pdfDocuments.get(0).getDocumentCatalog().getAllPages().get(0));
/*wrong approach*/doc.save(stream);
//save the whole document instead
pdfDocuments.get(0).save(stream);//<---right approach
}
try (FileOutputStream stream = new FileOutputStream("bflattened.pdf")) {
PdfStamper stamper = new PdfStamper(new PdfReader("bfilled.pdf"), stream);
stamper.setFormFlattening(true);
stamper.close();
}
Use PdfStamper.setFormFlattening(true) to get rid of the fields and write them as content.
Always use the whole page when working with acroforms
populateB();
try (PDDocument doc = new PDDocument(); FileOutputStream stream = new FileOutputStream("bfilled.pdf")) {
//importing the page will corrupt the fields
doc.importPage((PDPage) pdfDocuments.get(0).getDocumentCatalog().getAllPages().get(0));
doc.save(stream);
//save the whole document instead
pdfDocuments.get(0).save(stream);
}
try (FileOutputStream stream = new FileOutputStream("bflattened.pdf")) {
PdfStamper stamper = new PdfStamper(new PdfReader("bfilled.pdf"), stream);
stamper.setFormFlattening(true);
stamper.close();
}

Save itext pdf as blob without physical existence.

I am using this code to generate PDF using iText. First it creates HTML to PDF after that it converts that PDF in byte array or in BLOB or in byte array.
I dont want to create any physical stores of pdf on my server. First i want to convert HTML to blob of PDF using itext, And after that i want to store that blob in my DB(Stores in DB i will done).
String userAccessToken=requests.getSession()
.getAttribute("access_token").toString();
Document document = new Document(PageSize.LETTER);
String name="/pdf/invoice.pdf";
PdfWriter pdfWriter = PdfWriter.getInstance
(document, new FileOutputStream(requests.getSession().getServletContext().getRealPath("")+"/assets"+name));
document.open();
document.addAuthor("Real Gagnon");
document.addCreator("Real's HowTo");
document.addSubject("Thanks for your support");
document.addTitle("Please read this ");
XMLWorkerHelper worker = XMLWorkerHelper.getInstance();
//data is an html string
String str = data;
worker.parseXHtml(pdfWriter, document, new StringReader(str));
document.close();
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
PdfWriter.getInstance(document, byteArrayOutputStream);
byte[] pdfBytes = byteArrayOutputStream.toByteArray();
link=name;
System.out.println("Byte array is "+pdfBytes);
PROBLEM:- Convert html to pdf BLOB using itext, Without physical existence of PDF.
The other answer to this question is almost correct, but not quite.
You can use any OutputStream when you create a PdfWriter. If you want to create a file entirely in memory, you can use a ByteArrayOutputStream like this:
ByteArrayOutputStream baos = new ByteArrayOutputStream();
Document document = new Document();
PdfWriter.getInstance(document, baos);
document.open();
// add stuff
document.close();
byte[] pdf = baos.toByteArray();
In short: you first create a ByteArrayOutputStream, you pass this OutputStream to the PdfWriter and after the document is closed, you can get the bytes from the OutputStream.
(In the other answer, there was no way to retrieve the bytes. Also: it is important that you don't try to retrieve the bytes before the document is closed.)
Write into a ByteArrayOutputStream (instead of a FileOutputStream):
PdfWriter pdfWriter = PdfWriter.getInstance
(document, new ByteArrayOutputStream());

Itext PDF shows blank page

I have the below sample code which generates a PDF file using iText.
The question I have is when I create the base64Binary by DatatypeConverter.printBase64Binary method..
I tried to copy the Sysem.out.println of "base64Binary".
Used a online base64 online decoder tool to decode the content and save it output as sample.pdf and
when I try to open the sample.pdf it shows empty. I am not sure why its behaving this way and help would be much appreciated.
But when I directly decode using java and write it to a disk file it shows the context.
Can someone help me understand why it shows blank when I try to save "base64Binary" output as sample.pdf.
Thanks.
Below is the code snippet:
import java.io.ByteArrayOutputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import javax.xml.bind.DatatypeConverter;
import com.itextpdf.text.Document;
import com.itextpdf.text.DocumentException;
import com.itextpdf.text.Element;
import com.itextpdf.text.Font;
import com.itextpdf.text.Font.FontFamily;
import com.itextpdf.text.PageSize;
import com.itextpdf.text.Paragraph;
import com.itextpdf.text.pdf.BaseFont;
import com.itextpdf.text.pdf.PdfContentByte;
import com.itextpdf.text.pdf.PdfWriter;
/**
* Creates a PDF file in memory.
*/
public class HelloWorldMemory {
/** Path to the resulting PDF file. */
public static final String RESULT = "C:////hello_memory.pdf";
public static void main(final String[] args) throws DocumentException, IOException {
// step 1
final Document document = new Document();
final ByteArrayOutputStream baos = new ByteArrayOutputStream();
final PdfWriter writer = PdfWriter.getInstance(document, baos);
document.open();
final PdfContentByte cb = writer.getDirectContent();
cb.beginText();
cb.setFontAndSize(getBaseFont(Font.NORMAL), 24);
final float exPosition = (PageSize.A4.getWidth()) / 2;
cb.showTextAligned(Element.ALIGN_CENTER, "Test No", exPosition, 670, 0);
cb.endText();
document.add(new Paragraph("Hello World!"));
document.close();
System.out.println("baos.toByteArray():" + baos.toByteArray());
final String base64Binary = DatatypeConverter.printBase64Binary(baos.toByteArray());
System.out.println("base64Binary:" + base64Binary);
final byte[] txt = DatatypeConverter.parseBase64Binary(base64Binary);
final FileOutputStream fos = new FileOutputStream(RESULT);
fos.write(txt);
fos.close();
}
private static BaseFont getBaseFont(final int fontType) {
final Font f = new Font(FontFamily.HELVETICA, 0, fontType);
final BaseFont baseFont = f.getCalculatedBaseFont(true);
return baseFont;
}
}
This question is not really related to iText or PDF. You'll have the same problem with any binary data that is base64 encoded. When using the online base64 decoder, your binary data gets corrupted somehow. Bruno already explained in his answer why this does not completely invalidate a file in case of a PDF.
The data is probably corrupted because of encoding issues. Maybe the online base64 decoder displayed the decoded data in a textarea or something and you copy/pasted it into a file? If you use a decoder that offers you a binary file for download, the result should be fine.
I tested with http://www.opinionatedgeek.com/dotnet/tools/base64decode/ (the first hit of a Google search). When I save the .bin file and rename it to .pdf, it displays as expected in a PDF viewer.
PDF is a binary file format based on the Carousel Object System (COS) syntax and the AIM (Adobe Imaging Model). The COS objects use ASCII for the file structures, but the streams for images and AIM syntax are usually binary. When you copy a PDF file without respecting the binary aspect of the file, a PDF viewer can render the document structure (the pages) based on the ASCII COS objects, but not the content (what's on the pages). This is probably what happens in your case: you're corrupting the bytes in the content streams.

How to convert String to Image (Java)

I have an issue about my Application in detail.
- I have a java servlet receive data from mms gateway (MM7 protocol)
I get inputstream (image content , message content ) convert to string
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
//String orgin = new String(byteArrayOutputStream.toByteArray(),"UTF-8");
String orgin = Streams.asString(request.getInputStream(), "ISO-8859-1");
Then I substring orgin for image content and convert to base64 and save to image file
but string that I convert to base64 can not save to image because this error
not a jpeg file
I print out string base64 does not start with /9j that mean not jpg format
please suggest or give an example for me
Best reqard
lieang noob noob
sorry for my english :)
This is at least part of your problem:
String orgin = Streams.asString(request.getInputStream(), "ISO-8859-1");
You shouldn't be converting it into a string to start with. It's binary data, right? So read it from the stream as binary data.
Now it sounds like you basically want to get separate "chunks" of that binary data - but converting the data into a string format to start with is not appropriate, unless that binary data really is encoded text.
I think, this might help: http://www.dailycoding.com/Posts/convert_image_to_base64_string_and_base64_string_to_image.aspx
Encoding of image is very simple.
Encoding Source:
Bitmap bm = BitmapFactory.decodeFile(picturePath);
ByteArrayOutputStream baos = new ByteArrayOutputStream();
bm.compress(Bitmap.CompressFormat.JPEG, 100, baos);
byte[] byteArray = baos.toByteArray();
encodedImage = Base64.encodeToString(byteArray, Base64.DEFAULT);
ImageView imageView = (ImageView) findViewById(R.id.imageView1);
imageView.setImageBitmap(BitmapFactory.decodeFile(picturePath));
Decoding Source:
byte[] decodedString;
decodedString = Base64.decode(picture, Base64.DEFAULT);
imageView1.setImageBitmap( BitmapFactory.decodeByteArray(decodedString, 0, decodedString.length));
Use the below code the for the String to Image
here "origin" is a String
import org.apache.commons.codec.binary.Base64;
byte[] imgByteArray = Base64.decodeBase64(origin);
FileOutputStream imgOutFile = new FileOutputStream("C:\\Workspaces\\String_To_Image.jpg");
imgOutFile.write(imgByteArray);
imgOutFile.close();

Categories