How to change PDF margin while converting .docx to pdf using opensagres - java

I'm using this code to convert an XWPFDocument to PDF:
ByteArrayOutputStream baos = new ByteArrayOutputStream();
PdfOptions options = PdfOptions.create();
PdfConverter.getInstance().convert(xwpfDocument, baos, options);
The converter is setting wrong margins on the pdf. Here is the original .docx:
And this is the pdf after conversion:
How can I retain the original left and right margins?

Related

ă ș ț characters missing from pdf generated from html with PdfWriter

I am trying to convert some html content to a pdf using the itext PdfWriter, like this:
Document document = new Document();
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
PdfWriter writer = PdfWriter.getInstance(document, outputStream);
document.open();
InputStream stream = new ByteArrayInputStream(content.getBytes(StandardCharsets.UTF_8));
XMLWorkerHelper.getInstance().parseXHtml(writer, document, stream, Charset.forName("UTF-8"));
document.close();
but the ă ș ț charaters are missing from the generated pdf. I have tried setting the encoding or the font, but with no luck. What I tried was to use a font provider and set it as a param to the parseXHtml method.
I set the encoding, but nothing changed.
XMLWorkerFontProvider fontProvider = new XMLWorkerFontProvider();
fontProvider.setUseUnicode(true);
fontProvider.defaultEncoding = BaseFont.CP1257;
I also tried setting the font, but it was not applied to the pdf.
XMLWorkerFontProvider fontProvider = new XMLWorkerFontProvider(XMLWorkerFontProvider.DONTLOOKFORFONTS);
fontProvider.register(PATH_TO_TTF_FONT_FILE_HOSTED_ON_S3);
And then set the param for parseXHtml.
XMLWorkerHelper.getInstance().parseXHtml(writer, document, stream, Charset.forName("UTF-8"), fontProvider);
Is there any way I could use the PdfWriter to convert all characters correctly from html to pdf?

SVG to PNG with Apache Batik then attach to PDF with PDFBox without saving images

So as the title says I am looking for a way to turn SVG to PNG with Apache Batik and then attach this image to PDF file using PDFBox without actually creating the svg and png anywhere.
Currently I have a web form that has SVG image with selectable parts of it.
When the form is submitted I take the "html" part of the svg meaning I keep something like <svg bla bla> <path bla bla/></svg> in a string that Spring then uses to create a ".svg" file in a given folder, then Batik creates a PNG file in the same folder and then PDFBox attaches it to the PDF - this works fine(code below).
//Get the svg data from the Form and Create the svg file
String svg = formData.getSvg();
File svgFile = new File("image.svg");
BufferedWriter writer = new BufferedWriter(new FileWriter(svgFile));
writer.write(svg);
writer.close();
// Send to Batik to turn to PNG
PNGTranscoder pngTranscode = new PNGTranscoder();
File svgFile = new File("image.svg");
InputStream in = new FileInputStream(svgFile);
TranscoderInput tIn = new TranscoderInput(in);
OutputStream os = new FileOutputStream("image.png");
TranscoderOutput tOut = new TranscoderOutput(os)
pngTranscode .transcode(tIn , tOut);
os.flush();
os.close();
//Send to PDFBox to attach to pdf
File pngfile = new File("image.png");
String path = pngfile.getAbsolutePath();
PDImageXObject pdImage = PDImageXObject.createFromFile(path, pdf);
PDPageContentStream contents = new PDPageContentStream(pdf, pdf.getPage(1));
contents.drawImage(pdImage, 0, pdf.getPage(1).getMediaBox().getHeight() - pdImage.getHeight());
contents.close();
As you can see there are a lot of files and stuff (need to tidy it up a bit), but is it possible to do this on the run without the creation and constant fetching of the svg and png files?
Given the suggestion in the comments I opted for using ByteArrayOutputStream, ByteArrayInputStream, BufferedImage and LosslessFactory. Its a bit slower than the saving (if you go through it in debug as seems the BufferedImage goes on a holiday first before creating the image).
The sources I found to use are: How to convert SVG into PNG on-the-fly and Print byte[] to pdf using pdfbox
byte[] streamBytes = IOUtils.toByteArray(new ByteArrayInputStream(formData.getSvg().getBytes()));
PNGTranscoder pngTranscoder = new PNGTranscoder();
ByteArrayOutputStream os = new ByteArrayOutputStream();
pngTranscoder.transcode(new TranscoderInput(new ByteArrayInputStream(streamBytes)), new TranscoderOutput(os));
InputStream is = new ByteArrayInputStream(os.toByteArray());
BufferedImage bim = ImageIO.read(is);
PDImageXObject pdImage = LosslessFactory.createFromImage(pdf, bim);
PDPageContentStream contents = new PDPageContentStream(pdf, pdf.getPage(1));
contents.drawImage(pdImage, 0, pdf.getPage(1).getMediaBox().getHeight() - pdImage.getHeight());
contents.close();
Based on the comments and links provided by D.V.D., I also worked through the problem. I just wanted to post the simple but full example, for anyone wanting to review in the future.
public class App {
private static String OUTPUT_PATH = "D:\\so52875145\\output\\PdfWithPngImage.pdf";
public static void main(String[] args) {
try {
// obtain the SVG source (hardcoded here, but the OP would obtain the String from form data)
byte[] svgByteArray = "<svg xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\"><polygon points=\"200,10 250,190 160,210\" style=\"fill:lime;stroke:purple;stroke-width:1\" /></svg>".getBytes();
System.out.println("Converted svg to byte array...");
// convert SVG into PNG image
PNGTranscoder pngTranscoder = new PNGTranscoder();
ByteArrayOutputStream os = new ByteArrayOutputStream();
pngTranscoder.transcode(new TranscoderInput(new ByteArrayInputStream(svgByteArray)), new TranscoderOutput(os));
System.out.println("Transcoded svg to png...");
// create PDF, and add page to it
PDDocument pdf = new PDDocument();
pdf.addPage(new PDPage());
// generate in-memory PDF image object, using the transcoded ByteArray stream
BufferedImage bufferedImage = ImageIO.read( new ByteArrayInputStream(os.toByteArray()) );
PDImageXObject pdImage = LosslessFactory.createFromImage(pdf, bufferedImage);
System.out.println("Created PDF image object...");
// write the in-memory PDF image object to the PDF page
PDPageContentStream contents = new PDPageContentStream(pdf, pdf.getPage(0));
contents.drawImage(pdImage, 0, 0);
contents.close();
System.out.println("Wrote PDF image object to PDF...");
pdf.save(OUTPUT_PATH);
pdf.close();
System.out.println("Saved PDF to path=[" + OUTPUT_PATH + "]");
}
catch (Exception e) {
e.printStackTrace();
}
}
}

How to convert an html String to a PDF InputStream?

If we have string with a content of a html page, how can we convert it to a InputStream made after transform this string to a pdf document?
I'm trying to use iText with XMLWorkerHelper, and this following code works, but the problem is I don't want the output on a file. I have tried several variations in order to get the result on a InputStream that I could convert to a Primefaces StreamedContent but no success. How we can do it?
Is there another technique that we can use to solve this problem?
The motivation to this is use xhtml files wich is already rendered and output it as a pdf to be downloaded by the user.
Document document = new Document();
PdfWriter writer = PdfWriter.getInstance(document,
new FileOutputStream("results/loremipsum.pdf"));
document.open();
XMLWorkerHelper.getInstance().parseXHtml(writer, document,
new FileInputStream("/html/loremipsum.html"));
document.close();
If you need an InputStream from which some other code can read the PDF your code produces, you can simply create the PDF using a byte array output stream and thereafter wrap the byte array from that stream in a byte array input stream:
ByteArrayOutputStream baos = new ByteArrayOutputStream();
Document document = new Document();
PdfWriter writer = PdfWriter.getInstance(document, baos);
document.open();
XMLWorkerHelper.getInstance().parseXHtml(writer, document, new FileInputStream("/html/loremipsum.html"));
document.close();
ByteArrayInputStream pdfInputStream = new ByteArrayInputStream(baos.toByteArray());
You can optimize this a bit by creating and processing the PDF in different threads and using a PipedOutputStream and a PipedInputStream instead.

Save itext pdf as blob without physical existence.

I am using this code to generate PDF using iText. First it creates HTML to PDF after that it converts that PDF in byte array or in BLOB or in byte array.
I dont want to create any physical stores of pdf on my server. First i want to convert HTML to blob of PDF using itext, And after that i want to store that blob in my DB(Stores in DB i will done).
String userAccessToken=requests.getSession()
.getAttribute("access_token").toString();
Document document = new Document(PageSize.LETTER);
String name="/pdf/invoice.pdf";
PdfWriter pdfWriter = PdfWriter.getInstance
(document, new FileOutputStream(requests.getSession().getServletContext().getRealPath("")+"/assets"+name));
document.open();
document.addAuthor("Real Gagnon");
document.addCreator("Real's HowTo");
document.addSubject("Thanks for your support");
document.addTitle("Please read this ");
XMLWorkerHelper worker = XMLWorkerHelper.getInstance();
//data is an html string
String str = data;
worker.parseXHtml(pdfWriter, document, new StringReader(str));
document.close();
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
PdfWriter.getInstance(document, byteArrayOutputStream);
byte[] pdfBytes = byteArrayOutputStream.toByteArray();
link=name;
System.out.println("Byte array is "+pdfBytes);
PROBLEM:- Convert html to pdf BLOB using itext, Without physical existence of PDF.
The other answer to this question is almost correct, but not quite.
You can use any OutputStream when you create a PdfWriter. If you want to create a file entirely in memory, you can use a ByteArrayOutputStream like this:
ByteArrayOutputStream baos = new ByteArrayOutputStream();
Document document = new Document();
PdfWriter.getInstance(document, baos);
document.open();
// add stuff
document.close();
byte[] pdf = baos.toByteArray();
In short: you first create a ByteArrayOutputStream, you pass this OutputStream to the PdfWriter and after the document is closed, you can get the bytes from the OutputStream.
(In the other answer, there was no way to retrieve the bytes. Also: it is important that you don't try to retrieve the bytes before the document is closed.)
Write into a ByteArrayOutputStream (instead of a FileOutputStream):
PdfWriter pdfWriter = PdfWriter.getInstance
(document, new ByteArrayOutputStream());

Exporting SWT Image into PDF using Java iText API

I have a an SWT image I want to export this image into a pdf file using iText API.
I have tried saving this image on the disk and then using the path of image to export
it to the pdf, this takes lots of time to generate the pdf.
I have also tried converting the SWT image into AWT image and then exporting it into the
pdf, this approach takes even more time to generate pdf.
Another approach I have been trying is to convert the raw data of image into
jpeg byteArrayOutputStream using ImageLoader Object as shown below :
ImageLoader tempLoader = new ImageLoader();
tempLoader.data = new ImageData[] {
image.getImageData()
};
ByteArrayOutputStream bos = new ByteArrayOutputStream();
tempLoader.save(bos, SWT.IMAGE_JPEG);
Now I am using this ByteArrayOutputStream as input to
OutputStream outStream = new FileOutputStream(selectedPathAndName);
Document document = new Document();
document.setMargins(0,0,0,0);
document.setPageSize(new Rectangle(0,0,width,height));
PdfWriter.getInstance(document, outStream);
document.open();
com.itextpdf.text.Image pdfImage = com.itextpdf.text.Image.getInstance(bos.toByteArray());
document.add(pdfImage);
document.close();
This generates pdf files with the width and height I have set, but the page seems to be empty.
Any suggestions or any other approach is most welcome.
Thank you,
It looks your page sizes are zero, try setting them to something like A4 in the constructor.
Document document = new Document(PageSize.A4, 50, 50, 50, 50);

Categories