Need help with pdf-renderer - java

I'm using PDF-Renderer to view PDF files within my java application. It's working perfectly for normal PDF files.
However, i want the application to be able to display encrypted PDF files. The ecrypted file will be decrypted with CipherInputStream, but i do not want to save the decrypted data on disk. Am trying to figure a way i can pass the decryted data from CipherInputStream to the PDFFile constructor without having to write the decryted data to file.
I will also appreciate if someone can help with a link to PDF-Renderer tutorial, so that i can read up more on it.
Thanks.

Try the following class:
import com.sun.pdfview.PDFFile;
import java.io.IOException;
import java.io.InputStream;
import java.nio.ByteBuffer;
import java.nio.channels.Channels;
import java.nio.channels.ReadableByteChannel;
public class PDFFileUtility {
private static final int READ_BLOCK = 8192;
public static PDFFile getPDFFile(InputStream in) throws IOException {
ReadableByteChannel bc = Channels.newChannel(in);
ByteBuffer bb = ByteBuffer.allocate(READ_BLOCK);
while (bc.read(bb) != -1) {
bb = resizeBuffer(bb); //get new buffer for read
}
return new PDFFile(bb);
}
private static ByteBuffer resizeBuffer(ByteBuffer in) {
ByteBuffer result = in;
if (in.remaining() < READ_BLOCK) {
result = ByteBuffer.allocate(in.capacity() * 2);
in.flip();
result.put(in);
}
return result;
}
}
So call:
PDFFileUtility.getPDFFile(myCipherInputStream);

Related

Converting ANSI to UTF-8 & java.lang.OutOfMemoryError: Java heap space

My final goal is to convert a file from ANSI to UTF-8. To do so, I use some code with Java :
import java.io.IOException;
import java.nio.ByteBuffer;
import java.nio.CharBuffer;
import java.nio.charset.Charset;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
public class ConvertFromAnsiToUtf8 {
public static void main(String[] args) throws IOException {
try {
Path p = Paths.get("C:\\shared_to_vm\\test_encode\\test.csv");
ByteBuffer bb = ByteBuffer.wrap(Files.readAllBytes(p));
CharBuffer cb = Charset.forName("windows-1252").decode(bb);
bb = Charset.forName("UTF-8").encode(cb);
Files.write(p, bb.array());
} catch (Exception e) {
System.out.println(e);
}
}
}
The code works perfectly when I test it on small files. My file is convert from ANSI to UTF-8 and all characters are recognize and well encoded. But as soon as I try to use it on the file I need to convert, I get the error java.lang.OutOfMemoryError: Java heap space.
So far as my understanding goes, I got like 1.5 million lines in my file so I am pretty sure I create too many objects with my application.
Of course, I have checked what this error means and how I could solve it (like here or here for example) but is improving the memory capacity of my JVM the only way to solve it ? And if it is, how much more should i use ?
Any kind of help (tip, advice, link or else) would be greatly appreciated !
Don't read the whole file at once:
ByteBuffer bb = ByteBuffer.wrap(Files.readAllBytes(p));
Instead, try to read line-by-line:
Files.lines(p, Charset.forName("windows-1252")).forEach(line -> {
// Convert your line, write to file
});
Stream the input, convert the character encoding, and write the output as you go. This way, you don't need to read the entire file into memory, but only as much as you want.
If you want to minimize the number of (slowish) system calls, you could use a similar approach, but explicitly create a BufferedInputStream with a larger internal buffer, and then wrap that in an InputStreamReader. But the simple approach shown here is unlikely to be a critical point in many applications.
private static final Charset WINDOWS1252 = Charset.forName("windows-1252");
private static final int DEFAULT_BUF_SIZE = 8192;
public static void transcode(Path input, Path output) throws IOException {
try (Reader r = Files.newBufferedReader(input, WINDOWS1252);
Writer w = Files.newBufferedWriter(output, StandardCharsets.UTF_8, StandardOpenOption.CREATE_NEW)) {
char[] buf = new char[DEFAULT_BUF_SIZE];
while (true) {
int n = r.read(buf);
if (n < 0) break;
w.write(buf, 0, n);
}
}
}
If you have a large file, which is larger then available random access memory you should convert characters chunk-by-chunk.
Following you can found the example:
import java.io.IOException;
import java.nio.ByteBuffer;
import java.nio.channels.FileChannel;
import java.nio.channels.ReadableByteChannel;
import java.nio.channels.WritableByteChannel;
import java.nio.charset.Charset;
import java.nio.charset.CharsetDecoder;
import java.nio.charset.CharsetEncoder;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.nio.file.StandardOpenOption;
public class Iconv {
private static void iconv(Charset toCode, Charset fromCode, Path src, Path dst) throws IOException {
CharsetDecoder decoder = fromCode.newDecoder();
CharsetEncoder encoder = toCode.newEncoder();
try (ReadableByteChannel source = FileChannel.open(src, StandardOpenOption.READ);
WritableByteChannel destination = FileChannel.open(dst, StandardOpenOption.CREATE, StandardOpenOption.TRUNCATE_EXISTING,
StandardOpenOption.WRITE);) {
ByteBuffer readBytes = ByteBuffer.allocate(4096);
while (source.read(readBytes) > 0) {
readBytes.flip();
destination.write(encoder.encode(decoder.decode(readBytes)));
readBytes.clear();
}
}
}
public static void main(String[] args) throws Exception {
iconv(Charset.forName("UTF-8"), Charset.forName("Windows-1252"), Paths.get("test.csv") , Paths.get("test-utf8.csv") );
}
}

Redis/java - writing and reading binary data

I'm trying to write and read a gzip to/from Redis. The problem is that I tried saving the read bytes to a file and opening it with gzip - it's invalid. The strings are also different when looking at them in the Eclipse console.
Here's my code:
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import redis.clients.jedis.Jedis;
public class TestRedis
{
public static void main(String[] args) throws IOException
{
String fileName = "D:/temp/test_write.gz";
String jsonKey = fileName;
Jedis jedis = new Jedis("127.0.0.1");
byte[] jsonContent = ReadFile(new File(fileName).getPath());
// test-write data we're storing in redis
FileOutputStream fostream = new FileOutputStream("D:/temp/test_write_before_redis.gz"); // looks ok
fostream.write(jsonContent);
fostream.close();
jedis.set(jsonKey.getBytes(), jsonContent);
System.out.println("writing, key: " + jsonKey + ",\nvalue: " + new String(jsonContent)); // looks ok
byte[] readJsonContent = jedis.get(jsonKey).getBytes();
String readJsonContentString = new String(readJsonContent);
FileOutputStream fos = new FileOutputStream("D:/temp/test_read.gz"); // invalid gz file :(
fos.write(readJsonContent);
fos.close();
System.out.println("n\nread json content from redis: " + readJsonContentString);
}
private static byte[] ReadFile(String aFilePath) throws IOException
{
Path path = Paths.get(aFilePath);
return Files.readAllBytes(path);
}
}
You are using Jedis.get(String) to read which includes an inner UTF-8 conversion. But using Jedis.set(byte[], byte[]) to write does not include such conversion. The mismatch could be because of this reason. If so, you can try Jedis.get(byte[]) to read from redis to skip UTF-8 conversion. E.g.
byte[] readJsonContent = jedis.get(jsonKey.getBytes());

How to return binary data from AWS Lambda written in Java

Given that it is now possible to handle binary data in Amazon Api Gateway and Amazon Lambda, I wanted to try to make an Amazon Lambda endpoint which returned an Excel spreadsheet. It is entirely possible to do so using node/js, as demonstrated here. Unfortunately, any time I try to do this using Java, it falls to pieces.
My initial attempt was to create a simple workbook using apache XSSFWorkbook, write it to the output stream provided by RequestStreamHandler, and done.
import com.amazonaws.services.lambda.runtime.Context;
import com.amazonaws.services.lambda.runtime.RequestStreamHandler;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
public class FileRequestHandler implements RequestStreamHandler {
public void handleRequest(InputStream inputStream, OutputStream outputStream, Context context)
throws IOException {
Workbook wb = new XSSFWorkbook();
String sheetName = "Problem sheet";
wb.createSheet(sheetName);
wb.write(outputStream);
}
}
When tested locally, the output stream can be piped to a file resulting in a valid output excel file.
import com.amazonaws.util.StringInputStream;
import org.junit.Test;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
public class FileRequestHandlerTest {
#Test
public void shouldCreateExcelFile() throws IOException {
FileRequestHandler fileRequestHandler = new FileRequestHandler();
InputStream inputStream = new StringInputStream("hello world");
String fileName = "FileRequestLambda";
String path = fileName + ".xlsx";
FileOutputStream fileOutputStream = new FileOutputStream(path);
fileRequestHandler.handleRequest(inputStream, fileOutputStream, TestUtils.createContext());
fileOutputStream.close();
}
}
But when I run it in Amazon Lambda, I get malformed binary output:
PKn��I_rels/.rels���j�0��}
�{㴃1F�^Ơ�2��l%1I,c�[��3�l
l�����H��4��R�l��·����q}*�2�������;�*��
t"�^�l;1W)�N�iD)ejuD�cKz[׷:}g����#:�
�3����4�7N�s_ni�G�M*7�����2R�+�� �2�/�����b��mC�Pp�ֱ$POyQ�抒�DsZ��IС�'un���~�PK����OPKn��I[Content_Types].xml�SMO1��+6��m��1���G%��β
�J[���MDL0�S;yo�{3i�Ӎ5�c��5lć�B'��nѰ��S}˪��)0�aÜg��`<�L��԰.�p'D�ZH�t��>Z�Tƅ ��#q=��]F��\4�=`+���P�!-!S.�v�#��+�����N�tEV=nHe7���S,;K]_h7Q+�W8߶Z��re��c�U�����}�����g�&A��,���H�$�B<��`�"�Jb���"���I�N�1���A���CI�#��܂v��?|\�{��`�b������$�c�D��|2�PKKB�>'PKn��IdocProps/app.xmlM��
�0D�~EȽ��ADҔ���A? ��6�lB�J?ߜ���0���ͯ��)�#��׍H6���V>��$;�SC
;̢(�ra�g�l�&�e��L!y�%��49��`_���4G���F��J��Wg
�GS�b����
~�PK�|wؑ�PKn��IdocProps/core.xmlm��J�0F��!�m�V����(���Ż��m��!�v}{ӺVP/g��a��wG5�wp~4��4�1-�u���n��c�גOFC����6��e�888c��<�홰
B��/P�g��q�b��!��'��W�)��"
�<p�S��I)Ŧ�onZR�#��Ќ�6�S�߅u��G?n�<��\�\����ۛ���t���p|��f� Q4��ac&ߓ��������i��"�UG+vV��z�ɯ���U�^�H#�����IM�$�&�PK����PKn��Ixl/sharedStrings.xml=�A� ツ��.z0Ɣ�`������,�����q2��o�ԇ���N�E��x5�z>�W���(R�K���^4{�����ŀ�5��y�V����y�m�XV�\�.��j�����
8�PKp��&x�PKn��I
xl/styles.xml���n� ��>bop2TQ��P)U��RWb�6*�����ӤS�Nw�s���3ߍ֐���t��(l��������ҝx�!N=#$ɀ��}��3c���ʰr`:i��2��w,�
�d
�T��R#�voc �;c�iE���Û��E<|��4Iɣ�����F#��n���B�z�F���y�j3y��yҥ�jt>���2��Lژ�!6��2F��OY��4#M�!���G��������1�t��y��p��" n����u�����a�ΦDi�9�&#��%I��9��}���cK��T��$?������`J������7���o��f��M|PK�1X#C�PKn��Ixl/workbook.xml���N�0��<��wj�E�8��J��P�;�����hmZ'Q�#����~;���;vCJ6 �Fà���"��|x|�}���#]����C�0�<֜'=�WiG��#y���O#�2i#������+`!��F�{��-�O�!/B�r)�;&h�����zOz�o����xO��I2����YuĔ��s�u��<J8Q�z6��Qm�:�,�c��Z�����PK1����dPKn��Ixl/_rels/workbook.xml.rels��Mk1#���0�nv-�R�^����0$����$dƯo���R�OC�ރ�-��������#Sՠ(�����ܼ?��b��p�����d�AJ�¾O�
#�/�޴f�iD�b�P6m�#Jy�N'�[�HO��E�k����3�W���ܑ`���Zri㪐����?�ض��e�������7p�wj�W5r���]������=�|���<:�[p��7�O�PK��4��9PKn��Ixl/worksheets/sheet1.xmleP�N�0���މ�V��THU$���$��j���[��c�����3��-v�nT���/a����7�Zߗ��z���]uQ���0 ��zJD�[�C3�3!� }|鈝�H��ab4�br�^���v�z���:�)P1v%ܭ#W�"|�8�?X�ܚ���C[B�'�~��ȅO������Tyb�bgN�<�|��$��ƙ��{#&����h��>��D�Ű�z�#��6��8�LF�dQ����,4�xS����/PK�_�Y�lPKn��I����O_rels/.relsPKn��IKB�>'[Content_Types].xmlPKn��I�|wؑ��docProps/app.xmlPKn��I����mdocProps/core.xmlPKn��Ip��&x��xl/sharedStrings.xmlPKn��I�1X#C�
nxl/styles.xmlPKn��I1����d�xl/workbook.xmlPKn��I��4��9xl/_rels/workbook.xml.relsPKn��I�_�Y�l$ xl/worksheets/sheet1.xmlPK ?Z
The output is about 5KB in size, while the output on my local computer is about 3KB in size. This appears to be a problem with binary output in general for Java on Amazon Lambda. When I do run some code that writes an image to the output string, it also works locally, but results in an image twice the size and garbled when run from Amazon Lambda.
import com.amazonaws.services.lambda.runtime.Context;
import com.amazonaws.services.lambda.runtime.RequestStreamHandler;
import java.io.*;
import java.net.URL;
public class ImageRequestHandler implements RequestStreamHandler {
public void handleRequest(InputStream inputStream, OutputStream outputStream, Context context)
throws IOException {
String address = "https://upload.wikimedia.org/wikipedia/commons/thumb/1/1d/AmazonWebservices_Logo.svg/580px-AmazonWebservices_Logo.svg.png";
URL url = new URL(address);
InputStream in = new BufferedInputStream(url.openStream());
ByteArrayOutputStream out = new ByteArrayOutputStream();
byte[] buf = new byte[1024];
int n;
while (-1!=(n=in.read(buf)))
{
out.write(buf, 0, n);
}
out.close();
in.close();
byte[] response = out.toByteArray();
outputStream.write(response);
}
}
The types of the was input and output streams are:
lambdainternal.util.NativeMemoryAsInputStream
lambdainternal.util.LambdaByteArrayOutputStream
Help?
I had the same problem with returning JPG image from Amazon Lambda and I found a work-around.
You need to encode an output stream with base64 encoding:
OutputStream encodedStream = Base64.getEncoder().wrap(outputStream);
encodedStream.write(response);
encodedStream.close();
Then you need to update Method Response and Integration Response of your function as described here: AWS Gateway API base64Decode produces garbled binary?

Convert DOCX to HTML incliding IMAGES

I am using DOCX4J to convert the DOCX to HTML .I have successfully done the conversion and got the html format.I will be using the html format to embed it as EMAIL body to send an email.But I have some issues which are listed below....
Unable to display images in email body
Losing the spaces and bullets
Please find the code which I have written,
WordprocessingMLPackage wordMLPackage;
wordMLPackage = Docx4J.load(new java.io.File(resourcePath2));
HTMLSettings htmlSettings = Docx4J.createHTMLSettings();
htmlSettings.setImageDirPath(imageFolder + resourcePath2 + "_files");
htmlSettings.setImageTargetUri(imageFolder +resourcePath2.substring(resourcePath2.lastIndexOf("/")+1) + "_files");
htmlSettings.setWmlPackage(wordMLPackage);
OutputStream os;
os = new ByteArrayOutputStream();
Docx4jProperties.setProperty("docx4j.Convert.Out.HTML.OutputMethodXML", true);
Docx4J.toHTML(htmlSettings, os, Docx4J.FLAG_SAVE_FLAT_XML);
DOCX = ((ByteArrayOutputStream)os).toString();
You may add like this in your code
package tcg.doc.web.managedBeans;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import org.apache.poi.xwpf.converter.core.FileImageExtractor;
import org.apache.poi.xwpf.converter.core.FileURIResolver;
import org.apache.poi.xwpf.converter.xhtml.XHTMLConverter;
import org.apache.poi.xwpf.converter.xhtml.XHTMLOptions;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.springframework.beans.factory.annotation.Qualifier;
import org.springframework.context.annotation.Scope;
import org.springframework.stereotype.Component;
#Component
#Scope("session")
#Qualifier("ConvertWord")
public class ConvertWord {
private static final String docName = "TestDocx.docx";
private static final String outputlFolderPath = "d:/";
String htmlNamePath = "docHtml.html";
String zipName="_tmp.zip";
File docFile = new File(outputlFolderPath+docName);
File zipFile = new File(zipName);
public void ConvertWordToHtml() {
try {
// 1) Load DOCX into XWPFDocument
InputStream doc = new FileInputStream(new File(outputlFolderPath+docName));
System.out.println("InputStream"+doc);
XWPFDocument document = new XWPFDocument(doc);
// 2) Prepare XHTML options (here we set the IURIResolver to load images from a "word/media" folder)
XHTMLOptions options = XHTMLOptions.create(); //.URIResolver(new FileURIResolver(new File("word/media")));;
// Extract image
String root = "target";
File imageFolder = new File( root + "/images/" + doc );
options.setExtractor( new FileImageExtractor( imageFolder ) );
// URI resolver
options.URIResolver( new FileURIResolver( imageFolder ) );
OutputStream out = new FileOutputStream(new File(htmlPath()));
XHTMLConverter.getInstance().convert(document, out, options);
System.out.println("OutputStream "+out.toString());
} catch (FileNotFoundException ex) {
} catch (IOException ex) {
}
}
public static void main(String[] args) {
ConvertWord cwoWord=new ConvertWord();
cwoWord.ConvertWordToHtml();
System.out.println();
}
public String htmlPath(){
// d:/docHtml.html
return outputlFolderPath+htmlNamePath;
}
public String zipPath(){
// d:/_tmp.zip
return outputlFolderPath+zipName;
}
}
For maven Dependency on pom.xml
<dependency>
<groupId>fr.opensagres.xdocreport</groupId>
<artifactId>org.apache.poi.xwpf.converter.xhtml</artifactId>
<version>1.0.4</version>
</dependency>
or download it from Here
For images to work in an email body, I guess you need to use either a data URI or publish them to a web-reachable location.
In either case, you'll need to write an implementation of:
public interface ConversionImageHandler {
/**
* #param picture
* #param relationship of the image
* #param part of the image, if it is an internal image, otherwise null
* #return uri for the image we've saved, or null
* #throws Docx4JException this exception will be logged, but not propagated
*/
public String handleImage(AbstractWordXmlPicture picture, Relationship relationship, BinaryPart part) throws Docx4JException;
}
and configure docx4j to use it with htmlSettings.setImageHandler.
You can look at some of the existing implementations in the docx4j source code, and take advantage of the helper methods in AbstractConversionImageHandler (eg createEncodedImage if you want data URIs).

Need to encrypt and decrypt in Java, unable to use Base64... I think

I need to encrypt and decrypt as four digit pin into a database and am having trouble. I have tried using examples that use Base64, put even after importing the package it can't find the class. What am I doing wrong? I understand that the class below may be correct, but why can't it find the class and create an object. In eclipse when I navigate to the Base64 class in reference libraries it says "source not found".
import java.io.IOException;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.ArrayList;
import java.util.Date;
import java.util.List;
import java.util.Random;
import org.apache.commons.codec.binary.Base64;
public class PasswordEncryption {
private static Random random = new Random((new Date()).getTime());
public static String encrypt(String userId) {
Base64() encoder = new Base64();
byte[] salt = new byte[8];
random.nextBytes(salt);
return encoder.encode(salt)+
encoder.encode(userId.getBytes());
}
public static String decrypt(String encryptKey) {
if (encryptKey.length() > 12) {
String cipher = encryptKey.substring(12);
BASE64Decoder decoder = new BASE64Decoder();
try {
return new String(decoder.decodeBuffer(cipher));
} catch (IOException e) {
// throw new InvalidImplementationException(
// "Failed to perform decryption for key ["+encryptKey+"]",e);
}
}
return null;
}
}
And apologies if I have not used these forums correctly, this is my first post.
Thanks
I think you need to download Apache Commons Codec. After you've downloaded the jar, you need to add it to your Eclipse project as a library in your build path. (I apologise if you've already done this. It isn't clear from your post.)
Once you've done you still won't be able to see the source in Eclipse, but your project should work when you run it.

Categories