How to convert a BLOB to pdf or csv using itext

How to convert a BLOB to pdf or csv using itext - java

My goal is to display the appropriate file, when the user clicks on a pdf or xls link.
The contents of a pdf or xlsfile are stored as a blob in a table. A stored procedure takes the file id as an input parameter and returns the blob as output.
I want to be able to display the file and am not sure how to go about it. On doing some reading it looks like i could use itest.
Is there a way to convert the blob to a pdf(or xls), using itext. Is this possible?
I was unable to find any examples that use a blob datatype.

(can't comment on David solution due to low reputation)
If the content on the BLOB record is a PDF binary data, you actually don't need to do anything with iText.
If you are not saving the BLOB to disk before and want (acording to your description) to just display, you could set the content type on the HTTP Response to indicate the browser how to deal with it:
response.setContentType("application/pdf"); // for PDF
response.setContentType("application/vnd.ms-excel"); // For BIFF .xls files
response.setContentType("application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"); // For Excel2007 and above .xlsx files

The IOUtils class in the Apache Commons IO library has a method copy which will copy all the bytes from an InputStream to an OutputStream. See the Javadoc.
So once you've got your blob and your HTTP response, you can just write
OutputStream httpOutputStream = httpResponse.getOutputStream();
InputStream blobInputStream = theBlob.getBinaryStream();
IOUtils.copy(blobInputStream, httpOutputStream);
blobInputStream.close();
httpOutputStream.close();
to copy the data. Or you might want to put the two close() calls in a finally block.
If you're not already using Apache Commons, don't forget to download the jar and add it to your classpath.

Related

Read and Append data to the File from a Blob URL path before download

This is my first hands on using Java Spring boot in a project, as I have mostly used C# and I have a requirement of reading a file from a blob URL path and appending some string data(like a key) to the same file in the stream before my API downloads the file.
Here are the ways that I have tried to do it:
FileOutputStream/InputStream: This throws a FileNotfoundException as it is not able to resolve the blob path.
URLConnection: This got me somewhere and I was able to download the file successfully but when I tried to write/append some value to the file before I download, I failed.
the code I have been doing.
//EXTERNAL_FILE_PATH is the azure storage path ending with for e.g. *.txt
URL urlPath = new URL(EXTERNAL_FILE_PATH);
URLConnection connection = urlPath.openConnection();
connection.setDoOutput(true); //I am doing this as I need to append some data and the docs mention to set this flag to true.
OutputStreamWriter out = new OutputStreamWriter(connection.getOutputStream());
out.write("I have added this");
out.close();
//this is where the issues exists as the error throws saying it cannot read data as the output is set to true and it can only write and no read operation is allowed. So, I get a 405, Method not allowed...
inputStream = connection.getInputStream();
I am not sure if the framework allows me to modify some file in the URL path and read it simultaneously and download the same.
Please help me in understanding if they is a better way possible here.

From logical point of view you are not appending data to the file from URL. You need to create new file, write some data and after that append content from file from URL. Algorithm could look like below:
Create new File on the disk, maybe in TMP folder.
Write some data to the file.
Download file from the URL and append it to file on the disk.
Some good articles from which you can start:
Download a File From an URL in Java
How to download and save a file from Internet using Java?
How to append text to an existing file in Java
How to write data with FileOutputStream without losing old data?

Output Streaming multiple files into a zip file

Hie , I m generating report in csv format using solr , angularjs , Jax-rs and java. The input stream contain a csv response already because we have specified wt=csv while querying solr. Size of csv created from every input Stream might be 300mb .At java layer code is some thing like :
enter code here
InputStream is1;
InputStream is2;
// for is1 let file is csv1
// for is2 let file is csv2
// csvzip is the csv created form both two files
// now csvzip need to be downloaded through a popup
Creating big size file and zipfile in memory will not be a good approach surely.
Is there is any way to handle this?

Reading Excel Data Issue From DB (CLOB Column) in Java with POI

I have a question looks to me so hard at first glance but maybe has very easy solution that I cant figure it out yet. I need to read binary data of an excel file which stored in a oracle database CLOB column.
Everything is ok with reading CLOB as string in java. I get excel file as binaries on a string parameter.
String respXLS = othRaw.getOperationData(); // here I get excel file
InputStream bais = new ByteArrayInputStream(respXLS.getBytes());
POIFSFileSystem fs = new POIFSFileSystem(bais);
HSSFWorkbook wb = new HSSFWorkbook(fs);
Then I try to read ByteStreamData and put in POIFSFileSystem but I get this exception:
java.io.IOException: Invalid header signature; read 0x00003F1A3F113F3F, expected 0xE11AB1A1E011CFD0
I googled some excel problems, they mention about read access. So I download same excel file to hdd and change nothing with it(even I did not open it), and use FileInputStream by giving the file path. It has worked flawless. So what is the reason?
Any advice or alternative way to read from CLOB will be appreciated.
Thanks in advance,
My Regards.

CLOB means Character Large OBject; You want to use a BLOB - Binary Large OBject. Change your database schema.
What happens is that a CLOB will use a Character Set to convert your String to/from the database internal format, whatever that is; this will cause file corruption on non-text contents.
Repeat after me: a String is not a byte[], and a character is not a byte.

Apache POI HSSF XLS reading error

Using the following code while reading in a .xls file, where s is the file directory:
InputStream input = new FileInputStream(s);
Workbook wbs = new HSSFWorkbook(input);
I get the following error message:
Exception in thread "main" java.io.IOException: Invalid header signature; read 0x0010000000060809, expected 0xE11AB1A1E011CFD0
I need a program that is able to read in either XLSX or XLS, and using the exact same code just adjusted for XSSF it has no problem at all reading in the XLSX file.

The Exception you're getting is one telling you that the file you're supplying isn't a valid Excel binary file, at least not a valid Excel file produced since about 1990. The exception you're getting tells you what POI expects, and that it found something else instead which wasn't a valid .xls file, and wasn't anything else POI can detect.
One thing to be aware of is that Excel opens a wide variety of different file formats, including .csv and .html. It's also not very picky about the file extension, so will happily open a CSV file that has been renamed to a .xls one. However, since renaming a .csv to a .xls doesn't magically change the format, POI still can't open it!
.
From the exception, I can tell what's happening, and I can also tell you're using an ancient version of Apache POI! A header signature of 0x0010000000060809 corresponds to the Excel 4 file format, from about 25 years ago! If you use a more recent version of Apache POI, it'll give you a helpful error message telling you that the file supplied is an old and largely unsupported Excel file. New versions of POI do include the OldExcelExtractor tool which can pull out some information from those ancient formats.
Otherwise, as with all exceptions of this type, try opening the file in Excel and doing a save-as. That will give you an idea of what the file currently is (eg .html saved as .xls, .csv saved as .xls etc), and will also let you re-save it as a proper .xls file for POI to load and work with.

If the file is in xlsx format instead of xls you might get this error. I would try using the generic Workbook object (Also called the SS Usermodel)
Check out the Workbook interface and the WorkbookFactory object. The factory should be able to create a generic Workbook for you out of either xlsx or xls.
I thought I had a good tutorial on this, but I can't seem to find it. I'll keep looking though.
Edit
I found this little tiny snippet from Apache's site about reading and rewriting using the SS Usermodel.
I hope this helps!

Invalid header signature; read 0x342E312D46445025, expected 0xE11AB1A1E011CFD0
Well I got this error when I uploaded corrupted xls/xlsx file(to upload corrupt file I renamed sample.pdf to sample.xls). Add validation like :
Workbook wbs = null;
try {
InputStream input = new FileInputStream(s);
wbs = new HSSFWorkbook(input);
} catch(IOException e) {
// log "file is corrupted", show error message to user
}

Jsp download file size

We are running tomcat, and we are generating pdf files on the fly. I do not have the file size before hand, so I cannot direcly link to a file on the server. So I directly send the output.
response.setContentType("application/force-download");
OutputStream o = response.getOutputStream();
And then I directly output to this OutputStream.
The only problem is that the receiver does not get the filesize, so they do not know how long the download will take. Is there a way to tell the response how large the file is?
EDIT
I do know the filesize, I just cant tell the STREAM how big the file is.

The response object should have a setContentLength method:
// Assumes response is a ServletResponse
response.setContentLength(sizeHere);

Serialize the PDF byte stream to a file or in a byte array calculate its size, set the size and write it to the output stream.

I beleive you're ansering the qustion your self:
quote:
I do not have the file size before hand, so I directly send the output.
If you don't have the size you cant send it....

Why not generate the PDF file in to temp file system , or ram-base file system or memory-map file on the fly. then you can get the file size.
response.setContentType("application/force-download");
response.setContentLength(sizeHere);
OutputStream o = response.getOutputStream();

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to convert a BLOB to pdf or csv using itext - java

Related

Read and Append data to the File from a Blob URL path before download

Output Streaming multiple files into a zip file

Reading Excel Data Issue From DB (CLOB Column) in Java with POI

Apache POI HSSF XLS reading error

Jsp download file size

Categories

Resources