Writing spreadsheet to a blob column and retrieving it back using POI - java

We have a requirement to store the uploaded spreadsheet in an Oracle database blob column.
User will upload the spreadsheet using the ADF UI and the same spreadsheet will be persisted to the DB and retrieved later.
We are using POI to process the spreadsheet.
The uploaded file is converted to byte[] and sent to the appserver. Appserver persists the same to the blob column.
But when I am trying to retrieve the same later,I am seeing "Excel found unreadable content in *****.xlsx.Do you want to recover the contents of this workbook?" message.
I could resolve this issue by
Converting the byte[] to XSSFWorkbook and converting the same back to byte[] and persisting it.
But according to my requirement I may get very large spreadsheet and initializing XSSFWorkbook might result into outofmemory issues.
The code to get the byte[] from the uploaded spreadsheet is as below
if (uploadedFile != null) {
InputStream inputStream;
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
inputStream = uploadedFile.getInputStream();
int c = 0;
while (c != -1) {
c = inputStream.read();
byteArrayOutputStream.write((char) c);
}
bytes = byteArrayOutputStream.toByteArray();
}
and the same byte[] is being persisted into a blob column as below.
1. Assign this byte[] to the BlobCloumn
2. Update the SQL Update statement with the bolobColumn
3. Execute the Statement.
Once the above step is done, retrieve the spreadsheet as follows.
1. Read the BlobColumn
2. Get the bytes[] from the BlobColumn
3. Set the content-type of the response to support the spreadsheet.
4. Send the byte[].
But when I open the above downloaded spreadsheet I am getting the spreadsheet corrupted error.
If I introduce an additional step as below after receiving the byte[] from the UI, the issue is solved.
InputStream is = new ByteArrayInputStream(uploadedSpreadSheetBytes);
XSSFWorkbook uploadedWorkbook = new XSSFWorkbook(is);
and then, derive the byte[] again from the XSSFWorkbook as below
byteArrayOutputStream = new ByteArrayOutputStream();
workbook.write(byteArrayOutputStream);
byte[] spreadSheetBytes = byteArrayOutputStream.toByteArray();
I feel converting the byte[] to XSSFWorkbook and then converting the XSSFWorkbook back to byte[] is redundant.
Any help would be appreciated.
Thanks.

The memory issues can be avoided by, instead of initializing the entire XSSFWorkbook, using event-based parsing (SAX). This way, you are only processing parts of the file which consumes less memory. See the POI spreadsheet how-to page for more info on SAX parsing.
You could also increase the JVM memory, but that's no guarantee of course.

Related

Update InputStream before further processing

I've got a system where I process files basing on a sample file. So basically I should receive excel file with headers and then rows with information. Now the users send some additional headers and trailers which fails in processing in Apache POI.
I've added additional fields on GUI where user can add how many leading and trailing rows are additionally so I can remove them while parsing excel. So basically I receive a file as an InputStream on spring endpoint then the validation happens and then file is pushed to S3. So I am wondering if there is any chance to update that InputStream and remove that wrong records before S3 upload?
Do I need to save updated file and then read it back to get new InputStream or there is any better way to do that?
public InputStream cleanFileBeforeS3Upload(InputStream inputStream, Definition definition) throws IOException, InvalidFormatException {
var workbook = WorkbookFactory.create(inputStream);
var sheet = workbook.getSheetAt(0);
ExcelUtils.removeLeadingAndTrailingRowsFromExcel(sheet, definition.getTrimLeadingRows(), definition.getTrimTrailingRows());
// ????? How to get new updated inputstream from above workbook
return inputStream;
}
//Line which upload file to S3
var request = new PutObjectRequest(s3Properties.getBucket(), s3ObjectKey, file.getInputStream(), metadata);
I think you can do something like:
...
var out = new ByteArrayOutputStream();
workbook.write(out);
return new ByteArrayInputStream(out.toByteArray());

How to retrieve a Long from Oracle using java

I have a database where a Long datatype has an image stored in it,
I want to retrieve it and write it to an image file,
I tried using getBytes method and write a file using for and it return as corrupt image,
I also tried using getBinarystream and write using fos I wrote it in an image file I get same corrupt error.
Code:
InputStream is = RS.getBinaryStream(1);
FileOutputStream fos = new FileOutputStream ("image.bmp");
Int c;
While((c=is.read())!=-1)
{
fos.write(c);
}
fos.close;
To store binary data you should use LONG_RAW or a BLOB. The type LONG is for character based (text) data.

How to save .pfx certificate into a Sqlite DB

I am trying to save .pfx file in sqlite db and install it after getting its info from DB itself, not directly from the file. I am storing its Byte array in BLOB column and other info in respective to their type in DB, but it fails to install when I get it through sql Query.
Getting Byte Array:
ByteArrayOutputStream baos = new ByteArrayOutputStream();
BufferedInputStream reader = new BufferedInputStream(new FileInputStream(certificateFile));
try {
IOUtils.copyStream(reader, baos);
} finally {
reader.close();
baos.close();
}
return baos.toByteArray();
Please suggest any solution regarding this. I searched for the solution and found that I have to store .pem and .cer file separately after extracting .pfx file, but unable to find how? How could I save those info programmatically?.

Excel convert into encoded string and then re-write in excel

I am working on a poc where i need to convert any text file or excel file into encoded string and send as a rest api string body
Now converting plane text file into string and then re construct file without any problem
Now i am unable to re construct encoded string of excel to original excel file.
Getting corrupt file when converting it to excel file..
byte[] decoded = Base64.decodeBase64(encodedExcelString);
BufferedOutputStream w = new BufferedOutputStream(new FileOutputStream("path"));
w.write(decoded.getBytes());
I had the same scenario, I am not much familiar with excel creation scenario and char format type, but in normal case It will work ..
byte[] bytes = new sun.misc.BASE64Decoder().decodeBuffer(encodeData);
try (FileOutputStream fos = new FileOutputStream(filePath)) {
fos.write(bytes);
}
Also please avoid encoded string for binary files, for normal text file it is ok to wrap with in enocded string but in case of large binary file it will take much process time. Instead of string use array of bytes.

Reading Excel Data Issue From DB (CLOB Column) in Java with POI

I have a question looks to me so hard at first glance but maybe has very easy solution that I cant figure it out yet. I need to read binary data of an excel file which stored in a oracle database CLOB column.
Everything is ok with reading CLOB as string in java. I get excel file as binaries on a string parameter.
String respXLS = othRaw.getOperationData(); // here I get excel file
InputStream bais = new ByteArrayInputStream(respXLS.getBytes());
POIFSFileSystem fs = new POIFSFileSystem(bais);
HSSFWorkbook wb = new HSSFWorkbook(fs);
Then I try to read ByteStreamData and put in POIFSFileSystem but I get this exception:
java.io.IOException: Invalid header signature; read 0x00003F1A3F113F3F, expected 0xE11AB1A1E011CFD0
I googled some excel problems, they mention about read access. So I download same excel file to hdd and change nothing with it(even I did not open it), and use FileInputStream by giving the file path. It has worked flawless. So what is the reason?
Any advice or alternative way to read from CLOB will be appreciated.
Thanks in advance,
My Regards.
CLOB means Character Large OBject; You want to use a BLOB - Binary Large OBject. Change your database schema.
What happens is that a CLOB will use a Character Set to convert your String to/from the database internal format, whatever that is; this will cause file corruption on non-text contents.
Repeat after me: a String is not a byte[], and a character is not a byte.

Categories