Have a Word DOCX file stored in a table (zip file). With Java stored procedure, I like to replace some string inside "word/document.xml" file and store all files into new archive. If I run the code on Oracle database I get corrupted zip file. File size of ouptut file is smaller than input file...
If I run the same app on client (Netbeans IDE) then all works fine. I can't find the where is the problem!?
public static void zamenjajVsebino(oracle.sql.BLOB srcBlob1, oracle.sql.BLOB[] dstBlob) throws Exception {
InputStream zipBuffer = srcBlob1.getBinaryStream();
OutputStream outBuffer = dstBlob[0].getBinaryOutputStream();
ZipInputStream zipIn = new ZipInputStream(zipBuffer);
ZipOutputStream zipOut = new ZipOutputStream(outBuffer);
ZipEntry inEntry;
while ((inEntry = zipIn.getNextEntry()) != null) {
ZipEntry outEntry = new ZipEntry(inEntry.getName());
zipOut.putNextEntry(outEntry);
if (inEntry.getName().equals("content.xml") | inEntry.getName().equals("word/document.xml")) {
String contentIn = new String(getStringByte(zipIn), "UTF-8");
contentIn = contentIn.replaceAll("%TEXT%", "BLA BLA");
zipOut.write(contentIn.getBytes(), 0, contentIn.getBytes().length);
} else {
copy(zipIn, zipOut);
}
zipOut.closeEntry();
}
zipIn.close();
zipOut.flush();
zipOut.finish();
}
public static void copy(InputStream in, OutputStream out) throws IOException {
int n;
byte[] buffer = new byte[1024];
while ((n = in.read(buffer)) > -1) {
out.write(buffer, 0, n); // Don't allow any extra bytes to creep in, final write
}
out.flush();
}
If I open the source DOCX file is like this:
The output file is like this:
Here is my plsql source code:
-- select source docx file from blob...
select vdb.vsebina
into SrcBlobLocator
from dok_vsebina_dokumenta_blob vdb
where id=p_vdb_id;
-- prepare output blob
p_vdb_id_out:=dok_lob.f_pripravi_blob;
update dok_vsebina_dokumenta_blob set
naziv_datoteke='test.docx',
vsebina = empty_blob()
where id=p_vdb_id_out;
--select output blob for update...
select vdb.vsebina
into DstBlobLocator
from dok_vsebina_dokumenta_blob vdb
where id=p_vdb_id_out for update;
-- call java stored procedure with source and dest blob...
ZamenjajVsebino(SrcBlobLocator, DstBlobLocator);
Related
i have checked everywhere online and stackoverflow and could not find a match specific to this issue.
I am trying to extract a pdf file that is located in a zip file that is inside a zip file (nested zips).
Re-calling the method i am using to extract does not work nor does changing the whole program to accept Inputstreams instead of how i am doing it below.
The .pdf file inside the nested zip is just skipped at this stage
public static void main(String[] args)
{
try
{
//Paths
String basePath = "C:\\Users\\user\\Desktop\\Scan\\";
File lookupDir = new File(basePath + "Data\\");
String doneFolder = basePath + "DoneUnzipping\\";
File[] directoryListing = lookupDir.listFiles();
for (int i = 0; i < directoryListing.length; i++)
{
if (directoryListing[i].isFile()) //there's definately a file
{
//Save the current file's path
String pathOrigFile = directoryListing[i].getAbsolutePath();
Path origFileDone = Paths.get(pathOrigFile);
Path newFileDone = Paths.get(doneFolder + directoryListing[i].getName());
//unzip it
if(directoryListing[i].getName().toUpperCase().endsWith(ZIP_EXTENSION)) //ZIP files
{
unzip(directoryListing[i].getAbsolutePath(), DESTINATION_DIRECTORY + directoryListing[i].getName());
//move to the 'DoneUnzipping' folder
Files.move(origFileDone, newFileDone);
}
}
}
} catch (Exception e)
{
e.printStackTrace(System.out);
}
}
private static void unzip(String zipFilePath, String destDir)
{
//buffer for read and write data to file
byte[] buffer = new byte[BUFFER_SIZE];
try (ZipInputStream zis = new ZipInputStream(new FileInputStream(zipFilePath)))
{
FileInputStream fis = new FileInputStream(zipFilePath);
ZipEntry ze = zis.getNextEntry();
while(ze != null)
{
String fileName = ze.getName();
int index = fileName.lastIndexOf("/");
String newFileName = fileName.substring(index + 1);
File newFile = new File(destDir + File.separator + newFileName);
//Zips inside zips
if(fileName.toUpperCase().endsWith(ZIP_EXTENSION))
{
ZipInputStream innerZip = new ZipInputStream(zis);
ZipEntry innerEntry = null;
while((innerEntry = innerZip.getNextEntry()) != null)
{
System.out.println("The file: " + fileName);
if(fileName.toUpperCase().endsWith("PDF"))
{
FileOutputStream fos = new FileOutputStream(newFile);
int len;
while ((len = innerZip.read(buffer)) > 0)
{
fos.write(buffer, 0, len);
}
fos.close();
}
}
}
//close this ZipEntry
zis.closeEntry(); // java.io.IOException: Stream Closed
ze = zis.getNextEntry();
}
//close last ZipEntry
zis.close();
fis.close();
} catch (IOException e)
{
e.printStackTrace();
}
}
The solution to this is not as obvious as it seems. Despite writing a few zip utilities myself some time ago, getting zip entries from inside another zip file only seems obvious in retrospect
(and I also got the java.io.IOException: Stream Closed on my first attempt).
The Java classes for ZipFile and ZipInputStream really direct your thinking into using the file system, but it is not required.
The functions below will scan a parent-level zip file, and continue scanning until it finds an entry with a specified name. (Nearly) everything is done in-memory.
Naturally, this can be modified to use different search criteria, find multiple file types, etc. and take different actions, but this at least demonstrates the basic technique in question -- zip files inside of zip files -- no guarantees on other aspects of the code, and someone more savvy could most likely improve the style.
final static String ZIP_EXTENSION = ".zip";
public static byte[] getOnePDF() throws IOException
{
final File source = new File("/path/to/MegaData.zip");
final String nameToFind = "FindThisFile.pdf";
final ByteArrayOutputStream mem = new ByteArrayOutputStream();
try (final ZipInputStream in = new ZipInputStream(new BufferedInputStream(new FileInputStream(source))))
{
digIntoContents(in, nameToFind, mem);
}
// Save to disk, if you want
// copy(new ByteArrayInputStream(mem.toByteArray()), new FileOutputStream(new File("/path/to/output.pdf")));
// Otherwise, just return the binary data
return mem.toByteArray();
}
private static void digIntoContents(final ZipInputStream in, final String nameToFind, final ByteArrayOutputStream mem) throws IOException
{
ZipEntry entry;
while (null != (entry = in.getNextEntry()))
{
final String name = entry.getName();
// Found the file we are looking for
if (name.equals(nameToFind))
{
copy(in, mem);
return;
}
// Found another zip file
if (name.toUpperCase().endsWith(ZIP_EXTENSION.toUpperCase()))
{
digIntoContents(new ZipInputStream(new ByteArrayInputStream(getZipEntryFromMemory(in))), nameToFind, mem);
}
}
}
private static byte[] getZipEntryFromMemory(final ZipInputStream in) throws IOException
{
final ByteArrayOutputStream mem = new ByteArrayOutputStream();
copy(in, mem);
return mem.toByteArray();
}
// General purpose, reusable, utility function
// OK for binary data (bad for non-ASCII text, use Reader/Writer instead)
public static void copy(final InputStream from, final OutputStream to) throws IOException
{
final int bufferSize = 4096;
final byte[] buf = new byte[bufferSize];
int len;
while (0 < (len = from.read(buf)))
{
to.write(buf, 0, len);
}
to.flush();
}
Your question asks how to use java (by implication in windows) to extract a pdf from a zip inside another outer zip.
In many systems including windows it is a single line command that will depend on the location of source and target folders, however using the shortest example of current downloads folder it would be in a shell as simple as
tar -xf "german (2).zip" && tar -xf "german.zip" && german.pdf
to shell the command in windows see
How do I execute Windows commands in Java?
The default pdf viewer can open the result so Windows Edge or in my case SumatraPDF
There is generally no point in putting a pdf inside a zip because it cannot be run in there. So single nesting would be advisable if needed for download transportation.
There is no need to add a password to the zip because PDF uses its own password for opening. Thus unwise to add two levels of complexity. Keep it simple.
If you have multiple zips nested inside multiple zips with multiple pdfs in each then you have to be more specific by filtering names. However avoid that extra onion skin where possible.
\Downloads>tar -xf "german (2).zip" "both.zip" && tar -xf "both.zip" "English language.pdf"
You could complicate that by run in a memory or temp folder but it is reliable and simple to use the native file system so consider without Java its fastest to run
CD /D "C:/Users/user/Desktop/Scan/DoneUnzipping" && for %f in (..\Data\*.zip) do tar -xf "%f" "*.zip" && for %f in (*.zip) do tar -xf "%f" "*.pdf" && del "*.zip"
This will extract all inner zips into working folder then extract all PDFs and remove all the essential temporary zips. The source double zips will not be deleted simply touched.
The line that causes your problem looks to be auto-close block you have created when reading the inner zip:
try(ZipInputStream innerZip = new ZipInputStream(fis)) {
...
}
Several likely issues: firstly it is reading the wrong stream - fis not the existing zis.
Secondly, you shouldn't use try-with-resources for auto-close on innerZip as this implicitly calls innerZip.close() when exiting the block. If you view the source code of ZipInputStream via a good IDE you should see (eventually) that ZipInputStream extends InflaterInputStream which itself extends FilterInputStream. A call to innerZip.close() will close the underlying outer stream zis (fis in your case) hence stream is closed when you resume the next entry of the outer zip.
Therefore remove the try() block and add use of zis:
ZipInputStream innerZip = new ZipInputStream(zis);
Use try-catch block only for the outermost file handling:
try (ZipInputStream zis = new ZipInputStream(new FileInputStream(zipFilePath))) {
ZipEntry ze = zis.getNextEntry();
...
}
Thirdly, you appear to be copying the wrong stream when extracting a PDF - use innerZip not outer zis. The code will never extract PDF as these 2 lines can never be true at the same time because a file ending ZIP will never end PDF too:
if(fileName.toUpperCase().endsWith(ZIP_EXTENSION)) {
...
// You want innerEntry.getName() here
if(fileName.toUpperCase().endsWith("PDF"))
You should be able to switch to one line Files.copy and make use of the PDF filename not zip filename:
if(innerEntry.getName().toUpperCase().endsWith("PDF")) {
Path newFile = Paths.get(destDir + '-'+innerEntry.getName().replace("/", "-"));
System.out.println("Files.copy to " + newFile);
Files.copy(innerZip, newFile);
}
We've recently implemented functionality to load ".zip" files from sFTP. However before we've implemented the functionality there was already couple of ".zip" files in there which were treated as ordinary "*.xml" files and downloaded. Now I have a task to restore those downloaded files to zip format.
ZIP files were treated as xml and in doing so they were downloaded in following manner,
CLOB file = downloadFile(channelSFTP.get(fileName), oConn);
:
private static CLOB downloadFile(InputStream is, OracleConnection oConn) throws Exception {
CLOB clob = CLOB.createTemporary(oConn, true, CLOB.DURATION_SESSION);
Writer writer = clob.setCharacterStream(1);
ByteArrayOutputStream result = new ByteArrayOutputStream();
int length = -1;
byte[] buffer = new byte[1024];
while ((length = is.read(buffer)) != -1) {
result.write(buffer, 0, length);
}
result.flush();
writer.write(result.toString("UTF-8"));
writer.flush();
return clob;
}
Then the CLOB is returned from java stored procedure to oracle procedure, and ENCODED in BASE64.
What I've tried to restore files:
public static void decodeZip(String path, String resPath) throws FileNotFoundException, IOException {
ByteArrayOutputStream bos = new ByteArrayOutputStream();
new BASE64Decoder().decodeBuffer(new FileInputStream(path), bos);
ZipInputStream zis = new ZipInputStream((InputStream) new ByteArrayInputStream(bos.toByteArray()));
ZipEntry entry;
while((entry = zis.getNextEntry()) != null) {
System.out.println(entry.getName());
}
}
Where "path" is the path of the BASE64 encoded file. However I'm getting the following error:
java.util.zip.ZipException: invalid stored block lengths
on "zis.getNextEntry()". I think it should be possible to achieve this, but i can't seem to figure out, what step I'am missing.
I'm trying to read .srt files that are located in zip file itself located in a zip file. I succeed to read .srt files that were in a simple zip with the extract of code below :
for (Enumeration enume = fis.entries(); enume.hasMoreElements();) {
ZipEntry entry = (ZipEntry) enume.nextElement();
fileName = entry.toString().substring(0,entry.toString().length()-4);
try {
InputStream in = fis.getInputStream(entry);
BufferedReader reader = new BufferedReader(new InputStreamReader(in));
String ext = entry.toString().substring(entry.toString().length()-4, entry.toString().length());
But now i don't know how i could get to the zip file inside the zip file.
I tried using ZipFile fis = new ZipFile(filePath) with filePath being the path of the zip file + the name of zip file inside. It didn't recognize the path so i don't know if i am clear.
Thanks.
ZipFile only works with real files, because it's intended for use as a random access mechanism which needs to be able to seek directly to specific locations in the file to read entries by name. But as VGR suggests in the comments, while you can't get random access to the zip-inside-a-zip you can use ZipInputStream, which provides strictly sequential access to the entries and works with any InputStream of zip-format data.
However, ZipInputStream has a slightly odd usage pattern compared to other streams - calling getNextEntry reads the entry metadata and positions the stream to read that entry's data, you read from the ZipInputStream until it reports EOF, then you (optionally) call closeEntry() before moving on to the next entry in the stream.
The critical point is that you must not close() the ZipInputStream until you have finished reading the final entry, so depending what you want to do with the entry data you might need to use something like the commons-io CloseShieldInputStream to guard against the stream getting closed prematurely.
try(ZipInputStream outerZip = new ZipInputStream(fis)) {
ZipEntry outerEntry = null;
while((outerEntry = outerZip.getNextEntry()) != null) {
if(outerEntry.getName().endsWith(".zip")) {
try(ZipInputStream innerZip = new ZipInputStream(
new CloseShieldInputStream(outerZip))) {
ZipEntry innerEntry = null;
while((innerEntry = innerZip.getNextEntry()) != null) {
if(innerEntry.getName().endsWith(".srt")) {
// read the data from the innerZip stream
}
}
}
}
}
}
Find the code to extract .zip files recursively:
public void extractFolder(String zipFile) throws ZipException, IOException {
System.out.println(zipFile);
int BUFFER = 2048;
File file = new File(zipFile);
ZipFile zip = new ZipFile(file);
String newPath = zipFile.substring(0, zipFile.length() - 4);
new File(newPath).mkdir();
Enumeration zipFileEntries = zip.entries();
// Process each entry
while (zipFileEntries.hasMoreElements())
{
// grab a zip file entry
ZipEntry entry = (ZipEntry) zipFileEntries.nextElement();
String currentEntry = entry.getName();
File destFile = new File(newPath, currentEntry);
//destFile = new File(newPath, destFile.getName());
File destinationParent = destFile.getParentFile();
// create the parent directory structure if needed
destinationParent.mkdirs();
if (!entry.isDirectory())
{
BufferedInputStream is = new BufferedInputStream(zip
.getInputStream(entry));
int currentByte;
// establish buffer for writing file
byte data[] = new byte[BUFFER];
// write the current file to disk
FileOutputStream fos = new FileOutputStream(destFile);
BufferedOutputStream dest = new BufferedOutputStream(fos,
BUFFER);
// read and write until last byte is encountered
while ((currentByte = is.read(data, 0, BUFFER)) != -1) {
dest.write(data, 0, currentByte);
}
dest.flush();
dest.close();
is.close();
}
if (currentEntry.endsWith(".zip"))
{
// found a zip file, try to open
extractFolder(destFile.getAbsolutePath());
}
}
}
I am trying to transfer a SQLite database into an app by downloading it and then unzipping it to the correct location. I was successful in transferring the DB when it was unzipped. The error I get is that it cannot find any of the tables I query. I have also been successful in unzipping and reading normal text files.
The DB has Hebrew and English, but that has not caused problems before. The bilingual DB was copied successfully when it was not zipped and bilingual texts have been successfully unzipped and read. Still, it is a possibility that there is an encoding problem going on. That seems weird to me, because as you can see below in the code, I'm just copying the bytes directly.
-EDIT-
Let's say the prezipped db is called test1.db. I zipped it, put it in the app, unzipped it and called that test2.db. when I ran a diff command on these two, there were no differences. So there must be a technical issue with the way android is reading the file / or maybe encoding issue on android that doesn't exist on pc?
I hate to do a code dump, but i will post both my copyDatabase() function (which works). That is what I used previously running it on an unzipped DB file. I put it here as comparison. Now I'm trying to use unzipDatabase() function (which doesn't work), and use it on a zipped DB file. The latter function was copied from How to unzip files programmatically in Android?
private void copyDatabase() throws IOException{
String DB_NAME = "test.db";
String DB_PATH = "/data/data/org.myapp.myappname/databases/";
//Open your local db as the input stream
InputStream myInput = myContext.getAssets().open(DB_NAME);
// Path to the just created empty db
String outFileName = DB_PATH + DB_NAME;
//Open the empty db as the output stream
OutputStream myOutput = new FileOutputStream(outFileName);
//transfer bytes from the inputfile to the outputfile
byte[] buffer = new byte[1024];
int length;
while ((length = myInput.read(buffer))>0){
myOutput.write(buffer, 0, length);
}
//Close the streams
myOutput.flush();
myOutput.close();
myInput.close();
}
private boolean unzipDatabase(String path)
{
String DB_NAME = "test.zip";
InputStream is;
ZipInputStream zis;
try
{
String filename;
is = myContext.getAssets().open(DB_NAME);
zis = new ZipInputStream(is);
ZipEntry ze;
byte[] buffer = new byte[1024];
int count;
while ((ze = zis.getNextEntry()) != null)
{
// write to a file
filename = ze.getName();
// Need to create directories if not exists, or
// it will generate an Exception...
if (ze.isDirectory()) {
Log.d("yo",path + filename);
File fmd = new File(path + filename);
fmd.mkdirs();
continue;
}
OutputStream fout = new FileOutputStream(path + filename);
// reading and writing zip
while ((count = zis.read(buffer)) != -1)
{
fout.write(buffer, 0, count);
}
fout.flush();
fout.close();
zis.closeEntry();
}
zis.close();
}
catch(IOException e)
{
e.printStackTrace();
return false;
}
return true;
}
So still don't know why, but the problem is solved if I first delete the old copy of the database (located at DB_PATH + DB_NAME) and then unzip the new one there. I didn't need to do this when copying it directly.
so yay, it was a file overwriting issue...If someone knows why, feel free to comment
I have the following situation:
I am able to zip my files with the following method:
public boolean generateZip(){
byte[] application = new byte[100000];
ByteArrayOutputStream baos = new ByteArrayOutputStream();
// These are the files to include in the ZIP file
String[] filenames = new String[]{"/subdirectory/index.html", "/subdirectory/webindex.html"};
// Create a buffer for reading the files
try {
// Create the ZIP file
ZipOutputStream out = new ZipOutputStream(baos);
// Compress the files
for (int i=0; i<filenames.length; i++) {
byte[] filedata = VirtualFile.fromRelativePath(filenames[i]).content();
ByteArrayInputStream in = new ByteArrayInputStream(filedata);
// Add ZIP entry to output stream.
out.putNextEntry(new ZipEntry(filenames[i]));
// Transfer bytes from the file to the ZIP file
int len;
while ((len = in.read(application)) > 0) {
out.write(application, 0, len);
}
// Complete the entry
out.closeEntry();
in.close();
}
// Complete the ZIP file
out.close();
} catch (IOException e) {
System.out.println("There was an error generating ZIP.");
e.printStackTrace();
}
downloadzip(baos.toByteArray());
}
This works perfectly and I can download the xy.zip which contains the following directory and file structure:
subdirectory/
----index.html
----webindex.html
My aim is to completely leave out the subdirectory and the zip should only contain the two files. Is there any way to achieve this?
(I am using Java on Google App Engine).
Thanks in advance
If you are sure the files contained in the filenames array are unique if you leave out the directory, change your line for constructing ZipEntrys:
String zipEntryName = new File(filenames[i]).getName();
out.putNextEntry(new ZipEntry(zipEntryName));
This uses java.io.File#getName()
You can use Apache Commons io to list all your files, then read them to an InputStream
Replace the line below
String[] filenames = new String[]{"/subdirectory/index.html", "/subdirectory/webindex.html"}
with the following
Collection<File> files = FileUtils.listFiles(new File("/subdirectory"), new String[]{"html"}, true);
for (File file : files)
{
FileInputStream fileStream = new FileInputStream(file);
byte[] filedata = IOUtils.toByteArray(fileStream);
//From here you can proceed with your zipping.
}
Let me know if you have issues.
You could use the isDirectory() method on VirtualFile