I have a directory where I programmatically (in Java) do recursive unzipping (which seems to work), but in the end I'm left with a directory that has a lot of subdirectories and files. Every time I run this method I want to start with a clean slate, so I always delete the folder and its left-over files and subdirectories present in the temp directory.
root = new File(System.getProperty("java.io.tmpdir")+ File.separator + "ProductionTXOnlineCompletionDataPreProcessorRoot");
if(root.exists()){
try {
FileUtils.deleteDirectory(root);
} catch (IOException e) {
e.printStackTrace();
}
}
if(root.mkdir()){
rawFile = createRawDataFile();
}
I'm getting a really strange error from FileUtils.deleteDirectory though.
14:55:27,214 ERROR [stderr] (Thread-3 (HornetQ-client-global-threads-2098205981)) java.io.IOException: Unable to delete directory C:\Users\Admin\AppData\Local\Temp\ProductionTXOnlineCompletionDataPreProcessorRoot\ProductionTXOnlineCompletionDataPreProcessor8718674704286818303.
It seems to think that I have a period at the end of my directory (It doesn't, so it's no surprise that it can't delete it). Sometimes, this error appears on folders in the subdirectory. Has anyone seen this before?
I'm using the Commons IO 2.4 jar.
EDIT I've confirmed that the directories do not have periods, so unless they're invisible, I don't know why the method would think that there are periods. And the File's path that I give the method is set right before feeding it as an argument, and as anyone can see - it doesn't have a period at the end.
I'm running the program on Windows 7.
EDIT This is the code I used for recursively unzipping:
private void extractFolder(String zipFile) throws IOException
{
int BUFFER = 2048;
File file = new File(zipFile);
ZipFile zip = null;
String newPath = zipFile.substring(0, zipFile.length() - 4);
BufferedOutputStream dest = null;
BufferedInputStream is = null;
try{
zip = new ZipFile(zipFile);
Enumeration<? extends ZipEntry> zipFileEntries = zip.entries();
while (zipFileEntries.hasMoreElements())
{
ZipEntry entry = (ZipEntry) zipFileEntries.nextElement();
String currentEntry = entry.getName();
File destFile = new File(newPath, currentEntry);
File destinationParent = destFile.getParentFile();
destinationParent.mkdirs();
if (!entry.isDirectory())
{
is = new BufferedInputStream(zip
.getInputStream(entry));
int currentByte;
byte data[] = new byte[BUFFER];
FileOutputStream fos = new FileOutputStream(destFile);
dest = new BufferedOutputStream(fos, BUFFER);
// read and write until last byte is encountered
while ((currentByte = is.read(data, 0, BUFFER)) != -1) {
dest.write(data, 0, currentByte);
}
dest.flush();
}
if (currentEntry.endsWith(".zip")){
// found a zip file, try to open
extractFolder(destFile.getAbsolutePath());
}
}
}catch(Exception e){
e.printStackTrace();
}finally{
if(dest!=null) {dest.close();}
if(is!=null) {is.close();}
zip.close();
}
}
I put the original zip into the root directory, and recursively unzip from there.
This is the relevant code showing that:
downloadInputStreamToFileInRootDir(in, rawFile);
try {
extractFolder(rawFile.getCanonicalPath());
} catch (IOException e) {
e.printStackTrace();
}catch (Exception e){
}
I just noticed that I use rawFile.getCanonicalPath() (rawFile is set in the first code excerpt) as the argument for extractFolder initially and then switch to destFile.getAbsolutePath() ... Maybe that has something to do with it. The problem with testing this is that the issue isn't deterministic. It sometimes happens and sometimes not.
The period is part of the error message. It's not trying to delete a file path with a period at the end. See the FileUtils source:
if (!directory.delete()) {
final String message =
"Unable to delete directory " + directory + ".";
throw new IOException(message);
}
Related
i have checked everywhere online and stackoverflow and could not find a match specific to this issue.
I am trying to extract a pdf file that is located in a zip file that is inside a zip file (nested zips).
Re-calling the method i am using to extract does not work nor does changing the whole program to accept Inputstreams instead of how i am doing it below.
The .pdf file inside the nested zip is just skipped at this stage
public static void main(String[] args)
{
try
{
//Paths
String basePath = "C:\\Users\\user\\Desktop\\Scan\\";
File lookupDir = new File(basePath + "Data\\");
String doneFolder = basePath + "DoneUnzipping\\";
File[] directoryListing = lookupDir.listFiles();
for (int i = 0; i < directoryListing.length; i++)
{
if (directoryListing[i].isFile()) //there's definately a file
{
//Save the current file's path
String pathOrigFile = directoryListing[i].getAbsolutePath();
Path origFileDone = Paths.get(pathOrigFile);
Path newFileDone = Paths.get(doneFolder + directoryListing[i].getName());
//unzip it
if(directoryListing[i].getName().toUpperCase().endsWith(ZIP_EXTENSION)) //ZIP files
{
unzip(directoryListing[i].getAbsolutePath(), DESTINATION_DIRECTORY + directoryListing[i].getName());
//move to the 'DoneUnzipping' folder
Files.move(origFileDone, newFileDone);
}
}
}
} catch (Exception e)
{
e.printStackTrace(System.out);
}
}
private static void unzip(String zipFilePath, String destDir)
{
//buffer for read and write data to file
byte[] buffer = new byte[BUFFER_SIZE];
try (ZipInputStream zis = new ZipInputStream(new FileInputStream(zipFilePath)))
{
FileInputStream fis = new FileInputStream(zipFilePath);
ZipEntry ze = zis.getNextEntry();
while(ze != null)
{
String fileName = ze.getName();
int index = fileName.lastIndexOf("/");
String newFileName = fileName.substring(index + 1);
File newFile = new File(destDir + File.separator + newFileName);
//Zips inside zips
if(fileName.toUpperCase().endsWith(ZIP_EXTENSION))
{
ZipInputStream innerZip = new ZipInputStream(zis);
ZipEntry innerEntry = null;
while((innerEntry = innerZip.getNextEntry()) != null)
{
System.out.println("The file: " + fileName);
if(fileName.toUpperCase().endsWith("PDF"))
{
FileOutputStream fos = new FileOutputStream(newFile);
int len;
while ((len = innerZip.read(buffer)) > 0)
{
fos.write(buffer, 0, len);
}
fos.close();
}
}
}
//close this ZipEntry
zis.closeEntry(); // java.io.IOException: Stream Closed
ze = zis.getNextEntry();
}
//close last ZipEntry
zis.close();
fis.close();
} catch (IOException e)
{
e.printStackTrace();
}
}
The solution to this is not as obvious as it seems. Despite writing a few zip utilities myself some time ago, getting zip entries from inside another zip file only seems obvious in retrospect
(and I also got the java.io.IOException: Stream Closed on my first attempt).
The Java classes for ZipFile and ZipInputStream really direct your thinking into using the file system, but it is not required.
The functions below will scan a parent-level zip file, and continue scanning until it finds an entry with a specified name. (Nearly) everything is done in-memory.
Naturally, this can be modified to use different search criteria, find multiple file types, etc. and take different actions, but this at least demonstrates the basic technique in question -- zip files inside of zip files -- no guarantees on other aspects of the code, and someone more savvy could most likely improve the style.
final static String ZIP_EXTENSION = ".zip";
public static byte[] getOnePDF() throws IOException
{
final File source = new File("/path/to/MegaData.zip");
final String nameToFind = "FindThisFile.pdf";
final ByteArrayOutputStream mem = new ByteArrayOutputStream();
try (final ZipInputStream in = new ZipInputStream(new BufferedInputStream(new FileInputStream(source))))
{
digIntoContents(in, nameToFind, mem);
}
// Save to disk, if you want
// copy(new ByteArrayInputStream(mem.toByteArray()), new FileOutputStream(new File("/path/to/output.pdf")));
// Otherwise, just return the binary data
return mem.toByteArray();
}
private static void digIntoContents(final ZipInputStream in, final String nameToFind, final ByteArrayOutputStream mem) throws IOException
{
ZipEntry entry;
while (null != (entry = in.getNextEntry()))
{
final String name = entry.getName();
// Found the file we are looking for
if (name.equals(nameToFind))
{
copy(in, mem);
return;
}
// Found another zip file
if (name.toUpperCase().endsWith(ZIP_EXTENSION.toUpperCase()))
{
digIntoContents(new ZipInputStream(new ByteArrayInputStream(getZipEntryFromMemory(in))), nameToFind, mem);
}
}
}
private static byte[] getZipEntryFromMemory(final ZipInputStream in) throws IOException
{
final ByteArrayOutputStream mem = new ByteArrayOutputStream();
copy(in, mem);
return mem.toByteArray();
}
// General purpose, reusable, utility function
// OK for binary data (bad for non-ASCII text, use Reader/Writer instead)
public static void copy(final InputStream from, final OutputStream to) throws IOException
{
final int bufferSize = 4096;
final byte[] buf = new byte[bufferSize];
int len;
while (0 < (len = from.read(buf)))
{
to.write(buf, 0, len);
}
to.flush();
}
Your question asks how to use java (by implication in windows) to extract a pdf from a zip inside another outer zip.
In many systems including windows it is a single line command that will depend on the location of source and target folders, however using the shortest example of current downloads folder it would be in a shell as simple as
tar -xf "german (2).zip" && tar -xf "german.zip" && german.pdf
to shell the command in windows see
How do I execute Windows commands in Java?
The default pdf viewer can open the result so Windows Edge or in my case SumatraPDF
There is generally no point in putting a pdf inside a zip because it cannot be run in there. So single nesting would be advisable if needed for download transportation.
There is no need to add a password to the zip because PDF uses its own password for opening. Thus unwise to add two levels of complexity. Keep it simple.
If you have multiple zips nested inside multiple zips with multiple pdfs in each then you have to be more specific by filtering names. However avoid that extra onion skin where possible.
\Downloads>tar -xf "german (2).zip" "both.zip" && tar -xf "both.zip" "English language.pdf"
You could complicate that by run in a memory or temp folder but it is reliable and simple to use the native file system so consider without Java its fastest to run
CD /D "C:/Users/user/Desktop/Scan/DoneUnzipping" && for %f in (..\Data\*.zip) do tar -xf "%f" "*.zip" && for %f in (*.zip) do tar -xf "%f" "*.pdf" && del "*.zip"
This will extract all inner zips into working folder then extract all PDFs and remove all the essential temporary zips. The source double zips will not be deleted simply touched.
The line that causes your problem looks to be auto-close block you have created when reading the inner zip:
try(ZipInputStream innerZip = new ZipInputStream(fis)) {
...
}
Several likely issues: firstly it is reading the wrong stream - fis not the existing zis.
Secondly, you shouldn't use try-with-resources for auto-close on innerZip as this implicitly calls innerZip.close() when exiting the block. If you view the source code of ZipInputStream via a good IDE you should see (eventually) that ZipInputStream extends InflaterInputStream which itself extends FilterInputStream. A call to innerZip.close() will close the underlying outer stream zis (fis in your case) hence stream is closed when you resume the next entry of the outer zip.
Therefore remove the try() block and add use of zis:
ZipInputStream innerZip = new ZipInputStream(zis);
Use try-catch block only for the outermost file handling:
try (ZipInputStream zis = new ZipInputStream(new FileInputStream(zipFilePath))) {
ZipEntry ze = zis.getNextEntry();
...
}
Thirdly, you appear to be copying the wrong stream when extracting a PDF - use innerZip not outer zis. The code will never extract PDF as these 2 lines can never be true at the same time because a file ending ZIP will never end PDF too:
if(fileName.toUpperCase().endsWith(ZIP_EXTENSION)) {
...
// You want innerEntry.getName() here
if(fileName.toUpperCase().endsWith("PDF"))
You should be able to switch to one line Files.copy and make use of the PDF filename not zip filename:
if(innerEntry.getName().toUpperCase().endsWith("PDF")) {
Path newFile = Paths.get(destDir + '-'+innerEntry.getName().replace("/", "-"));
System.out.println("Files.copy to " + newFile);
Files.copy(innerZip, newFile);
}
So iam working on a project where i get to a point when i need to zip multiple folders (4folders to be specific) and one file in one output.zip file using java
So is there anyway for me to do it and by the way putting all the folders and file in one directory and then zipping it doesn't give the same result in other words the folders have to be in root level of the zip file
There are several solutions.
This one for example:
public static void zipDirectory(ZipOutputStream zos, File fileToZip, String parentDirectoryName) throws Exception
{
if (fileToZip == null || !fileToZip.exists())
{
return;
}
String zipEntryName = fileToZip.getName();
if (parentDirectoryName!=null && !parentDirectoryName.isEmpty())
{
zipEntryName = parentDirectoryName + "/" + fileToZip.getName();
}
// If we are dealing with a directory:
if (fileToZip.isDirectory())
{
System.out.println("+" + zipEntryName);
if(parentDirectoryName == null) // if parentDirectory is null, that means it's the first iteration of the recursion, so we do not include the first container folder
{
zipEntryName = "";
}
for (File file : fileToZip.listFiles()) // we iterate over all the folders/files and archive them by keeping the structure too.
{
zipDirectory(zos, file, zipEntryName);
}
} else // If we are dealing with a file, then we zip it directly
{
System.out.println(" " + zipEntryName);
byte[] buffer = new byte[1024];
FileInputStream fis = new FileInputStream(fileToZip);
zos.putNextEntry(new ZipEntry(zipEntryName));
int length;
while ((length = fis.read(buffer)) > 0)
{
zos.write(buffer, 0, length);
}
zos.closeEntry();
fis.close();
}
}
then you could use this function like this:
try
{
File directoryToBeZipped = new File("C:\\New\\test");
FileOutputStream fos = new FileOutputStream("C:\\New\\test\\archive.zip");
ZipOutputStream zos = new ZipOutputStream(fos);
zipDirectory(zos, directoryToBeZipped, null);
zos.flush();
fos.flush();
zos.close();
fos.close();
} catch (Exception e)
{
e.printStackTrace();
}
Or you could use the ZeroTurnAround ZIP Library. And do this in one line:
ZipUtil.pack(new File("D:\\sourceFolder\\"), new File("D:\\generatedZipFile.zip"));
Dead easy way (though you'll get warnings about proprietary classes)
final String[] ARGS = { "-cfM", "x.zip", "folder1", "folder2", "folder3", "folder4", "file.txt" };
sun.tools.jar.Main.main(ARGS);
It might be worth getting a similar thing that won't give you warnings
I've been trying to copy a directory recursively from an NFS mount to the local file system in Java. I first tried using FileUtils in Apache Utils. Sadly, it didn't copy recursively (walking through sub-directories, etc.) so I had to go back to the drawing board. I heard that some of those operations are "finicky" when they are cross-device. I was then suggested to try and use linux commands, so I tried doing so:
Process process = new ProcessBuilder()
.command("cp -R " + source.getAbsolutePath() + " " + dest.getAbsolutePath())
.start();
process.waitFor();
That sadly threw a response of "no such file or directory", I slapped on some debug on there and tried again. Even though I got "no such file or directory", my debug stated that both the source and destination directories exist, as well as after checking manually if they exist.
Alright, so I decided to write my own implementation and oddly enough it worked. Sadly, I have little-to-no knowledge why this worked over FileUtils. Here is what I wrote:
private void copy(File source, File destination) throws IOException {
if (source.isDirectory()) {
if (!destination.exists()) {
destination.mkdir();
}
if (destination.isFile()) {
destination.delete();
destination.mkdir();
}
for (File src : source.listFiles()) {
File dest = new File(destination, src.getName());
copy(src, dest);
}
} else {
destination.createNewFile();
FileInputStream input = new FileInputStream(source);
FileOutputStream out = new FileOutputStream(destination);
byte[] buffer = new byte[2048];
int l;
while ((l = input.read(buffer)) > 0) {
out.write(buffer, 0, l);
}
input.close();
out.close();
}
}
i'm using the following code from the web
File dir = new File(dest.getAbsolutePath(), archiveName);
// create output directory if it doesn't exist
if (!dir.exists()) {
dir.mkdirs();
}
System.err.println(pArchivePath);
ZipFile zipFile = null;
try {
zipFile = new ZipFile(pArchivePath);
Enumeration<?> enu = zipFile.entries();
while (enu.hasMoreElements()) {
ZipEntry zipEntry = (ZipEntry) enu.nextElement();
String name = zipEntry.getName();
long size = zipEntry.getSize();
long compressedSize = zipEntry.getCompressedSize();
System.out.printf("name: %-20s | size: %6d | compressed size: %6d\n",
name, size, compressedSize);
File file = new File(name);
if (name.endsWith("/")) {
System.err.println("make dir " + name);
file.mkdirs();
continue;
}
File parent = file.getParentFile();
if (parent != null) {
parent.mkdirs();
}
InputStream is = zipFile.getInputStream(zipEntry);
FileOutputStream fos = new FileOutputStream(file);
byte[] bytes = new byte[1024];
int length;
while ((length = is.read(bytes)) >= 0) {
fos.write(bytes, 0, length);
}
is.close();
fos.close();
}
zipFile.close();
} catch (Exception e) {
e.printStackTrace();
} finally {
if (zipFile != null) {
try {
zipFile.close();
}
catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
to unzip an archive. It's packed with 7zip or Winrar as an .zip archive but renamed to .qtz (but i don't thing this causes the problem..)
So if i run the code to unzip my archive everything works fine: i get the output on sysout/err listing all files and also no exception occurs, but if i look in the destination directory ... it's empty - just the root folder exists.
I also used
Runtime.getRuntime().exec(String.format("unzip %s -d %s", pArchivePath, dest.getPath()));
but I can't use this anymore 'cause a new process is started and I'm continuing working on the archive right after the unzip process in the java code.
Well the question is.. why doesn't this peace of code work? There a lot of similar examples but none of them worked for me.
br, Philipp
EDIT: The following solved my Problem
File file = new File(dir.getParent(), name);
So i didn't set the right parent path for this file.
The below fragment in your code:
File parent = file.getParentFile();
if (parent != null) {
parent.mkdirs();
}
where does this create the parent directories? Because I tried your code and it's not creating in the destination directory but rather in my Eclipse's project directory. Looking at your code, the destination directory is nowhere used, right?
The code actually extracts the contents of the zip file but not where I expected it.
I am thinking because I don't see you doing something like this:
ZipOutputStream out = new ZipOutputStream(new FileOutputStream(zipPath+ "Zip.zip"));
out.putNextEntry(new ZipEntry("Results.csv"));
I am still working on it but I think that the problem because that makes the file inside of the zip
Also you should probably use the ZipOutputStream to write; like
out.write(bytes, 0, length);
Use Case
I need to package up our kml which is in a String into a kmz response for a network link in Google Earth. I would like to also wrap up icons and such while I'm at it.
Problem
Using the implementation below I receive errors from both WinZip and Google Earth that the archive is corrupted or that the file cannot be opened respectively. The part that deviates from other examples I'd built this from are the lines where the string is added:
ZipEntry kmlZipEntry = new ZipEntry("doc.kml");
out.putNextEntry(kmlZipEntry);
out.write(kml.getBytes("UTF-8"));
Please point me in the right direction to correctly write the string so that it is in doc.xml in the resulting kmz file. I know how to write the string to a temporary file, but I would very much like to keep the operation in memory for understandability and efficiency.
private static final int BUFFER = 2048;
private static void kmz(OutputStream os, String kml)
{
try{
BufferedInputStream origin = null;
ZipOutputStream out = new ZipOutputStream(os);
out.setMethod(ZipOutputStream.DEFLATED);
byte data[] = new byte[BUFFER];
File f = new File("./icons"); //folder containing icons and such
String files[] = f.list();
if(files != null)
{
for (String file: files) {
LOGGER.info("Adding to KMZ: "+ file);
FileInputStream fi = new FileInputStream(file);
origin = new BufferedInputStream(fi, BUFFER);
ZipEntry entry = new ZipEntry(file);
out.putNextEntry(entry);
int count;
while((count = origin.read(data, 0, BUFFER)) != -1) {
out.write(data, 0, count);
}
origin.close();
}
}
ZipEntry kmlZipEntry = new ZipEntry("doc.kml");
out.putNextEntry(kmlZipEntry);
out.write(kml.getBytes("UTF-8"));
}
catch(Exception e)
{
LOGGER.error("Problem creating kmz file", e);
}
}
Bonus points for showing me how to put the supplementary files from the icons folder into a similar folder within the archive as opposed to at the same layer as the doc.kml.
Update Even when saving the string to a temp file the errors occur. Ugh.
Use Case Note The use case is for use in a web app, but the code to get the list of files won't work there. For details see how-to-access-local-files-on-server-in-jboss-application
You forgot to call close() on ZipOutputStream. Best place to call it is the finally block of the try block where it's been created.
Update: To create a folder, just prepend its name in the entry name.
ZipEntry entry = new ZipEntry("icons/" + file);