how to check if a ZIP file is empty in Java? - java

I have the following piece of code -
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
ZipOutputStream zos = new ZipOutputStream(outputStream);
for (int i = 0; i < params.getGrades().size(); i++) {
generateReport(param1, param2, zos);
}
zos.flush();
zos.close();
In the generateReport method, I have code to generate my reports as xls files and add them to ZIP.
Is there any way we can check if any files have been written in the ZIP file, or if the ZIP file is empty? is there any property I can use?
Thanks,
Raaz

You can use the ZipFile from the java.util.zip package.
You can invoke the
size()
method.

After you close zos, outputStream.size() gives you the number of bytes written. You would have to allow for whatever the ZIP header size is for an empty ZIP file.

See:
http://www.java-examples.com/get-number-entries-zip-file-example
and:
Count files in ZIP's directory - JAVA, Android
and:
Android: Get Number of Files within Zip?

#Raaz, Please go through this link.
In that you can see a Class called 'ZipEntry'. It represents the files contained in a zip folder. It provides some useful methods such as:
zipEntry.getName(); // name of the file contained by zip.
zipEntry.getSize(); // size of the file contained by zip.

#Didier - I decided to take your advice on returning a value, but ended up doing it this way -
Instead of checking if a file has been added to the ZIP, I checked if the list data I'm trying to write in an xls file (the file which in turn gets added to the ZIP) is empty. If it's empty, then I set a error value to "No file generated". If the list is not empty, I assigned the an empty value to the string and returned it to the calling function.

Related

Read the data from TXT file inside Zip File without extracting the contents in Matlab

I have tab delimited ascii data in txt files which are zip compressed (and the zip may or may not contain other files). I would like to read this data into a matrix without uncompressing the zip files.
There were a few similar #matlab / #java posts earlier:
Read the data of CSV file inside Zip File without extracting the contents in Matlab
Extracting specific file from zip in matlab
Read Content from Files which are inside Zip file
I have gotten this far thanks to the above - I can identify the .txt inside the zip, but don't know how to actually read its contents. First example:
zipFilename = 'example.zip';
zipJavaFile = java.io.File(zipFilename);
zipFile=org.apache.tools.zip.ZipFile(zipJavaFile);
entries=zipFile.getEntries;
cnt=1;
while entries.hasMoreElements
tempObj=entries.nextElement;
file{cnt,1}=tempObj.getName.toCharArray';
cnt=cnt+1;
end
ind=regexp(file,'$*.xml$');
ind=find(~cellfun(#isempty,ind));
file=file(ind);
file = cellfun(#(x) fullfile('.',x),file,'UniformOutput',false);
% Now Operate Any thing on File.
zipFile.close
HOWEVER, I found no example as to how to "operate anything on file". I can extract the path within the zip file, but don't know how to actually read the contents of this txt file. (I wish to directly read its contents into memory -- a matrix --, without extraction, if possible.)
The other example is
zipFilename = 'example.zip';
zipFile = org.apache.tools.zip.ZipFile(zipFilename);
entries = zipFile.getEntries;
while entries.hasMoreElements
entry = entries.nextElement;
entryName = char(entry.getName);
[~,~,ext] = fileparts(entryName);
if strcmp(ext,'.txt')
inputStream = zipFile.getInputStream(entry);
%Read the contents of the file
inputStream.close;
end
end
zipFile.close
The original example contained code to extract the file, but I merely want to read it directly into memory. Again, I don't know how exactly to work with this inputStream.
Could anyone give me a suggestion with a MWE?
It might be a little late, but maybe someone can use it:
(the code was tested in Matlab R2018a)
zipFilename = 'example.zip';
zipFile = org.apache.tools.zip.ZipFile(zipFilename);
entries = zipFile.getEntries;
while entries.hasMoreElements
entry = entries.nextElement;
entryName = char(entry.getName);
[~,~,ext] = fileparts(entryName);
if strcmp(ext,'.txt')
inputStream = zipFile.getInputStream(entry);
%Read the contents of the file
buffer = java.io.ByteArrayOutputStream();
org.apache.commons.io.IOUtils.copy(inputStream, buffer);
data = char(typecast(buffer.toByteArray(), 'uint8')');
inputStream.close;
end
end
zipFile.close

Reading files from an embedded ZIP archive

I have a ZIP archive that's embedded inside a larger file. I know the archive's starting offset within the larger file and its length.
Are there any Java libraries that would enable me to directly read the files contained within the archive? I am thinking along the lines of ZipFile.getInputStream(). Unfortunately, ZipFile doesn't work for this use case since its constructors require a standalone ZIP file.
For performance reasons, I cannot copy the ZIP achive into a separate file before opening it.
edit: Just to be clear, I do have random access to the file.
I've come up with a quick hack (which needs to get sanitized here and there), but it reads the contents of files from a ZIP archive which is embedded inside a TAR. It uses Java6, FileInputStream, ZipEntry and ZipInputStream. 'Works on my local machine':
final FileInputStream ins = new FileInputStream("archive.tar");
// Zip starts at 0x1f6400, size is not needed
long toSkip = 0x1f6400;
// Safe skipping
while(toSkip > 0)
toSkip -= ins.skip(toSkip);
final ZipInputStream zipin = new ZipInputStream(ins);
ZipEntry ze;
while((ze = zipin.getNextEntry()) != null)
{
final byte[] content = new byte[(int)ze.getSize()];
int offset = 0;
while(offset < content.length)
{
final int read = zipin.read(content, offset, content.length - offset);
if(read == -1)
break;
offset += read;
}
// DEBUG: print out ZIP entry name and filesize
System.out.println(ze + ": " + offset);
}
zipin.close();
1.create FileInputStream fis=new FileInputStream(..);
position it at the start of embedded zipfile:
fis.skip(offset);
open ZipInputStream(fis)
I suggest using TrueZIP, it provides file system access to many kinds of archives. It worked well for me in the past.
If you're using Java SE 7, it provides a zip fie system which allows you to read/ write files in the zip directly: http://docs.oracle.com/javase/7/docs/technotes/guides/io/fsp/zipfilesystemprovider.html
I think apache commons compress may help you.
There is a class org.apache.commons.compress.archivers.zip.ZipArchiveEntry, which inherit java.util.zip.ZipEntry.
It has a method getDataOffset(), that can get the offset of data stream within the archive file.
7-zip-JavaBinding is a Java wrapper for the 7-zip C++ library.
The code snippets page in particular has some nice examples including printing a list of items in an archive, extracting a single file and opening multi-part archives.
Check whether zip4j helps you or not.
You can try PartInputStream to read zip file as per your use case.
I think it is better to create temp zip file and then accessing it.

Preserving file checksum after extract from zip in java

This is what I'm trying to accomplish:
1) Calculate the checksum of all files to be added to a zip file. Currently using apache commons io follows:
final Checksum oChecksum = new Adler32();
...
//for every file iFile in folder
long lSum = (FileUtils.checksum(iFile, oChecksum)).getValue();
//store this checksum in a log
2) Compress the folder processed as a zip using the Ant zip task.
3) Extract files from the zip one by one to the specified folder (using both commons io and compression for this), and calculate the checksum of the extracted file:
final Checksum oChecksum = new Adler32();
...
ZipFile myZip = new ZipFile("test.zip");
ZipArchiveEntry zipEntry = myZip.getEntry("checksum.log"); //reads the filename from the log
BufferedInputStream myInputStream = new BufferedInputStream(myZip.getInputStream(zipEntry));
File destFile = new File("/mydir", zipEntry.getName());
lDestFile.createNewFile();
FileUtils.copyInputStreamToFile(myInputStream, destFile);
long newChecksum = FileUtils.checksum(destFile, oChecksum).getValue();
The problem I have is that the value from newChecksum doesn't match the one from the original file. The files' sizes match on disk. Funny thing is that if I run cksum or md5sum commands on both files directly on a terminal, these are the same for both files. The mismatch occurs only from java.
Is this the correct way to approach it or is there any way to preserve the checksum value after extraction?
I also tried using a CheckedInputStream but this also gets me different values from java.
EDIT: This seems related to the Adler32 object used (pre-zip vs unzip checks). If I do "new Adler32()" in the unzip check for every file instead of reusing the same Adler32 for all, I get the correct result.
Are you trying to for all file concatenated? If yes, you need to make sure you're reading them in the same order "checksumed" them.
If no, you need to call checksum.reset() between computing the checksum for each file. You'll notice (in you look at the source) that Adler32 is stateful, which means you're computing the checksum of the file plus all the preceding ones during part one.

zip a folder structure using java

I am trying to zip the following file structure on my machine,
parent/
parent/test1
parent/test1/image1.jpeg
parent/test2
The problem here is i cant zip the above file structure using java. I have google and found following code sample but it only zip the files only inside a given folder.
File inFolder=new File("out");
File outFolder=new File("Out.zip");
ZipOutputStream out = new ZipOutputStream(new
BufferedOutputStream(new FileOutputStream(outFolder)));
BufferedInputStream in = null;
byte[] data = new byte[1000];
String files[] = inFolder.list();
for (int i=0; i<files.length; i++)
{
in = new BufferedInputStream(new FileInputStream
(inFolder.getPath() + "/" + files[i]), 1000);
out.putNextEntry(new ZipEntry(files[i]));
int count;
while((count = in.read(data,0,1000)) != -1)
{
out.write(data, 0, count);
}
out.closeEntry();
}
out.flush();
out.close();
In the above code the out is a folder and we need to have some files..also folder cannot be empty if so it throws a exception java.util.zip.ZipException or cant contain any sub folders even files inside it (eg:out\newfolder\image.jpeg) if so it throws a java.io.FileNotFoundException: out\newfolder (Access is denied).
In my case im costructig the above file structure by quering the database sometime empty folders along the folder structure can be have.
Can some one please tell me a solution?
Thank You.
What is probably happening is that you're trying to treat every entry as a FileInputStream. However, for a directory, this is not true. Since the path is not to a file, when you try to read it, a FileNotFoundException is thrown. For directories, you still want to create the ZipEntry, but instead of trying to read in any data, just skip it and move on to the next path.
write two methods. The first one takes dirpath, makes a zip stream and calls another method which copies files to the zip stream and calls itself recursively for directories as below:
open an entry in the zip stream for the given directory
list files and dirs in the given directory, loop through them
if an entry is a file, open an entry, copy file content to the entry, close it
if an entry is a directory, call this method. Pass the zip stream
close the entry.
The first method closes the zip stream.

When creating a zip archive, what constitutes a duplicate entry

In a Java web application I am creating a zip file from various in-memory files (stored as byte[]).
Here's the key bit of code:
ByteArrayOutputStream baos = new ByteArrayOutputStream();
ZipOutputStream zos = new ZipOutputStream(baos);
for (//each member of a collection of objects) {
PDFDocument pdfDocument = //generate PDF for this member of the collection;
ZipEntry entry = new ZipEntry(pdfDocument.getFileName());
entry.setSize(pdfDocument.getBody().length);
zos.putNextEntry(entry);
zos.write(pdfDocument.getBody());//pdfDocument.getBody() returns byte[]
zos.closeEntry();
}
zos.close();
The problem: I'm sometimes getting a "ZipException: duplicate entry" when doing the "putNextEntry()" line.
The PDF files themselves will certainly be different, but they may have the same name ("PDF_File_for_John_Smith.pdf"). Is a name collision sufficient to cause this exception?
You can't store 2 entries with the same same name in a zip archive(in the same folder), much like you can't have 2 files with the same name in the same folder in a filesystem.
Edit; And while technically the zip file format allows this, the Java API for dealing with ZIP archives does not.
Yes -- you can use a directory structure inside your ZIP file if you need to hold multiple files with the same file name.
I believe so. Zip was originally intended to archive a directory structure, so it expects filenames to be unique. You could add directories to keep your files separated (and provide extra information to differentiate them, if you want).

Categories