java extract zip Unexpected end of ZLIB input stream

java extract zip Unexpected end of ZLIB input stream - java

I am creating a program that will extract a zip and then insert the files into a database, every so often I get the error
java.lang.Exception: java.io.EOFException: Unexpected end of ZLIB input stream
I can not pinpoint the reason for this as the extraction code is pretty much the same as all the other code you can find on the web. My code is as follows:
public void extract(String zipName, InputStream content) throws Exception {
int BUFFER = 2048;
//create the zipinputstream
ZipInputStream zis = new ZipInputStream(content);
//Get the name of the zip
String containerName = zipName;
//container for the zip entry
ZipEntry entry;
// Process each entry
while ((entry = zis.getNextEntry()) != null) {
//get the entry file name
String currentEntry = entry.getName();
try {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
// establish buffer for writing file
byte data[] = new byte[BUFFER];
int currentByte;
// read and write until last byte is encountered
while ((currentByte = zis.read(data, 0, BUFFER)) != -1) {
baos.write(data, 0, currentByte);
}
baos.flush(); //flush the buffer
//this method inserts the file into the database
insertZipEntry(baos.toByteArray());
baos.close();
}
catch (Exception e) {
System.out.println("ERROR WITHIN ZIP " + containerName);
}
}
}

This is probably caused by this JVM bug (JVM-6519463)
I previously has about one or two errors on 1000 randomly created documents, I applied the proposed solution (catch EOFException and do nothing with it) and I have no more errors.

I would say you are occasionally being given truncated Zip files to process. Check upstream.

I had the same exception and the problem was in the compressing method (not extracting). I did not close the ZipOutputStream with zos.closeEntry() after writing to the output stream. Without that, compressing worked well but I got an exception while extracting.
public static byte[] zip(String outputFilename, byte[] output) {
try (ByteArrayOutputStream baos = new ByteArrayOutputStream();
ZipOutputStream zos = new ZipOutputStream(baos)) {
zos.putNextEntry(new ZipEntry(outputFilename));
zos.write(output, 0, output.length);
zos.closeEntry(); //this line must be here
return baos.toByteArray();
} catch (IOException e) {
//catch exception
}
}

Never attempt to read more bytes than the entry contains. Call ZipEntry.getSize() to get the actual size of the entry, then use this value to keep track of the number of bytes remaining in the entry while reading from it. See below :
try{
...
int bytesLeft = (int)entry.getSize();
while ( bytesLeft>0 && (currentByte=zis.read(data, 0, Math.min(BUFFER, bytesLeft))) != -1) {
...
}
...
}

Related

using write() method, the file gets too big

I try to write to a file, the data that I receive from a socket , I store the data in an array but when I write them, the file gets too big ...
I think it is caused by using a big array , as i don't know the length of the data stream...
But checking the method write it is stated that write(byte[] b) Writes b.length bytes from the specified byte array to this file output stream,
the write() method reads the length of the array but the length is 2000...
How can i know the length of the data that will be written?
...
byte[] Rbuffer = new byte[2000];
dis = new DataInputStream(socket.getInputStream());
dis.read(Rbuffer);
writeSDCard.writeToSDFile(Rbuffer);
...
void writeToSDFile(byte[] inputMsg){
File root = android.os.Environment.getExternalStorageDirectory();
File dir = new File (root.getAbsolutePath() + "/download");
if (!(dir.exists())) {
dir.mkdirs();
}
Log.d("WriteSDCard", "Start writing");
File file = new File(dir, "myData.txt");
try {
FileOutputStream f = new FileOutputStream(file, true);
f.write(inputMsg);
f.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
Log.i(TAG, "******* File not found. Did you" +
" add a WRITE_EXTERNAL_STORAGE permission to the manifest?");
} catch (IOException e) {
e.printStackTrace();
}
}

read() returns the number of bytes that were read, or -1. You are ignoring both possibilities, and assuming that it filled the buffer. All you have to do is store the result in a variable, check for -1, and otherwise pass it to the write() method.
Actually you should pass the input stream to your method, and use a loop after creating the file:
int count;
byte[] buffer = new byte[8192];
while ((count = in.read(buffer)) > 0)
{
out.write(buffer, 0, count);
}
Your statement in a now-deleted comment that a new input stream is created per packet is not correct.

Why does getResourceAsStream() and reading file with FileInputStream return arrays of different length?

I want to read files as byte arrays and realised that amount of read bytes varies depending on the used method. Here the relevant code:
public byte[] readResource() {
try (InputStream is = getClass().getClassLoader().getResourceAsStream(FILE_NAME)) {
int available = is.available();
byte[] result = new byte[available];
is.read(result, 0, available);
return result;
} catch (Exception e) {
log.error("Failed to load resource '{}'", FILE_NAME, e);
}
return new byte[0];
}
public byte[] readFile() {
File file = new File(FILE_PATH + FILE_NAME);
try (InputStream is = new FileInputStream(file)) {
int available = is.available();
byte[] result = new byte[available];
is.read(result, 0, available);
return result;
} catch (Exception e) {
log.error("Failed to load file '{}'", FILE_NAME, e);
}
return new byte[0];
}
Calling File.length() and reading with the FileInputStream returns the correct length of 21566 bytes for the given test file, though reading the file as a resources returns 21622 bytes.
Does anyone know why I get different results and how to fix it so that readResource() returns the correct result?

Why does getResourceAsStream() and reading file with FileInputStream return arrays of different length?
Because you're misusing the available() method in a way that is specifically warned against in the Javadoc:
"It is never correct to use the return value of this method to allocate a buffer intended to hold all data in this stream."
and
Does anyone know why I get different results and how to fix it so that readResource() returns the correct result?
Read in a loop until end of stream.

According to the the API docs of InputStream, InputStream.available() does not return the size of the resource - it returns
an estimate of the number of bytes that can be read (or skipped over) from this input stream without blocking
To get the size of a resource from a stream, you need to fully read the stream, and count the bytes read.
To read the stream and return the contents as a byte array, you could do something like this:
try ( InputStream is = getClass().getClassLoader().getResourceAsStream(FILE_NAME);
ByteArrayOutputStream bos = new ByteArrayOutputStream()) {
byte[] buffer = new byte[4096];
int bytesRead = 0;
while ((bytesRead = is.read(buffer)) != -1) {
bos.write(buffer, 0, bytesRead);
}
return bos.toByteArray();
}

Convert Zip to Store compression

I have a zip file file.zip that is compressed. Is there a way to change the compression level of the file to store (no compression).
I have written and tried the following code and it works, but I will be running this in an environment where memory and storage will be a limitation and there might not be enough space. I am using the zip4j library.
This code extracts the input zip to a folder, then rezips it with store compression level. The problem with this is that at one point in execution, there are 3 copies of the zip on storage, which is a problem because space is a limitation.
try {
String zip = "input.zip";
final ZipFile zipFile = new ZipFile(zip);
zipFile.extractAll("dir");
File file = new File("dir");
ZipParameters params = new ZipParameters();
params.setCompressionMethod(Zip4jConstants.COMP_STORE);
params.setIncludeRootFolder(false);
ZipFile output = new ZipFile(new File("out.zip"));
output.addFolder(file, params);
file.delete();
return "Done";
} catch (Exception e) {
e.printStackTrace();
return "Error";
}
So any suggestions on another way to approach this problem? Or maybe some speed or memory optimizations to my current code?

As an alternative we can read files from zip one by one in memory or into temp file, like here
ZipInputStream is = ...
ZipOutputStream os = ...
os.setMethod(ZipOutputStream.STORED);
int bSize = ... calculate max available size
byte[] buf = new byte[bSize];
for (ZipEntry e; (e = is.getNextEntry()) != null;) {
ZipEntry e2 = new ZipEntry(e.getName());
e2.setMethod(ZipEntry.STORED);
int n = is.read(buf);
if (is.read() == -1) {
// in memory
e2.setSize(n);
e2.setCompressedSize(n);
CRC32 crc = new CRC32();
crc.update(buf, 0, n);
e2.setCrc(crc.getValue());
os.putNextEntry(e2);
os.write(buf, 0, n);
is.closeEntry();
os.closeEntry();
} else {
// use tmp file
}
}
reading in memory is supposed to be faster

I finally got it after a few hours by playing around with input streams.
try {
final ZipFile zipFile = new ZipFile("input.zip");
File output = new File("out.zip");
byte[] read = new byte[1024];
ZipInputStream zis = new ZipInputStream(new FileInputStream(zip));
ZipOutputStream zos = new ZipOutputStream(new FileOutputStream(output));
ZipEntry ze;
zos.setLevel(ZipOutputStream.STORED);
zos.setMethod(ZipOutputStream.STORED);
while((ze = zis.getNextEntry()) != null) {
int l;
zos.putNextEntry(ze);
System.out.println("WRITING: " + ze.getName());
while((l = zis.read(read)) > 0) {
zos.write(read, 0, l);
}
zos.closeEntry();
}
zis.close();
zos.close();
return "Done";
} catch (Exception e) {
e.printStackTrace();
return "Error";
}
Thanks so much for your answer Evgeniy Dorofeev, I literally just got my answer when I read yours! However, I prefer my method as it only takes up a maximum of 1 MB in memory (Am I right?). Also, I tried executing your code and only the first file in the input zip was transferred.

Java: multiple outputstreams single file

I need to read multiple small files and append them into a bigger single file.
Base64OutputStream baos = new Base64OutputStream(new FileOutputStream(outputFile, true));
for (String fileLocation : fileLocations) {
InputStream fis = null;
try
{
fis = new FileInputStream(new File(fileLocation));
int bytesRead = 0;
byte[] buf = new byte[65536];
while ((bytesRead=fis.read(buf)) != -1) {
if (bytesRead > 0) baos.write(buf, 0, bytesRead);
}
}
catch (Exception e) {
logger.error(e.getMessage());
}
finally{
try{
if(fis != null)
fis.close();
}
catch(Exception e){
logger.error(e.getMessage());
}
}
}
All pretty standard, but I'm finding that, unless I open a new baos per input file (include it inside the loop), all the files following the first one written by baos are wrong (incorrect output).
The questions:
I've been told that opening/closing an outputstream back and forth for the same resource is not a good practice, why?
Why using a single output stream is not delivering the same result as multiple separate ones?

Perhaps the problem is that if you are assumming that encoding in base64 the concatenation of several files should give the same result as concatenating the base64 encoding of each file? That's not necessariy the case; base64 encodes groups of three consecutive input bytes to 4 ascii characters, so, unless you know that each file has a size that is a multiple of three, the base64 encoding will produce completely different outputs.

How to split file into chunks while still writing into it?

I tried to create byte array blocks from file whil the process was still using the file for writing. Actually I am storing video into file and I would like to create chunks from the same file while recording.
The following method was supposed to read blocks of bytes from file:
private byte[] getBytesFromFile(File file) throws IOException{
InputStream is = new FileInputStream(file);
long length = file.length();
int numRead = 0;
byte[] bytes = new byte[(int)length - mReadOffset];
numRead = is.read(bytes, mReadOffset, bytes.length - mReadOffset);
if(numRead != (bytes.length - mReadOffset)){
throw new IOException("Could not completely read file " + file.getName());
}
mReadOffset += numRead;
is.close();
return bytes;
}
But the problem is that all array elements are set to 0 and I guess it is because the writing process locks the file.
I would bevery thankful if anyone of you could show any other way to create file chunks while writing into file.

Solved the problem:
private void getBytesFromFile(File file) throws IOException {
FileInputStream is = new FileInputStream(file); //videorecorder stores video to file
java.nio.channels.FileChannel fc = is.getChannel();
java.nio.ByteBuffer bb = java.nio.ByteBuffer.allocate(10000);
int chunkCount = 0;
byte[] bytes;
while(fc.read(bb) >= 0){
bb.flip();
//save the part of the file into a chunk
bytes = bb.array();
storeByteArrayToFile(bytes, mRecordingFile + "." + chunkCount);//mRecordingFile is the (String)path to file
chunkCount++;
bb.clear();
}
}
private void storeByteArrayToFile(byte[] bytesToSave, String path) throws IOException {
FileOutputStream fOut = new FileOutputStream(path);
try {
fOut.write(bytesToSave);
}
catch (Exception ex) {
Log.e("ERROR", ex.getMessage());
}
finally {
fOut.close();
}
}

If it were me, I would have it chunked by the process/thread writing to the file. This is how Log4j seems to do it, at any rate. It should be possible to make an OutputStream which automatically starts writing to a new file every N bytes.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

java extract zip Unexpected end of ZLIB input stream - java

This is probably caused by this JVM bug (JVM-6519463) I previously has about one or two errors on 1000 randomly created documents, I applied the proposed solution (catch EOFException and do nothing with it) and I have no more errors.

I would say you are occasionally being given truncated Zip files to process. Check upstream.

Related

using write() method, the file gets too big

Why does getResourceAsStream() and reading file with FileInputStream return arrays of different length?

Convert Zip to Store compression

Java: multiple outputstreams single file

How to split file into chunks while still writing into it?

Categories

Resources