Zip files based on InputStreams - java

I have a method to zip files in Java:
public void compress(File[] inputFiles, OutputStream outputStream) {
Validate.notNull(inputFiles, "Input files are required");
Validate.notNull(outputStream, "Output stream is required");
int BUFFER = 2048;
BufferedInputStream origin = null;
ZipOutputStream out = new ZipOutputStream(new BufferedOutputStream(outputStream));
byte data[] = new byte[BUFFER];
for (File f : inputFiles) {
FileInputStream fi;
try {
fi = new FileInputStream(f);
} catch (FileNotFoundException e) {
throw new RuntimeException("Input file not found", e);
}
origin = new BufferedInputStream(fi, BUFFER);
ZipEntry entry = new ZipEntry(f.getName());
try {
out.putNextEntry(entry);
} catch (IOException e) {
throw new RuntimeException(e);
}
int count;
try {
while ((count = origin.read(data, 0, BUFFER)) != -1) {
out.write(data, 0, count);
}
} catch (IOException e) {
throw new RuntimeException(e);
}
try {
origin.close();
} catch (IOException e) {
throw new RuntimeException(e);
}
}
try {
out.close();
} catch (IOException e) {
throw new RuntimeException(e);
}
}
As you can see parameter inputFiles is an Array of File objects. This all works, but I'd like to have instead a collection of InputStream objects as parameter to make it more flexible.
But then I have the problem that when making a new ZipEntry (as in code above)
ZipEntry entry = new ZipEntry(f.getName());
I don't have a filename to give as parameter.
How should i solve this? Maybe a Map with (fileName,inputStream) pairs?
Any thoughts on this are appreciated!
Thanks,
Nathan

I think your suggestion Map<String, InputStream> is a good solution.
Just a side note: Remember to close the inputstreams after you are done
If you want to make it more "fancy" you can always use create an interface:
interface ZipOuputInterface {
String getName();
InputStream getInputStream();
}
And have it implemented differently in your different cases for instance File:
class FileZipOutputInterface implements ZipOutputInterface {
File file;
public FileZipOutputInterface(File file) {
this.file = file;
}
public String getName() {
return file.getAbstractName();
}
public InputStream getInputStream() {
return new FileInputStream(file);
}
}

I think that map is good. Just pay attention on the type ofmap you are using if you wish to preserve the original order of files in your ZIP. In this case use LinkedHashMap.

Related

Concurrent use of ZipOutputStream uses 100% of CPU

I am working in a feature for an LMS to download a bunch of selected files and folders in a zip on-the-fly. I have used ZipOutputStream to prevent OutOfMemory issues.
The feature works nice, but we have done a stress test and when several users are downloading zips at the same time (lets say 10 users zipping about 100 MB each one), 4 out of 4 CPUs reach 100% of load until the zips are created. Our system admins think that this is not acceptable.
I wonder if there is some mechanism to do ZipOutputStream use less system resources, no matter if it takes more time to finish.
My current code:
protected void compressResource(ZipOutputStream zipOut, String collectionId, String rootFolderName, String resourceId) throws Exception
{
if (ContentHostingService.isCollection(resourceId))
{
try
{
ContentCollection collection = ContentHostingService.getCollection(resourceId);
List<String> children = collection.getMembers();
if(children != null)
{
for(int i = children.size() - 1; i >= 0; i--)
{
String child = children.get(i);
compressResource(zipOut,collectionId,rootFolderName,child);
}
}
}
catch (PermissionException e)
{
//Ignore
}
}
else
{
try
{
ContentResource resource = ContentHostingService.getResource(resourceId);
String displayName = isolateName(resource.getId());
displayName = escapeInvalidCharsEntry(displayName);
InputStream content = resource.streamContent();
byte data[] = new byte[1024 * 10];
BufferedInputStream bContent = null;
try
{
bContent = new BufferedInputStream(content, data.length);
String entryName = (resource.getContainingCollection().getId() + displayName);
entryName=entryName.replace(collectionId,rootFolderName+"/");
entryName = escapeInvalidCharsEntry(entryName);
ZipEntry resourceEntry = new ZipEntry(entryName);
zipOut.putNextEntry(resourceEntry); //A duplicate entry throw ZipException here.
int bCount = -1;
while ((bCount = bContent.read(data, 0, data.length)) != -1)
{
zipOut.write(data, 0, bCount);
}
try
{
zipOut.closeEntry();
}
catch (IOException ioException)
{
logger.error("IOException when closing zip file entry",ioException);
}
}
catch (IllegalArgumentException iException)
{
logger.error("IllegalArgumentException while creating zip file",iException);
}
catch (java.util.zip.ZipException e)
{
//Duplicate entry: ignore and continue.
try
{
zipOut.closeEntry();
}
catch (IOException ioException)
{
logger.error("IOException when closing zip file entry",ioException);
}
}
finally
{
if (bContent != null)
{
try
{
bContent.close();
}
catch (IOException ioException)
{
logger.error("IOException when closing zip file",ioException);
}
}
}
}
catch (PermissionException e)
{
//Ignore
}
}
}
Thanks in advance.
I have solved it with a simple hack told by #shmosel.
private static Semaphore mySemaphore= new Semaphore(ServerConfigurationService.getInt("content.zip.download.maxconcurrentdownloads",5),true);
(...)
ZipOutputStream zipOut = null;
try
{
mySemaphore.acquire();
ContentCollection collection = ContentHostingService.getCollection(collectionId);
(...)
zipOut.flush();
zipOut.close();
mySemaphore.release();
(...)
This is working in my test server. But if anybody has any objection or any extra advice, I will be happy to hear.

Creating zip file in memory out of byte[]. Zip file is allways corrupted

I have a problem with my created zip file. I am using Java 7. I tried to create a zip file out of a byte array, which contains two or more Excel files. The application finishes allways without any exceptions. So, I thought everything is alright. After I tried to open the zip file, there was an error message from Windows 7, that the zip file is maybe corrupted. I couldn't open it and I have no idea why...!
I googled for this problem but the code snippets I found, looks exactly the same than in my implementation.
This is my code:
if (repsList.size() > 1)
{
String today = DateUtilities.convertDateToString(new Date(), "dd_MM_yyyy");
String prefix = "recs_" + today;
String suffix = ".zip";
ByteArrayOutputStream baos = null;
ZipOutputStream zos = null;
try
{
baos = new ByteArrayOutputStream();
zos = new ZipOutputStream(baos);
for (RepBean rep : repsList)
{
String filename = rep.getFilename();
ZipEntry entry = new ZipEntry(filename);
entry.setSize(rep.getContent().length);
zos.putNextEntry(entry);
zos.write(rep.getContent());
zos.closeEntry();
}
// this is the zip file as byte[]
reportContent = baos.toByteArray();
}
catch (UnsupportedEncodingException e)
{
...
}
catch (ZipException e) {
...
}
catch (IOException e)
{
...
}
finally
{
try
{
if (zos != null)
{
zos.close();
}
if (baos != null)
{
baos.close();
}
}
catch (IOException e)
{
// Nothing to do ...
e.printStackTrace();
}
}
}
try
{
response.setContentLength(reportContent.length);
response.getOutputStream().write(reportContent);
}
catch (IOException e)
{
...
}
finally
{
try
{
response.getOutputStream().flush();
response.getOutputStream().close();
}
catch (IOException e)
{
...
}
}
It must be a very simple failure but I cannot find it. Would be nice if you can help me with my problem.
Thanks a lot in advance.
You are converting the ByteArrayOutputStream to a byte[] before you have closed the ZipOutputStream. You must ensure zos is closed before you do baos.toByteArray(), the easiest way to ensure this is a try-with-resources construct:
try
{
try (baos = new ByteArrayOutputStream();
zos = new ZipOutputStream(baos))
{
for (RepBean rep : repsList)
{
String filename = rep.getFilename();
ZipEntry entry = new ZipEntry(filename);
entry.setSize(rep.getContent().length);
zos.putNextEntry(entry);
zos.write(rep.getContent());
zos.closeEntry();
}
}
// this is the zip file as byte[]
reportContent = baos.toByteArray();
}
// catch blocks as before, finally is no longer required as the try-with-resources
// will ensure the streams are closed

Java Android - Still getting old file

I'm still receiving 1st file my app generated for me.
First , I thought it's because the file exists so I wrote
File file=new File(getCacheDir(), "Competition.xls");
if (file.exists()) {file.delete(); file =new File(getCacheDir(), "Competition.xls");}
But that didn't help me-I still receive first file that was made
I'm new to working with files so I decided to copy entire method here. Sorry for a lot of text.
private void createFileTosend() {
InputStream inputStream = null;
FileOutputStream outputStream = null;
try {
File toSend=null;
try {
toSend = getFile();
} catch (WriteException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
inputStream = new FileInputStream(toSend);
outputStream = openFileOutput("Competition.xls",
Context.MODE_WORLD_READABLE | Context.MODE_APPEND);
byte[] buffer = new byte[1024];
int length = 0;
try {
while ((length = inputStream.read(buffer)) > 0){
outputStream.write(buffer, 0, length);
}
} catch (IOException ioe) {
/* ignore */
}
} catch (FileNotFoundException fnfe) {
/* ignore */
} finally {
try {
inputStream.close();
} catch (IOException ioe) {
/* ignore */
}
try {
outputStream.close();
} catch (IOException ioe) {
/* ignore */
}
}
}
public File getFile() throws IOException, WriteException{
File file=new File(getCacheDir(), "Competition.xls");
if (file.exists()) {file.delete(); file =new File(getCacheDir(), "Competition.xls");}
WritableWorkbook workbook = Workbook.createWorkbook(file);
//then goes long block with creating a .xls file which is not important
workbook.write();
workbook.close();
return file;
}
Help on understanding where the problem is
You should never have a structure like :
catch(Exception ex ) {
//ignore (or log only)
}
Exception are there to tell you something went wrong. What you do is called (in french) "eating/hiding exceptions". You are loosing this very important information that something went abnormally.
You should always either throw the exception you catch to your caller, or process it locally. At the very least, and this is a poor practice, you should log it. But doing nothing is just very wrong.
Here, put the whole try catch in a method for instance :
private void createFileTosend() throws IOException {
InputStream inputStream = null;
FileOutputStream outputStream = null;
try {
File toSend = getFile();
inputStream = new FileInputStream(toSend);
outputStream = openFileOutput("Competition.xls",
Context.MODE_WORLD_READABLE | Context.MODE_APPEND);
byte[] buffer = new byte[1024];
int length = 0;
while ((length = inputStream.read(buffer)) > 0){
outputStream.write(buffer, 0, length);
}
} finally {
try {
if( inputStream != null ) {
inputStream.close();
}
} catch (IOException ioe) {
Log.e( ioe );
}
try {
if( outputStream != null ) {
outputStream.close();
}
} catch (IOException ioe) {
Log.e( ioe );
}
}
}
And now, when you call createFileToSend, do that in a try/catch structure and toast a message, or something if you catch an exception.

How to manage streams of bytes and when to close the streams

I'm experiencing java.lang.OutOfMemoryError: Java heap space whenever I try to execute my code. However, if I close my streams in certain instances the error goes away, but because my streams are closing prematurely I'm missing data.
I'm very new to Java and I'm clearly not understanding how to manage the streams. How and when should I close streams?
private void handleFile(File source)
{
FileInputStream fis = null;
try
{
if(source.isFile())
{
fis = new FileInputStream(source);
handleFile(source.getAbsolutePath(), fis);
}
else if(source.isDirectory())
{
for(File file:source.listFiles())
{
if(file.isFile())
{
fis = new FileInputStream(file);
handleFile(file, fis);
}
else
{
handleFile(file);
}
}
}
}
catch(IOException ioe)
{
ioe.printStackTrace();
}
finally
{
try
{
if(fis != null) { fis.close(); }
}
catch(IOException ioe) { ioe.printStackTrace(); }
}
}
private handleFile(String fileName, InputStream inputStream)
{
try
{
byte[] initialBytes = isToByteArray(inputStream);
byte[] finalBytes = initialBytes;
if(initialBytes.length == 0) return;
if(isBytesTypeB(initialBytes))
{
finalBytes = getBytesTypeB(startingBytes);
}
// Other similar method checks
// .....
map.put(fileName, finalBytes);
}
catch(IOException ioe)
{
ioe.printStackTrace();
}
}
private byte[] isToByteArray(InputStream inputStream)
{
ByteArrayOutputStream baos = new ByteArrayOutputStream();
byte[] buffer = new byte[1024];
int nRead;
while((nRead = inputStream.read(buffer)) != -1)
{
baos.write(buffer, 0, nRead);
}
return baos.toByteArray();
}
private boolean isBytesTypeB(byte[] fileBytes)
{
// Checks if these bytes match a particular type
if(BytesMatcher.matches(fileBytes, fileBytes.length))
{
return true;
}
return false;
}
private byte[] getBytesTypeB(byte[] fileBytes)
{
//decompress bytes
return decompressedBytes;
}
First of all, do not read the entire streams in memory. Use buffers when reading and writing.
Use ByteArrayInputStream and ByteArrayInputStream only if you're sure you'll be reading very small streams (whose data you will need to re-use for some operations) and it really makes sense to keep the data in memory. Otherwise, you will quickly (or unexpectedly) run out of memory.
Define the streams outside a try-catch block and close them in the finally block (if they are not null). For example:
void doSomeIOStuff() throws IOException
{
InputStream is = null;
try
{
is = new MyInputStream(...);
// Do stuff
}
catch (IOException ioExc)
{
// Either just inform (poor decision, but good for illustration):
ioExc.printStackTrace();
// Or re-throw to delegate further on:
throw new IOException(ioExc);
}
finally
{
if (is != null)
{
is.close();
}
}
}
This way your resources are always properly closed after use.
Out of curiosity, what should the handleFile(...) method really be doing?

How to check if a generated zip file is corrupted?

we have a piece of code which generates a zip file on our system. Everything is ok, but sometimes this zip file while opened by FilZip or WinZip is considered to be corrupted.
So here is my question: how can we check programatically if a generated zip file is corrupted?
Here is the code we are using to generate our zip files:
try {
ZipOutputStream zos = new ZipOutputStream(new FileOutputStream(tmpFile));
byte[] buffer = new byte[16384];
int contador = -1;
for (DigitalFile digitalFile : document.getDigitalFiles().getContent()) {
ZipEntry entry = new ZipEntry(digitalFile.getName());
FileInputStream fis = new FileInputStream(digitalFile.getFile());
try {
zos.putNextEntry(entry);
while ((counter = fis.read(buffer)) != -1) {
zos.write(buffer, 0, counter);
}
fis.close();
zos.closeEntry();
} catch (IOException ex) {
throw new OurException("It was not possible to read this file " + arquivo.getId());
}
}
try {
zos.close();
} catch (IOException ex) {
throw new OurException("We couldn't close this stream", ex);
}
Is there anything we are doing wrong here?
EDIT:
Actually, the code above is absolutely ok. My problem was that I was redirecting the WRONG stream for my users. So, instead of opening a zip file they where opening something completely different. Mea culpa :(
BUT the main question remains: how programatically I can verify if a given zip file is not corrupted?
You can use the ZipFile class to check your file :
static boolean isValid(final File file) {
ZipFile zipfile = null;
try {
zipfile = new ZipFile(file);
return true;
} catch (IOException e) {
return false;
} finally {
try {
if (zipfile != null) {
zipfile.close();
zipfile = null;
}
} catch (IOException e) {
}
}
}
I know its been a while that this has been posted, I have used the code that all of you provided and came up with this. This is working great for the actual question. Checking if the zip file is corrupted or not
private boolean isValid(File file) {
ZipFile zipfile = null;
ZipInputStream zis = null;
try {
zipfile = new ZipFile(file);
zis = new ZipInputStream(new FileInputStream(file));
ZipEntry ze = zis.getNextEntry();
if(ze == null) {
return false;
}
while(ze != null) {
// if it throws an exception fetching any of the following then we know the file is corrupted.
zipfile.getInputStream(ze);
ze.getCrc();
ze.getCompressedSize();
ze.getName();
ze = zis.getNextEntry();
}
return true;
} catch (ZipException e) {
return false;
} catch (IOException e) {
return false;
} finally {
try {
if (zipfile != null) {
zipfile.close();
zipfile = null;
}
} catch (IOException e) {
return false;
} try {
if (zis != null) {
zis.close();
zis = null;
}
} catch (IOException e) {
return false;
}
}
}
I think you'll see correspondent exception stack trace during zip-file generation. So, you probably wan't to enhance your exception handling.
in my implementation it looks like that. maybe it helps you:
//[...]
try {
FileInputStream fis = new FileInputStream(file);
BufferedInputStream bis = new BufferedInputStream(fis);
zos.putNextEntry(new ZipEntry(file.getName()));
try {
final byte[] buf = new byte[BUFFER_SIZE];
while (true) {
final int len = bis.read(buf);
if (len == -1) {
break;
}
zos.write(buf, 0, len);
}
zos.flush();
zos.closeEntry();
} finally {
try {
bis.close();
} catch (IOException e) {
LOG.debug("Buffered Stream closing failed");
} finally {
fis.close();
}
}
} catch (IOException e) {
throw new Exception(e);
}
//[...]
zos.close
Perhaps swap the following two lines?;
fis.close();
zos.closeEntry();
I can imagine that the closeEntry() will still read some data from the stream.
Your code is basically OK, try to find out which file is responsible for the corrupted zip file. Check whether digitalFile.getFile() always returns a valid and accessible argument to FileInputStream. Just add a bit logging to your code and you will find out what's wrong.
new ZipFile(file)
compress again the file, so duplicate efforts and that is not what you are looking for. Despite of the fact that only check one file and the question compress n-files.
Take a look to this: http://www.kodejava.org/examples/336.html
Create a checksum for your zip:
CheckedOutputStream checksum = new CheckedOutputStream(fos, new CRC32());
ZipOutputStream zos = new ZipOutputStream(new BufferedOutputStream(checksum));
...
And when you finish the compression show it
System.out.println("Checksum : " + checksum.getChecksum().getValue());
You must do the same reading the zip with java or others tools checking if checksums match.
see https://stackoverflow.com/a/10689488/848072 for more information
ZipOutputStream does not close the underlying stream.
What you need to do is:
FileOutputStream fos = new FileOutputStream(...);
ZipOutputStream zos = new ZipOutputStream(fos);
Then in your closing block:
zos.close();
fos.flush(); // Can't remember whether this is necessary off the top of my head!
fos.close();

Categories