Java create InputStream from ZipInputStream entry - java

I would like to write a method that read several XML files inside a ZIP, from a single InputStream.
The method would open a ZipInputStream, and on each xml file, get the corresponding InputStream, and give it to my XML parser. Here is the skeleton of the method :
private void readZip(InputStream is) throws IOException {
ZipInputStream zis = new ZipInputStream(is);
ZipEntry entry = zis.getNextEntry();
while (entry != null) {
if (entry.getName().endsWith(".xml")) {
// READ THE STREAM
}
entry = zis.getNextEntry();
}
}
The problematic part is the "// READ THE STREAM". I have a working solution, which consist to create a ByteArrayInputStream, and feed my parser with it. But it uses a buffer, and for large files I get an OutOfMemoryError. Here is the code, if someone is still interested :
int count;
byte buffer[] = new byte[2048];
ByteArrayOutputStream out = new ByteArrayOutputStream();
while ((count = zis.read(buffer)) != -1) { out.write(buffer, 0, count); }
InputStream is = new ByteArrayInputStream(out.toByteArray());
The ideal solution would be to feed the parser with the original ZipInputStream. It should works, because it works if I just print the entry content with a Scanner :
Scanner sc = new Scanner(zis);
while (sc.hasNextLine())
{
System.out.println(sc.nextLine());
}
But... The parser I'm currently using (jdom2, but I also tried with javax.xml.parsers.DocumentBuilderFactory) closes the stream after parsing the data :/ . So I'm unable to get the next entry and continue.
So finally the question is :
Does anybody know a DOM parser that doesn't close its stream ?
Is there another way to have an InputStream from a ZipEntry ?
Thanks.

A small improvement on Tim's solution: The problem with having to call allowToBeClosed() before close() is that it makes closing the ZipInputStream properly when handling exceptions tricky and will break Java 7's try-with-resources statement.
I suggest creating a wrapper class as follows:
public class UncloseableInputStream extends InputStream {
private final InputStream input;
public UncloseableInputStream(InputStream input) {
this.input = input;
}
#Override
public void close() throws IOException {} // do not close the wrapped stream
#Override
public int read() throws IOException {
return input.read();
}
// delegate all other InputStream methods as with read above
}
which can then safely be used as follows:
try (ZipInputStream zipIn = new ZipInputStream(...))
{
DocumentBuilder db = DocumentBuilderFactory.newInstance().newDocumentBuilder();
ZipEntry entry;
while (null != (entry = zipIn.getNextEntry()))
{
if ("file.xml".equals(entry.getName())
{
Document doc = db.parse(new UncloseableInputStream(zipIn));
}
}
}

Thanks to halfbit, I ended up with my own ZipInputStream class, which overrides the close method :
import java.io.IOException;
import java.io.InputStream;
import java.util.zip.ZipInputStream;
public class CustomZipInputStream extends ZipInputStream {
private boolean _canBeClosed = false;
public CustomZipInputStream(InputStream is) {
super(is);
}
#Override
public void close() throws IOException {
if(_canBeClosed) super.close();
}
public void allowToBeClosed() { _canBeClosed = true; }
}

You could wrap the ZipInputStream and intercept the call to close().

If you don't mind external dependencies, Apache Commons IO provides a convenience class named CloseShieldInputStream for blocking the close() call.
private void readZip(InputStream is) throws IOException {
ZipInputStream zis = new ZipInputStream(is);
ZipEntry entry = zis.getNextEntry();
while (entry != null) {
if (entry.getName().endsWith(".xml")) {
//commons-io 2.9 and later
InputStream tempIs = CloseShieldInputStream.wrap(zis);
//commons-io < 2.9
//InputStream tempIs = new CloseShieldInputStream(zis);
// READ THE STREAM
}
entry = zis.getNextEntry();
}
}

Related

Java zip files from streams instantly without using byte[]

I want to compress multiples files into a zip files, I'm dealing with big files, and then download them into the client, for the moment I'm using this:
#RequestMapping(value = "/download", method = RequestMethod.GET, produces = "application/zip")
public ResponseEntity <StreamingResponseBody> getFile() throws Exception {
File zippedFile = new File("test.zip");
FileOutputStream fos = new FileOutputStream(zippedFile);
ZipOutputStream zos = new ZipOutputStream(fos);
InputStream[] streams = getStreamsFromAzure();
for (InputStream stream: streams) {
addToZipFile(zos, stream);
}
final InputStream fecFile = new FileInputStream(zippedFile);
Long fileLength = zippedFile.length();
StreamingResponseBody stream = outputStream - >
readAndWrite(fecFile, outputStream);
return ResponseEntity.ok()
.header(HttpHeaders.ACCESS_CONTROL_EXPOSE_HEADERS, HttpHeaders.CONTENT_DISPOSITION)
.header(HttpHeaders.CONTENT_DISPOSITION, "attachment;filename=" + "download.zip")
.contentLength(fileLength)
.contentType(MediaType.parseMediaType("application/zip"))
.body(stream);
}
private void addToZipFile(ZipOutputStream zos, InputStream fis) throws IOException {
ZipEntry zipEntry = new ZipEntry(generateFileName());
zos.putNextEntry(zipEntry);
byte[] bytes = new byte[1024];
int length;
while ((length = fis.read(bytes)) >= 0) {
zos.write(bytes, 0, length);
}
zos.closeEntry();
fis.close();
}
This take a lot of time before all files are zipped and then the downloading start, and for large files this kan take a lot of time, this is the line responsible for the delay:
while ((length = fis.read(bytes)) >= 0) {
zos.write(bytes, 0, length);
}
So is there a way to download files immediately while their being zipped ?
Try this instead. Rather than using the ZipOutputStream to wrap a FileOutputStream, writing your zip to a file, then copying it to the client output stream, instead just use the ZipOutputStream to wrap the client output stream so that when you add zip entries and data it goes directly to the client. If you want to also store it to a file on the server then you can make your ZipOutputStream write to a split output stream, to write both locations at once.
#RequestMapping(value = "/download", method = RequestMethod.GET, produces = "application/zip")
public ResponseEntity<StreamingResponseBody> getFile() throws Exception {
InputStream[] streamsToZip = getStreamsFromAzure();
// You could cache already created zip files, maybe something like this:
// String[] pathsOfResourcesToZip = getPathsFromAzure();
// String zipId = getZipId(pathsOfResourcesToZip);
// if(isZipExist(zipId))
// // return that zip file
// else do the following
StreamingResponseBody streamResponse = clientOut -> {
FileOutputStream zipFileOut = new FileOutputStream("test.zip");
ZipOutputStream zos = new ZipOutputStream(new SplitOutputStream(clientOut, zipFileOut));
for (InputStream in : streamsToZip) {
addToZipFile(zos, in);
}
};
return ResponseEntity.ok()
.header(HttpHeaders.ACCESS_CONTROL_EXPOSE_HEADERS, HttpHeaders.CONTENT_DISPOSITION)
.header(HttpHeaders.CONTENT_DISPOSITION, "attachment;filename=" + "download.zip")
.contentType(MediaType.parseMediaType("application/zip")).body(streamResponse);
}
private void addToZipFile(ZipOutputStream zos, InputStream fis) throws IOException {
ZipEntry zipEntry = new ZipEntry(generateFileName());
zos.putNextEntry(zipEntry);
byte[] bytes = new byte[1024];
int length;
while ((length = fis.read(bytes)) >= 0) {
zos.write(bytes, 0, length);
}
zos.closeEntry();
fis.close();
}
public static class SplitOutputStream extends OutputStream {
private final OutputStream out1;
private final OutputStream out2;
public SplitOutputStream(OutputStream out1, OutputStream out2) {
this.out1 = out1;
this.out2 = out2;
}
#Override public void write(int b) throws IOException {
out1.write(b);
out2.write(b);
}
#Override public void write(byte b[]) throws IOException {
out1.write(b);
out2.write(b);
}
#Override public void write(byte b[], int off, int len) throws IOException {
out1.write(b, off, len);
out2.write(b, off, len);
}
#Override public void flush() throws IOException {
out1.flush();
out2.flush();
}
/** Closes all the streams. If there was an IOException this throws the first one. */
#Override public void close() throws IOException {
IOException ioException = null;
for (OutputStream o : new OutputStream[] {
out1,
out2 }) {
try {
o.close();
} catch (IOException e) {
if (ioException == null) {
ioException = e;
}
}
}
if (ioException != null) {
throw ioException;
}
}
}
For the first request for a set of resources to be zipped you wont know the size that the resulting zip file will be so you can't send the length along with the response since you are streaming the file as it is zipped.
But if you expect there to be repeated requests for the same set of resources to be zipped, then you can cache your zip files and simply return them on any subsequent requests; You will also know the length of the cached zip file so you can send that in the response as well.
If you want to do this then you will have to be able to consistently create the same identifier for each combination of the resources to be zipped, so that you can check if those resources were already zipped and return the cached file if they were. You might be able to could sort the ids (maybe full paths) of the resources that will be zipped and concatenate them to create an id for the zip file.

inconsistent variable when passing it from one method to another

I have a problem that I have not been able to solve and it does not occur to me that it could be.
I have a class to which I am passing an InputStream from the main method, the problem is that when transforming the InputString to String with the class IOUtils.toString of AWS, or with the IOUtils of commons-io, they return
  an empty String
No matter what the problem may be, since inside the main class, it works correctly and returns the String it should, but when I use it inside the other class (without having done anything), it returns the empty String to me.
these are my classes:
public class Main {
public static void main(String [] args) throws IOException {
InputStream inputStream = new ByteArrayInputStream("{\"name\":\"Camilo\",\"functionName\":\"hello\"}".getBytes());
OutputStream outputStream = new ByteArrayOutputStream();
LambdaExecutor lambdaExecutor = new LambdaExecutor();
String test = IOUtils.toString(inputStream); //this test variable have "{\"name\":\"Camilo\",\"functionName\":\"hello\"}"
lambdaExecutor.handleRequest(inputStream,outputStream);
}
}
and this:
public class LambdaExecutor{
private FrontController frontController;
public LambdaExecutor(){
this.frontController = new FrontController();
}
public void handleRequest(InputStream inputStream, OutputStream outputStream) throws IOException {
//Service service = frontController.findService(inputStream);
String test = IOUtils.toString(inputStream); //this test variable have "" <-empty String
System.exit(0);
//service.execute(inputStream, outputStream, context);
}
}
I used the debug tool, and the InputStream object is the same in both classes
By the time that you've passed the stream into handleRequest(), you've already consumed the stream:
public static void main(String [] args) throws IOException {
InputStream inputStream = new ByteArrayInputStream("{\"name\":\"Camilo\",\"functionName\":\"hello\"}".getBytes());
OutputStream outputStream = new ByteArrayOutputStream();
LambdaExecutor lambdaExecutor = new LambdaExecutor();
String test = IOUtils.toString(inputStream); //this consumes the stream, and nothing more can be read from it
lambdaExecutor.handleRequest(inputStream,outputStream);
}
When you took that out, the method worked as, as you said in the comments.
If you want the data to be re-useable, you'll have to use the reset() method if you want the same data again, or close and re-open the stream to re-use the object with different data.
// have your data
byte[] data = "{\"name\":\"Camilo\",\"functionName\":\"hello\"}".getBytes();
// open the stream
InputStream inputStream = new ByteArrayInputStream(data);
...
// do something with the inputStream, and reset if you need the same data again
if(inputStream.markSupported()) {
inputStream.reset();
} else {
inputStream.close();
inputStream = new ByteArrayInputStream(data);
}
...
// close the stream after use
inputStream.close();
Always close the stream after you use it, or use a try block to take advantage of AutoCloseable; you can do the same with the output stream:
try (InputStream inputStream = new ByteArrayInputStream(data);
OutputStream outputStream = new ByteArrayOutputStream()) {
lambdaExecutor.handleRequest(inputStream, outputStream);
} // auto-closed the streams
The reason you can't is because you can only read from a stream once.
To be able to read twice, you must call the reset() method for it to return to the beginning. After reading, call reset() and you can read it again!
Some sources don't support resetting it so you would actually have to create the stream again. To check if the source supports it, use the markSupported() method of the stream!

How to assert response in zipoutputstream

I am trying to write JUnit using MockitoJUnitRunner.
I am passing file id to my function which is downloading file from cloud and returning zip file as download.
here is my code
public void getLogFile(HttpServletResponse response, String id) throws IOException {
response.setContentType("Content-type: application/zip");
response.setHeader("Content-Disposition", "attachment; filename=LogFiles.zip");
ServletOutputStream out = response.getOutputStream();
ZipOutputStream zos = new ZipOutputStream(new BufferedOutputStream(out));
zos.putNextEntry(new ZipEntry(id));
InputStream inputStream = someDao.getFile(id);
BufferedInputStream fif = new BufferedInputStream(inputStream);
int data = 0;
while ((data = fif.read()) != -1) {
zos.write(data);
}
fif.close();
zos.closeEntry();
zos.close();
}
And my JUnit function is
#Mock
private MockHttpServletResponse mockHttpServletResponse;
anyInputStream = new ByteArrayInputStream("test data".getBytes());
#Test
public void shouldDownloadFile() throws IOException {
ServletOutputStream outputStream = mock(ServletOutputStream.class);
when(mockHttpServletResponse.getOutputStream()).thenReturn(outputStream);
=> when(someDao.download(anyString())).thenReturn(anyInputStream);
controller.getLogFile(mockHttpServletResponse, id);
verify(mockHttpServletResponse).setContentType("Content-type: application/zip");
verify(mockHttpServletResponse).setHeader("Content-Disposition","attachment; filename=LogFiles.zip");
verify(atmosdao).download(atmosFilePath);
}
This unit test is passing but I want to verify what is written on outputStream, how can I do it ? as I am writing "test data" to mocked outputStream like
anyInputStream = new ByteArrayInputStream("test data".getBytes());
when(someDao.download(anyString())).thenReturn(anyInputStream);
mockHttpServletResponse.getContentAsString() is giving me null !
Is it possible to assert MockHttpServletResponse which is written using zipoutputStream ? if yes then how can i do it ?
Thanks.
Instead of mocking your OutputStream, you could create a custom one:
public class CustomOutputStream extends ServletOutputStream {
private ByteArrayOutputStream out = new ByteArrayOutputStream();
private String content;
#Override
public void write(int b) throws IOException {
out.write(b);
}
#Override
public void close() throws IOException {
content = new String(out.toByteArray());
out.close();
super.close();
}
public String getContentAsString() {
return this.content;
}
}
This class will store all the bytes written to it and keep them in the content field.
Then you replace this:
ServletOutputStream outputStream = mock(ServletOutputStream.class);
by this:
CustomOutputStream outputStream = new CustomOutputStream();
When your servlet calls getOutputStream(), it will use the custom one, and in the end getContentAsString() will return you the output that was written to your servlet.
Note: the output is zipped, so the String will contain strange characters. If you want the original string, you'll have to unzip it (and in this case I'd use the byte array returned by out.toByteArray() instead of the String, because when you create a String this way you can have encoding problems when calling string.getBytes())
I got what I was looking for...
This is what I did to assert data written to zipoutputStream using powerMockito.
#Test
public void ShouldAttemptToWriteDownloadedFileToZipOutputStream() throws Exception {
InputStream anyInputStream = new ByteArrayInputStream("test data".getBytes());
ServletOutputStream outputStream = mock(ServletOutputStream.class);
BufferedOutputStream bufferedOutputStream = Mockito.mock(BufferedOutputStream.class);
PowerMockito.whenNew(BufferedOutputStream.class).withArguments(outputStream).thenReturn(bufferedOutputStream);
ZipOutputStream zipOutputStream = Mockito.mock(ZipOutputStream.class);
PowerMockito.whenNew(ZipOutputStream.class).withArguments(bufferedOutputStream).thenReturn(zipOutputStream);
BufferedInputStream bufferedInputStream = new BufferedInputStream(anyInputStream);
PowerMockito.whenNew(BufferedInputStream.class).withArguments(anyInputStream).thenReturn(bufferedInputStream);
subjectUnderTest.getLogFile(mockHttpServletResponse, "12345");
int data = 0;
while ((data = bufferedInputStream.read()) != -1) {
verify(zipOutputStream).write(data);
}
}
Thanks Hugo for your help !

Why are InputStream instances closed when referenced within an ObservableMap?

I have ObservableMap in which resource files added.
private ObservableMap<String, InputStream> resourceFilesData;
resourceFilesData = new ObservableMapWrapper<String, InputStream>(
new HashMap<String, InputStream>()
);
And InputStreams added in such way:
resourceFilesData.put(f.getName(), new FileInputStream(f));
and finally when I want to use streams, they appear closed!
Why? I cant find reason.
Maybe, there some whey to handle moment, when stream get closed? (for debugging)
how streams are used:
private void pack() throws JAXBException, IOException {
HashMap<String, InputStream> resources = new HashMap<>();
byte[] buf = new byte[1024];
ZipOutputStream zos = new ZipOutputStream(new FileOutputStream("../" + fwData.getFileName() + ".iolfw"));
File xml = fwData.marshal();
InputStream xmlStream = new FileInputStream(xml);
resources.put(xml.getName(), xmlStream);
resources.putAll(resourceFilesData);
for (Map.Entry<String, InputStream> data: resources.entrySet()) {
InputStream input = data.getValue();
zos.putNextEntry(new ZipEntry(data.getKey()));
for (int readNum = 0; (readNum = input.read(buf)) != -1; ) {
zos.write(buf, 0, readNum);
}
zos.closeEntry();
input.close();
}
zos.close();
resources.remove(xmlStream);
xml.delete();
}
trace:
http://pastebin.com/hE21ECL9
I don't know the reason of that behaviour. But you can try to debug the problem using inherited class:
class FileInputStreamInh extends FileInputStream {
public FileInputStreamInh(File file) throws FileNotFoundException {
super(file);
}
#Override
public void close() throws IOException {
super.close();
^^^breakpoint here
}
}
So, instead of creation FileInputStream, you should create FileInputStreamInh.

Copy and validate a Zip File via Java

After some research:
How to create a Zip File
and some google research i came up with this java function:
static void copyFile(File zipFile, File newFile) throws IOException {
ZipFile zipSrc = new ZipFile(zipFile);
ZipOutputStream zos = new ZipOutputStream(new FileOutputStream(newFile));
Enumeration srcEntries = zipSrc.entries();
while (srcEntries.hasMoreElements()) {
ZipEntry entry = (ZipEntry) srcEntries.nextElement();
ZipEntry newEntry = new ZipEntry(entry.getName());
zos.putNextEntry(newEntry);
BufferedInputStream bis = new BufferedInputStream(zipSrc
.getInputStream(entry));
while (bis.available() > 0) {
zos.write(bis.read());
}
zos.closeEntry();
bis.close();
}
zos.finish();
zos.close();
zipSrc.close();
}
This code is working...but it is not nice and clean at all...anyone got a nice idea or an example?
Edit:
I want to able to add some type of validation if the zip archive got the right structure...so copying it like an normal file without regarding its content is not working for me...or would you prefer checking it afterwards...i am not sure about this one
You just want to copy the complete zip file? Than it is not needed to open and read the zip file... Just copy it like you would copy every other file.
public final static int BUF_SIZE = 1024; //can be much bigger, see comment below
public static void copyFile(File in, File out) throws Exception {
FileInputStream fis = new FileInputStream(in);
FileOutputStream fos = new FileOutputStream(out);
try {
byte[] buf = new byte[BUF_SIZE];
int i = 0;
while ((i = fis.read(buf)) != -1) {
fos.write(buf, 0, i);
}
}
catch (Exception e) {
throw e;
}
finally {
if (fis != null) fis.close();
if (fos != null) fos.close();
}
}
Try: http://commons.apache.org/io/api-release/org/apache/commons/io/FileUtils.html#copyFile
Apache Commons FileUtils#copyFile
My solution:
import java.io.*;
import javax.swing.*;
public class MovingFile
{
public static void copyStreamToFile() throws IOException
{
FileOutputStream foutOutput = null;
String oldDir = "F:/UPLOADT.zip";
System.out.println(oldDir);
String newDir = "F:/NewFolder/UPLOADT.zip"; // name as the destination file name to be done
File f = new File(oldDir);
f.renameTo(new File(newDir));
}
public static void main(String[] args) throws IOException
{
copyStreamToFile();
}
}
I have updated your code to Java 9+, FWIW
try (ZipFile srcFile = new ZipFile(inputName)) {
try (ZipOutputStream destFile = new ZipOutputStream(
Files.newOutputStream(Paths.get(new File(outputName).toURI())))) {
Enumeration<? extends ZipEntry> entries = srcFile.entries();
while (entries.hasMoreElements()) {
ZipEntry src = entries.nextElement();
ZipEntry dest = new ZipEntry(src.getName());
destFile.putNextEntry(dest);
try (InputStream content = srcFile.getInputStream(src)) {
content.transferTo(destFile);
}
destFile.closeEntry();
}
destFile.finish();
}
}

Categories