Creating Zip file while client is downloading - java

I try to develop something like dropbox(very basic one). For one file to download, it's really easy. Just use servletoutputstream. what i want is: when client asks me multiple file, i zip files in server side then send to user. But if file is big it takes too many times to zip them and send to user.
is there any way to send files while they are compressing?
thanks for your help.

Part of the Java API for ZIP files is actually desgined to provide "on the fly" compression. It all fits nicely both with the java.io API and the servlet API, which means this is even... kind of easy (no multithreading required - even for performance reason, because usually your CPU will probably be faster at ZIPping than your network will be at sending contents).
The part you'll be interacting with is ZipOutputStream. It is a FilterOutputStream (which means it is designed to wrap an outputstream that already exists - in your case, that would be the respone's OutputStream), and will compress every byte you send it, using ZIP compression.
So, say you have a get request
protected void doGet(HttpServletRequest req, HttpServletResponse resp)
throws ServletException, IOException {
// Your code to handle the request
List<YourFileObject> responseFiles = ... // Whatever you need to do
// We declare that the response will contain raw bytes
response.setContentType("application/octet-stream");
// We open a ZIP output stream
try (ZipOutputStream zipStream = new ZipOutputStream(response.getOutputStream()) {// This is Java 7, but not that different from java 6
// We need to loop over each files you want to send
for(YourFileObject fileToSend : responseFiles) {
// We give a name to the file
zipStream.putNextEntry(new ZipEntry(fileToSend.getName()));
// and we copy its content
copy(fileToSend, zipStream);
}
}
}
Of course, you should do proper exception handling. A couple quick notes though :
The ZIP file format mandates that each file has a name, so you must create a new ZipEntry each time you start a new file (you'll probably get an IllegalStateException if you do not, anyway)
Proper use of the API would be that you close each entry once you are done writing to it (at the end of the file). BUT : the Java implementation does that for you : each time you call putNextEntry it closes the previous one (if need be) all by itself
Likewise, you must not forget to close the ZIP stream, beacuse, this will properly close the last entry AND flush everything that is needed to create a proper ZIP file. Failure to do so will result in a corrupt file. Here, the try with resources statement does this : it closes the ZipOutputStream once everything is written to it.
The copy method here is just what you would use to transfert all the bytes from the original file to the outputstream, there is nothing ZIP specific about it. Just call outputStream.write(byte[] bytes).
**EDIT : ** to clarify...
For example, given a YourFileType that has the following methods :
public interface YourFileType {
public byte[] getContent();
public InputStream getContentAsStream();
}
Then the copy method could look like (this is all very basic Java IO, you could maybe use a library such as commons io to not reinvent the wheel...)
public void copy(YourFileType file, OutputStream os) throws IOException {
os.write(file.getContent());
}
Or, for a full streaming implementation :
public void copy(YourFileType file, OutputStream os) throws IOException {
try (InputStream fileContent = file.getContentAsStream()) {
byte[] buffer = new byte[4096]; // 4096 is kind of a magic number
int readBytesCount = 0;
while((readBytesCount = fileContent.read(buffer)) >= 0) {
os.write(buffer, 0, readBytesCount);
}
}
}
Using this kind of implementation, your client will start receiveing a response almost as soon as you start writing to the ZIPOutputStream (the only delay would be that of internal buffers), meaning it should not timeout (unless you spent too long buliding the content to send - but that would not be the ZIPping part fault's).

Related

How can I read a Base64 file that comes as a chain?

I am currently developing a REST service which receives in its request a field where it is passed a file in base 64 format ("n" characters come). What I do within the service logic is to convert that character string to a File to save it in a predetermined path.
But the problem is that when the file is too large (3MB) the service becomes slow and takes a long time to respond.
This is the code I am using:
String filename = "TEXT.DOCX"
BufferedOutputStream stream = null;
// THE FIELD base64file IS WHAT A STRING IN BASE FORMAT COMES FROM THE REQUEST 64
byte [] fileByteArray = java.util.Base64.getDecoder (). decode (base64file);
// VALID FILE SIZE
if ((1 * 1024 * 1024 <fileByteArray.length) {
    logger.info ("The file [" + filename + "] is too large");
} else {
    stream = new BufferedOutputStream (new FileOutputStream (new File ("C: \" + filename)));
    stream.write (fileByteArray);
}
How can I do to avoid this inconvenience. And that my service does not take so long to convert the file to File.
Buffering does not improve your performance here, as all you are trying to do is simply write the file as fast as possible. Generally it looks fine, change your code to directly use the FileOutputStream and see if it betters things:
try (FileOutputStream stream = new FileOutputStream(path)) {
stream.write(bytes);
}
Alternatively you could also try using something like Apache Commons to do the task for you:
FileUtils.writeByteArrayToFile(new File(path), bytes);
Try the following, also for large files.
Path outPath = Paths.get(filename);
try (InputStream in = Base64.getDecoder ().wrap(base64file)) {
Files.copy(in, outPath);
}
This keeps only a buffer in memory. Your code might become slow because of taking more memory.
wrap takes an InputStream which you should provide, not the entire String.
From Network point of view:
Both json and xml can support large amount of data exchange. And, 3MB is not really huge. But, there is a limitation on how much browser can handle (if this call is from a user interface).
Also, web server like Tomcat has property to handle 2MB by default (check maxPostSize http://tomcat.apache.org/tomcat-6.0-doc/config/http.html#Common_Attributes)
You can also try chunking the request payload (although it shouldn't be required for a 3MB file)
From Implementation point of view:
Write operation on your disk could be slow. It also depends on your OS.
If your file size is really large, you can use Java FileChannel class with ByteBuffer.
To know the cause of slowness (network delay or code), check the performance with a simple Java program against the web service call.

Zip file turnout out to be empty

I'm working on a HTTP server in Java, which for testing purposes is running under Windows 8.1.
The way it's coded makes it so when a certain parameter is set, it changes the header of the HTTP file and sends the file through the socket with something that works kind of like:
socket.outputStream.write(filter.read());
Assume that the communication works fine, since I have tested it with various other filters and it works perfectly.
One of the filters is supposed to grab the HTML file, zip it and then send it to the client, without creating the file in the server machine. This is the header:
"HTTP/1.1 200 OK\nContent-Type: application/zip\nContent-Disposition: filename=\"" + request + ".zip\"\n";
Afterwards, I set my filter to a class I created (which is copied below) and send the file. My problem is that even though the server is definitively sending data, the client only downloads an empty zip file, with nothing inside.
I've been stuck with this issue for a few days, I can't seem to figure out what's wrong. I think that there's something wrong with how I create the entry or maybe how I close the outputs. I can't be sure.
I'd really appreciate any advice that could be given to me on this issue. Thanks for your attention.
class ZipFilterInputStream extends FilterInputStream
{
protected ZipFilterInputStream(InputStream inputToFilter) throws IOException
{
super(inputToFilter);
//Get the stuff ready for compression
ByteArrayOutputStream out = new ByteArrayOutputStream();
ZipOutputStream zout = new ZipOutputStream(out);
zout.putNextEntry(new ZipEntry("file.html"));
//Compress the stream
int data = in.read();
while (data != -1)
{
zout.write(data);
data = in.read();
}
zout.closeEntry();
zout.finish();
//Get the stream ready for reading.
in = new ByteArrayInputStream(out.toByteArray());
out.close();
}
public int read() throws IOException
{
return in.read();
}
}

Why InputStream.available() so time consuming?

I have implemented my own class to read pcap files. (Binary files, i.e. tcpdump, wireshark)
public class PcapReader implements Iterator<PcapPacket> {
private InputStream is;
public PcapReader (File file) throws FileNotFoundException, IOException {
is = this(new DataInputStream(
new BufferedInputStream(
new FileInputStream(file))));
}
#Override
public boolean hasNext () {
try {
return (is.available() > 0);
} catch (IOException e) {
return false;
}
}
//pseudo code!
#Override
public PcapPacket next () {
is.read(header);
is.read(body);
return new PcapPacket(header, body);
}
//more code here
}
Then I use it like this:
PcapReader reader = new PcapReader(file);
while (reader.hasNext()) {
PcapPacket pcapPacket = reader.next();
//process packet
}
The file under test has 190 Mb. And I also use JVisualVM to profile.
hasNext() is called 1.7 million times and time is 7.7 seconds
next() is called same number of times and time is 3.6 seconds
My main question is why hasNext() is so time consuming in absolute value and also twice greater than next?
When you call is.available(), in your hasNext() method, it goes down to FileInputStream.available() implementation. This is a native method, as one may see from FileInputStream source code.
In the end, this is indeed a time-consumming operation, as the Operating System implementation of the file operations will have to check ahead if more data is available to be read. So, it will actually do a read operation without updating the file pointer (or updating it back to the original position), just to check if there is a "next" byte.
I'm sure, that internal (native) implementation of available() method is not something like just returning some return availableSize;, but more complicated. Stream counts available data using OS API; especially, for example, for log files, which are written due Stream reads them.
I have implemented my own class to read pcap files.
Because you're not using jNetPcap, or because you are using jNetPcap but need something that can read from a File?
If the latter, you probably want to use a pattern other than one that has a "more data is available" method and a separate "so read that data" method; something that reads the data and either returns a "packet available"/"end of file"/"error" indication or throws an exception for one or both of the latter conditions (DataInputStream appears to throw exceptions for both I/O errors and EOF, so it might make sense to do the same for your class).
Yeah, that means it can't be an Iterator, but maybe Iterators weren't originally intended to represent records in a sequential file (besides, if you really want it to be an Iterator, what are you going to do about the remove method?).
And if you can avoid needing to read from a File, you could then use jNetPcap's own routines for reading capture files, which, in libpcap 1.1.0 and later, can also read some pcap-ng files.

Want to create a servlet that will save the posted data to a file based on a guid filename

So I pushed my java app to a server, pretty excited about that.
Now I want to test something, how can I save the posted data to my servlet to a file, and the filename should be a unique guid.
I have this so far:
public class TestServlet extends javax.servlet.http.HttpServlet {
protected void doPost(javax.servlet.http.HttpServletRequest request, javax.servlet.http.HttpServletResponse response)
throws javax.servlet.ServletException, IOException {
}
protected void doGet(javax.servlet.http.HttpServletRequest request, javax.servlet.http.HttpServletResponse response)
throws javax.servlet.ServletException, IOException {
PrintWriter printWriter = response.getWriter();
printWriter.print("hello, world from testservlet!");
}
}
So assuming the http posted data (say around 50K) will be posted to the field 'payload', how can I grab the posted text, and save it to a file, with the filename being a GUID.
Does java have a construct to clean up an open file, like in c#:
using(var file = new ....)
{
// write to file
}
That closes the connection and cleans up memory etc.
Also, do I need to set special permissions for tomcat to save this file?
I just set things up by default right now (just playing around on a VPS) using ubuntu 11, installed tomcat6.
Thanks.
You can user request to read the "payload", see the API doc for ServletRequest:
request.getParameter("payload");
You can use File to create the file, see AP doc:
File newFile = new File("fileName");
boolean isCreated = newfile.createNewFile();
You can write to the file as follows,
BufferedWriter out = new BufferedWriter(new FileWriter(newFile));
out.write(payLoad);
out.close();
For GUID you see this Create a GUID in Java
And for the clean up, you don't have to worry about it in Java, it's Garbage Collector ( What is the garbage collector in Java? ) does it for you automatically when the reference goes out of scope.
But you should close the resources like out.close to release it back to the system when you are done with it.
Also, do I need to set special permissions for tomcat to save this file?
You do not need to do that because tomcat is just a server, it's more related to the file system (OS). I use Glassfish on Unix and I don't need to do anything like that to create file.
Now I want to test something, how can I save the posted data to my servlet to a file, and the filename should be a unique guid.
Use File#createTempFile() to create a file with an unique ID in the given folder.
File file = File.createTempfile("prefix-", ".ext", "/path/to/files");
// ...
See also:
Saving uploaded file in specific location
Does java have a construct to clean up an open file, like in c#: using?
Only in Java 7 which is already been out for some time.
try (FileWriter writer = new FileWriter(file)) {
writer.write(content);
}
which is equivalent to
FileWriter writer = null;
try {
writer = new FileWriter(file);
writer.write(content);
} finally {
if (writer != null) writer.close();
}
See also:
"using" keyword in java
Also, do I need to set special permissions for tomcat to save this file?
The user who has started Tomcat should indeed have the file write permissions on the given directory.
In the future please ask separate questions in separate SO questions.
Java 7 has a new try with resources construct that will take care of closing the file for you. Otherwise... just close the file; no big deal.
As far as "special permissions", as long as the user Tomcat is running under can access the directory in question, there's no issue. I'd recommend against storing it under the webapp directories, though (and if it's deployed as a war you may not be able to anyway). Keep uploaded files in a known, but separate, directory.

Writing to CSV Files and then Zipping it up in Appengine (Java)

I'm currently working on a project that is done in Java, on google appengine.
Appengine does not allow files to be stored so any on-disk representation objects cannot be used. Some of these include the File class.
I want to write data and export it to a few csv files, and then zip it up, and allow the user to download it.
How may I do this without using any File classes? I'm not very experienced in file handling so I hope you guys can advise me.
Thanks.
You can create a zip file and add to it while the user is downloading it. If you are using a servlet, this is straigthforward:
protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
// ..... process request
// ..... then respond
response.setContentType("application/zip");
response.setStatus(HttpServletResponse.SC_OK);
// note : intentionally no content-length set, automatic chunked transfer if stream is larger than the internal buffer of the response
ZipOutputStream zipOut = new ZipOutputStream(response.getOutputStream());
byte[] buffer = new byte[1024 * 32];
try {
// case1: already have input stream, typically ByteArrayInputStream from a byte[] full of previoiusly prepared csv data
InputStream in = new BufferedInputStream(getMyFirstInputStream());
try {
zipOut.putNextEntry(new ZipEntry("FirstName"));
int length;
while((length = in.read(buffer)) != -1) {
zipOut.write(buffer, 0, length);
}
zipOut.closeEntry();
} finally {
in.close();
}
// case 2: write directly to output stream, i.e. you have your raw data but need to create csv representation
zipOut.putNextEntry(new ZipEntry("SecondName"));
// example setup, key is to use the below outputstream 'zipOut' write methods
Object mySerializer = new MySerializer(); // i.e. csv-writer
Object myData = getMyData(); // the data to be processed by the serializer in order to make a csv file
mySerizalier.setOutput(zipOut);
// write whatever you have to the zipOut
mySerializer.write(myData);
zipOut.closeEntry();
// repeat for the next file.. or make for-loop
}
} finally {
zipOut.close();
}
}
There is no reason to store your data in files unless you have memory constraints. Files give you InputStream and OutputStream, both which have in-memory equivalents.
Note that creating a csv writer usually means doing something like this, where the point is to take a piece of data (array list or map, whatever you have) and make it into byte[] parts. Append the byte[] parts into an OutputStream using a tool like DataOutputStream (make your own if you like) or OutputStreamWriter.
If your data is not huge, meaning can stay in memory then exporting to CSV and zipping up and streaming it for downloading can all be done on-they-fly. Caching can be done at any of these steps which greatly depends on your application's business logic.

Categories