Writing to CSV Files and then Zipping it up in Appengine (Java) - java

I'm currently working on a project that is done in Java, on google appengine.
Appengine does not allow files to be stored so any on-disk representation objects cannot be used. Some of these include the File class.
I want to write data and export it to a few csv files, and then zip it up, and allow the user to download it.
How may I do this without using any File classes? I'm not very experienced in file handling so I hope you guys can advise me.
Thanks.

You can create a zip file and add to it while the user is downloading it. If you are using a servlet, this is straigthforward:
protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
// ..... process request
// ..... then respond
response.setContentType("application/zip");
response.setStatus(HttpServletResponse.SC_OK);
// note : intentionally no content-length set, automatic chunked transfer if stream is larger than the internal buffer of the response
ZipOutputStream zipOut = new ZipOutputStream(response.getOutputStream());
byte[] buffer = new byte[1024 * 32];
try {
// case1: already have input stream, typically ByteArrayInputStream from a byte[] full of previoiusly prepared csv data
InputStream in = new BufferedInputStream(getMyFirstInputStream());
try {
zipOut.putNextEntry(new ZipEntry("FirstName"));
int length;
while((length = in.read(buffer)) != -1) {
zipOut.write(buffer, 0, length);
}
zipOut.closeEntry();
} finally {
in.close();
}
// case 2: write directly to output stream, i.e. you have your raw data but need to create csv representation
zipOut.putNextEntry(new ZipEntry("SecondName"));
// example setup, key is to use the below outputstream 'zipOut' write methods
Object mySerializer = new MySerializer(); // i.e. csv-writer
Object myData = getMyData(); // the data to be processed by the serializer in order to make a csv file
mySerizalier.setOutput(zipOut);
// write whatever you have to the zipOut
mySerializer.write(myData);
zipOut.closeEntry();
// repeat for the next file.. or make for-loop
}
} finally {
zipOut.close();
}
}
There is no reason to store your data in files unless you have memory constraints. Files give you InputStream and OutputStream, both which have in-memory equivalents.
Note that creating a csv writer usually means doing something like this, where the point is to take a piece of data (array list or map, whatever you have) and make it into byte[] parts. Append the byte[] parts into an OutputStream using a tool like DataOutputStream (make your own if you like) or OutputStreamWriter.

If your data is not huge, meaning can stay in memory then exporting to CSV and zipping up and streaming it for downloading can all be done on-they-fly. Caching can be done at any of these steps which greatly depends on your application's business logic.

Related

Creating mp4 file doesn't remove tmp files

I'm trying to write an InputStream that is an mp4 that I get from calling an external SOAP service, when I do so, it always generates this tmp files for my chosen temporary directory(java.io.tmpdir) that aren't removable and stay after the writing is done.
Writing images that I also get from the SOAP service works normal without the permanent tmp on the directory. I'm using java 1.8 SpringBoot
tmp files
This is what I'm doing:
File targetFile = new File("D:/archive/video.mp4");
targetFile.getParentFile().mkdirs();
targetFile.setWritable(true);
InputStream inputStream = filesToWrite.getInputStream();
OutputStream outputStream = new FileOutputStream(targetFile);
try {
int byteRead;
while ((byteRead = inputStream.read()) != -1) {
outputStream.write(byteRead);
}
} catch (IOException e) {
logger.fatal("Error# SaveFilesThread for guid: " + guid, e);
}finally {
try {
inputStream.close();
outputStream.flush();
outputStream.close();
}catch (Exception e){
e.printStackTrace();
}
also tried:
byte data[] = IOUtils.toByteArray(inputStream);
Path file = Paths.get("video.mp4");
Files.write(file, data);
And from apache commons IO:
FileUtils.copyInputStreamToFile(initialStream, targetFile);
When your code starts, the damage is already done. Your code is not the source of the temporary files (It's.. a ton of work for something that could be done so much simpler, though, see below), it's the framework that ends up handing you that filesToWrite variable.
It is somewhat likely that you can hook in at an earlier point and get the raw inputstream representing the socket or HTTP connection, and start saving the files straight from there. Alternatively, Perhaps filesToWrite has a way to get at the files themselves, in which case you can just move them into place instead of copying them over.
But, your code to do this is a mess, it has bad exception handling, and leaks memory, and is way too much code for a simple job, and is possibly 2000x to 10000x slower than needed depending on your harddisk (I'm not exaggerating, calling single-byte read() on unbuffered streams is thousands of times slower!)
// add `throws IOException` to your method signature.
// it saves files, it's supposed to throw IOException,
// 'doing I/O' is in the very definition of your method!
try (InputStream in = filesToWrite.getInputStream();
OutputStream out = new FileOutputStream(targetFile)) {
in.transferTo(out);
}
That's it. That solves all the problems - no leaks, no speed loss, tiny amount of code, fixes the deplorable error handling (which, here, is 'log something to the log, then print something to standard out, then potentially leak a bunch of resources, then don't tell the calling code anything went wrong and return exactly as if the copy operation succeeded).

Creating Zip file while client is downloading

I try to develop something like dropbox(very basic one). For one file to download, it's really easy. Just use servletoutputstream. what i want is: when client asks me multiple file, i zip files in server side then send to user. But if file is big it takes too many times to zip them and send to user.
is there any way to send files while they are compressing?
thanks for your help.
Part of the Java API for ZIP files is actually desgined to provide "on the fly" compression. It all fits nicely both with the java.io API and the servlet API, which means this is even... kind of easy (no multithreading required - even for performance reason, because usually your CPU will probably be faster at ZIPping than your network will be at sending contents).
The part you'll be interacting with is ZipOutputStream. It is a FilterOutputStream (which means it is designed to wrap an outputstream that already exists - in your case, that would be the respone's OutputStream), and will compress every byte you send it, using ZIP compression.
So, say you have a get request
protected void doGet(HttpServletRequest req, HttpServletResponse resp)
throws ServletException, IOException {
// Your code to handle the request
List<YourFileObject> responseFiles = ... // Whatever you need to do
// We declare that the response will contain raw bytes
response.setContentType("application/octet-stream");
// We open a ZIP output stream
try (ZipOutputStream zipStream = new ZipOutputStream(response.getOutputStream()) {// This is Java 7, but not that different from java 6
// We need to loop over each files you want to send
for(YourFileObject fileToSend : responseFiles) {
// We give a name to the file
zipStream.putNextEntry(new ZipEntry(fileToSend.getName()));
// and we copy its content
copy(fileToSend, zipStream);
}
}
}
Of course, you should do proper exception handling. A couple quick notes though :
The ZIP file format mandates that each file has a name, so you must create a new ZipEntry each time you start a new file (you'll probably get an IllegalStateException if you do not, anyway)
Proper use of the API would be that you close each entry once you are done writing to it (at the end of the file). BUT : the Java implementation does that for you : each time you call putNextEntry it closes the previous one (if need be) all by itself
Likewise, you must not forget to close the ZIP stream, beacuse, this will properly close the last entry AND flush everything that is needed to create a proper ZIP file. Failure to do so will result in a corrupt file. Here, the try with resources statement does this : it closes the ZipOutputStream once everything is written to it.
The copy method here is just what you would use to transfert all the bytes from the original file to the outputstream, there is nothing ZIP specific about it. Just call outputStream.write(byte[] bytes).
**EDIT : ** to clarify...
For example, given a YourFileType that has the following methods :
public interface YourFileType {
public byte[] getContent();
public InputStream getContentAsStream();
}
Then the copy method could look like (this is all very basic Java IO, you could maybe use a library such as commons io to not reinvent the wheel...)
public void copy(YourFileType file, OutputStream os) throws IOException {
os.write(file.getContent());
}
Or, for a full streaming implementation :
public void copy(YourFileType file, OutputStream os) throws IOException {
try (InputStream fileContent = file.getContentAsStream()) {
byte[] buffer = new byte[4096]; // 4096 is kind of a magic number
int readBytesCount = 0;
while((readBytesCount = fileContent.read(buffer)) >= 0) {
os.write(buffer, 0, readBytesCount);
}
}
}
Using this kind of implementation, your client will start receiveing a response almost as soon as you start writing to the ZIPOutputStream (the only delay would be that of internal buffers), meaning it should not timeout (unless you spent too long buliding the content to send - but that would not be the ZIPping part fault's).

Zip file turnout out to be empty

I'm working on a HTTP server in Java, which for testing purposes is running under Windows 8.1.
The way it's coded makes it so when a certain parameter is set, it changes the header of the HTTP file and sends the file through the socket with something that works kind of like:
socket.outputStream.write(filter.read());
Assume that the communication works fine, since I have tested it with various other filters and it works perfectly.
One of the filters is supposed to grab the HTML file, zip it and then send it to the client, without creating the file in the server machine. This is the header:
"HTTP/1.1 200 OK\nContent-Type: application/zip\nContent-Disposition: filename=\"" + request + ".zip\"\n";
Afterwards, I set my filter to a class I created (which is copied below) and send the file. My problem is that even though the server is definitively sending data, the client only downloads an empty zip file, with nothing inside.
I've been stuck with this issue for a few days, I can't seem to figure out what's wrong. I think that there's something wrong with how I create the entry or maybe how I close the outputs. I can't be sure.
I'd really appreciate any advice that could be given to me on this issue. Thanks for your attention.
class ZipFilterInputStream extends FilterInputStream
{
protected ZipFilterInputStream(InputStream inputToFilter) throws IOException
{
super(inputToFilter);
//Get the stuff ready for compression
ByteArrayOutputStream out = new ByteArrayOutputStream();
ZipOutputStream zout = new ZipOutputStream(out);
zout.putNextEntry(new ZipEntry("file.html"));
//Compress the stream
int data = in.read();
while (data != -1)
{
zout.write(data);
data = in.read();
}
zout.closeEntry();
zout.finish();
//Get the stream ready for reading.
in = new ByteArrayInputStream(out.toByteArray());
out.close();
}
public int read() throws IOException
{
return in.read();
}
}

Display servlet as img src using session attribute

I have a servlet which is used to display image.This servlet actually called by the
<img src="/displaySessionImage?widgetName=something"/>
My get & post redirect to this method,
protected void processRequest(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
HttpSession session = request.getSession();
String widgetName = request.getParameter("widgetName");
try {
//this is my file manager which was store ealier
StorageFile file = (StorageFile)session.getAttribute(widgetName);
response.setContentType(file.getContentType());
//the file manager can retrieve input stream
InputStream in = file.getInputStream();
OutputStream outImage = response.getOutputStream();
byte[] buf = new byte[1024];
int count = 0;
while ((count = in.read(buf)) >= 0) {
outImage.write(buf, 0, count);
}
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
But this code does not work, the image could not be display. I think this will not work because i have store the file manager that contain the input stream in a session. This same method work for another image file that was retrieved from database and not stored in the session. i have actually print out the input stream. it contain the same input stream as the database file.
Is it something wrong with the code?
or i actually cannot store the file manager that contain the input stream in a session?
or is it that i used input stream in a wrong way?
You are really not clear about what is actually happening, which is perhaps just ignorance. But storing and passing an InputStream around in the session is already not a good sign. Firstly, it is not serializable. Secondly, you're fully detaching the input stream from the context where it's been created (so it might implicitly have been closed/released when the initial context is finished). Thirdly, an input stream can often be read only once (so once it's read, it cannot be read again anymore, you'd have to create a new one).
The normal approach is to read the InputStream into a byte[] directly after its creation and then store that byte[] in the session instead.
InputStream input = uploadedFile.getInputStream();
ByteArrayOutputStream output = new ByteArrayOutputStream();
// Copy bytes from input to output the usual way.
// ...
byte[] content = output.toByteArray();
// Now store it in session.
And then in the image servlet, just do
// ...
response.getOutputStream().write(content);
You only need to be aware that each byte of a byte[] eats one byte of JVM's memory. Be sure that you don't go overboard. Remove the attribute from the session as soon as you don't need it anymore. Make use of temp file storage if necessary, for sure if you have to deal with large files.
Update: as per your comment on the question:
I am using firebug, response tab is empty, in header tab, response header contain : content-type : image/jpeg, content-length :0 , the server and the date.
A content length of 0 confirms that the input stream was already been read (or its source has implicitly been released). This only confirms my initial guesses. No, manually setting the content length header won't solve the problem. The servlet container already automatically takes care about it when the response body fits fully in the default response buffer; it would otherwise switch to chunked encoding anyway.
Your code is not working because you are not able to display the uploaded image as your img syntax is wrong.
Try this:
<img src="${pageContext.request.contextPath}/submit/Java/2.jpg">
Here, /submit is the folder which is created in the project folder. I.e. this is the folder where all uploaded images are saved.

How to pass an InputStream via RMI

Consider these two functions:
Function A takes inputStream as parameter.
public void processStream(InputStream stream)
{
//Do process routine
}
Function B loads a file content to pass it to Function A as InputStream.
pulic void loadFile()
{
File file =new File("c:\\file.txt");
//Pass file as InputStream
}
How can I pass file from Function B to Function A as InputStream without reading it on first hand?
I did something like this:
File file = new File("c:\\file.txt");
DataInputStream stream= new DataInputStream(new FileInputStream(file));
This generated the exception below:
java.io.WriteAbortedException: writing aborted; java.io.NotSerializableException: java.io.DataInputStream
EDIT:
loadFile() is passing the InputStream as RMI response.
The following should work just fine
processStream(new FileInputStream(file));
You should only not attempt to serialize an InputStream instance by ObjectOutputStream like as
objectOutputStream.writeObject(inputStream);
which you're apparently doing in processStream() method. That's namely exactly what the exception is trying to tell you. How to solve it properly depends on the sole functional requirement which you omitted from the question.
Update as per the comment
I am passing the InputStream as an RMI response.
There's the problem. You cannot pass non-serializable objects around as RMI response, let alone unread streams. You need to read the InputStream into a ByteArrayOutputStream the usual IO way and then use its toByteArray() to get a byte[] out of it and pass that instead. Something like:
InputStream input = new FileInputStream(file);
ByteArrayOutputStream output = new ByteArrayOutputStream();
byte[] buffer = new byte[8192];
for (int length = 0; (length = input.read(buffer)) > 0;) {
output.write(buffer, 0, length);
}
byte[] bytes = output.toByteArray(); // Pass that instead to RMI response.
Be careful with large files though. Every byte of a byte[] eats one byte of JVM's memory.
That exception seems to indicate that you are calling the processStream method on a remote object using something like RMI? if that is the case, you will need to re-visit what you are doing. sending streams of data over RMI is not an easy thing to do. if you are guaranteed to be using small files, you could copy the file data to a byte[] and pass that to the remote method call. if you need to process larger files, however, that will most likely cause memory issues on the client and/or server. in that case, you should use something like rmiio, which provides utilities for streaming data over RMI.
You could just pass the FileInputStream ?
processStream(new FileInputStream(yourFile));
The reason you are getting the exception is because DataInputStream is intended to read primitive Java types

Categories