how to download large files without memory issues in java - java

When I am trying to download a large file which is of 260MB from server, I get this error: java.lang.OutOfMemoryError: Java heap space. I am sure my heap size is less than 252MB. Is there any way I can download large files without increasing heap size?
How I can download large files without getting this issue? My code is given below:
String path= "C:/temp.zip";
response.addHeader("Content-Disposition", "attachment; filename=\"test.zip\"");
byte[] buf = new byte[1024];
try {
File file = new File(path);
long length = file.length();
BufferedInputStream in = new BufferedInputStream(new FileInputStream(file));
ServletOutputStream out = response.getOutputStream();
while ((in != null) && ((length = in.read(buf)) != -1)) {
out.write(buf, 0, (int) length);
}
in.close();
out.close();

There are 2 places where I can see you could potentially be building up memory usage:
In the buffer reading your input file.
In the buffer writing to your output stream (HTTPOutputStream?)
For #1 I would suggest reading directly from the file via FileInputStream without the BufferedInputStream. Try this first and see if it resolves your issue. ie:
FileInputStream in = new FileInputStream(file);
instead of:
BufferedInputStream in = new BufferedInputStream(new FileInputStream(file));
If #1 does not resolve the issue, you could try periodically flushing the output stream after so much data is written (decrease chunk size if necessary):
ie:
try
{
FileInputStream fileInputStream = new FileInputStream(file);
byte[] buf=new byte[8192];
int bytesread = 0, bytesBuffered = 0;
while( (bytesread = fileInputStream.read( buf )) > -1 ) {
out.write( buf, 0, bytesread );
bytesBuffered += bytesread;
if (bytesBuffered > 1024 * 1024) { //flush after 1MB
bytesBuffered = 0;
out.flush();
}
}
}
finally {
if (out != null) {
out.flush();
}
}

Unfortunately you have not mentioned what type out is. If you have memory issues I guess it is ByteArrayOutpoutStream. So, replace it by FileOutputStream and write the byte you are downloading directly to file.
BTW, do not use read() method that reads byte-by-byte. Use read(byte[] arr) instead. This is much faster.

First you can remove the (in != null) from your while statement, it's unnecessary. Second, try removing the BufferedInputStream and just do:
FileInputStream in = new FileInputStream(file);

There's nothing wrong (in regard to memory usage) with the code you're show. Either the servlet container is configured to buffer the entire response (look at the web.xml configuration), or the memory is being leaked elsewhere.

Related

InputStream not fully readable

Why i cant read POST-request with 150k chars?
I can only read ~15k chars all time
InputStream is = socket.getInputStream();
ByteArrayOutputStream baos = new ByteArrayOutputStream();
while (is.available() > 0 && (length = is.read(buffer)) != -1) {
baos.write(buffer, 0, length);
}
System.out.println(baos.toString(StandardCharsets.UTF_8.name()));
UPD: if we ignored is.available(), code freezes in the while:
InputStream is = socket.getInputStream();
ByteArrayOutputStream baos = new ByteArrayOutputStream();
while ((length = is.read(buffer)) != -1) {
baos.write(buffer, 0, length);
}
System.out.println(baos.toString(StandardCharsets.UTF_8.name()));
There are no exceptions.
Docs for avaiable() says:
available()
Returns an estimate of the number of bytes that can be read (or skipped over) from this input stream without blocking
So I'm going to guess that you internally have 15k buffers and you're only reading up to the end of your own buffer, not to the end of the stream. You should frankly be ignoring availabe() in this case and just call read( byte[] ) until it returns -1.
Your updated code example looks almost exactly like the code I use to read streams. I think the problem must be on the sender's side. Either the sender is not closing the stream properly, or there's some network issue that doesn't allow enough packets through.
For reference, here's the code I use to read an entire stream. (Lightly tested.)
public static ByteArrayOutputStream readFully( InputStream ins )
throws IOException
{
ByteArrayOutputStream bos = new ByteArrayOutputStream();
byte[] bytes = new byte[ 1024 ];
for( int length; ( length = ins.read( bytes ) ) != -1; )
bos.write( bytes, 0, length );
return bos;
}

How to get file size before writing it to server?

When try to convert InputStream into byte array to know size of the file being uploaded. I am able to get size, but InputStream.read() becomes -1. How to check file size before writing it to server?
My current code gives me size , but InputStream reaches the end.
private static byte[] readFully(InputStream input) throws IOException
{
byte[] buffer = new byte[8192];
int bytesRead;
byte []bytes=null;
ByteArrayOutputStream output = new ByteArrayOutputStream();
while ((bytesRead = input.read(buffer)) != -1)
{
System.out.println("Buffer is "+input.read(buffer));
output.write(buffer, 0, bytesRead);
}
bytes=output.toByteArray();
output.close();
return bytes;
}
If you are implementing a web-server in Java, please take a look at the following link:
http://www.prasannatech.net/2008/11/http-web-server-java-post-file-upload.html
You must try to read the incomming information until you found its boundary.
You can't use read() method now, because the InputStream may not be ready to be read yet.

InputStream reader

I'm currently trying to read in a image file from the server but either getting a incomplete data or
Exception in thread "main"
java.lang.NegativeArraySizeException.
Has this something to do with the buffer size? I have tried to use static size instead of contentlength. Please kindly advise.
URL myURL = new URL(url);
HttpURLConnection connection = (HttpURLConnection)myURL.openConnection();
connection.setRequestMethod("GET");
status = connection.getResponseCode();
if (status == 200)
{
int size = connection.getContentLength() + 1024;
byte[] bytes = new byte[size];
InputStream input = new ByteArrayInputStream(bytes);
FileOutputStream out = new FileOutputStream(file);
input = connection.getInputStream();
int data = input.read(bytes);
while(data != -1){
out.write(bytes);
data = input.read(bytes);
}
out.close();
input.close();
Let's examine the code:
int size = connection.getContentLength() + 1024;
byte[] bytes = new byte[size];
why do you add 1024 bytes to the size? What's the point? The buffer size should be something large enough to avoid too many reads, but small enough to avoid consuming too much memory. Set it at 4096, for example.
InputStream input = new ByteArrayInputStream(bytes);
FileOutputStream out = new FileOutputStream(file);
input = connection.getInputStream();
Why do you create a ByteArrayInputStream, and then forget about it completely? You don't need a ByteArrayInputStream, since you don't read from a byte array, but from the connection's input stream.
int data = input.read(bytes);
This reads bytes from the input. The max number of bytes read is the length of the byte array. The actual number of bytes read is returned and stored in data.
while (data != -1) {
out.write(bytes);
data = input.read(bytes);
}
So you have read data bytes, but you don't write only the first data bytes of the array. You write the whole array of bytes. That is wrong. Suppose your array if of size 4096 and data is 400, instead of writing the 400 bytes that have been read, you write the 400 bytes + the remaining 3696 bytes of the array, which could be 0, or could have values coming from a previous read. It should be
out.write(bytes, 0, data);
Finally:
out.close();
input.close();
If any exception occurs before, those two streams will never be closed. Do that a few times, and your whold OS won't have file descriptos available anymore. Use the try-with-resources statement to be sure your streams are closed, no matter what happens.
This code can help you
input = connection.getInputStream();
byte[] buffer = new byte[4096];
int n = - 1;
OutputStream output = new FileOutputStream( file );
while ( (n = input.read(buffer)) != -1)
{
if (n > 0)
{
output.write(buffer, 0, n);
}
}
output.close();

What happens when a file being downloaded is modified on the server?

I'm downloading a zip file from a server but keep getting corrupted file. I have a slow connection and I know that the server keeps updating the file frequently. Is this why I get corrupted files? I would assume the network protocol should be smart enough to avoid this kind of situations.
private void downloadFile(String urlString, String fileName)
throws MalformedURLException, IOException {
InputStream input = new URL(urlString).openConnection().getInputStream();
FileOutputStream output = new FileOutputStream(fileName);
int bufferSize = 153600;
byte[] buffer = new byte[bufferSize];
int totalBytesRead = 0;
int bytesRead = 0;
while ((bytesRead = input.read(buffer)) > 0) {
output.write(buffer, 0, bytesRead);
buffer = new byte[bufferSize];
totalBytesRead += bytesRead;
}
output.close();
input.close();
}
Thanks!
It's nothing to do with the protocol, and everything to do with the server software you're using at the other end of your URL. Your code can only read what the server sends you. The server code needs to ensure that it either maintains a write lock on the file while it's streaming it out to you, or otherwise ensures you receive a valid copy of the (unmodified) file.

Fastest way to copy text from a File to a HttpServletResponse

I need a very fast way to copy text from a file to the body of a HttpServletResponse.
Actually I'm copying byte by byte in a loop, from a bufferedReader to the response.getWriter() but I believe there must be a faster and more straightforward way of doing it.
Thanks!
I like using the read() method that accepts a byte array since you can tweak the size and change the performance.
public static void copy(InputStream is, OutputStream os) throws IOException {
byte buffer[] = new byte[8192];
int bytesRead;
BufferedInputStream bis = new BufferedInputStream(is);
while ((bytesRead = bis.read(buffer)) != -1) {
os.write(buffer, 0, bytesRead);
}
is.close();
os.flush();
os.close();
}
There's no need to do this stuff yourself. It is such a common requirement that open source, battle-tested, optimised solutions exist.
Apache Commons IO has an IOUtils class with a range of static copy methods. Perhaps you could use
IOUtils.copy(reader, writer);
http://commons.apache.org/io/api-1.4/org/apache/commons/io/IOUtils.html#copy(java.io.Reader, java.io.Writer)
This is how I do it in my Servlet with a 4K buffer,
// Send the file.
OutputStream out = response.getOutputStream();
BufferedInputStream is = new BufferedInputStream(new FileInputStream(file));
byte[] buf = new byte[4 * 1024]; // 4K buffer
int bytesRead;
while ((bytesRead = is.read(buf)) != -1) {
out.write(buf, 0, bytesRead);
}
is.close();
out.flush();
out.close();

Categories