Following this thread.
Streaming large files in a java servlet.
Is it possible to find the total internet bandwidth available in current machine thru java?
what i am trying to do is while streaming large files thru servlet, based on the number of parallel request and the total band width i am trying to reduce the BUFFER_SIZE of the stream for each request. make sense?
Is there any pure java way? (without JNI)
Maybe you can time how long the app need to send one package (the buffer). And if that is larger than x milliseconds, then make your buffer smaller. You can use other values for the original bufferSize and if (stop - start > 700).
This is based on the thread you noticed:
ServletOutputStream out = response.getOutputStream();
InputStream in = [ code to get source input stream ];
String mimeType = [ code to get mimetype of data to be served ];
int bufferSize = 1024 * 4;
byte[] bytes = new byte[bufferSize];
int bytesRead;
response.setContentType(mimeType);
while ((bytesRead = in.read(bytes)) != -1) {
long start = System.currentTimeMillis();
out.write(bytes, 0, bytesRead);
long stop = System.currentTimeMillis();
if (stop - start > 700)
{
bufferSize /= 2;
bytes = new byte[bufferSize];
}
}
// do the following in a finally block:
in.close();
out.close();
The only way to find available bandwidth is to monitor / measure it. On windows you have access to Net.exe and can get the throughput on each NIC.
If you're serving the content through a servlet, then you could calculate how fast each servlet output stream is going. Collect that data for all streams for a user/session, and you could determine at least what the current bandwidth usage is.
A possible way to calculate the rate could be instead of writing the large files through the servlet output stream, write to a new FilterOutputStream that would keep track of your download rates.
The concept of "total internet bandwidth available in current machine" is really hard to define. However, tweaking the local buffer size will not affect how much data you can push through to an individual client.
The rate at which a given client can take data from your server will vary with the client, and with time. For any given connection, you might be limited by your local upstream connection to the Internet (e.g., server on DSL) or you might be limited somewhere in the core (unlikely) or the remote end (e.g., server in a data center, client on a dialup line). When you have many connections, each individual connection may have a different bottleneck. Measuring this available bandwidth is a hard problem; see for example this list of research and tools on the subject.
In general, TCP will handle using all the available bandwidth fairly for any given connection (though sometimes it may react to changes in available bandwidth slower than you like). If the client can't handle more data, the write call will block.
You should only need to tweak the buffersize in the linked question if you find that you are seeing low bandwidth and the cause of that is insufficient data buffered to write to the network. Another reason you might tweak the buffer size is if you have so many active connections that you are running low on memory.
In any case, the real answer may be to not buffer at all but instead put your static files on a separate server and use something like thttpd to serve them (using a system call like sendfile) instead of a servlet. This helps ensure that the bottleneck is not on your server, but somewhere out in the Internet, beyond your control.
EDIT: Re-reading this, it's a little muddled because it's late here. Basically, you shouldn't have to do this from scratch; use one of the existing highly scalable java servers, since they'll do it better and easier.
You're not going to like this, but it actually doesn't make sense, and here's why:
Total bandwidth is independent of the number of connections (though there is some small overhead), so messing with buffer sizes won't help much
Your chunks of data are being broken into variable-sized packets anyway. Your network card and protocol will deal with this better than your servlet can
Resizing buffers regularly is expensive -- far better to re-use constant buffers from a fixed-size pool and have all connections queue up for I/O rights
There are a billion and a half libraries that assist with this sort of server
Were this me, I would start looking at multiplexed I/O using NIO. You can almost certainly find a library to do this for you. The IBM article here may be a useful starting point.
I think the smart money gives you one network I/O thread, and one disk I/O thread, with multiplexing. Each connection requests a buffer from a pool, fills it with data (from a shared network or disk Stream or Channel), processes it, then returns the buffer to the pool for re-use. No re-sizing of buffers, just a bit of a wait for each chunk of data. If you want latency to stay short, then limit how many transfers can be active at a time, and queue up the others.
Related
I'm trying to write a bulk-downloader for images. Getting the InputStream from an URLConnection is easy enough, but downloading all files takes a while. Using multithreading sure speeds it up, but having a lot of threads download files could use a lot of memory. Here's what I found:
Let in be the InputStream, file the target File and fos a FileOutputStream to file
The simple way
fos.write(in.readAllBytes());
Read whole file, write the returning byte[]. Probably useable for getting the website source, no good for bigger files such as images.
Writing chunks
byte[] buffer = new byte[bufsize];
int read;
while ((read = in.read(buffer, 0, bufsize)) >= 0) {
fos.write(buffer, 0, read);
}
Seems better to me.
in.transferTo(fos)
in.transferTo(fos);
Writes chunks internally, as seen above.
Files.copy()
Files.copy(in, file.toPath(), StandardCopyOption.REPLACE_EXISTING);
Appears to use native implementations.
Which one of these should I use to minimize memory usage when done dozens of times in parallel?
This is a small project fur fun, external libraries are overkill for that IMO. Also I can't use ImageIO, since that can't handle webms, some pngs/jpgs and animated gifs.
EDIT:
This question was based on the assumption that concurrent writing is possible. However, it doesn't seem like that is the case. I'll probably get the image links concurrently and then download them one after another. Thanks for the answers anyways!
The short answer is: from the memory usage perspective the best solution is to use the version which reads and stores data in chunks.
The buffer size should be basically choosen taking into account the number of simultaneuous downloads, available memory, download speed and efficiency of the target drive in terms of data tranfer rate and IOPS.
The long answer is that concurrent download of files doesn't neccesarilly mean the download will be faster.
The number of simultaneuous downloads to actually speed up the overall download time mostly depends on:
number of hosts from which you're downlading
speed of internet connection of the host from which you're
downloading, limited by the speed of the network adapter of this host
speed of your internet connection, limited by the speed of the network adapter of this host
IOps of the storage of the host from which you're downloading
IOps of the storage you're downloading into
Tranfer rate of the storage on the host from which you're downloading
Tranfer rate of the storage you're downloading into
Performance of the local and remote hosts. For instance some older or low cost android device could be limited by the CPU speed.
For instance it could appear that if the source host has single hdd drive and single connection already gives the full connection speed, then it is useless to use multiple connections, as it would make the download slower by creating overhead of switching beetwen tranfered files.
It could be also that the source host has a speed limit on single connection, so multiple connections could speed things up.
HDD drive usually have an IOPS value around 80 IOPS and tranfer rate about 80 MB/s, and it could limit the speed of download/upload by these factors. So practically you can't write or read from such disk more than 80 files per second, and more than the tranfer limit around 80MB/s, of course this hardly depends on the disk model.
SSD drive usually have tens of thousands of IOPS and transfer rate > 400 MB/s, so the limits are much bigger, but for really fast internet connections they are still important.
I found on the internet a time-based comparison (hence performance) here journaldev.com/861/java-copy-file
However if you are focused on memory you could try to measure the memory consumption yourself using something like the code proposed by #pasha701 here
Runtime runtime = Runtime.getRuntime();
long usedMemoryBefore = runtime.totalMemory() - runtime.freeMemory();
System.out.println("Used Memory before" + usedMemoryBefore);
// copy file method here
long usedMemoryAfter = runtime.totalMemory() - runtime.freeMemory();
System.out.println("Memory increased:" + (usedMemoryAfter-usedMemoryBefore));
Notice this returns values are in bytes, divide by 1000000 to get values in MB.
Currently I am using this code on both Server and Client Side. Client is an android device.
BufferedOutputStream os = new BufferedOutputStream(socket.getOutputStream(),10000000);
BufferedInputStream sin = new BufferedInputStream(socket.getInputStream(),10000000);
os.write("10000000\n".getBytes());
os.flush();
for (int i =0;i<10000000;i++){
os.write((sampleRead[i]+" ").getBytes());
}
os.flush();
The problem is that this code takes about 80 secs to transfer data from android client to server while it takes only 8 seconds to transfer the data back from server to client. The code is same on both sides and buffer is also same. I also tried with different buffer sizes but the problem is with this segment
for (int i =0;i<10000000;i++){
os.write((sampleRead[i]+" ").getBytes());
}
The buffering takes most of the time while the actual transfer takes only about 6-7 seconds on a 150mbps hotspot connection. What could be the problem and how to solve it?
First of all, as a commenter has already noted, using a monstrously large buffer is likely to be counter productive. Once your stream buffer is bigger than the size of a network packet, app-side buffering loses its effectiveness. (The data in your "big" buffer needs to be split packet-sized chunks by the TCP/IP stack before it goes onto the network.) Indeed, if the app-side buffer is really large, you may find that your data gets stuck in the buffer for a long time waiting for the buffer to fill ... while the network is effectively idle.
(The Buffered... readers, writers and streams are primarily designed to avoid lots of syscalls that transfer tiny amounts of data. Above 10K or so, the buffering doesn't performance help much.)
The other thing to now is that in a lot of OS environments, the network throughput is actually limited by virtualization and default network stack tuning parameters. To get a better throughput, you may need to tune at the OS level.
Finally, if your network path is going over a network path that is congested, has a high end-to-end latency or links with constrained data rate, then you are unlikely to get fast data transfers no matter how you tune things.
(Compression might help ... if you can afford the CPU overhead at both ends ... but some data links already do compression transparently.)
You could compress the data transfer, it will save a lot of memory and well to transfer a compress stream of data is cheaper... For that you need to implement compress logic in client side and decompress logic in server side, see GZIPInputStream... And try reducing the buffer size is huge for a mobile device...
I would like to know the difference in terms of performance between these two blocks that try to send a big file over a TCP socket.
I couldn't find much resources explaining their efficiency.
A-
byte[] buffer = new byte[1024];
int number;
while ((number = fileInputStream.read(buffer)) != -1) {
socketOutputStream.write(buffer, 0, number);
}
B-
byte mybytearray = new byte[filesize];
os.write(mybytearray);
Which one is better in terms of transfer delay?
Also What is the difference if i set the size to 1024 or 65536? How would that affect the performance.
The latency until the last byte of the file arrives is basically identical. However the first one is preferable, although with a much larger buffer, for the following reasons:
The data starts arriving sooner.
There is no assumption that the file size fits into an int.
There is no assumption that the entire file fits into memory, so
It scales to very large files without code changes.
Your MTU (Maximum Transmission Unit) size is likely to be around 1500 bytes. This means your data will be broken up (or combined into) this size no matter what you do. Any reasonable buffer size from 512 byte up is likely to give you the same transfer speed.
How you send and receieve data impact the amount of CPU you use. Unles syou have a fast network e.g. 10 GB, your CPU will mroe than keep up with your network.
Writing the code in an efficient manner will ensure you don't waste CPU (which is a good thing) but shouldn't make much difference to your transfer speed which is limited by your bandwidth (and latency of your network)
I am writing an application which grabs an XML file from the server and then works with the data inside. My question is, because TCP ensures that all packets arrive and is beyond my control to control how it breaks that data apart, does it make sense to cap the buffer size? If so, I can send the data over in chunks and reassemble them on the client side. Obviously I cannot make an infinite buffer. The XML can get fairly large, up to 256kb and I am bit worried about reserving a buffer of that size. The data is pulled by an Android device but we can assume the device have 1gb of RAM.
The TCP receive buffer size has nothing to do with the size of the data being transferred. Obviously, you can transport gigabytes of data over TCP streams and that doesn't require the buffer to be of the same size. The buffer size generally has to do with performance (both network and processor on the endpoints) and can be small - you probably don't have to change the default settings in most cases.
You don't need to reassemble it at the client side yourself. Just attach an XML parser directly to the socket InputStream.
The default buffers in the network stack are generally tuned to be good, on average. Unless your application is particularly unusual (which it does not sound like), you would be better off not changing the buffer size. The fact that the endpoints are different will also result in tension that prevents easy selection of anything more optimal for both simultaneously.
As suggested, if you use a streaming parser on the receiving side, the buffer size does not really matter. Send the messages as you have them ready to reduce latency caused by batching the entire document.
I am building a java server that needs to scale. One of the servlets will be serving images stored in Amazon S3.
Recently under load, I ran out of memory in my VM and it was after I added the code to serve the images so I'm pretty sure that streaming larger servlet responses is causing my troubles.
My question is : is there any best practice in how to code a java servlet to stream a large (>200k) response back to a browser when read from a database or other cloud storage?
I've considered writing the file to a local temp drive and then spawning another thread to handle the streaming so that the tomcat servlet thread can be re-used. This seems like it would be io heavy.
Any thoughts would be appreciated. Thanks.
When possible, you should not store the entire contents of a file to be served in memory. Instead, aquire an InputStream for the data, and copy the data to the Servlet OutputStream in pieces. For example:
ServletOutputStream out = response.getOutputStream();
InputStream in = [ code to get source input stream ];
String mimeType = [ code to get mimetype of data to be served ];
byte[] bytes = new byte[FILEBUFFERSIZE];
int bytesRead;
response.setContentType(mimeType);
while ((bytesRead = in.read(bytes)) != -1) {
out.write(bytes, 0, bytesRead);
}
// do the following in a finally block:
in.close();
out.close();
I do agree with toby, you should instead "point them to the S3 url."
As for the OOM exception, are you sure it has to do with serving the image data? Let's say your JVM has 256MB of "extra" memory to use for serving image data. With Google's help, "256MB / 200KB" = 1310. For 2GB "extra" memory (these days a very reasonable amount) over 10,000 simultaneous clients could be supported. Even so, 1300 simultaneous clients is a pretty large number. Is this the type of load you experienced? If not, you may need to look elsewhere for the cause of the OOM exception.
Edit - Regarding:
In this use case the images can contain sensitive data...
When I read through the S3 documentation a few weeks ago, I noticed that you can generate time-expiring keys that can be attached to S3 URLs. So, you would not have to open up the files on S3 to the public. My understanding of the technique is:
Initial HTML page has download links to your webapp
User clicks on a download link
Your webapp generates an S3 URL that includes a key that expires in, lets say, 5 minutes.
Send an HTTP redirect to the client with the URL from step 3.
The user downloads the file from S3. This works even if the download takes more than 5 minutes - once a download starts it can continue through completion.
Why wouldn't you just point them to the S3 url? Taking an artifact from S3 and then streaming it through your own server to me defeats the purpose of using S3, which is to offload the bandwidth and processing of serving the images to Amazon.
I've seen a lot of code like john-vasilef's (currently accepted) answer, a tight while loop reading chunks from one stream and writing them to the other stream.
The argument I'd make is against needless code duplication, in favor of using Apache's IOUtils. If you are already using it elsewhere, or if another library or framework you're using is already depending on it, it's a single line that is known and well-tested.
In the following code, I'm streaming an object from Amazon S3 to the client in a servlet.
import java.io.InputStream;
import java.io.OutputStream;
import org.apache.commons.io.IOUtils;
InputStream in = null;
OutputStream out = null;
try {
in = object.getObjectContent();
out = response.getOutputStream();
IOUtils.copy(in, out);
} finally {
IOUtils.closeQuietly(in);
IOUtils.closeQuietly(out);
}
6 lines of a well-defined pattern with proper stream closing seems pretty solid.
toby is right, you should be pointing straight to S3, if you can. If you cannot, the question is a little vague to give an accurate response:
How big is your java heap? How many streams are open concurrently when you run out of memory?
How big is your read write/bufer (8K is good)?
You are reading 8K from the stream, then writing 8k to the output, right? You are not trying to read the whole image from S3, buffer it in memory, then sending the whole thing at once?
If you use 8K buffers, you could have 1000 concurrent streams going in ~8Megs of heap space, so you are definitely doing something wrong....
BTW, I did not pick 8K out of thin air, it is the default size for socket buffers, send more data, say 1Meg, and you will be blocking on the tcp/ip stack holding a large amount of memory.
I agree strongly with both toby and John Vasileff--S3 is great for off loading large media objects if you can tolerate the associated issues. (An instance of own app does that for 10-1000MB FLVs and MP4s.) E.g.: No partial requests (byte range header), though. One has to handle that 'manually', occasional down time, etc..
If that is not an option, John's code looks good. I have found that a byte buffer of 2k FILEBUFFERSIZE is the most efficient in microbench marks. Another option might be a shared FileChannel. (FileChannels are thread-safe.)
That said, I'd also add that guessing at what caused an out of memory error is a classic optimization mistake. You would improve your chances of success by working with hard metrics.
Place -XX:+HeapDumpOnOutOfMemoryError into you JVM startup parameters, just in case
take use jmap on the running JVM (jmap -histo <pid>) under load
Analyize the metrics (jmap -histo out put, or have jhat look at your heap dump). It very well may be that your out of memory is coming from somewhere unexpected.
There are of course other tools out there, but jmap & jhat come with Java 5+ 'out of the box'
I've considered writing the file to a local temp drive and then spawning another thread to handle the streaming so that the tomcat servlet thread can be re-used. This seems like it would be io heavy.
Ah, I don't think you can't do that. And even if you could, it sounds dubious. The tomcat thread that is managing the connection needs to in control. If you are experiencing thread starvation then increase the number of available threads in ./conf/server.xml. Again, metrics are the way to detect this--don't just guess.
Question: Are you also running on EC2? What are your tomcat's JVM start up parameters?
You have to check two things:
Are you closing the stream? Very important
Maybe you're giving stream connections "for free". The stream is not large, but many many streams at the same time can steal all your memory. Create a pool so that you cannot have a certain number of streams running at the same time
In addition to what John suggested, you should repeatedly flush the output stream. Depending on your web container, it is possible that it caches parts or even all of your output and flushes it at-once (for example, to calculate the Content-Length header). That would burn quite a bit of memory.
If you can structure your files so that the static files are separate and in their own bucket, the fastest performance today can likely be achieved by using the Amazon S3 CDN, CloudFront.