I have used standard java file stream to upload a file. When I tried to upload a 25MB size zip file , it took almost 11 minutes. but when I tried to upload that file on yousendit.com a file uploading site it just took 25 seconds. Following is my code
File file = new File(destination + fileName);
FileOutputStream fileOutputStream = new FileOutputStream(file);
byte[] buffer = new byte[1024];
InputStream in = dataHandler.getDataSource().getInputStream();
int len = in.read(buffer);
while (len != -1) {
fileOutputStream.write(buffer, 0, len);
len = in.read(buffer);
}
fileOutputStream.flush();
fileOutputStream.close();
I dont have Ideas ho to speed up the uploading? Is there any other 3rd party API , or any other suggestions?
You can split file into chunks and upload each one in separate thread. As far as I remember HTTP standard defines special headers that help server to join the chunks together.
Start from taking a look on FileUpload from Apcahe
You may use a flash or html5 plugin to upload the file to your server, and do the things to the file which has been on your server, that'll be much faster I think.
There is something terribly wrong if a software stack cannot achieve 40kb per second throughput on an upload.
I suggest that you increase the size of buffer. Make it 10 times bigger and see if you get a speedup.
If that doesn't help I suggest that you profile your system to try to identify where the bottleneck is. The code you've written should not be CPU intensive. If it is, it would be instructive to understand why.
My guess is either that you've got a particularly badly written filter "upstream" of that code ... or that the problem is not in the application at all, despite what the network team thinks. Perhaps it is a problem with virtualization / virtual networking.
Related
I am currently developing a REST service which receives in its request a field where it is passed a file in base 64 format ("n" characters come). What I do within the service logic is to convert that character string to a File to save it in a predetermined path.
But the problem is that when the file is too large (3MB) the service becomes slow and takes a long time to respond.
This is the code I am using:
String filename = "TEXT.DOCX"
BufferedOutputStream stream = null;
// THE FIELD base64file IS WHAT A STRING IN BASE FORMAT COMES FROM THE REQUEST 64
byte [] fileByteArray = java.util.Base64.getDecoder (). decode (base64file);
// VALID FILE SIZE
if ((1 * 1024 * 1024 <fileByteArray.length) {
logger.info ("The file [" + filename + "] is too large");
} else {
stream = new BufferedOutputStream (new FileOutputStream (new File ("C: \" + filename)));
stream.write (fileByteArray);
}
How can I do to avoid this inconvenience. And that my service does not take so long to convert the file to File.
Buffering does not improve your performance here, as all you are trying to do is simply write the file as fast as possible. Generally it looks fine, change your code to directly use the FileOutputStream and see if it betters things:
try (FileOutputStream stream = new FileOutputStream(path)) {
stream.write(bytes);
}
Alternatively you could also try using something like Apache Commons to do the task for you:
FileUtils.writeByteArrayToFile(new File(path), bytes);
Try the following, also for large files.
Path outPath = Paths.get(filename);
try (InputStream in = Base64.getDecoder ().wrap(base64file)) {
Files.copy(in, outPath);
}
This keeps only a buffer in memory. Your code might become slow because of taking more memory.
wrap takes an InputStream which you should provide, not the entire String.
From Network point of view:
Both json and xml can support large amount of data exchange. And, 3MB is not really huge. But, there is a limitation on how much browser can handle (if this call is from a user interface).
Also, web server like Tomcat has property to handle 2MB by default (check maxPostSize http://tomcat.apache.org/tomcat-6.0-doc/config/http.html#Common_Attributes)
You can also try chunking the request payload (although it shouldn't be required for a 3MB file)
From Implementation point of view:
Write operation on your disk could be slow. It also depends on your OS.
If your file size is really large, you can use Java FileChannel class with ByteBuffer.
To know the cause of slowness (network delay or code), check the performance with a simple Java program against the web service call.
I need to upload the files on server. It can be done either thru webservice/or UI. I just need to store that file content in DB.
File can be of any size upto 2 to 4 GB as well. I am not sure whats the way to upload big size file on server without
getting out of memory exception?
System Configuration :- 8 GB ,java 7 64 bit processor.
I am not sure whats the way to upload big size file on server without getting out of memory exception?
That part is easy. Don't buffer the entire file in memory. Stream it straight to disk.
(Pseudo code ... ignoring exception handling and resource management)
InputStream in = ...
OutputStrean out = ... // the place you want to ultimately store the file
byte[] buffer = new byte[8192];
int bytesRead;
while ((bytesRead = in.read(buffer)) > 0) {
out.write(buffer, 0, bytesRead);
}
// close streams.
UPDATE
You seem to be confused about how to get the input stream.
If you are using the Servlet APIs, then use you can get the request's input stream using ServletRequest.getInputStream().
If you are using different APIs, be specific ... and I'll look into it for you.
The more difficult part is dealing with the various issues to do with uploaded file encoding, encapsulation and so on. For that, the best approach is to look for an existing solution. But that depends on the context in which you are doing the uploads; e.g. what web container you are using, etc.
Here's an example: http://commons.apache.org/proper/commons-fileupload/
Our current project requires us to send an audio file to the server and then use the audio file for further computation.
Using the Java sound api, I was able to capture the recording and save it as a wav file in my system. Then in order to pass the audio wav to the server, I am using Apache Commons HttpClient to post a request to the server. (I am using InputstreamEntity provided by apache and sending the data as a chunk).
The problem appears when i am trying to recreate/retrieve the wav file on the server. I understand that I would have to use the AudioSystem.write API to create the wav file (exactly as what was done on my system). However what I observe is that althought the file gets created , it does not play (I am using vlc media player to test it FYI). I have searched in Google for sample codes and have tried to implement it, but is unable to play it once the file gets created.
The sample code snippets indicates the approaches i have tried:
//******************************************************************
try {
InputStream is = request.getInputStream();
FileOutputStream fs = new FileOutputStream("output123.wav");
byte[] tempbuffer = new byte[4096];
int bytesRead;
while((bytesRead=is.read(tempbuffer))!=-1)
{
fs.write(tempbuffer, 0,bytesRead);
}
is.close();
fs.close();
AudioInputStream inputStream =AudioSystem.getAudioInputStream(newFile("output123.wav"));
int numofbytes = inputStream.available();
byte[] buffer = new byte[numofbytes];
inputStream.read(buffer);
int bytesWritten = AudioSystem.write(inputStream, AudioFileFormat.Type.WAVE,new File("outputtest.wav"));
System.out.println("written"+bytesWritten);
Approach 2
InputStream is = request.getInputStream();
System.out.println("inputStream obtained : "+is.toString());
ByteArrayInputStream bais = null;
byte[] audioBuffer = IOUtils.toByteArray(is);
System.out.println(" is audioBuffer empty? : length = ? "+audioBuffer.length);
try {
AudioFileFormat ai = AudioSystem.getAudioFileFormat(is);
System.out.println("ai bytelength ? "+ai.getByteLength());
System.out.println("ai frame length = "+ai.getFrameLength());
Set<Map.Entry<String,Object>> audioProperties = ai.getFormat().properties().entrySet();
System.out.println("entry set is empty ? "+audioProperties.isEmpty());
for(Map.Entry me : audioProperties){
System.out.println("key = "+me.getKey());
System.out.println("value ="+me.getValue());}
bais = new ByteArrayInputStream(audioBuffer);
AudioInputStream ais = new AudioInputStream(bais, new AudioFormat(8000,8,2,true,true), 2);
AudioSystem.write(ais, AudioFileFormat.Type.WAVE,new File("testtest.wav"));
//*************************************************************************************
The audioFormat properties all turned out to be null. Are these null values giving the problem? So while creating the wave file on the server, I tried to set the properties manually once again. But even then the wav file would not play.
I have also tried quite a few approaches already mentioned on this site, but somehow they aren't working. I am sure i am missing something, but I am unable to pinpoint the exact problem.
Would be really helpful, if you guys can point out how to go about the conversion from ServletInputStream to getting a wav.
P.S (1) I know the code is shabby, because i have been under a trial and error situation for quite some time now. But I will give more details on the approaches if needed.
2) Apologise for the clumsiness, this happens to be my first post.. )
this is not how you copy a stream (from Approach 1). you have the correct code to copy a stream just above this.:
int numofbytes = inputStream.available();
byte[] buffer = new byte[numofbytes];
inputStream.read(buffer);
If all your server wants to do is get the data and write it to a file, then you do not need to use any of the audio API: simply treat the data as a stream of bytes.
So the part of approach 1 that is before any mention of AudioInputStream should be sufficient.
Although the approach chosen might not be the perfect solution, due to time constraints, I adopted a simpler approach. Using java.util.zip i simply zipped it up and sent it over to the server and then wrote a layer wherin the file gets unzipped . then i deleted the zip files. Seems like an immature solution (bcos the original challenge was to send the audio file). now i am incurring an overhead of zipping the files, but the file transfer would hapeen relatively faster. Thanks for your help guys.
I would like to write simple Java downloader for my backup website. What is important, applet should be able to download many files at once.
So, here is my problem. Such applet seems to me easily to hack or infect. What is more, it for sure will need many system resources to run. So, I would like to hear your opinions what is the best, the most optimal and the most secure way to do it.
I thought about something like this:
//user chose directory to download his files
//fc is a FileChooser
//fc.showSaveDialog(this)==JFileChooser.APPROVE_OPTION
try {
for(i=0;i<=urls.length-1;i++){
String fileName = '...';//obtaining filename and extension
fileName=fileName.replaceAll(" ", "_");
//I am not sure if line above resolves all problems with names of files...
String path = file.getAbsolutePath() + File.separator + fileName;
try{
InputStream is=null;
FileOutputStream os=new FileOutputStream(path);
URLConnection uc = urls[i].openConnection();
is = uc.getInputStream();
int a=is.read();
while(a!=-1){
os.write(a);
a=is.read();
}
is.close();
os.close();
}
catch(InterruptedIOException iioe) {
//TODO User cancelled.
}
catch(IOException ioe){
//TODO
}
}
}
but I am sure that there is a better solution.
There is one more thing - when user wants to download really huge amount of files (e.g. 1000, between 10MB and 1GB), there will be several problems. So, I thought about setting a limit for it, but I don't really know how to decide how many files at once is OK. Should I check user's Internet connection or computer's load?
Thanks in advance
BroMan
I would like to write simple Java downloader for my backup website.
What is important, applet should be able to download many files at once.
I hope you mean sequentially like your code is written. There would be no advantage in this situation to run multiple download streams in parallel.
Such applet seems to me easily to hack or infect.
Make sure to encrypt your communication stream. Since it looks like you are just accessing URLs on the server, maybe configure your server to use HTTPS.
What is more, it for sure will need many system
resources to run.
Why do you assume that? The network bandwidth will be the limiting factor. You are not going to be taxing your other resources very much. Maybe you meant avoiding saturating user's bandwidth. You can implement simple throttling by giving user a configurable delay that you insert between every file or even every iteration of your read/write loop. Use Thread.sleep to implement the delay.
So, I thought about setting a limit for it, but I don't
really know how to decide how many files at once is OK.
Assuming you are doing download sequentially, setting limits isn't a technical question. More about what kind of service you want to provide. More files just means the download takes longer.
int a=is.read();
Your implementation of stream read/write is very inefficient. You want to read/write in chunks rather than single bytes. See the versions of read/write methods that take byte[].
Here is the basic logic flow to copy data from an input stream to an output stream.
InputStream in = null;
OutputStream out = null;
try
{
in = ...
out = ...
final byte[] buf = new byte[ 1024 ];
for( int count = in.read( buf ); count != -1; count = in.read( buf ) )
{
out.write( buf, 0, count );
}
}
finally
{
if( in != null )
{
in.close();
}
if( out != null )
{
out.close();
}
}
EDIT:
Got the directory to live. Now there's another issue in sight:
The files in the storage are stored with their DB id as a prefix
to their file names. Of course I don't want the users to see those.
Is there a way to combine the response.redirect and the header setting
für filename and size?
best,
A
Hi again,
new approach:
Is it possible to create a IIS like virtual directory within tomcat in order
to avoid streaming and only make use of header redirect? I played around with
contexts but could'nt get it going...
any ideas?
thx
A
Hi %,
I'm facing a wired issue with the java heap space which is close
to bringing me to the ropes.
The short version is:
I've written a ContentManagementSystem which needs to handle
huge files (>600mb) too. Tomcat heap settings:
-Xmx700m
-Xms400m
The issue is, that uploading huge files works eventhough it's
slow. Downloading files results in a java heap space exception.
Trying to download a 370mb file makes tomcat jump to 500mb heap
(which should be ok) and end in an Java heap space exception.
I don't get it, why does upload work and download not?
Here's my download code:
byte[] byt = new byte[1024*1024*2];
response.setHeader("Content-Disposition", "attachment;filename=\"" + fileName + "\"");
FileInputStream fis = null;
OutputStream os = null;
fis = new FileInputStream(new File(filePath));
os = response.getOutputStream();
BufferedInputStream buffRead = new BufferedInputStream(fis);
while((read = buffRead.read(byt))>0)
{
os.write(byt,0,read);
os.flush();
}
buffRead.close();
os.close();
If I'm getting it right the buffered reader should take care of any
memory issue, right?
Any help would be highly appreciated since I ran out of ideas
Best regards,
W
If I'm getting it right the buffered
reader should take care of any memory
issue, right?
No, that has nothing to do with memory issues, it's actually unnecessary since you're already using a buffer to read the file. Your problem is with writing, not with reading.
I can't see anything immediately wrong with your code. It looks as though Tomcat is buffering the entire response instead of streaming it. I'm not sure what could cause that.
What does response.getBufferSize() return? And you should try setting response.setContentLength() to the file's size; I vaguely remember that a web container under certain circumstances buffers the entire response in order to determine the content length, so maybe that's what's happening. It's good practice to do it anyway since it enables clients to display the download size and give an ETA for the download.
Try using the setBufferSize and flushBuffer methods of the ServletResponse.
You better use java.nio for that, so you can read resources partially and free resources already streamed!
Otherwise, you end up with memory problems despite the settings you've done to the JVM environment.
My suggestions:
The Quick-n-easy: Use a smaller array! Yes, it loops more, but this will not be a problem. 5 kilobytes is just fine. You'll know if this works adequately for you in minutes.
byte[] byt = new byte[1024*5];
A little bit harder: If you have access to sendfile (like in Tomcat with the Http11NioProtocol -- documentation here), then use it
A little bit harder, again: Switch your code to Java NIO's FileChannel. I have very, very similar code running on equally large files with hundreds of concurrent connections and similar memory settings with no problem. NIO is faster than plain old Java streams in these situations. It uses the magic of DMA (Direct Memory Access) allowing the data to go from disk to NIC without ever going through RAM or the CPU. Here is a code snippet for my own code base...I've ripped out much to show the basics. FileChannel.transferTo() is not guaranteed to send every byte, so it is in this loop.
WritableByteChannel destination = Channels.newChannel(response.getOutputStream());
FileChannel source = file.getFileInputStream().getChannel();
while (total < length) {
long sent = source.transferTo(start + total, length - total, destination);
total += sent;
}
The following code is able to streaming data to the client, allocating only a small buffer (BUFFER_SIZE, this is a soft point since you may want to adjust it):
private static final int OUTPUT_SIZE = 1024 * 1024 * 50; // 50 Mb
private static final int BUFFER_SIZE = 4096;
#Override
protected void doGet(HttpServletRequest request,HttpServletResponse response)
throws ServletException, IOException {
String fileName = "42.txt";
// build response headers
response.setStatus(200);
response.setContentLength(OUTPUT_SIZE);
response.setContentType("text/plain");
response.setHeader("Content-Disposition",
"attachment;filename=\"" + fileName + "\"");
response.flushBuffer(); // write HTTP headers to the client
// streaming result
InputStream fileInputStream = new InputStream() { // fake input stream
int i = 0;
#Override
public int read() throws IOException {
if (i++ < OUTPUT_SIZE) {
return 42;
} else {
return -1;
}
}
};
ReadableByteChannel input = Channels.newChannel(fileInputStream);
WritableByteChannel output = Channels.newChannel(
response.getOutputStream());
ByteBuffer buffer = ByteBuffer.allocate(BUFFER_SIZE);
while (input.read(buffer) != -1) {
buffer.flip();
output.write(buffer);
buffer.clear();
}
input.close();
output.close();
}
Are you required to serve files using Tomcat? For this kind of tasks we have used separate download mechanism. We chained Apache -> Tomcat -> storage and then add rewrite rules for download. Then you just by-pass Tomcat and Apache will serve the file to client (Apache->storage). But if works only if you have files stored as files. If you read from DB or other type of non-file storage this solution cannot be used successfully. the overall scenario is that you generate download links for files as e.g. domain/binaries/xyz... and write redirect rule for domain/files using Apache mod_rewrite.
Do you have any filters in the application, or do you use the tcnative library? You could try to profile it with jvisualvm?
Edit: Small remark: Note that you have a HTTP response splitting attack possibility in the setHeader if you do not sanitize fileName.
Why don't you use tomcat's own FileServlet?
It can surely give out files much better than you can possible imagine.
A 2-MByte buffer is way too large! A few k should be ample. Megabyte-sized objects are a real issue for the garbage collector, since they often need to be treated separately from "normal" objects (normal == much smaller than a heap generation). To optimize I/O, your buffer only needs to be slightly larger than your I/O buffer size, i.e. at least as large as a disk block or network package.