Using Akka actors to download many files - java

Background
I'm trying to use Akka actors to download many files. (Akka 2.5.1 & Java 8)
Each actor is assigned with a different URL it should download from.
A different actor is creating the downloaders actors and it should not wait for the downloaders to finish. Once they will finish, they will create another actor to handle the downloaded file.
The problem
When I run only one actor - it is able to download the file.
As I increas the number of actors it seems like none of them is able to finish its task. They are downloading portion of the files and stopping with no particular error / exception.
Actors creation code:
ActorRef downloaderActor = context().actorOf(Props.create(DownloaderActor.class));
downloaderActor.tell("URL to download", this.getSelf());
Inside the DownloaderActor class I have a download function where it seems like the problem occurs:
public void downloadFile(String fileURL, String saveDir) {
try {
URL url = new URL(fileURL);
HttpURLConnection httpConn = (HttpURLConnection) url.openConnection();
int responseCode = httpConn.getResponseCode();
if (responseCode == HttpURLConnection.HTTP_OK) {
InputStream inputStream = httpConn.getInputStream();
String saveFilePath = saveDir + File.separator + fileName;
FileOutputStream outputStream = new FileOutputStream(saveFilePath);
int bytesRead = -1;
byte[] buffer = new byte[4096];
while ((bytesRead = inputStream.read(buffer)) != -1) {
outputStream.write(buffer, 0, bytesRead);
}
outputStream.close();
inputStream.close();
System.out.println("File downloaded: " + fileURL);
} else {
System.out.println("No file to download. Server replied HTTP code: " + responseCode + " . when accessing " + url);
}
httpConn.disconnect();
}catch (MalformedURLException murl){
murl.printStackTrace();
}catch (IOException ioe){
ioe.printStackTrace();
}
}
And to be more specific - it seems like the problem is in the "while" loop, because if I add there logging, I can see that the loop is looping and than stops after a while.
Failed attempts
I also tried to set some http connection parameters:
httpConn.setRequestProperty("Range", "bytes=0-24");
httpConn.setConnectTimeout(10_000_000);
But it didn't seems to help.
I tried to put the download function as static function in a different Util class and it also didn't helped.
Will appriciate any help here.

You are using blocking I/O here to download the file, which means you need a thread for each concurrent download.
You're on the right track in the sense that the Actor model can help you model your problem without requiring one thread per concurrent download: actors are a good way to model concurrent asynchronous processes.
To actually take advantage of this, however, you still need to write the 'implementation' of the actor to be non-blocking. There are a number of non-blocking HTTP libraries available for Scala, for example you could use akka-http's future-based client API together with the 'ask' pattern.

I found the problem origin:
I run the code in a Junit contex. Seems like the Junit at some point cut the running threads and thus terminating the activiti of the actors.
Once I started to run the program in regular run mode, the problem (seems to be) gone.

Related

Android Studio return exeption from GitHub API request

I am creating Android app to get repository info from GitHub account.
Function to get repositories data from GitHub account https://github.com/vGrynishyn.
I received result only 2-3 times using Android Studio , after that I receive the following Exception in urlConnection.getResponseCode(): android.os.NetworkOnMainThreadException.
But I tried run it in Intelij IDEA(Command line app) and it is working all time without problem.
Could you please advise me where is error?
private String getGitHubRepositoryContent(){
String line = null;
try {
URL url = new URL("https://api.github.com/users/vGrynishyn/repos");
HttpsURLConnection urlConnection = (HttpsURLConnection) url.openConnection();
int responseCode = urlConnection.getResponseCode();
InputStream in;
if (urlConnection.getResponseCode() < HttpsURLConnection.HTTP_BAD_REQUEST) {
in = urlConnection.getInputStream();
} else {
in = urlConnection.getErrorStream();
}
BufferedReader br = new BufferedReader(new InputStreamReader(in, "UTF-8"));
while (br.read() != -1) {
line = br.readLine();
}
System.out.println("Response code: " + responseCode);
System.out.println(line);
} catch (Exception e) {
e.printStackTrace(); //do something
}
return line;
}
debug window screen
The error suggests you are executing a network operation in the main thread. Android OS doesn't allow you to do It in order to prevent your UI to freeze. Perform network operation in another thread, use Retrofit library or Asyntasks to make network operations
The problem here is you're running your method getGitHubRepositoryContent() on the main thread. This is not the right way to do network requests on Android. I would also recommend against using HttpsURLConnection. Use Retrofit or Volley for making HTTP requests since they themselves take care of making the network call in a background thread, and gives callback of the HTTP response in the main thread.
But I tried run it in Intelij IDEA(Command line app) and it is working all time without problem.
This runs without error in command line because there are no separate UI events to be handled. In command line when a request is made, your code makes the request on same thread it's running on, which freezes the program, but it isn't noticeable because there is no separate UI to see
a noticeable freeze.
TL;DR You shouldn't make network requests on the main thread in Android. It's recommended to use Retrofit or Volley.

Java heap space error when uploading files through http basic authentication (JAVA) [duplicate]

I am trying to publish a large video/image file from the local file system to an http path, but I run into an out of memory error after some time...
here is the code
public boolean publishFile(URI publishTo, String localPath) throws Exception {
InputStream istream = null;
OutputStream ostream = null;
boolean isPublishSuccess = false;
URL url = makeURL(publishTo.getHost(), this.port, publishTo.getPath());
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
if (conn != null) {
try {
conn.setDoOutput(true);
conn.setDoInput(true);
conn.setRequestMethod("PUT");
istream = new FileInputStream(localPath);
ostream = conn.getOutputStream();
int n;
byte[] buf = new byte[4096];
while ((n = istream.read(buf, 0, buf.length)) > 0) {
ostream.write(buf, 0, n); //<--- ERROR happens on this line.......???
}
int rc = conn.getResponseCode();
if (rc == 201) {
isPublishSuccess = true;
}
} catch (Exception ex) {
log.error(ex);
} finally {
if (ostream != null) {
ostream.close();
}
if (istream != null) {
istream.close();
}
}
}
return isPublishSuccess;
}
HEre is the error i am getting...
Exception in thread "Thread-8773" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2786)
at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:94)
at sun.net.www.http.PosterOutputStream.write(PosterOutputStream.java:61)
at com.test.HTTPClient.publishFile(HTTPClient.java:110)
at com.test.HttpFileTransport.put(HttpFileTransport.java:97)
The HttpUrlConnection is buffering the data so that it can set the Content-Length header (per HTTP spec).
One alternative, if your destination server supports it, is to use "chunked" transfers. This will buffer only a small portion of data at a time. However, not all services support it (Amazon S3, for example, doesn't).
Another alternative (and imo a better one) is to use Jakarta HttpClient. You can set the "entity" in a request from a file, and the connection code will set request headers appropriately.
Edit: nos commented that the OP could call HttpURLConnection.setFixedLengthStreamingMode(long length). I was unaware of this method; it was added in 1.5, and I haven't used this class since then.
However, I still suggest using Jakarta HttpClient, for the simple reason that it reduces the amount of code that the OP has to maintain. Code that is boilerplate, yet still has the potential for errors:
The OP correctly handles the loop to copy between input and output. Usually when I see an example of this, the poster either doesn't properly check the returned buffer size, or keeps re-allocating the buffers. Congratulations, but you now have to ensure that your successors take as much care.
The exception handling isn't quite so good. Yes, the OP remembers to close the connections in a finally block, and again, congratulations on that. Except that either of the close() calls could throw IOException, keeping the other from executing. And the method as a whole throws Exception, so that the compiler isn't going to help catch similar errors.
I count 31 lines of code to setup and execute the response (excluding the response code check and the URL computation, but including the try/catch/finally). With HttpClient, this would be somewhere in the range of a half dozen LOC.
Even if the OP had written this code perfectly, and refactored it into methods similar to those in Jakarta Commons IO, s/he shouldn't do that. This code has been written and tested by others. I know that it's a waste of my time to rewrite it, and suspect that it's a waste of the OP's time as well.
conn.setFixedLengthStreamingMode((int) new File(localpath).length());
And for buffering you could cover your streams into the BufferedOutputStream and BufferedInputStream
Good example of chunked uploading you could find there: gdata-java-client
The problem is that the HttpURLConnection class is using a byte array to store your data. Presumably this video you are pushing is taking more memory than available. You have a few options here:
Increase the memory to your application. You can use the -Xmx1024m option to give 1GB of memory to your application. This will increase the amount of data you can store in memory.
If you still run out of memory, you might want to consider trying another library to push the video up that does not store the data all in memory at once. The Apache Commons HttpClient has such a feature. See this site for more information: http://hc.apache.org/httpclient-3.x/features.html. See this section for multi-part form upload of large files: http://hc.apache.org/httpclient-3.x/methods/multipartpost.html
For anything other than basic GET operations, the built-in java.net HTTP stuff isn't very good. Using Apache Commons HttpClient is recommended for this. It lets you do much more intuitive stuff like this:
PutMethod put = new PutMethod(url);
put.setRequestEntity(new FileRequestEntity(localFile, contentType));
int responseCode = put.executeMethod();
which replaces a lot of your boiler-plate code.
HttpsURLConnection#setChunkedStreamingMode(1024 * 1024 * 10); //10MB chunk
This ensures that any file (of any size) is streamed over a https connection, without internal buffering. This should be used when the file size or the content length is unknown.
Your problem is that you're trying to fix X video bytes into X/N bytes of RAM, when N > 1.
You either need to read the video into a smaller buffer and write it out as you go or make the file smaller or increase the memory available to your process.
Check your heap size. You can use -Xmx to increase it if you've taken the default.

Java NIO read large file from inputstream

I want to read a large InputStream and return it as a file. So I need to split InputStream(or I should read InputStream in multiple threads). How can I do this? I'm trying to do something like this:
URL url = new URL("path");
URLConnection connection = url.openConnection();
int fileSize = connection.getContentLength();
InputStream is = connection.getInputStream();
ReadableByteChannel rbc1 = Channels.newChannel(is);
ReadableByteChannel rbc2 = Channels.newChannel(is);
FileOutputStream fos = new FileOutputStream("file.ext");
FileChannel fileChannel1 = fos.getChannel();
FileChannel fileChannel2 = fos.getChannel();
fileChannel1.transferFrom(rbc1, 0, fileSize/2);
fileChannel2.transferFrom(rbc2, fileSize/2, fileSize/2);
fos.close();
But it does not affect on performance.
You can open multiple (HTTP) Connections to the same resource (URL) but use the Range: Header of HTTP to make each stream begin to read at another point. This can actually speed up the data transfer, especially when high latency is an issue. You should not overdo the parallelism, be aware that it puts additional load on the server.
connection1.setRequestProperty("Range", "bytes=0-" + half);
connection2.setRequestProperty("Range", "bytes=" + half+1 +"-");
This can also be used to resume downloads. It needs to be supported by the server. It can announce this with Accept-Ranges: bytesbut does not have to . Be prepared that the first connection might return the whole requested entity (status 200 vs. 206) instead.
You need to read the input streams from the URLConnections in separate threads as this is blocking IO (not sure if the NIO wrapping helps here).
You can use position(long) method for each channel to start reading for.
Check this.
http://tutorials.jenkov.com/java-nio/file-channel.html#filechannel-position
Besides, if you want download a file partially,
Parallel Downloading
To download multiple parts of a file parallelly, we need to create
multiple threads. Each thread is implemented similarly to the simple
thread above, except that it needs to download only a part of the
downloaded file. To do that, the HttpURLConnection or its super class
URLConnection provides us method setRequestProperty to set the range
of the bytes we want to download.
// open Http connection to URL
HttpURLConnection conn = (HttpURLConnection)mURL.openConnection();
// set the range of byte to download
String byteRange = mStartByte + "-" + mEndByte;
conn.setRequestProperty("Range", "bytes=" + byteRange);
// connect to server
conn.connect();
This would be helpful for you.
I found this answer here, you can check complete tutorial.
http://luugiathuy.com/2011/03/download-manager-java/

Problem with Sending and Receiving Files with SPP over Bluetooth

I am attempting to transfer files (MP3s about six megabytes in size) between two PCs using SPP over Bluetooth (in Java, with the BlueCove API). I can get the file transfer working fine in one direction (for instance, one file from the client to the server), but when I attempt to send any data in the opposite direction during the same session (i.e., send a file from the server to the client), the program freezes and will not advance.
For example, if I simply:
StreamConnection conn;
OutputStream outputStream;
outputStream = conn.openOutputStream();
....
outputStream.write(data); //Data here is an MP3 file converted to byte array
outputStream.flush();
The transfer works fine. But if I try:
StreamConnection conn;
OutputStream outputStream;
InputStream inputStream;
ByteArrayOutputStream out = new ByteArrayOutputStream();
outputStream = conn.openOutputStream();
inputStream = conn.openInputStream();
....
outputStream.write(data);
outputStream.flush();
int receiveData;
while ((receiveData = inputStream.read()) != -1) {
out.write(receiveData);
}
Both the client and the server freeze, and will not advance. I can see that the file transfer is actually happening at some point, because if I kill the client, the server will still write the file to the hard drive, with no issues. I can try to respond with another file, or with just an integer, and it still will not work.
Anyone have any ideas what the problem is? I know OBEX is commonly used for file transfers over Bluetooth, but it seemed overkill for what I needed to do. Am I going to have to use OBEX for this functionality?
It could be as simple as both programs stuck in blocking receive calls, waiting for the other end to say something... try adding a ton of log statements so you can see what "state" each program is in (ie, so it gives you a running commentary such as "trying to recieve", "got xxx data", "trying to reply", etc), or set up debugging, wait until it gets stuck and then stop one of them and single step it.
you can certainly use SPP to transfer file between your applications (assuming you are sending and receiving at both ends using your application). From the code snippet it is difficult to tell what is wrong with your program.
I am guessing that you will have to close the stream as an indication to the other side that you are done with sending the data .. Note even though you write the whole file in one chunk, SPP / Bluetooth protocol layers might fragment it and the other end could receive in fragments, so you need to have some protocol to indicate transfer completion.
It is hard to say without looking at the client side code, but my guess, if the two are running the same code (i.e. both writing first, and then reading), is that the outputStream needs to be closed before the reading occurs (otherwise, both will be waiting for the other to close their side in order to get out of the read loop, since read() only returns -1 when the other side closes).
If the stream should not be closed, then the condition to stop reading cannot be to wait for -1. (so, either change it to transmit the file size first, or some other mechanism).
Why did you decide to use ByteArrayOutputStream? Try following code:
try {
try {
byte[] buf = new byte[1024];
outputstream = conn.openOutputStream();
inputStream = conn.openInputStream();
while ((n = inputstream.read(buf, 0, 1024)) > -1)
outputstream.write(buf, 0, n);
} finally {
outputstream.close();
inputstream.close();
log.debug("Closed input streams!");
}
} catch (Exception e) {
log.error(e);
e.printStackTrace();
}
And to convert the outputStream you could do something like this:
byte currentMP3Bytes[] = outputStream.toString().getBytes();
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream(currentMP3Bytes);

OutputStream OutOfMemoryError when sending HTTP

I am trying to publish a large video/image file from the local file system to an http path, but I run into an out of memory error after some time...
here is the code
public boolean publishFile(URI publishTo, String localPath) throws Exception {
InputStream istream = null;
OutputStream ostream = null;
boolean isPublishSuccess = false;
URL url = makeURL(publishTo.getHost(), this.port, publishTo.getPath());
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
if (conn != null) {
try {
conn.setDoOutput(true);
conn.setDoInput(true);
conn.setRequestMethod("PUT");
istream = new FileInputStream(localPath);
ostream = conn.getOutputStream();
int n;
byte[] buf = new byte[4096];
while ((n = istream.read(buf, 0, buf.length)) > 0) {
ostream.write(buf, 0, n); //<--- ERROR happens on this line.......???
}
int rc = conn.getResponseCode();
if (rc == 201) {
isPublishSuccess = true;
}
} catch (Exception ex) {
log.error(ex);
} finally {
if (ostream != null) {
ostream.close();
}
if (istream != null) {
istream.close();
}
}
}
return isPublishSuccess;
}
HEre is the error i am getting...
Exception in thread "Thread-8773" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2786)
at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:94)
at sun.net.www.http.PosterOutputStream.write(PosterOutputStream.java:61)
at com.test.HTTPClient.publishFile(HTTPClient.java:110)
at com.test.HttpFileTransport.put(HttpFileTransport.java:97)
The HttpUrlConnection is buffering the data so that it can set the Content-Length header (per HTTP spec).
One alternative, if your destination server supports it, is to use "chunked" transfers. This will buffer only a small portion of data at a time. However, not all services support it (Amazon S3, for example, doesn't).
Another alternative (and imo a better one) is to use Jakarta HttpClient. You can set the "entity" in a request from a file, and the connection code will set request headers appropriately.
Edit: nos commented that the OP could call HttpURLConnection.setFixedLengthStreamingMode(long length). I was unaware of this method; it was added in 1.5, and I haven't used this class since then.
However, I still suggest using Jakarta HttpClient, for the simple reason that it reduces the amount of code that the OP has to maintain. Code that is boilerplate, yet still has the potential for errors:
The OP correctly handles the loop to copy between input and output. Usually when I see an example of this, the poster either doesn't properly check the returned buffer size, or keeps re-allocating the buffers. Congratulations, but you now have to ensure that your successors take as much care.
The exception handling isn't quite so good. Yes, the OP remembers to close the connections in a finally block, and again, congratulations on that. Except that either of the close() calls could throw IOException, keeping the other from executing. And the method as a whole throws Exception, so that the compiler isn't going to help catch similar errors.
I count 31 lines of code to setup and execute the response (excluding the response code check and the URL computation, but including the try/catch/finally). With HttpClient, this would be somewhere in the range of a half dozen LOC.
Even if the OP had written this code perfectly, and refactored it into methods similar to those in Jakarta Commons IO, s/he shouldn't do that. This code has been written and tested by others. I know that it's a waste of my time to rewrite it, and suspect that it's a waste of the OP's time as well.
conn.setFixedLengthStreamingMode((int) new File(localpath).length());
And for buffering you could cover your streams into the BufferedOutputStream and BufferedInputStream
Good example of chunked uploading you could find there: gdata-java-client
The problem is that the HttpURLConnection class is using a byte array to store your data. Presumably this video you are pushing is taking more memory than available. You have a few options here:
Increase the memory to your application. You can use the -Xmx1024m option to give 1GB of memory to your application. This will increase the amount of data you can store in memory.
If you still run out of memory, you might want to consider trying another library to push the video up that does not store the data all in memory at once. The Apache Commons HttpClient has such a feature. See this site for more information: http://hc.apache.org/httpclient-3.x/features.html. See this section for multi-part form upload of large files: http://hc.apache.org/httpclient-3.x/methods/multipartpost.html
For anything other than basic GET operations, the built-in java.net HTTP stuff isn't very good. Using Apache Commons HttpClient is recommended for this. It lets you do much more intuitive stuff like this:
PutMethod put = new PutMethod(url);
put.setRequestEntity(new FileRequestEntity(localFile, contentType));
int responseCode = put.executeMethod();
which replaces a lot of your boiler-plate code.
HttpsURLConnection#setChunkedStreamingMode(1024 * 1024 * 10); //10MB chunk
This ensures that any file (of any size) is streamed over a https connection, without internal buffering. This should be used when the file size or the content length is unknown.
Your problem is that you're trying to fix X video bytes into X/N bytes of RAM, when N > 1.
You either need to read the video into a smaller buffer and write it out as you go or make the file smaller or increase the memory available to your process.
Check your heap size. You can use -Xmx to increase it if you've taken the default.

Categories