Reasonable to hold an HttpUrlConnection open indefinitely to a remote REST endpoint?

Reasonable to hold an HttpUrlConnection open indefinitely to a remote REST endpoint? - java

I am looking to optimize a process that runs continually and makes frequent calls (> 1 per second on average) to an external API via a simple REST style HTTP post. One thing I've noticed is that currently, the HttpUrlConnection is created and closed for every API call, as per the following structure (non essential code and error handling removed for readability).
//every API call
try {
URL url = new URL("..remote_site..");
conn = (HttpURLConnection) url.openConnection();
setupConnectionOptions(conn); //sets things like timeoout and usecaches false
outputWriter = new OutputStreamWriter(new BufferedOutputStream(conn.getOutputStream()));
//send request
} finally {
conn.disconnect();
outputWriter.close();
}
I don't have extensive experience dealing with the http protocol directly, but based on common sense / knowledge of sockets in general it seems that it would be much more efficient to only create the connection once and re-use it, and only reinitialize it on a problem, to avoid the connection negotiation each time, like this:
//on startup, or error
private void initializeConnection()
{
URL url = new URL("..remote_site..");
conn = (HttpURLConnection) url.openConnection();
setupConnectionOptions(conn); //sets things like timeoout and usecaches false
}
//per request
try {
outputWriter = new OutputStreamWriter(new BufferedOutputStream(conn.getOutputStream()));
//send request
} catch (IOException) {
try conn.disconnect();
initializeConnection();
} finally {
outputWriter.close();
}
//on graceful exit
conn.disconnect();
My questions are:
is this a reasonable optimization in general (will the speed increase be noticeable)?
Assuming yes:
should I reuse the output stream as well the connection?
is it reasonable to only reinitialize connection on error, or should I do it after a certain number of requests / time?

Basically, yes, and it saves a lot of time --- setting up a socket takes significant effort, even worse with SSL. That's why "keepalive" was implemented back in the Old Days. That's a litle bit counter to the REST philosophy, but it's a performance optimization.
The one thing about it is that sockets are a limited resource; in a really heavy-use environment, you could end up with no sockets left for new connections. this is a Bad Thing.

Related

How do I make a Java function that retries a URL connection every half second if the connection takes too long?

So I have a problem with a Java program I have. The program's basic functionality includes basically connecting to a web API for data. The function that does that is something like this:
public static Object getData(String sURL) throws IOException {
URL url = new URL(sURL);
URLConnection request = url.openConnection();
request.connect();
return request.getContent();
}
The code works fine as it is, but recently, after my house changed ISPs, I have found that sometimes the connections take an unreasonably long amount of time, something like 10 seconds or more in about 10% of attempts, while the other 90% takes only around 200ms. I have found it to be faster to ask my program to call the function again in a different thread than to wait for some of these connections to finally connect.
Therefore, I want to change the function so that if after 500ms, the connection did not establish, it would disconnect and a new connection would be attempted. How could I do this?
Somewhere online I read that HttpURLConnection might help, but I am not sure how.

URLConnection allows you to specify the connect and read timeout prior to calling connect():
https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/net/URLConnection.html#setConnectTimeout(int)
Sets a specified timeout value, in milliseconds, to be used when
opening a communications link to the resource referenced by this
URLConnection. If the timeout expires before the connection can be
established, a java.net.SocketTimeoutException is raised. A timeout of
zero is interpreted as an infinite timeout.
With 500ms timeout:
try {
URLConnection request = url.openConnection();
request.setConnectTimeout(500); // 500 ms
request.connect();
// on successful connection
} catch (SocketTimeoutException ex) {
// on request timeout
}
This you can pack into a loop, but I recommend limiting the number of attempts made.

Java's URLConnection doesn't have retry capabilities in Java 8 therefore the best way here to achieve this - use an appropriate standalone 3-party library such as Apache HttpClient.
This is by far the best standalone 3-party HTTP client with advanced capabilities as of 2020 and it's still maintained.
By default as of version 5.2.x Apache Http Client, Apache Http Client uses the default implementation of org.apache.http.client.HttpRequestRetryHandler, which retries 3 times, but you can use a custom implementation instead.
The configuration might look like this(full imports are for example's sake):
org.apache.http.client.HttpClient httpClient = org.apache.http.impl.client.HttpClients.custom()
.setRetryHandler(YourCustomImplOfTheRetryHandlerClass)
//other config
.build();

There is no way I can reproduce that problem using my ISP.
I suggest you dig deeper into the problem and find a better solution. Sending another request just doesn't seem good enough to me. Maybe try a different way to get the data and see if that works for you. Can't say for sure as I can't reproduce the problem.

Java heap space error when uploading files through http basic authentication (JAVA) [duplicate]

I am trying to publish a large video/image file from the local file system to an http path, but I run into an out of memory error after some time...
here is the code
public boolean publishFile(URI publishTo, String localPath) throws Exception {
InputStream istream = null;
OutputStream ostream = null;
boolean isPublishSuccess = false;
URL url = makeURL(publishTo.getHost(), this.port, publishTo.getPath());
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
if (conn != null) {
try {
conn.setDoOutput(true);
conn.setDoInput(true);
conn.setRequestMethod("PUT");
istream = new FileInputStream(localPath);
ostream = conn.getOutputStream();
int n;
byte[] buf = new byte[4096];
while ((n = istream.read(buf, 0, buf.length)) > 0) {
ostream.write(buf, 0, n); //<--- ERROR happens on this line.......???
}
int rc = conn.getResponseCode();
if (rc == 201) {
isPublishSuccess = true;
}
} catch (Exception ex) {
log.error(ex);
} finally {
if (ostream != null) {
ostream.close();
}
if (istream != null) {
istream.close();
}
}
}
return isPublishSuccess;
}
HEre is the error i am getting...
Exception in thread "Thread-8773" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2786)
at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:94)
at sun.net.www.http.PosterOutputStream.write(PosterOutputStream.java:61)
at com.test.HTTPClient.publishFile(HTTPClient.java:110)
at com.test.HttpFileTransport.put(HttpFileTransport.java:97)

The HttpUrlConnection is buffering the data so that it can set the Content-Length header (per HTTP spec).
One alternative, if your destination server supports it, is to use "chunked" transfers. This will buffer only a small portion of data at a time. However, not all services support it (Amazon S3, for example, doesn't).
Another alternative (and imo a better one) is to use Jakarta HttpClient. You can set the "entity" in a request from a file, and the connection code will set request headers appropriately.
Edit: nos commented that the OP could call HttpURLConnection.setFixedLengthStreamingMode(long length). I was unaware of this method; it was added in 1.5, and I haven't used this class since then.
However, I still suggest using Jakarta HttpClient, for the simple reason that it reduces the amount of code that the OP has to maintain. Code that is boilerplate, yet still has the potential for errors:
The OP correctly handles the loop to copy between input and output. Usually when I see an example of this, the poster either doesn't properly check the returned buffer size, or keeps re-allocating the buffers. Congratulations, but you now have to ensure that your successors take as much care.
The exception handling isn't quite so good. Yes, the OP remembers to close the connections in a finally block, and again, congratulations on that. Except that either of the close() calls could throw IOException, keeping the other from executing. And the method as a whole throws Exception, so that the compiler isn't going to help catch similar errors.
I count 31 lines of code to setup and execute the response (excluding the response code check and the URL computation, but including the try/catch/finally). With HttpClient, this would be somewhere in the range of a half dozen LOC.
Even if the OP had written this code perfectly, and refactored it into methods similar to those in Jakarta Commons IO, s/he shouldn't do that. This code has been written and tested by others. I know that it's a waste of my time to rewrite it, and suspect that it's a waste of the OP's time as well.

conn.setFixedLengthStreamingMode((int) new File(localpath).length());
And for buffering you could cover your streams into the BufferedOutputStream and BufferedInputStream
Good example of chunked uploading you could find there: gdata-java-client

The problem is that the HttpURLConnection class is using a byte array to store your data. Presumably this video you are pushing is taking more memory than available. You have a few options here:
Increase the memory to your application. You can use the -Xmx1024m option to give 1GB of memory to your application. This will increase the amount of data you can store in memory.
If you still run out of memory, you might want to consider trying another library to push the video up that does not store the data all in memory at once. The Apache Commons HttpClient has such a feature. See this site for more information: http://hc.apache.org/httpclient-3.x/features.html. See this section for multi-part form upload of large files: http://hc.apache.org/httpclient-3.x/methods/multipartpost.html

For anything other than basic GET operations, the built-in java.net HTTP stuff isn't very good. Using Apache Commons HttpClient is recommended for this. It lets you do much more intuitive stuff like this:
PutMethod put = new PutMethod(url);
put.setRequestEntity(new FileRequestEntity(localFile, contentType));
int responseCode = put.executeMethod();
which replaces a lot of your boiler-plate code.

HttpsURLConnection#setChunkedStreamingMode(1024 * 1024 * 10); //10MB chunk
This ensures that any file (of any size) is streamed over a https connection, without internal buffering. This should be used when the file size or the content length is unknown.

Your problem is that you're trying to fix X video bytes into X/N bytes of RAM, when N > 1.
You either need to read the video into a smaller buffer and write it out as you go or make the file smaller or increase the memory available to your process.
Check your heap size. You can use -Xmx to increase it if you've taken the default.

HttpURLConnection slow to disconnect - Java / Android

I want to get the file size of a file on a remote connection without actually downloading the (large) file. I am using the "Content-Length" header of the file. The relevant code is:
URL obj = new URL(FILES_URL + fileName);
String contentLength = "";
HttpURLConnection conn = null;
try {
conn = (HttpURLConnection) obj.openConnection();
conn.setConnectTimeout(3000);
conn.setReadTimeout(3000);
contentLength = conn.getHeaderField("Content-Length");
int responseCode = conn.getResponseCode();
Log.d(TAG, "responseCode: " + responseCode);
} finally {
Log.d(TAG, "pre-disconnect");
if (conn!=null) conn.disconnect();
Log.d(TAG, "post-disconnect");
}
return contentLength;
The command "conn.disconnect();" sometimes seems to take forever. I have seen 23 seconds! Admittedly, this is connecting to a secondary local device which is running a web server, but the WiFi signal is strong, relatively fast, and I have never had any such problems using "curl" from my laptop. I do not have control over the web server I am connecting too.
The problem possibly is enhanced when making multiple similar connections to different files one after another, not sure. This is, however, creating entirely new HttpURLConnection's and not reusing the old one. Could reusing the connection help?
I never actually download the file or access the inputstream.
I could just not call disconnect, but I understand it is not recommended because resources would not be released. Is this not correct? I notice URLConnection doesn't have a disconnect. It is just suggested to close any streams you open.
This code is in an asynctask. I guess I could try moving the disconnect call itself to a further asynctask because I don't do anything afterwards. Not sure if that is even possible.
Do you have any suggestions? Should I try something other than HttpURLConnection to get the file size without downloading the file?

Thanks to EJP in the comments. Changing the request method to "HEAD" made the disconnect almost instantaneous:
conn.setRequestMethod("HEAD");
From what I have read, HttpURLConnection.disconnect() will skip through the entire response object if it hasn't been read. Therefore, for very large files, it will take a long time. Using the request method "HEAD" force the response body to be empty and solves the issue.

I suggest you to use either Volley or Okhttp for faster networking but depending on your requirement . Got through Comparison Of Volley And OkHttp and Retrofit and decide which library to use.
As suggestion if you putting this code inside AsyncTask then Read Dark Side of AsyncTask.

Why do you have to call URLConnection#getInputStream to be able to write out to URLConnection#getOutputStream?

I'm trying to write out to URLConnection#getOutputStream, however, no data is actually sent until I call URLConnection#getInputStream. Even if I set URLConnnection#doInput to false, it still will not send. Does anyone know why this is? There's nothing in the API documentation that describes this.
Java API Documentation on URLConnection: http://download.oracle.com/javase/6/docs/api/java/net/URLConnection.html
Java's Tutorial on Reading from and Writing to a URLConnection: http://download.oracle.com/javase/tutorial/networking/urls/readingWriting.html
import java.io.IOException;
import java.io.OutputStreamWriter;
import java.net.URL;
import java.net.URLConnection;
public class UrlConnectionTest {
private static final String TEST_URL = "http://localhost:3000/test/hitme";
public static void main(String[] args) throws IOException {
URLConnection urlCon = null;
URL url = null;
OutputStreamWriter osw = null;
try {
url = new URL(TEST_URL);
urlCon = url.openConnection();
urlCon.setDoOutput(true);
urlCon.setRequestProperty("Content-Type", "text/plain");
////////////////////////////////////////
// SETTING THIS TO FALSE DOES NOTHING //
////////////////////////////////////////
// urlCon.setDoInput(false);
osw = new OutputStreamWriter(urlCon.getOutputStream());
osw.write("HELLO WORLD");
osw.flush();
/////////////////////////////////////////////////
// MUST CALL THIS OTHERWISE WILL NOT WRITE OUT //
/////////////////////////////////////////////////
urlCon.getInputStream();
/////////////////////////////////////////////////////////////////////////////////////////////////////////
// If getInputStream is called while doInput=false, the following exception is thrown: //
// java.net.ProtocolException: Cannot read from URLConnection if doInput=false (call setDoInput(true)) //
/////////////////////////////////////////////////////////////////////////////////////////////////////////
} catch (Exception e) {
e.printStackTrace();
} finally {
if (osw != null) {
osw.close();
}
}
}
}

The API for URLConnection and HttpURLConnection are (for better or worse) designed for the user to follow a very specific sequence of events:
Set Request Properties
(Optional) getOutputStream(), write to the stream, close the stream
getInputStream(), read from the stream, close the stream
If your request is a POST or PUT, you need the optional step #2.
To the best of my knowledge, the OutputStream is not like a socket, it is not directly connected to an InputStream on the server. Instead, after you close or flush the stream, AND call getInputStream(), your output is built into a Request and sent. The semantics are based on the assumption that you will want to read the response. Every example that I've seen shows this order of events. I would certainly agree with you and others that this API is counterintuitive when compared to the normal stream I/O API.
The tutorial you link to states that "URLConnection is an HTTP-centric class". I interpret that to mean that the methods are designed around a Request-Response model, and make the assumption that is how they will be used.
For what it's worth, I found this bug report that explains the intended operation of the class better than the javadoc documentation. The evaluation of the report states "The only way to send out the request is by calling getInputStream."

Although the getInputStream() method can certainly cause a URLConnection object to initiate an HTTP request, it is not a requirement to do so.
Consider the actual workflow:
Build a request
Submit
Process the response
Step 1 includes the possibility of including data in the request, by way of an HTTP entity. It just so happens that the URLConnection class provides an OutputStream object as the mechanism for providing this data (and rightfully so for many reasons that aren't particularly relevant here). Suffice to say that the streaming nature of this mechanism provides the programmer an amount of flexibility when supplying the data, including the ability to close the output stream (and any input streams feeding it), before finishing the request.
In other words, step 1 allows for supplying a data entity for the request, then continuing to build it (such as by adding headers).
Step 2 is really a virtual step, and can be automated (like it is in the URLConnection class), since submitting a request is meaningless without a response (at least within the confines of the HTTP protocol).
Which brings us to Step 3. When processing an HTTP response, the response entity -- retrieved by calling getInputSteam() -- is just one of the things we might be interested in. A response consists of a status, headers, and optionally an entity. The first time any one of these is requested, the URLConnection will perform virtual step 2 and submit the request.
No matter if an entity is being sent via the connection's output stream or not, and no matter whether a response entity is expected back, a program will ALWAYS want to know the result (as provided by the HTTP status code). Calling getResponseCode() on the URLConnection provides this status, and switching on the result may end the HTTP conversation without ever calling getInputStream().
So, if data is being submitted, and a response entity is not expected, don't do this:
// request is now built, so...
InputStream ignored = urlConnection.getInputStream();
... do this:
// request is now built, so...
int result = urlConnection.getResponseCode();
// act based on this result

As my experiments have shown (java 1.7.0_01) the code:
osw = new OutputStreamWriter(urlCon.getOutputStream());
osw.write("HELLO WORLD");
osw.flush();
Doesn't send anything to the server. It just saves what's written there to the memory buffer. Thus in case you're going to upload a large file via POST - you need to be sure that you have enough memory. On desktop/server it may not be such a big problem, but on android that may result in out of memory error. Here's the example of how the stack trace looks when trying to write to output stream, and memory runs out.
Exception in thread "Thread-488" java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.Arrays.copyOf(Arrays.java:2271)
at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113)
at java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140)
at sun.net.www.http.PosterOutputStream.write(PosterOutputStream.java:78)
at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
at sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:282)
at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125)
at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:135)
at java.io.OutputStreamWriter.write(OutputStreamWriter.java:220)
at java.io.Writer.write(Writer.java:157)
at maxela.tables.weboperations.POSTRequest.makePOST(POSTRequest.java:138)
On the bottom of the trace you can see the makePOST() method which does the following:
writer = new OutputStreamWriter(conn.getOutputStream());
for (int j = 0 ; j < 3000 * 100 ; j++)
{
writer.write("&var" + j + "=garbagegarbagegarbage_"+ j);
}
writer.flush();
And writer.write() throws the exception.
Also my experiments have shown that any exception related to the actual connection/IO with the server is thrown only after urlCon.getOutputStream() is called. Even urlCon.connect() seems to be "dummy" method which doesn't do any physical connection.
However if you call urlCon.getContentLengthLong() which returns Content-Length: header field from the server response-headers - then URLConnection.getOutputStream() will be called automatically and in case there's exception - it will be thrown.
The exceptions thrown by urlCon.getOutputStream() are all IOException, and I have met the follwing ones:
try
{
urlCon.getOutputStream();
}
catch (UnknownServiceException ex)
{
System.out.println("UnkownServiceException():" + ex.getMessage());
}
catch (ConnectException ex)
{
System.out.println("ConnectException()");
Logger.getLogger(POSTRequest.class.getName()).log(Level.SEVERE, null, ex);
}
catch (IOException ex) {
System.out.println("IOException():" + ex.getMessage());
Logger.getLogger(POSTRequest.class.getName()).log(Level.SEVERE, null, ex);
}
Hopefully my little research helps to people, as URLConnection class is a bit counter-intuitive in some cases thus, when implementing it - one needs to know what's it deals with.
Second reason is: when working with servers - the work with server may fail because of many reasons (connection, dns, firewall, httpresponses, server not being able to accept connection, server not being able to process request timely). Thus it is important to understand how exceptions raised can explain about what's actually happening with the connection.

Calling getInputStream() signals that the client is finished sending it's request, and is ready to receive the response (per HTTP spec). It seems that the URLConnection class has this notion built into it, and must be flush()ing the output stream when the input stream is asked for.
As the other responder noted, you should be able to call flush() yourself to trigger the write.

The fundamental reason is that it has to compute a Content-length header automatically (unless you are using chunked or streaming mode). It can't do that until it has seen all the output, and it has to send it before the output, so it has to buffer the output. And it needs a decisive event to know when the last output has actually been written. So it uses getInputStream() for that. At that time it writes the headers including the content-length, then the output, then it starts reading the input.

(Repost from your first question. Shameless self-plug)
Don't fiddle around with URLConnection yourself, let Resty handle it.
Here's the code you would need to write (I assume you are getting text back):
import static us.monoid.web.Resty.*;
import us.monoid.web.Resty;
...
new Resty().text(TEST_URL, content("HELLO WORLD")).toString();

OutputStream OutOfMemoryError when sending HTTP

I am trying to publish a large video/image file from the local file system to an http path, but I run into an out of memory error after some time...
here is the code
public boolean publishFile(URI publishTo, String localPath) throws Exception {
InputStream istream = null;
OutputStream ostream = null;
boolean isPublishSuccess = false;
URL url = makeURL(publishTo.getHost(), this.port, publishTo.getPath());
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
if (conn != null) {
try {
conn.setDoOutput(true);
conn.setDoInput(true);
conn.setRequestMethod("PUT");
istream = new FileInputStream(localPath);
ostream = conn.getOutputStream();
int n;
byte[] buf = new byte[4096];
while ((n = istream.read(buf, 0, buf.length)) > 0) {
ostream.write(buf, 0, n); //<--- ERROR happens on this line.......???
}
int rc = conn.getResponseCode();
if (rc == 201) {
isPublishSuccess = true;
}
} catch (Exception ex) {
log.error(ex);
} finally {
if (ostream != null) {
ostream.close();
}
if (istream != null) {
istream.close();
}
}
}
return isPublishSuccess;
}
HEre is the error i am getting...
Exception in thread "Thread-8773" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2786)
at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:94)
at sun.net.www.http.PosterOutputStream.write(PosterOutputStream.java:61)
at com.test.HTTPClient.publishFile(HTTPClient.java:110)
at com.test.HttpFileTransport.put(HttpFileTransport.java:97)

conn.setFixedLengthStreamingMode((int) new File(localpath).length());
And for buffering you could cover your streams into the BufferedOutputStream and BufferedInputStream
Good example of chunked uploading you could find there: gdata-java-client

The problem is that the HttpURLConnection class is using a byte array to store your data. Presumably this video you are pushing is taking more memory than available. You have a few options here:
Increase the memory to your application. You can use the -Xmx1024m option to give 1GB of memory to your application. This will increase the amount of data you can store in memory.
If you still run out of memory, you might want to consider trying another library to push the video up that does not store the data all in memory at once. The Apache Commons HttpClient has such a feature. See this site for more information: http://hc.apache.org/httpclient-3.x/features.html. See this section for multi-part form upload of large files: http://hc.apache.org/httpclient-3.x/methods/multipartpost.html

For anything other than basic GET operations, the built-in java.net HTTP stuff isn't very good. Using Apache Commons HttpClient is recommended for this. It lets you do much more intuitive stuff like this:
PutMethod put = new PutMethod(url);
put.setRequestEntity(new FileRequestEntity(localFile, contentType));
int responseCode = put.executeMethod();
which replaces a lot of your boiler-plate code.

HttpsURLConnection#setChunkedStreamingMode(1024 * 1024 * 10); //10MB chunk
This ensures that any file (of any size) is streamed over a https connection, without internal buffering. This should be used when the file size or the content length is unknown.

Your problem is that you're trying to fix X video bytes into X/N bytes of RAM, when N > 1.
You either need to read the video into a smaller buffer and write it out as you go or make the file smaller or increase the memory available to your process.
Check your heap size. You can use -Xmx to increase it if you've taken the default.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.