I recently was experimenting with java networking and I found a bit odd thing, suppose you have
URL url = new URL("http://www.google.com");
URLConnection con = url.openConnection();
then i can call methods, like con.getContentLength() and so on and they will give me correct values, even despite I didn't envoke con.connect(). How can that be? I mean, where from/how does URLConnection gets those headers, I didn't invoke con.connect() yet, so no requests were sent and so no headers should be available at that moment.
The actual TCP connect happens implicitly when you call any method that requires the response, such as getContentLength(), getInputStream(), getResponseCode(). It doesn't happen at openConnection(). The request is sent at that point.
Unless you are using one of the streaming modes and you're doing a PUT or POST with request content, in which case the connection is opened when you start writing the request.
Related
I have the following code:
HttpURLConnection conn = null;
BufferedReader in = null;
StringBuilder sb = null;
InputStream is = null;
conn = (HttpURLConnection) url.openConnection();
// Break-point A
conn.setDoInput(true);
conn.setDoOutput(true);
conn.setRequestMethod("POST");
// Break-point B
conn.setRequestProperty("X-TP-APP", Constants.X_TP_APP);
conn.setRequestProperty("X-TP-DEVICE", Constants.X_TP_DEVICE);
conn.setRequestProperty("X-TP-LOCALE", Constants.X_TP_LOCALE);
conn.setRequestProperty("Content-Type", contentType);
conn.setRequestProperty("Accept", accept);
conn.setRequestProperty("Authorization", SystemApi.TOKEN_STR);
conn.setUseCaches(false);
conn.setConnectTimeout(30000);
conn.getOutputStream().write(req.getBytes("UTF-8"));
conn.getOutputStream().flush();
conn.getOutputStream().close();
is = conn.getInputStream();
in = new BufferedReader(new InputStreamReader(is));
int statusCode = conn.getResponseCode();
// Break-point C
The code is running fine without problem (when breakpoint(A,B) is disabled)
I tried to find out when does HttpURLConnection really call the request and place breakpoint(A) after conn = getConnection(strURL);
and continue the code, but then at the end, at breakpoint(C), server would return me 401 - Unauthorized, which mean my Authorization header is not in the request.
It seem like that we are trying to open a connection first, and then set the header as fast as we can. If we are not fast enough, then the request is called anyway, which doesn't seem right.
My question and concern:
When does HttpURLConnection really call the request?
Is this what is actually happening? Is this the correct way to do so?
Is there a better way to make sure the header is set before calling the request?
Per the docs, the actual connection is made when the connect() method is invoked on the [Http]UrlConnection. That may be done manually, or it may be done implicitly by certain other methods. The Javadocs for UrlConnection.connect() say, in part:
URLConnection objects go through two phases: first they are created, then they are connected. After being created, and before being connected, various options can be specified (e.g., doInput and UseCaches). After connecting, it is an error to try to set them. Operations that depend on being connected, like getContentLength, will implicitly perform the connection, if necessary.
Note in particular the last sentence. I don't see anything in your code that would require the connection to be established until the first conn.getOutputStream(), and I read the docs as saying that the connection object will not enter the "connected" state until some method is invoked on it that requires that. Until such a time, it is ok to set connection properties.
Moreover, the docs definitely state that methods that set properties on the connection (and setRequestProperty() in particular) will throw an IllegalStateException if invoked when the connection object is already connected.
It is possible that your Java library is buggy in the manner you describe, but that would certainly be in conflict with the API specification. I think it's more likely that the explanation for the behavior you observe is different, and I recommend you capture and analyze the actual HTTP traffic to determine what's really going on.
Actually what really happened is, in the debug mode, I used conn.getResponseCode() in the expressions, which force the conn.getResponseCode() to run.
When it is not connected yet, getResponseCode() would calls connect() before the request is prepared.
Hence it would return me 401.
Since Android using the same HttpURLConnection, I did some capture the packet exchange to see what is happening under the hood.
I detailed my experiment in this post Can you explain the HttpURLConnection connection process?
To outline the network activity for your program.
At Breakpoint A No physical connection is made to the remote server. You get a logical handle to a local connection object.
At Breakpoint B You just configure the local connection object, nothing more.
conn.getOutputStream() Network connection starts here, but no payload is transferred to the server.
conn.getInputStream() Payload (http headers, content) are sent to the server, and you get the response (buffered into input stream, and also the response code etc.)
To Answer your question
When does HttpURLConnection really call the request?
getInputStream() triggers network layer to send out application payload and got responses.
Is this what is actually happening? Is this the correct way to do so?
No. openConnection() does not initiate network activity. You are getting back a local handle for future connection, not an active connection.
Is there a better way to make sure the header is set before calling the request?
You don't need to make sure header is set. The header payload isn't sent to the server until you ask for response (such as getting the response code, or opening a inputStream )
I want to readLines from a URL, which resolves to an HTTP service. I can use
Resources.readLines(url, Charsets.SOMETHING)
from com.google.common.io.
This works, but the class javadoc for Resources states the following, without further explanation:
Note that even though these methods use URL parameters, they are usually not appropriate for HTTP or other non-classpath resources.
Why is this method inappropriate for reading from an HTTP service, and what is the recommended approach?
When using URL to send an HTTP request, the typical process is
URL url = new URL(someStringUrl);
HttpUrlConnection con = (HttpUrlConnection) url.openConnection();
// do some stuff with con, add headers, add request body, etc.
con.getInputStream(); // get body of response
The URL given to Resources skips all that. The methods in Resources depend on URL#openStream() which skips any modifications to the URLConnection, ie. is equivalent the url.openConnection().getInputStream(). It's possible you'll get any number of 400 level error codes from the HTTP response because your request wasn't correct.
This won't happen with class path resources because the protocol is simple. You just copy the bytes.
AFAIK HttpURLConnection doesn't actually send the request out until we attempt to read the input. However, if an exception happens here I can't differentiate between the case where the request was not sent, and the case where the request was sent but some other sort of error occurred (maybe we entered a tunnel so couldn't receive the response).
Is there a way to query and find out if the request was actually sent or not?
You might have some luck with
URL url = new URL(serverUrl);
connection = (HttpURLConnection) url.openConnection();
// write to connection output stream (don't forget to flush())
connection.getResponseCode();
You can get a full list of response codes at http://download.oracle.com/javase/1,5.0/docs/api/java/net/HttpURLConnection.html
Having said that I can imagine your data going out completely an instant before you enter a tunnel and the connection not having a response code set, in which case you would try to send again (perhaps using a unique 'send id') so that the listener knows to ignore you resend but can still let you know that it was received.
this problem is bugging me:
HttpURLConnection con = (HttpURLConnection)new URL(url).openConnection();
con.setRequestMethod("HEAD");
if (con.getResponseCode()!=200 ){dosomething()}
Is this the correct way to set the Request Method, or is it already too late since I called URL.openConnection() and it already made the connection using the default which is GET?
I can't call setRequestMethod("HEAD") in the same line as openConnection because it returns a URLConnection,not a HttpURLConnection.
So how do I ensure that the method will always be HEAD knowing the default is GET?
Should I just use HttpClient ?
That's the correct method.
Calling openConnection() doesn't actually do anything. The request isn't "committed" (that is, nothing is sent to the server) until you ask for something that is returned in the server's response, like the body of the response (con.getInputStream()), the status (con.getResponseCode()), or some other response header. This gives you time to set options on the HttpUrlConnection, like whether you plan to send a request body (i.e., POST), set the request method, etc.
By the way, you could set the method "on the same line," but being on the same line is meaningless: either openConnection() sends the request method, or it doesn't. Method calls that happen after are not a factor, regardless of the line they are on.
If I create an HTTP java.net.URL and then call openConnection() on it, does it necessarily imply that an HTTP post is going to happen? I know that openStream() implies a GET. If so, how do you perform one of the other HTTP verbs without having to work with the raw socket layer?
If you retrieve the URLConnection object using openConnection() it doesn't actually start communicating with the server. That doesn't happen until you get the stream from the URLConnection(). When you first get the connection you can add/change headers and other connection properties before actually opening it.
URLConnection's life cycle is a bit odd. It doesn't send the headers to the server until you've gotten one of the streams. If you just get the input stream then I believe it does a GET, sends the headers, then lets you read the output. If you get the output stream then I believe it sends it as a POST, as it assumes you'll be writing data to it (You may need to call setDoOutput(true) for the output stream to work). As soon as you get the input stream the output stream is closed and it waits for the response from the server.
For example, this should do a POST:
URL myURL = new URL("http://example.com/my/path");
URLConnection conn = myURL.openConnection();
conn.setDoOutput(true);
conn.setDoInput(true);
OutputStream os = conn.getOutputStream();
os.write("Hi there!");
os.close();
InputStream is = conn.getInputStream();
// read stuff here
While this would do a GET:
URL myURL = new URL("http://example.com/my/path");
URLConnection conn = myURL.openConnection();
conn.setDoOutput(false);
conn.setDoInput(true);
InputStream is = conn.getInputStream();
// read stuff here
URLConnection will also do other weird things. If the server specifies a content length then URLConnection will keep the underlying input stream open until it receives that much data, even if you explicitly close it. This caused a lot of problems for us as it made shutting our client down cleanly a bit hard, as the URLConnection would keep the network connection open. This probably probably exists even if you just use getStream() though.
No it does not. But if the protocol of the URL is HTTP, you'll get a HttpURLConnection as a return object. This class has a setRequestMethod method to specify which HTTP method you want to use.
If you want to do more sophisticated stuff you're probably better off using a library like Jakarta HttpClient.