Set new object equal to otherObject.method()? - java

I'm learning java and come across something which confuses me quite a bit. I'm watching a video on an explanation of how http requests work...
URL theURL = new URL("http://www.google.com");
**URLConnection theConn = theURL.openConnection();**
I understand the first line in that it is just creating a URL object with an actual url as an argument. But I don't understand how in the second line, a URLConnection object is being created and being set equal to a method of the other object, or is that method returning something?

The method is returning a URLConnection as documented by the URL.openConnection() Javadoc which says (in part)
Returns a URLConnection instance that represents a connection to the remote object referred to by the URL.

Related

Retrieve redirected URL with Java / HttpURLConnection

Given a URL (String ref), I am attempting to retrieve the redirected URL as follows:
HttpURLConnection con = (HttpURLConnection)new URL(ref).openConnection();
con.setInstanceFollowRedirects(false);
con.setRequestProperty("User-Agent","");
int responseType = con.getResponseCode()/100;
while (responseType == 1)
{
Thread.sleep(10);
responseType = con.getResponseCode()/100;
}
if (responseType == 3)
return con.getHeaderField("Location");
return con.getURL().toString();
I am having several (conceptual and technical) problems with it:
Conceptual problem:
It works in most cases, but I don't quite understand how.
All methods of the 'con' instance are called AFTER the connection is opened (when 'con' is instanciated).
So how do they affect the actual result?
How come calling 'setInstanceFollowRedirects' affects the returned value of 'getHeaderField'?
Is there any point calling 'getResponseCode' over and over until the returned value is not 1xx?
Bottom line, my general question here: is there another request/response sent through the connection every time one of these methods is invoked?
Technical problem:
Sometimes the response-code is 3xx, but 'getHeaderField' does not return the "final" URL.
I tried calling my code with the returned value of 'getHeaderField' until the response-code was 2xx.
But in most other cases where the response-code is 3xx, 'getHeaderField' DOES return the "final" URL, and if I call my code with this URL then I get an empty string.
Can you please advise how to approach the two problems above in order to have a "100% proof" code for retrieving the "final" URL?
Please ignore cases where the response-code is 4xx or 5xx (or anything else other than 1xx / 2xx / 3xx for that matter).
Thanks
Conceptual problems:
0.) Can one URLConnection or HttpURLConnection object be reused?
No, you can not reuse such an object. You can use it to fetch the content of one URL just once. You can not use it to retrieve another URL, nor to fetch the content twice (speaking on the network level).
If you want to fetch another URL or to fetch the URL a second time, you have to call the openConnection() method of the URL class again to instanciate a new connection object.
1.) When is the URLConnection actually connected?
The method name openConnection() is misleading. It only instanciates the connection object. It does not do anything on the network level.
The interaction on the network level starts in this line, which implicitly connects the connection (= the TCP socket under the hood is opened and data is sent and received):
int responseType = con.getResponseCode()/100;
.
Alternatively, you can use HttpURLConnection.connect() to explicitly connect the connection.
2.) How does setInstanceFollowRedirects work?
setInstanceFollowRedirects(true) causes the URLs to be fetched "under the hood" again and again until there is a non-redirect response. The response code of the non-redirect response is returned by your call to getResponseCode().
UPDATE:
Yes, this allows to write simple code if you do not want to bother about the redirects yourself. You can simply switch on to follow redirects and then you can read the final response of the location to which you get redirected as if there was no redirect taking place.
I would be more careful in evaluating the response code. Not every 3xx-code is automatically a kind of redirection. For example the code 304 just stands for "Not modified."
Look at the original definitions here.

URLConnection.getURL method

I would like to have a second opinion on a small piece of Java code.
Will the method below always return an output string equal to the input string?
private static String func(final String url)
{
HttpURLConnection con = (HttpURLConnection)new URL(url).openConnection();
con.setInstanceFollowRedirects(true);
...
...
return con.getURL().toString();
}
The question refers to all possible scenarios, such as automatic redirection, etc.
If you look at URLConnection.getUrl() implementation, you can see that it returns the original URL passed to the constructor.
HttpURLConnection also doesn't change the original url.
To get the destination URL of a redirect you're supposed to call con.getHeaderField("Location"); - see for example: Retrieve the final location of a given URL in Java
So you get the original URL until you call connect() or some other method that results in establishing a connection.
If you set ((HttpURLConnection)con).setInstanceFollowRedirects(true); then after connect() if it really redirects you'll get the destination URL.
Redirect may not automatically happen for example when the protocol changes (e.g. http -> https).

How it comes that URL.openConnection() allows me to read header?

I recently was experimenting with java networking and I found a bit odd thing, suppose you have
URL url = new URL("http://www.google.com");
URLConnection con = url.openConnection();
then i can call methods, like con.getContentLength() and so on and they will give me correct values, even despite I didn't envoke con.connect(). How can that be? I mean, where from/how does URLConnection gets those headers, I didn't invoke con.connect() yet, so no requests were sent and so no headers should be available at that moment.
The actual TCP connect happens implicitly when you call any method that requires the response, such as getContentLength(), getInputStream(), getResponseCode(). It doesn't happen at openConnection(). The request is sent at that point.
Unless you are using one of the streaming modes and you're doing a PUT or POST with request content, in which case the connection is opened when you start writing the request.

Java HttpURLConnection class Program

I am learning Java through use of a textbook, which contains the following code describing the use of a HttpURLConnection ...
class HttpURLDemo {
public static void main(String args[]) throws Exception {
URL hp = new URL("http://www.google.com");
HttpURLConnection hpCon = (HttpURLConnection) hp.openConnection();
// Display request method.
System.out.println("Request method is " + hpCon.getRequestMethod());
}
}
Could someone please explain why the hpCon object is declared in the following way...
HttpURLConnection hpCon = (HttpURLConnection) hp.openConnection();
instead of declaring it like this...
HttpURLConnection hpCon = new HttpURLConnection();
The textbook author provided the following explanation, which I don't really understand...
Java provides a subclass of URLConnection that provides support for HTTP connections.
This class is called HttpURLConnection. You obtain an HttpURLConnection in the same
way just shown, by calling openConnection( ) on a URL object, but you must cast the result
to HttpURLConnection. (Of course, you must make sure that you are actually opening an
HTTP connection.) Once you have obtained a reference to an HttpURLConnection object,
you can use any of the methods inherited from URLConnection
The declaration that you don't understand why not to use:
HttpURLConnection hpCon = new HttpURLConnection();
Does not provide information about the URL to which you want to open the connection. This is the reason why you should use:
HttpURLConnection hpCon = new HttpURLConnection(hp);
Because this way the constructor knows that you want to open a connection to the url "http://www.google.com".
java.net.URLConnection is an abstract class that facilitates in communication with various types of servers via various protocols (ftp http etc).
The protocol specific subclasses are hidden inside SUN's packages and these hidden classes are responsible for the concrete implementation of the protocols.
In your example since your URL is a http://www.google.com by parsing the URL the internals of the URL class knows that an HTTP handler/subclass must be used.
So when you open a connection to the server hp.openConnection(); you get a concrete instance of a class that implements the HTTP protocol.
That class is an instance of HttpURLConnection (actually a subclass since HTTPURLConnection is also abstract and that is why you can do:
HttpURLConnection hpCon = (HttpURLConnection) hp.openConnection(); and not get class cast exception.
So with Java's design you can't do HttpURLConnection hpCon = new HttpURLConnection(hp); as you ask, since that is not how the designers want you to use these APIs.
You are expected to work arround URLs and URLConnections and only worry about input/output.
You shouldn't worry about the rest

Optimizing HttpURLConnection in Android

this problem is bugging me:
HttpURLConnection con = (HttpURLConnection)new URL(url).openConnection();
con.setRequestMethod("HEAD");
if (con.getResponseCode()!=200 ){dosomething()}
Is this the correct way to set the Request Method, or is it already too late since I called URL.openConnection() and it already made the connection using the default which is GET?
I can't call setRequestMethod("HEAD") in the same line as openConnection because it returns a URLConnection,not a HttpURLConnection.
So how do I ensure that the method will always be HEAD knowing the default is GET?
Should I just use HttpClient ?
That's the correct method.
Calling openConnection() doesn't actually do anything. The request isn't "committed" (that is, nothing is sent to the server) until you ask for something that is returned in the server's response, like the body of the response (con.getInputStream()), the status (con.getResponseCode()), or some other response header. This gives you time to set options on the HttpUrlConnection, like whether you plan to send a request body (i.e., POST), set the request method, etc.
By the way, you could set the method "on the same line," but being on the same line is meaningless: either openConnection() sends the request method, or it doesn't. Method calls that happen after are not a factor, regardless of the line they are on.

Categories