Java. need to read a url as string and timeout option - java

already asked in another post, but got no solution
see me script bellow, how can i change that, please full demo, so
that i can set a readtimout if the network is down for example
at all i have to connect to an url, read just one line, get that as a string ... done
hope u can give an example. any other working way is welcome.. thx
try {
URL url = new URL("http://mydomain/myfile.php");
//url.setReadTimeout(5000); does not work
InputStreamReader testi= new InputStreamReader(url.openStream());
BufferedReader in = new BufferedReader(testi);
//in.setReadTimeout(5000); does not work
stri = in.readLine();
Log.v ("GotThat: ",stri);
in.close();
} catch (MalformedURLException e) {
} catch (IOException e) {
}

How about using HttpURLConnection
import java.net.HttpURLConnection;
URL url = new URL("http://mydomain/myfile.php");
HttpURLConnection huc = (HttpURLConnection) url.openConnection();
HttpURLConnection.setFollowRedirects(false);
huc.setConnectTimeout(15 * 1000);
huc.setRequestMethod("GET");
huc.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2 (.NET CLR 3.5.30729)");
huc.connect();
InputStream input = huc.getInputStream();
Code picked from here

Timeout:
URL url = new URL("http://www.google.com");
URLConnection conn = url.openConnection();
conn.connect();
conn.setReadTimeout(10000); //10000 milliseconds (10 seconds)

Related

HttpURLConnection with https InputStream Garbled

I use HttpURLConnection to crawler https://translate.google.com/.
InetSocketAddress addr = new InetSocketAddress("127.0.0.1", 1082);
Proxy proxy = new Proxy(Proxy.Type.HTTP, addr);
url = new URL("https://translate.google.com/");
HttpURLConnection conn = (HttpURLConnection) url.openConnection(proxy);
conn.setRequestProperty("Accept-Encoding", "gzip, deflate, sdch");
conn.setRequestProperty("Connection", "keep-alive");
conn.setRequestProperty("User-Agent",
"Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.76 Mobile Safari/537.36");
conn.setRequestProperty("Accept", "*/*");
Map<String, List<String>> reqHeaders = conn.getHeaderFields();
List<String> reqTypes = reqHeaders.get("Content-Type");
for (String ss : reqTypes) {
System.out.println(ss);
}
InputStream in = conn.getInputStream();
String s = IOUtils.toString(in, "UTF-8");
System.out.println(s.substring(0, 100));
Map<String, List<String>> resHeader = conn.getHeaderFields();
List<String> resTypes = resHeader.get("Content-Type");
for (String ss : resTypes) {
System.out.println(ss);
}
Console is
But When I change url to http://translate.google.com/.
It works well.
I know actually HttpURLConnection is HttpsURLConnection when i crawler https://translate.google.com/.
I try to use HttpsURLConnection and it still garbled.
Any suggestions?
conn.setRequestProperty("Accept-Encoding", "gzip, deflate, sdch");
The response is compressed, because the above line tells the server that the client is able to understand encodings specified in Accept-Encoding.
Try to comment this line or handle this situation.
There's a more specific implementation for HTTPS i.e. HttpsURLConnection, in case you're interested in https-specific features, e.g.:
import javax.net.ssl.HttpsURLConnection;
....
URL url = new URL("https://www.google.com/");
HttpsURLConnection conn = (HttpsURLConnection) url.openConnection();
I accept Jerry Chin's answer.Solves my problem.
My answer just recording how i resolve this problem.
If this approach is unreasonable.Let me know, I'll remove this answer.
conn.setRequestProperty("Accept-Encoding", "gzip, deflate, sdch");
And then I check response Content-Encoding.It's gzip.
So i use GZIPInputStream to receive.
InputStream in = conn.getInputStream();
GZIPInputStream gzis=new GZIPInputStream(in);
InputStreamReader reader = new InputStreamReader(gzis);
BufferedReader br = new BufferedReader(reader);
The InputStream is normal.
BTW,If you don't need Accept-Encoding,you can remove it.
And do not forget check user-agent. It's very important and different operating systems corresponding to different user-agent.

How does HTTP POST method transfer data ?

I know what is the HTTP header and how the HTTP data format , I also know how to make a HTTP post from Java ,
like
StringBuffer result = new StringBuffer();
PrintWriter out = null;
BufferedReader in = null;
StringBuffer result = new StringBuffer();
try {
URL realUrl = new URL("http://somesite/somepage.htm");
URLConnection conn = realUrl.openConnection();
conn.setRequestProperty("accept", "*/*");
conn.setRequestProperty("connection", "Keep-Alive");
conn.setRequestProperty("user-agent",
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;SV1)");
conn.setDoOutput(true);
conn.setDoInput(true);
out = new PrintWriter(conn.getOutputStream());
out.print(param);
out.flush();
in = new BufferedReader(
new InputStreamReader(conn.getInputStream()));
String line;
while ((line = in.readLine()) != null) {
result.append(line);
}
} catch (Exception e) {
e.printStackTrace();
}
I can use some code like this write stream to web server , and read stream also .
Here is my question .
What is the HTTP POST method really do , I mean how does the client communicate to web server ?
What will happen if the web server only read the HTTP POST header and don't read the stream from client ?
Does the stream will stuck in somewhere ?
Thanks .

Java HttpURLConnection trying to login with cookie

So am trying to login this website using java but for some reason its not working as expected i got the host and all that stuff but its not going to the account page with the cookie it still shows the login page and yes my account info is correct any help is great
public static void main(String[] args) {
try {
String params = "loginEmail=private#hotmail.com&loginPassword=privatepassword&Submit=Sign+In";
String urls = "http://www.filefactory.com/member/signin.php";
URL url = new URL(urls);
HttpURLConnection connection = (HttpURLConnection)url.openConnection();
connection.setRequestMethod("POST");
connection.setRequestProperty("Content-Type",
"application/x-www-form-urlencoded");
connection.setRequestProperty("Host", "www.filefactory.com");
connection.setRequestProperty("Referer", "http://www.filefactory.com/member/signin.php");
connection.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.101 Safari/537.36 OPR/25.0.1614.50");
connection.setRequestProperty("Content-Length", "" +
Integer.toString(params.getBytes().length));
connection.setRequestProperty("Content-Language", "en-US");
connection.setDoInput(true);
connection.setDoOutput(true);
//Send request
DataOutputStream wr = new DataOutputStream (
connection.getOutputStream ());
wr.writeBytes (params);
wr.flush ();
wr.close ();
//Get Response
InputStream is = connection.getInputStream();
BufferedReader rd = new BufferedReader(new InputStreamReader(is));
String line;
StringBuilder response = new StringBuilder();
while((line = rd.readLine()) != null) {
response.append(line);
response.append('\r');
}
rd.close();
System.out.println(response.toString());
// get the cookie if need, for login
String cookies = connection.getHeaderField("Set-Cookie");
// open the new connnection again
connection = (HttpURLConnection) new URL("http://www.filefactory.com/account/").openConnection();
connection.setRequestProperty("Cookie", cookies);
connection.addRequestProperty("Accept-Language", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8");
connection.addRequestProperty("User-Agent", "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.101 Safari/537.36 OPR/25.0.1614.50");
connection.addRequestProperty("Host", "www.filefactory.com");
System.out.println("Redirect to URL : " + "http://www.filefactory.com/account/");
BufferedReader in = new BufferedReader(
new InputStreamReader(connection.getInputStream()));
String inputLine;
StringBuilder html = new StringBuilder();
while ((inputLine = in.readLine()) != null) {
html.append(inputLine);
}
in.close();
System.out.println("URL Content... \n" + html.toString());
System.out.println("Done");
} catch (Exception e) {
e.printStackTrace();
}
}
}
You are using : String cookies = connection.getHeaderField("Set-Cookie");
Are you sure there is only one entry for that header? There could be more.
http://en.wikipedia.org/wiki/HTTP_cookie
Try using chrome or firefox and try logging in manually to capture the request and response. That may give you some hints regarding what could be wrong.
Additionally you could use a tool to view the communication between your client and the server (unless you are already doing so)
It's hard to tell, not knowing the exact way that website works, but you should note that it sends you the cookies first when it presents the login page to you. When you send in your credentials you have to already send them together with the cookies, so that it knows to associate those credentials with this cookie.

HTTP Response code 403 Issue

I am getting
Http 403 code
When I try to access that URL using my application
URL url = new URL(path);
url.openStream();
URL url = new URL(downloadURL);
/*URLConnection conn = url.openConnection();
conn.setRequestProperty("User-Agent", "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.4; en-US; rv:1.9.2.2) Gecko/20100316 Firefox/3.6.2");*/
HttpURLConnection httpcon = (HttpURLConnection) url.openConnection();
httpcon.setRequestProperty("User-Agent","Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.102 Safari/537.36");
httpcon.connect();
//url.getFile();
InputStream is = httpcon.getInputStream();
//InputStream is = conn.getInputStream();
But I can open the URL which I am using in my application ..
Can anybody please help
I can access that URL In my browser...
That URL is from some server(local) not public.
URL url = new URL(path);
URLConnection conn = url.openConnection();
conn.getInputStream();
Try this way.

How to check if a URL exists or returns 404 with Java?

String urlString = "http://www.nbc.com/Heroes/novels/downloads/Heroes_novel_001.pdf";
URL url = new URL(urlString);
if(/* Url does not return 404 */) {
System.out.println("exists");
} else {
System.out.println("does not exists");
}
urlString = "http://www.nbc.com/Heroes/novels/downloads/Heroes_novel_190.pdf";
url = new URL(urlString);
if(/* Url does not return 404 */) {
System.out.println("exists");
} else {
System.out.println("does not exists");
}
This should print
exists
does not exists
TEST
public static String URL = "http://www.nbc.com/Heroes/novels/downloads/";
public static int getResponseCode(String urlString) throws MalformedURLException, IOException {
URL u = new URL(urlString);
HttpURLConnection huc = (HttpURLConnection) u.openConnection();
huc.setRequestMethod("GET");
huc.connect();
return huc.getResponseCode();
}
System.out.println(getResponseCode(URL + "Heroes_novel_001.pdf"));
System.out.println(getResponseCode(URL + "Heroes_novel_190.pdf"));
System.out.println(getResponseCode("http://www.example.com"));
System.out.println(getResponseCode("http://www.example.com/junk"));
Output
200
200
200
404
SOLUTION
Add the next line before .connect() and the output would be 200, 404, 200, 404
huc.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2 (.NET CLR 3.5.30729)");
You may want to add
HttpURLConnection.setFollowRedirects(false);
// note : or
// huc.setInstanceFollowRedirects(false)
if you don't want to follow redirection (3XX)
Instead of doing a "GET", a "HEAD" is all you need.
huc.setRequestMethod("HEAD");
return (huc.getResponseCode() == HttpURLConnection.HTTP_OK);
this worked for me:
URL u = new URL ( "http://www.example.com/");
HttpURLConnection huc = ( HttpURLConnection ) u.openConnection ();
huc.setRequestMethod ("GET"); //OR huc.setRequestMethod ("HEAD");
huc.connect () ;
int code = huc.getResponseCode() ;
System.out.println(code);
thanks for the suggestions above.
Use HttpUrlConnection by calling openConnection() on your URL object.
getResponseCode() will give you the HTTP response once you've read from the connection.
e.g.
URL u = new URL("http://www.example.com/");
HttpURLConnection huc = (HttpURLConnection)u.openConnection();
huc.setRequestMethod("GET");
huc.connect() ;
OutputStream os = huc.getOutputStream();
int code = huc.getResponseCode();
(not tested)
Based on the given answers and information in the question, this is the code you should use:
public static boolean doesURLExist(URL url) throws IOException
{
// We want to check the current URL
HttpURLConnection.setFollowRedirects(false);
HttpURLConnection httpURLConnection = (HttpURLConnection) url.openConnection();
// We don't need to get data
httpURLConnection.setRequestMethod("HEAD");
// Some websites don't like programmatic access so pretend to be a browser
httpURLConnection.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2 (.NET CLR 3.5.30729)");
int responseCode = httpURLConnection.getResponseCode();
// We only accept response code 200
return responseCode == HttpURLConnection.HTTP_OK;
}
Of course tested and working.
There is nothing wrong with your code. It's the NBC.com doing tricks on you. When NBC.com decides that your browser is not capable of displaying PDF, it simply sends back a webpage regardless what you are requesting, even if it doesn't exist.
You need to trick it back by telling it your browser is capable, something like,
conn.setRequestProperty("User-Agent",
"Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.0.13) Gecko/2009073021 Firefox/3.0.13");

Categories