How to use HTML response to extract data in Java? - java

So, when I am sending an HTTP request using Java language, am getting the response in the form of HTML code. For example, sending request: http://www.google.com/search?q=what%20is%20mango
getting the response in the form of HTML code of this page:
https://www.google.co.in/search?q=what+is+mango&rlz=1C1CHBF_enIN743IN743&oq=what+is+mango&aqs=chrome..69i57j0l5.4095j0j7&sourceid=chrome&ie=UTF-8
So, from this response page, I again want to send the request to Wikipedia page (listed in the response page) and then I want to copy the content about mango from the Wikipedia page and write it to a file on my system
the code from which I am sending the Google search request:
package api_test;
import java.io.*;
import java.net.*;
import java.util.*;
public class HttpURLConnectionExample {
private final String USER_AGENT= "Mozilla/5.0";
public static void main(String[] args) throws Exception {
HttpURLConnectionExample http= new HttpURLConnectionExample();
System.out.println("testing 1- send http get request");
http.sendGet();
}
private void sendGet() throws Exception{
Scanner s= new Scanner(System.in);
System.out.println("enter the URL");
String url = s.nextLine();
URL obj = new URL("http://"+url);
HttpURLConnection con = (HttpURLConnection) obj.openConnection();
// optional default is GET
con.setRequestMethod("GET");
//add request header
con.setRequestProperty("User-Agent", USER_AGENT);
int responseCode = con.getResponseCode();
System.out.println("\nSending 'GET' request to URL : " + url);
System.out.println("Response Code : " + responseCode);
BufferedReader in = new BufferedReader(
new InputStreamReader(con.getInputStream()));
String inputLine;
StringBuffer response = new StringBuffer();
while ((inputLine = in.readLine()) != null) {
response.append(inputLine);
}
in.close();
//print result
System.out.println(response.toString());
}
}

I think what you need is a HTML Parser, like jsoup.
You could do something like
Document doc = Jsoup.connect("http://www.google.com/search?q=what%20is%20mango").get();
Element result = doc.select("#search h3.r a").first();
String link = result.attr("data-href");
I'm not sure if Google's layout changes a lot, but right now the CSS selector "#search h3.r a" is working.

Related

Microsoft Graph 401 Unauthorized with access token

Unable to get data from the the Microsoft Graph API.
private String getUserNamesFromGraph() throws Exception {
String bearerToken = "Bearer "+getAccessToken();
String url = "https://graph.microsoft.com/v1.0/users";
String returnData = null;
try {
URL apiURL = new URL(url);
URLConnection con = apiURL.openConnection();
con.setRequestProperty("Authorization", bearerToken);
con.setRequestProperty("Content-Type", "application/json");
BufferedReader in = new BufferedReader(new InputStreamReader(con.getInputStream()));
String inputLine;
StringBuffer response = new StringBuffer();
while((inputLine = in.readLine()) != null) {
response.append(inputLine);
}
in.close();
returnData = response.toString();
System.out.println(returnData);
} catch(Exception e) {
System.out.println(e);
}
return returnData;
}
private String getAccessToken() throws Exception {
String url = "https://login.microsoftonline.com/common/oauth2/v2.0/token";
URL obj = new URL(url);
HttpsURLConnection con = (HttpsURLConnection) obj.openConnection();
// header
con.setRequestMethod("POST");
con.setRequestProperty("User-Agent", "eTarget API");
con.setRequestProperty("Accept-Language", "en-US,en;q=0.5");
String urlParameters = "client_id=***
APPLICATION ID FROM APPLICATION REGISTRATION PORTAL ***&scope=https%3A%2F%2Fgraph.microsoft.com%2F.default&client_secret=***
APPLICATION SECRET FROM APPLICATION REGISTRATION PORTAL ***&grant_type=client_credentials";
// Send post request
con.setDoOutput(true);
DataOutputStream wr = new DataOutputStream(con.getOutputStream());
wr.writeBytes(urlParameters);
wr.flush();
wr.close();
int responseCode = con.getResponseCode();
System.out.println("\nSending 'POST' request to URL : " + url);
System.out.println("Post parameters : " + urlParameters);
System.out.println("Response Code : " + responseCode);
BufferedReader in = new BufferedReader(new InputStreamReader(con.getInputStream()));
String inputLine;
StringBuffer response = new StringBuffer();
while ((inputLine = in.readLine()) != null) {
response.append(inputLine);
}
in.close();
//print result
String returnData = response.toString();
System.out.println(returnData);
Map jsonTokenData = new Gson().fromJson(returnData, Map.class);
String accessToken = (String)jsonTokenData.get("access_token");
//System.out.println(accessToken);
return accessToken;
}
The application is registered
I have a method getAccessToken() that successfully returns an access token
The method getUserNamesFromGraph() however returns a 401 Unauthorized instead of the expected data.
I've gone through the documentation countless times, trying different variations and endpoints but to no avail. Any ideas appreciated.
In order your application to read the users it has to have an explicitly granted User.Read.All application permission. This permission requires admin consent. Here is one link where it is explained how to grant that permission. You must invoke that interactive consent dialog to grant your application the permissions. Otherwise you will still receive Insufficient permissions error.
Then here is the complete list of different Microsoft Graph permissions. In your case - a daemon application without user interaction, you have to look at the application permissions and not **delegated permissions*.
Once you grant appropriate permissions, you will be able to query the users. You do not have to change the scope in your token request. Leave it as it is: https://graph.microsoft.com/.default
Once you make all these changes, you can use https://jwt.ms to check your access token. There you can extract all the claims and check your audience and scope claims to further understand why you get 401 from the Microsoft Graph.
The reason for this is the application must also have support for the permissions requested. Case in point is an application isn't allowed to list managed devices as shown in the Prerequisites page in this page
List managedDevices permissions

HTTP post request fails

I wrote the below code to send post request to an url. When I ran the code I am getting 500 error code. But, when I tried the same url in SOAP UI with the below headers I got the response back. May I know what is wrong in my code. Thanks in advance. I doubt I didn't add the headers properly.
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:arm="http://siebel.com/Webservice">
<soapenv:Header>
<UsernameToken xmlns="http://siebel.com/webservices">username</UsernameToken>
<PasswordText xmlns="http://siebel.com/webservices">password</PasswordText>
<SessionType xmlns="http://siebel.com/webservices">Stateless</SessionType>
</soapenv:Header>
<soapenv:Body>
<arm:QueryList_Input>
<arm:SRNum></arm:SRNum>
</arm:QueryList_Input>
</soapenv:Body>
</soapenv:Envelope>
Below is my code.
package com.siebel.Webservice;
import java.io.BufferedReader;
import java.io.DataOutputStream;
import java.io.InputStreamReader;
import java.net.HttpURLConnection;
import java.net.URL;
import javax.net.ssl.HttpsURLConnection;
public class HttpQueryList {
private final String USER_AGENT = "Mozilla/5.0";
public static void main(String[] args) throws Exception {
HttpQueryList http = new HttpQueryList();
System.out.println("\nTesting 2 - Send Http POST request");
http.sendPost();
}
// HTTP POST request
private void sendPost() throws Exception {
String url = "https://mywebsite.org/start.swe";
URL obj = new URL(url);
HttpsURLConnection con = (HttpsURLConnection) obj.openConnection();
//add reuqest header
con.setRequestMethod("POST");
con.setRequestProperty("UsernameToken", "username");
con.setRequestProperty("PasswordText", "password");
String urlParameters = "SWEExtSource=WebService&SWEExtCmd=Execute&WSSOAP=1";
// Send post request
con.setDoOutput(true);
DataOutputStream wr = new DataOutputStream(con.getOutputStream());
wr.writeBytes(urlParameters);
wr.flush();
wr.close();
int responseCode = con.getResponseCode();
System.out.println("\nSending 'POST' request to URL : " + url);
System.out.println("Post parameters : " + urlParameters);
System.out.println("Response Code : " + responseCode);
BufferedReader in = new BufferedReader(
new InputStreamReader(con.getInputStream()));
String inputLine;
StringBuffer response = new StringBuffer();
while ((inputLine = in.readLine()) != null) {
response.append(inputLine);
}
in.close();
//print result
System.out.println(response.toString());
}
}
In your XML, you are specifying a token. When I have done this using SOAP UI, I have a certificate file that I use. In my case, I put it in my C:\Program Files (x86)\SmartBear\SoapUI-5.2.1\bin folder. Then I configured SOAP UI to use this. Do you have a certificate? If yes, are you referencing it?

Server returned HTTP response code: 415 Unsupported media type

I have a rest web service like below.
#POST
#Path("/startProcess")
#Produces(MediaType.TEXT_PLAIN)
#Consumes(MediaType.APPLICATION_JSON)
public String startProcess(InputParams inputParams, #Context HttpServletRequest request, #Context HttpServletResponse response) {
ProjectBean projBean = new ProjectBean();
Helper.loadProjectBean(inputParams, projBean);
return "1";
}
Now I am trying to consume it with below main program.
public static void main(String[] args) throws Exception {
StringBuffer response = new StringBuffer();
String taigaServiceUrl = "http://localhost:8181/restServer/rest/TestWebService/startProcess/";
URL url = new URL(taigaServiceUrl);
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
conn.setRequestMethod("POST");
conn.setDoOutput(true);
conn.setRequestProperty("Content-Type", "application/json");
String userpass = "admin" + ":" + "admin";
String basicAuth = "Basic " + new String(new Base64().encode(userpass.getBytes()));
conn.setRequestProperty("Authorization", basicAuth);
InputParams inputParams = new InputParams();
inputParams.setXXX("xxxx");
inputParams.setYYYY("123456");
inputParams.setZZZZ("ZZZZ");
String json = new Gson().toJson(inputParams);
DataOutputStream os = new DataOutputStream (conn.getOutputStream());
os.write(json.getBytes());
os.flush();
BufferedReader br = new BufferedReader(new InputStreamReader((conn.getInputStream())));
String inputLine;
while ((inputLine = br.readLine()) != null) {
response.append(inputLine);
}
br.close();
}
But every time I am getting below error.
Exception in thread "main" java.io.IOException: Server returned HTTP response code: 415 for URL: http://localhost:8181/restServer/rest/TestWebService/startProcess/
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)
at scm.controllers.Test.main(Test.java:64)
As per error the media type is unsupported. In my rest webservice I am consuming JSON and in my main program I am sending JSON. Then where it is breaking?
Well after lot of debugging I found solution of my problem. I needed to add below jars in classpath. Actually Jersey was not able to bind JSON object to the rest service.
jackson-annotations-2.5.4.jar
jackson-core-2.5.4.jar
jackson-databind-2.5.4.jar
jackson-jaxrs-base-2.5.4.jar
jackson-jaxrs-json-provider-2.5.4.jar
jersey-entity-filtering-2.22.2.jar
jersey-media-json-jackson-2.22.2.jar
Have a look at this guide:
I think you need to define a json processor:
https://www.nabisoft.com/tutorials/java-ee/producing-and-consuming-json-or-xml-in-java-rest-services-with-jersey-and-jackson
thanks.
This is the issue with your #Produces and #Consumes.
#Produces(MediaType.TEXT_PLAIN)
#Consumes(MediaType.APPLICATION_JSON)
As per the annotation, your endpoint receives JSON and result would be TEXT.
But in your client program, you have mentioned content type as json.
conn.setRequestProperty("Content-Type", "application/json");
Hence client expects a json, where as its not.
Change this as
conn.setRequestProperty("Content-Type", "text/plain");
would work.

Why does QueryParam return null

I hava a post method where I try and add the parameter "enc":
protected void sendPost(String url, String encData) throws Exception {
URL obj = new URL(url);
HttpURLConnection con = (HttpURLConnection) obj.openConnection();
//add request header
con.setRequestMethod("POST");
con.setRequestProperty("Accept-Language", "en-US,en;q=0.5");
// Send post request
con.setDoOutput(true);
OutputStreamWriter wr = new OutputStreamWriter(con.getOutputStream());
wr.write("enc="+encData);
wr.flush();
wr.close();
int responseCode = con.getResponseCode();
System.out.println("\nSending 'POST' request to URL : " + url);
//System.out.println("Post parameters : " + urlParameters);
System.out.println("Response Code : " + responseCode);
}
However in my server code (below) I get a value of NULL when trying to get the data. Its just a string, not JSON or anything fancy. I've also tried writing the param as "?enc="+endData, and that does not work either. Also the path encRead is entered in the url, so I don't think that is the issue.
#Path("/encRead")
#POST
public void decryptData(#QueryParam("enc") String enc) {
System.out.println("got endData: "+enc);
}
So far I've been referencing the answers from Jersey POST Method is receiving null values as parameters but still come up with no solution
The problem is you are trying to write to the body of the request, with wr.write("enc="+encData);. #QueryParams should be in the query string. So this instead would work
public static void main(String[] args) throws Exception {
sendPost(".../encRead", "HelloWorld");
}
protected static void sendPost(String url, String encData) throws Exception {
String concatUrl = url + "?enc=" + encData;
URL obj = new URL(concatUrl);
[...]
//wr.write("enc=" + encData);

How to capture response status of all the Get/post request from page using Webdriver

When hit any url in browser. Multiple get/post/delete method is submitted. I want to capture a status of these methods.
Tried with below program but it gives a webpage status
String url = "http://www.google.com/";
WebClient webClient = new WebClient();
HtmlPage htmlPage = webClient.getPage(url);
try{
//verify response
Assert.assertEquals(200,htmlPage.getWebResponse().getStatusCode());
System.out.println(true);
}
We have a utility class using HttpURLConnection to get GET response codes, but you could use that same approach to get other methods using the setRequestMethod method. Here's how we get the GET responses:
private static int getResponseCode(String url) throws MalformedURLException, IOException{
HttpURLConnection.setFollowRedirects(true);
HttpURLConnection con = (HttpURLConnection) new URL(url).openConnection();
con.setConnectTimeout(connection_time_out);
con.setRequestMethod("GET");
int responseCode = con.getResponseCode();
if(con != null){
con.disconnect();
}
return responseCode;
}

Categories