I want to get the id cookie that Google issues when you opt-in at the ads settings page (if you're already accepting target advertisement, you must opt out first to see the page to which I am referring).
I've found that, in order to get this cookie, you have to perform an HTTP GET to the action URL in the form that is in this page. The problem is that this URL contains a hash that changes for every new HTTP connection so, first, I must go to this page and get this URL and, then, perform the GET to the URL.
I'm using HttpComponents to get http://www.google.com/ads/preferences but when I parse the contents with JSOUP there is only a script and no form can be found.
I'm afraid that this happens becauses contents are loaded dynamically using some sort of timeout... Does anyone know a workaround for this?
EDIT: by the way, the code that I use by now is:
HttpClient httpclient = new DefaultHttpClient();
// Create a local instance of cookie store
CookieStore cookieStore = new BasicCookieStore();
// Bind custom cookie store to the local context
((AbstractHttpClient) httpclient).setCookieStore(cookieStore);
CookieSpecFactory csf = new CookieSpecFactory() {
public CookieSpec newInstance(HttpParams params) {
return new BrowserCompatSpec() {
#Override
public void validate(Cookie cookie, CookieOrigin origin)
throws MalformedCookieException {
// Allow all cookies
System.out.println("Allowed cookie: " + cookie.getName() + " "
+ cookie.getValue() + " " + cookie.getPath());
}
};
}
};
((AbstractHttpClient) httpclient).getCookieSpecs().register("EASY", csf);
// Create local HTTP context
HttpContext localContext = new BasicHttpContext();
// Bind custom cookie store to the local context
localContext.setAttribute(ClientContext.COOKIE_STORE, cookieStore);
HttpGet httpget = new HttpGet(doubleClickURL);
// Override the default policy for this request
httpclient.getParams().setParameter(
ClientPNames.COOKIE_POLICY, "EASY");
// Pass local context as a parameter
HttpResponse response = httpclient.execute(httpget, localContext);
HttpEntity entity = response.getEntity();
if (entity != null) {
InputStream instream = entity.getContent();
BufferedReader reader = new BufferedReader(
new InputStreamReader(instream));
instream.close();
// Find action attribute of form
Document document = Jsoup.parse(reader.readLine());
Element form = document.select("form").first();
String optinURL = form.attr("action");
URL connection = new URL(optinURL);
// ... get id Cookie
}
You may have more chance using HtmlUnit, Selenium or jWebUnit for such a task. JSoup does not interpret Javascript, and the Google page your pointing to is full of Javascript that should be executed by a browser to produce what you're seeing.
HtmlUnit is OS independent and does not need anything else installed, but I've never used it for complicated Javascript sites. HtmlUnit can also extract data from the web page like JSoup does, but you can still feed the html to JSoup if you prefer using it.
Finally I found it! I found the following site describing the doubleclick cookie protocol:
Privacy Advisory
Then, is as easy as setting a cookie in that domain with name id and value A. Then make an HTTP request to http://www.google.com/ads/preferences and they'll set a correct ID value.
It is a very specific question but I hope that serves to future viewers.
By the way, I found that amazon.com is for example a member of the Ad-sense Network. An HTTP request to doubleclick is sent by means of script in the main page to:
http://ad.doubleclick.net/adj/amzn.us.gw.atf
There you can find a script that seems the actual code to give you the id cookie. Nevertheless, if you access this with the cookie with value A it will set the id of doubleclick.
Related
When using the java.net.http.HttpClient classes in Java 11 and later, how does one tell the client to follow through an HTTP 303 to get to the redirected page?
Here is an example. Wikipedia provides a REST URL for getting the summary of a random page of their content. That URL redirects to the URL of the randomly-chosen page. When running this code, I see the 303 when calling HttpResponse#toString. But I do not know how to tell the client class to follow along to the new URL.
HttpClient client = HttpClient.newHttpClient();
HttpRequest request =
HttpRequest
.newBuilder()
.uri( URI.create( "https://en.wikipedia.org/api/rest_v1/page/random/summary" ) )
.build();
try
{
HttpResponse < String > response = client.send( request , HttpResponse.BodyHandlers.ofString() );
System.out.println( "response = " + response ); // ⬅️ We can see the `303` status code.
String body = response.body();
System.out.println( "body = " + body );
}
catch ( IOException e )
{
e.printStackTrace();
}
catch ( InterruptedException e )
{
e.printStackTrace();
}
When run:
response = (GET https://en.wikipedia.org/api/rest_v1/page/random/summary) 303
body =
Problem
You're using HttpClient#newHttpClient(). The documentation of that method states:
Returns a new HttpClient with default settings.
Equivalent to newBuilder().build().
The default settings include: the "GET" request method, a preference of HTTP/2, a redirection policy of NEVER [emphasis added], the default proxy selector, and the default SSL context.
As emphasized, you are creating an HttpClient with a redirection policy of NEVER.
Solution
There are at least two solutions to your problem.
Automatically Follow Redirects
If you want to automatically follow redirects then you need to use HttpClient#newBuilder() (instead of #newHttpClient()) which allows you to configure the to-be-built client. Specifically, you need to call HttpClient.Builder#followRedirects(HttpClient.Redirect) with an appropriate redirect policy before building the client. For example:
HttpClient client =
HttpClient.newBuilder()
.followRedirects(HttpClient.Redirect.NORMAL) // follow redirects
.build();
The different redirect policies are specified by the HttpClient.Redirect enum:
Defines the automatic redirection policy.
The automatic redirection policy is checked whenever a 3XX response code is received. If redirection does not happen automatically, then the response, containing the 3XX response code, is returned, where it can be handled manually.
There are three constants: ALWAYS, NEVER, and NORMAL. The meaning of the first two is obvious from their names. The last one, NORMAL, behaves just like ALWAYS except it won't redirect from https URLs to http URLs.
Manually Follow Redirects
As noted in the documentation of HttpClient.Redirect you could instead manually follow a redirect. I'm not well versed in HTTP and how to properly handle all responses so I won't give an example here. But I believe, at a minimum, this requires you:
Check the status code of the response.
If the code indicates a redirect, grab the new URI from the response headers.
If the new URI is relative then resolve it against the request URI.
Send a new request.
Repeat 1-4 as needed.
Obviously configuring the HttpClient to automatically follow redirects is much easier (and less error-prone), but this approach would give you more control.
Please find below code where i was calling another api from my REST APi in java.
To note I am using java version 17. This will solve error code 303.
#GetMapping(value = "url/api/url")
private String methodName() throws IOException, InterruptedException {
var url = "api/url/"; // remote api url which you want to call
System.out.println(url);
var request = HttpRequest.newBuilder().GET().uri(URI.create(url)).setHeader("access-token-key", "accessTokenValue").build();
System.out.println(request);
var client = HttpClient.newBuilder().followRedirects(HttpClient.Redirect.NORMAL).build();
System.out.println(client);
var response = client.send(request, HttpResponse.BodyHandlers.ofString());
System.out.println(response);
System.out.println(response.body());
return response.body();
}
I am trying to find a solution to this the whole evening now...
I write an app which requests data from a web server. The Server answers in JSON format.
Everything works well except when I enter a umlaut like ä into my App.
In the following I assume the request URL is http://example.com/?q= and I am searching for "Jäger"
The correct call would then be h++p://example.com/?q=J%C3%A4ger
(Sorry for plus-signs but the spam protection doesnt let me post it correctly.)
So my problem is now:
When I give my URL String encoded or unencoded over to HttpGet it will always result in a doublee-encoded URL.
The Request to my Server is then http://example.com/?q=J%25C3%25A4ger (It encodes the percent signs)
which leads to the server searching in database for J%C3%A4ger what is obviously wrong.
So my question is how can I achive that if the user enters "Jäger" my app calls the correctly encoded URL?
Thanks for any help!
Here is the currently used code... Ist probably the worst possible idea I had...
URI url = new URI("http", "//example.com/?q=" + ((EditText)findViewById(R.id.input)).getText().toString(), null);
Log.v("MyLogTag", "API Request: " + url);
HttpGet httpGetRequest = new HttpGet(url);
// Execute the request in the client
HttpResponse httpResponse;
httpResponse = defaultClient.execute(httpGetRequest);
Update: Sorry, HttpParams isn't meant for request parameters but for configuring HttpClient.
On Android, you might want to use Uri.Builder, like suggested in this other SO answer:
Uri uri = new Uri.Builder()
.scheme("http")
.authority("example.com")
.path("someservlet")
.appendQueryParameter("param1", foo)
.appendQueryParameter("param2", bar)
.build();
HttpGet request = new HttpGet(uri.toString());
// This looks very tempting but does NOT set request parameters
// but just HttpClient configuration parameters:
// HttpParams params = new BasicHttpParams();
// params.setParameter("q", query);
// request.setParams(params);
HttpResponse response = defaultClient.execute(request);
String json = EntityUtils.toString(response.getEntity());
Outside of Android, your best bet is building the query string manually (with all the encoding hassles) or finding something similar to Android's Uri.Builder.
I have to upload a file to a server which only exposes a jsf web page with file upload button (over http). I have to automate a process (done as java stand alone process) which generates a file and uploads the file to the server.Sadly the server to where the file has to be uploaded does not provide a FTP or SFTP. Is there a way to do this?
Thanks,
Richie
When programmatically submitting a JSF-generated form, you need to make sure that you take the following 3 things in account:
Maintain the HTTP session (certainly if website has JSF server side state saving turned on).
Send the name-value pair of the javax.faces.ViewState hidden field.
Send the name-value pair of the button which is virtually to be pressed.
Otherwise the action will possibly not be invoked at all. For the remnant it's not different from "regular" forms. The flow is basically as follows:
Send a GET request on the page with the form.
Extract the JSESSIONID cookie.
Extract the value of the javax.faces.ViewState hidden field from the response. If necessary (for sure if it has a dynamically generated name and thus possibly changes every request), extract the name of input file field and the submit buttonas well. Dynamically generated IDs/names are recognizeable by the j_id prefix.
Prepare a multipart/form-data POST request.
Set the JSESSIONID cookie (if not null) on that request.
Set the name-value pair of javax.faces.ViewState hidden field and the button.
Set the file to be uploaded.
You can use any HTTP client library to perform the task. The standard Java SE API offers java.net.URLConnection for this, which is pretty low level. To end up with less verbose code, you could use Apache HttpClient to do the HTTP requests and manage the cookies and Jsoup to extract data from the HTML.
Here's a kickoff example, assuming that the page has only one <form> (otherwise you need to include an unique identifier of that form in Jsoup's CSS selectors):
String url = "http://localhost:8088/playground/test.xhtml";
String viewStateName = "javax.faces.ViewState";
String submitButtonValue = "Upload"; // Value of upload submit button.
HttpClient httpClient = new DefaultHttpClient();
HttpContext httpContext = new BasicHttpContext();
httpContext.setAttribute(ClientContext.COOKIE_STORE, new BasicCookieStore());
HttpGet httpGet = new HttpGet(url);
HttpResponse getResponse = httpClient.execute(httpGet, httpContext);
Document document = Jsoup.parse(EntityUtils.toString(getResponse.getEntity()));
String viewStateValue = document.select("input[type=hidden][name=" + viewStateName + "]").val();
String uploadFieldName = document.select("input[type=file]").attr("name");
String submitButtonName = document.select("input[type=submit][value=" + submitButtonValue + "]").attr("name");
File file = new File("/path/to/file/you/want/to/upload.ext");
InputStream fileContent = new FileInputStream(file);
String fileContentType = "application/octet-stream"; // Or whatever specific.
String fileName = file.getName();
HttpPost httpPost = new HttpPost(url);
MultipartEntity entity = new MultipartEntity();
entity.addPart(uploadFieldName, new InputStreamBody(fileContent, fileContentType, fileName));
entity.addPart(viewStateName, new StringBody(viewStateValue));
entity.addPart(submitButtonName, new StringBody(submitButtonValue));
httpPost.setEntity(entity);
HttpResponse postResponse = httpClient.execute(httpPost, httpContext);
// ...
Try using HttpClient, here's an article that I think describes what you want, towards the bottom there's a section titled "Using HttpClient-Based FileUpload".
Hope this helps.
Probably that webpage just sends a POST request to the server with the contents of the form. You can easily send such a POST request yourself from Java, without using that page. For example this article shows an example of sending POST requests from Java
What you'll need to do is to examine the HTML on the page and work out what parameters are needed to post the form. It'll probably look something like this:
<form action="/RequestURL">
<input type=file name=file1>
<input type=textbox name=value1>
</form>
Based on that you can write some code to do a POST request to the url:
String data = URLEncoder.encode("value1", "UTF-8") + "=" + URLEncoder.encode("value1", "UTF-8");
data += "&" + URLEncoder.encode("file1", "UTF-8") + "=" + URLEncoder.encode(FileData, "UTF-8");
// Send data
URL url = new URL("http://servername.com/RequestURL");
URLConnection conn = url.openConnection();
conn.setDoOutput(true);
OutputStreamWriter wr = new OutputStreamWriter(conn.getOutputStream());
wr.write(data);
wr.flush();
wr.close();
Remember that the person who wrote the page might do some checks to make sure the POST request came from the same site. In that case you might be in trouble, and you might need to set the user agent correctly.
You could try to use HtmlUnit for this. It provides a very simply API for simulating browser actions. I already used this approach for similar requirements. It's very easy. You should give it a try.
So this is currently how my app is set up:
1.) Login Activity.
2.) Once logged in, other activities may be fired up that use PHP scripts that require the cookies sent from logging in.
I am using one HttpClient across my app to ensure that the same cookies are used, but my problem is that I am getting 2 of the 3 cookies rejected. I do not care about the validity of the cookies, but I do need them to be accepted. I tried setting the CookiePolicy, but that hasn't worked either. This is what logcat is saying:
11-26 10:33:57.613: WARN/ResponseProcessCookies(271): Cookie rejected: "[version: 0] [name: cookie_user_id][value: 1][domain: www.trackallthethings.com][path: trackallthethings][expiry: Sun Nov 25 11:33:00 CST 2012]". Illegal path attribute "trackallthethings". Path of origin: "/mobile-api/login.php"
11-26 10:33:57.593: WARN/ResponseProcessCookies(271): Cookie rejected: "[version: 0][name: cookie_session_id][value: 1985208971][domain: www.trackallthethings.com][path: trackallthethings][expiry: Sun Nov 25 11:33:00 CST 2012]". Illegal path attribute "trackallthethings". Path of origin: "/mobile-api/login.php"
I am sure that my actual code is correct (my app still logs in correctly, just doesn't accept the aforementioned cookies), but here it is anyway:
HttpGet httpget = new HttpGet(//MY URL);
HttpResponse response;
response = Main.httpclient.execute(httpget);
HttpEntity entity = response.getEntity();
InputStream in = entity.getContent();
BufferedReader reader = new BufferedReader(new InputStreamReader(in));
StringBuilder sb = new StringBuilder();
From here I use the StringBuilder to simply get the String of the response. Nothing fancy.
I understand that the reason my cookies are being rejected is because of an "Illegal path attribute" (I am running a script at /mobile-api/login.php whereas the cookie will return with a path of just "/" for trackallthethings), but I would like to accept the cookies anyhow. Is there a way to do this?
The issue that you are facing seems to be by design for privacy/security purpose. In general any resource is not allowed to set a cookie it will not be able to receive. Here you are trying to set the cookie with the path trackallthethings from the resource /mobile-api/login.php which obviously is not working.
Here you have following two options
Set the cookie with the path which is accessible to both the resources (this may be root '/') OR
Define a custom cookie policy and Registering your own cookie support. Here is related documentation and example.
Hope this helps.
Since the API of HttpClient seems to change very fast, here is some working example code for HttpClient 4.5.1 to allow all (malformed) cookies:
class EasyCookieSpec extends DefaultCookieSpec {
#Override
public void validate(Cookie arg0, CookieOrigin arg1) throws MalformedCookieException {
//allow all cookies
}
}
class EasySpecProvider implements CookieSpecProvider {
#Override
public CookieSpec create(HttpContext context) {
return new EasyCookieSpec();
}
}
Registry<CookieSpecProvider> r = RegistryBuilder.<CookieSpecProvider>create()
.register("easy", new EasySpecProvider())
.build();
CookieStore cookieStore = new BasicCookieStore();
RequestConfig requestConfig = RequestConfig.custom()
.setCookieSpec("easy")
.build();
CloseableHttpClient httpclient = HttpClients.custom()
.setDefaultCookieStore(cookieStore)
.setDefaultCookieSpecRegistry(r)
.setDefaultRequestConfig(requestConfig)
.build();
I've successfully managed to logon to a site using httpclient and print out the cookies that enable that logon.
However, I am now stuck because I wanted to display subsequent pages in a JEditorPane using .setPage(url) function. However, when I do that and analyse my GET request using Wireshark I see that the user agent is not my httpclient but the following:
User-Agent: Java/1.6.0_17
The GET request (which is coded somewhere in side jeditorpane's setPage(URL url) method) does not have the cookies that were retrieved using the httpclient. My question is - how can I somehow transfer the cookies received with httpclient so that my JEditorPane can display URLs from the site?
I'm beginning to think it's not possible and I should try and logon using normal Java URLconnection etc but would rather stick with httpclient as it's more flexible (I think). Presumably I would still have a problem with the cookies??
I had thought of extending the JEditorPane class and overriding the setPage() but I don't know the actual code I should put in it as can't seem to find out how setPage() actually works.
Any help/suggestions would be greatly appreciated.
Dave
As I mentioned in the comment, HttpClient and the URLConnection used by the JEditorPane to fetch the URL content don't talk to each other. So, any cookies that HttpClient may have fetched won't transfer over to the URLConnection. However, you can subclass JEditorPane like so :
final HttpClient httpClient = new DefaultHttpClient();
/* initialize httpClient and fetch your login page to get the cookies */
JEditorPane myPane = new JEditorPane() {
protected InputStream getStream(URL url) throws IOException {
HttpGet httpget = new HttpGet(url.toExternalForm());
HttpResponse response = httpClient.execute(httpget);
HttpEntity entity = response.getEntity();
// important! by overriding getStream you're responsible for setting content type!
setContentType(entity.getContentType().getValue());
// another thing that you're now responsible for... this will be used to resolve
// the images and other relative references. also beware whether it needs to be a url or string
getDocument().putProperty(Document.StreamDescriptionProperty, url);
// using commons-io here to take care of some of the more annoying aspects of InputStream
InputStream content = entity.getContent();
try {
return new ByteArrayInputStream(IOUtils.toByteArray(content));
}
catch(RuntimeException e) {
httpget.abort(); // per example in HttpClient, abort needs to be called on unexpected exceptions
throw e;
}
finally {
IOUtils.closeQuietly(content);
}
}
};
// now you can do this!
myPane.setPage(new URL("http://www.google.com/"));
By making this change, you'll be using HttpClient to fetch the URL content for your JEditorPane. Be sure to read the JavaDoc here http://download.oracle.com/javase/1.4.2/docs/api/javax/swing/JEditorPane.html#getStream(java.net.URL) to make sure that you catch all the corner cases. I think I've got most of them sorted, but I'm not an expert.
Of course, you can change around the HttpClient part of the code to avoid loading the response into memory first, but this is the most concise way. And since you're going to be loading it up into an editor, it will all be in memory at some point. ;)
Under Java 5 & 6, there is a default cookie manager which "automatically" supports HttpURLConnection, the type of connection JEditorPane uses by default.
Based on this blog entry, if you write something like
CookieManager manager = new CookieManager();
manager.setCookiePolicy(CookiePolicy.ACCEPT_NONE);
CookieHandler.setDefault(manager);
seems enough to support cookies in JEditorPane.
Make sure to add this code before any internet communication with JEditorPane takes place.