How to retrieve specific information from http header java? - java

I spent a fair bit of time looking for the solution but couldnt find any. I am sending HTTP request and need to retrieve certain information from the http header. I'm doing manually with socket library. How do I omit other information and only retrieve information that i want to display from the http header? Is there any way or what library to use so I can format the http header?
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.io.PrintWriter;
import java.net.InetAddress;
import java.net.InetSocketAddress;
import java.net.Socket;
import java.util.*;
import java.text.SimpleDateFormat;
public class socketv1 {
public static void main(String[] args) throws Exception {
InetAddress addr = InetAddress.getByName("www.google.com");
Socket socket = new Socket(addr, 80);
//socket.bind (new InetSocketAddress (socket.getLocalAddress().getHostAddress(), 0));
//socket.connect (new InetSocketAddress (socket.getInetAddress().getHostAddress(), 80), 1000);
boolean autoflush = true;
System.out.println("URL requested: " + socket.getInetAddress().getHostName());
System.out.println("Client: " + socket.getLocalAddress().getHostAddress() + " " + socket.getLocalPort());
System.out.println("Server: " + socket.getInetAddress().getHostAddress() + " " + socket.getPort());
System.out.println("");
PrintWriter out = new PrintWriter(socket.getOutputStream(), autoflush);
BufferedReader in = new BufferedReader(new InputStreamReader(socket.getInputStream()));
// send an HTTP request to the web server
out.println("GET / HTTP/1.1");
out.println("Host: www.google.com:80");
out.print ("Date Accessed: date" + "\r\n");
out.println("Connection: Close");
out.println();
// read the response
boolean loop = true;
StringBuilder sb = new StringBuilder(8096);
while (loop) {
if (in.ready()) {
int i = 0;
while (i != -1) {
i = in.read();
sb.append((char) i);
}
loop = false;
}
}
System.out.println(sb.toString());
socket.close();
}
}
Expected:
URL requested: www.google.com
Client: 192.168.1.110 53954
Server: 172.217.167.68 80
Date Accessed: 24/03/2019 14:53:59 AEST
Actual:
URL requested: www.google.com
Client: 192.168.1.110 53954
Server: 172.217.167.68 80
HTTP/1.1 200 OK
Date: Sun, 24 Mar 2019 12:52:16 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
P3P: CP="This is not a P3P policy! See g.co/p3phelp for more info."
Server: gws
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Set-Cookie: 1P_JAR=2019-03-24-12; expires=Tue, 23-Apr-2019 12:52:16 GMT; path=/; domain=.google.com
Set-Cookie: NID=179=E-IxZRjdPtqBWrSM-bdqfdYDdzPlEaC7gkdFKxYoGRJpBIdD__1ZQiVFPrSuoEqme-yBucdcczqMw_EOJaUpfuXYy1auuQWd1-AZQ6WKmQR_pz8kFZqemdm4Bc-yH0P1Zc7ODKWEmtHKpE3nT2kqIhwfp7pLZrYd3YGMrZFUwZs; expires=Mon, 23-Sep-2019 12:52:16 GMT; path=/; domain=.google.com; HttpOnly
Accept-Ranges: none
Vary: Accept-Encoding
Connection: close

Do you have to work with sockets, can you not do something like:
import java.net.URL;
import java.net.URLConnection;
public class socketv1 {
public static void main(String[] args) throws Exception {
URL url = new URL("http://www.google.com");
URLConnection c = url.openConnection();
System.out.println(c.getHeaderField("Content-Type"));
}
}
so work with the connection from the start?

Related

Why does SparkJava not process the second request on the same connection?

I have written a small server with a REST-API using SparkJava. I try to query the REST-API with an Apache Httpclient. With this client, I open a connection and send a first request to the server and receive a response. Then I reuse the same connection to send a second request to the server. The request is transmitted but the server does not process it. Does anyone know, what I am doing wrong?
Here a minimal working example:
Maven dependencies:
<dependency>
<groupId>com.sparkjava</groupId>
<artifactId>spark-core</artifactId>
<version>2.9.3</version>
</dependency>
<dependency>
<groupId>org.apache.httpcomponents.client5</groupId>
<artifactId>httpclient5</artifactId>
<version>5.0.3</version>
</dependency>
Server class:
package minimal;
import spark.Spark;
public class Server {
public static void main(String[] args) {
Spark.post("/a", (req, resp) -> {
resp.status(204);
return "";
});
Spark.post("/b", (req, resp) -> {
resp.status(204);
return "";
});
Spark.before((req, res) -> {
System.out.println("Before: Request from " + req.ip() + " received " + req.pathInfo());
});
Spark.after((req, res) -> {
System.out.println("After: Request from " + req.ip() + " received " + req.pathInfo());
});
}
}
Client class:
package minimal;
import java.io.IOException;
import org.apache.hc.client5.http.classic.methods.HttpPost;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.CloseableHttpResponse;
import org.apache.hc.client5.http.impl.classic.HttpClients;
public class Client {
public static void main(String[] args) throws IOException {
try (CloseableHttpClient httpclient = HttpClients.createDefault()) {
HttpPost httpPost1 = new HttpPost("http://localhost:4567/a");
try (CloseableHttpResponse response1 = httpclient.execute(httpPost1)) {
System.out.println(response1.getCode() + " " + response1.getReasonPhrase());
}
HttpPost httpPost2 = new HttpPost("http://localhost:4567/b");
try (CloseableHttpResponse response2 = httpclient.execute(httpPost2)) {
System.out.println(response2.getCode() + " " + response2.getReasonPhrase());
}
}
}
}
The server output on the console:
Before: Request from 127.0.0.1 received /a
After: Request from 127.0.0.1 received /a
Here the shortened output of a tcpdump:
14:52:15.210468 IP localhost.44020 > localhost.4567:
POST /a HTTP/1.1
Accept-Encoding: gzip, x-gzip, deflate
Host: localhost:4567
Connection: keep-alive
User-Agent: Apache-HttpClient/5.0.3 (Java/1.8.0_282)
14:52:15.271563 IP localhost.4567 > localhost.44020:
HTTP/1.1 204 No Content
Date: Tue, 27 Apr 2021 12:52:15 GMT
Content-Type: text/html;charset=utf-8
Server: Jetty(9.4.26.v20200117)
14:52:15.277376 IP localhost.44020 > localhost.4567:
POST /b HTTP/1.1
Accept-Encoding: gzip, x-gzip, deflate
Host: localhost:4567
Connection: keep-alive
User-Agent: Apache-HttpClient/5.0.3 (Java/1.8.0_282)
Thereafter no response of the Server was recorded anymore.
Here's the client sample please try it out and see if it works for you.
I tested it and it worked fine.
import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
import java.io.IOException;
public class T1 {
static void runPost(CloseableHttpClient c,String s)
{
HttpPost httpPost1 = new HttpPost(s);
try(CloseableHttpResponse response1 = c.execute(httpPost1)) {
System.out.println(Thread.currentThread().getName() + ": " +
response1.getStatusLine().getStatusCode() + " " +
response1.getStatusLine().getReasonPhrase());
} catch (Exception e) {
e.printStackTrace();
}
}
public static void main(String[] args) throws IOException {
try(CloseableHttpClient httpclient = HttpClients.createDefault()) {
T1.runPost(httpclient, "http://localhost:4567/a");
T1.runPost(httpclient, "http://localhost:4567/b");
}
System.exit(0);
}
}
The reason for the missing processing of the SparkJava server was the following additional maven dependency I had in the project:
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.4.0.7.1.1.0-565</version>
</dependency>
After removing this dependency, the SparkJava server works as expected.

REST API - HTTP Fileupload with Status Code 415

Hi im building a REST API to upload files.
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import javax.ws.rs.Consumes;
import javax.ws.rs.GET;
import javax.ws.rs.POST;
import javax.ws.rs.Path;
import javax.ws.rs.core.MediaType;
import javax.ws.rs.core.Response;
import org.apache.http.HttpEntity;
#Path("/api")
public class RestAPI {
private final String UPLOADED_FILE_PATH = "C:/ProgramData/XXXX/";
#GET
public String getFile() {
return "Loading File...";
}
#POST
#Path("/image-upload")
#Consumes(MediaType.MULTIPART_FORM_DATA)
public Response uploadFile(HttpEntity input) throws IOException {
// Do stuff
return Response.status(200).entity("Uploaded file name : " + "").build();
}
Uploader Class:
import java.io.File;
import org.apache.http.HttpEntity;
import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.entity.ContentType;
import org.apache.http.entity.mime.MultipartEntityBuilder;
import org.apache.http.entity.mime.content.FileBody;
import org.apache.http.entity.mime.content.StringBody;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClientBuilder;
public class DemoFileUploader {
public static void main(String args[]) throws Exception {
DemoFileUploader fileUpload = new DemoFileUploader();
File file = new File("C:/Users/tdr/Desktop/TestFile.txt");
// Upload the file
fileUpload.executeMultiPartRequest("http://localhost:8080/MediaHandler/mediahandler/api/image-upload",
file, file.getName(), "File Uploaded :: TestFile.txt");
}
public void executeMultiPartRequest(String urlString, File file, String fileName, String fileDescription)
throws Exception {
// default client builder
CloseableHttpClient httpClient = HttpClientBuilder.create().build();
HttpPost postRequest = new HttpPost(urlString);
try {
FileBody fileBody = new FileBody(file, ContentType.DEFAULT_BINARY);
// Set various attributes
HttpEntity multiPartEntity = MultipartEntityBuilder.create()
.addPart("fileDescription",
new StringBody(fileDescription != null ? fileDescription : "",
ContentType.MULTIPART_FORM_DATA))
.addPart("fileName", new StringBody(fileName != null ? fileName : file.getName(),
ContentType.MULTIPART_FORM_DATA))
.addPart("attachment", fileBody).build();
// Set to request body
postRequest.setEntity(multiPartEntity);
System.out.println("Sending Request....");
System.out.println("Request: " + postRequest);
System.out.println("Request Entity: " + postRequest.getEntity().getContentType());
// Send request
CloseableHttpResponse response = httpClient.execute(postRequest);
System.out.println("Request executed.");
// Verify response if any
if (response != null) {
System.out.println("Response Status Code: " + response.getStatusLine().getStatusCode());
System.out.println("Response: " + response);
System.out.println("Response Entity: " + response);
}
} catch (Exception ex) {
ex.printStackTrace();
}
}
}
and i get the following output:
Sending Request....
Request: POST http://localhost:8080/MediaHandler/mediahandler/api/image-upload HTTP/1.1
Request Entity: Content-Type: multipart/form-data; boundary=eINJSk3iptTJP7wf-cXlS-uznnnGMl99FyFmlet
Request executed.
Response Status Code: 415
Response: HTTP/1.1 415 [Content-Type: text/html;charset=utf-8, Content-Language: de, Content-Length: 785, Date: Wed, 31 Mar 2021 12:19:35 GMT, Keep-Alive: timeout=20, Connection: keep-alive]
Response Entity: HTTP/1.1 415 [Content-Type: text/html;charset=utf-8, Content-Language: de, Content-Length: 785, Date: Wed, 31 Mar 2021 12:19:35 GMT, Keep-Alive: timeout=20, Connection: keep-alive]
I tried to follow all examples i found, but all of them are similar to my code. Do u guys can tell me where the bug is?
im sending a multipart/form-data and my restapi is expecting multipart/form-data...
Can you remove this annotation:
#Consumes(MediaType.MULTIPART_FORM_DATA)
or request header must contains Content-Type: MediaType.MULTIPART_FORM_DATA

Apache HttpClient 4.5.5 - wrong headers in response after a GET request

When I'm using Apache HttpClient and loading a webpage via GET request after the page is loaded in the response I have the headers that are different from ones I have when loading the same page in browser. Here is the example of the page: http://empoweredfoundation.org/wp-login.php?action=register
In browser I have the following headers:
Status code: 302
Content-Type: text/html; charset=UTF-8
X-Port: port_10210
X-Cacheable: YES:Forced
Location: http://empoweredfoundation.org/register/
Content-Encoding: gzip
Transfer-Encoding: chunked
Date: Thu, 22 Feb 2018 04:04:35 GMT
Age: 0
Vary: User-Agent
X-Cache: uncached
X-Cache-Hit: MISS
X-Backend: all_requests
When I use HttpClient in my application I have these headers in response:
Status code: 200
Content-Type: text/html; charset=UTF-8
X-Port: port_10210
X-Cacheable: YES:Forced
Transfer-Encoding: chunked
Date: Thu, 22 Feb 2018 04:44:58 GMT
Age: 28224
Vary: Accept-Encoding, User-Agent
X-Cache: cached
X-Cache-Hit: HIT
X-Backend: all_requests
Server: nginx/1.12.1
Date: Thu, 22 Feb 2018 04:45:24 GMT
Content-Type: text/html; charset=utf-8
Transfer-Encoding: chunked
Connection: keep-alive
X-Powered-By: PHP/5.4.45
So, I should have a 302 status code but I have 200. And also I see other headers are different than the ones from the browser. I can't figure out why and what should I do to fix this.
Here is the code:
HttpClient httpclient = null;
HttpClientBuilder builder = HttpClients.custom();
Builder requestConfigBuilder = RequestConfig.custom();
// here goes the cookie store creation, ssl configuration etc
builder.setDefaultRequestConfig(requestConfigBuilder.build());
httpclient = builder.build();
HttpResponse response = null;
HttpGet httpget = null;
Escaper escaper = UrlEscapers.urlFragmentEscaper();
httpget = new HttpGet(escaper.escape(url));
httpget.getParams().setParameter("http.socket.timeout", new Integer(socketTimeout));
httpget.getParams().setParameter("http.connection.timeout", new Integer(connectTimeout));
httpget.addHeader("Accept", "text/html, application/xml;q=0.9, application/xhtml+xml, image/png, image/jpeg, image/gif, image/x-xbitmap, */*;q=0.1");
httpget.addHeader("Accept-Language", "en-US,en;q=0.9");
httpget.addHeader("Accept-Encoding", "identity, *;q=0");
response = httpclient.execute(httpget);
I also tried CloseableHttpClient, had the same result.
I resolved this issue, this solution works: https://memorynotfound.com/apache-httpclient-redirect-handling-example/
I still have 200 status code, not 302. But now I can handle 302 redirects (even when response.getStatusLine() shows 200).
Here is the code from the article:
package com.memorynotfound.httpclient;
import org.apache.http.HttpHost;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.client.protocol.HttpClientContext;
import org.apache.http.client.utils.URIUtils;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.impl.client.LaxRedirectStrategy;
import java.io.IOException;
import java.net.URI;
import java.net.URISyntaxException;
import java.util.List;
/**
* This example demonstrates the use of {#link HttpGet} request method.
* and handling redirect strategy with {#link LaxRedirectStrategy}
*/
public class HttpClientRedirectHandlingExample {
public static void main(String... args) throws IOException, URISyntaxException {
CloseableHttpClient httpclient = HttpClients.custom()
.setRedirectStrategy(new LaxRedirectStrategy())
.build();
try {
HttpClientContext context = HttpClientContext.create();
HttpGet httpGet = new HttpGet("http://httpbin.org/redirect/3");
System.out.println("Executing request " + httpGet.getRequestLine());
System.out.println("----------------------------------------");
httpclient.execute(httpGet, context);
HttpHost target = context.getTargetHost();
List<URI> redirectLocations = context.getRedirectLocations();
URI location = URIUtils.resolve(httpGet.getURI(), target, redirectLocations);
System.out.println("Final HTTP location: " + location.toASCIIString());
} finally {
httpclient.close();
}
}
}
And also I added builder.setRedirectStrategy(new LaxRedirectStrategy()); when created the HttpClient class object.
If you know any solution to get the correct status code (which should be 302), please tell me.

How to add multiple "Set-Cookie" header in servlet response?

As per RFC https://www.rfc-editor.org/rfc/rfc6265#page-7 It is allowed to have two headers with same key of "Set-Cookie". The example provided in RFC is -
​​Set-Cookie: SID=31d4d96e407aad42; Path=/; Secure; HttpOnly
Set-Cookie: ​​lang=en-US; Path=/; Domain=example.com
How​ do ​I achieve same with Jetty(or any other servlet container)? When I call httpServletResponse.addHeader this way-
​httpServletResponse.addHeader("Set-Cookie", "SID=31d4d96e407aad42; Path=/; Secure; HttpOnly");
httpServletResponse.addHeader("Set-Cookie", "lang=en-US; Path=/; Domain=example.com");​
I see that the second addHeader() doesn't add a new header. According to javadoc for this method-
Adds a response header with the given name and value. This method
allows response headers to have multiple values.
So it seems that multiple values of allowed but I am not sure how to go about having multiple "Set-Cookie" in servlet response.
Setting Cookies directly like that is a bit awkward, considering that the Servlet API has methods specifically for working with Cookies.
Anyway, tested on Jetty 9.3.0.v20150612 and it works as expected.
Example: SetCookieTest.java
package jetty;
import static org.hamcrest.Matchers.*;
import static org.junit.Assert.*;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.OutputStream;
import java.io.PrintWriter;
import java.net.Socket;
import java.nio.charset.StandardCharsets;
import javax.servlet.ServletException;
import javax.servlet.http.Cookie;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import org.eclipse.jetty.server.Server;
import org.eclipse.jetty.servlet.ServletContextHandler;
import org.junit.AfterClass;
import org.junit.BeforeClass;
import org.junit.Test;
public class SetCookieTest
{
#SuppressWarnings("serial")
public static class SetCookieAddHeaderServlet extends HttpServlet
{
#Override
protected void doGet(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException
{
resp.setContentType("text/plain");
resp.addHeader("Set-Cookie","SID=31d4d96e407aad42; Path=/; Secure; HttpOnly");
resp.addHeader("Set-Cookie","lang=en-US; Path=/; Domain=example.com");
PrintWriter out = resp.getWriter();
out.println("Hello From: " + this.getClass().getName());
}
}
#SuppressWarnings("serial")
public static class SetCookieAddCookieServlet extends HttpServlet
{
#Override
protected void doGet(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException
{
resp.setContentType("text/plain");
// Set-Cookie: SID=31d4d96e407aad42; Path=/; Secure; HttpOnly
Cookie sidCookie = new Cookie("SID","31d4d96e407aad42");
sidCookie.setPath("/");
sidCookie.setSecure(true);
sidCookie.setHttpOnly(true);
resp.addCookie(sidCookie);
// Set-Cookie: lang=en-US; Path=/; Domain=example.com
Cookie langCookie = new Cookie("lang","en-US");
langCookie.setPath("/");
langCookie.setDomain("example.com");
resp.addCookie(langCookie);
PrintWriter out = resp.getWriter();
out.println("Hello From: " + this.getClass().getName());
}
}
private static Server server;
#BeforeClass
public static void startServer() throws Exception
{
server = new Server(9090);
ServletContextHandler context = new ServletContextHandler(ServletContextHandler.SESSIONS);
context.addServlet(SetCookieAddHeaderServlet.class,"/test-add-header");
context.addServlet(SetCookieAddCookieServlet.class,"/test-add-cookie");
server.setHandler(context);
server.start();
}
#AfterClass
public static void stopServer() throws Exception
{
server.stop();
}
/**
* Issue simple GET request, returning entire response (including payload)
*
* #param uri
* the URI to request
* #return the response
*/
private String issueSimpleHttpGetRequest(String path) throws IOException
{
StringBuilder req = new StringBuilder();
req.append("GET ").append(path).append(" HTTP/1.1\r\n");
req.append("Host: localhost\r\n");
req.append("Connection: close\r\n");
req.append("\r\n");
// Connect
try (Socket socket = new Socket("localhost",9090))
{
try (OutputStream out = socket.getOutputStream())
{
// Issue Request
byte rawReq[] = req.toString().getBytes(StandardCharsets.UTF_8);
out.write(rawReq);
out.flush();
// Read Response
StringBuilder resp = new StringBuilder();
try (InputStream stream = socket.getInputStream();
InputStreamReader reader = new InputStreamReader(stream);
BufferedReader buf = new BufferedReader(reader))
{
String line;
while ((line = buf.readLine()) != null)
{
resp.append(line).append(System.lineSeparator());
}
}
// Return Response
return resp.toString();
}
}
}
#Test
public void testAddHeader() throws Exception
{
String response = issueSimpleHttpGetRequest("/test-add-header");
System.out.println(response);
assertThat("response", response, containsString("Set-Cookie: SID=31d"));
assertThat("response", response, containsString("Set-Cookie: lang=en-US"));
}
#Test
public void testAddCookie() throws Exception
{
String response = issueSimpleHttpGetRequest("/test-add-cookie");
System.out.println(response);
assertThat("response", response, containsString("Set-Cookie: SID=31d"));
assertThat("response", response, containsString("Set-Cookie: lang=en-US"));
}
}
Console Output
2015-06-25 14:18:19.186:INFO::main: Logging initialized #167ms
2015-06-25 14:18:19.241:INFO:oejs.Server:main: jetty-9.3.0.v20150612
2015-06-25 14:18:19.276:INFO:oejsh.ContextHandler:main: Started o.e.j.s.ServletContextHandler#56cbfb61{/,null,AVAILABLE}
2015-06-25 14:18:19.288:INFO:oejs.ServerConnector:main: Started ServerConnector#1ef05443{HTTP/1.1,[http/1.1]}{0.0.0.0:9090}
2015-06-25 14:18:19.289:INFO:oejs.Server:main: Started #270ms
HTTP/1.1 200 OK
Date: Thu, 25 Jun 2015 21:18:19 GMT
Content-Type: text/plain;charset=iso-8859-1
Set-Cookie: SID=31d4d96e407aad42;Path=/;Secure;HttpOnly
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Set-Cookie: lang=en-US;Path=/;Domain=example.com
Connection: close
Server: Jetty(9.3.0.v20150612)
Hello From: jetty.SetCookieTest$SetCookieAddCookieServlet
HTTP/1.1 200 OK
Date: Thu, 25 Jun 2015 21:18:19 GMT
Content-Type: text/plain;charset=iso-8859-1
Set-Cookie: SID=31d4d96e407aad42; Path=/; Secure; HttpOnly
Set-Cookie: lang=en-US; Path=/; Domain=example.com
Connection: close
Server: Jetty(9.3.0.v20150612)
Hello From: jetty.SetCookieTest$SetCookieAddHeaderServlet
2015-06-25 14:18:19.405:INFO:oejs.ServerConnector:main: Stopped ServerConnector#1ef05443{HTTP/1.1,[http/1.1]}{0.0.0.0:9090}
2015-06-25 14:18:19.407:INFO:oejsh.ContextHandler:main: Stopped o.e.j.s.ServletContextHandler#56cbfb61{/,null,UNAVAILABLE}
It's probably not the answer you're looking for but I just tried this myself and it worked right away:
Set-Cookie:SID=31d4d96e407aad42; Path=/; Secure; HttpOnly
Set-Cookie:lang=en-US; Path=/; Domain=example.com
Set-Cookie:JSESSIONID=76A68D96ED044DDFF0CC266810F52DDA; Path=/; HttpOnly
That's how the response looked like. Maybe it's the problem of your particular web container, or your implementation.
Try to debug the application (using remote debugging facility) to figure out where the header gets lost.

Download a cookie to make new GET request

I am trying to do a PHP GET request to a website:
The problem is that this website will only process my request if I attach Cookie information to the header of the request.
Or in picture terms, if I disable cookies in my browser, I get this:
Which means the website recognises that it's my first time 'visiting' the site.
Problem is, that if I now use the search bar on the top right, it will not process this request:
it will just show the same (general) screen.
E.g.: if I have cookies disabled and I search for "AAPL", it will not show any results.
Now if I have cookies enabled, the request is handled just fine:
And so the "AAPL" results are shown.
You can try this yourself as well:
With cookies enabled, visit http://www.pennystocktweets.com/user_posts/feeds?cat=search&lptyp=prep&usrstk=AAPL
With cookies disabled, visit the link again: http://www.pennystocktweets.com/user_posts/feeds?cat=search&lptyp=prep&usrstk=AAPL
Now compare the responses, only the first one is correct.
This means that the website only works after the client has downloaded a cookie, and then has made another (new) GET request to the server with this Cookie information attached.
(Does this imply that the website needs a session-cookie to function correctly?)
Now what I'm trying to do is imitate the request with Apache HttpClient like so:
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.UnsupportedEncodingException;
import java.net.CookieHandler;
import java.net.CookieManager;
import java.net.HttpURLConnection;
import java.net.URL;
import java.util.Date;
import java.util.List;
import java.util.StringTokenizer;
import org.apache.http.HttpResponse;
import org.apache.http.NameValuePair;
import org.apache.http.client.HttpClient;
import org.apache.http.client.entity.UrlEncodedFormEntity;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.impl.client.DefaultHttpClient;
import org.apache.http.message.BasicNameValuePair;
public class downloadTweets {
private String cookies;
private HttpClient client = new DefaultHttpClient();
private final String USER_AGENT = "Mozilla/5.0";
public static void main(String[] args) throws Exception {
String ticker = "AAPL";
String lptyp = "prep";
int opid = 0;
int lpid = 0;
downloadTweets test = new downloadTweets();
String url = test.constructURL(ticker, lptyp, opid, lpid);
// make sure cookies is turn on
CookieHandler.setDefault(new CookieManager());
downloadTweets http = new downloadTweets();
String page = http.GetPageContent(url, ticker);
System.out.println(page);
}
public String constructURL(String ticker, String lptyp, int opid, int lpid)
{
String link = "http://www.pennystocktweets.com/user_posts/feeds?cat=search" +
"&lptyp=" + lptyp +
"&usrstk=" + ticker;
if (opid != 0)
{
link = link +
"&opid=" + opid +
"&lpid=" + lpid;
}
return link;
}
private String GetPageContent(String url, String ticker) throws Exception {
HttpGet request = new HttpGet(url);
String RefererLink = "http://www.pennystocktweets.com/search/post/" + ticker.toUpperCase();
request.setHeader("Host", "www.pennystocktweets.com");
request.setHeader("Connection", "Keep-alive");
request.setHeader("Accept", "*/*");
request.setHeader("X-Requested-With", "XMLHttpRequest");
request.setHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.57 Safari/537.36");
request.setHeader("Referer", RefererLink);
request.setHeader("Accept-Language", "nl-NL,nl;q=0.8,en-US;q=0.6,en;q=0.4,fr;q=0.2");
HttpResponse response = client.execute(request);
int responseCode = response.getStatusLine().getStatusCode();
System.out.println("\nSending 'GET' request to URL : " + url);
System.out.println("Response Code : " + responseCode);
BufferedReader rd = new BufferedReader(
new InputStreamReader(response.getEntity().getContent()));
StringBuffer result = new StringBuffer();
String line = "";
while ((line = rd.readLine()) != null) {
result.append(line);
}
// set cookies
setCookies(response.getFirstHeader("Set-Cookie") == null ? "" :
response.getFirstHeader("Set-Cookie").toString());
return result.toString();
}
public String getCookies() {
return cookies;
}
public void setCookies(String cookies) {
this.cookies = cookies;
}
}
Now, the same thing holds: if I attach (my) cookie information, the response works just fine, and if I don't the response doesn't work.
But I don't know how to get the cookie information and then use it in a new GET request.
So my question is:
How can I make 2 requests to a website such that:
On the first GET request, I get cookie information from the website and store this in my Java program
On the second GET request, I use the stored cookie information (as a Header) to make a new request.
Note: I don't know if the cookie is a normal cookie or a session cookie but I suspect it's a session-cookie!
All help is greatly appreciated!
As the documents of Apache commons httpclient states in the HttpClient Cookie handling part:
HttpClient supports automatic management of cookies, including allowing the server to set cookies and automatically return them to the server when required. It is also possible to manually set cookies to be sent to the server.
Whenever the http client receives cookies they are persisted into HttpState and added automatically to the new request. This is the default behavior.
In the following example code, we can see the cookies returned by two GET requests. We can't see directly the cookies sent to the server, but we can use a tool such as a protocol/net sniffer or ngrep to see the data transmitted over the network:
import java.io.IOException;
import org.apache.commons.httpclient.Cookie;
import org.apache.commons.httpclient.HttpClient;
import org.apache.commons.httpclient.HttpException;
import org.apache.commons.httpclient.HttpMethod;
import org.apache.commons.httpclient.HttpState;
import org.apache.commons.httpclient.cookie.CookiePolicy;
import org.apache.commons.httpclient.methods.GetMethod;
public class HttpTest {
public static void main(String[] args) throws HttpException, IOException {
String url = "http://www.whatarecookies.com/cookietest.asp";
HttpClient client = new HttpClient();
client.getParams().setCookiePolicy(CookiePolicy.BROWSER_COMPATIBILITY);
HttpMethod method = new GetMethod(url);
int res = client.executeMethod(method);
System.out.println("Result: " + res);
printCookies(client.getState());
method = new GetMethod(url);
res = client.executeMethod(method);
System.out.println("Result: " + res);
printCookies(client.getState());
}
public static void printCookies(HttpState state){
System.out.println("Cookies:");
Cookie[] cookies = state.getCookies();
for (Cookie cookie : cookies){
System.out.println(" " + cookie.getName() + ": " + cookie.getValue());
}
}
}
This is the output:
Result: 200
Cookies:
active_template::468: %2Fresponsive%2Fthree_column_inner_ad3b74de5a1c2f311bee7bca5c368aaa4e:b326b5062b2f0e69046810717534cb09
Result: 200
Cookies:
active_template::468: %2Fresponsive%2Fthree_column_inner_ad%2C+3b74de5a1c2f311bee7bca5c368aaa4e%3Db326b5062b2f0e69046810717534cb09
3b74de5a1c2f311bee7bca5c368aaa4e: b326b5062b2f0e69046810717534cb09
Here is an excerpt of ngrep:
MacBook$ sudo ngrep -W byline -d en0 "" host www.whatarecookies.com
interface: en0 (192.168.11.0/255.255.255.0)
filter: (ip) and ( dst host www.whatarecookies.com )
#####
T 192.168.11.70:56267 -> 54.228.218.117:80 [AP]
GET /cookietest.asp HTTP/1.1.
User-Agent: Jakarta Commons-HttpClient/3.1.
Host: www.whatarecookies.com.
.
####
T 54.228.218.117:80 -> 192.168.11.70:56267 [A]
HTTP/1.1 200 OK.
Server: nginx/1.4.0.
Date: Wed, 27 Nov 2013 10:22:14 GMT.
Content-Type: text/html; charset=iso-8859-1.
Content-Length: 36397.
Connection: keep-alive.
Vary: Accept-Encoding.
Vary: Cookie,Host,Accept-Encoding.
Set-Cookie: active_template::468=%2Fresponsive%2Fthree_column_inner_ad; expires=Fri, 29-Nov-2013 10:22:01 GMT; path=/; domain=whatarecookies.com; httponly.
Set-Cookie: 3b74de5a1c2f311bee7bca5c368aaa4e=b326b5062b2f0e69046810717534cb09; expires=Thu, 27-Nov-2014 10:22:01 GMT.
X-Middleton-Response: 200.
Cache-Control: max-age=0, no-cache.
X-Mod-Pagespeed: 1.7.30.1-3609.
.
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/1998/REC-html40-19980424/loose.dtd">
...
##
T 192.168.11.70:56267 -> 54.228.218.117:80 [AP]
GET /cookietest.asp HTTP/1.1.
User-Agent: Jakarta Commons-HttpClient/3.1.
Host: www.whatarecookies.com.
Cookie: active_template::468=%2Fresponsive%2Fthree_column_inner_ad.
Cookie: 3b74de5a1c2f311bee7bca5c368aaa4e=b326b5062b2f0e69046810717534cb09.
.
##
T 54.228.218.117:80 -> 192.168.11.70:56267 [A]
HTTP/1.1 200 OK.
Server: nginx/1.4.0.
Date: Wed, 27 Nov 2013 10:22:18 GMT.
Content-Type: text/html; charset=iso-8859-1.
Content-Length: 54474.
Connection: keep-alive.
Vary: Accept-Encoding.
Vary: Cookie,Host,Accept-Encoding.
Set-Cookie: active_template::468=%2Fresponsive%2Fthree_column_inner_ad%2C+3b74de5a1c2f311bee7bca5c368aaa4e%3Db326b5062b2f0e69046810717534cb09; expires=Fri, 29-Nov-2013 10:22:05 GMT; path=/; domain=whatarecookies.com; httponly.
Set-Cookie: 3b74de5a1c2f311bee7bca5c368aaa4e=b326b5062b2f0e69046810717534cb09; expires=Thu, 27-Nov-2014 10:22:05 GMT.
X-Middleton-Response: 200.
Cache-Control: max-age=0, no-cache.
X-Mod-Pagespeed: 1.7.30.1-3609.
.
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/1998/REC-html40-19980424/loose.dtd">
...

Categories