How can I get the underlying Socket from an httpclient connection?

How can I get the underlying Socket from an httpclient connection? - java

This question is worded the same way as another question on SO (HttpClient: how do I obtain the underlying socket from an existing connection?), but that question is actually about tunneling other protocols through HTTP(S) and thus the answer is quite a bit different.
What I'm trying to do is make an HTTPS connection, then find out the details of that connection. Java's SSLSocket class will give me what I need, but I need to be able to get a hold of the Socket itself in order to interrogate it.
Is there a way to get to the underlying Socket? httpclient/httpcore has become a maze of factories and private/protected implementations of things, so it's really difficult to poke-around the API to figure out how to actually get things once they have been configured.

HttpClient intentionally makes it difficult to get hold of the underlying connection object and the socket it is bound to, primarily to ensure the connection state is consistent and persistent connections in the connection pool are safe to be re-used by another transaction.
However, one can get hold of the underlying connection from a response interceptor.
CloseableHttpClient httpclient = HttpClients.custom()
.addInterceptorLast(new HttpResponseInterceptor() {
#Override
public void process(
final HttpResponse response,
final HttpContext context) throws HttpException, IOException {
HttpClientContext clientContext = HttpClientContext.adapt(context);
ManagedHttpClientConnection connection = clientContext.getConnection(ManagedHttpClientConnection.class);
// Can be null if the response encloses no content
if (connection != null) {
Socket socket = connection.getSocket();
System.out.println(socket);
}
}
})
.build();
try (CloseableHttpResponse response = httpclient.execute(new HttpGet("http://www.google.com/"))) {
System.out.println(response.getStatusLine());
EntityUtils.consume(response.getEntity());
}

I ended up using a somewhat different technique, but #oleg got me on the right track. Here's my one-time code:
HttpClientContext ctx = HttpClientContext.create();
HttpResponse response = getHttpClient().execute(method, ctx);
if(log.isDebugEnabled())
{
ManagedHttpClientConnection connection = ctx.getConnection(ManagedHttpClientConnection.class);
// Can be null if the response encloses no content
if(null != connection)
{
Socket socket = connection.getSocket();
if(socket instanceof SSLSocket)
{
SSLSocket sslSock = (SSLSocket)socket;
log.debug("Connected to " + getEndpointURL()
+ " using " + sslSock.getSession().getProtocol()
+ " and suite " + sslSock.getSession().getCipherSuite());
}
}
}

Related

Read from URL if content is image [duplicate]

I am trying to write a java program that will automatically download and name some of my favorite web comics. Since I will be requesting multiple objects from the same domain, I wanted to have a persistent http connection that I could keep open until all the comics have been downloaded. Below is my work-in-progress. How do I make another request from the same domain but different path without opening a new http connection?
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.HttpURLConnection;
import java.net.URL;
public class ComicDownloader
{
public static void main(String[] args)
{
URL url = null;
HttpURLConnection httpc = null;
BufferedReader input = null;
try
{
url = new URL("http://www.cad-comic.com/cad/archive/2002");
httpc = (HttpURLConnection) url.openConnection();
input = new BufferedReader(new InputStreamReader(httpc.getInputStream()));
String inputLine;
while ((inputLine = input.readLine()) != null)
{
System.out.println(inputLine);
}
input.close();
httpc.disconnect();
}
catch (IOException ex)
{
System.out.println(ex);
}
}
}

According to the documentation here, HTTP persistence is being handled transparently in Java, although it gives you the options to control it too via http.keepAlive and http.maxConnections system properties.
However,
The current implementation doesn't
buffer the response body. Which means
that the application has to finish
reading the response body or call
close() to abandon the rest of the
response body, in order for that
connection to be reused. Furthermore,
current implementation will not try
block-reading when cleaning up the
connection, meaning if the whole
response body is not available, the
connection will not be reused.
Take a look at the link and see if it really helps you.

According to this link http://docs.oracle.com/javase/6/docs/technotes/guides/net/http-keepalive.html, HTTP connection reuse is enabled by default, you can use Wireshark to check the interactions between your client and server. The first request contains TCP and SSL handshakes(if your request is https), the subsequent requests fired in the keep-alive time, contains no TCP and SSL handshakes, just application data transfers.

Even though HttpURLConnection enable keep-alive by default, it is not guaranteed that HttpURLConnection uses same TCP connection for multiple HTTP requests. I faced same kind of issue when writing HTTPS client application. Solved this issue by using single instance of SSLContext, SSLSocketFactory and HttpsURLConnection.
public class MyHTTPClient {
private SSLContext mSSLContext = null;
private SSLSocketFactory mSSLSocketFactory = null;
private HttpsURLConnection mConnection = null;
public void init() {
//Setup SSL context and Socket factory here
}
pubblic void sendRequest() {
URL url = new URL("https://example.com/request_receiver");
mConnection = (HttpsURLConnection) url.openConnection();
mConnection.setSSLSocketFactory(mSSLSocketFactory);
// Setup request property and send request
// Open input stream to read response
// Close output, input streams
mConnection.disconnect();
}
}

HttpClient connection reusing with 4.3.x

I'm trying to use HttpClient and am having trouble deciphering the meaning of 1.1.5. Ensuring release of low level resources.
Are these how closing the content stream and closing the response are interpreted?
Closing the content stream: (keeps the underlying connection alive)
CloseableHttpClient httpclient = HttpClients.createDefault();
try {
HttpGet httpget = new HttpGet("http://localhost/");
// do multiple times on the same connection
for (...) {
HttpResponse response = httpclient.execute(httpget);
HttpEntity entity = response.getEntity();
if (entity != null) {
try {
// do something useful
} finally {
EntityUtils.consume(entity); // <-- ensures reuse
}
}
}
} finally {
httpclient.close();
}
Closing the response: (immediately shuts down and discards the connection)
CloseableHttpClient httpclient = HttpClients.createDefault();
try {
HttpGet httpget = new HttpGet("http://localhost/");
// do multiple times on different connections
for (...) {
ClosableHttpResponse response = httpclient.execute(httpget);
try {
HttpEntity entity = response.getEntity();
if (entity != null) {
// do something useful
}
} finally {
response.close(); // <-- ensures reconnect
}
}
} finally {
httpclient.close();
}

entityUtils.consume closes the stream for you...
if (entity.isStreaming()) {
final InputStream instream = entity.getContent();
if (instream != null) {
instream.close();
}
}
You just 'release' your client back to the pool...
Then, you should wrap your HttpClient in a runnable...
public void run() {
handler.sendMessage(Message.obtain(handler, HttpConnection.DID_START));
CloseableHttpClient httpClient = HttpClients.custom()
.setConnectionManager(YourConnectionMgr.getInstance())
.addInterceptorLast(new HttpRequestInterceptor() {
public void process(
final HttpRequest request,
final HttpContext context) throws HttpException, IOException {
}
})
.build();
} //end runnable
At the endof runnable, the client just gets released back to the ConnectionPool and you dont have to worry about resources or cleanup.
Use a manager that extends PoolingClientConnectionManager
newInstance = new MyConnectionManager(schemeRegistry);
instance.setMaxTotal(15);
instance.setDefaultMaxPerRoute(15);
HttpHost localhost = new HttpHost("api.parse.com", 443);
instance.setMaxPerRoute(new HttpRoute(localhost), 10);
Then at the end , i think you do need to shutdown the pool.
YourConnectionMgr.getInstance().shutdown();
YourConnectionMgr.reset();
More details here

In general, once you're done with the entity you want to discard it so that system resources aren't tied up with objects that are no longer meaningful. In my opinion, the only distinction here is use. That chapter on fundamentals is basically describing that point. However you implement it, make sure that you use resources only for as long as you need them. The low level resource is the InputStream in the entity, the high level resource is the connection. If you're implementing something that doesn't need to read the full InputStream in order to make a determination, for example, just terminate the response and the cleanup will be handled for you efficiently.

Java Proxy: How to extract Destination Host and Port from the HttpRequest?

I am working on a Http Proxy in java. Basically I have 3 applications:
a client application, where I just submit a request to a server VIA a proxy
a proxy that captures the request, modifies it and then forwards it to the web server
the web server
Here is my code for the Client (taken from the apache httpcore examples, but works well):
public class ClientExecuteProxy () {
public static void main(String[] args)throws Exception {
HttpHost proxy = new HttpHost("127.0.0.1", 8080, "http");
DefaultHttpClient httpclient = new DefaultHttpClient();
try {
httpclient.getParams().setParameter(ConnRoutePNames.DEFAULT_PROXY, proxy);
HttpHost target = new HttpHost("issues.apache.org", 443, "https");
HttpGet req = new HttpGet("/");
System.out.println("executing request to " + target + " via " + proxy);
HttpResponse rsp = httpclient.execute(target, req);
HttpEntity entity = rsp.getEntity();
System.out.println("----------------------------------------");
System.out.println(rsp.getStatusLine());
Header[] headers = rsp.getAllHeaders();
for (int i = 0; i<headers.length; i++) {
System.out.println(headers[i]);
}
System.out.println("----------------------------------------");
if (entity != null) {
System.out.println(EntityUtils.toString(entity));
}
} finally {
// When HttpClient instance is no longer needed,
// shut down the connection manager to ensure
// immediate deallocation of all system resources
httpclient.getConnectionManager().shutdown();
}
}
}
If I do a direct execution of the request to the server (if I comment the line "httpclient.getParams().setParameter(ConnRoutePNames.DEFAULT_PROXY, proxy);"), it works without any problem. But if I leave it like that, it will pass by the proxy. Here is the part that I do not know how to handle for the proxy:
The proxy listens for the requests, reads its content and verifies if it respects certain policies. If OK it will forward it to the server, else it will drop the request and it will send a HttpResponse with an error. The problem is when the request is OK and it needs to be forwarded. How does the proxy know to what address to forward it? My question is: How do I get the information from the request entered at the line "HttpHost target = new HttpHost("issues.apache.org", 443, "https");"?
I've googled for a couple of hours but found nothing. Can anybody help me please?

When you define an HTTP proxy to an application or browser, either:
There will be a preceding CONNECT request to form a tunnel, that tells you the target host:port, or
The entire target URL is placed in the middle of the GET/POST/... request line. Normally, without a proxy, this is just a relative URL, relative to the host:port of the TCP connection.

How to use a custom socketfactory in Apache HttpComponents

I have been trying to use a custom SocketFactory in the httpclient library from the Apache HTTPComponents project. So far without luck. I was expecting that I could just set a socket factory for a HttpClient instance, but it is obviously not so easy.
The documentation for HttpComponents at http://hc.apache.org/httpcomponents-client-ga/tutorial/html/connmgmt.html does mention socket factories, but does not say how to use them.
Does anybody know how this is done?

oleg's answer is of course correct, I just wanted to put the information directly here, in case the link goes bad. In the code that creates a HttpClient, I use this code to let it use my socket factory:
CustomSocketFactory socketFactory = new CustomSocketFactory();
Scheme scheme = new Scheme("http", 80, socketFactory);
httpclient.getConnectionManager().getSchemeRegistry().register(scheme);
CustomSocketFactory is my own socket factory, and I want to use it for normal HTTP traffic, that's why I use "http" and 80 as parameters.
My CustomSchemeSocketFactory looks similar to this:
public class CustomSchemeSocketFactory implements SchemeSocketFactory {
#Override
public Socket connectSocket( Socket socket, InetSocketAddress remoteAddress, InetSocketAddress localAddress, HttpParams params ) throws IOException, UnknownHostException, ConnectTimeoutException {
if (localAddress != null) {
socket.setReuseAddress(HttpConnectionParams.getSoReuseaddr(params));
socket.bind(localAddress);
}
int connTimeout = HttpConnectionParams.getConnectionTimeout(params);
int soTimeout = HttpConnectionParams.getSoTimeout(params);
try {
socket.setSoTimeout(soTimeout);
socket.connect(remoteAddress, connTimeout );
} catch (SocketTimeoutException ex) {
throw new ConnectTimeoutException("Connect to " + remoteAddress + " timed out");
}
return socket;
}
#Override
public Socket createSocket( HttpParams params ) throws IOException {
// create my own socket and return it
}
#Override
public boolean isSecure( Socket socket ) throws IllegalArgumentException {
return false;
}
}

We use a custom socket factory to allow HttpClient connections to connect to HTTPS URLs with untrusted certificates.
Here is how we did it:
We adapted implementations of both the 'EasySSLProtocolSocketFactory' and 'EasyX509TrustManager' classes from the examples source directory referenced by Oleg.
In our HttpClient startup code, we do the following to enable the new socket factory:
HttpClient httpClient = new HttpClient();
Protocol easyhttps = new Protocol("https", new EasySSLProtocolSocketFactory(), 443);
Protocol.registerProtocol("https", easyhttps);
So that any time we request an https:// URL, this socket factory is used.

Java Httpurlconnection DNS resolution with multiple IP addresses

I'm using Java's HttpUrlConnection to hit foo.com
foo.com has multiple A-Records that point to different IP addresses (1.1.1.1 and 1.1.1.2)
If my first connect call resolves to 1.1.1.1, but then that machine goes down, will a subsequent connect call recognize this and try to connect on 1.1.1.2 instead?
Or do I need to implement this sort of logic myself using the INetAddress api?

I was able to resolve this by using Apache Commons HttpClient, see the code snippet below.
Like I feared, the URLConnection provided by java.net is a very simplistic implementation and will only try the first IP address from the resolved list. If you really are not allowed to use another library, you will have to write your own error handling. It's kinda messy, since you will need to resolve all IPs before hand using InetAddress, and connect to each IP passing the "Host: domain.name" header to the HTTP stack yourself until one of the IPs responds.
The Apache library is greatly more robust and allows for a great deal of customization. You can control how many times it will retry and, most importantly, it will automatically try all IP addresses resolved to the same name until one of them responds successfully.
HttpRequestRetryHandler myRetryHandler = new HttpRequestRetryHandler() {
#Override
public boolean retryRequest(IOException exception, int count, HttpContext context) {
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
}
return count < 30;
}
};
ConnectionKeepAliveStrategy keepAlive = new ConnectionKeepAliveStrategy() {
#Override
public long getKeepAliveDuration(HttpResponse response, HttpContext context) {
return 500;
}
};
DefaultHttpClient httpclient = new DefaultHttpClient();
httpclient.getParams().setParameter("http.socket.timeout", new Integer(2000));
httpclient.getParams().setParameter("http.connection.timeout", new Integer(2000));
httpclient.setHttpRequestRetryHandler(myRetryHandler);
httpclient.setKeepAliveStrategy(keepAlive);
HttpGet httpget = new HttpGet("http://remotehost.com");
HttpResponse httpres = httpclient.execute(httpget);
InputStream is = httpres.getEntity().getContent();
I hope this helps!

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How can I get the underlying Socket from an httpclient connection? - java

Related

Read from URL if content is image [duplicate]

HttpClient connection reusing with 4.3.x

Java Proxy: How to extract Destination Host and Port from the HttpRequest?

How to use a custom socketfactory in Apache HttpComponents

Java Httpurlconnection DNS resolution with multiple IP addresses

Categories

Resources