Setting setChunkedStreamingMode in HttpURLConnection fails to deliver data to server - java

My server version is as follows on my dev machine:
Apache/2.2.21 (Win32) mod_fcgid/2.3.6
I have been testing HttpURLConnection as my project requires easy streaming capabilties. I have read a great synopsis from #BalusC on how to use the class.
Using java.net.URLConnection to fire and handle HTTP requests
The trouble I am currently having is when setting setChunkedStreamingMode. Regardless of what I set it to my stream doesn't seem to make it to the server the data stream is empty when my server api method/connection is called/made. However, if I remove it, it works fine.
I have seen another person with a similar issue:
Java/Android HttpURLConnection setChunkedStreamingMode not working with all PHP servers
But with no real resolution. I am unable to set it to setFixedLengthStreamingMode simply because the content (json) is variable in length.
This is NOT OK. I potentially will be transfering very large quantities of data and hence cannot have the data stored in memory.
My question is, how can I get setChunkedStreamingMode to play nice? Is it a server setup issue or can it be fixed in code?
EDIT
I have now tested my code on my production server and it works no problem. I would however still like to know why my Apache server on my local machine fails. Any help is still much appreciated.

Try adding this HTTP header:
urlConnection.setRequestHeader("Transfer-Encoding","chunked");
I haved a problem like this: although I haved set the chunked HTTP streaming mode (urlConnection.setChunkedStreamingMode(0) ), it not worked, but putting the HTTP header above it works fine.

I had a similar issue. In my case it was the client system that had a virus scanner installed. Those scanners sometimes have identity theft modules that intercept POSTs, scan the body and then pass it on.
In my case BitDefender cached about 5MB before passing it on.
If the whole payload was less then the POST was delivered as non chunked fixed length request.

I had a similar problem using HttpURLConnection. Just add:
conn.setRequestProperty("connection", "close"); // disables Keep Alive
to your connection or disable it for all connections:
System.setProperty("http.keepAlive", "false");
From the API about disconnect():
Releases this connection so that its resources may be either reused or closed.
Unlike other Java implementations, this will not necessarily close socket connections that can be reused. You can disable all connection reuse by setting the http.keepAlive system property to false before issuing any HTTP requests.

Related

How to handle a full duplex HTTP POST using Jersey 1.x or another HTTP client library?

I need some assistance on a project I am working on. It's a library itself using Jersey 1.x (1.19.1) aiming at HTTP posting a JSON document and getting the corresponding JSON response from a server.
I am facing a problem when the response from the server is "big". The JSON document that is posted by my client application contains several jobs that must be executed by the server, and
the JSON document sent back by the server is made of the outputs of these jobs. The jobs can be considered independent from each other. The server works in streaming mode, which means it
starts to process the jobs before it receives the entire JSON document posted by the client. And it starts to send the outputs of the jobs as soon as they are finished. So the server
starts to reply to my client application while it is still posting the request. Here is my problem. When the request gets big so gets the response (more jobs to do), my application freezes
and at some point terminates.
I spent a lot of time trying to figure out what's happening and here is what is found and what I infered.
Jersey, for handling HTTP communication is using a class from the JDK (in rt.jar) I forgot the exact name and don't have access to my work right now but let's call it HttpConnection.
In this class there is a method checkError() that is invoked and throws a IOException with only a message saying it was impossible to write to server.
Debugging I was able to understand that an attribute of this class named trouble was set to true because a write() method caught an IOException before. checkError() throws a
IOException based on that trouble boolean flag. It's not possible to easily see the cause IOException because the classes of the JRE are compiled without the debugging symbols but
I managed to see that this IOExeption was a "connection reset by peer" problem.
Then I tried to understand why the server resets the connection. I used a HTTP proxy that captures the HTTP traffic between my client application and the server but this gave me no more clues,
it even seems that the proxy is unable to handle properly the connection with the server as well!
So I tried to use Wireshark to capture the traffic and see what's wrong. Here is what I found.
On client side, packets corresponding to the post of the request JSON document are sent and the server starts to reply shortly after, as explained above. The server side sends
more and more packets and I noticed that the buffer of the TCP layer (called TCP window in Wireshark) on client side has a size that decreases more and more as the server sends packets.
Until it beomes full (size: 0 byte). So the TCP layer on server side cannot send data to the TCP layer on client side anymore and thus becomes full too. The conversation, in the end is
only about retrying to send data, on both sides, failing again and again. Ultimately the server decides to send a reset packet. This corresponds to the cause IOExcpetion I mentioned
above I believe.
My understanding is: as long as the server does not start to stream the response everything is fine. When the server starts to send the response, the TCP buffer on client side starts to
get filled. But as the client application does not read the response yet, the content of this buffer is not consumed. When the server has sent enough data to fill this buffer it cannot
send anymore data and the buffer of its TCP layer gets full too because the server continues to push data. As a result, the client application cannot finish to send the request JSON
document. The communication is blocked on both sides and the server decides to reset the connection.
My conclusion is: the code, as currently written, does not support such full duplex communication, because the response from the server is not consumed as it is received. Indeed, walking
through the Jersey code that is executed by my library, by debugging, it is clear that the pattern is:
first: connection.getOutputStream().write()
and then: response.getInputStream().read()
In my opinion, the root cause of the problem is that the library I am working on uses Jersey in this synchronous manner which does not fit well the way the server works (streaming the
response while the request is still being sent to it).
I searched a lot on the Internet a solution keeping Jersey 1.19.1 for me to improve the library with as few impacts as possible but I failed. This is the reason why I am asking help
here now ;)
So basicaly my question is: is it possible to do what I need to do keeping Jersey client library 1.19.1 and if yes how? If not, what HTTP client library should I use for my library (to
write a post request and read the corresponding response at the same time) and if you could give me a basic example so I can be on track quickly it would be much appreciated.
One last thing: curl works just fine, I can fully post the exact same JSON document and get the response using it, so there is no problem on server side as I suspected at the very
beginning of my investigation. And it scales fine (I tried to send huge JSON documents). Of course I made sure the HTTP header of the post is the same in the case of my library and in the
curl case.
Thanks a lot for reading me and for your answers.
Best regards,
Loïc

Force Embedded Jetty to disconnect at once

I'm creating a small utility which receives a lot of HTTP requests. It is written in java and uses embedded-jetty to handle requests via https.
I have a load-testing tool for it, but when it is being run for some time it starts to throw exceptions:
java.net.BindException: Address already in use: connect
(note, this is on sender's side, not in my project)
As I understand this means no more free sockets were found in system when another connect was called. Throughput is about 1000 requests per second, and failures start to appear somewhere after 20000 to 50000 requests.
However when I use the same load testing tool with another program (a kind of simple consumer, written in scala using netty by some colleague - it simply receives all requests and returns empty ok response) - there is no so problem with sockets (though typical speed is 1.5-2 times slower).
I wonder if this could be fixed by telling Jetty somehow to close connections immediately after response was sent. Anyway each new request is sent via new connection. I tried to play with Connector#setIdleTimeout - it seems to be 30000 by default but have not succeeded.
What can I do to fix this - or at least to research the matter deeper to find its cause (if I am wrong in my suggestions)?
UPD Thanks for suggestions, I think I am not allowed to post the source, but I get the idea that I should study client's code (this will make me busy for some time since it is written in scala).
I found that really there was a problem with client - it sends requests with Connection: Keep-Alive in header, though creates new HttpURLConnection for each request and calls disconnect() method after it.
To solve this trouble on the server-side it was sufficient to send Connection: close in response header, since I have no allowance to change testing utility.

How to diagnose leaked http connections (org.apache.http.impl.conn.tsccm.ConnPoolByRoute)

I have a multithreaded java program that runs on Amazon's EC2. It queries and fetches data items from a vendor via HttpPost and HttpGet, using a org.apache.http.impl.client.DefaultHttpClient. Concurrently, it pushes the retrieved data items into S3 using AWS's Java SDK.
After a few days of running, I get the symptoms that normally come with http connection leaks:
org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection
at org.apache.http.impl.conn.tsccm.ConnPoolByRoute.getEntryBlocking(ConnPoolByRoute.java:417)
at org.apache.http.impl.conn.tsccm.ConnPoolByRoute$1.getPoolEntry(ConnPoolByRoute.java:300)
at org.apache.http.impl.conn.tsccm.ThreadSafeClientConnManager$1.getConnection(ThreadSafeClientConnManager.java:224)
at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:391)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:820)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:754)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:732)
Since both AWS and my requests to the data vendor use Http connections, I am not quite sure where exactly I forget to HttpEntity.consume(), or S3ObjectInputStream.close() (unless it is yet something else...).
So here is my question: are there ways to monitor org.apache.http.impl.conn.tsccm.ConnPoolByRoute so that at least I can detect when I am starting to leak connections/entities not properly consumed/http streams not closed? (I have a feeling it happens only under certain conditions, e.g. when certain exceptions are being thrown, by-passing the logic in my code that consumes HttpEntities, closes streams, etc.) Any idea on how to diagnose what eventually causes all my http connections to fail with that ConnectionPoolTimeoutException would be most welcome. I don't feel like waiting 4+ days between attempts to fix the root cause of the problem.
If you're using the PoolingClientConnectionManager note there are the methods getTotalStats() and getStats(final HttpRoute route) which will give you a PoolStats object with the data you're looking to monitor.
Just fetch the ConnectionManager from your httpclient:
PoolingClientConnectionManager poolManager = (PoolingClientConnectionManager) httpClient.getConnectionManager();
If you can access the org.apache.http.impl.conn.tsccm.ConnPoolByRoute then set it's connTTL to a low enough value so that it's WaitingThreadAborter will eventually terminate a connection. It will show a nice stacktrace there. The other option is to use CGLIB or some other bytecode manipulating framework to create a proxy class wrapping org.apache.http.impl.conn.tsccm.ConnPoolByRoute. Depending on your environment it might not be that easy to set it up, but it's a rather valuable tool to debug issues like yours. (And yes, if you happen to use spring or just plain Aspects the setup will be supereasy :) )

CXF increase connection pool size without changing http.maxConnections

I have recently been asked to configure CXF to the same parameters as our older XFire service.
One of those parameters was Keep-Alive: timeout=60, max=20.
However, I did some research and it appears that CXF uses the JVM HttpURLConnection object under the hood. From what I see, there has been some attempts to provide configuration for it but nothing has been commited for now.
I would prefer not to change the http.maxConnections parameter as it would impact all the server instead of the CXF web services only.
I found this interresting forum thread talking about it where Daniel Kulp says:
BTW: there is a way to control the connection pooling, but it's a
SERVER side thing. Basically, if the server sends back a header of:
Keep-Alive: timeout=60, max=5
then the Java client will respect those values. Right now in CXF,
you would probably need to write an interceptor to set those values.
I just made a commit to trunk that expands the http configuration to
include a setting to control this from the config file.
I could write an interceptor that modify the headers to do so. However my question is: How will the server react to this kind of change? Would not that be a problem if the server expects 5 connections max and the client performs more ?
According to what I read here, the keep-alive parameters can be controller either by system properties or directly in the HTTP headers:
The support for HTTP keep-Alive is done transparently. However, it can
be controlled by system properties http.keepAlive, and
http.maxConnections, as well as by HTTP/1.1 specified request and
response headers.

How to close a HTTP connection from the HttpServlet

I'm running a servlet in Tomcat 6.0.26. The servlet accepts file upload from the client by HTTP POST. I'd like to stop the file uploading from the HttpServlet side. I tried the following methods with no luck:
close the request inputstream
send error code HttpServletResponse.SC_REQUEST_ENTITY_TOO_LARGE and flush response
do 1 and 2 in a Filter
I googled but found no direct answers. Please advise solutions.
Thanks.
This is not possible using the standard Servlet nor Commons FileUpload API's. Basically, to be able to abort the connection immediately, you should grab the underlying socket physically and close it. However, this socket is controlled by the webserver. See also this related question: How to explicitly terminate http connection from server with no response header.
Little tests have however confirmed that Commons FileUpload doesn't buffer up the entire file in memory when its size exceeds the limit. It will read the input stream, but just ignore and throw away the read bytes (also the ones which are already read). So memory efficiency isn't necessarily the problem here.
To fix the real problem, you'd basically like to validate the file size in the client side rather than the server side. This is possible with a Java Applet or a Flash Application. For example, respectively JumpLoader and SWFUpload.
This is not possible using standard APIs. And you're violating some protocol standards/RFC if you do. So why would you do that?
Instead, send a "Connection: close" response (http header) with no http body.
here is some crazy workaround: you can write (or find somewhere) some firewall standalone application based on Sockets that handles HTTP request, parses headers and if the request matches some your custom conditions - firewall forwards it to Tomcat, otherwise returns HTTP error response. Or you can try to tune Apache <-> Tomcat with some Apache rules.

Categories