HttpClient: disabling chunked encoding - java

I am using the Apache Commons HttpClient along with Restlet to call a restful web service. Unfortunately, my server (based on Ruby on Rails) does not like the Transfer-Encoding: chunked that HttpClient is using by default.
Is there any way to disable the usage of chunked encoding for POSTs from the client?

As a general rule, for request not to be chunked, you need to specify exact size of post body, which for dynamically generated data means you need to buffer entire response in memory, see its size and only then send it.
Apache client documentation seems to confirm this: AbstractHttpEntity.setChunked() states
Note that the chunked setting is a hint only. If using HTTP/1.0, chunking is never performed. Otherwise, even if chunked is false, HttpClient must use chunk coding if the entity content length is unknown (-1).

As said in Restlet mailing list, in Restlet version 2.1, you can set ClientResource#entityBuffering property to true to cache content in memory and prevent chunked encoding.

The most reliable way, as #Slartibartfast hinted in his answer, is to explicitly switch HttpPost to HTTP 1.0 protocol.
Set apache HttpPost request to HTTP 1.0 protocol (the same for HttpGet, if you need this...):
HttpPost httpPost = new HttpPost(someUrl);
httpPost.setProtocolVersion(HttpVersion.HTTP_1_0); // Since v.4.3 of Apache HttpClient
When creating Multipart post request provide as an input for an attachment not an InputStream (as for HTTP 1.1, which causes chunked encoding), but an array of bytes, which you have to create from the same stream beforehand. This is why content length is known. See org.apache.http.entity.mime.MultipartEntityBuilder.addBinaryBody(String, byte[], ContentType, String)
I tested this for Android development, that required slightly different class names... (see https://github.com/andstatus/andstatus/issues/249 )

Related

HttpServletRequest.getInputStream() does not unwrap chunked HTTP request

I am in the process of sending a HTTP chunked request to an internal system. I've confirmed other factors are not at play by ensuring that I can send small messages without chunk encoding.
My process was basically to change the Transfer-Encoding header to be chunked and I've removed the Content-Length header. Additionally, I am utilising an in-house ChunkedOutputStream which has been around for quite some time.
I am able to connect, obtain an output stream and send the data. The recipient then returns a 200 response so it seems the request was received and successfully handled. The endpoint receives the HTTP Request, and streams the data straight into a table (using HttpServletRequest.getInputStream()).
On inspecting the streamed data I can see that the chunk encoding information in the stream has not been unwrapped/decoded by the Tomcat container automatically. I've been trawling the Tomcat HTTPConnector documentation and can't find anything that alludes to the chunked encoding w.r.t how a chunk encoded message should be handled within a HttpServlet. I can't see other StackOverflow questions querying this so I suspect I am missing something basic.
My question boils down to:
Should Tomcat automatically decode the chunked encoding from my request and give me a "clean" InputStream when I call HttpServletRequest.getInputStream()?
If yes, is there configuration that needs to be updated to enable this functionality? Am I sending something wrong in the headers that is causing it to return the non-decoded stream?
If no, is it common practice to wrap input stream in a ChunkedInputStream or something similar when the Transfer-Encoding header is present ?
This is solved. As expected it was basic in my case.
The legacy system I was using provided handrolled methods to simplify the process of opening a HTTP Connection, sending headers and then using an OutputStream to send the content via a POST. I didn't realise, and it was in a rather obscure location, but the behind-the-scenes helper's we're identifying that I was not specifying a Content-Length thus added the TRANSFER_ENCODING=chunked header and wrapped the OutputStream in a ChunkedOutputStream. This resulted in me double encoding the contents, hence my endpoints (seeming) inability to decode it.
Case closed.

HTTPClient never leaves socketRead() when executing GET on stream - workaround?

I am using Apache HttpClient (from Apache HTTP Components 4.3) in order to execute a GET against a ShoutCast stream:
CloseableHttpClient client = HttpClients.createDefault();
HttpGet request = new HttpGet("http://relay3.181.fm:8062/");
CloseableHttpResponse response = client.execute(request);
The call to client.execute() never returns, and according to the debugger it is a nested invocation to java.net.SocketInputStream#socketRead0() which is the last node in the call stack. From profiling the code, my only conclusion (based on a steadily rising number of char[] allocations) is that it simply "latches on" to the stream and keeps pulling bytes from the socket indefinitely.
What I would like is for the client to simply work normally and give me a HTTPResponse which I can use to pull what I want from the stream. As a matter of fact, I have been able to do so with other ShoutCast streams, but not this one.
Is there any way to work around this? Could I for example tell the client to break off after a certain number of bytes?
That site is very particular. If you don't specify a supported User-Agent (like Mozilla), the server keep streaming bytes. I don't know what these bytes are meant to represent, audio perhaps.
If you print out the bytes that you receive, you will see
ICY 200 OK
icy-notice1:<BR>This stream requires Winamp<BR>
icy-notice2:SHOUTcast Distributed Network Audio Server/Linux v1.9.8<BR>
icy-name:181.FM - The Beatles Channel
icy-genre:Oldies
icy-url:http://www.181.fm
content-type:audio/mpeg
icy-pub:1
icy-br:128
which indicates that the response is not a valid HTTP response. It is an ICY response from the ICY protocol.
Now the default HttpClient you are using uses a DefaultHttpResponseParser which is a
Lenient HTTP response parser implementation that can skip malformed
data until a valid HTTP response message head is encountered.
In other words, it keeps reading the bytes the server is sending until it finds a valid HTTP response header, which will never happen, thus the infinite read.
I don't think you will be able to accomplish what you want with the Http Components library. Either look for an ICY client implementation in Java or spin your own.

How do I send an HTTP response without Transfer Encoding: chunked?

I have a Java Servlet that responds to the Twilio API. It appears that Twilio does not support the chunked transfer that my responses are using. How can I avoid using Transfer-Encoding: chunked?
Here is my code:
// response is HttpServletResponse
// xml is a String with XML in it
response.getWriter().write(xml);
response.getWriter().flush();
I am using Jetty as the Servlet container.
I believe that Jetty will use chunked responses when it doesn't know the response content length and/or it is using persistent connections. To avoid chunking you either need to set the response content length or to avoid persistent connections by setting "Connection":"close" header on the response.
Try setting the Content-length before writing to the stream. Don't forget to calculate the amount of bytes according to the correct encoding, e.g.:
final byte[] content = xml.getBytes("UTF-8");
response.setContentLength(content.length);
response.setContentType("text/xml"); // or "text/xml; charset=UTF-8"
response.setCharacterEncoding("UTF-8");
final OutputStream out = response.getOutputStream();
out.write(content);
The container will decide itself to use Content-Length or Transfer-Encoding basing on the size of data to be written by using Writer or outputStream. If the size of the data is larger than the HttpServletResponse.getBufferSize(), then the response will be trunked. If not, Content-Length will be used.
In your case, just remove the 2nd flushing code will solve your problem.

Java HttpGet doesn't accept gzip

I am making an HttpGet to an url and I do not want the server to send the data gzipped. What header should I include in my HttpGet ?
With the default headers, the server sends gzipped data from time to time. I don't want this to happen. Thanks.
You want the Accept-Encoding HTTP request header.
Update: per #Selvin's comment, leave it empty or set it to "identity".
Update: The web application has to cooperate properly to be HTTP compliant, of course. If it's not honoring Accept-Encoding, look at its Content-Encoding HTTP response header. If it's "gzip", just read the response body with Java's GZIPInputStream.html. Then add "gzip" to your Accept-Encoding request header, since your client now handles GZIP. If the web application doesn't set the Content-Encoding header properly, that's another story altogether.
You should set the Accept-Encoding header to identity.
You could try to change the Accept-Encoding header, by removing the gzip|deflate value. If this doesn't work, you should also take into account that server doesn't care if the client supports the gzipped content (which is a bug and should be fixed).

Java client throwing Unsupported Media Type Exception

I am developing an application that uses restful api. A java client sending a request to a standalone server is throwing Unsupported Media Type exception.
The client code is as follows
StringBuilder xml = new StringBuilder();
xml.append("<?xml version=\"1.0\" encoding=\"${encoding}\"?>").append("\n");
xml.append("<root>").append("\n");
xml.append("<user>").append("\n");
xml.append("<username>"+username+"</username>");
xml.append("\n");
xml.append("<password>"+pass+"</password");
xml.append("\n");
xml.append("</user>");
xml.append("</root>");
Representation representation = new StringRepresentation(xml.toString());
new ClientResource("http://localhost:7777/Auth").post(representation);
Server code is as follows
new Server(Protocol.HTTP,7777,TestServer.class).start();
String username = (String) getRequest().getAttributes().get("username");
String password=(String) getRequest().getAttributes().get("password");
StringRepresentation representation = null;
You are not passing the content-type header; I strongly recommend using an API like Apache Common HttpClient to produce such requests (and maybe read the contents from a file).
#Riccardo is correct, the Restlet Resource on the server is checking the Content-Type header of the client's request to make sure the entity you're POSTing to the server has a type it can support. Here's a Restlet 1.1 example. You'll notice that that Resource is set up to expect XML:
// Declare the kind of representations supported by this resource.
getVariants().add(new Variant(MediaType.TEXT_XML));
So maybe your server side doesn't declare the representations it can handle, or it does and Restlet's automatic media type negotiation is detecting that your request doesn't have Content-Type: text/xml (or application/xml) set.
So as #Riccardo suggests, use Apache HttpClient and call HttpRequest.setHeader("Content-Type", "text/xml"), or use Restlet's client library API to do this (it adds another abstraction layer on top of an HTTP client connector like Apache HttpClient).

Categories