Encoding of Response is incorrect using Apache HttpClient - java

I am calling a restful service that returns JSON using the Apache HttpClient.
The problem is I am getting different results in the encoding of the response when I run the code on different platforms.
Here is my code:
GetMethod get = new GetMethod("http://urltomyrestservice");
get.addRequestHeader("Content-Type", "text/html; charset=UTF-8");
...
HttpResponse response = httpexecutor.execute(request, conn, context);
response.setParams(params);
httpexecutor.postProcess(response, httpproc, context);
StringWriter writer = new StringWriter();
IOUtils.copy(response.getEntity().getContent(), writer);
When I run this on OSX, asian characters etc return fine e.g. 張惠妹 in the response. But when I run this on a linux server the same code displays the characters as ???
The linux server is an Amazon EC2 instance running Java 1.6.0_26-b03
My local OSX is running 1.6.0_29-b11
Any ideas really appreciated!!!!!

If you look at the javadoc of org.apache.commons.io.IOUtils.copy(InputStream, Writer):
Copy bytes from an InputStream to chars on a Writer using the default
character encoding of the platform.
So that will give different answers depending on the client (which is what you're seeing)
Also, Content-Type is usually a response header (unless you're using POST or PUT). The server is likely to ignore it (though you might have more luck with the Accept-Charset request header).
You need to parse the content type's charset-encoding parameter of the response header, and use that to convert the response into a String (if it's a String you're actually after). I expect Commons HTTP has code that will do that automatically for you. If it doesn't, Spring's RESTTemplate definitely does.

I believe that the problem is not in the HTTP encoding but elsewhere (e.g. while reading or forming the answer). Where do you get the content from and how? Is this stored in a DB or file?

Related

Downloading binary file from url

I am using this code to download files from a url:
FileUtils.copyURLToFile(url, new File("C:/Songs/newsong.mp3"));
When I create the url using for instance,
"https://mjcdn.cc/2/282676442/MjUgU2FhbCAtIFZlZXQgQmFsaml0Lm1wMw==",
this works just fine and the mp3 is downloaded.
However,
if I use another url:
"https://dl.jatt.link/hd.jatt.link/a0339e7c772ed44a770a3fe29e3921a8/uttzv/Hummer-(Mr-Jatt.com).mp3",
the file is 0kb.
I am able to download files from both these urls from within a web browser.
What's wrong here, and how can I fix it.
I noticed a difference between your 2 URLs:
The first one just gives back the file without redirection.
But the second one responds with a redirect (HTTP/1.1 302 Moved Temporarily). It's also a special case, because it's a redirect from HTTPS to HTTP protocol.
Browsers can follow redirects, but your program - for some reason (see below) - can't.
I suggest you to use a HTTP client library (e.g. Apache HTTP client or Jsoup), and configure it to follow redirects (if they don't do it by default).
For example, with Jsoup, you would need a code like this:
String url = "https://dl.jatt.link/hd.jatt.link/a0339e7c772ed44a770a3fe29e3921a8/uttzv/Hummer-(Mr-Jatt.com).mp3";
String filename = "C:/Songs/newsong.mp3";
Response r = Jsoup.connect(url)
//.followRedirects(true) // follow redirects (it's the default)
.ignoreContentType(true) // accept not just HTML
.maxBodySize(10*1000*1000) // accept 10M bytes (default is 1M), or set to 0 for unlimited
.execute(); // send GET request
FileOutputStream out = new FileOutputStream(new File(filename));
out.write(r.bodyAsBytes());
out.close();
Update on #EJP's comment:
I looked up Apache Commons IO's FileUtils class on GitHub. It calls openStream() of the received URL object.
openStream() is a shorthand for openConnection().inputStream().
openConnection() returns an URLConnection object. If there is an appropriate subclass for the protocol used by URL, it will return an instance of that subclass. In this case that's a HttpsURLConnection which is the subclass of HttpURLConnection.
The followRedirects option is defined in HttpURLConnection and it's indeed true by default:
Sets whether HTTP redirects (requests with response code 3xx) should be automatically followed by this class. True by default.
So OP's approach would normally work with redirects too, but it seems that redirection from HTTPS to HTTP is not handled (properly) by HttpsURLConnection. - It's the case that #VGR mentioned in the comments below.
It's possible to handle redirects manually by reading the Location header with HttpsURLConnection, then use it in a new HttpURLConnection. (Example) (I wouldn't be surprised if Jsoup did the same.)
I suggested Jsoup because it already implements a way to handle HTTPS to HTTP redirections correctly and also provides tons of useful features.

Using InputStreamEntity for buidling a http put request with httpclient won't work unless I pass the content length explicitely

I am trying to do a http request in scala using httpclient from org.apache.httpcomponents version 4.23. In particular I want to do a put using an InputStreamEntity to build the request in order to avoid copying over a large (~100Mb) byte array in memory. Here is the snippet:
val req = new HttpPut(url)
req setEntity new InputStreamEntity(contentStream, -1/*contentlength*/)
val client = new DefaultHttpClient(connManager, httpParams)
val resp = client execute req
In the code url, connManager, httpParams are defined elsewhere. The result of the code is the creation of a file on the desired location with NO content. I am testing with a contentStream which has 3 bytes. Creating the InputStreamEntity with content length as argument set explicitly to 3 will result in the code to create the file the right way. For good reasons in production I won't know the length of the stream hence I want to use negative numbers to make sure the entire stream is sent until, as advertized by the api of InputStreamEntity, the end of the stream is reached.
What am I doing wrong? Why am I getting an empty file when not explicitely setting the content length?
Not setting the content-length, will result in HTTP Client switching to chunked transfer-encoding
For this to work, the http server you are posting to must be HTTP 1.1 compliant. Is it ?

How do I get Rest Assured to return the text (non-encrypted or streamed) value in my REST response?

I recently moved over to Java and am attempting to write some REST tests against the netflix REST service.
I'm having an issue in that my response using rest assured either wants to send a gzip encoded response or "InputStream", neither of which provide the actual XML text in the content of the response. I discovered the "Accept-Encoding" header yet making that blank doesn't seem to be the solution. With .Net I never had to mess with this and I can't seem to find the proper means of returning a human readable response.
My code:
RestAssured.baseURI = "http://api-public.netflix.com";
RestAssured.port = 80;
Response myResponse = given().header("Accept-Encoding", "").given().auth().oauth(consumerKey, consumerSecret, accessToken, secretToken).param("term", "star wars").get("/catalog/titles/autocomplete");
My response object has a "content" value with nothing but references to buffers, wrapped streams etc. Trying to get a ToString() of the response doesn't work. None of the examples I've seen seem to work in my case.
Any suggestions on what I'm doing wrong here?
This has worked for me:
given().config(RestAssured.config().decoderConfig(DecoderConfig.decoderConfig().noContentDecoders())).get(url)
I guess in Java land everything is returned as an input stream. Using a stream reader grabbed me the data I needed.
Until its version 1.9.0, Rest-assured has been providing by default in the requests the header "Accept-Encoding:gzip,deflate" with no way of changing it.
See
https://code.google.com/p/rest-assured/issues/detail?id=154
It works for me:
String responseJson = get("/languages/").asString();

Chunked http decoding in java?

I am decoding http packets.
And I faced a problem that chunk problem.
When I get a http packet it has a header and body.
When transefer-encoding is chunked I don't know what to do ?
Is there a useful API or class for dechunk the data in JAVA ?
And if someone , experienced about http decoding , please show me a way how to do this ?
Use a fullworthy HTTP client like Apache HttpComponents Client or just the Java SE provided java.net.URLConnection (mini tutorial here). Both handles it fully transparently and gives you a "normal" InputStream back. HttpClient in turn also comes with a ChunkedInputStream which you just have to decorate your InputStream with.
If you really insist in homegrowing a library for this, then I'd suggest to create a class like ChunkedInputStream extends InputStream and write logic accordingly. You can find more detail how to parse it in this Wikipedia article.
Apache HttpComponents
Oh, and if we are talking about the client side, HttpUrlConnection does this as well.
If you are looking for a simple API try Jodd Http library (http://jodd.org/doc/http.html).
It handles Chunked transfer encoding for you and you get the whole body as a string back.
From the docs:
HttpRequest httpRequest = HttpRequest.get("http://jodd.org");
HttpResponse response = httpRequest.send();
System.out.println(response);
Here is quick-and-dirty alternative that requires no dependency except Oracle JRE:
private static byte[] unchunk(byte[] content) throws IOException {
ByteArrayInputStream bais = new ByteArrayInputStream(content);
ChunkedInputStream cis = new ChunkedInputStream(bais, new HttpClient() {}, null);
return readFully(cis);
}
It uses the same sun.net.www.http.ChunkedInputStream as java.net.HttpURLConnection does behind the scene.
This implementation doesn't provide detailed exceptions (line numbers) on wrong content format.
It works with Java 8 but could fail in with next release. You've been warned.
Could be useful for prototyping though.
You can choose any readFully implementation from Convert InputStream to byte array in Java.

How to gzip ajax requests with Struts 2?

How to gzip an ajax response with Struts2? I tried to create a filter but it didn't work. At client-side I'm using jQuery and the ajax response I'm expecting is in json.
This is the code I used on server:
ByteArrayOutputStream out = new ByteArrayOutputStream();
GZIPOutputStream gz = new GZIPOutputStream(out);
gz.write(json.getBytes());
gz.close();
I'm redirecting the response to dummy jsp page defined at struts.xml.
The reason why I want to gzip the data back is because there's a situation where I must send a relatively big sized json back to the client.
Any reference provided will be appreciated.
Thanks.
You shouldn't randomly gzip responses. You can only gzip the response when the client has notified the server that it accepts (understands) gzipped responses. You can do that by determining if the Accept-Encoding request header contains gzip. If it is there, then you can safely wrap the OutputStream of the response in a GZIPOutputStream. You only need to add the Content-Encoding header beforehand with a value of gzip to inform the client what encoding the content is been sent in, so that the client knows that it needs to ungzip it.
In a nutshell:
response.setContentType("application/json");
response.setCharacterEncoding("UTF-8");
OutputStream output = response.getOutputStream();
String acceptEncoding = request.getHeader("Accept-Encoding");
if (acceptEncoding != null && acceptEncoding.contains("gzip")) {
response.setHeader("Content-Encoding", "gzip");
output = new GZIPOutputStream(output);
}
output.write(json.getBytes("UTF-8"));
(note that you would like to set the content type and character encoding as well, this is taken into account in the example)
You could also configure this at appserver level. Since it's unclear which one you're using, here's just a Tomcat-targeted example: check the compression and compressableMimeType attributes of the <Connector> element in /conf/server.xml: HTTP connector reference. This way you can just write to the response without worrying about gzipping it.
If your response is JSON I would recommend using the struts2-json plugin http://struts.apache.org/2.1.8/docs/json-plugin.html and setting the
enableGZIP param to true.

Categories