How to fix encoding while reading json response in android? - java

I am calling an API which returns a JSON response. While reading the JSON response there are some places where data has some special characters. I want to exclude these special characters while reading the response in an object. The JSON response looks like this:
{"data":[{"title":"PSY - GANGNAM STYLE (\uac15\ub0a8\uc2a4\ud0c0\uc77c) M\/V","content":All rights reserved."}]}
The Java code is this:
BufferedReader reader = new BufferedReader(new InputStreamReader(
is, "ISO-8859-1"), 8);
When I read the title key from the response, it gives me these special characters as well which I don't want. How do get rid of those? Do i need to specify some other encoding?
Data Source :http://pipes.yahoo.com/pipes/pipe.run?_id=920adeb2e95c15877e29dc678aa78dd7&_render=json&n=1

This isn't an encoding issue (like UTF-8), it's a JavaScript syntax issue. The "\uac15", for example, is JavaScript syntax that represents the Unicode character U+AC15, which is "강". Together, those escaped characters are the name of the song written in Hangul (Korean): "강남스타일".
It's normal and OK for your Java string to contain the backslash escape sequences. When you run that string though a JSON reader, you should get a JSON object containing the actual Hangul characters.
In response to your comment about getting wrong output from a JSON reader, that depends on what JSON library you're using (and how you're using it), which you didn't specify in the question. Here's an example that works for me using Jackson 2.1.0:
public final class JsonTest {
public static void main(final String[] args) {
final String json = "\"PSY - GANGNAM STYLE (\\uac15\\ub0a8\\uc2a4\\ud0c0\\uc77c) M\\/V\"";
System.out.println("JSON: " + json);
try {
// ObjectMapper is from Jackson 2.1 databind library.
final ObjectMapper mapper = new ObjectMapper();
final String decoded = mapper.readValue(json, String.class);
System.out.println("Decoded: " + decoded);
}
catch (final IOException e) {
e.printStackTrace();
}
}
}

Related

How escape string works in ObjectMapper of jackson

I want to generate JSON string from given string but with single backslash character before single quote character like this \'. For example I have string "you are the 'great'" and want output like this "you are the \'great\'". I am using jackson object mapper class and following is the code:
String str = "you are the 'great'";
String jsonStr = "";
System.out.println(str);//Line-1
ObjectMapper mapper = new ObjectMapper();
try
{
jsonStr = mapper.writeValueAsString(str);
}
catch (IOException e)
{
e.printStackTrace();
}
System.out.println(jsonStr);//Line-2
So based on that I have performed following testcases:
Input string: "you are the \'great\'"
Output of Line-1: you are the 'great'
Output of Line-2: you are the 'great'
Input string: "you are the \\'great\\'"
Output of Line-1: you are the \'great\'
Output of Line-2: you are the \\'great\\'
But I am not able to get the expected output. Please provide some solution.
Note: Here to explain you the problem I have taken string as a input but actually I have some string properties in object and want Json string of that object.

BufferedWriter doesnt write JSON String correctly

I wrote a code that gets a JSON text from a website and formats it so it is easier to read. My problem with the code is:
public static void gsonFile(){
try {
re = new BufferedReader(new FileReader(dateiname));
Gson gson = new GsonBuilder().setPrettyPrinting().create();
JsonParser jp = new JsonParser();
String uglyJSONString ="";
uglyJSONString = re.readLine();
JsonElement je = jp.parse(uglyJSONString);
String prettyJsonString = gson.toJson(je);
System.out.println(prettyJsonString);
wr = new BufferedWriter(new FileWriter(dateiname));
wr.write(prettyJsonString);
wr.close();
re.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
It correctly prints it into the console : http://imgur.com/B8MTlYW.png
But in my txt file it looks like this: http://imgur.com/N8iN7dv.png
What can I do so it correctly prints it into the file? (separated by new lines)
Gson uses \n as line delimiter (as can be seen in the newline method here).
Since Notepad does not understand \n you can either open your result file with another file editor (Wordpad, Notepad++, Atom, Sublime Text, etc.) or replace the \n by \r\n before writing it:
prettyJsonString = prettyJsonString.replace("\n", "\r\n");
FileReader and FileWriter are old utility classes that use the platform encoding. This gives non-portable files. And for JSON one ordinarily uses UTF-8.
Path datei = Paths.get(dateiname);
re = Files.newBufferedReader(datei, StandardCharsets.UTF_8);
Or
List<String> lines = Files.readAllLines(datei, StandardCharsets.UTF_8);
// Without line endings as usual.
Or
String text = new String(Files.readAllBytes(datei), StandardCharsets.UTF_8);
And later:
Files.write(text.getBytes(StandardCharsets.UTF_8));
It is problem with your text editor. Not with text. It incorrectly process new line character.
I suppose it expect CR LF(Windows way) symbols and Gson generate only LF symbol (Unix way).
After a quick search this topic might come handy.
Strings written to file do not preserve line breaks
Also, opening in another editor like the others said will help too

String object vs. string literal in OutputStreamWriter in Java

I'm making requests to a HTTP server sending JSON string. I used Gson for serializing and deserializing JSON objects. Today I observed this pretty weird behavior that I don't understand.
I have:
String jsonAsString = gson.toJson(jsonAsObject).replace("\"", "\\\"");
System.out.println(jsonAsString);
That outputs exactly this:
{\"content\":\"Literal\",\"pos\":{\"left\":20,\"top\":20}}
Now I'm using OutputStreamWriter obtained from HttpURLConnection to make HTTP, PUT request with JSON payload. The foregoing request works fine:
os.write("{\"content\":\"Literal\",\"pos\":{\"left\":20,\"top\":20}}");
However, when I say:
os.write(jsonAsString);
...the request doesn't work (this server doesn't return any errors but I can see that when writing JSON as string object it doesn't do what it should). Is there a difference when using string literal over string object. Am I doing something wrong?
Here is the snippet:
public static void EditWidget(SurfaceWidget sw, String widgetId) {
Gson gson = new Gson();
String jsonWidget = gson.toJson(sw).replace("\"", "\\\"");
System.out.println(jsonWidget);
try {
HttpURLConnection hurl = getConnectionObject("PUT", "http://fltspc.itu.dk/widget/5162b1a0f835c1585e00009e/");
hurl.connect();
OutputStreamWriter os = new OutputStreamWriter(hurl.getOutputStream());
//os.write("{\"content\":\"Literal\",\"pos\":{\"left\":20,\"top\":20}}");
os.write(jsonWidget);
os.flush();
os.close();
System.out.println(hurl.getResponseCode());
} catch (IOException e) {
e.printStackTrace();
}
}
Remove the .replace("\"", "\\\"") instruction. It's unnecessary.
You're forced to add slashes before double quotes when you send a JSON String literal because in a Java String literal, double quotes must be escaped (otherwise, they would mark the end of the String instead of being a double quote inside the String).
But the actual String, in the bytecode, doesn't contain these backslashes. They're only used in the source code.

Jackson JSON parser invalid utf-8 start byte

I'm trying to parse the following JSON and I keep getting a JsonParseException:
{
"episodes":{
"description":"Episode 3 – Oprah's Surprise Patrol from 1\/20\/04\nTake a trip down memory lane and hear all your favorite episodes of The Oprah Winfrey Show from the last 25 seasons -- everyday on your radio!"
}
}
also fails on this JSON
{
"episodes":{
"description":"After 20 years in sports talk…he’s still the top dog! Catch Christopher “Mad Dog” Russo weekday afternoons on Mad Dog Radio as he tells it like it is…Give the Doggie a call at 888-623-3646."
}
}
Exception:
org.codehaus.jackson.JsonParseException: Invalid UTF-8 start byte 0x96
at [Source: C:\Json Test Files\episodes.txt; line: 3, column: 33]
at org.codehaus.jackson.JsonParser._constructError(JsonParser.java:1291)
at org.codehaus.jackson.impl.JsonParserMinimalBase._reportError(JsonParserMinimalBase.java:385)
at org.codehaus.jackson.impl.Utf8StreamParser._reportInvalidInitial(Utf8StreamParser.java:2236)
at org.codehaus.jackson.impl.Utf8StreamParser._reportInvalidChar(Utf8StreamParser.java:2230)
at org.codehaus.jackson.impl.Utf8StreamParser._finishString2(Utf8StreamParser.java:1467)
at org.codehaus.jackson.impl.Utf8StreamParser._finishString(Utf8StreamParser.java:1394)
at org.codehaus.jackson.impl.Utf8StreamParser.getText(Utf8StreamParser.java:113)
at com.niveus.jackson.Main.parseEpisodes(Main.java:37)
at com.niveus.jackson.Main.main(Main.java:13)
Code:
public static void main(String [] args) {
parseEpisodes("C:\\Json Test Files\\episodes.txt");
}
public static void parseEpisodes(String filename) {
JsonFactory factory = new JsonFactory();
JsonParser parser = null;
String nameField = null;
try {
parser = factory.createJsonParser(new File(filename));
parser.configure(JsonParser.Feature.ALLOW_UNQUOTED_CONTROL_CHARS, true);
parser.configure(JsonParser.Feature.ALLOW_BACKSLASH_ESCAPING_ANY_CHARACTER, true);
JsonToken token = parser.nextToken();
nameField = parser.getText();
String desc = null;
while (token != JsonToken.END_OBJECT) {
if (nameField.equals("episodes")) {
while (token != JsonToken.END_OBJECT) {
if (nameField.equals("description")) {
parser.nextToken();
desc = parser.getText();
}
token = parser.nextToken();
nameField = parser.getText();
}
}
token = parser.nextToken();
nameField = parser.getText();
}
System.out.println(desc);
} catch (JsonParseException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
The character at column 33 is –, and the reason this would be the byte 0x96 is that the file is physically encoded as Windows-1252. You need to save the file in UTF-8, windows-1252 is not a valid encoding for json. How to do this depends on what text editor you are using.
See JSON RFC:
Encoding
JSON text SHALL be encoded in Unicode. The default encoding is
UTF-8.
I have also faced similar issue. open your json in Notepad ++, then in encoding drop down select as UTF-8. and save the text to other file. doing this resolved the issue.
Everything mentioned here I tried and none solved my issue, so I manually typed the payload and it solved my issue.
I know this question is old, but I would like to share something that works for me. It is possible to ignore the character in the following way.
Define a charset decoded
StandardCharsets.UTF_8.newDecoder().onMalformedInput(CodingErrorAction.IGNORE);
Use to read the InputStream
InputStreamReader stream = new InputStreamReader(resource.getInputStream(), CHARSET_DECODER)
Use the Jackson CSV mapper to read the content
new CsvMapper().readerFor(Map.class).readValues(stream);
The key element here is the charset decoder with the option IGNORE in the malformed input.

Sending Non-latin query string in URL in JavaME

I want to make am HTTP GET request from my J2ME application using HttpConnection class.
The problem is that I cannot send russian text in the query string.
Here is the example of how I'm sending the request
c = (HttpConnection)Connector.open("http://127.0.0.1:1418/zp.ashx?тест");
InputStream s = c.openInputStream();
The receiving asp.net script receives the query part of the url as %3f%3f%3f%3f
That is 4 identical codes. Definately that's not what I'm sending
So how can I send non-latin text in an http query in J2ME?
Thank you in advance
Your code
Connector.open("http://127.0.0.1:1418/zp.ashx?тест");
is processed by a java.nio.CharsetDecoder for the ASCII character set, and this decoder replaces all unknown characters with its replacement.
To get the behavior you want, you have to encode the URL before sending it. For example, when your server expects the URLs to be UTF8-encoded:
String encodedParameter = URLEncoder.encode("тест", "UTF-8");
Connector.open("http://127.0.0.1:1418/zp.ashx?" + encodedParameter);
Note that if you have multiple parameters, you have to encode both the parameter names and the parameter values individually, before putting them together with "=" and concatenating them with "&". If you need to encode multiple parameters, this class may be helpful to you:
import java.io.UnsupportedEncodingException;
import java.net.URLEncoder;
public class UrlParamGenerator {
private final String encoding;
private final StringBuilder sb = new StringBuilder();
private String separator = "?";
public UrlParamGenerator(String charset) {
this.encoding = charset;
}
public void add(String key, String value) throws UnsupportedEncodingException {
sb.append(separator);
sb.append(URLEncoder.encode(key, encoding));
sb.append("=");
sb.append(URLEncoder.encode(value, encoding));
separator = "&";
}
#Override
public String toString() {
return sb.toString();
}
public static void main(String[] args) throws UnsupportedEncodingException {
UrlParamGenerator gen = new UrlParamGenerator("UTF-8");
gen.add("test", "\u0442\u0435\u0441\u0442");
gen.add("x", "0");
System.out.println(gen.toString());
}
}
You might need to explicitly set a character set in the HTTP header that supports the cyrillic alphabet. You could either use UTF-8 or another charset, such as windows-1251 (although UTF-8 should be the preferred choice).
c.setRequestProperty("Content-type", "application/x-www-form-urlencoded;charset=utf-8");
c = (HttpConnection)Connector.open("http://127.0.0.1:1418/zp.ashx?тест");
If you use an appropriate charset, the server should be able to properly handle the cyrillic request parameter - provided it too supports this charset.
URL can contain only ASCII chars and a few punctuation chars. For other chars, you must %-encode them before adding them in the URL. Use URLEncoder.encode("тест", enc) where the enc parameter is the encoding scheme that the server expects.

Categories