In my API response, I have control-p character. Jackson parser fails to serialize the character and throws an error
com.fasterxml.jackson.core.JsonParseException: Illegal unquoted
character ((CTRL-CHAR, code 16)): has to be escaped using backslash to
be included in string value
I have investigated and found that Jackson library actually tries to catch for ctrl-char.
Can anyone suggest solutions or work around for this? Thanks in advance.
I was able to fix similar problem by setting Feature.ALLOW_UNQUOTED_CONTROL_CHARS (documentation) on JsonParser
.
The code in my case looks:
parser.setFeatureMask(parser.getFeatureMask() | JsonParser.Feature.ALLOW_UNQUOTED_CONTROL_CHARS.getMask());
As stated by others, such JSON is invalid, but in case you have no chance to change JSON, this should help.
Have you tried to configure the mapper to force escape non-ASCII?
This might be enough:
mapper.configure(JsonGenerator.Feature.ESCAPE_NON_ASCII, true);
see documentation
But I agree with StaxMan: the JSON response should be well formatted.
Content you get is not valid JSON -- as per JSON specification, control characters MUST be escaped within String values, and CAN NOT exist outside of them. So I would recommened getting input data fixed; it is corrupt, and whoever is sending it is not doing good job of cleansing it, or properly escaping.
Barring that, you can write a Reader (or even InputStream) that filters out or converts said control characters.
Related
As I looked for a reliable character for splitting strings, I found out an earlier post about using "((char)007)" as a split character so i decided to use that for a request/response project I'm building.
But when I send data with "((char)007)" between data parts that need to be seperated, the data arrives at the other end of the socket like this instead "teq□weq□1231□21231".
So splitting this data properly is unsuccessful at the moment. Any ideas about why this happens and what kind of approach I might follow to fix this, what else I can use for splitting, any ideas would be appreciated, thanks.
If you are printing control characters (BELL) then your console may not print it out properly.
In any case, consider just sending a structure like a serialized object (be careful with deserializing user-supplied content) or perhaps JSON. Any structure with a standardized format will do better in the long term versus arbitrary splitting on a magic character
I need to add a URL typically in the format http:\somewebsite.com\somepage.asp.
When I create a string with the above URL and add it to JSON object json
using
json.put("url",urlstring);
it's appending an extra "\" and when I check the output it's like http:\\\\somewebsite.com\\somepage.asp
When I give the URL as http://somewebsite.com/somepage.asp
the json output is http:\/\/somewebsite.com\/somepage.asp
Can you help me to retrieve the URL as it is, please?
Thanks
Your JSON library automatically escapes characters like slashes. On the receiving end, you'll have to remove those backslashes by using a function like replace().
Here's an example:
string receivedUrlString = "http:\/\/somewebsite.com\/somepage.asp";<br />
string cleanedUrlString = receivedUrlString.replace('\', '');
cleanedUrlString should be "http://somewebsite.com/somepage.asp".
Hope this helps.
Reference: http://docs.oracle.com/javase/6/docs/api/java/lang/String.html#replace(char,%20char)
Tichodroma's answer has nailed it. You can solve the "problem" by storing valid URLs.
In addition, the JSON format requires that backslashes in strings are escaped with a second backslash. If the 2nd backslash is left out, the result is invalid JSON. Refer to the JSON syntax diagrams at http://www.json.org
The fact that the double backslashes are giving you problems actually means that the software that is reading the files is broken. A properly written JSON parser will automatically de-escape the strings. The site I linked to above lists many JSON parser libraries written in many languages. You should use one of these rather than trying to write the JSON parsing code yourself.
In my android application, my JSON date is returned as this:
\/Date(1323752400000)\/
Is there a simple way to remove the escape characters? (This is being sent from a WCF service to an Android application). I am already using StringEscapeUtils.unEscapeHtml4 to decode the entire serialized object.
actully this not works as it throws java.util.regex.PatternSyntaxException instead of using that use
myJsonString=myJsonString.replaceAll("\\\\","");
it works fine
On the receiving end, if you really want to, you could just do myJsonString = myJsonString.replaceAll("\\","");
But do note that those escape characters in no way make the JSON invalid or otherwise semantically different -- the '/' character can be optionally escaped with '\' in JSON.
You can use Apache Commons lang:
StringEscapeUtils.unescapeJava(stringToUnEscape);
Class Ref : https://commons.apache.org/proper/commons-lang/javadocs/api-2.6/org/apache/commons/lang/StringEscapeUtils.html
I'm trying to decode JSON output of a Java program (jackson) and having some issues.
The cause of the problem is the following snippet:
{
"description": "... lives\uMOVE™ OFFERS ",
}
Which causes ValueError: Invalid \uXXXX escape.
Any ideas on how to fix this?
EDIT: The output is from an Avro file, the Avro package uses jackson to emit records as JSON.
EDIT2: After poking about in the source files, it might be the case that the JSON is constructed manually (sorry jackson).
What's the original string supposed to look like? \uXXXX is a unicode escape sequence, so it's interpreting \uMOVE as a single character, but it's not a valid unicode value. JSON is always assumed to be unicode, so you'll likely need to fix the string in the originating app
Try quoting the \u like this:
{
"description": "... lives\\uMOVE™ OFFERS ",
}
Basically the input isn't valid json.
The spec on http://www.json.org/ defines how strings should be be encoded. You will have to fix the JSON output from the other application.
This is a known bug in Avro versions < 1.6.0. See AVRO-851 for more details.
Jackson does not currently have a configuration feature to allow accepting such input. (Was it generated with Jackson?)
You could modify the stream parser to handle it. Follow the stack trace to the method(s) that would need changing.
You could submit a change request at http://jira.codehaus.org/browse/JACKSON for Jackson to be enhanced to provide such a feature, though I'm not sure how popular the request would be, and whether it would ever be implemented.
We have a JAVA application that pulls the data from SAP, parses it and renders to the users.
The data is pulled using JCO connector.
Recently we were thrown an exception:
org.xml.sax.SAXParseException: Character reference "�" is an invalid XML character.
So, we are planning to write a new level of indirection where ALL special/illegal characters are replaced BEFORE parsing the XML.
My questions here are :
Is there any existing(open source) utility that does this job of replacing illegal characters in XML?
Or if I had to write such utility, how should i handle them?
Why is the above exception thrown?
Thank You.
From my point of view, the source (SAP) should do the replacement. Otherwise, what it transmits to your programm may looks like XML, but is not.
While replacing the '&' by '&' can be done by a simple String.replaceAll(...) to the string from to toXML() call, others characters can be harder to replace (the '<' and '>' for exemple).
regards
Guillaume
It sounds like a bug in their escaping. Depending on context you might be best off just writing your own version of their XMLWriter class that uses a real XML library rather than trying to write your own XML utilities like the SAP developers did.
Alternatively, looking at the character code, �, you might be able to get away with a replace all on it with the empty string:
String goodXml = badXml.replaceAll("", "");
I've had a related, but opposite problem, where I was trying to insert character 1 into the output of an XSLT transformation. I considered post-processing to replace a marker with the zero, but instead chose to use an xsl:param.
If I was in your situation, I'd either come up with a bespoke encoding, replacing the characters which are invalid in XML, and handling them as special cases in your parsing, or if possible, replace them with whitespace.
I don't have experience with JCO, so can't advise on how or where I'd replace the invalid characters.
You can encode/decode non-ASCII characters in XML by using the Apache Commons Lang class StringEscapeUtils escapeXML method. See:
http://commons.apache.org/lang/api-2.4/index.html
To read about how XML character references work, search for "numeric character references" on wikipedia.