How to make JSONWriter XML safe? [duplicate] - java

The reason for this "escapes" me.
JSON escapes the forward slash, so a hash {a: "a/b/c"} is serialized as {"a":"a\/b\/c"} instead of {"a":"a/b/c"}.
Why?

JSON doesn't require you to do that, it allows you to do that. It also allows you to use "\u0061" for "A", but it's not required, like Harold L points out:
The JSON spec says you CAN escape forward slash, but you don't have to.
Harold L answered Oct 16 '09 at 21:59
Allowing \/ helps when embedding JSON in a <script> tag, which doesn't allow </ inside strings, like Seb points out:
This is because HTML does not allow a string inside a <script> tag to contain </, so in case that substring's there, you should escape every forward slash.
Seb answered Oct 16 '09 at 22:00 (#1580667)
Some of Microsoft's ASP.NET Ajax/JSON API's use this loophole to add extra information, e.g., a datetime will be sent as "\/Date(milliseconds)\/". (Yuck)

The JSON spec says you CAN escape forward slash, but you don't have to.

I asked the same question some time ago and had to answer it myself. Here's what I came up with:
It seems, my first thought [that it comes from its JavaScript
roots] was correct.
'\/' === '/' in JavaScript, and JSON is valid JavaScript. However,
why are the other ignored escapes (like \z) not allowed in JSON?
The key for this was reading
http://www.cs.tut.fi/~jkorpela/www/revsol.html, followed by
http://www.w3.org/TR/html4/appendix/notes.html#h-B.3.2. The feature of
the slash escape allows JSON to be embedded in HTML (as SGML) and XML.

PHP escapes forward slashes by default which is probably why this appears so commonly. I suspect it's because embedding the string "</script>" inside a <script> tag is considered unsafe.
Example:
<script>
var searchData = <?= json_encode(['searchTerm' => $_GET['search'], ...]) ?>;
// Do something else with the data...
</script>
Based on this code, an attacker could append this to the page's URL:
?search=</script> <some attack code here>
Which, if PHP's protection was not in place, would produce the following HTML:
<script>
var searchData = {"searchTerm":"</script> <some attack code here>"};
...
</script>
Even though the closing script tag is inside a string, it will cause many (most?) browsers to exit the script tag and interpret the items following as valid HTML.
With PHP's protection in place, it will appear instead like this, which will NOT break out of the script tag:
<script>
var searchData = {"searchTerm":"<\/script> <some attack code here>"};
...
</script>
This functionality can be disabled by passing in the JSON_UNESCAPED_SLASHES flag but most developers will not use this since the original result is already valid JSON.

Yes, some JSON utiltiy libraries do it for various good but mostly legacy reasons. But then they should also offer something like setEscapeForwardSlashAlways method to set this behaviour OFF.
In Java, org.codehaus.jettison.json.JSONObject does offer a method called
setEscapeForwardSlashAlways(boolean escapeForwardSlashAlways)
to switch this default behaviour off.

Related

How to insert HTML code in a context variable with thyme leaf [duplicate]

I'm using Thymeleaf to process html templates, I understood how to append inline strings from my controller, but now I want to append a fragment of HTML code into the page.
For example, lets stay that I have this in my Java application:
String n="<span><i class=\"icon-leaf\"></i>"+str+"</span> \n";
final WebContext ctx = new WebContext(request, response,
servletContext, request.getLocale());
ctx.setVariable("n", n);
What do I need to write in the HTML page so that it would be replaced by the value of the n variable and be processed as HTML code instead of it being encoded as text?
You can use th:utext attribute that stands for unescaped text (see documentation). Use this with caution and avoid user input in th:utext as it can cause security problems.
<div th:remove="tag" th:utext="${n}"></div>
If you want short-hand syntax you can use following:
[(${variable})]
Escaped short-hand syntax is
[[${variable}]]
but if you change inner square brackets [ with regular ( ones HTML is not escaped.
Example within tags:
<div>
[(${variable})]
</div>

Sending xml data in hidden field.Is it safe?

I have an html form:
<form>
<input type="hidden" id="hiddenField"/>
...Other form fields
</form>
In this form I want to set a hidden field with xml data.
Can anyone suggest if it is fine to set the hidden field directly with xml data.
i.e. in my javascript function is it safe to directly set the hidden field with xml like: $(#hiddenFiled).val(xml); and get the xml in my java servlet?Please suggest.
No you can't keep xml without encoding
You can opt either
var stringValue=escape(xml);
var xmlValue= unescape (stringValue)
in javascript
Though these methods has been depreciated in newer versions so you could find it in another library like http://underscorejs.org/#escapeUnderScoreJs
Also don't keep XML in hidden field if it holds andy sensitive information.
Hidden form fields are not for session tracking.
We have two mechanism for session tracking, they are cookies and URL rewriting, the latest for the people that doesn't have cookies enabled in their browsers, I could only understand sending a session id in a hidden field when you have your own session tracker and are not using the one that is already with your server container (HttpSession and all), but why re-invent the wheel?
Hidden fields are for passing information between pages, sometimes I use a and I clearly don't want that information displayed to the user
Posting XML without javascript or browser plugins is impossible. You should not send it directly as a form parameter. See this answer for more info:.
Use a library that would encode them while sending to server, and decode them at the server side.
Underscore.js provides such functionality. See the documentation:
escape_.escape(string)
Escapes a string for insertion into HTML, replacing &, <, >, ", `, and ' characters.
_.escape('Curly, Larry & Moe');
=> "Curly, Larry & Moe"
unescape_.unescape(string)
The opposite of escape, replaces &, <, >, ", ` and ' with their unescaped counterparts.
_.unescape('Curly, Larry & Moe');
=> "Curly, Larry & Moe"
However, do keep in mind that usually browsers have limits over the amount of data that you can send through GET request (around 255 bytes). Hence it's always a good option to use POST instead of GET even when sending encoded XML.

Thymeleaf print JSON string as JSON object into a javascript variable

In Specific
I need a way to print JSON representation of a string value into the html page via thymeleaf.
In detail
I'm having a model attribute which contains a string which is actually a string representation of the JSON
My thymeleaf code
<script th:inline="javascript">
var value = [[${data.scriptValue}]];
</script>
print the variable as below
var value = '[[\"asd\",\"3\"],[\"asd\",\"1\"],[\"asdasd\",\"1\"]]';
But I want something like this as a javascript/JSON array
var value = [["asd","3"],["asd","1"],["asdasd","1"]];
How to do this in thymeleaf?
Note: I know I can do this from JSON.Parse but i need a way to do this from thymeleaf :)
Update - 2015/12/24
This feature is available in Thymeleaf 3
Refer The Thymeleaf textual syntax in https://github.com/thymeleaf/thymeleaf/issues/395
// Unescaped (actual answer)
var value = [(${data.scriptValue})];
//or
var value = [# th:utext="${data.scriptValue}"/];
// Escaped
var value = [[${data.scriptValue}]];
//or
var value = [# th:text="${data.scriptValue}"/];
It's not possible at Thymeleaf 2. As Patric LC mentioned, there are two issues reported for this.
unescaped inline for scripts/css #12
Use Jackson for Javascript inlining of JSON #81
#Faraj, new version of Thymeleaf provides this functionality. They implement features for the issues that you mentioned. You can look here:
http://www.thymeleaf.org/doc/articles/thymeleaf3migration.html
The main features:
Three textual template modes: TEXT, JAVASCRIPT and CSS.
New syntax for elements in textual template modes: [# ...] ... [/].
Inlined output expressions allowed, both escaped ([[...]]) and unescaped ([(...)]).
Intelligent escaping of JavaScript (as literals) and CSS (as identifiers).
Parser-level (/*[- ... -]*/) and prototype-only (/*[+ ... +]*/) comment blocks.
Natural templates applied to JAVASCRIPT scripts and CSS style sheets by means of wrapping elements and/or output expressions inside comments (/*[# ...]*/).

freemarker display ${..} as html rather than string

I get html code from server to build freemarker.ftl.
Example:
Server return:
String htmlCode="<h1>Hello</h1>";
freemarker.ftl
${htmlCode}
except:Hello
actually: <h1>Hello</h1>
what can i do?
By default FreeMarker has no auto-escaping on, so it should print that value as HTML. But as it doesn't as you say, I can imagine two possibilities:
You are inside <#escape x as x?html>...</#escape>, or that was added to the template by a custom TemplateLoader. In that case, in 2.3.x you have to write <#noescape>${htmlCode}</#noescape>. (In 2.4 it will be much less verbose if everything goes as planned.)
That value was escaped before it reaches FreeMarker. So the template already gets <h1>Hello</h1> as the string.

Part after # is missing from the request parameter value

I did a hello world web application in Java on Tomcat container. I have a query string
code=askdfjlskdfslsjdflksfjl#_=_
with underscores on both sides of = in the URL. When I tried to retrieve the query string in the servlet by request.getParameter("code"), I get only askdfjlskdfslsjdflksfjl. The part after # is missing.
How is this caused and how can I solve it?
That's because the part of the url after # is not a part of the query.
Section 3.4 of approprate RFC says:
The query component is indicated by the first question
mark ("?") character and terminated by a number sign ("#") character
or by the end of the URI.
The # is only interpreted by the browser, not the server. If you want to pass the # character to the server, you must URLEncode it.
Example:
URLEncoder.encode("code=askdfjlskdfslsjdflksfjl#=", "UTF-8");
Please read the percent encoding on Wikipedia. The # and = are reserved characters in URLs. Only unreserved characters can be used plain in URLs, all other characters are supposed to be URL-encoded. The URL-encoded value of a # is %23 and = is %3D. So this should do:
code=askdfjlskdfslsjdflksfjl%23_%3D_
If this actually originates from a HTML <a> link in some JSP like so:
some link
then you should actually have changed it to use JSTL's <c:url>:
<c:url var="servletUrlWithParam" value="servletUrl">
<c:param name="code" value="askdfjlskdfslsjdflksfjl#_=_" />
</c:url>
some link
so that it get generated as
some link
Note that this is not related to Java/Servlets per-se, this applies to every web application.

Categories