My Java project based on WebView component.
Now, I want to call some JS function with single String argument.
To do this, I'm using simple code:
webEngine.executeScript("myFunc('" + str + "');");
*str text is getting from the texarea.
This solution works, but not safe enough.
Some times we can get netscape.javascript.JSException: SyntaxError: Unexpected EOF
So, how to handle str to avoid Exception?
Letfar's answer will work in most cases, but not all, and if you're doing this for security reasons, it's not sufficient. First, backslashes need to be escaped as well. Second, the line.separator property is the server side's EOL, which will only coincidentally be the same as the client side's, and you're already escaping the two possibilities, so the second line isn't necessary.
That all being said, there's no guarantee that some other control or non-ASCII character won't give some browser problems (for example, see the current Chrome nul in a URL bug), and browsers that don't recognize JavaScript (think things like screenreaders and other accessibility tools) might try to interpret HTML special characters as well, so I normally escape [^ -~] and [\'"&<>] (those are regular expression character ranges meaning all characters not between space and tilde inclusive; and backslash, single quote, double quote, ampersand, less than, greater than). Paranoid? A bit, but if str is a user entered string (or is calculated from a user entered string), you need to be a bit paranoid to avoid a security vulnerability.
Of course the real answer is to use some open source package to do the escaping, written by someone who knows security, or to use a framework that does it for you.
I have found this quick fix:
str = str.replace("'", "\\'");
str = str.replace(System.getProperty("line.separator"), "\\n");
str = str.replace("\n", "\\n");
str = str.replace("\r", "\\n");
Related
I've a webservice running and Android devices reading data from it. The data I want to send, is slashed by the server, to avoid hacking issues. Once its escapped, it's being saved into the database.
But when I'm reading this data again, it's being returned like this:
"Baba O\'Riley" instead of "Baba O'Riley".
I think its pretty "correct" and that what I've to do, is to clean the string I get of backslashes with a function like Stripslashes in PHP.
http://es1.php.net/manual/es/function.stripslashes.php
However, I couldn't find any function to do this in Java.
Any idea?
You can use string.replace() function. See String replace a Backslash and How to replace backward slash to forward slash using java?
String replacedStr = stringname.replace("\\", "");
I'm having trouble finding a genuinely suitable Java function for this, but this is at least better than the currently accepted answer:
String replacedStr = stringname.replace("\\\\", "!~!").replace("\\", "").replace("!~!", "\\");
(Replace !~! with some sequence of characters that's sufficiently unlikely to appear in the string)
This method works by replacing double backslashes with a sufficiently uncommon marker for safekeeping, stripping all backslashes, then changing the uncommon marker back to a single backslash. It's slower than the state machine that PHP uses, since it makes three passes, but that's unlikely to make a noticeable difference.
\n,\r, \t,\v,\f,\e....
When dealing with characters listed above the JAVA function
String replacedStr = stringname.replace("\\", "");
are different from PHP function
I have a string like this
"\x27\x18\xf6,\x03\x12\x8e\xfa\xec\x11\x0dHL"
when i put it in browser console, it automatically becomes something else:
"\x27\x18\xf6,\x03\x12\x8e\xfa\xec\x11\x0dHL"
"'ö,úìHL"
if I do chatAt(x) over this string, I get:
"\x27\x18\xf6,\x03\x12\x8e\xfa\xec\x11\x0dHL".charAt(0)
"'"
"\x27\x18\xf6,\x03\x12\x8e\xfa\xec\x11\x0dHL".charAt(1)
""
"\x27\x18\xf6,\x03\x12\x8e\xfa\xec\x11\x0dHL".charAt(2)
"ö"
which IS what I want.
Now I want to implement a Java program that reads the string the same way as in browser.
The problem is, Java does not recognize the way this string is encoded. Instead, it treats it as a normal string:
"\\x27\\x18\\xf6,\\x03\\x12\\x8e\\xfa\\xec\\x11\\x0dHL".charAt(0) == '\'
"\\x27\\x18\\xf6,\\x03\\x12\\x8e\\xfa\\xec\\x11\\x0dHL".charAt(1) == 'x'
"\\x27\\x18\\xf6,\\x03\\x12\\x8e\\xfa\\xec\\x11\\x0dHL".charAt(2) == '2'
What kind of encoding this string is encoded? What kind of encoding uses prefix \x?
Is there a way to read it properly (get the same result as in browser)?
update: I found a solution -> i guess it is not the best, but it works for me:
StringEscapeUtils.unescapeJava("\\x27\\x18\\xf6,\\x03\\x12\\x8e\\xfa\\xec\\x11\\x0dHL".replace("\\x", "\\u00"))
thank you all for your replies :)
especially Ricardo Cacheira
Thank you
\x03 is the ASCII hexadecimal value of char
so this: "\x30\x31" is the same as : "01"
see that page: http://www.asciitable.com
Another thing is when you copy your string without quotation marks your IDE converts any \ to \\
Java String uses unicode escape so this: "\x30\0x31" in java is: "\u0030\u0031";
you can't use these escape sequence in Java String \u000a AND \u000d you should convert it respectively to \r AND \n
So this "\u0027\u0018\u00f6,\u0003\u0012\u008e\u00fa\u00ec\u0011\rHL" is the conversion for Java of this: "\x27\x18\xf6,\x03\x12\x8e\xfa\xec\x11\x0dHL"
apache commons provides a helper for this:
StringEscapeUtils.unescapeJava(...)
Unescapes any Java literals found in the String. For example, it will turn a sequence of '\' and 'n' into a newline character, unless the '\' is preceded by another '\'.
Good morning. I realize there are a ton of questions out there regarding replace and replaceAll() but i havnt seen this.
What im looking to do is parse a string (which contains valid html to a point) then after I see the second instance of <p> in the string i want to remove everything that starts with & and ends with ; until i see the next </p>
To do the second part I was hoping to use something along the lines of s.replaceAll("&*;","")
That doesnt work but hopefully it gets my point across that I am looking to replace anything that starts with & and ends with ;
You should probably leave the parsing to a DOM parser (see this question). I can almost guarantee you'll have to do this to find text within the <p> tags.
For the replacement logic, String.replaceAll uses regular expressions, which can do the matching you want.
The "wildcard" in regular expressions that you want is the .* expression. Using your example:
String ampStr = "This &escape;String";
String removed = ampStr.replaceAll("&.*;", "");
System.out.println(removed);
This outputs This String. This is because the . represents any character, and the * means "this character 0 or more times." So .* basically means "any number of characters." However, feeding it:
"This &escape;String &anotherescape;Extended"
will probably not do what you want, and it will output This Extended. To fix this, you specify exactly what you want to look for instead of the . character. This is done using [^;], which means "any character that's not a semicolon:
String removed = ampStr.replaceAll("&[^;]*;", "");
This has performance benefits over &.*?; for non-matching strings, so I highly recommend using this version, especially since not all HTML files will contain a &abc; token and the &.*?; version can have huge performance bottle-necks as a result.
The expression you want is:
s.replaceAll("&.*?;","");
But do you really want to be parsing HTML this way? You may be better off using an XML parser.
I need a simple way to implement the contains function using matches. I believe this is my starting point:
xxx.matches("'.*yyy.*'");
But I need to make it a universal method and pre-process whatever I search for to be accepted by matches! This must be done using only the escape '\' character!
Imagine a string SEARCH_FOR that can contain some special characters that must be "regex escaped"...
String SEARCH_FOR="*.\\"
xxx.matches("'.*" + SEARCH_FOR + ".*'");
Are there any catches? Special situations? Any other "special chars should be taken into account?
Are you looking for Pattern.quote(String) ?
This escapes special characters for you.
EDIT:
After reading the comments, I really hope you try Pattern.quote(yourString.toLowerCase()) as it sounds like you've been using Pattern.quote(yourString).toLowerCase(). If DataNucleus is applying the regex then there should be no problems with using the \Q and \E escape sequence.
Since you have really asked for it, ".\\".replaceAll("(\\.|\\$|\\+|\\*|\\\\)", "\\\\\$1") outputs \.\\
This will escape .'s, $'s, + 's, *'s and \'s. Note that the security of this is now all upon you. If you don't escape something you needed to, or you escape it incorrectly, you will either allow people to use regex inside the search term when you weren't expecting to or it won't returns results that you were expecting.
I am using Matcher.appendReplacement() and it worked great until my replacement string had a $2 in it:
Note that backslashes ( \ ) and dollar
signs ($) in the replacement string
may cause the results to be different
than if it were being treated as a
literal replacement string. Dollar
signs may be treated as references to
captured subsequences as described
above, and backslashes are used to
escape literal characters in the
replacement string.
Is there a convenience method somewhere that will escape all backslashes \ and dollar signs $ with a backslash? Or do I have to write one myself? It sounds like it's not that hard, just would be nice if they gave you one >:(
edit: since they do give you one, I need to replace(">:(", ":-)");
Use Matcher.quoteReplacement on the replacement string.
Unfortunately "ease of use" in this case conflicts with strong typing. [Explanation: An object of Java static type java.lang.String is any immutable sequence of chars. It doesn't tell you the format of that raw data. In this scenario we have text probably meaningful to the user, text encoded in a mini-language for replacement and text encoded in a mini-language for the pattern. The Java type system has no way of distinguishing these (although you can do fun things with annotation-based type checkers, often to avoid XSS or SQL/command injection vulnerabilities). For the pattern mini-language you can to a form of conversion with Pattern.compile although that is a specific use and most APIs methods ignore it (for ease of use). An equivalent ReplacementText.compile could be written. Further, you could ignore the mini-languages and go for libraries as "DSLs". But all this doesn't help casual ease of use.]
Here's another option:
matcher.appendReplacement(stringbuffer, "");
stringbuffer.append(replacement);
appendReplacement() handles the job of copying over the text between the matches, then StringBuffer#append() adds your replacement text sans adulterations. This is especially handy if you're generating the replacement text dynamically, as in Elliott Hughes' Rewriter.
I got it to work with the following, but I like Tom Hawtin's solution better :-)
private static Pattern escapePattern = Pattern.compile("\\$|\\\\");
replacement = escapePattern.matcher(replacement).replaceAll("\\\\$0");
matcher.appendReplacement(stringbuffer, replacement);
Tom's solution:
matcher.appendReplacement(stringbuffer, Matcher.quoteReplacement(replacement));