I've a webservice running and Android devices reading data from it. The data I want to send, is slashed by the server, to avoid hacking issues. Once its escapped, it's being saved into the database.
But when I'm reading this data again, it's being returned like this:
"Baba O\'Riley" instead of "Baba O'Riley".
I think its pretty "correct" and that what I've to do, is to clean the string I get of backslashes with a function like Stripslashes in PHP.
http://es1.php.net/manual/es/function.stripslashes.php
However, I couldn't find any function to do this in Java.
Any idea?
You can use string.replace() function. See String replace a Backslash and How to replace backward slash to forward slash using java?
String replacedStr = stringname.replace("\\", "");
I'm having trouble finding a genuinely suitable Java function for this, but this is at least better than the currently accepted answer:
String replacedStr = stringname.replace("\\\\", "!~!").replace("\\", "").replace("!~!", "\\");
(Replace !~! with some sequence of characters that's sufficiently unlikely to appear in the string)
This method works by replacing double backslashes with a sufficiently uncommon marker for safekeeping, stripping all backslashes, then changing the uncommon marker back to a single backslash. It's slower than the state machine that PHP uses, since it makes three passes, but that's unlikely to make a noticeable difference.
\n,\r, \t,\v,\f,\e....
When dealing with characters listed above the JAVA function
String replacedStr = stringname.replace("\\", "");
are different from PHP function
Related
My Java project based on WebView component.
Now, I want to call some JS function with single String argument.
To do this, I'm using simple code:
webEngine.executeScript("myFunc('" + str + "');");
*str text is getting from the texarea.
This solution works, but not safe enough.
Some times we can get netscape.javascript.JSException: SyntaxError: Unexpected EOF
So, how to handle str to avoid Exception?
Letfar's answer will work in most cases, but not all, and if you're doing this for security reasons, it's not sufficient. First, backslashes need to be escaped as well. Second, the line.separator property is the server side's EOL, which will only coincidentally be the same as the client side's, and you're already escaping the two possibilities, so the second line isn't necessary.
That all being said, there's no guarantee that some other control or non-ASCII character won't give some browser problems (for example, see the current Chrome nul in a URL bug), and browsers that don't recognize JavaScript (think things like screenreaders and other accessibility tools) might try to interpret HTML special characters as well, so I normally escape [^ -~] and [\'"&<>] (those are regular expression character ranges meaning all characters not between space and tilde inclusive; and backslash, single quote, double quote, ampersand, less than, greater than). Paranoid? A bit, but if str is a user entered string (or is calculated from a user entered string), you need to be a bit paranoid to avoid a security vulnerability.
Of course the real answer is to use some open source package to do the escaping, written by someone who knows security, or to use a framework that does it for you.
I have found this quick fix:
str = str.replace("'", "\\'");
str = str.replace(System.getProperty("line.separator"), "\\n");
str = str.replace("\n", "\\n");
str = str.replace("\r", "\\n");
I am doing a web application. I would like to have a SEO-friendly link such as the following:
http://somesite.org/user-entered-title
The above user-entered-title is extracted from user-created records that have a field called title.
I am wondering whether there is any Java library for cleaning up such user-entered text (remove spaces, for example) before displaying it in a URL.
My target text is something such as "stackoverflow-is-great" after cleanup from user-entered "stackoverflow is great".
I am able to write code to replace spaces in a string with dashes, but not sure what are other rules/ideas/best practices out there for making text part of a url.
Please note that user-entered-title may be in different languages, not just English.
Thanks for any input and pointers!
Regards.
What you want is some kind of "SLUGifying" the prhase into a URL, so it is SEO-friendly.
Once I had that problem, I came to use a solution provided in maddemcode.com. Below you'll find its adapted code.
The trick is to properly use the Normalize JDK class with some little additional cleanup. The usage is simple:
// casingchange-aeiouaeiou-takesexcess-spaces
System.out.println(slugify("CaSiNgChAnGe áéíóúâêîôû takesexcess spaces "));
// these-are-good-special-characters-sic
System.out.println(slugify("These are good Special Characters šíč"));
// some-exceptions-123-aeiou
System.out.println(slugify(" some exceptions ¥123 ã~e~iõ~u!##$%¨&*() "));
// gonna-accomplish-yadda
System.out.println(slugify("gonna accomplish, yadda, 완수하다, 소양양)이 있는 "));
Function code:
public static String slugify(String input) {
return Normalizer.normalize(input, Normalizer.Form.NFD)
.replaceAll("[^\\p{ASCII}]", "")
.replaceAll("[^ \\w]", "").trim()
.replaceAll("\\s+", "-").toLowerCase(Locale.ENGLISH);
}
In the source page (http://maddemcode.com/java/seo-friendly-urls-using-slugify-in-java/) you can take a look at where this comes from. The small snippet above, though, works the same.
As you can see, there are some exceptional chars that aren't converted. To my knowledge, everyone that translates them, uses some kind of map, like Djago's urlify (see example map here). You need them, I believe your best bet is making one.
It seems you want to URL-encode a string. It's possible in core Java, without using external libraries. URLEncoder is the class you need.
Languages other than English shouldn't be a problem as the class allows you to specify the character encoding, which takes care of special characters like accents, etc.
Is there any real way to represent a URL (which more than likely will also have a query string) as a filename in Java without obscuring the original URL completely?
My first approach was to simply escape invalid characters with arbitrary replacements (for example, replacing "/" with "_", etc).
The problem is, as in the example of replacing with underscores is that a URL such as "app/my_app" would become "app_my_app" thus obscuring the original URL completely.
I have also attempted to encode all the special characters, however again, seeing crazy %3e %20 etc is really not clear.
Thank you for any suggestions.
Well, you should know what you want here, exactly. Keep in mind that the restrictions on file names vary between systems. On a Unix system you probably only need to escape the virgule somehow, whereas on Windows you need to take care of the colon and the question mark as well.
I guess, the safest thing would be to encode anything that could potentially clash (everything non-alphanumeric would be a good candidate, although you migth adapt this to the platform) with percent-encoding. It's still somewhat readable and you're guaranteed to get the original URL back.
Why? URL-encoding is already defined in an RFC: there's not much point in reinventing it. Basically you must have an escape character such as %, otherwise you can't tell whether a character represents itself or an escape. E.g. in your example app_my_app could represent app/my/app. You therefore also need a double-escape convention so you can represent the escape character itself. It is not simple.
Is there any way in Java to use a special delimiter at the start and the end of a String to avoid having to backslash all of the quotes within that String?
i.e. not have to do this:
String s = "Quote marks like this \" are just the best, here are a few more \" \" \""
No, there is no such option. Sorry.
No - there's nothing like C#'s verbatim string literals or Groovy's slashy strings, for example.
On the other hand, it's the kind of feature which may be included in the future. It's not like it would require any fundamental changes in the type system. I'd be hugely surprised for it to make it into Java 7 this late in the day though, and I haven't seen any suggestions that it'll be in Java 8... so you're in for a long wait :(
The only way to achive this is to put your strings in some other file and read it from Java. For instance a resource bundle.
Its not possible as of now, May be NOT in future also.
if you can give us what and why you are loookng for this kind of feature we can defnitely Suggest some more alternatives
I'm writing a small app that reads some input and do something based on that input.
Currently I'm looking for a line that ends with, say, "magic", I would use String's endsWith method. It's pretty clear to whoever reads my code what's going on.
Another way to do it is create a Pattern and try to match a line that ends with "magic". This is also clear, but I personally think this is an overkill because the pattern I'm looking for is not complex at all.
When do you think it's worth using RegEx Java? If it's complexity, how would you personally define what's complex enough?
Also, are there times when using Patterns are actually faster than string manipulation?
EDIT: I'm using Java 6.
Basically: if there is a non-regex operation that does what you want in one step, always go for that.
This is not so much about performance, but about a) readability and b) compile-time-safety. Specialized non-regex versions are usually a lot easier to read than regex-versions. And a typo in one of these specialized methods will not compile, while a typo in a Regex will fail miserably at runtime.
Comparing Regex-based solutions to non-Regex-bases solutions
String s = "Magic_Carpet_Ride";
s.startsWith("Magic"); // non-regex
s.matches("Magic.*"); // regex
s.contains("Carpet"); // non-regex
s.matches(".*Carpet.*"); // regex
s.endsWith("Ride"); // non-regex
s.matches(".*Ride"); // regex
In all these cases it's a No-brainer: use the non-regex version.
But when things get a bit more complicated, it depends. I guess I'd still stick with non-regex in the following case, but many wouldn't:
// Test whether a string ends with "magic" in any case,
// followed by optional white space
s.toLowerCase().trim().endsWith("magic"); // non-regex, 3 calls
s.matches(".*(?i:magic)\\s*"); // regex, 1 call, but ugly
And in response to RegexesCanCertainlyBeEasierToReadThanMultipleFunctionCallsToDoTheSameThing:
I still think the non-regex version is more readable, but I would write it like this:
s.toLowerCase()
.trim()
.endsWith("magic");
Makes the whole difference, doesn't it?
You would use Regex when the normal manipulations on the String class are not enough to elegantly get what you need from the String.
A good indicator that this is the case is when you start splitting, then splitting those results, then splitting those results. The code is getting unwieldy. Two lines of Pattern/Regex code can clean this up, neatly wrapped in a method that is unit tested....
Anything that can be done with regex can also be hand-coded.
Use regex if:
Doing it manually is going to take more effort without much benefit.
You can easily come up with a regex for your task.
Don't use regex if:
It's very easy to do it otherwise, as in your example.
The string you're parsing does not lend itself to regex. (it is customary to link to this question)
I think you are best with using endsWith. Unless your requirements change, it's simpler and easier to understand. Might perform faster too.
If there was a bit more complexity, such as you wanted to match "magic", "majik', but not "Magic" or "Majik"; or you wanted to match "magic" followed by a space and then 1 word such as "... magic spoon" but not "...magic soup spoon", then I think RegEx would be a better way to go.
Any complex parsing where you are generating a lot of Objects would be better done with RegEx when you factor in both computing power, and brainpower it takes to generate the code for that purpose. If you have a RegEx guru handy, it's almost always worthwhile as the patterns can easily be tweaked to accommodate for business rule changes without major loop refactoring which would likely be needed if you used pure java to do some of the complex things RegEx does.
If your basic line ending is the same everytime, such as with "magic", then you are better of using endsWith.
However, if you have a line that has the same base, but can have multiple values, such as:
<string> <number> <string> <string> <number>
where the strings and numbers can be anything, you're better of using RegEx.
Your lines are always ending with a string, but you don't know what that string is.
If it's as simple as endsWith, startsWith or contains, then you should use these functions. If you are processing more "complex" strings and you want to extract information from these strings, then regexp/matchers can be used.
If you have something like "commandToRetrieve someNumericArgs someStringArgs someOptionalArgs" then regexp will ease your task a lot :)
I'd never use regexes in java if I have an easier way to do it, like in this case the endsWith method. Regexes in java are as ugly as they get, probably with the only exception of the match method on String.
Usually avoiding regexes makes your core more readable and easier for other programmers. The opposite is true, complex regexes might confuse even the most experience hackers out there.
As for performance concerns: just profile. Specially in java.
If you are familiar with how regexp works you will soon find that a lot of problems are easily solved by using regexp.
Personally I look to using java String operations if that is easy, but if you start splitting strings and doing substring on those again, I'd start thinking in regular expressions.
And again, if you use regular expressions, why stop at lines. By configuring your regexp you can easily read entire files in one regular expression (Pattern.DOTALL as parameter to the Pattern.compile and your regexp don't end in the newlines). I'd combine this with Apache Commons IOUtils.toString() methods and you got something very powerful to do quick stuff with.
I would even bring out a regular expression to parse some xml if needed. (For instance in a unit test, where I want to check that some elements are present in the xml).
For instance, from some unit test of mine:
Pattern pattern = Pattern.compile(
"<Monitor caption=\"(.+?)\".*?category=\"(.+?)\".*?>"
+ ".*?<Summary.*?>.+?</Summary>"
+ ".*?<Configuration.*?>(.+?)</Configuration>"
+ ".*?<CfgData.*?>(.+?)</CfgData>", Pattern.DOTALL);
which will match all segments in this xml and pick out some segments that I want to do some sub matching on.
I would suggest using a regular expression when you know the format of an input but you are not necessarily sure on the value (or possible value(s)) of the formatted input.
What I'm saying, if you have an input all ending with, in your case, "magic" then String.endsWith() works fine (seeing you know that your possible input value will end with "magic").
If you have a format e.g a RFC 5322 message format, one cannot clearly say that all email address can end with a .com, hence you can create a regular expression that conforms to the RFC 5322 standard for verification.
In a nutshell, if you know a format structure of your input data but don't know exactly what values (or possible values) you can receive, use regular expressions for validation.
There's a saying that goes:
Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems. (link).
For a simple test, I'd proceed exactly like you've done. If you find that it's getting more complicated, then I'd consider Regular Expressions only if there isn't another way.