I have a simple SQL query where I check whether the query matches any of the fields I have. I'm using LIKE statement for this. One of my field can have special characters and so does the search query. So I'm looking for a solution where I need to an escape "\" in front of the special character.
query = "hello+Search}query"
I need the above to change to
query = "hello\+Search\}query"
Is there a simple way of doing this other than searching for each special character separately and adding the "\". Because if I don't have the escape character I will get the error message
java.util.regex.PatternSyntaxException: Dangling meta character '+' near index 0
Thanks in advance
Decide which special characters you want to escape and just call
query.replace("}", "\\}")
You may keep all special characters you allow in some array then iterate it and replace the occurrences as exemplified.
This method replaces all regex meta characters.
public String escapeMetaCharacters(String inputString){
final String[] metaCharacters = {"\\","^","$","{","}","[","]","(",")",".","*","+","?","|","<",">","-","&","%"};
for (int i = 0 ; i < metaCharacters.length ; i++){
if(inputString.contains(metaCharacters[i])){
inputString = inputString.replace(metaCharacters[i],"\\"+metaCharacters[i]);
}
}
return inputString;
}
You could use it as query=escapeMetaCharacters(query);
Don't think that any library you would find would do anything more than that. At best it defines a complete list of specialCharacters.
There is actually a better way of doing this in a sleek manner.
String REGEX = "[\\[+\\]+:{}^~?\\\\/()><=\"!]";
StringUtils.replaceAll(inputString, REGEX, "\\\\$0");
You need to use \\ to introduce a \ into a string literal; that is you need to escape the \. (A single backslash is used to introduce special characters into a string: e.g. \t is a tab.)
query = "hello\\+Search\\}query" is what you need.
I had to do same thing in javascript. I came up with below solution. I think it might help someone.
function escapeSpecialCharacters(s){
let arr = s.split('');
arr = arr.map(function(d){
return d.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\'+d)
});
let reg = new RegExp(arr.join(''));
return reg;
}
let newstring = escapeSpecialCharacters("hello+Search}query");
If you want to use Java 8+ and Streams, you could do something like:
private String escapeSpecialCharacters(String input) {
List<String> specialCharacters = Lists.newArrayList("\\","^","$","{","}","[","]","(",")",".","*","+","?","|","<",">","-","&","%");
return Arrays.stream(input.split("")).map((c) -> {
if (specialCharacters.contains(c)) return "\\" + c;
else return c;
}).collect(Collectors.joining());
}
The simple version ( without deprecated StringUtils.replaceAll ):
String regex = "[\\[+\\]+:{}^~?\\\\/()><=\"!]";
String query = "hello+Search}query";
String replaceAll = query.replaceAll(regex, "\\\\$0");
Related
I need help making a delimiter for multiple characters
I need a String delimiter for
these characters
( ) " ; : , ? ! .
I've tried:
private String delimiter = "()\":;,?!.";
private String delimiter = "[()\":;,?!.]";
private String delimiter = "\\(\\)\"\\:\\;\\,\\?\\!\\.";
Seems I can only make them work one at a time..
Any insight is greatly appreciated.
If it matters this is how its going into array:
foo = line.split(delim);
If you want to split on any of those characters, you can separate each one with an alternation: |. Otherwise, the string will only be split when all of those characters are present.
String delimiter = "\\(|\\)|\"|\\:|\\;|\\,|\\?|\\!|\\.";
Also, you're unnecessarily escaping a few characters, this would also work:
String delimiter = "\\(|\\)|\"|:|;|,|\\?|!|\\.";
Almost there with nr. 3
#Test
public void delim() {
String delimiter = "[\\(\\)\"\\:\\;\\,\\?\\!\\.]";
String[] split = "Hello(World)How:are;You;doing,today?You!sir.I mean"
.split(delimiter);
System.out.println(Arrays.toString(split));
}
Output
[Hello, World, How, are, You, doing, today, You, sir, I mean]
You missed the square brackets.
To avoid all the quoting you may use Pattern#quote
String delimiter = "[" + Pattern.quote("()\":;,?!.") + "]";
Returns a literal pattern String for the specified String.
This method produces a String that can be used to create a Pattern that would match the string s as if it were a literal pattern.
Metacharacters or escape sequences in the input sequence will be given no special meaning.
| is required between:
delimiter = "\\(|\\)|\"|:|;|,|\\?|!|\\."
public String getPriceString() {
String priceString = "45.0";
String[] priceStringArray = priceString.split(".");
return priceStringArray.length + "";
}
Why does this give me a 0, zero? Shouldn't this be 2?
The argument to split() is a regular expression, and dot has a special meaning in regular expressions (it matches any character).
Try priceString.split("[.]");
You need to escape . like that
String[] priceStringArray = priceString.split("\\.");
split takes regular expression as a parameter and . means any character.
http://docs.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html#sum
escape . with backslash like \\.. . is a regex metacharacter for anything. you will have to escape it with \\. in order to make it treat as a normal character
String priceString = "45.0";
String[] priceStringArray = priceString.split("\\.");
String.split takes a regular expression pattern. You're passing in . which means you want to split on any character.
You could use "\\." as the pattern to split on - but personally I'd use Guava instead:
private static final Splitter DOT_SPLITTER = Splitter.on('.');
...
(If you're not already using Guava, you'll find loads of goodies in there.)
You need to escape . as \\. because . has special meaning in regex.
String priceString = "45.0";
String[] priceStringArray = priceString.split("\\.");
return priceStringArray.length + "";
Use String[] priceStringArray = priceString.split("\\.");
You will have to use escape sequence.
how to split the string in java in Windows?
I used
Eg.
String directory="C:\home\public\folder";
String [] dir=direct.split("\");
I want to know how to split the string in eg.
In java, if I use "split("\")" , there is syntax error.
thanks
split() function in Java accepts regular expressions. So, what you exactly need to do is to escape the backslash character twice:
String[] dir=direct.split("\\\\");
One for Java, and one for regular expressions.
The syntax error is caused because the sing backslash is used as escape character in Java.
In the Regex '\' is also a escape character that why you need escape from it either.
As the final result should look like this "\\\\".
But You should use the java.io.File.separator as the split character in a path.
String[] dirs = dircect.split(Pattern.quote(File.separator));
thx to John
You need to escape the backslash:
direct.split("\\\\");
Once for a java string and once for the regex.
You need to escape it.
String [] dir=direct.split("\\\\");
Edit: or Use Pattern.quote method.
String [] dir=direct.split(Pattern.quote("\\"))
Please, don't split using file separators.
It's highly recommended that you get the file directory and iterate over and over the parents to get the paths. It will work everytime regardless of the operating system you are working with.
Try this:
String yourDir = "C:\\home\\public\\folder";
File f = new File(yourDir);
System.out.println(f.getAbsolutePath());
while ((f = f.getParentFile()) != null) {
System.out.println(f.getAbsolutePath());
}
I guess u can use the StringTokenizer library
String directory="C:\home\public\folder";
String [] dir=direct.split("\");
StringTokenizer token = new StringTokenizer(directory, '\');
while(token.hasTokens()
{
String s = token.next();
}
This may not be completely correct syntactically but Hopefully this will help.
final String dir = System.getProperty("user.dir");
String[] array = dir.split("[\\\\/]",-1) ;
String arrval="";
for (int i=0 ;i<array.length;i++)
{
arrval=arrval+array[i];
}
System.out.println(arrval);
It's because of the backslash. A backslash is used to escape characters. Use
split("\\")
to split by a backslash.
String[] a1 = "abc bcd"
String[] seperate = a1.split(" ");
String finalValue = seperate[0];
System.out.pritln("Final string is :" + finalValue);
This will give the result as abc
split("\\") A backlash is used to escape.
I need to split a string base on delimiter - and .. Below are my desired output.
AA.BB-CC-DD.zip ->
AA
BB
CC
DD
zip
but my following code does not work.
private void getId(String pdfName){
String[]tokens = pdfName.split("-\\.");
}
I think you need to include the regex OR operator:
String[]tokens = pdfName.split("-|\\.");
What you have will match:
[DASH followed by DOT together] -.
not
[DASH or DOT any of them] - or .
Try this regex "[-.]+". The + after treats consecutive delimiter chars as one. Remove plus if you do not want this.
You can use the regex "\W".This matches any non-word character.The required line would be:
String[] tokens=pdfName.split("\\W");
The string you give split is the string form of a regular expression, so:
private void getId(String pdfName){
String[]tokens = pdfName.split("[\\-.]");
}
That means to split on any character in the [] (we have to escape - with a backslash because it's special inside []; and of course we have to escape the backslash because this is a string). (Conversely, . is normally special but isn't special inside [].)
Using Guava you could do this:
Iterable<String> tokens = Splitter.on(CharMatcher.anyOf("-.")).split(pdfName);
For two char sequence as delimeters "AND" and "OR" this should be worked. Don't forget to trim while using.
String text ="ISTANBUL AND NEW YORK AND PARIS OR TOKYO AND MOSCOW";
String[] cities = text.split("AND|OR");
Result : cities = {"ISTANBUL ", " NEW YORK ", " PARIS ", " TOKYO ", " MOSCOW"}
pdfName.split("[.-]+");
[.-] -> any one of the . or - can be used as delimiter
+ sign signifies that if the aforementioned delimiters occur consecutively we should treat it as one.
I'd use Apache Commons:
import org.apache.commons.lang3.StringUtils;
private void getId(String pdfName){
String[] tokens = StringUtils.split(pdfName, "-.");
}
It'll split on any of the specified separators, as opposed to StringUtils.splitByWholeSeparator(str, separator) which uses the complete string as a separator
String[] token=s.split("[.-]");
It's better to use something like this:
s.split("[\\s\\-\\.\\'\\?\\,\\_\\#]+");
Have added a few other characters as sample. This is the safest way to use, because the way . and ' is treated.
Try this code:
var string = 'AA.BB-CC-DD.zip';
array = string.split(/[,.]/);
You may also specified regular expression as argument in split() method ..see below example....
private void getId(String pdfName){
String[]tokens = pdfName.split("-|\\.");
}
s.trim().split("[\\W]+")
should work.
you can try this way as split accepts varargs so we can pass multiple parameters as delimeters
String[]tokens = pdfName.split("-",".");
you can pass as many parameters that you want.
If you know the sting will always be in the same format, first split the string based on . and store the string at the first index in a variable. Then split the string in the second index based on - and store indexes 0, 1 and 2. Finally, split index 2 of the previous array based on . and you should have obtained all of the relevant fields.
Refer to the following snippet:
String[] tmp = pdfName.split(".");
String val1 = tmp[0];
tmp = tmp[1].split("-");
String val2 = tmp[0];
...
Is there a nice way to extract tokens that start with a pre-defined string and end with a pre-defined string?
For example, let's say the starting string is "[" and the ending string is "]". If I have the following string:
"hello[world]this[[is]me"
The output should be:
token[0] = "world"
token[1] = "[is"
(Note: the second token has a 'start' string in it)
I think you can use the Apache Commons Lang feature that exists in StringUtils:
substringsBetween(java.lang.String str,
java.lang.String open,
java.lang.String close)
The API docs say it:
Searches a String for substrings
delimited by a start and end tag,
returning all matching substrings in
an array.
The Commons Lang substringsBetween API can be found here:
http://commons.apache.org/lang/apidocs/org/apache/commons/lang/StringUtils.html#substringsBetween(java.lang.String,%20java.lang.String,%20java.lang.String)
Here is the way I would go to avoid dependency on commons lang.
public static String escapeRegexp(String regexp){
String specChars = "\\$.*+?|()[]{}^";
String result = regexp;
for (int i=0;i<specChars.length();i++){
Character curChar = specChars.charAt(i);
result = result.replaceAll(
"\\"+curChar,
"\\\\" + (i<2?"\\":"") + curChar); // \ and $ must have special treatment
}
return result;
}
public static List<String> findGroup(String content, String pattern, int group) {
Pattern p = Pattern.compile(pattern);
Matcher m = p.matcher(content);
List<String> result = new ArrayList<String>();
while (m.find()) {
result.add(m.group(group));
}
return result;
}
public static List<String> tokenize(String content, String firstToken, String lastToken){
String regexp = lastToken.length()>1
?escapeRegexp(firstToken) + "(.*?)"+ escapeRegexp(lastToken)
:escapeRegexp(firstToken) + "([^"+lastToken+"]*)"+ escapeRegexp(lastToken);
return findGroup(content, regexp, 1);
}
Use it like this :
String content = "hello[world]this[[is]me";
List<String> tokens = tokenize(content,"[","]");
StringTokenizer?Set the search string to "[]" and the "include tokens" flag to false and I think you're set.
Normal string tokenizer wont work for his requirement but you have to tweak it or write your own.
There's one way you can do this. It isn't particularly pretty. What it involves is going through the string character by character. When you reach a "[", you start putting the characters into a new token. When you reach a "]", you stop. This would be best done using a data structure not an array since arrays are of static length.
Another solution which may be possible, is to use regexes for the String's split split method. The only problem I have is coming up with a regex which would split the way you want it to. What I can come up with is {]string of characters[) XOR (string of characters[) XOR (]string of characters) Each set of parenthesis denotes a different regex. You should evaluate them in this order so you don't accidentally remove anything you want. I'm not familiar with regexes in Java, so I used "string of characters" to denote that there's characters in between the brackets.
Try a regular expression like:
(.*?\[(.*?)\])
The second capture should contain all of the information between the set of []. This will however not work properly if the string contains nested [].
StringTokenizer won't cut it for the specified behavior. You'll need your own method. Something like:
public List extractTokens(String txt, String str, String end) {
int so=0,eo;
List lst=new ArrayList();
while(so<txt.length() && (so=txt.indexOf(str,so))!=-1) {
so+=str.length();
if(so<txt.length() && (eo=txt.indexOf(end,so))!=-1) {
lst.add(txt.substring(so,eo);
so=eo+end.length();
}
}
return lst;
}
The regular expression \\[[\\[\\w]+\\] gives us
[world] and
[[is]