I'm trying to create a regular expresion to match any word ( \w+ ) except true or false.
This is what I got so far is: \w+\s*=\s*[^true|^false]\w+
class Ntnf {
public static void main ( String ... args ) {
System.out.println( args[0].matches("\\w+\\s*=\\s*[^true|^false]\\w+") );
}
}
But is not working for:
a = b
a = true
a = false
It matches always.
How can I match any word ( \w+ ) except true or false?
EDIT
I'm trying to spot this pattern:
a = b
x = y
name = someothername
etc = xyz
x = truea
n = falsea
But avoid matching
a = true
etc = false
name = true
You can use:
^(?!(true|false)$)
^ - beginning of string
?! - negative lookahead
$ - end of string
So it matches as long as the whole string isn't just "true" or "false". Note that it can still start with one of those.
However, it may be more straightforward to use regular string comparisons.
EDIT:
The whole regex (without escaping) for your situation is:
^\w+\s*=\s*(?!(true|false)$)\w+$
It's the same idea, except that we're putting it in the equation form.
[^true] Is a character class. It only matches one character. [^true] means: "Match this character only if it not one of t, r, u or e". This is not what you need, right?
Regex is not a good idea for this task. It will be quite complicated to do it in regex. Just use string comparison.
Square brackets match a list of possible characters, or reject a list of possible characters (not necessarily in the order you specify), so [^true] is not the way to go.
When I'm trying not to match a certain word, I usually do the following:
([^t]|t[^r]|tr[^u]|tru[^e])
Related
I have a string (VIN) like this:
String vin = "XTC53229R71133923";
I can use OR to see if there are characters Q,O,I:
String regExp = ".*[QOI].*";
This works.
However I can not check that any of these 3 letter are NOT in the string.
It means: (NOT Q) AND (NOT O) AND (NOT I).
I tried negative lookahead:
String regExp = "(?!.*[QOI].*)";
This doens't work. In "XTC5Q3229R71133923" it returns true.
The main issue - I have 2 conditions:
Number of characters (A-Z0-9) in the string should be 17.
The string should not have Q,O,I.
I can check this with 2 regexps:
String regExp = "^([A-Z0-9]{17})$"; //should be true
String regExp = ".*[QOI].*"; //should be false
But is there a way to combine these 2 checks in one regular expression?
How about just using a custom range that doesn't include the characters you do not want?
String regexp = "^([A-HJ-NPR-Z0-9]{17})$";
Here you go ^[^QOI]{17}$. Starting a charcter class with ^ means "do not match any of these characters".
I have never done regex before, and I have seen they are very useful for working with strings. I saw a few tutorials (for example) but I still cannot understand how to make a simple Java regex check for hexadecimal characters in a string.
The user will input in the text box something like: 0123456789ABCDEF and I would like to know that the input was correct otherwise if something like XTYSPG456789ABCDEF when return false.
Is it possible to do that with a regex or did I misunderstand how they work?
Yes, you can do that with a regular expression:
^[0-9A-F]+$
Explanation:
^ Start of line.
[0-9A-F] Character class: Any character in 0 to 9, or in A to F.
+ Quantifier: One or more of the above.
$ End of line.
To use this regular expression in Java you can for example call the matches method on a String:
boolean isHex = s.matches("[0-9A-F]+");
Note that matches finds only an exact match so you don't need the start and end of line anchors in this case. See it working online: ideone
You may also want to allow both upper and lowercase A-F, in which case you can use this regular expression:
^[0-9A-Fa-f]+$
May be you want to use the POSIX character class \p{XDigit}, so:
^\p{XDigit}+$
Additionally, if you plan to use the regular expression very often, it is recommended to use a constant in order to avoid recompile it each time, e.g.:
private static final Pattern REGEX_PATTERN =
Pattern.compile("^\\p{XDigit}+$");
public static void main(String[] args) {
String input = "0123456789ABCDEF";
System.out.println(
REGEX_PATTERN.matcher(input).matches()
); // prints "true"
}
Actually, the given answer is not totally correct. The problem arises because the numbers 0-9 are also decimal values. PART of what you have to do is to test for 00-99 instead of just 0-9 to ensure that the lower values are not decimal numbers. Like so:
^([0-9A-Fa-f]{2})+$
To say these have to come in pairs! Otherwise - the string is something else! :-)
Example:
(Pick one)
var a = "1e5";
var a = "10";
var a = "314159265";
If I used the accepted answer in a regular expression it would return TRUE.
var re1 = new RegExp( /^[0-9A-Fa-f]+$/ );
var re2 = new RegExp( /^([0-9A-Fa-f]{2})+$/ );
if( re1.test(a) ){ alert("#1 = This is a hex value!"); }
if( re2.test(a) ){ alert("#2 = This IS a hex string!"); }
else { alert("#2 = This is NOT a hex string!"); }
Note that the "10" returns TRUE in both cases. If an incoming string only has 0-9 you can NOT tell, easily if it is a hex value or a decimal value UNLESS there is a missing zero in front of off length strings (hex values always come in pairs - ie - Low byte/high byte). But values like "34" are both perfectly valid decimal OR hexadecimal numbers. They just mean two different things.
Also note that "3.14159265" is not a hex value no matter which test you do because of the period. But with the addition of the "{2}" you at least ensure it really is a hex string rather than something that LOOKS like a hex string.
I need to write a regex containing not only digits [0-9]. How can I do that without explicitly specifying all possible charaters in a group. Is it possible to do through lookahead/lookbehind? Examples:
034987694 - doesn't match
23984576s9879 - match
rtfsdbhkjdfg - match
=-0io[-09uhidkbf - match
9347659837564983467 - doesn't match
^(?!\\d+$).*$
This should do it for you.See demo.
https://regex101.com/r/fM9lY3/1
The negative will lookahead will check if the string doesnt have integers from start to end.You need $ to make sure the check is till end or else it will just check at the start.
If you just need to detect whether the string is not numbers-only, then you can simply test for /\D/ - "succeed if there is a non-digit anywhere".
Why not check if it only contains digits, if not it matches
String[] strings = {"034987694", "23984576s9879",
"rtfsdbhkjdfg",
"=-0io[-09uhidkbf",
"9347659837564983467"};
for (String s : strings) {
System.out.printf("%s = %s%n", s, !s.matches("\\d*"));
}
output
034987694 = false
23984576s9879 = true
rtfsdbhkjdfg = true
=-0io[-09uhidkbf = true
9347659837564983467 = false
You may try the below,
string.matches(".*\\D.*");
This expects atleast 1 non-digit character.
I want to remove that characters from a String:
+ - ! ( ) { } [ ] ^ ~ : \
also I want to remove them:
/*
*/
&&
||
I mean that I will not remove & or | I will remove them if the second character follows the first one (/* */ && ||)
How can I do that efficiently and fast at Java?
Example:
a:b+c1|x||c*(?)
will be:
abc1|xc*?
This can be done via a long, but actually very simple regex.
String aString = "a:b+c1|x||c*(?)";
String sanitizedString = aString.replaceAll("[+\\-!(){}\\[\\]^~:\\\\]|/\\*|\\*/|&&|\\|\\|", "");
System.out.println(sanitizedString);
I think that the java.lang.String.replaceAll(String regex, String replacement) is all you need:
http://docs.oracle.com/javase/6/docs/api/java/lang/String.html#replaceAll(java.lang.String, java.lang.String).
there is two way to do that :
1)
ArrayList<String> arrayList = new ArrayList<String>();
arrayList.add("+");
arrayList.add("-");
arrayList.add("||");
arrayList.add("&&");
arrayList.add("(");
arrayList.add(")");
arrayList.add("{");
arrayList.add("}");
arrayList.add("[");
arrayList.add("]");
arrayList.add("~");
arrayList.add("^");
arrayList.add(":");
arrayList.add("/");
arrayList.add("/*");
arrayList.add("*/");
String string = "a:b+c1|x||c*(?)";
for (int i = 0; i < arrayList.size(); i++) {
if (string.contains(arrayList.get(i)));
string=string.replace(arrayList.get(i), "");
}
System.out.println(string);
2)
String string = "a:b+c1|x||c*(?)";
string = string.replaceAll("[+\\-!(){}\\[\\]^~:\\\\]|/\\*|\\*/|&&|\\|\\|", "");
System.out.println(string);
Thomas wrote on How to remove special characters from a string?:
That depends on what you define as special characters, but try
replaceAll(...):
String result = yourString.replaceAll("[-+.^:,]","");
Note that the ^ character must not be the first one in the list, since
you'd then either have to escape it or it would mean "any but these
characters".
Another note: the - character needs to be the first or last one on the
list, otherwise you'd have to escape it or it would define a range (
e.g. :-, would mean "all characters in the range : to ,).
So, in order to keep consistency and not depend on character
positioning, you might want to escape all those characters that have a
special meaning in regular expressions (the following list is not
complete, so be aware of other characters like (, {, $ etc.):
String result = yourString.replaceAll("[\\-\\+\\.\\^:,]","");
If you want to get rid of all punctuation and symbols, try this regex:
\p{P}\p{S} (keep in mind that in Java strings you'd have to escape
back slashes: "\p{P}\p{S}").
A third way could be something like this, if you can exactly define
what should be left in your string:
String result = yourString.replaceAll("[^\\w\\s]","");
Here's less restrictive alternative to the "define allowed characters"
approach, as suggested by Ray:
String result = yourString.replaceAll("[^\\p{L}\\p{Z}]","");
The regex matches everything that is not a letter in any language and
not a separator (whitespace, linebreak etc.). Note that you can't use
[\P{L}\P{Z}] (upper case P means not having that property), since that
would mean "everything that is not a letter or not whitespace", which
almost matches everything, since letters are not whitespace and vice
versa.
I have never done regex before, and I have seen they are very useful for working with strings. I saw a few tutorials (for example) but I still cannot understand how to make a simple Java regex check for hexadecimal characters in a string.
The user will input in the text box something like: 0123456789ABCDEF and I would like to know that the input was correct otherwise if something like XTYSPG456789ABCDEF when return false.
Is it possible to do that with a regex or did I misunderstand how they work?
Yes, you can do that with a regular expression:
^[0-9A-F]+$
Explanation:
^ Start of line.
[0-9A-F] Character class: Any character in 0 to 9, or in A to F.
+ Quantifier: One or more of the above.
$ End of line.
To use this regular expression in Java you can for example call the matches method on a String:
boolean isHex = s.matches("[0-9A-F]+");
Note that matches finds only an exact match so you don't need the start and end of line anchors in this case. See it working online: ideone
You may also want to allow both upper and lowercase A-F, in which case you can use this regular expression:
^[0-9A-Fa-f]+$
May be you want to use the POSIX character class \p{XDigit}, so:
^\p{XDigit}+$
Additionally, if you plan to use the regular expression very often, it is recommended to use a constant in order to avoid recompile it each time, e.g.:
private static final Pattern REGEX_PATTERN =
Pattern.compile("^\\p{XDigit}+$");
public static void main(String[] args) {
String input = "0123456789ABCDEF";
System.out.println(
REGEX_PATTERN.matcher(input).matches()
); // prints "true"
}
Actually, the given answer is not totally correct. The problem arises because the numbers 0-9 are also decimal values. PART of what you have to do is to test for 00-99 instead of just 0-9 to ensure that the lower values are not decimal numbers. Like so:
^([0-9A-Fa-f]{2})+$
To say these have to come in pairs! Otherwise - the string is something else! :-)
Example:
(Pick one)
var a = "1e5";
var a = "10";
var a = "314159265";
If I used the accepted answer in a regular expression it would return TRUE.
var re1 = new RegExp( /^[0-9A-Fa-f]+$/ );
var re2 = new RegExp( /^([0-9A-Fa-f]{2})+$/ );
if( re1.test(a) ){ alert("#1 = This is a hex value!"); }
if( re2.test(a) ){ alert("#2 = This IS a hex string!"); }
else { alert("#2 = This is NOT a hex string!"); }
Note that the "10" returns TRUE in both cases. If an incoming string only has 0-9 you can NOT tell, easily if it is a hex value or a decimal value UNLESS there is a missing zero in front of off length strings (hex values always come in pairs - ie - Low byte/high byte). But values like "34" are both perfectly valid decimal OR hexadecimal numbers. They just mean two different things.
Also note that "3.14159265" is not a hex value no matter which test you do because of the period. But with the addition of the "{2}" you at least ensure it really is a hex string rather than something that LOOKS like a hex string.