regex in java string - java

While trying some JAVA coding on the codingbat.com site, I came repeatedly to a Question about the functionality of regular expressions in java strings.
I know there are JAVA methods like matches() or finder() as well as replace() and so on, but this isn't where I wanted to go.
Take a quick look at the example:
boolean doubleX(String str) {
if(str.equals("xx")){
return true;
} else {
return false;
}
}
I wonder whether I could use regular expressions in the string to add a quantifier, for example
<----- add regex here
if(str.equals("x\[x.*]")){
Would you sirs, be so kind, to explain me, how I could use regex in strings? After all I understood, I thought, it would be possible even w/o using the java regex methodes, because the escape signal \ makes them usable even in plain code. Did I got this wrong?

Use String#matches(String)
if (str.matches(regex)) {
// ...
}
This will only find out if there is a match for the regex though.
What I suggest is that you specify the quantifier in your regex instead of counting the number of matches, like so:
public boolean isX(String str, int count) {
return str.matches("^x{" + count + "}$");
}

Some methods support regex as input and some is not. In general you can't use regex in plain String, because after all it will be just plain string. But some your or framework's methods can support regex inside with Pattern or other approaches.

You can use the Pattern and the Matcher class
private final Pattern PATTERN = Pattern.compile("x\[x.*]");
and then
Matcher matcher = PATTERN.matcher(str);
if (matcher.find())
doSomething();

Related

How can I provide an OR operator in regular expressions?

I want to match my string to one sequence or another, and it has to match at least one of them.
For and I learned it can be done with:
(?=one)(?=other)
Is there something like this for OR?
I am using Java, Matcher and Pattern classes.
Generally speaking about regexes, you definitely should begin your journey into Regex wonderland here: Regex tutorial
What you currently need is the | (pipe character)
To match the strings one OR other, use:
(one|other)
or if you don't want to store the matches, just simply
one|other
To be Java specific, this article is very good at explaining the subject
You will have to use your patterns this way:
//Pattern and Matcher
Pattern compiledPattern = Pattern.compile(myPatternString);
Matcher matcher = pattern.matcher(myStringToMatch);
boolean isNextMatch = matcher.find(); //find next match, it exists,
if(isNextMatch) {
String matchedString = myStrin.substring(matcher.start(),matcher.end());
}
Please note, there are much more possibilities regarding Matcher then what I displayed here...
//String functions
boolean didItMatch = myString.matches(myPatternString); //same as Pattern.matches();
String allReplacedString = myString.replaceAll(myPatternString, replacement)
String firstReplacedString = myString.replaceFirst(myPatternString, replacement)
String[] splitParts = myString.split(myPatternString, howManyPartsAtMost);
Also, I'd highly recommend using online regex checkers such as Regexplanet (Java) or refiddle (this doesn't have Java specific checker), they make your life a lot easier!
The "or" operator is spelled |, for example one|other.
All the operators are listed in the documentation.
You can separate with a pipe thus:
Pattern.compile("regexp1|regexp2");
See here for a couple of simple examples.
Use the | character for OR
Pattern pat = Pattern.compile("exp1|exp2");
Matcher mat = pat.matcher("Input_data");
The answers are already given, use the pipe '|' operator. In addition to that, it might be useful to test your regexp in a regexp tester without having to run your application, for example:
http://www.regexplanet.com/advanced/java/index.html

Extract set of repeated pattern from String literal in Java

What would be a convenient and reliable way to extract all the "{...}" tags from a given string? (Using Java).
So, to give an example:
Say I have: http://www.something.com/{tag1}/path/{tag2}/else/{tag3}.html
I want to get all the "{}” tags; I was thinking about using the Java .split() functions, but not sure what the correct regex would be for this.
Note also: tags can be called anything, not just tagX!
I would use regular expressions to match this. Something like this could work for your expression:
String regex = "\\{.*?\\}";
As this will "reluctantly" match any sub string that has { and } surrounding it. The .*? makes it find any character between the { and }, but reluctantly, so it doesn't match the bigger String:
{tag1}/path/{tag2}/else/{tag3}
which would be a "greedy" match. Note that the curly braces in the regex need to be escaped with double backslashes since curly braces have a separate meaning inside a regular expression, and if you want to indicate the curly brace String, you need to escape it.
e.g.,
public static void main(String[] args) {
String test = "http://www.something.com/{tag1}/path/{tag2}/else/{tag3}.html";
String regex = "\\{.*?\\}";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(test);
while (matcher.find()) {
System.out.println(matcher.group());
}
}
With an output of:
{tag1}
{tag2}
{tag3}
You can read more about regular expressions here:
Oracle Regular Expressions Tutorial
and for greater detail, here:
www.regular-expressions.info/tutorial

getting matching regular expressions in java

String.split(String regex) splits the string around a given regular expression and returns an String array. But I am interested in the regex matches and would like them to be returned as string array instead of strings around them.
For example,
In case of trival regex like ":" it probably wouldn't matter. But there are regexes which would match a particular date in a paragraph and I would like to get all these dates which may be different each time. I checked the jdk api but couldn't find any such methods. Is there any method that I can make use of?. Any help would much appreciated.
Take a look at java.util.regex package Matcher and Pattern classes:
http://download.oracle.com/javase/6/docs/api/java/util/regex/package-summary.html
Just use the Java regular expression API
Pattern pat = Pattern.compile("\\d");
Matcher mat= pat.matcher("Foo99Bar66Baz");
while(mat.find()) {
System.out.println(mat.group());
}
You can find simple but quite comprehensive examples for startup in the following link
http://www.vogella.de/articles/JavaRegularExpressions/article.html
Also Pattern and Matcher usage example in:
http://www.vogella.de/articles/JavaRegularExpressions/article.html#regexjava

java regex for US State Validation

I wrote this java method to do regex and missing something because it fails for all conditions. I am new to regex and unable to figure out whats causing it to fail for everything. Can some expert help me.
public static boolean isStateValid(String state){
String expression = "/^(?:A[KLRZ]|C[AOT]|D[CE]|FL|GA|HI|I[ADLN]|K[SY]|LA|M[ADEINOST]|N[CDEHJMVY]|O[HKR]|PA|RI|S[CD]|T[NX]|UT|V[AT]|W[AIVY])*$/";
CharSequence inputStr = state;
Pattern pattern = Pattern.compile(expression);
Matcher matcher = pattern.matcher(inputStr);
if (matcher.matches()) {
return true;
}else{
return false;
}
}
Changed to this after reading comments and stil it isnt working
public static boolean isStateValid(String state) {
CharSequence inputStr = state;
Pattern pattern = Pattern
.compile("AL|AK|AR|AZ|CA|CO|CT|DC|DE|FL|GA|HI|IA|ID|IL|IN|KS|KY|LA|MA|MD|ME|MI|MN|MO|MS|MT|NC|ND|NE|NH|NJ|NM|NV|NY|OH|OK|OR|PA|RI|SC|SD|TN|TX|UT|VA|VT|WA|WI|WV|WY|al|ak|ar|az|ca|co|ct|dc|de|fl|ga|hi|ia|id|il|in|ks|ky|la|ma|md|me|mi|mn|mo|ms|mt|nc|nd|ne|nh|nj|nm|nv|ny|oh|ok|or|pa|ri|sc|sd|tn|tx|ut|va|vt|wa|wi|wv|wy");
Matcher matcher = pattern.matcher(inputStr);
if (matcher.matches()) {
return true;
} else {
return false;
}
}
A lot of things.
First it is not perl. Remove leading and trailing slashes.
Second, why non-capturing group? I mean (?: You do not need group at all here.
Third why so complicated? Just say something like
Pattern.compile("AL|AK|AR|AZ|CA");
etc., all states. Your optimization does not have any benefits. It just makes regex more complicated.
It doesn't like the / characters at the beginning and end of the pattern. When I remove those, it works. Also, I don't think you want the * repetition character at the end.
This one is case-sensitive and includes the US territories:
^(?-i:A[LKSZRAEP]|C[AOT]|D[EC]|F[LM]|G[AU]|HI|I[ADLN]|K[SY]|LA|M[ADEHINOPST]|N[CDEHJMVY]|O[HKR]|P[ARW]|RI|S[CD]|T[NX]|UT|V[AIT]|W[AIVY])$
However, I think using regex is over the top. Use a list or array of some kind.
Is this an exercise in practicing regular expressions, or something that will be used in a production environment?
If it's the latter, then using a lookup list will be better. In this case a regex obfuscates and overcomplicates what you're trying to do.
I haven't used this particular tool, but have used a commercial tool that's similar, but I think you would benefit from a tool like: http://sourceforge.net/projects/regexevaluator/

Java regex return after first match

how do i return after the first match of regular expression? (does the Matcher.find() method do that? )
say I have a string "abcdefgeee". I want to ask the regex engine stop finding immediately after it finds the first match of "e" for example. I am writing a method to return true/false if the pattern is found and i don't want to find the whole string for "e". (I am looking for a regex solution )
Another question, sometimes when i use matches() , it doesn't return correctly. For example, if i compile my pattern like "[a-z]". and then use matches(), it doesn't match. But when I compile the pattern as ".*[a-z].*", it matches.... is that the behaviour of the matches() method of Matcher class?
Edit, here's actually what i want to do. For example I want to search for a $ sign AND a # sign in a string. So i would define 2 compiled patterns (since i can't find any logical AND for regex as I know the basics).
pattern1 = Pattern.compiled("$");
pattern2 = Pattern.compiled("#");
then i would just use
if ( match1.find() && match2.find() ){
return true;
}
in my method.
I only want the matchers to search the string for first occurrence and return.
thanks
For your second question, matches does work correctly, you example uses two different regular expressions.
.*[a-z].* will match a String that has at least one character. [a-z] will only match a one character String that is lower case a-z. I think you might mean to use something like [a-z]+
Another question, sometimes when i use matches() , it doesn't return correctly. For example, if i compile my pattern like "[a-z]". and then use matches(), it doesn't match. But when I compile the pattern as ".[a-z].", it matches.... is that the behaviour of the matches() method of Matcher class?
Yes, matches(...) tests the entire target string.
... here's actually what i want to do. For example I want to search for a $ sign AND a # sign in a string. So i would define 2 compiled patterns (since i can't find any logical AND for regex as I know the basics).
I know you said you wanted to use regex, but all your examples seems to suggest you have no need for them: those are all singe characters that can be handled with a couple of indexOf(...) calls.
Anyway, using regex, you could do it like this:
public static boolean containsAll(String text, String... patterns) {
for(String p : patterns) {
Matcher m = Pattern.compile(p).matcher(text);
if(!m.find()) return false;
}
return true;
}
But, again: indexOf(...) would do the trick as well:
public static boolean containsAll(String text, String... subStrings) {
for(String s : subStrings) {
if(text.indexOf(s) < 0) return false;
}
return true;
}

Categories