I've got a problem when I'm using Matcher for finding a symbol from the group of regular expressions, it cannot recognize the second group .Maybe the code below make it clear :
public void set(String n){
String pat = "(\\d+)[!##$%^&*()_+-=}]";
Pattern r;
r = Pattern.compile(pat);
System.out.println(r);
Matcher m;
m = r.matcher(n);
if (m.find()) {
JOptionPane.showMessageDialog(null,
"Not a correct form", "ERROR_NAME_MATCH", 0);
}else{
name = n;
}
}
After running the code the first group is recognizable but the second one [!##$%^&*()_+-=}] is not.I'm totally sure that the expression is true I've checked it with 'RegexBuddy'. There must be a problem with concatenating two or more groups in one line.
Thank you for your help.
Your regex - (\d+)[!##$%^&*()_+=}-] - matches a sequence of 1+ digits followed with a symbol from the specified set.
You want to test a string and return true if a single character from the specified set is present in the string.
So, just move \d to the character class and certainly move the - to the end of this class:
String pat = "[\\d!##$%^&*()_+=}-]";
^^^
If you need to match a digit or special char, use
String pat = "\\d|[!##$%^&*()_+=}-]";
If you need both irrespective of the order:
String pat = "^(?=\\D*\\d)(?=[^!##$%^&*()_+=}-]*[!##$%^&*()_+=}-])";
Related
Consider:
String str = "XYhaku(ABH1235-123548)";
From the above string, I need only "ABH1235-123548" and so far I created a regular expression:
Pattern.compile("ABH\\d+")
But it returns false. So what the correct regular expression for it?
I would just grab whatever is in the parenthesis:
Pattern p = Pattern.compile("\\((?<data>[A-Z\\d]+\\-\\d+)\\)");
Or, if you want to be even more open (any parenthesis):
Pattern p = Pattern.compile("\\((?<data>.+\\)\\)");
Then just nab it:
String s = /* some input */;
Matcher m = p.matcher(s);
if (m.find()) { //just find first
String tag = m.group("data"); //ABH1235-123548
}
\d only matches digits. To include other characters, use a character class:
Pattern.compile("ABH[\\d-]+")
Note that the - must be placed first or last in the character class, because otherwise it will be treated as a range indicator ([A-Z] matching every letter between A and Z, for example). Another way to avoid that would be to escape it, but that adds two more backslashes to your string...
I have sample content string repeated in a file which I wanna to retrieve its double value from it.the string content is "(AIC)|234.654 |" which I wanna retrieve the 234.654 from that...the "(AIC)|" is always fixed but the numbers change in other occasions so I am using regular expression as follow..but it says there is no match using below expression..any help would be appreciated
String contents="(AIC)|234.654 |";
Pattern p = Pattern.compile("AIC\\u0029{1}\\u007C{1}\\d+u002E{1}\\d+");
Matcher m = p.matcher(contents);
boolean b = m.find();
String t=m.group();
The above expression doest find any match and throw exception..
Thanks for any help
Your code has several typos, but beside them, you say you need to match the number inside the brackets, but you are referring to the whole match with .group(). You need to set a capturing group to access that number with .group(1).
Here is a fixed code:
String content="(AIC)|234.654 |";
Pattern p = Pattern.compile("AIC\\)\\|(\\d+\\.\\d+)");
Matcher m = p.matcher(content);
if (m.find())
{
System.out.println(m.group(1));
}
See IDEONE demo
If the number can be integer, just use an optional non-capturing group around the decimal part: Pattern.compile("AIC\\)\\|(\\d+(?:\\.\\d+)?)");
I think this regex should do the work:
(?<=\|)[\d\.]*(?=\s*\|)
It will only match digits and dots after a | and before an optional space and another |
And the complete code:
String content="(AIC)|234.654 |";
Pattern p = Pattern.compile("(?<=\\|)[\\d\\.]*(?=\\s*\\|)");
Matcher m = p.matcher(content);
boolean b = m.find();
String t=m.group();
I want to check the text to see if it starts with what or who and and is a question type, so for that I wrote the following code:
private static void startWithQOrIf(String commentstr){
String urlPattern = "(|who|what).*\\?.*$";
Pattern p = Pattern.compile(urlPattern,Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher(commentstr);
if (m.find()) {
System.out.println("yes");
}
}
everything works good but for example when I try:
whooooooooo is the follower?
will match as well but should not because I am looking for who not whooooooooo
Any idea?
You can ensure a whole word using a word boundary \b:
(|who|what)\\b.*\\?.*$
^^
If the words in the alternation group are supposed to appear at the start of the string, you can just use matches and remove $ anchor:
String urlPattern = "(|who|what)\\b.*\\?.*";
Pattern p = Pattern.compile(urlPattern,Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher(commentstr);
if (m.matches()) { // < - Here, matches is used
System.out.println("yes");
}
Note that (|who|what) matches either an empty string, or who, or what. If you do not plan to allow empty string, use just (who|what).
You must use word boundaries.
String urlPattern = "\\b(who|what)\\b.*\\?.*$";
Can you help with this code?
It seems easy, but always fails.
#Test
public void normalizeString(){
StringBuilder ret = new StringBuilder();
//Matcher matches = Pattern.compile( "([A-Z0-9])" ).matcher("P-12345678-P");
Matcher matches = Pattern.compile( "([\\w])" ).matcher("P-12345678-P");
for (int i = 1; i < matches.groupCount(); i++)
ret.append(matches.group(i));
assertEquals("P12345678P", ret.toString());
}
Constructing a Matcher does not automatically perform any matching. That's in part because Matcher supports two distinct matching behaviors, differing in whether the match is implicitly anchored to the beginning of the Matcher's region. It appears that you could achieve your desired result like so:
#Test
public void normalizeString(){
StringBuilder ret = new StringBuilder();
Matcher matches = Pattern.compile( "[A-Z0-9]+" ).matcher("P-12345678-P");
while (matches.find()) {
ret.append(matches.group());
}
assertEquals("P12345678P", ret.toString());
}
Note in particular the invocation of Matcher.find(), which was a key omission from your version. Also, the nullary Matcher.group() returns the substring matched by the last find().
Furthermore, although your use of Matcher.groupCount() isn't exactly wrong, it does lead me suspect that you have the wrong idea about what it does. In particular, in your code it will always return 1 -- it inquires about the pattern, not about matches to it.
First of all you don't need to add any group because entire match can be always accessed by group 0, so instead of
(regex) and group(1)
you can use
regex and group(0)
Next thing is that \\w is already character class so you don't need to surround it with another [ ], because it will be similar to [[a-z]] which is same as [a-z].
Now in your
for (int i = 1; i < matches.groupCount(); i++)
ret.append(matches.group(i));
you will iterate over all groups from 1 but you will exclude last group, because they are indexed from 1 so n so i<n will not include n. You would need to use i <= matches.groupCount() instead.
Also it looks like you are confusing something. This loop will not find all matches of regex in input. Such loop is used to iterate over groups in used regex after match for regex was found.
So if regex would be something like (\w(\w))c and your match would be like abc then
for (int i = 1; i < matches.groupCount(); i++)
System.out.println(matches.group(i));
would print
ab
b
because
first group contains two characters (\w(\w)) before c
second group is the one inside first one, right after first character.
But to print them you actually would need to first let regex engine iterate over your input and find() match, or check if entire input matches() regex, otherwise you would get IllegalStateException because regex engine can't know from which match you want to get your groups (there can be many matches of regex in input).
So what you may want to use is something like
StringBuilder ret = new StringBuilder();
Matcher matches = Pattern.compile( "[A-Z0-9]" ).matcher("P-12345678-P");
while (matches.find()){//find next match
ret.append(matches.group(0));
}
assertEquals("P12345678P", ret.toString());
Other way around (and probably simpler solution) would be actually removing all characters you don't want from your input. So you could just use replaceAll and negated character class [^...] like
String input = "P-12345678-P";
String result = input.replaceAll("[^A-Z0-9]+", "");
which will produce new string in which all characters which are not A-Z0-9 will be removed (replaced with "").
I have this code to find this pattern: 201409250200131738007947036000 - 1 ,inside the text
final String patternStr = "(\\d{30} - \\d{1})";
final Pattern p = Pattern.compile(patternStr);
final Matcher m = p.matcher(page);
if (m.matches()) {
System.out.println("SUCCESS");
}
But for any strange reasson in Java did't work, Can somebody help me where is the error please?
The reason is that the matches method checks for the entire given string to match the regex.
So i.e. if your string is 123456123412345612341234561234 - 8 it will match, if it is my number 123456123412345612341234561234 - 8 is inside other text it won't.
Use the find method to accomplish your task:
if (m.find()) {
System.out.println("SUCCESS");
}
It will search inside the given string instead of attempting to match the entire string.
From the documentation for Matcher, matches:
Attempts to match the entire region against the pattern.
As opposed to find which:
Attempts to find the next subsequence of the input sequence that matches the pattern.
So use matches to match an entire String against a pattern, use find to locate a pattern inside a String.
Try:
final String patternStr = "\\d{30}+\\s-\\s\\d";
final Pattern p = Pattern.compile(patternStr);
final Matcher m = p.matcher(page);
while (m.find()) {
System.out.printf("FOUND A MATCH: %s%n", matcher.group());
}
I edited your pattern slightly to make it more robust. This will print each match that it finds.