How to use multiple different patterns? - java

how to check strings for multi-pattern regex not for single pattern if tried for one pattern but I need it for multi-pattern and i tried but it doesn't work.
when I running these codes just I can get one of them (time or price ) that is in the String but when I combine them don't show me any output.
thanks for your help....
here is my code :
String line = "This order was places for QT 30.00$ ! OK? and time is 2:45";
String pattern = "\\d+[.,]\\d+.[$]"+"\\d:\\d\\d";
// Create a Pattern object
Pattern r = Pattern.compile(pattern);
// Now create matcher object.
Matcher m = r.matcher(line);
if (m.find( )) {
System.out.println("Found value: " + m.group(0) );
} else {
System.out.println("NO MATCH");
}

The "+" operator does not separate patterns - it concatenates strings.
What you can do is provide a pattern that accepts characters in between the two groups.
String pattern = "(\\d+[.,]\\d+.[$]).*(\\d:\\d\\d)";
The parentheses above are optional. If you include them, you can get the matched price and time as separate strings:
if (m.find( )) {
System.out.println("Found value: " + m.group(1) + " with time: " + m.group(2));
}
EDIT:
Just noticed your comment that you're looking for OR, not AND.
You can do that with an expression of the form X | Y:
String pattern = "\\d+[.,]\\d+.[$]|\\d:\\d\\d";
This will match either a price or a time, whichever occurs first. You can get the match with m.group(0).

Related

Java Regex expression not working

I have a problem with not working REGEX. I dont know what I am doing wrong. My code:
String test = "timetable:xxxxxtimetable:; timetable: fullihhghtO;";
Pattern p = Pattern.compile("\\btimetable:(.*);");
//also tried "timetable:(.*);" and "(\\btimetable:)(.*)(;)"
Matcher m = p.matcher(test);
while(m.find()) {
System.out.println("S:" + m.start() + ", E:" + m.end());
System.out.println("x: "+ test.substring(m.start(), m.end()));
}
Expected result:
(1) "timetable:xxxxxtimetable:"
(2) "timetable: fullihhghtO"
I thanks for any help.
A non-capturing group could be handy in our case:
String test = "timetable:xxxxxtimetable:; timetable: fullihhghtO;";
Pattern p = Pattern.compile("(?:\\btimetable:(.*?);)+"); // <-- here
Matcher m = p.matcher(test);
int i = 1;
while (m.find()) {
System.out.println(i + ") "+ m.group(1));
i++;
}
OUTPUT
1) xxxxxtimetable:
2) fullihhghtO
Regex explained:
(?:\\btimetable:(.*?);)+ by using the non-capturing (?:\\btimetable:...) we'll consume the "timetable:" without capturing it, then the second matching group (.*?) captures what we want to capture (everything between \btimetable: and ;). Pay special attention to the non-greedy term: .*? which means that we'll consume the minimum possible amount of characters until the ;. If we won't use this lazy form, the regex will use "greedy" default mode and will consume all the characters until the last ; in the string!
Now, all that is relevant if you wanted to catch only the unique part, but if you wanted to catch the whole thing:
1) timetable:xxxxxtimetable:;
2) timetable: fullihhghtO;
It can be done easily by modifying the line with the regex to:
Pattern p = Pattern.compile("\\b(timetable:.*?;)+");
which is even simpler: only one capturing group (see that we still have to use the non-greedy mode!).
You don't need to use regex, a simple split would do it :
public static void main(String[] args) throws IOException {
String test = "timetable:xxxxxtimetable:; timetable: fullihhghtO;";
String[] array = test.split(";");
String str1 = array[0].trim();
String str2 = array[1].trim();
System.out.println(str1 + "\n" + str2); //timetable:xxxxxtimetable:
//timetable: fullihhghtO
}

Why last digit in a string is not matched by a regex group?

String line = "This order was placed for QT3000! OK?";
String pattern = "(.*)(\\d+)(.*)";
// Create a Pattern object
Pattern r = Pattern.compile(pattern);
// Now create matcher object.
Matcher m = r.matcher(line);
if (m.find( )) {
System.out.println("Found value: " + m.group(0) );
System.out.println("Found value: " + m.group(1) );
System.out.println("Found value: " + m.group(2) );
System.out.println("Found value: " + m.group(3) );
}else {
System.out.println("NO MATCH");
}
Output:
Found value: This order was placed for QT3000! OK?
Found value: This order was placed for QT300
Found value: 0
Found value: ! OK?
Question: I don't understand why group 2 doesn't have 300 and just 0.
Because the .* is greedy. Meaning that it'll try to match as much as it can.
So the first group first matches the whole string, but then \\d+ fails to match the end of line. So the regex engine backtracks and tries to match one character less. It keeps doing that until
This order was placed for QT300
is matched, and then \\d+ matches the "0" that comes next. Finally, the last group matches to the end of the string.
If you want to extract only the number, use \\d+.
It is because of greedy .* before \d+. .* matches as many character as possible before backtracking just one position to allow matching \d+ which means single digit is captured in 2nd group.
Also you don't need 3 groups to capture number. Just use this regex:
\d+
to capture a number.
Code:
String line = "This order was placed for QT3000! OK?";
String pattern = "\\d+";
// Create a Pattern object
Pattern r = Pattern.compile(pattern);
// Now create matcher object.
Matcher m = r.matcher(line);
if (m.find( )) {
System.out.println("Found value: " + m.group(0) );
}else {
System.out.println("NO MATCH");
}
Output:
Found value: 3000

Regex not capturing matching in expected groups

I have been working on requirement and I need to create a regex on following string:
startDate:[2016-10-12T12:23:23Z:2016-10-12T12:23:23Z]
There can be many variations of this string as follows:
startDate:[*;2016-10-12T12:23:23Z]
startDate:[2016-10-12T12:23:23Z;*]
startDate:[*;*]
startDate in above expression is a key name which can be anything like endDate, updateDate etc. which means we cant hardcode that in a expression. The key name can be accepted as any word though [a-zA-Z_0-9]*
I am using the following compiled pattern
Pattern.compile("([[a-zA-Z_0-9]*):(\\[[[\\*]|[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}[Z]];[[\\*]|[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}[Z]]\\]])");
The pattern matches but the groups created are not what I expect. I want the group surrounded by parenthesis below:
(startDate):([*:2016-10-12T12:23:23Z])
group1 = "startDate"
group2 = "[*;2016-10-12T12:23:23Z]"
Could you please help me with correct expression in Java and groups?
You are using [ rather than ( to wrap options (i.e. using |).
For example, the following code works for me:
Pattern pattern = Pattern.compile("(\\w+):(\\[(\\*|\\d{4}):\\*\\])");
Matcher matcher = pattern.matcher(text);
if (matcher.matches()) {
for (int i = 0; i < matcher.groupCount() + 1; i++) {
System.out.println(i + ":" + matcher.group(i));
}
} else {
System.out.println("no match");
}
To simplify things I just use the year but I'm sure it'll work with the full timestamp string.
This expression captures more than you need in groups but you can make them 'non-capturing' using the (?: ) construct.
Notice in this that I simplified some of your regexp using the predefined character classes. See http://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html for more details.
Here is a solution which uses your original regex, modified so that it actually returns the groups you want:
String content = "startDate:[2016-10-12T12:23:23Z:2016-10-12T12:23:23Z]";
Pattern pattern = Pattern.compile("([a-zA-Z_0-9]*):(\\[(?:\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}Z|\\*):(?:\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}Z|\\*)\\])");
Matcher matcher = pattern.matcher(content);
// remember to call find() at least once before trying to access groups
matcher.find();
System.out.println("group1 = " + matcher.group(1));
System.out.println("group2 = " + matcher.group(2));
Output:
group1 = startDate
group2 = [2016-10-12T12:23:23Z:2016-10-12T12:23:23Z]
This code has been tested on IntelliJ and appears to be working correctly.

How to split a string which contains multiple key value pairs

I have a string:
Single line : Some text
Multi1: multi (Va1) Multi2 : multi (Va2) Multi3 : multi (Val3)
Dots....20/12/2013 (EOY)
and I am trying to retrieve all the key value pairs. My first attempt
(Single line|Multi[0-9]{1}|Dots)( *:? [.] *| *:? )(.)
seems to work but does not handle multiple key value pairs on one line. Is there any way to achieve this?
Try this:
String text = "Single line : Some text\r\n" +
"Multi1: multi (Va1) Multi2 : multi (Va2) Multi3 : multi (Val3)\r\n" +
"Dots....20/12/2013 (EOY)";
Pattern pattern = Pattern.compile("(\\p{Alnum}[\\p{Alnum}\\s/]+?)\\s?(:|\\.+)\\s?(\\p{Alnum}[\\p{Alnum}\\s/]+?)(?=($|\\()|(\\s\\())", Pattern.MULTILINE);
Matcher matcher = pattern.matcher(text);
while (matcher.find()) {
System.out.println(matcher.group(1) + "-->" + matcher.group(3));
}
Output:
Single line-->Some text
Multi1-->multi
Multi2-->multi
Multi3-->multi
Dots-->20/12/2013
Explanation:
I am limiting the keys and values to "starts with alphanumeric",
"contains any number of alphanumerics, spaces or slashes".
I am limiting the separator to "optional space, :, optional space" or
"optional space, any number of consecutive dots, optional space".
I am using groups 1 and 3 to define the key and value in the
Pattern.
Group 2 is used to provide alternate separators as above.
Finally, the Pattern is delimited at the end, either with a new
line, or with an open round bracket, or, with a space followed by an
open round bracket.
Note that you can't use quantifiers in a lookahead or lookbehind group, hence the repetition.
You can use this pattern:
public static void main(String[] args) {
String s = "Single line : Some text\n"
+ "Multi1: multi (Va1) Multi2 : multi (Va2) "
+ "Multi3 : multi (Val3)\n"
+ "Dots....20/12/2013 (EOY)";
String wd = "[^\\s.:]+(?:[^\\S\\n]+[^\\s.:]+)*";
Pattern p = Pattern.compile("(?<key>" + wd + ")"
+ "\\s*(?::|\\.+)\\s*"
+ "(?<value>" + wd + "(?:\\s*\\([^)]+\\))?)"
+ "(?!\\s*:)(?=\\s|$)");
Matcher m = p.matcher(s);
while (m.find()) {
System.out.println(m.group("key")+"->"+m.group("value"));
}
}
I don't recall the exact syntax, but I think it's something like this:
while (matcher.find()) {
String match = matcher.group();
}
The goal here is that you need to iterate over the current line and tell it "while you are still finding stuff, return to me the string on this line that matched." Since you have multiple matches on the same line, it should keep pulling out findings for you. Here is the JavaDoc for Matcher as a reference.
This is sadly another reason why Java is really not well-suited for this sort of thing, and before anyone downmods me understand I say that as a criticism of the Java APIs here, not the language.

Retrieving Regex matched pattern

I need to retrieve a regex pattern matched strings from the given input.
Lets say, the pattern I need to get is like,
"http://mysite.com/<somerandomvalues>/images/<againsomerandomvalues>.jpg"
Now I created the following regex pattern for this,
http:\/\/.*\.mysite\.com\/.*\/images\/.*\.jpg
Can anybody illustrate how to retrieve all the matched pattern with this regx expression using Java?
You don't mask slashes but literal dots:
String regex = "http://(.*)\\.mysite\\.com/(.*)/images/(.*)\\.jpg";
String url = "http://www.mysite.com/work/images/cat.jpg";
Pattern pattern = Pattern.compile (regex);
Matcher matcher = pattern.matcher (url);
if (matcher.matches ())
{
int n = matcher.groupCount ();
for (int i = 0; i <= n; ++i)
System.out.println (matcher.group (i));
}
Result:
www
work
cat
Some simple Java example:
String my_regex = "http://.*.mysite.com/.*/images/.*.jpg";
Pattern pattern = Pattern.compile(my_regex);
Matcher matcher = pattern.matcher(string_to_be_matched);
// Check all occurance
while (matcher.find()) {
System.out.print("Start index: " + matcher.start());
System.out.print(" End index: " + matcher.end() + " ");
System.out.println(matcher.group());
}
In fact, it is not clear if you want the whole matching string or only the groups.
Bogdan Emil Mariesan's answer can be reduced to
if ( matcher.matches () ) System.out.println(string_to_be_matched);
because you know it is mathed and there are no groups.
IMHO, user unknown's answer is correct if you want to get matched groups.
I just want to add additional information (for others) that if you need matched group you can use replaceFirst() method too:
String firstGroup = string.replaceFirst( "http://mysite.com/(.*)/images/", "$1" );
But performance of Pattern.compile approach if better if there are two or more groups or if you need to do that multiple times (on the other hand in programming contests, for example, it is faster to write replaceFirst()).

Categories