Pattern to match any character until boundary character - java

I have the string:
String myStr = "Operation=myMethod\nDataIn=A;B;C;D\nDataOut=X;Y;Z\n"
and I want to match DataIn.
I have the following code:
Pattern pattern = Pattern.compile("Operation=myMethod.*DataIn=(.*)?\n", Pattern.DOTALL);
Matcher matcher = pattern.matcher(myStr);
if (matcher.find()) {
return matcher.group(1);
}
The problem is that it is returning: "A;B;C;D\nDataOut=X;Y;Z\n"
I tried with the patter: "Operation=myMethod.DataIn=(.?\n)"
It then returns "A;B;C;D\n". I don't want the final "\n" to be returned.

Replace (.*) in your regex by ([^\n]*) to match until the line-break, or ([^\b]*) to match until any boundary character.
Pattern pattern = Pattern.compile("Operation=myMethod.*DataIn=([^\\n]*)?\n", Pattern.DOTALL);
Matcher matcher = pattern.matcher(myStr);
if (matcher.find()) {
return matcher.group(1);
}
The [^...] construct in a character class that means match any character that isn't in this set.

You can use:
Pattern pattern = Pattern.compile("Operation=myMethod.*?DataIn=([^\\n]*)", Pattern.DOTALL);
This will match until 0 or more characters in group #1 until \n is matched.

Try using this:
PATTERN
(?<=DataIn=)(.+?)(?=\\n)
CODE
Pattern pattern = Pattern.compile("(?<=DataIn=)(.+?)(?=\\n)", Pattern.DOTALL);
Matcher matcher = pattern.matcher(myStr);
if (matcher.find())
{
return matcher.group(1);
}
INPUT
Operation=myMethod\nDataIn=A;B;C;D\nDataOut=X;Y;Z\n
OUTPUT
A;B;C;D

Regex
(?s)(?<=DataIn=)(.+?)(?=\n)
Description

Related

How To Match Repeating Sub-Patterns

Let's say I have a string:
String sentence = "My nieces are Cara:8 Sarah:9 Tara:10";
And I would like to find all their respective names and ages with the following pattern matcher:
String regex = "My\\s+nieces\\s+are((\\s+(\\S+):(\\d+))*)";
Pattern pattern = Pattern.compile;
Matcher matcher = pattern.matcher(sentence);
I understand something like
matcher.find(0); // resets "pointer"
String niece = matcher.group(2);
String nieceName = matcher.group(3);
String nieceAge = matcher.group(4);
would give me my last niece (" Tara:10", "Tara", "10",).
How would I collect all of my nieces instead of only the last, using only one regex/pattern?
I would like to avoid using split string.
Another idea is to use the \G anchor that matches where the previous match ended (or at start).
String regex = "(?:\\G(?!\\A)|My\\s+nieces\\s+are)\\s+(\\S+):(\\d+)";
If My\s+nieces\s+are matches
\G will chain matches from there
(?!\A) neg. lookahead prevents \G from matching at \A start
\s+(\S+):(\d+) using two capturing groups for extraction
See this demo at regex101 or a Java demo at tio.run
Matcher m = Pattern.compile(regex).matcher(sentence);
while (m.find()) {
System.out.println(m.group(1));
System.out.println(m.group(2));
}
You can't iterate over repeating groups, but you can match each group individually, calling find() in a loop to get the details of each one. If they need to be back-to-back, you can iteratively bound your matcher to the last index, like this:
Matcher matcher = Pattern.compile("My\\s+nieces\\s+are").matcher(sentence);
if (matcher.find()) {
int boundary = matcher.end();
matcher = Pattern.compile("^\\s+(\\S+):(\\d+)").matcher(sentence);
while (matcher.region(boundary, sentence.length()).find()) {
System.out.println(matcher.group());
System.out.println(matcher.group(1));
System.out.println(matcher.group(2));
boundary = matcher.end();
}
}

Regex pattern not working properly on matcher

I have a string like this:
ben[0]='zc5u5';
icb[0]='M';
bild[0]='b1_413134.jpg';
ort[0]='Köln';kmm[0]=0.00074758603074103;alt[0]='18';
jti[0]=413134;
upd[0]='u41313486729.js';
jon[0]=0;
jco[0]=0;
jch[0]=0;
ben[1]='Oukg5';
icb[1]='M';
bild[1]='mannse.jpg';
jti[1]=412425;
upd[1]='u41242570092.js';
jon[1]=0;
jco[1]=0;
jch[1]=0;
ben[2]='Tester356';
icb[2]='M';
bild[2]='b1_247967.jpg';
I want to get the names fromben[], for example the first one would be zc5u5.
I do currently have this code:
Pattern pattern = Pattern.compile("(ben\\[\\d+\\]=').+?'");
Matcher matcher = pattern.matcher(string);
LinkedList<String> list = new LinkedList<String>();
// Loop through and find all matches and store them into the List
while(matcher.find()) {
list.add(matcher.group());
}
Unfortunately the pattern does match the whole line, instead of just the value, e.g. zc5u5. What am I doing wrong?
You need two groups if you want to capture the index and the value, and I would add support for optional white-space around the assignment (\\s*). Something like,
Pattern pattern = Pattern.compile("ben\\[(\\d+)\\]\\s*=\\s*'(.+)';");
Matcher matcher = pattern.matcher(string);
if (matcher.matches()) {
System.out.printf("index %s = %s%n", matcher.group(1), matcher.group(2));
}
You can use a regex like this:
ben\b.+'(.*?)'
Regex demo
Pattern pattern = Pattern.compile("ben\\b.*'(.*?)'");
Matcher matcher = pattern.matcher(string);
if (matcher.matches()) {
System.out.printf(matcher.group(1));
}

Java Regex group matches spaces

I have this regex and my output seems to be matching each single space but the capturing group is only alpha chars. I must be missing something.
String regexstring = new String("1234567 Mike Peloso ");
Pattern pattern = Pattern.compile("[A-Za-z]*");
Matcher matcher = pattern.matcher(regexstring);
while(matcher.find())
{
System.out.println(Integer.toString(matcher.start()));
String someNumberStr = matcher.group();
System.out.println(someNumberStr);
}
There is no capturing group, but you need to use the + quantifier (meaning 1 or more times). The * quantifier matches the preceding element zero or more times and creates a disaster of output...
Pattern pattern = Pattern.compile("[A-Za-z]+");
And then print the match result:
while (matcher.find()) {
System.out.println(matcher.start());
System.out.println(matcher.group());
}
Working Demo

Extract value between brackets with RegExp

I attempt to extract values between brackets ( and ), before I was managed only to check presence of value. Help me to extract it, please.
Pattern pattern;
pattern = Pattern.compile("\\b(.*\\b)");
Matcher matcher = pattern.matcher(node.toString());
if (matcher.find()){
System.out.println();// here I need to print value that I find between brackets
}
Escape the brackets in your regex:
Pattern pattern = Pattern.compile("\\((.*?)\\)");
Then you can do:
Matcher matcher = pattern.matcher(node.toString());
if (matcher.find()){
System.out.println( matcher.group(1) );
}

Java Regular expression to extract text from square bracket

How can I extract the text with in square brackets if it contains only dot and no other special character?
For example I want to extract "com.package.file" from
"ERR|appLogger|[Manager|Request]RequestFailed[com.package.file]uploading[com.file_upload]"
String s = "ERR|appLogger|[Manager|Request]RequestFailed[com.package.file]uploading[com.file]";
Pattern pattern = Pattern.compile("\\[([A-Za-z0-9.]+)\\]");
Matcher m = pattern.matcher(s);
if (m.find()) {
System.out.println(m.group(1)); // com.package.file
}
Something in the lines of:
^\w+\|\w+\|\[\w+\|\w+\]\w+\[([\w\.]+)\]\w+\[[\w\.\_]+\]$
Would allow you to capture that.
Pattern pattern = Pattern.compile("^\\w+\\|\\w+\\|\\[\\w+\\|\\w+\\]\\w+\\[([\\w\\.]+)\\]\\w+\\[[\\w\\.\\_]+\\]$", Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher("ERR|appLogger|[Manager|Request]RequestFailed[com.package.file]uploading[com.file_upload]");
System.out.println(matcher.group(1));

Categories