RegExpr output incorrect - java

I am trying to get all of the output from a string that I want to match a pattern using matcher, however, I am not sure that either the string or my pattern isn't correct. I am trying to get (Server: switch) as the first pattern and so on and so forth after the newline, however, I am only getting the last three pattern as my output shows. My output is the following with the code following
found_m: Message: Mess
found_m: Token: null
found_m: Response: OK
Here is my code:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegexMatches {
public static void main( String args[] ) {
// String to be scanned to find the pattern.
String line = "Server: Switch\nMessage: Mess\nToken: null\nResponse: OK";
String pattern = "([\\w]+): ([^\\n]+)";
// Create a Pattern object
Pattern r = Pattern.compile(pattern);
// Now create matcher object.
Matcher m = r.matcher(line);
if (m.find( )) {
while(m.find()) {
System.out.println("found_m: " + m.group());
}
}else {
System.out.println("NO MATCH");
}
}
}
Is my string line incorrect or my string pattern that I am not doing regexpr wrong?
Thanks in advance.

Your regex is almost correct.
Problem is that you're calling find twice: 1st time in if condition and then again in while.
You can use do-while loop instead:
if (m.find( )) {
do {
System.out.println("found_m: " + m.group());
} while(m.find());
} else {
System.out.println("NO MATCH");
}
For regex part you can use this with minor correction:
final String pattern = "(\\w+): ([^\\n]+)";
or if you don't need 2 capturing groups then use:
final String pattern = "\\w+: [^\\n]+";
As there is no need to use character class around \\w+

I'm not familiar with Java, but this regex pattern should work to capture every group and match.
([\w]+): (\w+)(?:(?:[\\][n])|$)
It basically states capture the word followed by the colon and space, then capture the next word before either the \n or the end of string.
Good luck.

Related

Print out the last match of a regex

I have this code:
String responseData = "http://xxxxx-f.frehd.net/i/world/open/20150426/1370235-005A/EPISOD-1370235-005A-016f1729028090bf_,892,144,252,360,540,1584,2700,.mp4.csmil/.m3u8";
"http://xxxxx-f.frehd.net/i/world/open/20150426/1370235-005A/EPISOD-1370235-005A-016f1729028090bf_,892,144,252,360,540,1584,2700,.mp4.csmil/.m3u8";
String pattern = ^(https://.*\.54325)$;
Pattern pr = Pattern.compile(pattern);
Matcher math = pr.matcher(responseData);
if (math.find()) {
// print the url
}
else {
System.out.println("No Math");
}
I want to print out the last string that starts with http and ends with .m3u8. How do I do this? I'm stuck. All help is appreciated.
The problem I have now is that when I find a math and what to print out the string, I get everything from responseData.
In case you need to get some substring at the end that is preceded by similar substrings, you need to make sure the regex engine has already consumed as many characters before your required match as possible.
Also, you have a ^ in your pattern that means beginning of a string. Thus, it starts matching from the very beginning.
You can achieve what you want with just lastIndexOf and substring:
System.out.println(str.substring(str.lastIndexOf("http://")));
Or, if you need a regex, you'll need to use
String pattern = ".*(http://.*?\\.m3u8)$";
and use math.group(1) to print the value.
Sample code:
import java.util.regex.*;
public class HelloWorld{
public static void main(String []args){
String str = "http://xxxxx-f.akamaihd.net/i/world/open/20150426/1370235-005A/EPISOD-1370235-005A-016f1729028090bf_,892,144,252,360,540,1584,2700,.mp4.csmil/index_0_av.m3u8" +
"EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=2795000,RESOLUTION=1280x720,CODECS=avc1.64001f, mp4a.40.2" +
"http://xxxxx-f.akamaihd.net/i/world/open/20150426/1370235-005A/EPISOD-1370235-005A-016f1729028090bf_,892,144,252,360,540,1584,2700,.mp4.csmil/index_6_av.m3u8";
String rx = ".*(http://.*?\\.m3u8)$";
Pattern ptrn = Pattern.compile(rx);
Matcher m = ptrn.matcher(str);
while (m.find()) {
System.out.println(m.group(1));
}
}
}
Output:
http://xxxxx-f.akamaihd.net/i/world/open/20150426/1370235-005A/EPISOD-1370235-005A-016f1729028090bf_,892,144,252,360,540,1584,2700,.mp4.csmil/index_6_av.m3u8
Also tested on RegexPlanet

Regex to exclude word from matches java code

Maybe someone could help me. I'm trying to include within a java code a regex to match all strings except the ZZ78. I'd like to know what it's missing in the regex I have.
The input string is str = "ab57cdZZ78efghZZ7ij#klmZZ78noCODpqrZZ78stuvw27z#xyzZZ78"
and I'm trying with this regex (?:(?![ZZF8]).)* but if you test in http://regexpal.com/
this regex against the string, you'll see that is not working completely.
str = new String ("ab57cdZZ78efghZZ7ij#klmZZ78noCODpqrZZ78stuvw27z#xyzZZ78");
Pattern pattern = Pattern.compile("(?:(?![ZZ78]).)*");
the matched strings should be
ab57cd
efghZZ7ij#klm
noCODpqr
stuvw27z#xyz
Update:
Hello Avinash Raj and Chthonic Project. Thanks so much for your help and solutions provided.
I originally thougth in split method, but I was trying to avoid get empty strings as result
when for example the delimiter string is at the beginning or at the end of the main string.
Then, I thought that a regex could help me to extract all except "ZZ78", avoiding in this way
empty results in the output.
Below I show the code using split method (Chthonic´s) and regex (Avinash´s) both produce empty
string if the commented "if()" conditions are not used.
Does the use of those "if()" are the only way to not print empty strings? or could be the regex
tweaked a little bit to match not empty strings?
This is the code I have tested so far:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegexTest {
public static void main(String[] args) {
System.out.println("########### Matches with Split ###########");
String str = "ZZ78ab57cdZZ78efghZZ7ij#klmZZ78noCODpqrZZ78stuvw27z#xyzZZ78";
for (String s : str.split("ZZ78")) {
//if ( !s.isEmpty() ) {
System.out.println("This is a match <<" + s + ">>");
//}
}
System.out.println("##########################################");
System.out.println("########### Matches with Regex ###########");
String s = "ZZ78ab57cdZZ78efghZZ7ij#klmZZ78noCODpqrZZ78stuvw27z#xyzZZ78";
Pattern regex = Pattern.compile("((?:(?!ZZ78).)*)(ZZ78|$)");
Matcher matcher = regex.matcher(s);
while(matcher.find()){
//if ( !matcher.group(1).isEmpty() ) {
System.out.println("This is a match <<" + matcher.group(1) + ">>");
//}
}
}
}
**and the output (without use the "if()´s"):**
########### Matches with Split ###########
This is a match <<>>
This is a match <<ab57cd>>
This is a match <<efghZZ7ij#klm>>
This is a match <<noCODpqr>>
This is a match <<stuvw27z#xyz>>
##########################################
########### Matches with Regex ###########
This is a match <<>>
This is a match <<ab57cd>>
This is a match <<efghZZ7ij#klm>>
This is a match <<noCODpqr>>
This is a match <<stuvw27z#xyz>>
This is a match <<>>
Thanks for help so far.
Thanks in advance
Update #2:
Excellent both of your answers and solutions. Now it works very nice. This is the final code I've tested with both solutions.
Many thanks again.
import java.util.ArrayList;
import java.util.Arrays;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegexTest {
public static void main(String[] args) {
System.out.println("########### Matches with Split ###########");
String str = "ZZ78ab57cdZZ78efghZZ7ij#klmZZ78noCODpqrZZ78stuvw27z#xyzZZ78";
Arrays.stream(str.split("ZZ78")).filter(s -> !s.isEmpty()).forEach(System.out::println);
System.out.println("##########################################");
System.out.println("########### Matches with Regex ###########");
String s = "ZZ78ab57cdZZ78efghZZ7ij#klmZZ78noCODpqrZZ78stuvw27z#xyzZZ78";
Pattern regex = Pattern.compile("((?:(?!ZZ78).)*)(ZZ78|$)");
Matcher matcher = regex.matcher(s);
ArrayList<String> allMatches = new ArrayList<String>();
ArrayList<String> list = new ArrayList<String>();
while(matcher.find()){
allMatches.add(matcher.group(1));
}
for (String s1 : allMatches)
if (!s1.equals(""))
list.add(s1);
System.out.println(list);
}
}
And output:
########### Matches with Split ###########
ab57cd
efghZZ7ij#klm
noCODpqr
stuvw27z#xyz
##########################################
########### Matches with Regex ###########
[ab57cd, efghZZ7ij#klm, noCODpqr, stuvw27z#xyz]
The easiest way to do this is as follows:
public static void main(String[] args) {
String str = "ab57cdZZ78efghZZ7ij#klmZZ78noCODpqrZZ78stuvw27z#xyzZZ78";
for (String s : str.split("ZZ78"))
System.out.println(s);
}
The output, as expected, is:
ab57cd
efghZZ7ij#klm
noCODpqr
stuvw27z#xyz
If the pattern used to split the string is at the beginning (i.e. "ZZ78" in your example code), the first element returned will be an empty string, as you have already noted. To avoid that, all you need to do is filter the array. This is essentially the same as putting an if, but you can avoid the extra condition line this way. I would do this as follows (in Java 8):
String test_str = ...; // whatever string you want to test it with
Arrays.stream(str.split("ZZ78")).filter(s -> !s.isEmpty()).foreach(System.out::println);
You must need to remove the character class since [ZZ78] matches a single charcater from the given list. (?:(?!ZZ78).)* alone won't give the match you want. Consider this ab57cdZZ78 as an input string. At first this (?:(?!ZZ78).)* matches the string ab57cd, next it tries to match the following Z and check the condition (?!ZZ78) which means match any character but not of ZZ78. So it failes to match the following Z, next the regex engine moves on to the next character Z and checks this (?!ZZ78) condition. Because of the second Z isn't followed by Z78, this Z got matched by the regex engine.
String s = "ab57cdZZ78efghZZ7ij#klmZZ78noCODpqrZZ78stuvw27z#xyzZZ78";
Pattern regex = Pattern.compile("((?:(?!ZZ78).)*)(ZZ78|$)");
Matcher matcher = regex.matcher(s);
while(matcher.find()){
System.out.println(matcher.group(1));
}
Output:
ab57cd
efghZZ7ij#klm
noCODpqr
stuvw27z#xyz
Explanation:
((?:(?!ZZ78).)*) Capture any character but not of ZZ78 zero or more times.
(ZZ78|$) And also capture the following ZZ78 or the end of the line anchor into group 2.
Group index 1 contains single or group of characters other than ZZ78
Update:
String s = "ZZ78ab57cdZZ78efghZZ7ij#klmZZ78noCODpqrZZ78stuvw27z#xyzZZ78";
Pattern regex = Pattern.compile("((?:(?!ZZ78).)*)(ZZ78|$)");
Matcher matcher = regex.matcher(s);
ArrayList<String> allMatches = new ArrayList<String>();
ArrayList<String> list = new ArrayList<String>();
while(matcher.find()){
allMatches.add(matcher.group(1));
}
for (String s1 : allMatches)
if (!s1.equals(""))
list.add(s1);
System.out.println(list);
Output:
[ab57cd, efghZZ7ij#klm, noCODpqr, stuvw27z#xyz]

regular expression java

I am trying to Take the content between Input, my pattern is not doing the right thing please help.
below is the sudocode:
s="Input one Input Two Input Three";
Pattern pat = Pattern.compile("Input(.*?)");
Matcher m = pat.matcher(s);
if m.matches():
print m.group(..)
Required Output:
one
Two
Three
Use a lookahead for Input and use find in a loop, instead of matches:
Pattern pattern = Pattern.compile("Input(.*?)(?=Input|$)");
Matcher matcher = pattern.matcher(s);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
See it working online: ideone
But it's better to use split here:
String[] result = s.split("Input");
// You need to ignore the first element in result, because it is empty.
See it working online: ideone
this does not work, because m.matches is true if and only if the whole string is matched by the expression. You could go two ways:
Use s.split("Input") instead, it gives you an array of the substrings between occurences of "Input"
Use Matcher.find() and Matcher.group(int). But be aware that your current expression will match everything after the first occurence of "Input", so you should change your expression.
Greetings,
Jost
import java.util.regex.*;
public class Regex {
public static void main(String[] args) {
String s="Input one Input Two Input Three";
Pattern pat = Pattern.compile("(Input) (\\w+)");
Matcher m = pat.matcher(s);
while( m.find() ) {
System.out.println( m.group(2) );
}
}
}

How do I process following PHP regex in Java?

How do I process following PHP regex in Java:
if(preg_match("/\r\n(.*?)\$/",$req,$match)){ $data=$match[1]; }
This line is part of following function, by the way:
function getheaders($req){
$r=$h=$o=null;
if(preg_match("/GET (.*) HTTP/" ,$req,$match)){ $r=$match[1]; }
if(preg_match("/Host: (.*)\r\n/" ,$req,$match)){ $h=$match[1]; }
if(preg_match("/Origin: (.*)\r\n/",$req,$match)){ $o=$match[1]; }
if(preg_match("/Sec-WebSocket-Key2: (.*)\r\n/",$req,$match)){ $key2=$match[1]; }
if(preg_match("/Sec-WebSocket-Key1: (.*)\r\n/",$req,$match)){ $key1=$match[1]; }
if(preg_match("/\r\n(.*?)\$/",$req,$match)){ $data=$match[1]; }
return array($r,$h,$o,$key1,$key2,$data);
}
Thanks in advance!
So far I have:
Matcher matcher = Pattern.compile("\r\n(.*?)\\$").matcher(req);
while(matcher.find()){
data = matcher.group(1);
}
I am sure, however, that this is wrong.
Ok guys, thanks for your answers, but they did not help yet. May I ask you to tell me, however, what this regex means:
if(preg_match("/\r\n(.*?)\$/",$req,$match)){ $data=$match[1]; }
I know, that if it does find a match with /\r\n(.*?)\$/ in the string $req, it will save the different kinds of mathces into the array $match. BUT: what is being matched here? And what's the difference between $match[0] and $match[1]? Maybe, if I understand this, I will be able to reconstruct the way to produce equal results in Java.
Thanks Jaroslav, but:
The string I am trying to process however (the last line of the handshake sent to me by Google Chrome, is:
Cookie: 34ad04df964553fb6017b93d35dccd5f=%7C34%7C36%7C37%7C40%7C41%7C42%7C43%7C44%7C45%7C46%7C47%7C48%7C49%7C50%7C52%7C53%7C54%7C55%7C56%7C57%7C58%7C59%7C60%7C61%7C62%7C63%7C64%7C65%7C66%7C67%7C68%7C69%7C70%7C71%7C72%7C73%7C74%7C75%7C76%7C77%7C78%7C79%7C80%7C81%7C82%7C83%7C84%7C85%7C86%7C87%7C88%7C89%7C90%7C91%7C92%7C93%7C94%7C95%7C96%7C97%7C98%7C99%7C100%7C101%7C102%7C103%7C104%7C105%7C106%7C107%7C108%7C109%7C110%7C111%7C112%7C113%7C114%7C115%7C116%7C117%7C118%7C119%7C120%7C121%7C122%7C123%7C124%7C125%7C126%7C127%7C128%7C129%7C130%7C131%7C132%7C133%7C134%7C135%7C136%7C137%7C138%7C139%7C%3B%7C%3B%7C%3B%7C%3B1%3B2%3B3%3B4%3B5%3B6%3B7%3B8%3B9%3B10%3B11%3B14%3B15%3B18%3B23%3B24%3B25%3B26%3B28%3B29%3B30%3B31%3B32%3B33%3B%7C
Hey guys, I just now realize what I have been asking was irrelevant :( But one answer has been right.
Use java.util.regex.Pattern look here for instructions for this class. Here is Regular Expressions Tutorial. And here is the example:
String p = "Host: (.*)\\r\\n";
String input = "Host: example.com\r\n";
Pattern pattern = Pattern.compile(p);
Matcher matcher = pattern.matcher(input);
if(matcher.matches()) {
String output = matcher.group(1);
System.out.println(output);
} else {
System.out.println("not found");
}
Note: Matcher.find matches subsequences, Matcher.matches matches entire region.
IMHO in your example \\$ at the end may cause a problem when your input is multiline and you parse it at once.
In Java there are more convenient methods for accessing headers. At client side this is HttpURLConnection.getHeaderField. At the server side there is HttpServletRequest.getHeader.
import java.util.regex.Pattern;
import java.util.regex.Matcher;
public class SplitDemo2 {
private static final String REGEX = "/\\r\\n(.*?)\\$/";
private static final String INPUT = "/GET (.*) HTTP/";
public static void main(String[] args) {
Pattern p = Pattern.compile(REGEX);
Matcher m = p.matcher(INPUT); // get a matcher object
int count = 0;
while(m.find()) {
count++;
System.out.println("Match number "+count);
System.out.println("start(): "+m.start());
System.out.println("end(): "+m.end());
}
}
}
More info on regex http://download.oracle.com/javase/tutorial/essential/regex/matcher.html
Explanation of your regex
\r\n(.*?)\$
\r Carriage return character.
\n Line break character.
(.*?) A numbered capture group
\$ Matches a $ character.

getting a string with quotes

I have a string "Hello" hello (including the quotes) and i just want to get the Hello that has the quotes but without the quotes
i tried using regular expression but it never finds the quotes im guessing
String s = new String("string");
Pattern p = Pattern.compile("\"([^\"])\"");
Matcher m = p.matcher(n);
while (m.find()) {
s = m.group(1);
}
the while loop never gets executed, suggestions?
-- Moved the star inside the parenthesis for proper grouping ---
"\"([^\"]*)\""
Tested successfully with the code
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Test {
public static void main(String[] args) {
String s = new String("\"Hello\" hello");
Pattern p = Pattern.compile("\"([^\"]*)\"");
Matcher m = p.matcher(s);
while (m.find()) {
System.out.println(m.group(1));
}
}
}
which produced the expected output
Hello
-- Original post follows --
You don't match anything because your regex is written to only match quoted one character strings.
"\"([^\"])*\""
is closer to what you need. Note the star, it means zero or more of the preceeding expression. In this case the preceeding expression is "anything that lacks a double quote".
I suggest you try a String which has quotes in it if you want to find any. ;)
Try
String s = "start \"string\" end";
or
String s = "\"Hello\" hello";
You can simply use indexOf("\"") in this case.

Categories