regular expression java - java

I am trying to Take the content between Input, my pattern is not doing the right thing please help.
below is the sudocode:
s="Input one Input Two Input Three";
Pattern pat = Pattern.compile("Input(.*?)");
Matcher m = pat.matcher(s);
if m.matches():
print m.group(..)
Required Output:
one
Two
Three

Use a lookahead for Input and use find in a loop, instead of matches:
Pattern pattern = Pattern.compile("Input(.*?)(?=Input|$)");
Matcher matcher = pattern.matcher(s);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
See it working online: ideone
But it's better to use split here:
String[] result = s.split("Input");
// You need to ignore the first element in result, because it is empty.
See it working online: ideone

this does not work, because m.matches is true if and only if the whole string is matched by the expression. You could go two ways:
Use s.split("Input") instead, it gives you an array of the substrings between occurences of "Input"
Use Matcher.find() and Matcher.group(int). But be aware that your current expression will match everything after the first occurence of "Input", so you should change your expression.
Greetings,
Jost

import java.util.regex.*;
public class Regex {
public static void main(String[] args) {
String s="Input one Input Two Input Three";
Pattern pat = Pattern.compile("(Input) (\\w+)");
Matcher m = pat.matcher(s);
while( m.find() ) {
System.out.println( m.group(2) );
}
}
}

Related

Java replaceAll but the specified regex

Can't get my head around this for quite some time already. I have this piece of code:
getStringFromDom(doc).replaceAll("contract=\"\\d*\"|name=\"\\p{L}*\"", "");
Basically I need it to work literally the opposite way - to replace everything BUT the specified regex. I've been trying to do it with the negative lookahead to no avail.
For your particular task, I think
getStringFromDom(doc).replaceAll(".*?(contract=\"\\d*\"|name=\"\\p{L}*\").*", "$1");
should do what you need.
You want to remove everything that does not match the pattern. This is the same as simply filtering the pattern matches. Use the regex to find matches for that pattern, then collect the matches in a stringbuilder.
Matcher m = Pattern.compile(your pattern).matcher(your input);
StringBuilder sb = new StringBuilder();
while (m.find()) sb.append (m.group()).append('\n');
String result = sb.toString();
I also think that removing what your are not looking for is a double negative. Concentrate on what you are looking for and use a pattern matching for that. This example searches your document for any name attributes:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Test {
public static void main(String[] args) {
String input = "<AnotherDoc accNum=\"1111\" docDate=\"2017-09-26\" docNum=\"2222\" name=\"foo\"> <anotherTag>some date</anotherTag>";
Pattern pattern = Pattern.compile("name=\"[^\\\"]*\""); // value are all characters but "
Matcher matcher = pattern.matcher(input);
while (matcher.find())
System.out.println(matcher.group());
}
}
This prints:
name="foo"

How to figure out exact reason why Regex is failing in java

I have a Regex Pattern that i am using to match screen.
When i use it to test in Sublime Text, the same is working just fine.
but in Java execution, the code is failing
System.out.println(Pattern.matches("(B+)?|(R+)?", "RRBRR"));//false
System.out.println(Pattern.matches("(B+)?|(R+)?", "RRRRR"));//true
The above code should be coming as true in both cases, whereas in java it is coming as false.
my basic requirement is to identify groups of unique character in sequence...
meaning if String is
RRRRBBBRRBBBRBBBRRR
Then it should identify as
RRRR BBB RR BBB R BBB RRR
Please help...Thanks in advance
Try this:
String value = "RRRRBBBRRBBBRBBBRRR";
Pattern pattern = Pattern.compile("B+|R+");
Matcher matcher = pattern.matcher(value);
while (matcher.find()) {
System.out.println(matcher.group());
}
The fact that the first expression returns false is due to the fact that you have a B in a middle of several R so you don't have an exact match since your regular expression expect only Rs or Bs
matches adds an implicit ^ at the start & $ at the end which means substring matches wont work. find() will look for substring.
Matcher is best suited for this:
public static void main (String[] args) throws java.lang.Exception
{
String regex = "(B+)?|(R+)?";
Pattern pat = Pattern.compile(regex);
Matcher matcher = pat.matcher("RRBRR");
System.out.println(matcher.find());
int count = 0;
while(matcher.find()){
System.out.println(matcher.group());
count++;
}
System.out.println("Count:"+count);
}

Print out the last match of a regex

I have this code:
String responseData = "http://xxxxx-f.frehd.net/i/world/open/20150426/1370235-005A/EPISOD-1370235-005A-016f1729028090bf_,892,144,252,360,540,1584,2700,.mp4.csmil/.m3u8";
"http://xxxxx-f.frehd.net/i/world/open/20150426/1370235-005A/EPISOD-1370235-005A-016f1729028090bf_,892,144,252,360,540,1584,2700,.mp4.csmil/.m3u8";
String pattern = ^(https://.*\.54325)$;
Pattern pr = Pattern.compile(pattern);
Matcher math = pr.matcher(responseData);
if (math.find()) {
// print the url
}
else {
System.out.println("No Math");
}
I want to print out the last string that starts with http and ends with .m3u8. How do I do this? I'm stuck. All help is appreciated.
The problem I have now is that when I find a math and what to print out the string, I get everything from responseData.
In case you need to get some substring at the end that is preceded by similar substrings, you need to make sure the regex engine has already consumed as many characters before your required match as possible.
Also, you have a ^ in your pattern that means beginning of a string. Thus, it starts matching from the very beginning.
You can achieve what you want with just lastIndexOf and substring:
System.out.println(str.substring(str.lastIndexOf("http://")));
Or, if you need a regex, you'll need to use
String pattern = ".*(http://.*?\\.m3u8)$";
and use math.group(1) to print the value.
Sample code:
import java.util.regex.*;
public class HelloWorld{
public static void main(String []args){
String str = "http://xxxxx-f.akamaihd.net/i/world/open/20150426/1370235-005A/EPISOD-1370235-005A-016f1729028090bf_,892,144,252,360,540,1584,2700,.mp4.csmil/index_0_av.m3u8" +
"EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=2795000,RESOLUTION=1280x720,CODECS=avc1.64001f, mp4a.40.2" +
"http://xxxxx-f.akamaihd.net/i/world/open/20150426/1370235-005A/EPISOD-1370235-005A-016f1729028090bf_,892,144,252,360,540,1584,2700,.mp4.csmil/index_6_av.m3u8";
String rx = ".*(http://.*?\\.m3u8)$";
Pattern ptrn = Pattern.compile(rx);
Matcher m = ptrn.matcher(str);
while (m.find()) {
System.out.println(m.group(1));
}
}
}
Output:
http://xxxxx-f.akamaihd.net/i/world/open/20150426/1370235-005A/EPISOD-1370235-005A-016f1729028090bf_,892,144,252,360,540,1584,2700,.mp4.csmil/index_6_av.m3u8
Also tested on RegexPlanet

Regex to replace a repeating string pattern

I need to replace a repeated pattern within a word with each basic construct unit. For example
I have the string "TATATATA" and I want to replace it with "TA". Also I would probably replace more than 2 repetitions to avoid replacing normal words.
I am trying to do it in Java with replaceAll method.
I think you want this (works for any length of the repeated string):
String result = source.replaceAll("(.+)\\1+", "$1")
Or alternatively, to prioritize shorter matches:
String result = source.replaceAll("(.+?)\\1+", "$1")
It matches first a group of letters, and then it again (using back-reference within the match pattern itself). I tried it and it seems to do the trick.
Example
String source = "HEY HEY duuuuuuude what'''s up? Trololololo yeye .0.0.0";
System.out.println(source.replaceAll("(.+?)\\1+", "$1"));
// HEY dude what's up? Trolo ye .0
You had better use a Pattern here than .replaceAll(). For instance:
private static final Pattern PATTERN
= Pattern.compile("\\b([A-Z]{2,}?)\\1+\\b");
//...
final Matcher m = PATTERN.matcher(input);
ret = m.replaceAll("$1");
edit: example:
public static void main(final String... args)
{
System.out.println("TATATA GHRGHRGHRGHR"
.replaceAll("\\b([A-Za-z]{2,}?)\\1+\\b", "$1"));
}
This prints:
TA GHR
Since you asked for a regex solution:
(\\w)(\\w)(\\1\\2){2,};
(\w)(\w): matches every pair of consecutive word characters ((.)(.) will catch every consecutive pair of characters of any type), storing them in capturing groups 1 and 2. (\\1\\2) matches anytime the characters in those groups are repeated again immediately afterward, and {2,} matches when it repeats two or more times ({2,10} would match when it repeats more than one but less than ten times).
String s = "hello TATATATA world";
Pattern p = Pattern.compile("(\\w)(\\w)(\\1\\2){2,}");
Matcher m = p.matcher(s);
while (m.find()) System.out.println(m.group());
//prints "TATATATA"

Java RegExp can't get the result ater evaluating pattern

Hi I have been trying to learn RegExpresions using Java I am still at the begining and I wanted to start a little program that is given a string and outputs the syllabels split.This is what I got so far:
String mama = "mama";
Pattern vcv = Pattern.compile("([aeiou][bcdfghjklmnpqrstvwxyz][aeiou])");
Matcher matcher = vcv.matcher(mama);
if(matcher){
// the result of this should be ma - ma
}
What I am trying to do is create a pattern that checks the letters of the given word and if it finds a pattern that contains a vocale/consonant/vocale it will add a "-" like this v-cv .How can I achive this.
In the following example i matched the first vowel and used positive lookahead for the next consonant-vowel group. This is so i can split again if i have a vcvcv group.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Test {
public static void main(String[] args) {
new Test().run();
}
private void run() {
String mama = "mama";
Pattern vcv =
Pattern.compile("([aeiou])(?=[bcdfghjklmnpqrstvwxyz][aeiou])");
Matcher matcher = vcv.matcher(mama);
System.out.println(matcher.replaceAll("$1-"));
String mamama = "mamama";
matcher = vcv.matcher(mamama);
System.out.println(matcher.replaceAll("$1-"));
}
}
Output:
ma-ma
ma-ma-ma
try
mama.replaceAll('([aeiou])([....][aeiou])', '\1-\2');
replaceAll is a regular expression method
Your pattern only matches if the String starts with a vocal. If you want to find a substring, ignoring the beginning, use
Pattern vcv = Pattern.compile (".*([aeiou][bcdfghjklmnpqrstvwxyz][aeiou])");
If you like to ignore the end too:
Pattern vcv = Pattern.compile (".*([aeiou][bcdfghjklmnpqrstvwxyz][aeiou]).*");

Categories