Splitting string by new line with a condition - java

I am trying to split a String by \n only when it's not in my "action block".
Here is an example of a text message\n [testing](hover: actions!\nnew line!) more\nmessage I want to split when ever the \n is not inside the [](this \n should be ignored), I made a regex for it that you can see here https://regex101.com/r/RpaQ2h/1/ in the example it seems like it's working correctly so I followed up with an implementation in Java:
final List<String> lines = new ArrayList<>();
final Matcher matcher = NEW_LINE_ACTION.matcher(message);
String rest = message;
int start = 0;
while (matcher.find()) {
if (matcher.group("action") != null) continue;
final String before = message.substring(start, matcher.start());
if (!before.isEmpty()) lines.add(before.trim());
start = matcher.end();
rest = message.substring(start);
}
if (!rest.isEmpty()) lines.add(rest.trim());
return lines;
This should ignore any \n if they are inside the pattern showed above, however it never matches the "action" group, seems like when it is added to java and a \n is present it never matches it. I am a bit confused as to why, since it worked perfectly on the regex101.

Instead of checking whether the group is action, you can simply use regex replacement with the group $1 (the first capture group).
I also changed your regex to (?<action>\[[^\]]*]\([^)]*\))|(?<break>\\n) as [^\]]* doesn't backtrack (.*? backtracks and causes more steps). I did the same with [^)]*.
See code working here
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Main {
public static void main(String[] args) {
final String regex = "(?<action>\\[[^\\]]*\\]\\([^)]*\\))|(?<break>\\\\n)";
final String string = "message\\n [testing test](hover: actions!\\nnew line!) more\\nmessage";
final Pattern pattern = Pattern.compile(regex);
final Matcher matcher = pattern.matcher(string);
final String result = matcher.replaceAll("$1");
System.out.println(result);
}
}

Related

matches.find() with replaceAll()

I am new to Java and I found a loop in existing code that seems like it should be an infinite loop (or otherwise have highly undesirable behavior) which actually works.
Can you explain what I'm missing? The reason I think it should be infinite is that according to the documentation here (https://docs.oracle.com/javase/8/docs/api/java/util/regex/Matcher.html#replaceAll-java.lang.String-) a call to replaceAll will reset the matcher (This method first resets this matcher. It then scans the input sequence...). So I thought the below code would do its replacement and then call find() again, which would start over at the beginning. And it would keep finding the same string, since as you can see the string is just getting wrapped in a tag.
In case it's not obvious, Pattern and Matcher are the classes in java.util.regex.
String aTagName = getSomeTagName()
String text = getSomeText()
Pattern pattern = getSomePattern()
Matcher matches = pattern.matcher(text);
while (matches.find()) {
text = matches.replaceAll(String.format("<%1$s> %2$s </%1$s>", aTagName, matches.group()));
}
Why is that not the case?
I share your suspicions that this code very likely is unintended, for replaceAll changes the state, and since it scans the string to replace, the result is that only 1 search is performed and stated group is used to replace all searches with this group.
String text = "abcdEfg";
Pattern pattern = Pattern.compile("[a-z]");
Matcher matches = pattern.matcher(text);
while (matches.find()) {
System.out.println(text); // abcdEfg
text = matches.replaceAll(matches.group());
System.out.println(text); // aaaaEaa
}
As replaceAll tells the matcher to scan through the string, it ends up moving the pointer to the end to exhaust the entire string's state. Then find resumes search (from the current state - which is the end, not the start), but the search has already been exhausted.
One of the correct ways to iterate and replace for each group appropriately may be to use appendReplacement:
String text = "abcdEfg";
Pattern pattern = Pattern.compile("[a-z]");
Matcher matches = pattern.matcher(text);
StringBuffer sb = new StringBuffer();
while (matches.find()) {
matches.appendReplacement(sb, matches.group().toUpperCase());
System.out.println(text); // some of ABCDEFG
}
matches.appendTail(sb);
System.out.println(sb); // ABCDEFG
The below examples shows there is no reason to call the while loop if you are using replace all. In both the cases the answer is
is th is a summer ? Th is is very hot summer. is n't it?
import java.util.regex.*;
public class Test {
public static void main(String[] args) {
String text = "is this a summer ? This is very hot summer. isn't it?";
String tag = "b";
String pattern = "is";
System.out.println(question(text,tag,pattern));
System.out.println(alt(text,tag,pattern));
}
public static String question(String text, String tag, String p) {
Pattern pattern = Pattern.compile(p);
Matcher matcher= pattern.matcher(text);
while (matcher.find()) {
text = matcher.replaceAll(
String.format("<%1$s> %2$s </%1$s>",
tag, matcher.group()));
}
return text;
}
public static String alt(String text, String tag, String p) {
Pattern pattern = Pattern.compile(p);
Matcher matcher= pattern.matcher(text);
if(matcher.find())
return matcher.replaceAll(
String.format("<%1$s> %2$s </%1$s>",
tag, matcher.group()));
else
return text;
}
}

regex to get two different words from a string in java

I will be getting the string as app1(down) and app2(up)
the words in the brackets indicate status of the app, they may be up or down depending,
now i need to use a regex to get the status of the apps like a comma seperated string
ex:ill get app1(UP) and app2(DOWN)
required result UP,DOWN
It's easy using RegEx like this:
\\((.*?)\\)
String x = "app1(UP) and app2(DOWN)";
Matcher m = Pattern.compile("\\((.*?)\\)").matcher(x);
String tmp = "";
while(m.find()) {
tmp+=(m.group(1))+",";
}
System.out.println(tmp);
Output:
UP,DOWN,
Java 8: using StringJoiner
String x = "app1(UP) and app2(DOWN)";
Matcher m = Pattern.compile("\\((.*?)\\)").matcher(x);
StringJoiner sj = new StringJoiner(",");
while(m.find()) {
sj.add((m.group(1)));
}
System.out.print(sj.toString());
Output:
UP,DOWN
(Last , is removed)
import java.util.ArrayList;
import java.util.List;
import java.util.regex.*;
public class ValidateDemo
{
public static void main(String[] args)
{
String input = "ill get app1(UP) and app2(DOWN)";
Pattern p = Pattern.compile("app[0-9]+\\(([A-Z]+)\\)");
Matcher m = p.matcher(input);
List<String> found = new ArrayList<String>();
while (m.find())
{
found.add(m.group(1));
}
System.out.println(found.toString());
}
}
my first java script, have mercy
Consider this code:
private static final Pattern RX_MATCH_APP_STATUS = Pattern.compile("\\s*(?<name>[^(\\s]+)\\((?<status>[^(\\s]+)\\)");
final String input = "app1(UP) or app2(down) let's have also app-3(DOWN)";
final Matcher m = RX_MATCH_APP_STATUS.matcher(input);
while (m.find()) {
final String name = m.group("name");
final String status = m.group("status");
System.out.printf("%s:%s\n", name, status);
}
This plucks from input line as many app status entries, as they really are there, and put each app name and its status into proper variable. It's then up to you, how you want to handle them (print or whatever).
Plus, this gives you advantage if there will come other states than UP and DOWN (like UNKNOWN) and this will still work.
Minus, if there are sentences in brackets prefixed with some name, that is actually not a name of an app and the content of the brackets is not an app state.
Use this as regex and test it on http://regexr.com/
[UP]|[DOWN]

Print out the last match of a regex

I have this code:
String responseData = "http://xxxxx-f.frehd.net/i/world/open/20150426/1370235-005A/EPISOD-1370235-005A-016f1729028090bf_,892,144,252,360,540,1584,2700,.mp4.csmil/.m3u8";
"http://xxxxx-f.frehd.net/i/world/open/20150426/1370235-005A/EPISOD-1370235-005A-016f1729028090bf_,892,144,252,360,540,1584,2700,.mp4.csmil/.m3u8";
String pattern = ^(https://.*\.54325)$;
Pattern pr = Pattern.compile(pattern);
Matcher math = pr.matcher(responseData);
if (math.find()) {
// print the url
}
else {
System.out.println("No Math");
}
I want to print out the last string that starts with http and ends with .m3u8. How do I do this? I'm stuck. All help is appreciated.
The problem I have now is that when I find a math and what to print out the string, I get everything from responseData.
In case you need to get some substring at the end that is preceded by similar substrings, you need to make sure the regex engine has already consumed as many characters before your required match as possible.
Also, you have a ^ in your pattern that means beginning of a string. Thus, it starts matching from the very beginning.
You can achieve what you want with just lastIndexOf and substring:
System.out.println(str.substring(str.lastIndexOf("http://")));
Or, if you need a regex, you'll need to use
String pattern = ".*(http://.*?\\.m3u8)$";
and use math.group(1) to print the value.
Sample code:
import java.util.regex.*;
public class HelloWorld{
public static void main(String []args){
String str = "http://xxxxx-f.akamaihd.net/i/world/open/20150426/1370235-005A/EPISOD-1370235-005A-016f1729028090bf_,892,144,252,360,540,1584,2700,.mp4.csmil/index_0_av.m3u8" +
"EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=2795000,RESOLUTION=1280x720,CODECS=avc1.64001f, mp4a.40.2" +
"http://xxxxx-f.akamaihd.net/i/world/open/20150426/1370235-005A/EPISOD-1370235-005A-016f1729028090bf_,892,144,252,360,540,1584,2700,.mp4.csmil/index_6_av.m3u8";
String rx = ".*(http://.*?\\.m3u8)$";
Pattern ptrn = Pattern.compile(rx);
Matcher m = ptrn.matcher(str);
while (m.find()) {
System.out.println(m.group(1));
}
}
}
Output:
http://xxxxx-f.akamaihd.net/i/world/open/20150426/1370235-005A/EPISOD-1370235-005A-016f1729028090bf_,892,144,252,360,540,1584,2700,.mp4.csmil/index_6_av.m3u8
Also tested on RegexPlanet

Java RegExp can't get the result ater evaluating pattern

Hi I have been trying to learn RegExpresions using Java I am still at the begining and I wanted to start a little program that is given a string and outputs the syllabels split.This is what I got so far:
String mama = "mama";
Pattern vcv = Pattern.compile("([aeiou][bcdfghjklmnpqrstvwxyz][aeiou])");
Matcher matcher = vcv.matcher(mama);
if(matcher){
// the result of this should be ma - ma
}
What I am trying to do is create a pattern that checks the letters of the given word and if it finds a pattern that contains a vocale/consonant/vocale it will add a "-" like this v-cv .How can I achive this.
In the following example i matched the first vowel and used positive lookahead for the next consonant-vowel group. This is so i can split again if i have a vcvcv group.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Test {
public static void main(String[] args) {
new Test().run();
}
private void run() {
String mama = "mama";
Pattern vcv =
Pattern.compile("([aeiou])(?=[bcdfghjklmnpqrstvwxyz][aeiou])");
Matcher matcher = vcv.matcher(mama);
System.out.println(matcher.replaceAll("$1-"));
String mamama = "mamama";
matcher = vcv.matcher(mamama);
System.out.println(matcher.replaceAll("$1-"));
}
}
Output:
ma-ma
ma-ma-ma
try
mama.replaceAll('([aeiou])([....][aeiou])', '\1-\2');
replaceAll is a regular expression method
Your pattern only matches if the String starts with a vocal. If you want to find a substring, ignoring the beginning, use
Pattern vcv = Pattern.compile (".*([aeiou][bcdfghjklmnpqrstvwxyz][aeiou])");
If you like to ignore the end too:
Pattern vcv = Pattern.compile (".*([aeiou][bcdfghjklmnpqrstvwxyz][aeiou]).*");

getting a string with quotes

I have a string "Hello" hello (including the quotes) and i just want to get the Hello that has the quotes but without the quotes
i tried using regular expression but it never finds the quotes im guessing
String s = new String("string");
Pattern p = Pattern.compile("\"([^\"])\"");
Matcher m = p.matcher(n);
while (m.find()) {
s = m.group(1);
}
the while loop never gets executed, suggestions?
-- Moved the star inside the parenthesis for proper grouping ---
"\"([^\"]*)\""
Tested successfully with the code
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Test {
public static void main(String[] args) {
String s = new String("\"Hello\" hello");
Pattern p = Pattern.compile("\"([^\"]*)\"");
Matcher m = p.matcher(s);
while (m.find()) {
System.out.println(m.group(1));
}
}
}
which produced the expected output
Hello
-- Original post follows --
You don't match anything because your regex is written to only match quoted one character strings.
"\"([^\"])*\""
is closer to what you need. Note the star, it means zero or more of the preceeding expression. In this case the preceeding expression is "anything that lacks a double quote".
I suggest you try a String which has quotes in it if you want to find any. ;)
Try
String s = "start \"string\" end";
or
String s = "\"Hello\" hello";
You can simply use indexOf("\"") in this case.

Categories