Regex max a string till " and not stop at \"

Regex max a string till " and not stop at \" - java

I have a String to be checked for regex :
"field":"Testing, for something \"and something\""
which I want to pattern match and replace with :
"field":"SAFE"
For this, I am trying to pattern match and capture till the last inverted commas. I have tried the following regex, but its not matching :
Pattern p = Pattern.compile("\"field\":\".*?(?!\\\")\"");
New to regex, can anyone suggest what I might be doing wrong? Thanks!
EDIT :
I guess the question was not clear. Apologies. The above is not the end of the string. It can contain more fields in succession :
"field":"Testing, for something \"and something\"", "new_field":"blahblah", ...
output should be :
"field":"SAFE", "new_field":"blahblah", ...

You can do it as follows:
public class Testing {
public static void main(String[] args) {
String str = "\"field\":\"Testing, for something \\\"and something\\\"\"";
str = str.replaceAll("(\"field\":).*", "$1\"SAFE\"");
System.out.println(str);
}
}
Output:
"field":"SAFE"
Explanation:
(\"field\":) is the first capturing group
.* specifies all characters
$1 specifies the first capturing group
Update:
Writing this update based on the clarification from OP.
You can use positive lookahead for comma as shown below:
public class Testing {
public static void main(String[] args) {
String str = "\"field\":\"Testing, for something \\\"and something\\\"\", \"new_field\":\"blahblah\"";
str = str.replaceAll("(\"field\":).*(?=,)", "$1\"SAFE\"");
System.out.println(str);
}
}
Output:
"field":"SAFE", "new_field":"blahblah"

Here is an example.
$str = '"field":"Testing, for something \"and something\""';
echo preg_replace('/(\"field\":\")(.*)(\")/i', "$1SAFE$3", $str);
Regex is tested: here.

Related

Extract regex internal values in Java

Given this text:
$F{abc} and $F{def}
I need to get
abc and def
For that, I would use this regex to find the values \$F\{\w*\} but I need to get what's represented by w*:
str.replaceAll("\\$F\{\\w*\\}", "??" );
Is this doable with a Java function or I need to write the routine?

You can capture the text in a group:
str = str.replaceAll("\\$F\\{(\\w*)}", "$1");

Update
I didn't read your question completely. Thanks to The fourth bird for pointing it out. Given below is the code for the expected output, abc and def
public class Main {
public static void main(String[] args) {
String str = "$F{abc} and $F{def}";
System.out.println(str.replaceAll("\\$F\\{(\\w*)\\}", "$1"));
}
}
Output:
abc and def
All you need to do is to replace the given string with the capturing group(1). The explanation in the original answer is still valid for this update.
Original answer:
You can use the regex, (\$F\{)(\w*)(\}) and replace the capturing group(2) with ?? and preserve the capturing group(1) and the capturing group(3) as shown below:
public class Main {
public static void main(String[] args) {
String str = "$F{abc} and $F{def}";
System.out.println(str.replaceAll("(\\$F\\{)(\\w*)(\\})", "$1??$3"));
}
}
Output:
$F{??} and $F{??}
Check this for an illustration of all the capturing groups in the regex, (\$F\{)(\w*)(\}).

One way is to use the regex (\\$F\\{)|(\\}) you can both remove the "$F{" and "}" parts using replaceAll():
String str = "$F{abc} and $F{def}";
str = str.replaceAll("(\\$F\\{)|(\\})", "");
About regex:
(\\$F\\{) : 1. group "$F{"
| : OR
(\\}) : 2. group "}"
Other way is to use a capture group reference in the replacement parameter. $1 stands for the capture group (\\w*) that corresponds to abc and def
String str = "$F{abc} and $F{def}";
str = str.replaceAll("\\$F\\{(\\w*)\\}", "$1" );
System.out.println(str);
Output:
abc and def

Split a string using split method

I have tried to split a string using split method, but I'm facing some problem in using split method.
String str="1-DRYBEANS,2-PLAINRICE,3-COLDCEREAL,4-HOTCEREAL,51-ASSORTEDETHNIC,GOURMET&SPECIALTY";
List<String> zoneArray = new ArrayList<>(Arrays.asList(zoneDescTemp.split(",")));
Actual output :
zoneArray = {"1-DRYBEANS","2-PLAINRICE","3-COLDCEREAL","4-HOTCEREAL","51-ASSORTEDETHNIC","GOURMET&SPECIALTY"}
Expected output :
zoneArray = {"1-DRYBEANS","2-PLAINRICE","3-COLDCEREAL","4-HOTCEREAL","51-ASSORTEDETHNIC,GOURMET&SPECIALTY"}
Any help would be appreciated.

Use split(",(?=[0-9])")
You are not just splitting by comma, but splitting by comma only if it is followed by a digit from 0-9. This is also known as positive lookahead (?=).
Take a look at this code snippet for example:
public static void main(String[] args) {
String str="1-DRYBEANS,2-PLAINRICE,3-COLDCEREAL,4-HOTCEREAL,51-ASSORTEDETHNIC,GOURMET&SPECIALTY";
String[] array1= str.split(",(?=[0-9])");
for (String temp: array1){
System.out.println(temp);
}
}
}

Use a look-ahead within your regex, one that uses comma (not in the look-ahead), followed by a number (in the look-head). \\d+ will suffice for number. The regex can look like:
String regex = ",(?=\\d+)";
For example:
public class Foo {
public static void main(String[] args) {
String str = "1-DRYBEANS,2-PLAINRICE,3-COLDCEREAL,4-HOTCEREAL,51-ASSORTEDETHNIC,GOURMET&SPECIALTY";
String regex = ",(?=\\d+)";
String[] tokens = str.split(regex);
for (String item : tokens) {
System.out.println(item);
}
}
}
what this does is split on a comma that is followed by numbers, but does not remove from the output, the numbers since they are part of the look-ahead.
For more on look-ahead, look-behind and look-around, please check out this relevant tutorial page.

Get Sub-string from String with specific Pattern in JAVA

I have the following input:
8=FIX.4.2|9=00394|35=8|49=FIRST|8=FIX.4.2|9=00394|35=8|56=MIDDLE|10=245|8=FIX.4.2|9=00394|35=8|49=LAST|56=HEMADTS|10=024|
Now I want the strings that are starting with "8=???" and end with "10=???|". You can see above that there are exactly two strings that start with 8 and end with 10. I have written a program for this.
Below is my code:
public class Main {
static Pattern r = Pattern.compile("(.*?)(8=\\w\\w\\w)[\\s\\S]*?(10=\\w\\w\\w)");
public static void main(String[] args) {
String str = "8=FIX.4.2|9=00394|35=8|49=FIRST|8=FIX.4.2|9=00394|35=8|56=MIDDLE|10=245|8=FIX.4.2|9=00394|35=8|49=LAST|56=HEMADTS|10=024|";
match(str);
}
public static void match(String message) { //send to OMS
Matcher m = r.matcher(message);
while (m.find()) {
System.out.println(m.group());
}
}
}
When I just run this I am getting the wrong output like:
8=FIX.4.2|9=00394|35=849=FIRST`|8=FIX.4.2|9=00394|35=8|56=MIDDLE|10=245|`
8=FIX.4.2|9=00394|35=849=LAST|56=HEMADTS|10=024|
You can see the first string in the output. It consists of "8=???" two times but the exact output needs to be like:
8=FIX.4.2|9=00394|35=8|56=MIDDLE|10=245|
8=FIX.4.2|9=00394|35=849=LAST|56=HEMADTS|10=024|
I also want the un-matched strings in separate as there is a further work with those strings. How can I get that? So, the total output needs to be like:
Matched : 8=FIX.4.2|9=00394|35=8|56=MIDDLE|10=245|
Matched : 8=FIX.4.2|9=00394|35=849=LAST|56=HEMADTS|10=024|
UnMatched : 8=FIX.4.2|9=00394|35=849=FIRST`|

You need to use a tempered greedy token to match the shortest window possible between 2 strings. That will solve the first problem. To get unmatched strings, just split the string with the pattern.
Use
\b8=\w{3}(?:(?!8=\w{3})[\s\S])*?10=\w{3}\|
See the regex demo.
Details
\b - a word boundary
8= - a literal substring
\w{3} - 3 word chars
(?:(?!8=\w{3})[\s\S])*? - a tempered greedy token matching any char ([\s\S]), zero or more times, as few as possible, that do not start a 8= and 3 word chars pattern
10= - a literal substring
\w{3} - 3 word chars
\| - a literal |.
Java code:
public static Pattern r = Pattern.compile("\\b8=\\w{3}(?:(?!8=\\w{3})[\\s\\S])*?10=\\w{3}\\|");
public static void main (String[] args) throws java.lang.Exception
{
String str = "8=FIX.4.2|9=00394|35=8|49=FIRST|8=FIX.4.2|9=00394|35=8|56=MIDDLE|10=245|8=FIX.4.2|9=00394|35=8|49=LAST|56=HEMADTS|10=024|";
match(str);
}
public static void match(String message) { //send to OMS
Matcher m = r.matcher(message);
System.out.println("MATCHED:");
while (m.find()) {
System.out.println(m.group());
}
System.out.println("UNMATCHED:");
String[] unm = r.split(message);
for (String s: unm) {
System.out.println(s);
}
}
See the Java demo.
Results:
MATCHED:
8=FIX.4.2|9=00394|35=8|56=MIDDLE|10=245|
8=FIX.4.2|9=00394|35=8|49=LAST|56=HEMADTS|10=024|
UNMATCHED:
8=FIX.4.2|9=00394|35=8|49=FIRST|

How to split the string after dot and print it to next line

String string = "This is a example.just to verify.please help me.";
if(string.matches("(.*).(.*)"))
{
System.out.println(true);
String[] parts = string.split("\\r?\\n");
for(String part:parts){
System.out.println(part);
}
}
I want to split the string after every dot to the next line. can anyone help me in this. thanks in advance.

use regex "\\."
public static void main(String[] args) {
String string = "This is a example.just to verify.please help me.";
if (string.matches("(.*).(.*)")) {
System.out.println(true);
String[] parts = string.split("\\.");
for (String part : parts) {
System.out.println(part);
}
}
}
output
true
This is a example
just to verify
please help me

Use positive lookbehind. And also in matches function, you need to escape the dot like string.matches(".*\\..*"), since dot is a regex special character which matches any character.
String[] parts = string.split("(?<=\\.)");
or
If you don't want to do a split after the last dot.
String[] parts = string.split("(?<=\\.)(?!$)");
DEMO

Regex to exclude word from matches java code

Maybe someone could help me. I'm trying to include within a java code a regex to match all strings except the ZZ78. I'd like to know what it's missing in the regex I have.
The input string is str = "ab57cdZZ78efghZZ7ij#klmZZ78noCODpqrZZ78stuvw27z#xyzZZ78"
and I'm trying with this regex (?:(?![ZZF8]).)* but if you test in http://regexpal.com/
this regex against the string, you'll see that is not working completely.
str = new String ("ab57cdZZ78efghZZ7ij#klmZZ78noCODpqrZZ78stuvw27z#xyzZZ78");
Pattern pattern = Pattern.compile("(?:(?![ZZ78]).)*");
the matched strings should be
ab57cd
efghZZ7ij#klm
noCODpqr
stuvw27z#xyz
Update:
Hello Avinash Raj and Chthonic Project. Thanks so much for your help and solutions provided.
I originally thougth in split method, but I was trying to avoid get empty strings as result
when for example the delimiter string is at the beginning or at the end of the main string.
Then, I thought that a regex could help me to extract all except "ZZ78", avoiding in this way
empty results in the output.
Below I show the code using split method (Chthonic´s) and regex (Avinash´s) both produce empty
string if the commented "if()" conditions are not used.
Does the use of those "if()" are the only way to not print empty strings? or could be the regex
tweaked a little bit to match not empty strings?
This is the code I have tested so far:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegexTest {
public static void main(String[] args) {
System.out.println("########### Matches with Split ###########");
String str = "ZZ78ab57cdZZ78efghZZ7ij#klmZZ78noCODpqrZZ78stuvw27z#xyzZZ78";
for (String s : str.split("ZZ78")) {
//if ( !s.isEmpty() ) {
System.out.println("This is a match <<" + s + ">>");
//}
}
System.out.println("##########################################");
System.out.println("########### Matches with Regex ###########");
String s = "ZZ78ab57cdZZ78efghZZ7ij#klmZZ78noCODpqrZZ78stuvw27z#xyzZZ78";
Pattern regex = Pattern.compile("((?:(?!ZZ78).)*)(ZZ78|$)");
Matcher matcher = regex.matcher(s);
while(matcher.find()){
//if ( !matcher.group(1).isEmpty() ) {
System.out.println("This is a match <<" + matcher.group(1) + ">>");
//}
}
}
}
**and the output (without use the "if()´s"):**
########### Matches with Split ###########
This is a match <<>>
This is a match <<ab57cd>>
This is a match <<efghZZ7ij#klm>>
This is a match <<noCODpqr>>
This is a match <<stuvw27z#xyz>>
##########################################
########### Matches with Regex ###########
This is a match <<>>
This is a match <<ab57cd>>
This is a match <<efghZZ7ij#klm>>
This is a match <<noCODpqr>>
This is a match <<stuvw27z#xyz>>
This is a match <<>>
Thanks for help so far.
Thanks in advance
Update #2:
Excellent both of your answers and solutions. Now it works very nice. This is the final code I've tested with both solutions.
Many thanks again.
import java.util.ArrayList;
import java.util.Arrays;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegexTest {
public static void main(String[] args) {
System.out.println("########### Matches with Split ###########");
String str = "ZZ78ab57cdZZ78efghZZ7ij#klmZZ78noCODpqrZZ78stuvw27z#xyzZZ78";
Arrays.stream(str.split("ZZ78")).filter(s -> !s.isEmpty()).forEach(System.out::println);
System.out.println("##########################################");
System.out.println("########### Matches with Regex ###########");
String s = "ZZ78ab57cdZZ78efghZZ7ij#klmZZ78noCODpqrZZ78stuvw27z#xyzZZ78";
Pattern regex = Pattern.compile("((?:(?!ZZ78).)*)(ZZ78|$)");
Matcher matcher = regex.matcher(s);
ArrayList<String> allMatches = new ArrayList<String>();
ArrayList<String> list = new ArrayList<String>();
while(matcher.find()){
allMatches.add(matcher.group(1));
}
for (String s1 : allMatches)
if (!s1.equals(""))
list.add(s1);
System.out.println(list);
}
}
And output:
########### Matches with Split ###########
ab57cd
efghZZ7ij#klm
noCODpqr
stuvw27z#xyz
##########################################
########### Matches with Regex ###########
[ab57cd, efghZZ7ij#klm, noCODpqr, stuvw27z#xyz]

The easiest way to do this is as follows:
public static void main(String[] args) {
String str = "ab57cdZZ78efghZZ7ij#klmZZ78noCODpqrZZ78stuvw27z#xyzZZ78";
for (String s : str.split("ZZ78"))
System.out.println(s);
}
The output, as expected, is:
ab57cd
efghZZ7ij#klm
noCODpqr
stuvw27z#xyz
If the pattern used to split the string is at the beginning (i.e. "ZZ78" in your example code), the first element returned will be an empty string, as you have already noted. To avoid that, all you need to do is filter the array. This is essentially the same as putting an if, but you can avoid the extra condition line this way. I would do this as follows (in Java 8):
String test_str = ...; // whatever string you want to test it with
Arrays.stream(str.split("ZZ78")).filter(s -> !s.isEmpty()).foreach(System.out::println);

You must need to remove the character class since [ZZ78] matches a single charcater from the given list. (?:(?!ZZ78).)* alone won't give the match you want. Consider this ab57cdZZ78 as an input string. At first this (?:(?!ZZ78).)* matches the string ab57cd, next it tries to match the following Z and check the condition (?!ZZ78) which means match any character but not of ZZ78. So it failes to match the following Z, next the regex engine moves on to the next character Z and checks this (?!ZZ78) condition. Because of the second Z isn't followed by Z78, this Z got matched by the regex engine.
String s = "ab57cdZZ78efghZZ7ij#klmZZ78noCODpqrZZ78stuvw27z#xyzZZ78";
Pattern regex = Pattern.compile("((?:(?!ZZ78).)*)(ZZ78|$)");
Matcher matcher = regex.matcher(s);
while(matcher.find()){
System.out.println(matcher.group(1));
}
Output:
ab57cd
efghZZ7ij#klm
noCODpqr
stuvw27z#xyz
Explanation:
((?:(?!ZZ78).)*) Capture any character but not of ZZ78 zero or more times.
(ZZ78|$) And also capture the following ZZ78 or the end of the line anchor into group 2.
Group index 1 contains single or group of characters other than ZZ78
Update:
String s = "ZZ78ab57cdZZ78efghZZ7ij#klmZZ78noCODpqrZZ78stuvw27z#xyzZZ78";
Pattern regex = Pattern.compile("((?:(?!ZZ78).)*)(ZZ78|$)");
Matcher matcher = regex.matcher(s);
ArrayList<String> allMatches = new ArrayList<String>();
ArrayList<String> list = new ArrayList<String>();
while(matcher.find()){
allMatches.add(matcher.group(1));
}
for (String s1 : allMatches)
if (!s1.equals(""))
list.add(s1);
System.out.println(list);
Output:
[ab57cd, efghZZ7ij#klm, noCODpqr, stuvw27z#xyz]

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Regex max a string till " and not stop at \" - java

Here is an example. $str = '"field":"Testing, for something \"and something\""'; echo preg_replace('/(\"field\":\")(.*)(\")/i', "$1SAFE$3", $str); Regex is tested: here.

Related

Extract regex internal values in Java

Split a string using split method

Get Sub-string from String with specific Pattern in JAVA

How to split the string after dot and print it to next line

Regex to exclude word from matches java code

Categories

Resources