I need a regular expression to replace 3rd matching substring - java

Example
input: abc def abc abc pqr
I want to to replace abc at third position with xyz.
output: abc gef abc xyz pqr
Thanks in advance

One way to do this would be to use.
String[] mySplitStrings = null;
String.Split(" ");
mySplitString[3] = "xyz";
And then rejoin the string, its not the best way to do it but it works, you could put the whole process into a function like.
string ReplaceStringInstance(Seperator, Replacement)
{
// Do Stuff
}

Group the three segments, that are the part before the replaced string, the replaced string and the rest and assemble the prefix, the replacement and the suffix:
String pattern = String.format("^(.*?%1$s.*?%1$s.*?)(%1$s)(.*)$", "abc");
String result = input.replaceAll(pattern, "$1xyz$3");
This solution assumes that the whole input is one line. If you have multiline input you'll have to replace the dots as they don't match line separators.

There's plenty of ways to do this, but here's one. It assumes that the groups of letters will be separated by spaces, and looks for the 3rd 'abc' block. It then does a single replace to replace that with 'xyz'.
import java.util.regex.Pattern;
import java.util.regex.Matcher;
public class main {
private static String INPUT = "abc def abc abc pqr";
private static String REGEX = "((?:abc\\ ).*?(?:abc\\ ).*?)(abc\\ )";
private static String REPLACE = "$1xyz ";
public static void main(String[] args) {
System.out.println("Input: " + INPUT);
Pattern p = Pattern.compile(REGEX);
Matcher m = p.matcher(INPUT); // get a matcher object
INPUT = m.replaceFirst(REPLACE);
System.out.println("Output: " + INPUT);
}
}

Related

Regex to find all dollar signs and parentheses and commas

I want a regex to remove all instances of dollar signs, commas, and opening and closing parentheses so that the String can be parsed to a Double.
Exmaples are:
($108.34)
$39.60
1,388.80
The code:
#Parsed
#Replace(expression = "", replacement = "")
public Double extdPrice;
This may help, we delete all the elements in this list: , $ ( )
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Example {
public static void main(String[] args) {
final String regex = "[(),$]";
final String string = "($108.34)\n"
+ "$39.60\n"
+ "1,388.80";
final String subst = "";
final Pattern pattern = Pattern.compile(regex);
final Matcher matcher = pattern.matcher(string);
// The substituted value will be contained in the result variable
final String result = matcher.replaceAll(subst);
System.out.println("Substitution result: " + result);
}
}
\d{1,3}(\,\d\d\d)*(\.\d+)?
can match all number like your examples, but it can't match 123456(no comma).
result was
108.34
39.60
1,388.80
and you need replace comma
Regex Expression = [^0-9\\.]
is what you are looking for. It will match anything other than digits 0-9 and character .
So technically this regex will remove all extra characters like ( , $ USD and etc
Example: System.out.println("($123.89)".replaceAll("[^0-9\\.]", "")); will give an output 123.89
Test output:
($108.34) => 108.34
$39.60 => 39.60
1,388.80 => 1388.80

Java regex convert string to valid json string

I have a pretty long string that looks something like
{abc:\"def\", ghi:\"jkl\"}
I want to convert this to a valid json string like
{\"abc\":\"def\", \"ghi\":\"jkl\"}
I started looking at the replaceAll(String regex, String replacement) method on the string object but i'm struggling to find the correct regex for it.
Can someone please help me with this.
In this particular case the regex should look for a word that is proceeded with {, space, or , and not followed by "
String str = "{abc:\"def\", ghi:\"jkl\"}";
String regex = "(?:[{ ,])(\\w+)(?!\")";
System.out.println(str.replaceAll(regex, "\\\"$1\\\""));
DEMO and regex explanation
I have to make an assumption that the "key" and "value" consist of only
"word characters" (\w) and there are no spaces in them.
Here is my program. Please also see the comments in-line:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegexJson {
public static void main(String[] args) {
/*
* Note that the input string, when expressed in a Java program, need escape
* for backslash (\) and double quote ("). If you read directly
* from a file then these escapes are not needed
*/
String input = "{abc:\\\"def\\\", ghi:\\\"jkl\\\"}";
// regex for one pair of key-value pair. Eg: abc:\"edf\"
String keyValueRegex = "(?<key>\\w+):(?<value>\\\\\\\"\\w+\\\\\\\")";
// regex for a list of key-value pair, separated by a comma (,) and a space ( )
String pairsRegex = "(?<pairs>(,*\\s*"+keyValueRegex+")+)";
// regex include the open and closing braces ({})
String regex = "\\{"+pairsRegex+"\\}";
StringBuilder sb = new StringBuilder();
sb.append("{");
Pattern p1 = Pattern.compile(regex);
Matcher m1 = p1.matcher(input);
while (m1.find()) {
String pairs = m1.group("pairs");
Pattern p2 = Pattern.compile(keyValueRegex);
Matcher m2 = p2.matcher(pairs);
String comma = ""; // first time special
while (m2.find()) {
String key = m2.group("key");
String value = m2.group("value");
sb.append(String.format(comma + "\\\"%s\\\":%s", key, value));
comma = ", "; // second time and onwards
}
}
sb.append("}");
System.out.println("input is: " + input);
System.out.println(sb.toString());
}
}
The print out of this program is:
input is: {abc:\"def\", ghi:\"jkl\"}
{\"abc\":\"def\", \"ghi\":\"jkl\"}

Regex to exclude word from matches java code

Maybe someone could help me. I'm trying to include within a java code a regex to match all strings except the ZZ78. I'd like to know what it's missing in the regex I have.
The input string is str = "ab57cdZZ78efghZZ7ij#klmZZ78noCODpqrZZ78stuvw27z#xyzZZ78"
and I'm trying with this regex (?:(?![ZZF8]).)* but if you test in http://regexpal.com/
this regex against the string, you'll see that is not working completely.
str = new String ("ab57cdZZ78efghZZ7ij#klmZZ78noCODpqrZZ78stuvw27z#xyzZZ78");
Pattern pattern = Pattern.compile("(?:(?![ZZ78]).)*");
the matched strings should be
ab57cd
efghZZ7ij#klm
noCODpqr
stuvw27z#xyz
Update:
Hello Avinash Raj and Chthonic Project. Thanks so much for your help and solutions provided.
I originally thougth in split method, but I was trying to avoid get empty strings as result
when for example the delimiter string is at the beginning or at the end of the main string.
Then, I thought that a regex could help me to extract all except "ZZ78", avoiding in this way
empty results in the output.
Below I show the code using split method (Chthonic´s) and regex (Avinash´s) both produce empty
string if the commented "if()" conditions are not used.
Does the use of those "if()" are the only way to not print empty strings? or could be the regex
tweaked a little bit to match not empty strings?
This is the code I have tested so far:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegexTest {
public static void main(String[] args) {
System.out.println("########### Matches with Split ###########");
String str = "ZZ78ab57cdZZ78efghZZ7ij#klmZZ78noCODpqrZZ78stuvw27z#xyzZZ78";
for (String s : str.split("ZZ78")) {
//if ( !s.isEmpty() ) {
System.out.println("This is a match <<" + s + ">>");
//}
}
System.out.println("##########################################");
System.out.println("########### Matches with Regex ###########");
String s = "ZZ78ab57cdZZ78efghZZ7ij#klmZZ78noCODpqrZZ78stuvw27z#xyzZZ78";
Pattern regex = Pattern.compile("((?:(?!ZZ78).)*)(ZZ78|$)");
Matcher matcher = regex.matcher(s);
while(matcher.find()){
//if ( !matcher.group(1).isEmpty() ) {
System.out.println("This is a match <<" + matcher.group(1) + ">>");
//}
}
}
}
**and the output (without use the "if()´s"):**
########### Matches with Split ###########
This is a match <<>>
This is a match <<ab57cd>>
This is a match <<efghZZ7ij#klm>>
This is a match <<noCODpqr>>
This is a match <<stuvw27z#xyz>>
##########################################
########### Matches with Regex ###########
This is a match <<>>
This is a match <<ab57cd>>
This is a match <<efghZZ7ij#klm>>
This is a match <<noCODpqr>>
This is a match <<stuvw27z#xyz>>
This is a match <<>>
Thanks for help so far.
Thanks in advance
Update #2:
Excellent both of your answers and solutions. Now it works very nice. This is the final code I've tested with both solutions.
Many thanks again.
import java.util.ArrayList;
import java.util.Arrays;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegexTest {
public static void main(String[] args) {
System.out.println("########### Matches with Split ###########");
String str = "ZZ78ab57cdZZ78efghZZ7ij#klmZZ78noCODpqrZZ78stuvw27z#xyzZZ78";
Arrays.stream(str.split("ZZ78")).filter(s -> !s.isEmpty()).forEach(System.out::println);
System.out.println("##########################################");
System.out.println("########### Matches with Regex ###########");
String s = "ZZ78ab57cdZZ78efghZZ7ij#klmZZ78noCODpqrZZ78stuvw27z#xyzZZ78";
Pattern regex = Pattern.compile("((?:(?!ZZ78).)*)(ZZ78|$)");
Matcher matcher = regex.matcher(s);
ArrayList<String> allMatches = new ArrayList<String>();
ArrayList<String> list = new ArrayList<String>();
while(matcher.find()){
allMatches.add(matcher.group(1));
}
for (String s1 : allMatches)
if (!s1.equals(""))
list.add(s1);
System.out.println(list);
}
}
And output:
########### Matches with Split ###########
ab57cd
efghZZ7ij#klm
noCODpqr
stuvw27z#xyz
##########################################
########### Matches with Regex ###########
[ab57cd, efghZZ7ij#klm, noCODpqr, stuvw27z#xyz]
The easiest way to do this is as follows:
public static void main(String[] args) {
String str = "ab57cdZZ78efghZZ7ij#klmZZ78noCODpqrZZ78stuvw27z#xyzZZ78";
for (String s : str.split("ZZ78"))
System.out.println(s);
}
The output, as expected, is:
ab57cd
efghZZ7ij#klm
noCODpqr
stuvw27z#xyz
If the pattern used to split the string is at the beginning (i.e. "ZZ78" in your example code), the first element returned will be an empty string, as you have already noted. To avoid that, all you need to do is filter the array. This is essentially the same as putting an if, but you can avoid the extra condition line this way. I would do this as follows (in Java 8):
String test_str = ...; // whatever string you want to test it with
Arrays.stream(str.split("ZZ78")).filter(s -> !s.isEmpty()).foreach(System.out::println);
You must need to remove the character class since [ZZ78] matches a single charcater from the given list. (?:(?!ZZ78).)* alone won't give the match you want. Consider this ab57cdZZ78 as an input string. At first this (?:(?!ZZ78).)* matches the string ab57cd, next it tries to match the following Z and check the condition (?!ZZ78) which means match any character but not of ZZ78. So it failes to match the following Z, next the regex engine moves on to the next character Z and checks this (?!ZZ78) condition. Because of the second Z isn't followed by Z78, this Z got matched by the regex engine.
String s = "ab57cdZZ78efghZZ7ij#klmZZ78noCODpqrZZ78stuvw27z#xyzZZ78";
Pattern regex = Pattern.compile("((?:(?!ZZ78).)*)(ZZ78|$)");
Matcher matcher = regex.matcher(s);
while(matcher.find()){
System.out.println(matcher.group(1));
}
Output:
ab57cd
efghZZ7ij#klm
noCODpqr
stuvw27z#xyz
Explanation:
((?:(?!ZZ78).)*) Capture any character but not of ZZ78 zero or more times.
(ZZ78|$) And also capture the following ZZ78 or the end of the line anchor into group 2.
Group index 1 contains single or group of characters other than ZZ78
Update:
String s = "ZZ78ab57cdZZ78efghZZ7ij#klmZZ78noCODpqrZZ78stuvw27z#xyzZZ78";
Pattern regex = Pattern.compile("((?:(?!ZZ78).)*)(ZZ78|$)");
Matcher matcher = regex.matcher(s);
ArrayList<String> allMatches = new ArrayList<String>();
ArrayList<String> list = new ArrayList<String>();
while(matcher.find()){
allMatches.add(matcher.group(1));
}
for (String s1 : allMatches)
if (!s1.equals(""))
list.add(s1);
System.out.println(list);
Output:
[ab57cd, efghZZ7ij#klm, noCODpqr, stuvw27z#xyz]

Matching the pattern of a string in java

I have been trying to figure out how to match the pattern of my input string with this kind of string:
"xyz 123456789"
In general every time I have a input that has first 3 characters (can be both uppercase or lowercase) and last 9 are digits (any combination) the input string should be accepted.
So if I have i/p string = "Abc 234646593" it should be a match (one or two white-space allowed). Also it would be great if "Abc" and "234646593" should be stored in seperate strings.
I have seeing a lot of regex but do not fully understand it.
Here's a working Java solution:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Regex {
public static void main(String[] args) {
String input = "Abc 234646593";
// you could use \\s+ rather than \\s{1,2} if you only care that
// at least one whitespace char occurs
Pattern p = Pattern.compile("([a-zA-Z]{3})\\s{1,2}([0-9]{9})");
Matcher m = p.matcher(input);
String firstPart = null;
String secondPart = null;
if (m.matches()) {
firstPart = m.group(1); // grab first remembered match ([a-zA-Z]{3})
secondPart = m.group(2); // grab second remembered match ([0-9]{9})
System.out.println("First part: " + firstPart);
System.out.println("Second part: " + secondPart);
}
}
}
Prints out:
First part: Abc
Second part: 234646593

How to extract uppercase substrings from a String in Java?

I need a piece of code with which I can extract the substrings that are in uppercase from a string in Java.
For example:
"a:[AAAA|0.1;BBBBBBB|-1.90824;CC|0.0]"
I need to extract CC BBBBBBB and AAAA
You can do it with String[] split(String regex). The only problem can be with empty strings, but it's easy to filter them out:
String str = "a:[AAAA|0.1;BBBBBBB|-1.90824;CC|0.0]";
String[] substrings = str.split("[^A-Z]+");
for (String s : substrings)
{
if (!s.isEmpty())
{
System.out.println(s);
}
}
Output:
AAAA
BBBBBBB
CC
This should demonstrate the proper syntax and method. More details can be found here http://docs.oracle.com/javase/1.5.0/docs/api/java/util/regex/Pattern.html and http://docs.oracle.com/javase/1.5.0/docs/api/java/util/regex/Matcher.html
String myStr = "a:[AAAA|0.1;BBBBBBB|-1.90824;CC|0.0]";
Pattern upperCase = Pattern.compile("[A-Z]+");
Matcher matcher = upperCase.matcher(myStr);
List<String> results = new ArrayList<String>();
while (matcher.find()) {
results.add(matcher.group());
}
for (String s : results) {
System.out.println(s);
}
The [A-Z]+ part is the regular expression which does most of the work. There are a lot of strong regular expression tutorials if you want to look more into it.
If you want just to extract all the uppercase letter use [A-Z]+, if you want just uppercase substring, meaning that if you have lowercase letters you don't need it (HELLO is ok but Hello is not) then use \b[A-Z]+\b
I think you should do a replace all regular expression to turn the character you don't want into a delimiter, perhaps something like this:
str.replaceAll("[^A-Z]+", " ")
Trim any leading or trailing spaces.
Then, if you wish, you can call str.split(" ")
This is probably what you're looking for:
import java.util.regex.Pattern;
import java.util.regex.Matcher;
public class MatcherDemo {
private static final String REGEX = "[A-Z]+";
private static final String INPUT = "a:[AAAA|0.1;BBBBBBB|-1.90824;CC|0.0]";
public static void main(String[] args) {
Pattern p = Pattern.compile(REGEX);
// get a matcher object
Matcher m = p.matcher(INPUT);
List<String> sequences = new Vector<String>();
while(m.find()) {
sequences.add(INPUT.substring(m.start(), m.end()));
}
}
}

Categories