How to replace character in the string using regex in java? - java

I want to replace every x in the end of line or string and behind every letters except aiueo with nya.
Expected input and output:
Input: bapakx
Output: bapaknya
I've tried this one:
String myString = "bapakx";
String regex = "[^aiueo]x(\\s|$)";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(myString);
if(m.find()){
myString = m.replaceAll("nya");
}
But the output is not bapaknya but bapanya. The k character is also replaced. How can I solve this?

To get consonant back Use a zero width lookbehind in your regex as:
String regex = "(?<=[^aiueo])x(?=\\s|$)";
Here (?<=[^aiueo]) will only assert presence of consonant before x but won't match it.
Alternatively you can use capture groups:
String regex = "([^aiueo])x(\\s|$)";
and use it as:
myString = m.replaceAll("$1nya");

Related

How to replace multiple consecutive occurrences of a character with a maximum allowed number of occurences?

CharSequence content = new StringBuffer("aaabbbccaaa");
String pattern = "([a-zA-Z])\\1\\1+";
String replace = "-";
Pattern patt = Pattern.compile(pattern, Pattern.CASE_INSENSITIVE);
Matcher matcher = patt.matcher(content);
boolean isMatch = matcher.find();
StringBuffer buffer = new StringBuffer();
for (int i = 0; i < content.length(); i++) {
while (matcher.find()) {
matcher.appendReplacement(buffer, replace);
}
}
matcher.appendTail(buffer);
System.out.println(buffer.toString());
In the above code content is input string,
I am trying to find repetitive occurrences from string and want to replace it with max no of occurrences
For Example
input -("abaaadccc",2)
output - "abaadcc"
here aaaand cccis replced by aa and cc as max allowed repitation is 2
In the above code, I found such occurrences and tried replacing them with -, it's working, But can someone help me How can I get current char and replace with allowed occurrences
i.e If aaa is found it is replaced by aa
or is there any alternative method w/o using regex?
You can declare the second group in a regex and use it as a replacement:
String result = "aaabbbccaaa".replaceAll("(([a-zA-Z])\\2)\\2+", "$1");
Here's how it works:
( first group - a character repeated two times
([a-zA-Z]) second group - a character
\2 a character repeated once
)
\2+ a character repeated at least once more
Thus, the first group captures a replacement string.
It isn't hard to extrapolate this solution for a different maximum value of allowed repeats:
String input = "aaaaabbcccccaaa";
int maxRepeats = 4;
String pattern = String.format("(([a-zA-Z])\\2{%s})\\2+", maxRepeats-1);
String result = input.replaceAll(pattern, "$1");
System.out.println(result); //aaaabbccccaaa
Since you defined a group in your regex, you can get the matching characters of this group by calling matcher.group(1). In your case it contains the first character from the repeating group so by appending it twice you get your expected result.
CharSequence content = new StringBuffer("aaabbbccaaa");
String pattern = "([a-zA-Z])\\1\\1+";
Pattern patt = Pattern.compile(pattern, Pattern.CASE_INSENSITIVE);
Matcher matcher = patt.matcher(content);
StringBuffer buffer = new StringBuffer();
while (matcher.find()) {
System.out.println("found : "+matcher.start()+","+matcher.end()+":"+matcher.group(1));
matcher.appendReplacement(buffer, matcher.group(1)+matcher.group(1));
}
matcher.appendTail(buffer);
System.out.println(buffer.toString());
Output:
found : 0,3:a
found : 3,6:b
found : 8,11:a
aabbccaa

JAVA split with regex doesn't work

I have the following String 46MTS007 and i have to split numbers from letters so in result i should get an array like {"46", "MTS", "007"}
String s = "46MTS007";
String[] spl = s.split("\\d+|\\D+");
But spl remains empty, what's wrong with the regex? I've tested in regex101 and it's working like expected (with global flag)
If you want to use split you can use this lookaround based regex:
(?<=\d)(?=\D)|(?<=\D)(?=\d)
RegEx Demo
Which means split the places where next position is digit and previous is non-digit OR when position is non-digit and previous position is a digit.
In Java:
String s = "46MTS007";
String[] spl = s.split("(?<=\\d)(?=\\D)|(?<=\\D)(?=\\d)");
Regex you're using will not split the string. Split() splits the string with regex you provide but regex used here matches with whole string not the delimiter. You can use Pattern Matcher to find different groups in a string.
public static void main(String[] args) {
String line = "46MTS007";
String regex = "\\D+|\\d+";
Pattern pattern = Pattern.compile(regex);
Matcher m = pattern.matcher(line);
while(m.find())
System.out.println(m.group());
}
Output:
46
MTS
007
Note: Don't forget to user m.find() after capturing each group otherwise it'll not move to next one.

Extract only the numbers from String

I need a Regex that given the following Strings: "12.123.123/1234-11", "12.123123123411" or "1123123/1234-11".
I could extract only the numbers(12123123123411);
Pattern padrao = Pattern.compile("\d+");
Matcher matcher = padrao.matcher("12.123.123/1234-11");
while (matcher.find()) {
System.out.println(matcher.group());
}
//output:12,123,123,1234,11,
//I need: 121231234123411
Can anyone help me?
A better way would be use String#replaceAll(regex, replacement) method to replace all characters except digits (As you see, the method takes a regex for replacing):
String str = "12.123.123/1234-11";
String digits = str.replaceAll("\\D", "");
\\D matches non-digit characters. Equivalent to [^0-9].
Note that, you need to escape the \D on Java regex engine.
If you have restriction for using Matcher#group() method, then you would have to build a StringBuilder instance, appending digits, everytime they are found:
String str = "12.123.123/1234-11";
StringBuilder digits = new StringBuilder();
Matcher matcher = Pattern.compile("\\d+").matcher(str);
while (matcher.find()) {
digits.append(matcher.group());
}
System.out.println(digits);
You could simply remove all the non-digit characters through replaceAll:
String out = string.replaceAll("\\D+", "");

java find() always returning true

I am trying to find a pattern in the string in java. Below is the code written as-
String line = "10011011001;0110,1001,1001,0,10,11";
String regex ="[A-Za-z]?"; //[A-Za-z2-9\W]?
//create a pattern obj
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(line);
boolean a = m.find();
System.out.println("The value of a is::"+a +" asdsd "+m.group(0));
I am expecting the boolean value to be false, but instead it is always returning as true. Any input or idea where I am going wrong.?
The ? makes the entire character group optional. So your regex essentially means "find any character* ... or not". And the "or not" part means it matches the empty string.
* not really "any", just those characters that are represented in ASCII.
[A-Za-z]? means "zero or one letters". It will always match somewhere in the string; even if there aren't any letters, it will match zero of them.
The below regex should work;
[A-Za-z]?-----> once or not at all
Reference :
http://docs.oracle.com/javase/1.5.0/docs/api/java/util/regex/Pattern.html
String line = "10011011001;0110,1001,1001,0,10,11";
String regex ="[A-Za-z]";// to find letter
String regex ="[A-Za-z]+$";// to find last string..
String regex ="[^0-9,;]";//means non digits and , ;
//create a pattern obj
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(line);
boolean a = m.find();
System.out.println("The value of a is::"+a +" asdsd "+m.group(0));

RegEX: how to match string which is not surrounded

I have a String "REC/LESS FEES/CODE/AU013423".
What could be the regEx expression to match "REC" and "AU013423" (anything that is not surrounded by slashes /)
I am using /^>*/, which works and matches the string within slash's i.e. using this I am able to find "/LESS FEES/CODE/", but I want to negate this to find reverse i.e. REC and AU013423.
Need help on this. Thanks
If you know that you're only looking for alphanumeric data you can use the regex ([A-Z0-9]+)/.*/([A-Z0-9]+) If this matches you will have the two groups which contain the first & final text strings.
This code prints RECAU013423
final String s = "REC/LESS FEES/CODE/AU013423";
final Pattern regex = Pattern.compile("([A-Z0-9]+)/.*/([A-Z0-9]+)", Pattern.CASE_INSENSITIVE);
final Matcher matcher = regex.matcher(s);
if (matcher.matches()) {
System.out.println(matcher.group(1) + matcher.group(2));
}
You can tweak the regex groups as necessary to cover valid characters
Here's another option:
String s = "REC/LESS FEES/CODE/AU013423";
String[] results = s.split("/.*/");
System.out.println(Arrays.toString(results));
// [REC, AU013423]
^[^/]+|[^/]+$
matches anything that occurs before the first or after the last slash in the string (or the entire string if there is no slash present).
To iterate over all matches in a string in Java:
Pattern regex = Pattern.compile("^[^/]+|[^/]+$");
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
// matched text: regexMatcher.group()
// match start: regexMatcher.start()
// match end: regexMatcher.end()
}

Categories