Regex replacing specific characters in java - java

I'm trying to replace some case when I put a generic term (here called tampon).
Rules:
I want to replace "AM into "AN","EM" into "AN", IM"into"IN","OM"into "ON","UM" into "UN" and "YM" into "IN".
I also want to replace them only if a consonant is after them except "M" and "N".
I need to replace only the case too when they are alone or at the end of the string.
I've tried some regex but still got some failures into my test (5/18).
Got faillure with "UMUMMUM" the test expects "UMUMMUM" but I've got "UMUMMUN".
Here is my code now :
public class Phonom {
static String[] consonnant={"B","C","D","F","G","H","J","K","L","P","Q","R","S","T","V","W","X","Z",""};
public static String phonom1(final String tampon){
if (tampon == null){
return "";
}
if (tampon.isEmpty()){
return "";
}
int pos=tampon.indexOf("EM");
int pos1=tampon.indexOf("AM");
int pos2=tampon.indexOf("IM");
int pos3=tampon.indexOf("OM");
int pos4=tampon.indexOf("UM");
int pos5=tampon.indexOf("YM");
if(pos==tampon.length()-2 ||pos1==tampon.length()-2|pos2==tampon.length()-2
||pos3==tampon.length()-2||pos4==tampon.length()-2||pos5==tampon.length()-2){
String temp=tampon.replaceAll("AM","AN");
String temp1=temp.replaceAll("EM","AN");
String temp2=temp1.replaceAll("IM","IN");
String temp3=temp2.replaceAll("OM","ON");
String temp4=temp3.replaceAll("UM","UN");
String result=temp4.replaceAll("YM","IN");
return result;
}
String temp=tampon.replaceAll("AM[^AEIOUMNY]","AN");
String temp1=temp.replaceAll("EM[^AEIOUMNY]","AN");
String temp2=temp1.replaceAll("IM[^AEIOUMNY]","IN");
String temp3=temp2.replaceAll("OM[^AEIOUMNY]","ON");
String temp4=temp3.replaceAll("UM[^AEIOUMNY]","UN");
String result=temp4.replaceAll("YM[^AEIOUMNY]","IN");
return result;
}
}

You could have done this in one line if YM was replaced with YN not IN.
tampon.replaceAll("(?<=[AEIOUY])(M)(?![AEIOUYMN])", "N");
Because of the YM to IN rule you will need to use appendReplacement and appendTail instead. The below code uses a negative look ahead to ensure possible replacements aren't followed by a vowel, M or N. If the first group is a Y we replace the match with IN. If not we use a back reference to the character in group 1 and follow it with an N.
public class Phonom {
private static final Pattern PATTERN = Pattern.compile("([AEIOUY])(M)(?![AEIOUYMN])");
public static String phonom1(String tampon) {
Matcher m = PATTERN.matcher(tampon);
StringBuffer sb = new StringBuffer();
while (m.find()) {
if ("Y".equals(m.group(1))) {
m.appendReplacement(sb, "IN");
} else {
m.appendReplacement(sb, "\1N");
}
}
m.appendTail(sb);
return sb.toString();
}
}

Related

How can I make the following regex match my censors? Java

I am trying to censor specific strings, and patterns within my application but my matcher doesn't seem to be finding any results when searching for the Pattern.
public String censorString(String s) {
System.out.println("Censoring... "+ s);
if (findPatterns(s)) {
System.out.println("Found pattern");
for (String censor : foundPatterns) {
for (int i = 0; i < censor.length(); i++)
s.replace(censor.charAt(i), (char)42);
}
}
return s;
}
public boolean findPatterns(String s) {
for (String censor : censoredWords) {
Pattern p = Pattern.compile("(.*)["+censor+"](.*)");//regex
Matcher m = p.matcher(s);
while (m.find()) {
foundPatterns.add(censor);
return true;
}
}
return false;
}
At the moment I'm focusing on just the one pattern, if the censor is found in the string. I've tried many combinations and none of them seem to return "true".
"(.*)["+censor+"](.*)"
"(.*)["+censor+"]"
"["+censor+"]"
"["+censor+"]+"
Any help would be appreciated.
Usage: My censored words are "hello", "goodbye"
String s = "hello there, today is a fine day."
System.out.println(censorString(s));
is supposed to print " ***** today is a fine day. "
Your regex is right!!!!. The problem is here.
s.replace(censor.charAt(i), (char)42);
If you expect this line to rewrite the censored parts of your string it will not. Please check the java doc for string.
Please find below the program which will do what you intend to do. I removed your findpattern method and just used the replaceall with regex in String API. Hope this helps.
public class Regex_SO {
private String[] censoredWords = new String[]{"hello"};
/**
* #param args the command line arguments
*/
public static void main(String[] args) {
Regex_SO regex_SO = new Regex_SO();
regex_SO.censorString("hello there, today is a fine day. hello again");
}
public String censorString(String s) {
System.out.println("Censoring... "+ s);
for(String censoredWord : censoredWords){
String replaceStr = "";
for(int index = 0; index < censoredWord.length();index++){
replaceStr = replaceStr + "*";
}
s = s.replaceAll(censoredWord, replaceStr);
}
System.out.println("Censored String is .. " + s);
return s;
}
}
Since this seem like homework I cant give you working code, but here are few pointers
consider using \\b(word1|word2|word3)\\b regex to find specific words
to create char representing * you can write it as '*'. Don't use (char)42 to avoid magic numbers
to create new string which will have same length as old string but will be filled with only specific characters you can use String newString = oldString.replaceAll(".","*")
to replace on-the-fly founded match with new value you can use appendReplacement and appendTail methods from Matcher class. Here is how code using it should look like
StringBuffer sb = new StringBuffer();//buffer for string with replaced values
Pattern p = Pattern.compile(yourRegex);
Matcher m = p.matcher(yourText);
while (m.find()){
String match = m.group(); //this will represent current match
String newValue = ...; //here you need to decide how to replace it
m.appentReplacemenet(sb, newValue );
}
m.appendTail(sb);
String censoredString = sb.toString();

How to get a substring of a certain character followed by a number?

In Java, how would I get a substring of a certain character followed by a number?
The string looks like this:
To be, or not to be. (That is the question.) (243)
I want the substring up until the (243), where the number inside the parenthesis is always changing every time I call.
Use a regular expression:
newstr = str.replaceFirst("\(\d+\)", "");
What this means is to find a substring beginning with (, then any number of digits, and then the character ). Then replace the substring with the empty string, "".
Reference: java.lang.String.replaceFirst()
You could match it with a regex, and get the index of the regex. Then use that to get the index in the string.
An example of that is Can Java String.indexOf() handle a regular expression as a parameter?
Pattern pattern = Pattern.compile(patternStr);
Matcher matcher = pattern.matcher(inputStr);
if(matcher.find()){
System.out.println(matcher.start());//this will give you index
}
You can use String.replaceAll():
String s = "To be, or not to be. (That is the question.) (243)";
String newString = s.replaceAll("\\(\\d+\\).*", "");
I think you can actually just do something like:
mystring.substring(0,mystring.lastIndexOf"("))
assuming that the last thing on the line will be the number in parentheses.
You could use a for loop and add the characters before the number to a separate string
String sentence = "To be, or not to be. (That is the question.) (243)";
public static void main(String[] args) {
String subSentence = getSubsentence(sentence);
}
public String getSubsentence(String sentence) {
String subSentence = "";
boolean checkForNum = false;
for (int i = 0; i < sentence.length(); i++) {
if (checkForNum) {
if (isInteger(sentence.getSubstring(i, i+1))) return subSentence;
checkForNum = false;
} else {
if (sentence.getSubstring(i, i+1).equals("(")) checkForNum = true;
else subSentence += sentence.getSubstring(i, i+1);
}
}
return subSentence;
}
public boolean isInteger(String s) {
try {
Integer.parseInt(s);
} catch(NumberFormatException e) {
return false;
}
return true;
}
Using a regex this can be solved with.
public class RegExParser {
public String getTextPart(String s) {
String pattern = "^(\\D+)(\\s\\(\\d+\\))$";
String part = s.replaceAll(pattern, "$1");
return part;
}
}
Simple and performance is good.

How to remove the commas from numeric string result?

I have result, which is either a text or numeric value, such as:
String result;
result = "avsds";
result = "123";
result = "345.45";
Sometimes the results also contain commas like:
result = "abc,def";
result = "1,234";
I want to remove the commas from result only if it is a numeric value, and not if it is simple text.
What is the best way of going about this?
Here is your answer:
String regex = "(?<=[\\d])(,)(?=[\\d])";
Pattern p = Pattern.compile(regex);
String str = "Your input";
Matcher m = p.matcher(str);
str = m.replaceAll("");
System.out.println(str);
This only affects NUMBERS, not strings, as you asked.
Try adding that in your main method. Or try this one, it receives input:
String regex = "(?<=[\\d])(,)(?=[\\d])";
Pattern p = Pattern.compile(regex);
System.out.println("Value?: ");
Scanner scanIn = new Scanner(System.in);
String str = scanIn.next();
Matcher m = p.matcher(str);
str = m.replaceAll("");
System.out.println(str);
The easiest way is to use two regexes. The first to make sure it is numeric (something along the lines of [0-9.,]*), and the second to clean it (result.replaceAll("/,//"))
You could try to parse the string first with any of the numeric classes (Integer, Double etc) after removing the unwanted characters, if the parsing succeeds, then it is a numeric and you can remove the unwanted characters from the original string.
Here I have used BigInteger since I am not sure about the precision for your requirement.
public static String removeIfNumeric(final String s, final String toRemove) {
final String result;
if (isNumeric(s, toRemove)) {
result = s.replaceAll(toRemove, "");
} else {
result = s;
}
return result;
}
public static boolean isNumeric(final String s, final String toRemoveRegex) {
try {
new BigInteger(s.replaceAll(toRemoveRegex, ""));
return true;
} catch (NumberFormatException e) {
return false;
}
}

java replaceLast() [duplicate]

This question already has answers here:
Replace the last part of a string
(11 answers)
Closed 5 years ago.
Is there replaceLast() in Java? I saw there is replaceFirst().
EDIT: If there is not in the SDK, what would be a good implementation?
It could (of course) be done with regex:
public class Test {
public static String replaceLast(String text, String regex, String replacement) {
return text.replaceFirst("(?s)"+regex+"(?!.*?"+regex+")", replacement);
}
public static void main(String[] args) {
System.out.println(replaceLast("foo AB bar AB done", "AB", "--"));
}
}
although a bit cpu-cycle-hungry with the look-aheads, but that will only be an issue when working with very large strings (and many occurrences of the regex being searched for).
A short explanation (in case of the regex being AB):
(?s) # enable dot-all option
A # match the character 'A'
B # match the character 'B'
(?! # start negative look ahead
.*? # match any character and repeat it zero or more times, reluctantly
A # match the character 'A'
B # match the character 'B'
) # end negative look ahead
EDIT
Sorry to wake up an old post. But this is only for non-overlapping instances.
For example .replaceLast("aaabbb", "bb", "xx"); returns "aaaxxb", not "aaabxx"
True, that could be fixed as follows:
public class Test {
public static String replaceLast(String text, String regex, String replacement) {
return text.replaceFirst("(?s)(.*)" + regex, "$1" + replacement);
}
public static void main(String[] args) {
System.out.println(replaceLast("aaabbb", "bb", "xx"));
}
}
If you don't need regex, here's a substring alternative.
public static String replaceLast(String string, String toReplace, String replacement) {
int pos = string.lastIndexOf(toReplace);
if (pos > -1) {
return string.substring(0, pos)
+ replacement
+ string.substring(pos + toReplace.length());
} else {
return string;
}
}
Testcase:
public static void main(String[] args) throws Exception {
System.out.println(replaceLast("foobarfoobar", "foo", "bar")); // foobarbarbar
System.out.println(replaceLast("foobarbarbar", "foo", "bar")); // barbarbarbar
System.out.println(replaceLast("foobarfoobar", "faa", "bar")); // foobarfoobar
}
use replaceAll and add a dollar sign right after your pattern:
replaceAll("pattern$", replacement);
You can combine StringUtils.reverse() with String.replaceFirst()
See for yourself: String
Or is your question actually "How do I implement a replaceLast()?"
Let me attempt an implementation (this should behave pretty much like replaceFirst(), so it should support regexes and backreferences in the replacement String):
public static String replaceLast(String input, String regex, String replacement) {
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(input);
if (!matcher.find()) {
return input;
}
int lastMatchStart=0;
do {
lastMatchStart=matcher.start();
} while (matcher.find());
matcher.find(lastMatchStart);
StringBuffer sb = new StringBuffer(input.length());
matcher.appendReplacement(sb, replacement);
matcher.appendTail(sb);
return sb.toString();
}
Use StringUtils from apache:
org.apache.commons.lang.StringUtils.chomp(value, ignoreChar);
No.
You could do reverse / replaceFirst / reverse, but it's a bit expensive.
If the inspected string is so that
myString.endsWith(substringToReplace) == true
you also can do
myString=myString.replaceFirst("(.*)"+myEnd+"$","$1"+replacement)
it is slow, but works:3
import org.apache.commons.lang.StringUtils;
public static String replaceLast(String str, String oldValue, String newValue) {
str = StringUtils.reverse(str);
str = str.replaceFirst(StringUtils.reverse(oldValue), StringUtils.reverse(newValue));
str = StringUtils.reverse(str);
return str;
}
split the haystack by your needle using a lookahead regex and replace the last element of the array, then join them back together :D
String haystack = "haystack haystack haystack";
String lookFor = "hay";
String replaceWith = "wood";
String[] matches = haystack.split("(?=" + lookFor + ")");
matches[matches.length - 1] = matches[matches.length - 1].replace(lookFor, replaceWith);
String brandNew = StringUtils.join(matches);
I also have encountered such a problem, but I use this method:
public static String replaceLast2(String text,String regex,String replacement){
int i = text.length();
int j = regex.length();
if(i<j){
return text;
}
while (i>j&&!(text.substring(i-j, i).equals(regex))) {
i--;
}
if(i<=j&&!(text.substring(i-j, i).equals(regex))){
return text;
}
StringBuilder sb = new StringBuilder();
sb.append(text.substring(0, i-j));
sb.append(replacement);
sb.append(text.substring(i));
return sb.toString();
}
It really works good. Just add your string where u want to replace string in s and in place of "he" place the sub string u want to replace and in place of "mt" place the sub string you want in your new string.
import java.util.Scanner;
public class FindSubStr
{
public static void main(String str[])
{
Scanner on=new Scanner(System.in);
String s=on.nextLine().toLowerCase();
String st1=s.substring(0, s.lastIndexOf("he"));
String st2=s.substring(s.lastIndexOf("he"));
String n=st2.replace("he","mt");
System.out.println(st1+n);
}
}

Check if a String contains a special character

How do you check if a String contains a special character like:
[,],{,},{,),*,|,:,>,
Pattern p = Pattern.compile("[^a-z0-9 ]", Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher("I am a string");
boolean b = m.find();
if (b)
System.out.println("There is a special character in my string");
If you want to have LETTERS, SPECIAL CHARACTERS and NUMBERS in your password with at least 8 digit, then use this code, it is working perfectly
public static boolean Password_Validation(String password)
{
if(password.length()>=8)
{
Pattern letter = Pattern.compile("[a-zA-z]");
Pattern digit = Pattern.compile("[0-9]");
Pattern special = Pattern.compile ("[!##$%&*()_+=|<>?{}\\[\\]~-]");
//Pattern eight = Pattern.compile (".{8}");
Matcher hasLetter = letter.matcher(password);
Matcher hasDigit = digit.matcher(password);
Matcher hasSpecial = special.matcher(password);
return hasLetter.find() && hasDigit.find() && hasSpecial.find();
}
else
return false;
}
You can use the following code to detect special character from string.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class DetectSpecial{
public int getSpecialCharacterCount(String s) {
if (s == null || s.trim().isEmpty()) {
System.out.println("Incorrect format of string");
return 0;
}
Pattern p = Pattern.compile("[^A-Za-z0-9]");
Matcher m = p.matcher(s);
// boolean b = m.matches();
boolean b = m.find();
if (b)
System.out.println("There is a special character in my string ");
else
System.out.println("There is no special char.");
return 0;
}
}
If it matches regex [a-zA-Z0-9 ]* then there is not special characters in it.
What do you exactly call "special character" ? If you mean something like "anything that is not alphanumeric" you can use org.apache.commons.lang.StringUtils class (methods IsAlpha/IsNumeric/IsWhitespace/IsAsciiPrintable).
If it is not so trivial, you can use a regex that defines the exact character list you accept and match the string against it.
This is tested in android 7.0 up to android 10.0 and it works
Use this code to check if string contains special character and numbers:
name = firstname.getText().toString(); //name is the variable that holds the string value
Pattern special= Pattern.compile("[^a-z0-9 ]", Pattern.CASE_INSENSITIVE);
Pattern number = Pattern.compile("[0-9]", Pattern.CASE_INSENSITIVE);
Matcher matcher = special.matcher(name);
Matcher matcherNumber = number.matcher(name);
boolean constainsSymbols = matcher.find();
boolean containsNumber = matcherNumber.find();
if(constainsSymbols){
//string contains special symbol/character
}
else if(containsNumber){
//string contains numbers
}
else{
//string doesn't contain special characters or numbers
}
All depends on exactly what you mean by "special". In a regex you can specify
\W to mean non-alpahnumeric
\p{Punct} to mean punctuation characters
I suspect that the latter is what you mean. But if not use a [] list to specify exactly what you want.
Have a look at the java.lang.Character class. It has some test methods and you may find one that fits your needs.
Examples: Character.isSpaceChar(c) or !Character.isJavaLetter(c)
This worked for me:
String s = "string";
if (Pattern.matches("[a-zA-Z]+", s)) {
System.out.println("clear");
} else {
System.out.println("buzz");
}
First you have to exhaustively identify the special characters that you want to check.
Then you can write a regular expression and use
public boolean matches(String regex)
//without using regular expression........
String specialCharacters=" !#$%&'()*+,-./:;<=>?#[]^_`{|}~0123456789";
String name="3_ saroj#";
String str2[]=name.split("");
for (int i=0;i<str2.length;i++)
{
if (specialCharacters.contains(str2[i]))
{
System.out.println("true");
//break;
}
else
System.out.println("false");
}
Pattern p = Pattern.compile("[\\p{Alpha}]*[\\p{Punct}][\\p{Alpha}]*");
Matcher m = p.matcher("Afsff%esfsf098");
boolean b = m.matches();
if (b == true)
System.out.println("There is a sp. character in my string");
else
System.out.println("There is no sp. char.");
//this is updated version of code that i posted
/*
The isValidName Method will check whether the name passed as argument should not contain-
1.null value or space
2.any special character
3.Digits (0-9)
Explanation---
Here str2 is String array variable which stores the the splited string of name that is passed as argument
The count variable will count the number of special character occurs
The method will return true if it satisfy all the condition
*/
public boolean isValidName(String name)
{
String specialCharacters=" !#$%&'()*+,-./:;<=>?#[]^_`{|}~0123456789";
String str2[]=name.split("");
int count=0;
for (int i=0;i<str2.length;i++)
{
if (specialCharacters.contains(str2[i]))
{
count++;
}
}
if (name!=null && count==0 )
{
return true;
}
else
{
return false;
}
}
Visit each character in the string to see if that character is in a blacklist of special characters; this is O(n*m).
The pseudo-code is:
for each char in string:
if char in blacklist:
...
The complexity can be slightly improved by sorting the blacklist so that you can early-exit each check. However, the string find function is probably native code, so this optimisation - which would be in Java byte-code - could well be slower.
in the line String str2[]=name.split(""); give an extra character in Array...
Let me explain by example
"Aditya".split("") would return [, A, d,i,t,y,a] You will have a extra character in your Array...
The "Aditya".split("") does not work as expected by saroj routray you will get an extra character in String => [, A, d,i,t,y,a].
I have modified it,see below code it work as expected
public static boolean isValidName(String inputString) {
String specialCharacters = " !#$%&'()*+,-./:;<=>?#[]^_`{|}~0123456789";
String[] strlCharactersArray = new String[inputString.length()];
for (int i = 0; i < inputString.length(); i++) {
strlCharactersArray[i] = Character
.toString(inputString.charAt(i));
}
//now strlCharactersArray[i]=[A, d, i, t, y, a]
int count = 0;
for (int i = 0; i < strlCharactersArray.length; i++) {
if (specialCharacters.contains( strlCharactersArray[i])) {
count++;
}
}
if (inputString != null && count == 0) {
return true;
} else {
return false;
}
}
Convert the string into char array with all the letters in lower case:
char c[] = str.toLowerCase().toCharArray();
Then you can use Character.isLetterOrDigit(c[index]) to find out which index has special characters.
Use java.util.regex.Pattern class's static method matches(regex, String obj)
regex : characters in lower and upper case & digits between 0-9
String obj : String object you want to check either it contain special character or not.
It returns boolean value true if only contain characters and numbers, otherwise returns boolean value false
Example.
String isin = "12GBIU34RT12";<br>
if(Pattern.matches("[a-zA-Z0-9]+", isin)<br>{<br>
System.out.println("Valid isin");<br>
}else{<br>
System.out.println("Invalid isin");<br>
}

Categories