Java Regex : String Formatting - java

After runing this
Names.replaceAll("^(\\w)\\w+", "$1.")
I have a String Like
Names = F.DA, ABC, EFG
I want a String format like
F.DA, A.BC & E.FG
How do I do that ?
Update :
If I had a name Like
Robert Filip, Robert Morris, Cirstian Jed
I want like
R.Filp, R.Morris & C.Jed
I will be happy, If also you suggest me a good resource on JAVA Regex.

You need to re-assign the result back to Names, since Strings are immutable, the replaceAll methods does not do in place replacement, rather it returns a new String:
names = names.replaceAll(", (?=[^,]*$)", " & ")

Following should work for you:
String names = "Robert Filip, Robert Morris, Cirstian Jed, S.Smith";
String repl = names.replaceAll("((?:^|[^A-Z.])[A-Z])[a-z]*\\s(?=[A-Z])", "$1.")
.replaceAll(", (?=[^,]*$)", " & ");
System.out.println(repl); //=> R.Filip, R.Morris, C.Jed & S.Smith
Explanation:
1st replaceAll call is matching a non-word && non-dot character + a capital letter in group #1 + 0 or more lower case letters + a space which should be followed by 1 capital letter. It is then inserting a dot in front of the match $1.
2ns replaceAll call is matching a comma that is not followed by another comma and replacing that by literal string " & ".

Try this
String names = "Amal.PM , Rakesh.KR , Ajith.N";
names = names.replaceAll(" , (?=[^,]*$)", " & ");
System.out.println("New String : "+names);

Related

Java - split a string to different fields using regex

Does anyone have an idea how I can split a string returned by a WS to different strings?
String wsResult = "SONACOM RC, RUE DES ETOILES N. 20, 75250 PARIS (MI)";
I'm trying to split it into:
String name = "SONACOM RC";
String adress = "RUE DES ETOILES N. 20";
String postalCode = "75250";
String city = "PARIS";
N.B: the return of the WS changes only what is inside of my parameters
Thank you in advance for your help
You could capture your data in 4 capturing groups. Your provided example uses uppercase characters, which you can match with [A-Z].
If you want to match also lowercase characters, digits and an underscore, you could replace [A-Z] or [A-Z\d] with \w.
You can go about this in multiple ways. An approach could be:
([A-Z ]+), +([A-Z\d .]+), +(\d+) +([A-Z\d() ]+)
Explanation
Group 1: match one or more uppercase characters or a whitespace ([A-Z ]+)
Match a comma and one or more whitespaces , +
Group 2: match one or more uppercase characters or digit or whitespace or dot ([A-Z\d .]+)
Match a comma and one or more whitespaces , +
Group 3: match one or more digits (\d+)
Match one or more whitespaces +
Group 4: match one or more uppercase characters or digit or open/close parenthesis or whitespace ([A-Z\d() ]+)
Output in Java
One easy way to split it as you'd like is using wsResult.split(","). However, you'll have to add a comma between 75250 and Paris:
String wsResult = "SONACOM RC, RUE DES ETOILES N. 20, 75250, PARIS (MI)";
String[] temp = wsResult.split(",");
String name = temp[0];
String adress = temp[1];
String postalCode = temp[2];
String city = temp[3];
Using that you will get the output you're looking for.
EDIT
Another way to get your output without adding a comma would be to do this (using the code above too):
for(int i = 1; i<postalCode.length(); i++){
if(postalCode.charAt(i) == ' ') {
city = postalCode.substring(i,postalCode.length());
postalCode = postalCode.substring(0,i);
break;
}
}
For more information check the String class in the API Java and this Stack Overflow question.

Replacing certain combination of characters

I'm trying to remove the first bad characters (CAP letter + dot + Space) of this.
A. Shipping Length of Unit
C. OVERALL HEIGHT
Overall Weigth
X. Max Cutting Height
I tried something like that, but it doesn't work:
string.replaceAll("[A-Z]+". ", "");
The result should look like this:
Shipping Length of Unit
OVERALL HEIGHT
Overall Weigth
Max Cutting Height
This should work:
string.replaceAll("^[A-Z]\\. ", "")
Examples
"A. Shipping Length of Unit".replaceAll("^[A-Z]\\. ", "")
// => "Shipping Length of Unit"
"Overall Weigth".replaceAll("^[A-Z]\\. ", "")
// => "Overall Weigth"
input.replaceAll("[A-Z]\\.\\s", "");
[A-Z] matches an upper case character from A to Z
\. matches the dot character
\s matches any white space character
However, this will replace every character sequence that matches the pattern.
For matching a sequence at the beginning you should use
input.replaceAll("^[A-Z]\\.\\s", "");
Without looking your code it is hard to tell the problem. but from my experience this is the common problem which generally we make in our initial days:
String string = "A. Test String";
string.replaceAll("^[A-Z]\\. ", "");
System.out.println(string);
String is an immutable class in Java. what it means once you have create a object it can not be changed. so here when we do replaceAll in existing String it simply create a new String Object. that you need to assign to a new variable or overwrite existing value something like below :
String string = "A. Test String";
string = string.replaceAll("^[A-Z]\\. ", "");
System.out.println(string);
Try this :
myString.replaceAll("([A-Z]\\.\\s)","")
[A-Z] : match a single character in the range between A and Z.
\. : match the dot character.
\s : match the space character.

How to replace last letter to another letter in java using regular expression

i have seen to replace "," to "." by using ".$"|",$", but this logic is not working with alphabets.
i need to replace last letter of a word to another letter for all word in string containing EXAMPLE_TEST using java
this is my code
Pattern replace = Pattern.compile("n$");//here got the real problem
matcher2 = replace.matcher(EXAMPLE_TEST);
EXAMPLE_TEST=matcher2.replaceAll("k");
i also tried "//n$" ,"\n$" etc
Please help me to get the solution
input text=>njan ayman
output text=> njak aymak
Instead of the end of string $ anchor, use a word boundary \b
String s = "njan ayman";
s = s.replaceAll("n\\b", "k");
System.out.println(s); //=> "njak aymak"
You can use lookahead and group matching:
String EXAMPLE_TEST = "njan ayman";
s = EXAMPLE_TEST.replaceAll("(n)(?=\\s|$)", "k");
System.out.println("s = " + s); // prints: s = njak aymak
Explanation:
(n) - the matched word character
(?=\\s|$) - which is followed by a space or at the end of the line (lookahead)
The above is only an example! if you want to switch every comma with a period the middle line should be changed to:
s = s.replaceAll("(,)(?=\\s|$)", "\\.");
Here's how I would set it up:
(?=.\b)\w
Which in Java would need to be escaped as following:
(?=.\\b)\\w
It translates to something like "a character (\w) after (?=) any single character (.) at the end of a word (\b)".
String s = "njan ayman aowkdwo wdonwan. wadawd,.. wadwdawd;";
s = s.replaceAll("(?=.\\b)\\w", "");
System.out.println(s); //nja ayma aowkdw wdonwa. wadaw,.. wadwdaw;
This removes the last character of all words, but leaves following non-alphanumeric characters. You can specify only specific characters to remove/replace by changing the . to something else.
However, the other answers are perfectly good and might achieve exactly what you are looking for.
if (word.endsWith("char oldletter")) {
name = name.substring(0, name.length() - 1 "char newletter");
}

Parse and remove special characters in java regex

So we were looking at some of the other regex posts and we are having trouble removing a special case in one instance; the special character is in the beginning of the word.
We have the following line in our code:
String k = s.replaceAll("([a-z]+)[()?:!.,;]*", "$1");
where s is a singular word. For example, when parsing the sentence "(hi hi hi)" by tokenizing it, and then performing the replaceAll function on each token, we get an output of:
(hi
hi
hi
What are we missing in our regex?
You can use an easier approach - replace the characters that you do not want with spaces:
String k = s.replaceAll("[()?:!.,;]+", " ");
Position matters so you would need to match the excluded charcters before the capturing group also:
String k = s.replaceAll("[()?:!.,;]*([a-z]+)[()?:!.,;]*", "$1");
your replace just removed the "special chars" after the [a-z]+, that's why the ( before hi is left there.
If you know s is a single word
you could either:
String k = s.replaceAll("\\W*(\\w+)\\W*", "$1");
or
String k = s.replaceAll("\\W*", "");
This can be more simple
try this :
String oldString = "Hi There ##$ What is %#your name?##$##$ 0123$$";
System.out.println(oldString.replaceAll("[\\p{Punct}\\s\\d]+", " ");
output :
Hi There What is your name 0123
So it also accepts numeric.
.replaceAll("[\p{Punct}\s\d]+", " ");
will replace alll the Punctuations used which includes almost all the special characters.

How do I find a group of words using Reg-ex?

Here is the code:
String Str ="Animals \n" +
"Dog \n" +
"Cat \n" +
"Fruits \n" +
"Apple \n" +
"Banana \n" +
"Watermelon \n" +
"Sports \n" +
"Soccer \n" +
"Volleyball \n";
The Str basically has 3 categories (Animals, Fruits, Sports). Each of them in separate line. Using Regular Expression, how do I find the Fruits' contents, which will give me the output like this:
Apple
Banana
Watermelon
I would like an explanation that goes with your answer as well, so that I will have a better understand about this problem.
Thanks. :)
Assuming that you want to extract the text between the word "Fruits" and the word "Sports" you could use a regular expression with a capturing group. This way, if a string matches then you still have to extract the group that contains the text that you want.
For example:
Pattern p = Pattern.compile("Fruits(.*?)Sports", Pattern.DOTALL);
// The string "Fruits" ------^ ^ ^ ^
// Capture everything in between --^ ^ ^
// The string "Sports" -----------------^ ^
// This tells the regex to treat newlines ^
// like normal characters ---------------------^
See the railroad diagram below:
Alternatively, you can use a more advanced regular expression using positive lookahead and lookbehinds. This means that you can make your regular expression still look for text between the words "Fruit" and "Sports" but not consider those strings themselves as part of the match.
Pattern p = Pattern.compile("(?<!Fruits).*?(?=Sports)", Pattern.DOTALL);
I would start by splitting the string into an array of words (String[] words = Regex.Split(Str, "\n");), then loop through the words array, adding elements to their proper categories as you go along, switching between the categories as you see headings.

Categories