Java regex pattern replacing space followed by special character - java

I have below 2 strings
1. Economy / Coach
2. First Class
I want regex pattern in such a way,
Conditions :
If the space is followed by special character(/), I need to remove the space with "". (Example : Economy / Coach to Economy/Coach )
If the space is not followed by any special character, then the string should be as it is. (Example: First Class to First class)
Expected output:
1. Economy/Coach
2. First class
Can anybody please help me to write this regex pattern?
Thanks in advance

The best way is tu use String#replaceAll method, but you have to store it explicitely into the variable because strings are immutable. The replaceAll method will give you a new instance and will not affect the original string.
Example :
String str = "Economy / Coach";
str = str.replaceAll("\\s+/\\s+", "/");
Note that the pattern "\\s+" will capture each sequence of spaces (tabulations, spaces etc..).

Simply replace occurrences of " / " with "/":
String replaced = string.replace(" / ", "/");
Note that this doesn't use regex (directly; it does internally, but that's just an implementation detail).

First Check with "/" is in the String or not.
If it is in that, then remove the WhiteSpaces.
if(testString.contains("/")){
str1 = testString.replaceAll("\\s+","");
}

Related

can deal with the first line space when i use regex for polynomials

here is my code
String a = "X^5+2X^2+3X^3+4X^4";
String exp[]=a.split("(|\\+\\d)[xX]\\^");
for(int i=0;i<exp.length;i++) {
System.out.println("exp: "+exp[i]+" ");
}
im try to find the output which is 5,2,3,4
but instead i got this answer
exp:
exp:5
exp:2
exp:3
exp:4
i dont know where is the first line space come from, and i cannot find a will to get rid of that, i try to use others regex for this and also use compile,still can get rid of the first line, i try to use new string "X+X^5+2X^2+3X^3+4X^4";the first line shows exp:X.
and i also use online regex compiler to try my problem, but their answer is 5,2,3,4, buy eclipse give a space ,and then 5,2,3,4 ,need a help to figure this out
Try to use regex, e.g:
String input = "X^5+2X^2+3X^3+4X^4";
Pattern pattern = Pattern.compile("\\^([0-9]+)");
Matcher matcher = pattern.matcher(input);
for (int i = 1; matcher.find(); i++) {
System.out.println("exp: " + matcher.group(1));
}
It gives output:
exp: 5
exp: 2
exp: 3
exp: 4
How does it work:
Pattern used: \^([0-9]+)
Which matches any strings starting with ^ followed by 1 or more digits (note the + sign). Dash (^) is prefixed with backslash (\) because it has a special meaning in regular expressions - beginning of a string - but in Your case You just want an exact match of a ^ character.
We want to wrap our matches in a groups to refer to them late during matching process. It means we need to mark them using parenthesis ( and ).
Then we want to pu our pattern into Java String. In String literal, \character has a special meaning - it is used as a control character, eg "\n" represents a new line. It means that if we put our pattern into String literal, we need to escape a \ so our pattern becomes: "\\^([0-9]+)". Note double \.
Next we iterate through all matches getting group 1 which is our number match. Note that a ^.character is not covered in our match even if it is a part of our pattern. It is so because wr used parenthesis to mark our searched group, which in our case are only digits
Because you are using the split method which looks for the occurrence of the regex and, well.. splits the string at this position. Your string starts with X^ so it very much matches your regex.

Split String 2 times but with different splits ";" and "."

Original String: "12312123;www.qwerty.com"
With this Model.getList().get(0).split(";")[1]
I get: "www.qwerty.com"
I tried doing this: Model.getList().get(0).split(";")[1].split(".")[1]
But it didnt work I get exception. How can I solve this?
I want only "qwerty"
Try this, to achieve "qwerty":
Model.getList().get(0).split(";")[1].split("\\.")[1]
You need escape dot symbol
Try to use split(";|\\.") like this:
for (String string : "12312123;www.qwerty.com".split(";|\\.")) {
System.out.println(string);
}
Output:
12312123
www
qwerty
com
You can split a string which has multiple delimiters. Example below:
String abc = "11;xyz.test.com";
String[] tokens = abc.split(";|\\.");
System.out.println(tokens[tokens.length-2]);
The array index 1 part doesn't make sense here. It will throw an ArrayIndexOutOfBounds Exception or something of the sort.
This is because splitting based on "." doesn't work the way you want it to. You would need to escape the period by putting "\." instead. You will find here that "." means something completely different.
You'd need to escape the ., i.e. "\\.". Period is a special character in regular expressions, meaning "any character".
What your current split means is "split on any character"; this means that it splits the string into a number of empty strings, since there is nothing between consecutive occurrences of " any character".
There is a subtle gotcha in the behaviour of the String.split method, which is that it discards trailing empty strings from the token array (unless you pass a negative number as the second parameter).
Since your entire token array consists of empty strings, all of these are discarded, so the result of the split is a zero-length array - hence the exception when you try to access one of its element.
Don't use split, use a regular expression (directly). It's safer, and faster.
String input = "12312123;www.qwerty.com";
String regex = "([^.;]+)\\.[^.;]+$";
Matcher m = Pattern.compile(regex).matcher(input);
if (m.find()) {
System.out.println(m.group(1)); // prints: qwerty
}

Replace empty space wherever Regex matches in a string

I have been trying to solve this problem. I have a string which has a pattern. Eg.
CW1234 has been despatched to CW334545
i.e the String can have patterns starting with CW followed by any number of intergers (at max 16).
I want to replace all these patters with an empty character. So that the string will look like
has been despatched to
I have tried the following but it replaces only the first digit followed by the CW. I'm pretty new to java. Any insights would be of great help.
if(Pattern.matches(".*[C][W][0-9].*", str1)) {
Matcher m = Pattern.compile(".*[C][W][0-9].*").matcher(str1);
while(m.find()) {
str1 = str1.replaceAll("[C][W][0-9]", "");
}
}
System.out.println(str1);
You need to have {n,m} quantifier on your digits, to enforce maximum digits. Also, for replacement purpose, you don't need to check beforehand whether the pattern is there or not. replaceAll will replace only if there is matching pattern, else will leave the string as it is.
So, remove all those Pattern and Matcher part, and change your regex to:
str1 = str1.replaceAll("CW\\d{0,16}", "");
If you want at least 1 digit, then make it {1,16}. No need to put C and W in different character classes. A character class with single character is as good as that character itself (given that it's not a special character). Also, you can use \\d instead of [0-9].
You're needlessly constructing the pattern and matching the string several times.
str1 = str1.replaceAll("CW\\d+", "");
This is sufficient. All other code is redundant.
You can also opt to do the replace by hand if performance is a problem.
Your replaceAll is missing a +:
str1 = str1.replaceAll("[C][W][0-9]+", "");
The + will make the regex match any number of digits directly following CW.
Your regex is wrong. Try with:
String str1 = CW1234;
str1 = str1.replaceAll("\\bCW\\d{0,16}\\b","");
if the "CW12134" is a single token in a string or with
String str1 = CW1234;
str1 = str1.replaceAll("^CW\\d{0,16}$","");
if the "CW1234" is a full string.
String.replaceAll("CW[0-9\\s]*", "") does what you need, and it also removes the space at the end of the number.
On another note, the whole point of Pattern.compile() is that you need to compile the required expression once in the application, and then use the matcher to find occurences. So I think your usage is inappropriate (rather than incorrect).
Pattern pattern = Pattern.compile("CD[0-9\\s]*");occurs only once in the code and then reuse it as
Matcher matcher = pattern.matcher(stringToMatch);

Splitting a Java String with '.'

I have
1. This is a test message
I want to print
This is a test message
I am trying
String delimiter=".";
String[] parts = line.split(delimiter);
int gg=parts.length;
Than want to print array
for (int k ;k <gg;K++)
parts[k];
But my gg is always 0.
am I missing anything.
All I need is to remove the number and . and white spaces
The number can be 1 (or) 5 digit number
You are using "." as a delimiter, you should break the special meaning of the . char.
The . char in regex is "any character" so your split is just splitting according to "any character", which is obviously not what you are after.
Use "\\." as a delimiter
For more information on pre-defined character classes you can have a look at the tutorial.
For more information on regex on general (includes the above) you can try this tutorial
EDIT:
P.S. What you are up to (removing the number) can be achieved with a one-liner, using the String.replaceAll() method.
System.out.println(line.replaceAll("[0-9]+\\.\\s+", ""));
will provide output
This is a test message
For your input example.
The idea is: [0-9] is any digit. - the + indicate there can be any number of them, which is greater then 0. The \\. is a dot (with breaking as mentioned above) and the \\s+ is at least one space.
It is all replaced with an empty string.
Note however, for strings like: "1. this is a 2. test" - it will provide "this is a test", and remove the "2. " as well, so think carefully if that is indeed what you are after.
Use following code..
String delimtor="\\."; // use this because . required to be skipped
String[] parts = line.split(delimtor);
For your for loop.
for (int k=0 ;k <gg.length;K++)
parts[k];
try this
String delimtor = "\\.";
"." has a special meaning for a regular expression.
If you are just trying to remove the prefix numbers then you can do it in one line. Not sure if you actually want to split on multiple dots. If it is just the prefix then you can do it in one line
String s = "1. with single digit";
String s2 = "999. with multiple digits";
String s3 = "999. with multiple digits . and . dots";
assertEquals("with single digit", (s.substring(s.indexOf(".") + 1).trim()));
assertEquals("with multiple digits", (s2.substring(s2.indexOf(".") + 1).trim()));
assertEquals("with multiple digits . and . dots", (s3.substring(s3.indexOf(".") + 1).trim()));

java replaceAll and '+' match

I have some code setup to remove extra spaces in between the words of a title
String formattedString = unformattedString.replaceAll(" +"," ");
My understanding of this type of regex is that it will match as many spaces as possible before stopping. However, my strings that are coming out are not changing in any way. Is it possible that it's only matching one space at a time, and then replacing that with a space? Is there something to the replaceAll method, since it's doing multiple matches, that would alter the way this type of match would work here?
A better approach might be to use "\\s+" to match runs of all possible whitespace characters.
EDIT
Another approach might be to extract all matches for "\\b([A-Za-z0-9]+)\\b" and then join them using a space which would allow you to remove everything except for valid words and numbers.
If you need to preserve punctuation, use "(\\S+)" which will capture all runs of non-whitespace characters.
Are you sure you string is spaces and not tabs? The following is a bit more "aggressive" on whitespace.
String formattedString = unformattedString.replaceAll("\\s+"," ");
all responses should work.
Both:
String formattedString = unformattedString.replaceAll(" +"," ");
or
String formattedString = unformattedString.replaceAll("\\s+"," ");
Maybe your unformattedString is a multiline expression. In that case you can instantiate an Pattern object
String unformattedString = " Hello \n\r\n\r\n\r World";
Pattern manySpacesPattern = Pattern.compile("\\s+",Pattern.MULTILINE);
Matcher formatMatcher = manySpacesPattern.matcher(unformattedString);
String formattedString = formatMatcher.replaceAll(" ");
System.out.println(unformattedString.replaceAll("\\s+", " "));
Or maybe unformattedString have special characters in that case you can play with Pattern flags en compile method.
Examples:
Pattern.compile("\\s+",Pattern.MULTILINE|Pattern.UNIX_LINES);
or
Pattern.compile("\\s+",Pattern.MULTILINE|Pattern.UNICODE_CASE);

Categories