Changing a few char for one the same - java

I have question about String class in Java.
I want remove every punctuation marks. To be exact I use replace() method and replace all marks for: "";
But my question is can I do it more smoothly? Becouse now I replace every sign separately
String line1 = line.replace(".", "");
String line2 = line1.replace("?", "");
String line3 = line2.replace("!", "");
String line4 = line3.replace("\n", "");

Ok I find helpful and nice solution.
String line11 = line.replaceAll("[\\p{Punct}]", "");

use replaceAll, and reg []
String str = "hellol,lol/,=o/l.o?ll\no,llol";
str = str.replaceAll("[,=/\\n\\?\\.]", "");
System.out.println(str);

If we want to replace every punctuation mark then we can use the replaceAll() method in java to achieve that. replaceAll("[^a-zA-Z ]", "")), This line makes a java compiler to understand every character other than alphabets(both lowercase and uppercase) to be replaced by "" i.e,empty. with this we can replace every punctuation marks in a particular string.
public class HelloWorld {
public static void main(String[] args) {
String line="Th##is i*s a Ex()ample St!#ing!#";
System.out.println(line.replaceAll("[^a-zA-Z ]", ""));
}
}

Related

How to replace part of a String in Java?

Im trying to replace part of a String based on a certain phrase being present within it. Consider the string "Hello my Dg6Us9k. I am alive.".
I want to search for the phase "my" and remove 8 characters to the right, which removes the hash code. This gives the string "Hello. I am alive." How can i do this in Java?
You could achieve this through string.replaceAll function.
string.replaceAll("\\bmy.{8}", "");
Add \\b if necessary. \\b called word boundary which matches between a word character and a non-word character. .{8} matches exactly the following 8 characters.
To remove also the space before my
System.out.println("Hello my Dg6Us9k. I am alive.".replaceAll("\\smy.{8}", ""));
This should do it:
String s = ("Hello my Dg6Us9k. I am alive");
s.replace(s.substring(s.indexOf("my"), s.indexOf("my")+11),"");
That is replacing the string starts at "my" and is 11 char long with nothing.
Use regex like this :
public static void main(String[] args) {
String s = "Hello my Dg6Us9k. I am alive";
String newString=s.replaceFirst("\\smy\\s\\w{7}", "");
System.out.println(newString);
}
O/P :
Hello. I am alive
Java strings are immutable, so you cannot change the string. You have to create a new string. So, find the index i of "my". Then concatenate the substring before (0...i) and after (i+8...).
int i = s.indexOf("my");
if (i == -1) { /* no "my" in there! */ }
string ret = s.substring(0,i);
ret.concat(s.substring(i+2+8));
return ret;
If you want to be flexible about the hash code length, use the folowing regexp:
String foo="Hello my Dg6Us9k. I am alive.";
String bar = foo.replaceFirst("\\smy.*?\\.", ".");
System.out.println(bar);

Java - Split string

i have string which is separated by "." when i try to split it by the dot it is not getting spitted.
Here is the exact code i have. Please let me know what could cause this not to split the string.
public class TestStringSplit {
public static void main(String[] args) {
String testStr = "[Lcom.hexgen.ro.request.CreateRequisitionRO;";
String test[] = testStr.split(".");
for (String string : test) {
System.out.println("test : " + string);
}
System.out.println("Str Length : " + test.length);
}
}
I have to separate the above string and get only the last part. in the above case it is CreateRequisitionRO not CreateRequisitionRO; please help me to get this.
You can split this string through StringTokenizer and get each word between dot
StringTokenizer tokenizer = new StringTokenizer(string, ".");
String firstToken = tokenizer.nextToken();
String secondToken = tokenizer.nextToken();
As you are finding for last word CreateRequisitionRO you can also use
String testStr = "[Lcom.hexgen.ro.request.CreateRequisitionRO;";
String yourString = testStr.substring(testStr.lastIndexOf('.')+1, testStr.length()-1);
String testStr = "[Lcom.hexgen.ro.request.CreateRequisitionRO;";
String test[] = testStr.split("\\.");
for (String string : test) {
System.out.println("test : " + string);
}
System.out.println("Str Length : " + test.length);
The "." is a regular expression wildcard you need to escape it.
Change String test[] = testStr.split("."); to String test[] = testStr.split("\\.");.
As the argument to String.split takes a regex argument, you need to escape the dot character (which means wildcard in regex):
Note that String.split takes in a regular expression, and . has special meaning in regular expression (which matches any character except for line separator), so you need to escape it:
String test[] = testStr.split("\\.");
Note that you escape the . at the level of regular expression once: \., and to specify \. in a string literal, \ needs to be escaped again. So the string to pass to String.split is "\\.".
Or another way is to specify it inside a character class, where . loses it special meaning:
String test[] = testStr.split("[.]");
You need to escape the . as it is a special character, a full list of these is available. Your split line needs to be:
String test[] = testStr.split("\\.");
Split takes a regular expression as a parameter. If you want to split by the literal ".", you need to escape the dot because that is a special character in a regular expression. Try putting 2 backslashes before your dot ("\\.") - hopefully that does what you are looking for.
String test[] = testStr.split("\\.");

How to get alphabets only from given albha-numberic word in java?

sorry for this if this is a silly question.but i need to know about this.
If i have a word like alphabets,numeric and special charters. I need to extract alphabets only.No need for numeric and special characters.I need to know is there default function is there in Java to split characters only?
eg.String word="te123##st";
I need test only.
This solution works with accentued/non-ascii caracters :
"te123##st\néàø_".replaceAll("[\\p{Digit}\\p{Punct}\\p{Space}]", "");
try this word.replaceAll("[^a-zA-Z]", "");
This will remove all non alphanumeric characters, but it will still remove accented characters.
String word = "te123##st";
word = word.replaceAll("[^\\p{Alpha}]", "");
// or word = word.replaceAll("[\\P{Alpha}]", "");
See apidoc reference.
try
word = word.replaceAll("\\P{Alpha}", "");
String word = "te123##st";
word = word.replaceAll("[\\W\\d._]", "");
try this:
word = word.replaceAll("[\\d##_]", "");
- I won't make this complicated using Regex, but will use inbuilt Java functionalities to answer this.
- First use subString() method to get the "abcd" part of the String, then use toCharArray() method to break the String into char elements, then use Character class's isDigit() method to know whether its a digit or not.
public class T1 {
public static void main(String[] args){
String s = "te123##st";
String str = s.substring(0,4);
System.out.println(str);
String tempStr = new String();
char[] cArr = str.toCharArray();
for(char a :cArr){
if(Character.isAlphabetic(a)){
System.out.println(a+" is a alphabet");
tempStr = tempStr + a;
}else{
System.out.println(a+" is not a alphabet");
}
}
System.out.println("The extracted String is: "+tempStr);
}
}

Java replaceAll method with a variable string and escaped dot

I'm having a hard time figuring this one out, so I ask for your help. Here's the deal:
String str = "02-EST-WHATEVER-099-00.dwg";
String newStr = str.replaceAll("([^-_\\.]+-[^-_\\.]+-[^-_\\.]+-[^-_\\.]+-)[^-_\\.]+(\\.[^-_\\.]+)", "$1$2");
The block of code above results in 02-EST-WHATEVER-099-.dwg (removed the last "00", just before the extension). Great, that's what I need!
But the RegEx I use above has to be created on the fly (the field I'm removing can be in a different position). So I used some code to create the RegEx string (here's what the result would look like if I just declared it):
String regexRemoveRev = "([^-_\\.]+-[^-_\\.]+-[^-_\\.]+-[^-_\\.]+-)[^-_\\.]+(\\.[^-_\\.]+)";
Now, if I out.print(regexRemoveRev), I get ([^-_\.]+-[^-_\.]+-[^-_\.]+-[^-_\.]+-)[^-_\.]+(\.[^-_\.]+) (notice the single backslashes).
And when i try the replaceAll again, it doesn't work:
String str = "02-EST-WHATEVER-099-00.dwg";
String newStr = str.replaceAll(regexRemoveRev, "$1$2");
So I thought it could be because of the single backslashes, and I tried declaring regexRemoveRev with 4 of them, instead of just 2:
String regexRemoveRev = "([^-_\\\\.]+-[^-_\\\\.]+-[^-_\\\\.]+-[^-_\\\\.]+-)[^-_\\\\.]+(\\\\.[^-_\\\\.]+)";
The output of out.print(regexRemoveRev) is the double backslash version of the RegEx, as expected:
([^-_\\.]+-[^-_\\.]+-[^-_\\.]+-[^-_\\.]+-)[^-_\\.]+(\\.[^-_\\.]+)
But the replace still doesn't work!
How do I get this to do what I want?
I have just wrote a short program and in both cases it works here it is:
public class StringTest
{
public static void main(String[] args)
{
String str = "02-EST-WHATEVER-099-00.dwg";
String newStr = str.replaceAll("([^-_\\.]+-[^-_\\.]+-[^-_\\.]+-[^-_\\.]+-)[^-_\\.]+(\\.[^-_\\.]+)", "$1$2");
String regexRemoveRev = "([^-_\\.]+-[^-_\\.]+-[^-_\\.]+-[^-_\\.]+-)[^-_\\.]+(\\.[^-_\\.]+)";
String newStr1 = str.replaceAll(regexRemoveRev, "$1$2");
System.out.println("newStr: "+newStr);
System.out.println("regexRemoveRev: "+regexRemoveRev);
System.out.println("newStr: "+newStr1);
}
}
The out put from the above:
newStr: 02-EST-WHATEVER-099-.dwg
regexRemoveRev: ([^-.]+-[^-.]+-[^-.]+-[^-.]+-)[^-.]+(.[^-.]+)
newStr: 02-EST-WHATEVER-099-.dwg
I am not sure why is not working for you!! or is it something else you are asking and I got wrong

Remove end of line characters from Java string

I have string like this
"hello
java
book"
I want remove \r and \n from String(hello\r\njava\r\nbook). I want the result to be "hellojavabook". How can I do this?
Regex with replaceAll.
public class Main
{
public static void main(final String[] argv)
{
String str;
str = "hello\r\njava\r\nbook";
str = str.replaceAll("(\\r|\\n)", "");
System.out.println(str);
}
}
If you only want to remove \r\n when they are pairs (the above code removes either \r or \n) do this instead:
str = str.replaceAll("\\r\\n", "");
If you want to avoid the regex, or must target an earlier JVM, String.replace() will do:
str=str.replace("\r","").replace("\n","");
And to remove a CRLF pair:
str=str.replace("\r\n","");
The latter is more efficient than building a regex to do the same thing. But I think the former will be faster as a regex since the string is only parsed once.
public static void main(final String[] argv)
{
String str;
str = "hello\r\n\tjava\r\nbook";
str = str.replaceAll("(\\r|\\n|\\t)", "");
System.out.println(str);
}
It would be useful to add the tabulation in regex too.
Given a String str:
str = str.replaceAll("\\\\r","")
str = str.replaceAll("\\\\n","")
You can either directly pass line terminator e.g. \n, if you know the line terminator in Windows, Mac or UNIX. Alternatively you can use following code to replace line breaks in either of three major operating system.
str = str.replaceAll("\\r\\n|\\r|\\n", " ");
Above code line will replace line breaks with space in Mac, Windows and Linux.
Also you can use line-separator. It will work for all OS. Below is the code snippet for line-separator.
String lineSeparator=System.lineSeparator();
String newString=yourString.replace(lineSeparator, "");
Have you tried using the replaceAll method to replace any occurence of \n or \r with the empty String?
static byte[] discardWhitespace(byte[] data) {
byte groomedData[] = new byte[data.length];
int bytesCopied = 0;
for (int i = 0; i < data.length; i++) {
switch (data[i]) {
case (byte) '\n' :
case (byte) '\r' :
break;
default:
groomedData[bytesCopied++] = data[i];
}
}
byte packedData[] = new byte[bytesCopied];
System.arraycopy(groomedData, 0, packedData, 0, bytesCopied);
return packedData;
}
Code found on commons-codec project.
You can use unescapeJava from org.apache.commons.text.StringEscapeUtils like below
str = "hello\r\njava\r\nbook";
StringEscapeUtils.unescapeJava(str);
Hey we can also use this regex solution.
String chomp = StringUtils.normalizeSpace(sentence.replaceAll("[\\r\\n]"," "));
I went with \\s+ and it removed \r and \n chars for me.
\s+ will match one or more whitespace characters
final String stringWithWhitespaceChars = "Bart\n\r";
final String stringWithoutEscapeChars = stringWithEscapeChars.replaceAll("\\s+","");
Refer to Regex expressions in Java, \\s vs. \\s+ for in detail informations.
Try below code. It worked for me.
str = str.replaceAll("\\r", "");
str = str.replaceAll("\\n", "");
Did you try
string.trim();
This is meant to trim all leading and leaning while spaces in the string. Hope this helps.
Edit: (I was not explicit enough)
So, when you string.split(), you will have a string[] - for each of the strings in the array, do a string.trim() and then append it.
String[] tokens = yourString.split(" ");
StringBuffer buff = new StringBuffer();
for (String token : tokens)
{
buff.append(token.trim());
}
Use stringBuffer/Builder instead of appending in the same string.

Categories