How can Java remove leading whitespace? - java

I have been trying to figure out why this Java code won't delete any leading whitespace to my actual string, I have been trying to use stripLeading() method and the trim(); method, and various other methods with the same functionality but still haven't gotten a favorable outcome. Code:
public static String message(String logLine) {
logLine = (String) logLine.subSequence(logLine.indexOf(" ") + 1, logLine.length());
return logLine;
}
public static void main(String[] args) {
System.out.println(message("[WARNING]: \tTimezone not set \r\n"));
}
What results is what I wanted, just the words "Timezone not set" however I want this program to completely ignore leading whitespace, which for some reason it can't. Thank you for any help.

Possible solutions
Use String::replaceFirst to keep only the part after a prefix ([WARNING]:) followed by whitespaces and the main part:
public static String message(String logLine) {
return logLine.replaceFirst("^\\S*\\s+(\\S+(\\s+\\S+)*)\\s+$", "$1");
}
As the prefix ends with ':', a solution offered in the comment using String::substring + String::trim works too:
public static String message(String logLine) {
return logLine.substring(logLine.indexOf(":") + 1).trim();
}

Related

Regex to validate that every digit is different from each other

I have to validate strings with specific conditions using a regex statement. The condition is that every digit is different from each other. So, 123 works but not 112 or 131.
So, I wrote a statement which filters a string according to the condition and prints true once a string fullfies everything, however it only seems to print "true" altough some strings do not meet the condition.
public class MyClass {
public static void main(String args[]) {
String[] value = {"123","951","121","355","110"};
for (String s : value){
System.out.println("\"" + s + "\"" + " -> " + validate(s));
}
}
public static boolean validate(String s){
return s.matches("([0-9])(?!\1)[0-9](?!\1)[0-9]");
}
}
#Vinz's answer is perfect, but if you insist on using regex, then you can use:
public static boolean validate(String s) {
return s.matches("(?!.*(.).*\\1)[0-9]+");
}
You don't need to use regex for that. You can simply count the number of unique characters in the String and compare it to the length like so:
public static boolean validate(String s) {
return s.chars().distinct().count() == s.length();
}

Remove part of string after or before a specific word in java

Is there a command in java to remove the rest of the string after or before a certain word;
Example:
Remove substring before the word "taken"
before:
"I need this words removed taken please"
after:
"taken please"
String are immutable, you can however find the word and create a substring:
public static String removeTillWord(String input, String word) {
return input.substring(input.indexOf(word));
}
removeTillWord("I need this words removed taken please", "taken");
There is apache-commons-lang class StringUtils that contains exactly you want:
e.g. public static String substringBefore(String str, String separator)
public static String foo(String str, String remove) {
return str.substring(str.indexOf(remove));
}
Clean way to safely remove until a string
String input = "I need this words removed taken please";
String token = "taken";
String result = input.contains(token)
? token + StringUtils.substringAfter(string, token)
: input;
Apache StringUtils functions are null-, empty-, and no match- safe
Since OP provided clear requirements
Remove the rest of the string after or before a certain word
and nobody has fulfilled those yet, here is my approach to the problem. There are certain rules to the implementation, but overall it should satisfy OP's needs, if he or she comes to revisit the question.
public static String remove(String input, String separator, boolean before) {
Objects.requireNonNull(input);
Objects.requireNonNull(separator);
if (input.trim().equals(separator)) {
return separator;
}
if (separator.isEmpty() || input.trim().isEmpty()) {
return input;
}
String[] tokens = input.split(separator);
String target;
if (before) {
target = tokens[0];
} else {
target = tokens[1];
}
return input.replace(target, "");
}

alternate method for using substring on a String

I have a string which contains an underscore as shown below:
123445_Lisick
I want to remove all the characters from the String after the underscore. I have tried the code below, it's working, but is there any other way to do this, as I need to put this logic inside a for loop to extract elements from an ArrayList.
public class Test {
public static void main(String args[]) throws Exception {
String str = "123445_Lisick";
int a = str.indexOf("_");
String modfiedstr = str.substring(0, a);
System.out.println(modfiedstr);
}
}
Another way is to use the split method.
String str = "123445_Lisick";
String[] parts = string.split("_");
String modfiedstr = parts[0];
I don't think that really buys you anything though. There's really nothing wrong with the method you're using.
Your method is fine. Though not explicitly stated in the API documentation, I feel it's safe to assume that indexOf(char) will run in O(n) time. Since your string is unordered and you don't know the location of the underscore apriori, you cannot avoid this linear search time. Once you have completed the search, extraction of the substring will be needed for future processing. It's generally safe to assume the for simple operations like this in a language which is reasonably well refined the library functions will have been optimized.
Note however, that you are making an implicit assumption that
an underscore will exist within the String
if there are more than one underscore in the string, all but the first should be included in the output
If either of these assumptions will not always hold, you will need to make adjustments to handle those situations. In either case, you should at least defensively check for a -1 returned from indexAt(char) indicating that '_' is not in the string. Assuming in this situation the entire String is desired, you could use something like this:
public static String stringAfter(String source, char delim) {
if(source == null) return null;
int index = source.indexOf(delim);
return (index >= 0)?source.substring(index):source;
}
You could also use something like that:
public class Main {
public static void main(String[] args) {
String str = "123445_Lisick";
Pattern pattern = Pattern.compile("^([^_]*).*");
Matcher matcher = pattern.matcher(str);
String modfiedstr = null;
if (matcher.find()) {
modfiedstr = matcher.group(1);
}
System.out.println(modfiedstr);
}
}
The regex groups a pattern from the start of the input string until a character that is not _ is found.
However as #Bill the lizard wrote, i don't think that there is anything wrong with the method you do it now. I would do it the same way you did it.

Removing accents from String

Recentrly I found very helpful method in StringUtils library which is
StringUtils.stripAccents(String s)
I found it really helpful with removing any special characters and converting it to some ASCII "equivalent", for instace ç=c etc.
Now I am working for a German customer who really needs to do such a thing but only for non-German characters. Any umlauts should stay untouched. I realised that strinAccents won't be useful in that case.
Does anyone has some experience around that stuff?
Are there any useful tools/libraries/classes or maybe regular expressions?
I tried to write some class which is parsing and replacing such characters but it can be very difficult to build such map for all languages...
Any suggestions appriciated...
Best built a custom function. It can be like the following. If you want to avoid the conversion of a character, you can remove the relationship between the two strings (the constants).
private static final String UNICODE =
"ÀàÈèÌìÒòÙùÁáÉéÍíÓóÚúÝýÂâÊêÎîÔôÛûŶŷÃãÕõÑñÄäËëÏïÖöÜüŸÿÅåÇçŐőŰű";
private static final String PLAIN_ASCII =
"AaEeIiOoUuAaEeIiOoUuYyAaEeIiOoUuYyAaOoNnAaEeIiOoUuYyAaCcOoUu";
public static String toAsciiString(String str) {
if (str == null) {
return null;
}
StringBuilder sb = new StringBuilder();
for (int index = 0; index < str.length(); index++) {
char c = str.charAt(index);
int pos = UNICODE.indexOf(c);
if (pos > -1)
sb.append(PLAIN_ASCII.charAt(pos));
else {
sb.append(c);
}
}
return sb.toString();
}
public static void main(String[] args) {
System.out.println(toAsciiString("Höchstalemannisch"));
}
My gut feeling tells me the easiest way to do this would be to just list allowed characters and strip accents from everything else. This would be something like
import java.util.regex.*;
import java.text.*;
public class Replacement {
public static void main(String args[]) {
String from = "aoeåöäìé";
String result = stripAccentsFromNonGermanCharacters(from);
System.out.println("Result: " + result);
}
private static String patternContainingAllValidGermanCharacters =
"a-zA-Z0-9äÄöÖéÉüÜß";
private static Pattern nonGermanCharactersPattern =
Pattern.compile("([^" + patternContainingAllValidGermanCharacters + "])");
public static String stripAccentsFromNonGermanCharacters(
String from) {
return stripAccentsFromCharactersMatching(
from, nonGermanCharactersPattern);
}
public static String stripAccentsFromCharactersMatching(
String target, Pattern myPattern) {
StringBuffer myStringBuffer = new StringBuffer();
Matcher myMatcher = myPattern.matcher(target);
while (myMatcher.find()) {
myMatcher.appendReplacement(myStringBuffer,
stripAccents(myMatcher.group(1)));
}
myMatcher.appendTail(myStringBuffer);
return myStringBuffer.toString();
}
// pretty much the same thing as StringUtils.stripAccents(String s)
// used here so I can demonstrate the code without StringUtils dependency
public static String stripAccents(String text) {
return Normalizer.normalize(text,
Normalizer.Form.NFD)
.replaceAll("\\p{InCombiningDiacriticalMarks}+", "");
}
}
(I realize the pattern doesn't probably contain all the characters needed, but add whatever is missing)
This might give you a work around. here you can detect the language and get the specific text only.
EDIT:
You can have the raw string as an input, put the language detection to German and then it will detect the German characters and will discard the remaining.

Can I use the replaceFirst method to look for a string pattern and replace it?

I have a question involving the Replace Method. I saw a question similar to this on here, but I tried to do the replaceFirst but it didn't work for me. Is there, any way I can use the replace method to change a string that results in: Helle, Werld; to get it to result to BE "Hello, World" using the replace method. Is there a way using the replaceFirst method for me to search for the sequence of "le" and replace it with "lo" and also change "We" to "Wo"?. Please see my code below:
public class Printer
{
/**Description: Replacement class
*
*
*/
public static void main(String[] args)
{
String test1Expected = "Hello, World!";
String newString1;
String test1 = "Holle, Werld!";
newString1 = test1.replace('o', 'e');
//Could I do: newString1.replaceFirst("le","lo);
System.out.println("newString1 = " + newString1);
//Output comes out to "Helle, Werld!"
}
}
You can do two regular expressions separatelt one after the other. Please try the following
newString1 = newString1.replaceAll("le", "lo").replaceAll("We", "Wo");

Categories