How can I remove whitespaces around the first occurrence of specific char? - java

How can I remove the whitespaces before and after a specific char? I want also to remove the whitespaces only around the first occurrence of the specific char. In the examples below, I want to remove the whitespaces before and after the first occurrence of =.
For example for those strings:
something = is equal to = something
something = is equal to = something
something =is equal to = something
I need to have this result:
something=is equal to = something
Is there any regular expression that I can use or should I check for the index of the first occurrence of the char =?

private String removeLeadingAndTrailingWhitespaceOfFirstEqualsSign(String s1) {
return s1.replaceFirst("\\s*=\\s*", "=");
}
Notice this matches all whitespace including tabs and new lines, not just space.

You can use the regular expression \w*\s*=\s* to get all matches. From there call trim on the first index in the array of matches.
Regex demo.

Yes - you can create a Regex that matches optional whitespace followed by your pattern followed by optional whitepace, and then replace the first instance.
public static String replaceFirst(final String toMatch, final String forIP) {
// string you want to match before and after
final String quoted = Pattern.quote(toMatch);
final Pattern patt = Pattern.compile("\\s*" + quoted + "\\s*");
final Matcher match = patt.matcher(forIP);
return match.replaceFirst(toMatch);
}
For your inputs this gives the expected result - assuming toMatch is =. It also works with arbitrary bigger things - eg.. imagine giving "is equal to" instead ... getting
something =is equal to= something
For the simple case you can ignore the quoting, for an arbitrary case it helps (although as
many contributors have pointed out before the Pattern.quoting isn't good for every case).
The simple case thus becomes
return forIP.replaceFirst("\\s*" + forIP + "\\s*", forIP);
OR
return forIP.replaceFirst("\\s*=\\s*", "=");

Related

How to add a space after certain characters using regex Java

I have a string consisting of 18 digits Eg. 'abcdefghijklmnopqr'. I need to add a blank space after 5th character and then after 9th character and after 15th character making it look like 'abcde fghi jklmno pqr'. Can I achieve this using regular expression?
As regular expressions are not my cup of tea hence need help from regex gurus out here. Any help is appreciated.
Thanks in advance
Regex finds a match in a string and can't preform a replacement. You could however use regex to find a certain matching substring and replace that, but you would still need a separate method for replacement (making it a two step algorithm).
Since you're not looking for a pattern in your string, but rather just the n-th char, regex wouldn't be of much use, it would make it unnecessary complex.
Here are some ideas on how you could implement a solution:
Use an array of characters to avoid creating redundant strings: create a character array and copy characters from the string before
the given position, put the character at the position, copy the rest
of the characters from the String,... continue until you reach the end
of the string. After that construct the final string from that
array.
Use Substring() method: concatenate substring of the string before
the position, new character, substring of the string after the
position and before the next position,... and so on, until reaching the end of the original string.
Use a StringBuilder and its insert() method.
Note that:
First idea listed might not be a suitable solution for very large strings. It needs an auxiliary array, using additional space.
Second idea creates redundant strings. Strings are immutable and final in Java, and are stored in a pool. Creating
temporary strings should be avoided.
Yes you can use regex groups to achieve that. Something like that:
final Pattern pattern = Pattern.compile("([a-z]{5})([a-z]{4})([a-z]{6})([a-z]{3})");
final Matcher matcher = pattern.matcher("abcdefghijklmnopqr");
if (matcher.matches()) {
String first = matcher.group(0);
String second = matcher.group(1);
String third = matcher.group(2);
String fourth = matcher.group(3);
return first + " " + second + " " + third + " " + fourth;
} else {
throw new SomeException();
}
Note that pattern should be a constant, I used a local variable here to make it easier to read.
Compared to substrings, which would also work to achieve the desired result, regex also allow you to validate the format of your input data. In the provided example you check that it's a 18 characters long string composed of only lowercase letters.
If you had a more interesting examples, with for example a mix of letters and digits, you could check that each group contains the correct type of data with the regex.
You can also do a simpler version where you just replace with:
"abcdefghijklmnopqr".replaceAll("([a-z]{5})([a-z]{4})([a-z]{6})([a-z]{3})", "$1 $2 $3 $4")
But you don't have the benefit of checking because if the string doesn't match the format it will just not replaced and this is less efficient than substrings.
Here is an example solution using substrings which would be more efficient if you don't care about checking:
final Set<Integer> breaks = Set.of(5, 9, 15);
final String str = "abcdefghijklmnopqr";
final StringBuilder stringBuilder = new StringBuilder();
for (int i = 0; i < str.length(); i++) {
if (breaks.contains(i)) {
stringBuilder.append(' ');
}
stringBuilder.append(str.charAt(i));
}
return stringBuilder.toString();

How to remove all characters before a specific character in Java?

I have a string and I'm getting value through a html form so when I get the value it comes in a URL so I want to remove all the characters before the specific charater which is = and I also want to remove this character. I only want to save the value that comes after = because I need to fetch that value from the variable..
EDIT : I need to remove the = too since I'm trying to get the characters/value in string after it...
You can use .substring():
String s = "the text=text";
String s1 = s.substring(s.indexOf("=") + 1);
s1.trim();
then s1 contains everything after = in the original string.
s1.trim()
.trim() removes spaces before the first character (which isn't a whitespace, such as letters, numbers etc.) of a string (leading spaces) and also removes spaces after the last character (trailing spaces).
While there are many answers. Here is a regex example
String test = "eo21jüdjüqw=realString";
test = test.replaceAll(".+=", "");
System.out.println(test);
// prints realString
Explanation:
.+ matches any character (except for line terminators)
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
= matches the character = literally (case sensitive)
This is also a shady copy paste from https://regex101.com/ where you can try regex out.
You can split the string from the = and separate in to array and take the second value of the array which you specify as after the = sign
For example:
String CurrentString = "Fruit = they taste good";
String[] separated = CurrentString.split("=");
separated[0]; // this will contain "Fruit"
separated[1]; //this will contain "they teste good"
then separated[1] contains everything after = in the original string.
I know this is asked about Java but this seems to also be the first search result for Kotlin so you should know that Kotlin has the String.substringAfter(delimiter: String, missingDelimiterValue: String = this) extension for this case.
Its implementation is:
val index = indexOf(delimiter)
return if (index == -1)
missingDelimiterValue
else
substring(index + delimiter.length, length)
Maybe locate the first occurrence of the character in the URL String. For Example:
String URL = "http://test.net/demo_form.asp?name1=stringTest";
int index = URL.indexOf("=");
Then, split the String based on an index
String Result = URL.substring(index+1); //index+1 to skip =
String Result now contains the value: stringTest
If you use the Apache Commons Lang3 library, you can also use the substringAfter method of the StringUtils utility class.
Official documentation is here.
Examples:
String value = StringUtils.substringAfter("key=value", "=");
// in this case where a space is in the value (e.g. read from a file instead of a query params)
String value = StringUtils.trimToEmpty(StringUtils.substringAfter("key = value", "=")); // = "value"
It manage the case where your values can contains the '=' character as it takes the first occurence.
If you have keys and values also containing '=' character it will not work (but the other methods as well); in the URL query params, such a character should be escaped anyway.

How do I extract the second occurence of a character?

How do I extract '1358751074-6824' from this
http://api.discogs.com/images/R-1169056-1358751074-6824.jpeg
and it also needs to extract '13587510746824' from this
http://api.discogs.com/images/R-1169056-13587510746824.jpeg
So I thought I could do it by substringing from the 'second - of the last path component up to the final dot', but how do I work out the second -
Depending on the allowed variations of the string, you could do something like:
String extract = s.replaceAll(".*?-.*?-([\\d-]+).*", "$1");
.*?- skips everyhing up to the first hyphen
.*?- skips everything up to the second hyphen
([\\d-]+) is the part you want to keep: digits and hyphens
.* skips the rest of the string
You can work out the position of the second dash without regular expressions - by finding the position of the first dash, and working from there:
int pos = str.indexOf('-', str.indexOf('-')+1);
Demo.
You can try something like this:
// Your original String
String str = "http://api.discogs.com/images/R-1169056-1358751074-6824.jpeg";
// identify the one-before-last-dash
int i=str.lastIndexOf("-", str.lastIndexOf("-")-1);
// Extract the value you want
String newStr = str.substring(i+1, str.lastIndexOf("."));
// Return numeric value only
String strNums = newStr.replaceAll("[^?0-9]+", "");

having trouble with arrays and maybe split

String realstring = "&&&.&&&&";
Double value = 555.55555;
String[] arraystring = realstring.split(".");
String stringvalue = String.valueof(value);
String [] valuearrayed = stringvalue.split(".");
System.out.println(arraystring[0]);
Sorry if it looks bad. Rewrote on my phone. I keep getting ArrayIndexOutOfBoundsException: 0 at the System.out.println. I have looked and can't figure it out. Thanks for the help.
split() takes a regexp as argument, not a literal string. You have to escape the dot:
string.split("\\.");
or
string.split(Pattern.quote("."));
Or you could also simply use indexOf('.') and substring() to get the two parts of your string.
And if the goal is to get the integer part of a double, you could also simply use
long truncated = (long) doubleValue;
split uses regex as parameter and in regex . means "any character except line separators", so you could expect that "a.bc".split(".") would create array of empty strings like ["","","","",""]. Only reason it is not happening is because (from split javadoc)
This method works as if by invoking the two-argument split method with the given expression and a limit argument of zero. Trailing empty strings are therefore not included in the resulting array.
so because all strings are empty you get empty array (and that is because you see ArrayIndexOutOfBoundsException).
To turn off removal mechanism you would have to use split(regex, limit) version with negative limit.
To split on . literal you need to escape it with \. (which in Java needs to be written as "\\." because \ is also Strings metacharacter) or [.] or other regex mechanism.
Dot (.) is a special character so you need to escape it.
String realstring = "&&&.&&&&";
String[] partsOfString = realstring.split("\\.");
String part1 = partsOfString[0];
String part2 = partsOfString[1];
System.out.println(part1);
this will print expected result of
&&&
Its also handy to test if given string contains this character. You can do this by doing :
if (string.contains(".")) {
// Split it.
} else {
throw new IllegalArgumentException("String " + string + " does not contain .");
}

Remove Special Characters For A Pattern Java

I want to remove that characters from a String:
+ - ! ( ) { } [ ] ^ ~ : \
also I want to remove them:
/*
*/
&&
||
I mean that I will not remove & or | I will remove them if the second character follows the first one (/* */ && ||)
How can I do that efficiently and fast at Java?
Example:
a:b+c1|x||c*(?)
will be:
abc1|xc*?
This can be done via a long, but actually very simple regex.
String aString = "a:b+c1|x||c*(?)";
String sanitizedString = aString.replaceAll("[+\\-!(){}\\[\\]^~:\\\\]|/\\*|\\*/|&&|\\|\\|", "");
System.out.println(sanitizedString);
I think that the java.lang.String.replaceAll(String regex, String replacement) is all you need:
http://docs.oracle.com/javase/6/docs/api/java/lang/String.html#replaceAll(java.lang.String, java.lang.String).
there is two way to do that :
1)
ArrayList<String> arrayList = new ArrayList<String>();
arrayList.add("+");
arrayList.add("-");
arrayList.add("||");
arrayList.add("&&");
arrayList.add("(");
arrayList.add(")");
arrayList.add("{");
arrayList.add("}");
arrayList.add("[");
arrayList.add("]");
arrayList.add("~");
arrayList.add("^");
arrayList.add(":");
arrayList.add("/");
arrayList.add("/*");
arrayList.add("*/");
String string = "a:b+c1|x||c*(?)";
for (int i = 0; i < arrayList.size(); i++) {
if (string.contains(arrayList.get(i)));
string=string.replace(arrayList.get(i), "");
}
System.out.println(string);
2)
String string = "a:b+c1|x||c*(?)";
string = string.replaceAll("[+\\-!(){}\\[\\]^~:\\\\]|/\\*|\\*/|&&|\\|\\|", "");
System.out.println(string);
Thomas wrote on How to remove special characters from a string?:
That depends on what you define as special characters, but try
replaceAll(...):
String result = yourString.replaceAll("[-+.^:,]","");
Note that the ^ character must not be the first one in the list, since
you'd then either have to escape it or it would mean "any but these
characters".
Another note: the - character needs to be the first or last one on the
list, otherwise you'd have to escape it or it would define a range (
e.g. :-, would mean "all characters in the range : to ,).
So, in order to keep consistency and not depend on character
positioning, you might want to escape all those characters that have a
special meaning in regular expressions (the following list is not
complete, so be aware of other characters like (, {, $ etc.):
String result = yourString.replaceAll("[\\-\\+\\.\\^:,]","");
If you want to get rid of all punctuation and symbols, try this regex:
\p{P}\p{S} (keep in mind that in Java strings you'd have to escape
back slashes: "\p{P}\p{S}").
A third way could be something like this, if you can exactly define
what should be left in your string:
String result = yourString.replaceAll("[^\\w\\s]","");
Here's less restrictive alternative to the "define allowed characters"
approach, as suggested by Ray:
String result = yourString.replaceAll("[^\\p{L}\\p{Z}]","");
The regex matches everything that is not a letter in any language and
not a separator (whitespace, linebreak etc.). Note that you can't use
[\P{L}\P{Z}] (upper case P means not having that property), since that
would mean "everything that is not a letter or not whitespace", which
almost matches everything, since letters are not whitespace and vice
versa.

Categories