Java equivalent for Python's str.strip() - java

Suppose I would like to remove all " surrounding a string. In Python, I would:
>>> s='"Don\'t need the quotes"'
>>> print s
"Don't need the quotes"
>>> print s.strip('"')
Don't need the quotes
And if I want to remove multiple characters, e.g. " and parentheses:
>> s='"(Don\'t need quotes and parens)"'
>>> print s
"(Don't need quotes and parens)"
>>> print s.strip('"()')
Don't need quotes and parens
What's the elegant way to strip a string in Java?

Suppose I would like to remove all " surrounding a string
The closest equivalent to the Python code is:
s = s.replaceAll("^\"+", "").replaceAll("\"+$", "");
And if I want to remove multiple characters, e.g. " and parentheses:
s = s.replaceAll("^[\"()]+", "").replaceAll("[\"()]+$", "");
If you can use Apache Commons Lang, there's StringUtils.strip().

The Guava library has a handy utility for it. The library contains CharMatcher.trimFrom(), which does what you want. You just need to create a CharMatcher which matches the characters you want to remove.
Code:
CharMatcher matcher = CharMatcher.is('"');
System.out.println(matcher.trimFrom(s));
CharMatcher matcher2 = CharMatcher.anyOf("\"()");
System.out.println(matcher2.trimFrom(s));
Internally, this does not create any new String, but just calls s.subSequence(). As it also doesn't need Regexps, I guess its the fastest solution (and surely the cleanest and easiest to understand).

In java, you can do it like :
s = s.replaceAll("\"",""),replaceAll("'","")
Also if you only want to replace "Start" and "End" quotes, you can do something like :
s = s.replace("^'", "").replace("'$", "").replace("^\"", "").replace("\"$", "");
OR if simply put :
s = s.replaceAll("^\"|\"$", "").replaceAll("^'|'$", "");

This replaces " and () at the beginning and end of a string
String str = "\"te\"st\"";
str = str.replaceAll("^[\"\\(]+|[\"\\)]+$", "");

try this:
new String newS = s.replaceAll("\"", "");
replace the double-quote with a no-character String.

Related

Java - Remove only the first backslash

Small Java question regarding how to remove only the first backslash please.
I have a string which looks like this:
String s = "\\u6df1\\u5733";
Please note, there are two backslashes, and multiple occurrences.
Hence, when this is displayed, the visual result is:
\深\圳
I would like to just remove any extra backslashes, having a result like this:
深圳
So far, I have tried this:
String s = "\\u6df1\\u5733";
String ss = s.replaceAll("\\", "");
But it is still not working.
What is the correct solution please in order to get 深圳 from "\\u6df1\\u5733" please?
Thank you
Try this.
String s = "\\u6df1\\u5733";
Pattern UNICODE_ESCAPE = Pattern.compile("\\\\u[0-9a-f]+", Pattern.CASE_INSENSITIVE);
String ss = UNICODE_ESCAPE.matcher(s).results()
.map(x -> new String(Character.toChars(Integer.parseInt(x.group().substring(2), 16))))
.collect(Collectors.joining());
System.out.println(ss);
UNICODE_ESCAPE.matcher(s).results() returns the stream of MatcherResult.
x.group().substring(2) extracts hexadecimal part "xxxx" from "\\uxxxx".
Integer.parseInt(..., 16) converts it to an integer value that is a code point.
Caracter.toChars() converts it to an array of char.
new String(...) converts it to an String. And .collect(Collectors.joining()) concatenates the all of them.
output:
深圳
Going by this output:
\深\圳
you actually have two unicode characters each preceded by one backslash.
In a Java string literal, that would look like this:
String s = "\\\u6df1\\\u5733";
If you want to remove the backslashes (\\) and leave the unicode character codes (e.g. \u6df1), then you just need replace.
String ss = s.replace("\\", "");
replaceAll won't work for this, because it requires a regular expression as its first argument.

How to insert quotes inside parenthesis using a regex

I have a string of SVG markup that contains multiples of these:
url(#586-xr___83_193_101__rgba_243_156_18_1__0-rgba_243_156_18_1__100)
and I need them to be like this:
url('#586-xr___83_193_101__rgba_243_156_18_1__0-rgba_243_156_18_1__100')
with quotes inside the parenthesis.
These will be mixed inside a long string containing lots of different markup, so needs to be very accurate.
You can use a regex like this:
\((.*?)\)
With the replacement string ('$1')
The idea is capture everything within parentheses and concatenates the '
So, you can use a code like this:
String str = "url(#586-xr___83_193_101__rgba_243_156_18_1__0-rgba_243_156_18_1__100)";
str = str.replaceAll("\\((.*?)\\)", "('$1')");
//Outuput: url('#586-xr___83_193_101__rgba_243_156_18_1__0-rgba_243_156_18_1__100')
IdeOne example
In case you want a better performance regex you can use:
str = str.replaceAll("\\(([^)]*)\\)", "('$1')");
ReplaceAll remove a part of the string and put an unrelated and invariant new stuff instead.
Because the replacement string can't be the same at both side, the only solution I imagine (with the constraint of using RegEx and ReplaceAll) is to do it in two time:
String Str = "url(#586-xr___83_193_101__rgba_243_156_18_1__0-rgba_243_156_18_1__100)";
Str = Str.replaceAll("\\(", "('"); // replace left parenthesis
Str = Str.replaceAll("\\)", "')"); // replace right parenthesis
System.out.print("Return Value: " + Str);
// Return Value: url('#586-xr___83_193_101__rgba_243_156_18_1__0-rgba_243_156_18_1__100')
You can test it here.

Remove parenthesis from String using java regex

I want to remove parenthesis using Java regular expression but I faced to error No group 1 please see my code and help me.
public String find_parenthesis(String Expr){
String s;
String ss;
Pattern p = Pattern.compile("\\(.+?\\)");
Matcher m = p.matcher(Expr);
if(m.find()){
s = m.group(1);
ss = "("+s+")";
Expr = Expr.replaceAll(ss, s);
return find_parenthesis(Expr);
}
else
return Expr;
}
and it is my main:
public static void main(String args[]){
Calculator c1 = new Calculator();
String s = "(4+5)+6";
System.out.println(s);
s = c1.find_parenthesis(s);
System.out.println(s);
}
The simplest method is to just remove all parentheses from the string, regardless of whether they are balanced or not.
String replaced = "(4+5)+6".replaceAll("[()]", "");
Correctly handling the balancing requires parsing (or truly ugly REs that only match to a limited depth, or “cleverness” with repeated regular expression substitutions). For most cases, such complexity is overkill; the simplest thing that could possibly work is good enough.
What you want is this: s = s.replaceAll("[()]","");
For more on regex, visit regex tutorial.
You're getting the error because your regex doesn't have any groups, but I suggest you use this much simpler, one-line approach:
expr = expr.replaceAll("\\((.+?)\\)", "$1");
You can't do this with a regex at all. It won't remove the matching parentheses, just the first left and the first right, and then you won't be able to get the correct result from the expression. You need a parser for expressions. Have a look around for recursive descent ezpresssion parsers, the Dijkstra shunting-yard algorithm, etc.
The regular expression defines a character class consisting of any whitespace character (\s, which is escaped as \s because we're passing in a String), a dash (escaped because a dash means something special in the context of character classes), and parentheses. Try it working code.
phoneNumber.replaceAll("[\\s\\-()]", "");
I know I'm very late here. But, just in case you're still looking for a better answer. If you want to remove both open and close parenthesis from a string, you can use a very simple method like this:
String s = "(4+5)+6";
s=s.replaceAll("\\(", "").replaceAll("\\)","");
If you are using this:
s=s.replaceAll("()", "");
you are instructing the code to look for () which is not present in your string. Instead you should try to remove the parenthesis separately.
To explain in detail, consider the below code:
String s = "(4+5)+6";
String s1=s.replaceAll("\\(", "").replaceAll("\\)","");
System.out.println(s1);
String s2 = s.replaceAll("()", "");
System.out.println(s2);
The output for this code will be:
4+5+6
(4+5)+6
Also, use replaceAll only if you are in need of a regex. In other cases, replace works just fine. See below:
String s = "(4+5)+6";
String s1=s.replace("(", "").replace(")","");
Output:
4+5+6
Hope this helps!

How to split a string in Java using "%*%" as separator, including the separator in the result list of strings?

I'm looking for the simplest way of tokenizing strings such as
INPUT OUTPUT
"hello %my% world" -> "hello ", "%my%", " world"
in Java. Is it possible to accomplish this with regex? I am basically looking for a String.split() that takes as separator something of the form "%*%" but that won't ignore it, as it seems to generally do.
Thanks
No, you can't do this the way you explained it. The reason is--it's ambiguous!
You give the example:
"hello %my% world" -> "hello ", "%my%", " world"
Should the % be attached to the string before it or after it?
Should the output be
"hello ", "%my", "% world"
Or, perhaps the output should be
"hello %", "my%", " world"
In your example you don't follow either of these rules. You come up with %my% which attaches the delimiter first to the string after it appears and then to the string before it appears.
Do you see the ambiguity?
So, you first need to come up with a clear set of rules about where you want the delimeter to be attached to. Once you do this, one simple (although not particularly efficient since Strings are immutable) way of achieving what you want is to:
Use String.split() to split the strings in the normal way
Follow your rule set to re-add the delimiter to where it should be in the string.
A simpler solution would be to just split the string by %s. That way, every other subsequence would have been between %s. All you have to do afterwards is iterate over the results, toggling a flag to know if the result is a regular string or one between %s.
Special attention has to be taken to the split implementation, how does it handle empty subsequences. Some implementations decide to discard empty subsequences at the begin/end of the input, others discard all empty subsequences and others discard none of them.
This would not result in the exact output that you want, since the %s would be gone. However you can easily add those back if there is an actual need for them (and I presume there isn't).
why not you split by space between your words. in that case you will get "hello","%my%","world".
If possible, use a simpler delimiter. And I'm okay with jury-rigging "%" as your delimiter, just so you can get String.split() instead of regexps. But if that's not possible...
Regexps! You can parse this using a Matcher. If you know there's one delimiter per line, you specify a pattern that eats the whole line:
String singleDelimRegexp = "(.*)(%[^%]*%)(.*)";
Pattern singleDelimPattern = Pattern.compile(singleDelimRegexp);
Matcher singleDelimMatcher = singleDelimPattern.matcher(input);
if (singleDelimMatcher.matches()) {
String before = singleDelimMatcher.group(1);
String delim = singleDelimMatcher.group(2);
String after = singleDelimMatcher.group(3);
System.out.println(before + "//" + delim + "//" + after);
}
If the input is long and you need a chain of results, you use Matcher in a loop:
String multiDelimRegexp = "%[^%]*%";
Pattern multiDelimPattern = Pattern.compile(multiDelimRegexp);
Matcher multiDelimMatcher = multiDelimPattern.matcher(input);
int lastEnd = 0;
while (multiDelimMatcher.find()) {
String data = input.substring(lastEnd, multiDelimMatcher.start());
String delim = multiDelimMatcher.group();
lastEnd = multiDelimMatcher.end();
System.out.println(data);
System.out.println(delim);
}
String lastData = input.substring(lastEnd);
System.out.println(lastData);
Add those to a data structure as you go, and you'll build the whole parsed input.
Running on input: http://ideone.com/s8FzeW

How to find and replace a substring?

For example I have such a string, in which I must find and replace multiple substrings, all of which start with #, contains 6 symbols, end with ' and should not contain ) ... what do you think would be the best way of achieving that?
Thanks!
Edit:
just one more thing I forgot, to make the replacement, I need that substring, i.e. it gets replaces by a string generated from the substring being replaced.
yourNewText=yourOldText.replaceAll("#[^)]{6}'", "");
Or programmatically:
Matcher matcher = Pattern.compile("#[^)]{6}'").matcher(yourOldText);
StringBuffer sb = new StringBuffer();
while(matcher.find()){
matcher.appendReplacement(sb,
// implement your custom logic here, matcher.group() is the found String
someReplacement(matcher.group());
}
matcher.appendTail(sb);
String yourNewString = sb. toString();
Assuming you just know the substrings are formatted like you explained above, but not exactly which 6 characters, try the following:
String result = input.replaceAll("#[^\\)]{6}'", "replacement"); //pattern to replace is #+6 characters not being ) + '
You must use replaceAll with the right regular expression:
myString.replaceAll("#[^)]{6}'", "something")
If you need to replace with an extract of the matched string, use a a match group, like this :
myString.replaceAll("#([^)]{6})'", "blah $1 blah")
the $1 in the second String matches the first parenthesed expression in the first String.
this might not be the best way to do it but...
youstring = youstring.replace("#something'", "new stringx");
youstring = youstring.replace("#something2'", "new stringy");
youstring = youstring.replace("#something3'", "new stringz");
//edited after reading comments, thanks

Categories