Replacing strings in java question - java

Say I have a java source file saved into a String variable like so.
String contents = Utils.getTextFromFile(new File(fileName));
And say there is a line of text in the source file like so
String x = "Hello World\n";
Notice the newline character at the end.
In my code, I know of the existence of Hello World, but not Hello World\n
so therefore a call to
String search = "Hello World";
contents = contents.replaceAll(search, "Something Else");
will fail because of that newline character. How can I make it so it will match in the case of one or many newline characters? Could this be a regular expression I add to the end search variable?
EDIT:
I am replacing string literals with variables. I know the literals, but I dont know if they have a newline character or not. Here is an example of the code before my replacement. For the replacment, I know that application running at a time. exists, but not application running at a time.\n\n
int option = JOptionPane.showConfirmDialog(null,"There is another application running. There can only be one application\n" +
"application running at a time.\n\n" +
"Press OK to close the other application\n" +
"Press Cancel to close this application",
"Multiple Instances of weh detected",
JOptionPane.OK_CANCEL_OPTION, JOptionPane.ERROR_MESSAGE);
And here is an example after my replacement
int option = JOptionPane.showConfirmDialog(null,"There is another application running. There can only be one application\n" +
"application running at a time.\n\n" +
"Press OK to close the other application\n" +
"PRESS_CANCEL_TO_CLOSE",
"MULTIPLE_INSTANCES_OF",
JOptionPane.OK_CANCEL_OPTION,
JOptionPane.ERROR_MESSAGE);
Notice that all of the literals without newlines get replaced, such as "Multiple Instances of weh Detected" is now "MULTIPLE_INSTANCES_OF" but all of the ones with new lines do not. I am thinking that there is some regular expression I can add on to handle one or many newline characters when it tries to the replace all.

well since its actually a regular expression that is passed as the first argument you could try something like this as the first argument
String search = "[^]?[H|h]ello [W|w]orld[\n|$]?"
which will search for hello world everywhere in the start and the end of a line wether it has a \n or not.
a bit redundant and as stated it should not matter but... apparantly it does
try it (made it nifty so it matches capital as well as regular letters :P... just overdoin it)

If you're only replacing string literals, and you're ok with replacing every occurrence of the literal (not just the first one) then you should use the replace method instead of replaceAll.
Your first example should change to this:
String search = "Hello World";
contents = contents.replace(search, "Something Else");
The replaceAll does a regular expression replacement instead of a string literal replacement. This is generally slower, and is not strictly necessary for your use case.
Note that this answer assumes that the trailing newline characters can be left in the string (which you've said you are ok with in the comments).

String search = "Hello World\n\n\n";
search.replaceAll ("Hello World(\\n*)", "Guten Morgen\1");
\1 captures the first group, marked by (...), counting from the opening parenthesis. \n+ is \n for newline, but the backslash needs to be masked in Java, leading to two backslashes. * means 0 to n, so it would catch 0, 1, 2, ... newlines.

As the commenters noted, the \n should not mess this up. However, if you are ok with removing the new lines, you can try this:
contents = contents.replaceAll("search"+"\\n*", "Something Else");

As per your question you need to pass following charator
public String replaceAll(String regex, String replacement)
The two parameters are –
regex – the regular expression to match
replacement – the string to be substituted for every match
Some other method is: -
replace(char oldChar, char newChar)
replace(CharSequence target, CharSequence replacement)
replaceFirst(String regex, String replacement)
example is
import java.lang.String;
public class StringReplaceAllExample {
public static void main(String[] args) {
String str = "Introduction 1231 to 124 basic 1243 programming 34563 concepts 5455";
String Str1 = str.replaceAll("[0-9]+", "");
System.out.println(Str1);
Str1 = str.replaceAll("[a-zA-Z]+", "Java");
System.out.println(Str1);
}
}

Related

Java escape characters in strings - string contains \r (need to keep "r")

I have a string "EAD\rgonzalez" which is passed to me.
I need to pull out "rgonzalez" from it.
I am running into problems with the "\" character.
I cannot find the index of it, I cannot replace it, etc.
Any help on pulling the data after the "\" would be appreciated.
The string that i receive is in the format of domain\username; the data can vary.
Another example would be US\ngross where \n would be interpreted as a newline character.
To clarify, I am not adding a '\', i am trying to split a string on a '\'
This string contains '\r' which in itself is a character, a special one.
I need a way to make \r contained within my string two separate characters, a '\' and an 'r'.
You haven't provided any code, but I'm assuming what you're doing is something like this:
String user = request.getParameter("user"); // user = "EAD\rgonzalez"
If you were to declare a static string in your application, you would have to escape the backslash because it is a special character for Java strings:
String user = "EAD\\rgonzalez";
To split that string on the backslash you must escape it twice in the regex that you pass to the split method. Once because backslash is a special character for Java strings and again because backslash is a special character for regex strings. So instead of one backlash you have four. The one is escaped so then you have two, and then both of them are escaped again.
String[] parts = user.split("\\\\");
Now you have split the string:
System.out.println(parts[0]); // "EAD"
System.out.println(parts[1]); // "rgonzalez"
The string that i receive is in the format of domain\username... the data can vary
The data shouldn't vary if that is the input your program expects.
where \n would be interpreted as a newline character
I'm not sure how you'd get newlines from a single line input form. If you are, then your input is invalid because it does not follow the format you're specified and are expecting. In the case where you did interpret newlines and other whitespace characters, you would either treat the whole thing as the domain, or the username, thus potentially breaking your program logic... Since you have stated the requirement of domain\username, and I don't think that requires you to handle any other form of input.
I am collecting this string from the header data from the request object in a webapp.
In that case, the raw value should not contain an escape character and is actually represented as the form "domain\\username" as a Java string. When you print the value, the escape characters aren't shown
I cannot find the index of it,
With the correct representation, indexOf("\\") will work...
pulling the data after the "\"
Since you would have the value as domain\\username, you need to escape both of the backslashes within the method of split(String pattern) since that is a regular expression.
For example,
public static void main (String[] args) throws java.lang.Exception
{
String in = "EAD\\rgonzalez";
System.out.println(in.indexOf("\\")); // find the index of '\'
String[] parts = in.split("\\\\"); // split on '\\'
System.out.println(Arrays.toString(parts));
}
Again, the string "EAD\rgonzalez" is not in the form of domain\username, as demonstrated here
System.out.print("EAD\rgonzalez".matches("[A-Z]+\\[a-z]+")); // false
The magic you need is in org.apache.commons.lang.StringEscapeUtils
Here is a demo:
package ignoreescapeseq2;
import org.apache.commons.lang.StringEscapeUtils;
/*
* #author Charles Knell
*/
public class IgnoreEscapeSeq2 {
public static void main(String[] args) {
String string = "EAD\rgonzalez"; // REQUIRED INPUT STRING
String eString = StringEscapeUtils.escapeJava(string);
String [] sArray = eString.split("\\\\");
System.out.println("domain: " + sArray[0]);
System.out.println("username: " + sArray[1]);
}
}
Here is the output:
Although this MAY answer the question, there does still seem to be a problem
if you must define the string in java. As you said, "EAD\xgonzalez" isn't a
valid java string because \x isn't a valid escape character. The solution above only works if the input string never has to be explictly defined, as in the demo.

Empty Strings within a non empty String [duplicate]

This question already has answers here:
Replace with empty string replaces newChar around all the characters in original string
(4 answers)
Closed 6 years ago.
I'm confused with a code
public class StringReplaceWithEmptyString
{
public static void main(String[] args)
{
String s1 = "asdfgh";
System.out.println(s1);
s1 = s1.replace("", "1");
System.out.println(s1);
}
}
And the output is:
asdfgh
1a1s1d1f1g1h1
So my first opinion was every character in a String is having an empty String "" at both sides. But if that's the case after 'a' (in the String) there should be two '1' coming in the second line of output (one for end of 'a' and second for starting of 's').
Now I checked whether the String is represented as a char[] in these links In Java, is a String an array of chars? and String representation in Java I got answer as YES.
So I tried to assign an empty character '' to a char variable, but its giving me a compiler error,
Invalid character constant
The same process gives a compiler error when I tried in char[]
char[] c = {'','a','','s'}; // CTE
So I'm confused about three things.
How an empty String is represented by char[] ?
Why I'm getting that output for the above code?
How the String s1 is represented in char[] when it is initialized first time?
Sorry if I'm wrong at any part of my question.
Just adding some more explanation to Tim Biegeleisen answer.
As of Java 8, The code of replace method in java.lang.String class is
public String replace(CharSequence target, CharSequence replacement) {
return Pattern.compile(target.toString(), Pattern.LITERAL).matcher(
this).replaceAll(Matcher.quoteReplacement(replacement.toString()));
}
Here You can clearly see that the string is replaced by Regex Pattern matcher and in regex "" is identified by Zero-Length character and it is present around any Non-Zero length character.
So, behind the scene your code is executed as following
Pattern.compile("".toString(), Pattern.LITERAL).matcher("asdfgh").replaceAll(Matcher.quoteReplacement("1".toString()));
The the output becomes
1a1s1d1f1g1h1
Going with Andy Turner's great comment, your call to String#replace() is actually implemented using String#replaceAll(). As such, there is a regex replacement happening here. The matches occurs before the first character, in between each character in the string, and after the last character.
^|a|s|d|f|g|h|$
^ this and every pipe matches to empty string ""
The match you are making is a zero length match. In Java's regex implementation used in String.replaceAll(), this behaves as the example above shows, namely matching each inter-character position and the positions before the first and after the last characters.
Here is a reference which discusses zero length matches in more detail: http://www.regexguru.com/2008/04/watch-out-for-zero-length-matches/
A zero-width or zero-length match is a regular expression match that does not match any characters. It matches only a position in the string. E.g. the regex \b matches between the 1 and , in 1,2.
This is because it does a regex match of the pattern/replacement you pass to the replace().
public String replace(CharSequence target, CharSequence replacement) {
return Pattern.compile(target.toString(), Pattern.LITERAL).matcher(
this).replaceAll(Matcher.quoteReplacement(replacement.toString()));
}
Replaces each substring of this string that matches the literal target
sequence with the specified literal replacement sequence. The
replacement proceeds from the beginning of the string to the end, for
example, replacing "aa" with "b" in the string "aaa" will result in
"ba" rather than "ab".
Parameters:
target The sequence of char values
to be replaced
replacement The replacement sequence of char values
Returns: The resulting string
Throws: NullPointerException if target
or replacement is null.
Since:
1.5
Please read more at the link below ... (Also browse through the source code).
http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/lang/String.java#String.replace%28java.lang.CharSequence%2Cjava.lang.CharSequence%29
A regex such as "" would match every possible empty string in a string. In this case it happens to be every empty space at the start and end and after every character in the string.

How can i put a string in one line

i'm doing some conversion, from Hex to Ascii, when i convert the string, i got the following example:
F23C040100C1
100D200000000000
0000
I know that the string is coming like this, because of the base 16, but i want too put it in just one line, like this:
F23C040100C1100D2000000000000000
How can i do that?
I have tried:
mensagem.replaceAll("\r\n", " ");
There are multiple problems you could be running into, so I'll cover all them in this answer.
First, any methods on String which appear to modify it actually return a new instance of String. That means if you do this:
String something = "Hello";
something.replaceAll("l", "");
System.out.println(something); //"Hello"
You'll want to do
something = something.replaceAll("l", "");
Or in your case
mensagem = mensagem.replaceAll("\r\n", " ");
Secondly, there might not be any \r in the newline, but there is a \n, or vice versa. Because of that, you want to say that
if \r exists, remove it. if \n exists, also remove it
You can do so like this:
mensagem = mensagem.replaceAll("\r*\n*", " ");
The * operator in a regular expression says to match zero or more of the preceding symbol.
mensagem.replaceAll("\r\n|\r|\n", "");

Java String.replace/replaceAll not working

So, I'm trying to parse a String input in Java that contains (opening) square brackets. I have str.replace("\\[", ""), but this does absolutely nothing. I've tried replaceAll also, with more than one different regex, but the output is always unchanged. Part of me wonders if this is possibly caused by the fact that all my back-slash characters appear as yen symbols (ever since I added Japanese to my languages), but it's been that way for over a year and hasn't caused me any issues like this before.
Any idea what I might be doing wrong here?
Strings are immutable in Java. Make sure you re-assign the return value to the same String variable:
str = str.replaceAll("\\[", "");
For the normal replace method, you don't need to escape the bracket:
str = str.replace("[", "");
public String replaceAll(String regex, String replacement)
As shown in the code above, replaceAll method expects first argument as regular expression and hence you need to escape characters like "(", ")" etc (with "\") if these exists in your replacement text which is to be replaced out of the string. For example :
String oldString = "This is (stringTobeReplaced) with brackets.";
String newString = oldString.replaceAll("\\(stringTobeReplaced\\)", "");
System.out.println(newString); // will output "This is with brackets."
Another way of doing this is to use Pattern.quote("str") :
String newString = oldString.replaceAll(Pattern.quote("(stringTobeReplaced)"), "");
This will consider the string as literal to be replaced.
As always, the problem is not that "xxx doesn't work", it is that you don't know how to use it.
First things first:
a String is immutable; if you read the javadoc of .replace() and .replaceAll(), you will see that both specify that a new String instance is returned;
replace() accepts a string literal as its first argument, not a regex literal.
Which means that you probably meant to do:
str = str.replace("[", "");
If you only ever do:
str.replace("[", "");
then the new instance will be created but you ignore it...
In addition, and this is a common trap with String (the other being that .matches() is misnamed), in spite of their respective names, .replace() does replace all occurrences of its first argument with its second argument; the only difference is that .replaceAll() accepts a regex as a first argument, and a "regex aware" expression as its second argument; for more details, see the javadoc of Matcher's .replaceAll().
For it to work it has to be inside a method.
for example:
public class AnyClass {
String str = "gtrg4\r\n" + "grtgy\r\n" + "grtht\r\n" + "htrjt\r\n" + "jtyjr\r\n" + "kytht";
public String getStringModified() {
str.replaceAll("\r\n", "");
return str;
}
}

String manipulation in java using replaceAll()

I have string of the following form:
भन्‍‌ने [-0.4531954191090929, 0.7931147934270654, -0.3875088408737827, -0.09427394940704822, 0.10065554475134718, -0.22044284832864797, 0.3532556916833505, -1.8256229909222224, 0.8036832111904731, 0.3395868096795993]
Whereever [ or ] or , char are present , I just want to remove them and i want each of the word and float separated by a space. It is follows:
भन्‍‌ने -0.4531954191090929 0.7931147934270654 -0.3875088408737827 -0.09427394940704822 0.10065554475134718 -0.22044284832864797 0.3532556916833505 -1.8256229909222224 0.8036832111904731 0.3395868096795993
I am representing each of these string as line. i did following:
line.replaceAll("([|]|,)$", " ");
But it didn't work for me. There was nothing change in the input line. Any help is really appreciated.
Strings are immutable. Try
line = line.replaceAll("([|]|,)$", " ");
Or to be a bit more verbose, but avoiding regular expressions:
char subst = ' ';
line = line.replace('[', subst).replace(']', subst).replace(',', subst);
In Java, strings are immutable, meaning that the contents of a string never change. So, calling
line.replaceAll("([|]|,)$", " ");
won't change the contents of line, but will return a new string. You need to assign the result of the method call to a variable. For instance, if you don't care about the original line, you can write
line = line.replaceAll("([|]|,)$", " ");
to get the effect you originally expected.
[ and ] are special characters in a regular expression. replaceAll is expected a regular expression as its first input, so you have to escape them.
String result = line.replaceAll("[\\[\\],]", " ");
Cannot tell what you were trying to do with your original regex, why you had the $ there etc, would need to understand what you were expecting the things you put there to do.
Try
line = "asdf [foo, bar, baz]".replaceAll("(\\[|\\]|,)", "");
The regex syntax uses [] to define groups like [a-z] so you have to mask them.

Categories