what working mechanism of replace() in java - java

I wrote java program using StringBuilder class.
class StringHandling2
{
public static void main(String args[])
{
StringBuilder sb=new StringBuilder("Welcom");
sb.replace(1,1,"JAVA");
System.out.println(sb);
}
}
I got output that is WJAVAelcome
after i modified index value of replace() in this program,
class StringHandling2
{
public static void main(String args[])
{
StringBuilder sb=new StringBuilder("Welcom");
sb.replace(2,1,"JAVA");
System.out.println(sb);
}
}
jvm throws runtime error that is StringIndexOutofBoundException.
how is replace(int startIndex, int endIndex,string) working in that program?

public StringBuilder replace(int start,
int end,
String str)
Replaces the characters in a substring of this sequence with
characters in the specified String. The substring begins at the
specified start and extends to the character at index end - 1 or to
the end of the sequence if no such character exists. First the
characters in the substring are removed and then the specified String
is inserted at start. (This sequence will be lengthened to accommodate
the specified String if necessary.)
Parameters:
start - The beginning index, inclusive.<br>
end - The ending index, exclusive.<br>
str - String that will replace previous contents.<br>
Returns:
This object.<br>
Throws:
StringIndexOutOfBoundsException - **if start is negative, greater
than length(), or greater than end**.
(source)
sb.replace(2,1,"JAVA");
2 > 1, hence the exception.

It replaces the characters between the start- and end index you specify with the string given in the third argument.
The error is this case comes from the fact that your startindex can't be greater then your end index.
More info can be found here

This is very common concept; according to JavaDoc of replace() method
public StringBuilder replace(int start,
int end,
String str)
Replaces the characters in a substring of this sequence with characters in the specified String. The substring begins at the specified start and extends to the character at index end - 1 or to the end of the sequence if no such character exists. First the characters in the substring are removed and then the specified String is inserted at start. (This sequence will be lengthened to accommodate the specified String if necessary.)
Parameters:
start - The beginning index, inclusive.
end - The ending index, exclusive.
str - String that will replace previous contents.
Returns:
This object.
Throws:
StringIndexOutOfBoundsException - if start is negative, greater than length(), or greater than end.
In your case the start integer is greater that start which would obviously throw ArrayOutOfBoundException.
Edit: and if you are curious why so then the AbstractStringBuilder class is used to replace the characters and this is achieved by creating array of characters in background which is hidden from us.

Good question.
StringBuilder.replace(int start, int end, String s) replaces characters from start index till 1 less than end index by String s.
For eg.
StringBuilder str = new StringBuilder("hello world");
str.replace(0,1,"welcome"); // o/p -welcomeello world
str.replace(2,5,"welcome"); // o/p - hewelcome world
i guess it internally converts string into char[] charArray & replaces char[start] to char[end-1] with string argument value.
So your always end has to greater than start

Related

Does trim() method removes CRLF characters also?

Suddenly noticed that trim() method removes CRLF - new line - characters also..:
String s = "str\r\n";
s = s.trim();
System.out.println("--");
System.out.print(s);
System.out.println("--");
Is it intended to do so?
Yes, see the doc:
Otherwise, let k be the index of the first character in the string
whose code is greater than '\u0020', and let m be the index of the
last character in the string whose code is greater than '\u0020'. A
new String object is created, representing the substring of this
string that begins with the character at index k and ends with the
character at index m-that is, the result of this.substring(k, m+1).
CR+LF: CR (U+000D) followed by LF (U+000A) less than U+0020

Empty Strings within a non empty String [duplicate]

This question already has answers here:
Replace with empty string replaces newChar around all the characters in original string
(4 answers)
Closed 6 years ago.
I'm confused with a code
public class StringReplaceWithEmptyString
{
public static void main(String[] args)
{
String s1 = "asdfgh";
System.out.println(s1);
s1 = s1.replace("", "1");
System.out.println(s1);
}
}
And the output is:
asdfgh
1a1s1d1f1g1h1
So my first opinion was every character in a String is having an empty String "" at both sides. But if that's the case after 'a' (in the String) there should be two '1' coming in the second line of output (one for end of 'a' and second for starting of 's').
Now I checked whether the String is represented as a char[] in these links In Java, is a String an array of chars? and String representation in Java I got answer as YES.
So I tried to assign an empty character '' to a char variable, but its giving me a compiler error,
Invalid character constant
The same process gives a compiler error when I tried in char[]
char[] c = {'','a','','s'}; // CTE
So I'm confused about three things.
How an empty String is represented by char[] ?
Why I'm getting that output for the above code?
How the String s1 is represented in char[] when it is initialized first time?
Sorry if I'm wrong at any part of my question.
Just adding some more explanation to Tim Biegeleisen answer.
As of Java 8, The code of replace method in java.lang.String class is
public String replace(CharSequence target, CharSequence replacement) {
return Pattern.compile(target.toString(), Pattern.LITERAL).matcher(
this).replaceAll(Matcher.quoteReplacement(replacement.toString()));
}
Here You can clearly see that the string is replaced by Regex Pattern matcher and in regex "" is identified by Zero-Length character and it is present around any Non-Zero length character.
So, behind the scene your code is executed as following
Pattern.compile("".toString(), Pattern.LITERAL).matcher("asdfgh").replaceAll(Matcher.quoteReplacement("1".toString()));
The the output becomes
1a1s1d1f1g1h1
Going with Andy Turner's great comment, your call to String#replace() is actually implemented using String#replaceAll(). As such, there is a regex replacement happening here. The matches occurs before the first character, in between each character in the string, and after the last character.
^|a|s|d|f|g|h|$
^ this and every pipe matches to empty string ""
The match you are making is a zero length match. In Java's regex implementation used in String.replaceAll(), this behaves as the example above shows, namely matching each inter-character position and the positions before the first and after the last characters.
Here is a reference which discusses zero length matches in more detail: http://www.regexguru.com/2008/04/watch-out-for-zero-length-matches/
A zero-width or zero-length match is a regular expression match that does not match any characters. It matches only a position in the string. E.g. the regex \b matches between the 1 and , in 1,2.
This is because it does a regex match of the pattern/replacement you pass to the replace().
public String replace(CharSequence target, CharSequence replacement) {
return Pattern.compile(target.toString(), Pattern.LITERAL).matcher(
this).replaceAll(Matcher.quoteReplacement(replacement.toString()));
}
Replaces each substring of this string that matches the literal target
sequence with the specified literal replacement sequence. The
replacement proceeds from the beginning of the string to the end, for
example, replacing "aa" with "b" in the string "aaa" will result in
"ba" rather than "ab".
Parameters:
target The sequence of char values
to be replaced
replacement The replacement sequence of char values
Returns: The resulting string
Throws: NullPointerException if target
or replacement is null.
Since:
1.5
Please read more at the link below ... (Also browse through the source code).
http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/lang/String.java#String.replace%28java.lang.CharSequence%2Cjava.lang.CharSequence%29
A regex such as "" would match every possible empty string in a string. In this case it happens to be every empty space at the start and end and after every character in the string.

How can I select the substring represented by the original string except the two initial characters?

I know that in Java I can extract a substring from a String object doing something like:
String string= "Hello World";
String subString = string.substring(5);
And in this wat the subString variable will contain only the Hello string
and I know that I can also specify 2 index to select a substring, something like:
String subString = string.substring(6, 11);
That will select the World string.
But what can I do if, given a string, I want select the substring represented by the original string except the two initial characters.
So for example I have:
String value = "12345"
and my substring have to be 345
How can I do it?
String subString = string.substring(5); doesn't do what you think it does.
Actually string.substring(2) returns a String containing all the characters of the first String except the first two characters.
When you want a sub string starting at the beginning of the input String, you use the two parameters version - for example string.substring(0,5) for the first 5 characters.
From the Java docs,
Returns a new string that is a substring of this string. The substring
begins with the character at the specified index and extends to the
end of this string.
Examples:
"unhappy".substring(2) returns "happy"
"Harbison".substring(3) returns "bison"
"emptiness".substring(9) returns "" (an empty string)
Parameters: beginIndex the beginning index, inclusive. Returns: the
specified substring. Throws: IndexOutOfBoundsException - if beginIndex
is negative or larger than the length of this String object.
public static void main(String[] args) {
String sb = "12345";
String s = sb.substring(2);
System.out.println(s);
}
output
345

Java: remove fixed number of characters at the start and the end from a string

I'm dealing with strings that differ in size but I know that there will always be the same number of characters at the beginning and end. e.g.
String i = "id3-jfhd3udj-endid";
String i = "id7-fdl3-endid";
String i = "id1-lkjf348hosjsldf-endid";
Is there a way (like a method in the String class) that would allow me to parse the string every time, removing the front characters and the back characters?
Also, what if the string contains two '-'?:
String i = "id3-t-jfhd3udj-t-endid";
Thanks
I know that there will always be the same number of characters at the
begging and end
Just use the substring(int beginIndex,
int endIndex) method :
Returns a new string that is a substring of this string. The substring
begins at the specified beginIndex and extends to the character at
index endIndex - 1.
i.substring(4, i.length()-6)
Alternatively, if you know that the part you want is always between the two '-', you can use :
i.substring(i.indexOf('-')+1, i.lastIndexOf('-'))
The last solution will always work whatever the number of characters at the
beggining and end are, while the first one will only work for your specific case.
You can use substring() method to extract the required string .
Syntax: substring(int startIndex,int endIndex);
int index1= i.indexOf("-");
int index2= i.lastIndexOf("-");
i=i.substring(index1+1,index2);
Then use the String.substring method:
string.substring(string.indexOf("-")+1, string.lastIndexOf("-"))
Also, what if the string contains two '-'?:
String i = "id3-t-jfhd3udj-t-endid";
As long as the string contains an even number of dashes, and the substring that you want is in the middle, this code will work.
public String extractString(String s) {
String[] parts = s.split("-");
return parts[parts.length / 2];
}
If it's always in that exact format (with the '-'s), the easiest [probably not the most efficient though] way to get the middle section of the string is probably i.split("-")[1].
If not, you could use i.substring(4, i.length() - 6)).
How about this approach
String i = "id1-lkjf348hosjsldf-endid";
System.out.println(i);
if (i.contains("-")) { // check for '-'
i = i.substring(1+i.indexOf('-'), i.lastIndexOf('-')); // first dash to last dash.
}
System.out.println(i);
If the string is always in the form of <prefix>-<content>-<suffix> (contains two '-') then you can try this:
String content = string.split("-")[1];

Java: Parsing a string based on delimiter

I have to design an interface where it fetches data from machine and then plots it. I have already designed the fetch part and it fetches a string of format A&B#.13409$13400$13400$13386$13418$13427$13406$13383$13406$13412$13419$00000$00000$
First five A&B#. characters are the identifier. Please note that the fifth character is new line feed i.e. ASCII 0xA.
The function I have written -
public static boolean checkStart(String str,String startStr){
String Initials = str.substring(0,5);
System.out.println("Here is start: " + Initials);
if (startStr.equals(Initials))
return true;
else
return false;
}
shows Here is start: A&B#. which is correct.
Question 1:
Why do we need to take str.substring(0,5) i.e. when I use str.substring(0,4) it shows only - Here is start: A&B# i.e. missing new line feed. Why is New Line feed making this difference.
Further to extract remaing string I have to use s.substring(5,s.length()) instead of s.substring(6,s.length())
i.e.
s.substring(6,s.length()) produces 3409$13400$13400$13386$13418$13427$13406$13383$13406$13412$13419$00000$00000$ i.e missing the first char after the identifier A&B#.
Question 2:
My parsing function is:
public static String[] StringParser(String str,String del){
String[] sParsed = str.split(del);
for (int i=0; i<sParsed.length; i++) {
System.out.println(sParsed[i]);
}
return sParsed;
}
It parses correctly for String String s = "A&B#.13409/13400/13400/13386/13418/13427/13406/13383/13406/13412/13419/00000/00000/"; and calling the function as String[] tokens = StringParser(rightChannelString,"/");
But for String such as String s = "A&B#.13409$13400$13400$13386$13418$13427$13406$13383$13406$13412$13419$00000$00000$" , the call String[] tokens = StringParser(rightChannelString,"$"); does not parse the string at all.
I am not able to figure out why this behaviour. Can any one please let me know the solution?
Thanks
Regarding question 1, the java API says that the substring method takes 2 parameters:
beginIndex the begin index, inclusive.
endIndex the end index, exclusive.
So in your example
String: A&B#.134
Index: 01234567
substring(0,4) = indexes 0 to 3 so A&B#, that's why you have to put 5 as the second parameter to recover your line delimiter.
Regarding question 2, I guess that the split method takes a regexp in parameter and $ is a special character. To match the dollar sign I guess you have to escape it with the \ character (as \ is a special char in strings so you must also escape it).
String[] tokens = StringParser(rightChannelString,"\\$");
Q1: review the description of substring in the documentation:
Returns a new string that is a substring of this string.
The substring begins at the specified beginIndex and extends to the
character at index endIndex - 1. Thus the length of the substring
is endIndex-beginIndex.
Q2: the split method takes a regular expression for the separator. $ is a special character for regular expressions, it matches the end of the line.

Categories