indexOf() of StringBuilder doesn't return anything - java

StringBuilder builder = new StringBuilder();
builder.setLength(10);
builder.append("d");
System.out.println(builder.length() + "\t" + builder.toString() + "\t" + builder.indexOf("d"));
Output:
11
Problem:
Why indexOf() doesn't return anything.
My Understanding:
As per my understanding, it should return 10, since StringBuilder is counting the "d" as part of its length in length().
But in case if "d" is not part of string hold by StringBuilder then, it should return -1 and length should be 10.

If you look at the docs of StringBuilder#setLength(int newLength)
If the newLength argument is greater than or equal to the current length, sufficient null characters ('\u0000') are appended so that length becomes the newLength argument.
That is why when you append "d" after setting the length, it is placed after the 10 null characters.
Q. Why indexOf() doesn't return anything.
It does return a value and that is 10, since the indexing is 0-based. This is the output of your code.
11 d 10 // the 10 represents the index of d
^-length ^-10 null chars followed by d
The reason you're not getting the output may be because of your console not supporting null characters. That is why, when it encounters the null character \u0000, it would just stop printing the values in the console. Try using eclipse which supports printing of Unicode characters.
Sample snapshot:

Related

What will be the output of String.substring(String.length)?

public class Str {
public static void main(String[] args) {
String str = "abcde";
String s = str.substring(str.length());
System.out.println(s);
}
}
The index of character 'e' is 4, but I am trying to get the whole string of length 5. If I execute the code above, why it is not throwing the IndexOutOfBoundsException?
The JavaDoc for String.substring() states:
[throws] IndexOutOfBoundsException - if beginIndex is negative or larger than the length of this String object.
Since your beginIndex is equal to the length of the string it is a valid value and substring() returns an empty string.
The empty String ("" with length 0) is a valid String. So that is what is returned by your code.
In other words str.substring(str.length()-1); returns the string "e", and str.substring(str.length()); returns the empty string. Perfectly valid.
Assume you got a String:
Hello World
this is what the indicies look like:
H e l l o W o r l d
0 1 2 3 4 5 6 7 8 9 10
"Hello World" has length of 11 so str.length would be equal to 11 in this case
now there is no index 11 in there, in fact 11 is beyond the last index. thats why you receive a IndexOutOfBounds
Additionally str.substring(number) returns a substring STARTING from the specified number all the way to the end of the string.
so str.substring(4) in this case would return
o World
Just thought i should put that in here

indexOf method asking for a char that appears multiple times?

String str = "Aardvark";
str.indexOf('a');
I was wondering what index str would return if it asked for a certain character and the string contained multiple of it. For example, aardvark: would the method return index 0, for the first instance it saw the char? There are 3 'a' chars in the word, so which would it return?
One additional question (couldn't fit it in the original question)
What is the difference between
str.indexOf('a');
and
str.indexOf("a");
I know the first is a char and the second is a String, but if str = "Aardvark", wouldn't the second statement return -1 or some sort of error, because "a" refers to a single-character String, not one char of a string?
I'm very sorry if this was unclear, I couldn't really think of a better way to pose my question. Thanks in advance!
indexOf() will return the index of the first occurrence of the string/char
like you say, one looks for a char and the other on a sub string. "a" will be found, as "a" is a substring of "Aardvark"
It would print the first occurence..
To get the second occurence you
would have to
fill in
indexOf(char c, int lookafterfirstindex);
indexOf can also take those two parameters instead of just the char.
Link to API Doc:
https://docs.oracle.com/javase/7/docs/api/java/lang/String.html#indexOf(java.lang.String,%20int)
Here is a simple example:
String text = "abcd_a";
System.out.println("Index of a: "+ text.indexOf('a')); // Index of a: 0
System.out.println("Index of a: "+ text.indexOf("a")); // Index of a: 0
System.out.println("Index of b: "+ text.indexOf('b')); // Index of b: 1
System.out.println("Index of c: "+ text.indexOf('c')); // Index of c: 2
System.out.println("Index of z: "+ text.indexOf('z')); // Index of z: -1
simple index of:
indexOf(char/string) will always return the first index of the occurrence.
from index:
There is also indexOf(char/string, int fromIndex) - which will search from a given position in your string.
last index:
There is a lastIndexOf(char/string) - which will search last occurrence.
Regarding the char vs String, I would use char if I only need one char index lookup. The char will peform much faster than the String index-lookup-methods!!!
Java String Spec

Strange behavior of Java String split() method

I have a method which takes a string parameter and split the string by # and after splitting it prints the length of the array along with array elements. Below is my code
public void StringSplitTesting(String inputString) {
String tokenArray[] = inputString.split("#");
System.out.println("tokenArray length is " + tokenArray.length
+ " and array elements are " + Arrays.toString(tokenArray));
}
Case I : Now when my input is abc# the output is tokenArray length is 1 and array elements are [abc]
Case II : But when my input is #abc the output is tokenArray length is 2 and array elements are [, abc]
But I was expecting the same output for both the cases. What is the reason behind this implementation? Why split() method is behaving like this? Could someone give me proper explanation on this?
One aspect of the behavior of the one-argument split method can be surprising -- trailing nulls are discarded from the returned array.
Trailing empty strings are therefore not included in the resulting array.
To get a length of 2 for each case, you can pass in a negative second argument to the two-argument split method, which means that the length is unrestricted and no trailing empty strings are discarded.
Just take a look in the documentation:
Trailing empty strings are therefore not included in the resulting
array.
So in case 1, the output would be {"abc", ""} but Java cuts the trailing empty String.
If you don't want the trailing empty String to be discarded, you have to use split("#", -1).
The observed behavior is due to the inherently asymmetric nature of the substring() method in Java:
This is the core of the implementation of split():
while ((next = indexOf(ch, off)) != -1) {
if (!limited || list.size() < limit - 1) {
list.add(substring(off, next));
off = next + 1;
} else { // last one
//assert (list.size() == limit - 1);
list.add(substring(off, value.length));
off = value.length;
break;
}
}
The key to understanding the behavior of the above code is to understand the behavior of the substring() method:
From the Javadocs:
String java.lang.String.substring(int beginIndex, int endIndex)
Returns a new string that is a substring of this string. The substring
begins at the specified beginIndex and extends to the character at index
endIndex - 1. Thus the length of the substring is endIndex-beginIndex.
Examples:
"hamburger".substring(4, 8) returns "urge" (not "urger")
"smiles".substring(1, 5) returns "mile" (not "miles")
Hope this helps.

length of a String with surrogate characters in it - java

I am having trouble counting the length of my String which has some surrogate characters in it ?
my String is,
String val1 = "\u5B66\uD8F0\uDE30";
The problem is, \uD8F0\uDE30 is one character not two, so the length of the String should be 2.
but when I am calculating the length of my String as val1.length() it gives 3 as output, which is totally wrong. how can I fix the problem and get the actual length of the String?
You can use codePointCount(beginIndex, endIndex) to count the number of code points in your String instead of using length().
val1.codePointCount(0, val1.length())
See the following example,
String val1 = "\u5B66\uD8F0\uDE30";
System.out.println("character count: " + val1.length());
System.out.println("code points: "+ val1.codePointCount(0, val1.length()));
output
character count: 3
code points: 2
FYI, you cannot print individual surrogate characters from a String using charAt() either.
In order to print individual supplementary character from a String use codePointAt and offsetByCodePoints(index, codePointOffset), like this,
for (int i =0; i<val1.codePointCount(0, val1.length()); i++)
System.out.println("character at " + i + ": "+ val1.codePointAt(val1.offsetByCodePoints(0, i)));
}
gives,
character at 0: 23398
character at 1: 311856
for Java 8
You can use val1.codePoints(), which returns an IntStream of all code points in the sequence.
Since you are interested in length of your String, use,
val1.codePoints().count();
to print code points,
val1.codePoints().forEach(a -> System.out.println(a));

getChars() using StringBuffer

I am new to Java.
I executed the below program successfully but I don't understand the output.
This is the program.
public class StringBufferCharAt {
public static void main(String[] args)
{
StringBuffer sb = new StringBuffer("abcdefghijklmnopqrstuvwxyz");
System.out.println("Length of sb : " + sb.length());
int start = 0;
int end = 10;
char arr[] = new char[end - start];
sb.getChars(start, end, arr, 0);
System.out.println("After altering : "+ arr.toString());
}
}
After executing this program: I got the following output:
Length of sb : 26
After altering : [C#21a722ef
My Questions:
Instead of printing 10 characters in the output, why 11 characters.
Instead of printing the original characters "abcdefghij" which are
inside sb, why did I get some other characters.
The arr.toString() in your last sentence is giving you a String value of your Object (doc here), here's an array. What you were probably trying to achieve was something like Arrays.toString(arr) which will print the content of your array (doc).
Why is the output 11 characters, and what do those characters mean?
It just so happens that the string representation of the char[], as specified by Object#toString(), is 11 characters: the [C indicates that it's a char[], the # indicates the following 8 hex digits are an address in memory. As the JavaDoc states,
This method returns a string equal to the value of:
getClass().getName() + '#' + Integer.toHexString(hashCode())
Class#getName() returns "C[" for a char array, and the default hashCode() implementation (generally) returns the object's address in memory:
This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the Java™ programming language.
How should this be solved?
If you want to print the contents of an array, use Arrays.toString():
// Instead of this:
System.out.println("After altering : "+ arr.toString());
// Use this:
System.out.println("After altering : "+ Arrays.toString(arr));
You need to have final print statement like this:
System.out.println("After altering : "+ new String(arr));
OR
System.out.println("After altering : "+ java.util.Arrays.toString(arr));
OUTPUT
For 1st case: abcdefghij
For 2nd case: [a, b, c, d, e, f, g, h, i, j]
Note: arr.toString() doesn't print the content of array that's why you need to construct a new `String object from char array like in my answer above or callArrays.toString(arr)`.
Arrays in java do not have any built in 'human readable' toString() implementations. What you see is just standard output derived from the memory location of the array.
The easiest way to turn a char[] into something printable is to just build a string out of it.
System.out.println("After altering : " + String.valueOf(arr));
The expression arr.toString() does not convert a char[] to a String using the contents of the char[]; it uses the default Object.toString() method, which is to print a representation of the object (in this case an array object). What you want is either to convert the characters to a String using new String(arr) or else an array representation of the characters using Arrays.toString(arr).
The one part of this question I can answer is #1. You are getting back 11 characters because the start and end variables are indexes. 0 is a valid index so, from 0 - 10 there are 11 different numbers.

Categories