Source code of a String Class in Java

Source code of a String Class in Java - java

I can't figure out what this lines do:
this.value = "".value;
and
int len2 = anotherString.value.length;
There is an array called "values" in the beginning of the class. It's something like String.Array (Array of chars).
How does it work?

Since the value field contains the characters for the String instance, what you see is the constructor using the value field of the (usually) interned empty string constant.
This way, an empty string created with the empty constructor will all use the same instance of the array. Strings are immutable, so they can and usually do, share the underlying char array to save on memory usage.

Related

Does string pool store literals or objects?

Stackoverflow is full of questions related to different types of String initialization. I understand how different is String s = "word" to String s = new String("word"). So no need to 'touch' that topic.
I noticed that different people refer that String pool stores constants/objects/literals.
Constants are understandable, as they are final, so they always 'stay' there. Yes, also duplicates aren't stored in SCP.
But I can't understand does SCP store objects or literals. They are totally different concepts. Object is an entity, while literal is just a value. So what is the correct answer to this. Does SCP store objects or literals? I know it can't be both :)

Literals are a chunk of source code that is delimited by ". For example, in the following line of source code:
String s = "Hello World";
"Hello World" is a string literal.
Objects are a useful abstraction for a meaningful bits of memory with data that (when grouped together) represents something, whether it be a Car, Person, or String.
The string pool stores String objects rather than String literals, simply because the string pool does not store source code.
You might hear people say "the string pool stores string literals". They (probably) don't mean that the string pool somehow has the source code "Hello World" in it. They (probably) mean that all the Strings represented by string literals in your source code will get put into the string pool. In fact, the Strings produced by constant expressions in your source code also gets added to the string pool automatically.

Strictly speaking, "literal" is not a value; It is a syntactic form. A String literal in Java is a double quote followed by some non-double-quote (or escaped double quote) characters, ending in another double quote. A "literal value" is a value that is created from a source-code literal, as opposed to an evaluated value such as a.concat(b). The core difference is that the literal value can be identified at compilation time, while an evaluated value can only be known during execution. This allows the compiler to store the literal values inside the compiled bytecode. (Since constants initialised by literal values are also known by the compiler at compile time, evaluations that only use constants can also be computed at compile time.)
In colloquial speech one can refer to a literal value as a "literal", but that may be the source of your confusion - a value is a value, whether its origin is a literal, or an evaluation.
I know it can't be both
The distinction between a literal value and an evaluated value is separate from a distinction between an object value and a primitive value. "foo" is a literal String value (and since Strings are objects, it is also an object). 3 is a literal primitive (integer) value. If x is currently 7, then 18 - x evaluates to a non-literal primitive value of 11. If y is currently "world!", then "Hello, " + y evaluates to a non-literal, non-primitive value "Hello, world!".

Nice question. The answer can be found through how String::intern() was implemented. From javadoc:
* When the intern method is invoked, if the pool already contains a
* string equal to this {#code String} object as determined by
* the {#link #equals(Object)} method, then the string from the pool is
* returned. Otherwise, this {#code String} object is added to the
* pool and a reference to this {#code String} object is returned.
* <p>
So the String pool stores string object.
We can open the source code to confirm the answer. String::intern() is a native method and it's defined in StringTable::intern(), symbolTable.hpp
oop StringTable::intern(Handle string_or_null, jchar* name,
int len, TRAPS) {
unsigned int hashValue = hash_string(name, len);
int index = the_table()->hash_to_index(hashValue);
oop found_string = the_table()->lookup(index, name, len, hashValue);
// Found
if (found_string != NULL) {
ensure_string_alive(found_string);
return found_string;
}
... ...
Handle string;
// try to reuse the string if possible
if (!string_or_null.is_null()) {
string = string_or_null;
} else {
string = java_lang_String::create_from_unicode(name, len, CHECK_NULL);
}
... ...
// Grab the StringTable_lock before getting the_table() because it could
// change at safepoint.
oop added_or_found;
{
MutexLocker ml(StringTable_lock, THREAD);
// Otherwise, add to symbol to table
added_or_found = the_table()->basic_add(index, string, name, len,
hashValue, CHECK_NULL);
}
ensure_string_alive(added_or_found);
return added_or_found;
}
http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/f3108e56b502/src/share/vm/classfile/symbolTable.cpp

Does java keep the String in a form of char array?

I couldn't find it on the api or by searching the web..
I know the JVM keeps every String object it has in a string pool in order to optimize memory usage. However I can't figure out how does it save it 'under the hood', Since a String is an immutable object using the toCharArray will get me a copy of the internal array the is stored on the String object in the pool? (if so than every operation involved in getting the string array as a char is O(n)) also - using charAt(i) uses the internal array of the string? or does it copy it to a new array and return the char at position i of the new copied array?

Until Java 8, Strings were internally represented as an array of characters – char[], encoded in UTF-16, so that every character uses two bytes of memory.
When we create a String via the new operator, the Java compiler will create a new object and store it in the heap. For example.
String str1= new String("Hello");
When we create a String variable and assign a value to it, the JVM searches the pool for a String of equal value. If found, the Java compiler will simply return a reference to its memory address, without allocating additional memory.If not found, it’ll be added to the pool and its reference will be returned.
String str2= "Hello";
toCharArray() internally creates a new char[] array by copying the characters of original array to the new one.
charAt(int index) returns the value of specified index of the internal (original) char[] array.
With Java 9 a new representation is provided, called Compact Strings. This new format will choose the appropriate encoding between char[] and byte[] depending on the stored content. Since the new String representation will use the UTF-16 encoding only when necessary, the amount of heap memory will be significantly lower, which in turn causes less Garbage Collector overhead on the JVM.
Source:
http://www.baeldung.com/java-string-pool

String equality comparison failure in Java

I'm taking a one string value from an object in a object list.
transitionName = transitionList.get(m).getTransitionName().toString();
And another string value from an object retrieved by a EJB query.
changeItem = changeItemFacade.getChangeItem(changeGroupList.get(1));
char tempNewString[] = changeItem.getNewstring();
newString=new String(tempNewString);
Now this Char[] to String comparison is because the Oracle Table which contains the changeItem have defined the coloumn NewString as a CLOB.
And the ejb entity defines type of the variable 'NewString' as a Char[] array .
So i have to convert it to a string before doing the comparison.
The problem is that this if statement always returns false and doesn't get executed.
if(transitionName.equalsIgnoreCase(newString)){}
When i try to Log the values (Logger.Debug) , It perfectly show the two (equal) string values in the Server Instance log.
Is there something wrong with the way i convert the char[] array?
I tried changing the type of the entity class variable to String(and of course also the getter and setter methods) but doesn't work either.

Try trim() on strings before comparing. Also, look for character encoding differences. – Singularity

Convert code with pointers in C to Java code

I am having some difficulty in understanding how to write the below piece of code using String or char[] in Java.
void xyz(char *a, int startIndex, int endIndex)
{
int j;
for (j = startIndex; j <= endIndex; j++)
{
doThis((a+startIndex), (a+j));
xyz(a, startIndex+1, endIndex);
}
}
Here char *a points to the starting location of the char name[]
The above are just some random functions, but I just want the logic of how to use char* and character index char[] in Java

Based on the rephrased question from the comment thread:
You cannot change the characters of a Java String. If you need to modify a sequence of characters, use StringBuilder, which supports setCharAt(int, char), insert(int, char), and append(char). You can use new StringBuilder(myString) to convert a String to a StringBuilder, and stringBuilder.toString() to convert back.
This is perfectly legit Java code -- it's not code smelly, it's just the way you work with mutable character sequences.

A char* in C is, as you noted, pointing to the start of your character array (which is how C manages Strings).
In C the size of a char is one byte, and pointers always point to the start of a byte. Your C String is an array of characters, so adding 1 to a pointer moves the start of your string right by one character.
That means that the C code:
char *a;
// Set the String here
a = a + 1;
translates in Java to something like:
String a;
// Set the String here
a = a.substring(1);
or if you are using a char array:
char[] a;
// Set the array contents here
char[] copyTo = new char[a.length];
System.arraycopy(a, 1, copyTo, 0, a.length);
a = copyTo;
Java will be a bit more careful of protecting you that C will be though. For instance, if you have a zero length string, the C code has the potential to either segfault (crashing the application) or give you a gibberish string full of memory junk (then, eventually, crash the application), whereas the Java code will throw an exception (normally an IndexOutOfBoundsException) which you can, hopefully, handle cleanly.
Remember though, that String in Java are immutable. You cannot change them, you can only create new Strings. Fortunately, String has several built in functions which allow you to do a lot of the standard actions, like replace part of the String with another and return the result. A character array is mutable, and you can change the characters within them, but you will lose a lot of the nice benefits you get from using the proper String class.

Simple Answer:
You can't do exactly that. Java is pass by reference only. You don't have access to memory location information, so you can't do arithmetic with it.
Longer Answer:
It looks like you are passing in a string for manipulation. You have several options to simulate that.
You can convert the string to an array of characters and then pass in a char[]. If your manipulations are not any sort of standard string operation and completely custom this is probably what you need to do. Keep in mind that you can't change the size of the array passed in, nor can you have a point at a new array after the function completes. (again, only pass by value). Only the values of the existing elements of the array can be modified.
You can pass in the String and use the String methods, such as subString() (which your begin and end indexes seem to suggest, but this may not meet your needs. Note that strings are immutable however, and you can only get a result out via the return statement.
If you really need to modify the contents of the object passed in you can pass a StringBuilder, StringBuffer or CharBuffer object and modify away.
There's a hack that can also be used to circumvent pass by reference, but it's poor style except in special situations. Pass in an array of whatever you need to modify, so in this case an array of array of characters would allow you to set a new sub-array, and effectively acheive pass by reference, but try not to do this :)

If your method modifies the values you cant use String as that is immutable, you can use StringBuilder instead.
If your methods already rely on char arrays and you need the offsets you can use a CharBuffer to wrap an array. It does not support String operations but supports views for sub ranges, which seems to be what you use in the doThis() method.

String can't change. But int, char can change

I've read that in Java an object of type String can't change. But int and char variables can. Why is it? Can you give me an example?
Thank you.
(I am a newer -_- )

As bzabhi said, strings are immutable in Java. This means that a string object will never change. This does not mean you can not change string variables, just that you cannot change the underlying memory representation of the string. for an example:
String str = "Hello";
str += " World!";
Following the execution of these lines, str will point to a new string in memory. The original "Hello" string still exists in memory, but most likely it will not be there for long. Assuming that there are no extenuating circumstances, nothing will be pointing at the original string, so it will be garbage collected.
I guess the best way to put this would be to say that when line 2 of the example executes, a new string in memory is created from the concatenation of the original string and the string being added to it. The str variable, which is just a reference to a memory location, is then changed to point at the new variable that was just created.
I am not particularly knowledgeable on the point, but, as I understand it, this is what happens with all "non-primitive" values. Anything that at some point derives from Object follows these rules. Primitive values, such as ints, bools, chars, floats and doubles allow the actual value in memory to be changed. So, from this:
int num = 5;
num += 2;
the actual value in memory changes. Rather than creating a new object and changing the reference, this code sample will simply change the value in memory for the num variable.
As for why this is true, it is simply a design decision by the makers of Java. I'm sure someone will comment on why this was made, but that isn't something I know.

int and char can't change either. As with strings, you can put a different value into the same variable, but an integer itself doesn't change. 3 will always be 3; you can't modify it to be 4.

String is an immutable type (the value inside of it cannot change). The same is true for all primitive types (boolean, byte, char, short, int, long, float, and double).
int x;
String s;
x = 1;
x = 2;
s = "hello";
s = "world";
x++; // x = x + 1;
x--; // x = x - 1;
As you can see, in no case can you alter the constant value (1, 2, "hello", "world") but you can alter where they are pointing (if you warp your mind a bit and say that an int variable points at a constant int value).

I'm not sure that it is possible to show (by example) that Strings cannot change. But you can confirm this by reading the description section of Javadoc for the String class, then reading the methods section and noting that there are no methods that can change a String.
EDIT: There are many reasons why Strings are designed to be immutable in Java. The most important reason is that immutable Strings are easier to use correctly than mutable ones. And if you do need the mutable equivalent of a String for some reason, you can use the StringBuilder (or StringBuffer) class.

It's also worthwhile to note that since strings are immutable, that if they are passed into a method, they can't be modified inside of the method and then have those changes seen outside of the method scope.
public void changeIt(String s) {
// I can't do anything to s here that changes the value
// original string object passed into this method
}
public void changeIt(SomeObject o) {
// if SomeObject is mutable, I can do things to it that will
// be visible outside of this method call
}

This little article can probably explain it better than I can: http://www.jchq.net/tutorial/09_02Tut.htm

Strings are immutable in java. Nevertheless, you can still append or prepend values to strings. By values, I mean primitive data types or other strings.
However, a StringBuffer is mutable, i.e. it can be changed in memory (a new memory block doesn't have to be allocated), which makes it quite efficient. Also, consider the following example:
StringBuffer mystringbuffer = new StringBuffer(5000);
for (int i = 0; i<=1000; i++)
{
mystringbuffer.append ( 'Number ' + i + '\n');
}
System.out.print (mystringbuffer);
Rather than creating one thousand strings, we create a single object (mystringbuffer), which can expand in length. We can also set a recommended starting size (in this case, 5000 bytes), which means that the buffer doesn't have to be continually requesting memory when a new string is appended to it.
While a StringBuffer won't improve efficiency in every situation, if your application uses strings that grow in length, it would be efficient. Code can also be clearer with StringBuffers, because the append method saves you from having to use long assignment statements.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.