Time of Strings Creation in Java

Time of Strings Creation in Java - java

I am writing an app for J2ME devices and pretty much care about unnecessary String creation.
As working with Strings is built-in, i.e. it is not necessary to create them explicitly, I am not sure if I understand it right.
For instance returning a String (just by using the double quotes) creates the string when it is being returned, i.e. if I have several return statements returning different Strings, only one of the would be created. Is that right?
Also when using Strings for printing messages with Exceptions, these Strings never get created, if the Exception doesn't get thrown, right?
Sorry to bother you with such a newbie question.

I'm not at all sure about the answers you've received so far. If you're just returning a string literal, e.g.
return "foo";
then those values are embedded into the class file. The JVM makes sure that only one instance of that string is ever created (from the literal) - but I don't think there's any guarantee that it won't create strings for all the constants in a class when the class itself is loaded.
Then again, because the string will only be created once (per string constant) it's unlikely to be an issue.
Now, if your code is actually more like this:
return "foo" + new Date();
then that is dynamically creating a string - but it will only be created if the return statement is actually hit.

The answer is a bit more complicated than some of the other answers suggest.
The String value that corresponds to a String literal in your source code gets created once, when the class is loaded. Furthermore, these Strings are automatically "interned", which means that if the same literal appears at more than one place in any class, only one copy of the String will be kept. (Any other copies, if they were created will get garbage collected.)
When the application calls new String(...), when it evaluates a String concatenation expression (i.e. the + operator), or when it calls one of the many library methods that create Strings under the hood, a new String will be created.
So to answer your questions:
1 - The following will not actually create a String at all. Rather it will return a String that was created when the class was loaded:
return "some string";
2 - The following will create two Strings. First it will call someObject.toString(), then it will concatenate that with the literal to give another String.
return "The answer is :" + someObject;
3 - The following will not create a String; see 1.
throw new Exception("Some message");
4 - The following will create two Strings when it is executed; see 2.
throw new Exception(someObject + "is wrong");
(Actually, 3 and 4 gloss over the fact that creating an exception causes details of the current thread's call stack to be captured, and those details include a method name, class name and file name for each frame. It is not specified when the Strings that represent these things actually get created, or whether they are interned. So it is possible that the act of creating an Exception object triggers creation of a number of Strings.)

Right
Right

Yes. If an expression that creates an instance either by invoking a constructor or if (for Strings) evaluating a literal is not executed, then nothing is created.
And furtheron, compilers do some optimization. String objects based on String literals will be created only once and interned afterwards. like here:
public String getHello() {
return "Hello";
}
public void test() {
String s = getHello(); // String object created
String t = getHello(); // no new String object created
}
The JVMS has a different 'opinion':
A new class instance may be implicitly created in the following situations:
Loading of a class or interface that contains a String literal may create a new String object (§2.4.8) to represent that literal. This may not occur if the a String object has already been created to represent a previous occurrence of that literal, or if the String.intern method has been invoked on a String object representing the same string as the literal.
So the above example is incorrect (I'm learning something new every day). The VM checks during classloading if "Hello" has to be created or not (and will create a String instance if not existing yet).
So now it is my understanding, that the VM creates a String instance for each unique String literal (on byte code level!), regardless whether it is used or not.
(on byte code level because compilers may optimize concatenations of String literals to one literal, like: the expression "one"+"two" will be compiled to "onetwo")

Strings (and other objects) only get created when you call them.
So to answer your questions:
Yes, only the String that gets returned will get created
Yes, only when the exception is thrown, this String will get created

The main concept here is that the heap allocation(what you say creation) for any String or in more general terms for any object doesn't take place until your logic-flow makes the interpreter executes that line of compiled code.

Related

Why String Deduplication when we have String Pool

String De-duplication:
Strings consume a lot of memory in any application.Whenever the garbage collector visits String objects it takes note of the char arrays. It takes their hash value and stores it alongside with a weak reference to the array. As soon as it finds another String which has the same hash code it compares them char by char.If they match as well, one String will be modified and point to the char array of the second String. The first char array then is no longer referenced anymore and can be garbage collected.
String Pool:
All strings used by the java program are stored here. If two variables are initialized to the same string value. Two strings are not created in the memory, there will be only one copy stored in memory and both will point to the same memory location.
So java already takes care of not creating duplicate strings in the heap by checking if the string exists in the string pool. Then what is the purpose of string de-duplication?
If there is a code as follows
String myString_1 = new String("Hello World");
String myString_2 = new String("Hello World");
two strings are created in memory even though they are same. I cannot think of any scenario other than this where string de-duplication is useful. Obviously I must be missing something. What I am I missing?
Thanks In Advance

The string pool applies only to strings added to it explicitly, or used as constants in the application. It does not apply to strings created dynamically during the lifetime of the application. String deduplication, however, applies to all strings.

Compile time vs run time
String pool refers to string constants that are known at compile time.
String deduplication would help you if you happen to retrieve (or construct) the same string a million times at run time, e.g. reading it from a file, a HTTP request or any other way.

String de-duplication enjoys the extra level of indirection built into String:
With a string pool, you are limited to returning the same object for two identical strings
String de-duplication lets you have multiple distinct String objects sharing the same content.
This translates into removing a limitation of de-duplicating on creation: your application could keep creating new String objects with identical content while using very little extra memory, because the content of the strings would be shared. This process can be done on a completely unrelated schedule - for example, in the background, while your application does not need much of the CPU resources. Since the identity of the String object does not change, de-duplication can be completely hidden form your application.

Just to add to the answers above, on older VM's the string pool is not garbage collected (this has changed now, but don't rely on that). It contains strings which are used as constants in the application, and so will always be needed. If you continually put all your strings in the string pool, you might quickly run out of memory. On top of that, de-duplication is a relatively expensive process, if you know you only need the string for a very short period of time, and you have enough memory.
For these reasons, strings are not put in the string pool automatically. You have to do it explicitly by calling string.intern().

From documentation :
"Initializes a newly created String object so that it represents
the same sequence of characters as the argument; in other words, the
newly created string is a copy of the argument string. Unless an
explicit copy of original is needed, use of this constructor is
unnecessary since Strings are immutable."
So my sense says, this constructor in String class is not needed normally like you have used above. I guess that constructor is provided merely for the sake of completeness or if you do not want to share that copy (kind of unnecessary now, refer here what I am talking about) but still other constructors are useful like getting an String object from char array and so on..

I cannot think of any scenario other than this where string de-duplication is useful.
Well one other (much more) frequent scenario is the use of StringBuilders. In the toString() method of the StringBuilder class, it clearly creates a new instance in memory:
public final class StringBuilder extends AbstractStringBuilder
implements java.io.Serializable, CharSequence
{
...
#Override
public String toString() {
// Create a copy, don't share the array
return new String(value, 0, count);
}
...
}
Same thing for its thread-safe version StringBuffer:
public final class StringBuffer extends AbstractStringBuilder
implements java.io.Serializable, CharSequence
{
...
#Override
public synchronized String toString() {
if (toStringCache == null) {
toStringCache = Arrays.copyOfRange(value, 0, count);
}
return new String(toStringCache, true);
}
...
}
In applications that rely heavily on this, string de-duplication may reduce memory usage.

How many objects are created in StringBuilder?

Based on my understanding on String objects, that every string literal is being added/created on the String Constant Pool.
String a = new String("hello world");
two objects are being created, one is the "hello world" being added on the constant pool and the other object is the new instance of String. Does the same principle is being applied when it comes to StringBuilder?
StringBuilder b = new StringBuilder("java is fun");
b.append(" and good");
in the StringBuilder example does it mean that there are 3 objects being created? The first one is the new instance of StringBuilder and the other 2 objects are the string literals "java is fun" and " and good"?

Yes, your understanding is correct. (But see below.) String literals go in the constant pool, while the new String(...) and new StringBuilder(...) create additional objects. (The StringBuilder also creates an internal character array object, so that there are at least four objects involved in the second example.) Note that the call to b.append(...) may internally create another object and some garbage, but only if the internal character array used by b needs to expand.
EDIT: As #fdreger points out in a comment, the string objects corresponding to the string literals are created not at run time, but rather at the time the class is created during the creation and loading phase of the class life cycle (as described in the Java Language Specification; see also the section on the runtime constant pool).

Yes, you're right.
"java is fun" and " and good" are going to be stored in the constant pool. I would call them objects, but yes, it's more like it.
b in this case is going to be an object with a limited lifespan. It may actually store its own copy of both strings in its buffer to work with, so that originals won't be modified.

The strings are created when the class is loaded, not when the actual code is executed. You may think of them as part of your code, exactly as any other literal value.
So in your second example:
two strings are created when the code is loaded for the first time
no strings are created during the execution of the code.

Is "new String()" immutable as well?

I've been studying Java String for a while. The following questions are based on the below posts
Java String is special
Immutability of String in java
Immutability:
Now, going by the immutability, the String class has been designed so that the values in the common pool can be reused in other places/variables. This holds good if the String was created as
String a = "Hello World!";
However, if I create String like
String b = new String("Hello World!");
why is this immutable as well? (or is it?). Since this has a dedicated heap memory, I should be able to modify this without affecting any other variable. So by design, was there any other reason why String as a whole is considered immutable? Or is my above assumption wrong?
Second thing I wanted to ask was about the common string pool. If I create a string object as
String c = "";
is an empty entry created in the pool?
Is there any post already on these? If so, could someone share the link?

new String() is an expression that produces a String ... and a String is immutable, no matter how it is produced.
(Asking if new String() is mutable or not is nonsensical. It is program code, not a value. But I take it that that is not what you really meant.)
If I create a string object as String c = ""; is an empty entry created in the pool?
Yes; that is, an entry is created for the empty string. There is nothing special about an empty String.
(To be pedantic, the pool entry for "" gets created long before your code is executed. In fact, it is created when your code is loaded ... or possibly even earlier than that.)
So, I was wanted to know whether the new heap object is immutable as well, ...
Yes it is. But the immutability is a fundamental property of String objects. All String objects.
You see, the String API simply does not provide any methods for changing a String. So (apart from some dangerous and foolish1 tricks using reflection), you can't mutate a String.
and if so what was the purpose?.
The primary reason that Java String is designed as an immutable class is simplicity. It makes it easier to write correct programs, and read / reason about other people's code if the core string class provides an immutable interface.
An important second reason is that the immutability of String has fundamental implications for the Java security model. But I don't think this was a driver in the original language design ... in Java 1.0 and earlier.
Going by the answer, I gather that other references to the same variable is one of the reasons. Please let me know if I am right in understanding this.
No. It is more fundamental than that. Simply, all String objects are immutable. There is no complicated special case reasoning required to understand this. It just >>is<<.
For the record, if you want a mutable "string-like" object in Java, you can use StringBuilder or StringBuffer. But these are different types to String.
1 - The reason these tricks are (IMO) dangerous and foolish is that they affect the values of strings that are potentially shared by other parts of your application via the string pool. This can cause chaos ... in ways that the next guy maintaining your code has little chance of tracking down.

String is immutable irrespective of how it is instantiated
1) Short answer is yes, new String() is immutable too.
Because every possible mutable operation (like replace,toLowerCase etcetra) that you perform on String does not affect the original String instance and returns you a new instance.
You may check this in Javadoc for String. Each public method of String that is exposed returns a new String instance and does not alter the present instance on which you called the method.
This is very helpful in Multi-threaded environment as you don't have to think about mutability (someone will change the value) every time you pass or share the String around. String can easily be the most used data type, so the designers have blessed us all to not think about mutability everytime and saved us a lot of pain.
Immutability allowed String pool or caching
It is because of immutability property that the internal pool of string was possible, as when same String value is required at some other place then that immutable reference is returned. If String would have been mutable then it would not have been possible to share Strings like this to save memory.
String immutablity was not because of pooling, but immutability has more benefits attached to it.
String interning or pooling is an example of Flyweight Design pattern
2) Yes it will be interned like any other String as a blank String is also as much a String as other String instances.
References:
Immutability benefits of String

The Java libraries are heavily optimized around the constraint that any String object is immutable, regardless of how that object is constructed. Even if you create your b using new, other code that you pass that instance to will treat the value as immutable. This is an example of the Value Object pattern, and all of the advantages (thread-safety, no need to make private copies) apply.
The empty string "" is a legitimate String object just like anything else, it just happens to have no internal contents, and since all compile-time constant strings are interned, I'll virtually guarantee that some runtime library has already caused it to be added to the pool.

1) The immutable part isn't because of the pool; it just makes the pool possible in the first place. Strings are often passed as arguments to other functions or even shared with other threads; Making Strings immutable was a design decision to make reasoning in such situations easier. So yes - Strings in Java are always immutable, no matter how you create them (note that it's possible to have mutable strings in java - just not with the String class).
2) Yes. Probably. I'm not actually 100% sure, but that should be the case.

This isn't strictly an answer to your question, but if behind your question is a wish to have mutable strings that you can manipulate, you should check out the StringBuilder class, which implements many of the exact same methods that String has but also adds methods to change the current contents.
Once you've built your string in such a way that you're content with it, you simply call toString() on it in order to convert it to an ordinary String that you can pass to library routines and other functions that only take Strings.
Also, both StringBuilder and String implements the CharSequence interface, so if you want to write functions in your own code that can use both mutable and immutable strings, you can declare them to take any CharSequence object.

From the Java Oracle documentation:
Strings are constant; their values cannot be changed after they are
created.
And again:
String buffers support mutable strings. Because String
objects are immutable they can be shared.
Generally speaking: "all primitive" (or related) object are immutable (please, accept my lack of formalism).
Related post on Stack Overflow:
Immutability of Strings in Java
Is Java String immutable?
Is Java "pass-by-reference" or "pass-by-value"?
Is a Java string really immutable?
String is immutable. What exactly is the meaning?
what the reason behind making string immutable in java?
About the object pool: object pool is a java optimization which is NOT related to immutable as well.

String is immutable means that you cannot change the object itself no matter how you created it.And as for the second question: yes it will create an entry.

It's the other way around, actually.
[...] the String class has been designed so that the values in the common pool can be reused in other places/variables.
No, the String class is immutable so that you can safely reference an instance of it, without worrying about it being modified from another part of your program. This is why pooling is possible in the first place.
So, consider this:
// this string literal is interned and referenced by 'a'
String a = "Hello World!";
// creates a new instance by copying characters from 'a'
String b = new String(a);
Now, what happens if you simply create a reference to your newly created b variable?
// 'c' now points to the same instance as 'b'
String c = b;
Imagine that you pass c (or, more specifically, the object it is referencing) to a method on a different thread, and continue working with the same instance on your main thread. And now imagine what would happen if strings were mutable.
Why is this the case anyway?
If nothing else, it's because immutable objects make multithreading much simpler, and usually even faster. If you share a mutable object (that would be any stateful object, with mutable private/public fields or properties) between different threads, you need to take special care to ensure synchronized access (mutexes, semaphores). Even with this, you need special care to ensure atomicity in all your operations. Multithreading is hard.
Regarding the performance implications, note that quite often copying the entire string into a new instance, in order to change even a single character, is actually faster than inducing an expensive context switch due to synchronization constructs needed to ensure thread-safe access. And as you've mentioned, immutability also offers interning possibilities, which means it can actually help reduce memory usage.
It's generally a pretty good idea to make as much stuff immutable as your can.

1) Immutability: String will be immutable if you create it using new or other way for security reasons
2) Yes there will be a empty entry in the string pool.
You can better understand the concept using the code
String s1 = new String("Test");
String s2 = new String("Test");
String s3 = "Test";
String s4 = "Test";
System.out.println(s1==s2);//false
System.out.println(s1==s3);//false
System.out.println(s2==s3);//false
System.out.println(s4==s3);//true
Hope this will help your query. You can always check the source code for String class in case of better understanding Link : http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/lang/String.java

1-String is immutable.See this:
Is a Java string really immutable?
So you can create it with many way.
2-Short answer: Yes will be empty.

Strings created will always be immutable regardless of how they are created.
Answers to your questions:
The only difference is:
When string is created like -- {String a = "Hello World!";} then only one object gets created.
And when it is created like -- {String b = new String("Hello World!");} then two objects get created. The first one, because you have used 'new' keyword and the second one because of the String property.
Yes, for sure. There will be an empty entry created in the pool.

String is immutable because it does not provide you with a mean to modify it. It is design to avoid any tampering (it is final, the underlying array is not supposed to be touched ...).
Identically, Integer is immutable, because there is no way to modify it.
It does not matter how you create it.

Immutability is not a feature of new, it is a feature of the class String. It has no mutator methods, so it immutable.

String A = "Test"
String B = "Test"
Now String B called"Test".toUpperCase()which change the same object into"TEST", soAwill also be"TEST"` which is not desirable.

Note that in your example, the reference is changed and not the object it refers to, i.e. b as a reference may be changed and refer to a new object. But that new object is immutable, which means its contents will not be changed "after" calling the constructor.
You can change a string using b=b+"x"; or b=new String(b);, and the content of variable a seem to change, but don't confuse immutability of a reference (here variable b) and the object it is referring to (think of pointers in C). The object that the reference is pointing to will remain unchanged after its creation.
If you need to change a string by changing the contents of the object (instead of changing the reference), you can use StringBuffer, which is a mutable version of String.

String POOL in java

Java has string pool, due to which objects of string class are immutable.
But my question stands -
What was the need to make String POOL?
Why string class was not kept like other class to hold its own values?
Is internally JVM need some strings or is this a performance benefit. If yes how?

A pool is possible because the strings are immutable. But the immutability of the String hasn't been decided only because of this pool. Immutability has numerous other benefits. BTW, a Double is also immutable, and there is no pool of Doubles.
The need for the String pool is to reduce the memory needed to hold all the String literals (and the interned Strings) a program uses, since these literals have a good chance of being used many times, in many places of the program. Instead of having thousands of copies of the same String literal, you just have thousand references to the same String, which reduces the memory usage.
Note that the String class is not different from other classes: it holds its own char array. It may also share it with other String instances, though, when substring is called.

When we compiler see's that a new String literal has to be created,it first check's the pool for an identical string,if found no new String literal is created,the existing String is referred.

the benifit of making string as immutable was for the security feature. Read below
Why String has been made immutable in Java?
Though, performance is also a reason (assuming you are already aware of the internal String pool maintained for making sure that the same String object is used more than once without having to create/re-claim it those many times), but the main reason why String has been made immutable in Java is 'Security'. Surprised? Let's understand why.
Suppose you need to open a secure file which requires the users to authenticate themselves. Let's say there are two users named 'user1' and 'user2' and they have their own password files 'password1' and 'password2', respectively. Obviously 'user2' should not have access to 'password1' file.
As we know the filenames in Java are specified by using Strings. Even if you create a 'File' object, you pass the name of the file as a String only and that String is maintained inside the File object as one of its members.
Had String been mutable, 'user1' could have logged into using his credentials and then somehow could have managed to change the name of his password filename (a String object) from 'password1' to 'password2' before JVM actually places the native OS system call to open the file. This would have allowed 'user1' to open user2's password file. Understandably it would have resulted into a big security flaw in Java. I understand there are so many 'could have's here, but you would certainly agree that it would have opened a door to allow developers messing up the security of many resources either intentionally or un-intentionally.
With Strings being immutable, JVM can be sure that the filename instance member of the corresponding File object would keep pointing to same unchanged "filename" String object. The 'filename' instance member being a 'final' in the File class can anyway not be modified to point to any other String object specifying any other file than the intended one (i.e., the one which was used to create the File object).

What was the need to make String POOL?
When created, a String object is stored in heap, and the String literal, that is sent in the constructor, is stored in SP. Thats why using String objects is not a good practice. Becused it creates two objects.
String str = new String("stackoverflow");
Above str is saved in heap with the reference str, and String literal from the constructor -"stackoverflow" - is stored in String Pool. And that is bad for performance. Two objects are created.
The flow: Creating a String literal -> JVM looks for the value in the String Pool as to find whether same value exists or not (no object to be returned) -> The value is not find -> The String literal is created as a new object (internally with the new keyword) -> But now is not sent to the heap , it is send instead in String Pool.
The difference consist where the object is created using new keyword. If it is created by the programmer, it send the object in the heap, directly, without delay. If it is created internally it is sent to String Poll. This is done by the method intern(). intern() is invoke internally when declaring a String literal. And this method is searching SP for identical value as to return the reference of an existing String object or/and to send the object to the SP.
When creating a String obj with new, intern() is not invoked and the object is stored in heap. But you can call intern() on String obj's: String str = new String().intern(); now the str object will be stored in SP.
ex:
String s1 = new String("hello").intern();
String s2 = "hello";
System.out.println(s1 == s2); // true , because now s1 is in SP

new String(); in java

I have an assert statement to check the equal method in a class. below is the 2 statements
Assert.assertTrue(a.equals(new VideoObj(title, year, director)));
Assert.assertTrue(a.equals(new VideoObj(new String(title), year, director)));
what is the difference in the two statements? and why does the 2nd assert statement have new String(title), can someone tell me what is the difference in doing so?
And also could you provide with a tutorial for me to learn writing test class for Java?
thanks

new String() explicitly creates a new object of String.
And:
Assert.assertTrue(a.equals(new VideoObj(title, year, director)));
reuses existing String object (which can give you different results depending on equals method implementation in VideoObj), or uses some other object like byte[] as there is a constructor for new String(byte[]). Of course there are many others (see Java API).
...edit
Check this out:
String s1=new String("abcxyz");
String s2=new String("abc"+"xyz");
String s3=s1+s2;
String s4="abcxyz"; // string literal will go to string pool
String s5="abc"+"xyz"; // string literal will go to string pool
System.out.println(s1==s2);
System.out.println(s2==s3);
System.out.println(s3==s4);
System.out.println(s4==s5); // TRUE (the rest are false)
System.out.println(s2==s4);
System.out.println(s2==s5);
System.out.println(s3==s1);
System.out.println(s3==s5);

The new String() statement creates a new String object with the content of the original String. That gives you two String objects (the original one and the one that was newly created with the new String(originalString) statement) that have equal content, but are two different Java objects. If you compare these two objects with the == operator, this comparison will return false, because it's two different objects. Whereas a comparison with the String's equals() method will still return true, because these two strings have the same text stored inside them.
So I assume the intention of this assertion in the test is to make sure that the equals() method of "a" really uses the equals() method of the title string to check for equality, and does not just do a simple == comparison of the two title strings. Without that assertion, such an invalid implementation of a's equals() method could remain unnoticed because the two title strings in the compared object could be references to the same String object, which would then also give you the expected result when using the (wrong) == operator to test them for equality.

The difference is this:
If in VideoObj's the equals() method has been overridden and is using the String that is passed in via the constructor as title, it's checking to make sure it works if the String isn't a literal
If in equals() someone did this:
...
if (otherObj.getTitle() == this.title) {
...
The first assertion would succeed while the second would fail if title was a string literal. String literals in Java are interned into a pool meaning that you end up with multiple references to the same literal if it's used multiple times;
String a = "This is a string";
String b = "This is a string";
// a and b will both be the same reference
String c = new String("This is a string");
// c will *not* be the same reference
Edit: Note this is assuming that title in your first assertion is indeed a String

If title is not a String, then the second assert statement might be invoking a different VideoObj constructor. Without more info about title and about the VideoObj class, it's hard to say more.
Do a web search for junit tutorial to get lots of resources for learning how to write test classes.
EDIT From your comment to the answer by #BrianRoach, I gather that your VideoObj.equals(Object) method is implemented as follows:
public boolean equals(Object thatObject) {
if (!(thatObject instanceof VideoObj)) return false;
VideoObj that = (VideoObj) thatObject;
return ((_title.equals(that.title()))
|| (_director.equals(that.director()))
|| (_year == that.year()));
}
If that's right, then the problem is that your return statement should be using && instead of || operators. That would explain why your asserts are not complaining.

Java does some optimization. If you have two strings with the same value it is most probably the same object. This can be done because strings are immutable. But if you explicitly create a new String with a constructor call, a new object is created.

There should be no difference between the 2 statements, provided that 'title' is a String.

title can either be a String, a StringBuffer, a StringBuilder, a char[] or a byte[]. If it's a String, it'll run the same constructor as the other call, but in all other cases it may run a totally different, overloaded, constructor.
Without seeing the type of title, it's impossible to tell which.
Edit: Since the type is String, the reason for the new String is most likely to make sure that the underlying code does not compare strings using == (reference-equals) by mistake. If it did, the first test may pass (string constants with the same value may indeed reference the same String object and the test may use constants for setup and test) while the second one would break since == would no longer match with a freshly created String.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.