Java string references different with new, same without new - java

Having never worked with Java much before, I was teaching myself generics syntax. I was testing out the simple generic function with some strings, and noticed something a little strange:
public class Main {
public static <T> boolean areSameReference(T lhs, T rhs) {
return lhs == rhs;
}
public static void main(String[] args) {
String s = new String("test1");
String t = s;
String u = new String("test1");
System.out.println(areSameReference(s, t)); //true
System.out.println(areSameReference(s, u)); //false
String v = "test2";
String w = "test2";
System.out.println(areSameReference(v, w)); //true
}
}
How come [s] and [u] are different references, but [v] and [w] are the same reference? I would have thought that with or without "new" the string literal would have caused them to be the same or different consistently in both cases.
Am I missing something else going on here?

How come [s] and [u] are different references,
Because you told the compiler you wanted new strings, didn't you?

As per JLS 3.10.5
a string literal always refers to the same instance of class String.
This is because string literals - or, more generally, strings that are
the values of constant expressions (§15.28) - are "interned" so as to
share unique instances, using the method String.intern.
String v = "test2";
    String w = "test2";
Will be considered as String literals.
When you use new operator to construct a String object it allocates new String object.

With new, you invoke the memory management system and get a new object.
Without new, the compiler optimizes the string contents into one entry in the '.class' constant table, resulting in two accesses to the same entry in the '.class' constant table.
The = comparison operator in Java is a reference comparison operation, so you will see differences between the two techniques. You might be able to hide those differences with enough use of String.intern(otherString).
The lesson to take home here is that except in extreme circumstances, always use .equals(...) to compare to objects.

The JVM keeps a pool of strings so that it doesn't waste memory with duplicate strings. Therefore, v and w should be the same. I'm guessing that the behavior that you're experiencing is due to the contract implied in the new String(String original) constructor, which indicates that it creates a new object (see http://docs.oracle.com/javase/1.5.0/docs/api/java/lang/String.html#String(java.lang.String)).

String objects are immutable, where string references are mutable. When you define "s" and "u", new objects for each are created. Value here does not matter because you invoke constructor, but when you assign same value to different String objects, they become references to same "test2" object in memory.
I would recommend reading more about immutable objects in Java, a good article is here:
http://javarevisited.blogspot.com/2010/10/why-string-is-immutable-in-java.html

Related

Why does String not require a constructor in Java? [duplicate]

I'm a C++ guy learning Java. I'm reading Effective Java and something confused me. It says never to write code like this:
String s = new String("silly");
Because it creates unnecessary String objects. But instead it should be written like this:
String s = "No longer silly";
Ok fine so far...However, given this class:
public final class CaseInsensitiveString {
private String s;
public CaseInsensitiveString(String s) {
if (s == null) {
throw new NullPointerException();
}
this.s = s;
}
:
:
}
CaseInsensitiveString cis = new CaseInsensitiveString("Polish");
String s = "polish";
Why is the first statement ok? Shouldn't it be
CaseInsensitiveString cis = "Polish";
How do I make CaseInsensitiveString behave like String so the above statement is OK (with and without extending String)? What is it about String that makes it OK to just be able to pass it a literal like that? From my understanding there is no "copy constructor" concept in Java?
String is a special built-in class of the language. It is for the String class only in which you should avoid saying
String s = new String("Polish");
Because the literal "Polish" is already of type String, and you're creating an extra unnecessary object. For any other class, saying
CaseInsensitiveString cis = new CaseInsensitiveString("Polish");
is the correct (and only, in this case) thing to do.
I believe the main benefit of using the literal form (ie, "foo" rather than new String("foo")) is that all String literals are 'interned' by the VM. In other words it is added to a pool such that any other code that creates the same string will use the pooled String rather than creating a new instance.
To illustrate, the following code will print true for the first line, but false for the second:
System.out.println("foo" == "foo");
System.out.println(new String("bar") == new String("bar"));
Strings are treated a bit specially in java, they're immutable so it's safe for them to be handled by reference counting.
If you write
String s = "Polish";
String t = "Polish";
then s and t actually refer to the same object, and s==t will return true, for "==" for objects read "is the same object" (or can, anyway, I"m not sure if this is part of the actual language spec or simply a detail of the compiler implementation-so maybe it's not safe to rely on this) .
If you write
String s = new String("Polish");
String t = new String("Polish");
then s!=t (because you've explicitly created a new string) although s.equals(t) will return true (because string adds this behavior to equals).
The thing you want to write,
CaseInsensitiveString cis = "Polish";
can't work because you're thinking that the quotations are some sort of short-circuit constructor for your object, when in fact this only works for plain old java.lang.Strings.
String s1="foo";
literal will go in pool and s1 will refer.
String s2="foo";
this time it will check "foo" literal is already available in StringPool or not as now it exist so s2 will refer the same literal.
String s3=new String("foo");
"foo" literal will be created in StringPool first then through string arg constructor String Object will be created i.e "foo" in the heap due to object creation through new operator then s3 will refer it.
String s4=new String("foo");
same as s3
so System.out.println(s1==s2);// **true** due to literal comparison.
and System.out.println(s3==s4);// **false** due to object
comparison(s3 and s4 is created at different places in heap)
Strings are special in Java - they're immutable, and string constants are automatically turned into String objects.
There's no way for your SomeStringClass cis = "value" example to apply to any other class.
Nor can you extend String, because it's declared as final, meaning no sub-classing is allowed.
Java strings are interesting. It looks like the responses have covered some of the interesting points. Here are my two cents.
strings are immutable (you can never change them)
String x = "x";
x = "Y";
The first line will create a variable x which will contain the string value "x". The JVM will look in its pool of string values and see if "x" exists, if it does, it will point the variable x to it, if it does not exist, it will create it and then do the assignment
The second line will remove the reference to "x" and see if "Y" exists in the pool of string values. If it does exist, it will assign it, if it does not, it will create it first then assignment. As the string values are used or not, the memory space in the pool of string values will be reclaimed.
string comparisons are contingent on what you are comparing
String a1 = new String("A");
String a2 = new String("A");
a1 does not equal a2
a1 and a2 are object references
When string is explicitly declared, new instances are created and their references will not be the same.
I think you're on the wrong path with trying to use the caseinsensitive class. Leave the strings alone. What you really care about is how you display or compare the values. Use another class to format the string or to make comparisons.
i.e.
TextUtility.compare(string 1, string 2)
TextUtility.compareIgnoreCase(string 1, string 2)
TextUtility.camelHump(string 1)
Since you are making up the class, you can make the compares do what you want - compare the text values.
The best way to answer your question would be to make you familiar with the "String constant pool". In java string objects are immutable (i.e their values cannot be changed once they are initialized), so when editing a string object you end up creating a new edited string object wherease the old object just floats around in a special memory ares called the "string constant pool". creating a new string object by
String s = "Hello";
will only create a string object in the pool and the reference s will refer to it, but by using
String s = new String("Hello");
you create two string objects: one in the pool and the other in the heap. the reference will refer to the object in the heap.
You can't. Things in double-quotes in Java are specially recognised by the compiler as Strings, and unfortunately you can't override this (or extend java.lang.String - it's declared final).
- How do i make CaseInsensitiveString behave like String so the above statement is ok (with and w/out extending String)? What is it about String that makes it ok to just be able to pass it a literal like that? From my understanding there is no "copy constructor" concept in Java right?
Enough has been said from the first point. "Polish" is an string literal and cannot be assigned to the CaseInsentiviveString class.
Now about the second point
Although you can't create new literals, you can follow the first item of that book for a "similar" approach so the following statements are true:
// Lets test the insensitiveness
CaseInsensitiveString cis5 = CaseInsensitiveString.valueOf("sOmEtHiNg");
CaseInsensitiveString cis6 = CaseInsensitiveString.valueOf("SoMeThInG");
assert cis5 == cis6;
assert cis5.equals(cis6);
Here's the code.
C:\oreyes\samples\java\insensitive>type CaseInsensitiveString.java
import java.util.Map;
import java.util.HashMap;
public final class CaseInsensitiveString {
private static final Map<String,CaseInsensitiveString> innerPool
= new HashMap<String,CaseInsensitiveString>();
private final String s;
// Effective Java Item 1: Consider providing static factory methods instead of constructors
public static CaseInsensitiveString valueOf( String s ) {
if ( s == null ) {
return null;
}
String value = s.toLowerCase();
if ( !CaseInsensitiveString.innerPool.containsKey( value ) ) {
CaseInsensitiveString.innerPool.put( value , new CaseInsensitiveString( value ) );
}
return CaseInsensitiveString.innerPool.get( value );
}
// Class constructor: This creates a new instance each time it is invoked.
public CaseInsensitiveString(String s){
if (s == null) {
throw new NullPointerException();
}
this.s = s.toLowerCase();
}
public boolean equals( Object other ) {
if ( other instanceof CaseInsensitiveString ) {
CaseInsensitiveString otherInstance = ( CaseInsensitiveString ) other;
return this.s.equals( otherInstance.s );
}
return false;
}
public int hashCode(){
return this.s.hashCode();
}
// Test the class using the "assert" keyword
public static void main( String [] args ) {
// Creating two different objects as in new String("Polish") == new String("Polish") is false
CaseInsensitiveString cis1 = new CaseInsensitiveString("Polish");
CaseInsensitiveString cis2 = new CaseInsensitiveString("Polish");
// references cis1 and cis2 points to differents objects.
// so the following is true
assert cis1 != cis2; // Yes they're different
assert cis1.equals(cis2); // Yes they're equals thanks to the equals method
// Now let's try the valueOf idiom
CaseInsensitiveString cis3 = CaseInsensitiveString.valueOf("Polish");
CaseInsensitiveString cis4 = CaseInsensitiveString.valueOf("Polish");
// References cis3 and cis4 points to same object.
// so the following is true
assert cis3 == cis4; // Yes they point to the same object
assert cis3.equals(cis4); // and still equals.
// Lets test the insensitiveness
CaseInsensitiveString cis5 = CaseInsensitiveString.valueOf("sOmEtHiNg");
CaseInsensitiveString cis6 = CaseInsensitiveString.valueOf("SoMeThInG");
assert cis5 == cis6;
assert cis5.equals(cis6);
// Futhermore
CaseInsensitiveString cis7 = CaseInsensitiveString.valueOf("SomethinG");
CaseInsensitiveString cis8 = CaseInsensitiveString.valueOf("someThing");
assert cis8 == cis5 && cis7 == cis6;
assert cis7.equals(cis5) && cis6.equals(cis8);
}
}
C:\oreyes\samples\java\insensitive>javac CaseInsensitiveString.java
C:\oreyes\samples\java\insensitive>java -ea CaseInsensitiveString
C:\oreyes\samples\java\insensitive>
That is, create an internal pool of CaseInsensitiveString objects, and return the corrensponding instance from there.
This way the "==" operator returns true for two objects references representing the same value.
This is useful when similar objects are used very frequently and creating cost is expensive.
The string class documentation states that the class uses an internal pool
The class is not complete, some interesting issues arises when we try to walk the contents of the object at implementing the CharSequence interface, but this code is good enough to show how that item in the Book could be applied.
It is important to notice that by using the internalPool object, the references are not released and thus not garbage collectible, and that may become an issue if a lot of objects are created.
It works for the String class because it is used intensively and the pool is constituted of "interned" object only.
It works well for the Boolean class too, because there are only two possible values.
And finally that's also the reason why valueOf(int) in class Integer is limited to -128 to 127 int values.
In your first example, you are creating a String, "silly" and then passing it as a parameter to another String's copy constructor, which makes a second String which is identical to the first. Since Java Strings are immutable (something that frequently stings people who are used to C strings), this is a needless waste of resources. You should instead use the second example because it skips several needless steps.
However, the String literal is not a CaseInsensitiveString so there you cannot do what you want in your last example. Furthermore, there is no way to overload a casting operator like you can in C++ so there is literally no way to do what you want. You must instead pass it in as a parameter to your class's constructor. Of course, I'd probably just use String.toLowerCase() and be done with it.
Also, your CaseInsensitiveString should implement the CharSequence interface as well as probably the Serializable and Comparable interfaces. Of course, if you implement Comparable, you should override equals() and hashCode() as well.
CaseInsensitiveString is not a String although it contains a String. A String literal e.g "example" can be only assigned to a String.
CaseInsensitiveString and String are different objects. You can't do:
CaseInsensitiveString cis = "Polish";
because "Polish" is a String, not a CaseInsensitiveString. If String extended CaseInsensitiveString String then you'd be OK, but obviously it doesn't.
And don't worry about the construction here, you won't be making unecessary objects. If you look at the code of the constructor, all it's doing is storing a reference to the string you passed in. Nothing extra is being created.
In the String s = new String("foobar") case it's doing something different. You are first creating the literal string "foobar", then creating a copy of it by constructing a new string out of it. There's no need to create that copy.
Just because you have the word String in your class, does not mean you get all the special features of the built-in String class.
when they say to write
String s = "Silly";
instead of
String s = new String("Silly");
they mean it when creating a String object because both of the above statements create a String object but the new String() version creates two String objects: one in heap and the other in string constant pool. Hence using more memory.
But when you write
CaseInsensitiveString cis = new CaseInsensitiveString("Polish");
you are not creating a String instead you are creating an object of class CaseInsensitiveString. Hence you need to use the new operator.
If I understood it correctly, your question means why we cannot create an object by directly assigning it a value, lets not restrict it to a Wrapper of String class in java.
To answer that I would just say, purely Object Oriented Programming languages have some constructs and it says, that all the literals when written alone can be directly transformed into an object of the given type.
That precisely means, if the interpreter sees 3 it will be converted into an Integer object because integer is the type defined for such literals.
If the interpreter sees any thing in single quotes like 'a' it will directly create an object of type character, you do not need to specify it as the language defines the default object of type character for it.
Similarly if the interpreter sees something in "" it will be considered as an object of its default type i.e. string. This is some native code working in the background.
Thanks to MIT video lecture course 6.00 where I got the hint for this answer.
In Java the syntax "text" creates an instance of class java.lang.String. The assignment:
String foo = "text";
is a simple assignment, with no copy constructor necessary.
MyString bar = "text";
Is illegal whatever you do because the MyString class isn't either java.lang.String or a superclass of java.lang.String.
First, you can't make a class that extends from String, because String is a final class. And java manage Strings differently from other classes so only with String you can do
String s = "Polish";
But whit your class you have to invoke the constructor. So, that code is fine.
I would just add that Java has Copy constructors...
Well, that's an ordinary constructor with an object of same type as argument.
String is one of the special classes in which you can create them without the new Sring part
it's the same as
int x = y;
or
char c;
String str1 = "foo";
String str2 = "foo";
Both str1 and str2 belongs to tha same String object, "foo", b'coz Java manages Strings in StringPool, so if a new variable refers to the same String, it doesn't create another one rather assign the same alerady present in StringPool.
String str1 = new String("foo");
String str2 = new String("foo");
Here both str1 and str2 belongs to different Objects, b'coz new String() forcefully create a new String Object.
It is a basic law that Strings in java are immutable and case sensitive.
Java creates a String object for each string literal you use in your code. Any time "" is used, it is the same as calling new String().
Strings are complex data that just "act" like primitive data. String literals are actually objects even though we pretend they're primitive literals, like 6, 6.0, 'c', etc. So the String "literal" "text" returns a new String object with value char[] value = {'t','e','x','t}. Therefore, calling
new String("text");
is actually akin to calling
new String(new String(new char[]{'t','e','x','t'}));
Hopefully from here, you can see why your textbook considers this redundant.
For reference, here is the implementation of String: http://www.docjar.com/html/api/java/lang/String.java.html
It's a fun read and might inspire some insight. It's also great for beginners to read and try to understand, as the code demonstrates very professional and convention-compliant code.
Another good reference is the Java tutorial on Strings:
http://docs.oracle.com/javase/tutorial/java/data/strings.html
In most versions of the JDK the two versions will be the same:
String s = new String("silly");
String s = "No longer silly";
Because strings are immutable the compiler maintains a list of string constants and if you try to make a new one will first check to see if the string is already defined. If it is then a reference to the existing immutable string is returned.
To clarify - when you say "String s = " you are defining a new variable which takes up space on the stack - then whether you say "No longer silly" or new String("silly") exactly the same thing happens - a new constant string is compiled into your application and the reference points to that.
I dont see the distinction here. However for your own class, which is not immutable, this behaviour is irrelevant and you must call your constructor.
UPDATE: I was wrong!
I tested this and realise that my understanding is wrong - new String("Silly") does indeed create a new string rather than reuse the existing one. I am unclear why this would be (what is the benefit?) but code speaks louder than words!

Comparing two identical strings with == returns false [duplicate]

This question already has answers here:
How do I compare strings in Java?
(23 answers)
Closed 9 years ago.
I am making an archive for my family. There are no syntax errors, however whenever I type in "Maaz", it evaluates realName == "Maaz" to false and goes to the else statement.
import java.util.Scanner;
public class MainFamily {
public static void main (String [] args) {
System.out.println("Enter you're name here");
Scanner name = new Scanner(System.in);//Scanner variable = name
String realName;
realName = name.nextLine();//String variable = user input
System.out.println("Name: "+ realName);
if (realName == "Maaz") {
System.out.println("Name: Maaz");
} else {
System.out.println("This person is not in the database");
}
}
}
TL;DR
You wrote (this doesn't work):
realName == "Maaz"
You meant this:
realname.equals("Maaz")
or this:
realname.equalsIgnoreCase("Maaz")
Explanation
In Java (and many other Object-Oriented programming languages), an object is not the same as a data-type. Data-types are recognized by the runtime as a data-type.
Examples of data-types include: int, float, short.
There are no methods or properties associated with a data-type. For example, this would throw an error, because data-types aren't objects:
int x = 5;
int y = 5;
if (x.equals(y)) {
System.out.println("Equal");
}
A reference is basically a chunk of memory that explicitly tells the runtime environment what that data-block is. The runtime doesn't know how to interpret this; it assumes that the programmer does.
For example, if we used Integer instead of int in the previous example, this would work:
Integer x = new Integer(5);
Integer y = new Integer(5);
if (x.equals(y)) {
System.out.println("Equal");
}
Whereas this would not give the expected result (the if condition would evaluate to false):
Integer x = new Integer(5);
Integer y = new Integer(5);
if (x == y) {
System.out.println("Equal");
}
This is because the two Integer objects have the same value, but they are not the same object. The double equals basically checks to see if the two Objects are the same reference (which has its uses).
In your code, you are comparing an Object with a String literal (also an object), which is not the same as comparing the values of both.
Let's look at another example:
String s = "Some string";
if (s == "Some string") {
System.out.println("Equal");
}
In this instance, the if block will probably evaluate to true. Why is this?
The compiler is optimized to use as little extra memory as is reasonable, although what that means depends on the implementation (and possibly runtime environment).
The String literal, "Some string", in the first line will probably be recognized as equivalent to the String literal in the second line, and will use the same place in memory for each. In simple terms, it will create a String object and plug it into both instances of "Some string". This cannot be relied upon, so using String.equals is always a better method of checking equivalence if you're only concerned with the values.
do this instead
if (realName.equals("Maaz"))
equals() should be used on all non-primitive objects, such as String in this case
'==' should only be used when doing primitive comparisons, such as int and long
use
if(realName.equals("Maaz"))
use == with primitive data type like int boolean .... etc
but if you want to compare object in java you should use the equals method
You have to compare objects with realName.equals ("Maaze"), not with ==.
It is best practice to compare Strings using str.equals(str2) and not str == str2. As you observed, the second form doesn't work a lot of the time. By contrast, the first form always works.
The only cases where the == approach will always work are when the strings are being compared are:
string literals or references to string literals, or
strings that have been "interned" by application-level code calling str = str.intern();.
(And no, strings are not interned by default.)
Since it is generally tricky to write programs that guarantee these preconditions for all strings, it is best practice to use equals unless there is a performance-related imperative to intern your strings and use ==.
Before that you decide that interning is a good idea, you need to compare the benefits of interning with the costs. Those costs include the cost of looking up the string in the string pool's hash table and the space and GC overheads of maintaining the string pool. These are non-trivial compared with the typical costs of just using a regular string and comparing using equals.
You can also use
realname.equalsIgnoreCase("Maaz")
This way you can accept Maaz, maaz, maaZ, mAaZ, etc.
== tests shallow equality. It checks if two objects reference the same location in memory.
Intriguing. Although, as others have stated, the correct way is to use the .equals(...) method, I always thought strings were pooled (irrespective of their creation). It seems this is only true of string literals.
final String str1 = new String("Maaz");
final String str2 = new String("Maaz");
System.out.println(str1 == str2); // Prints false
final String str3 = "Laaz";
final String str4 = "Laaz";
System.out.println(str3 == str4); // Prints true
Since you are working on strings, you should use equals to equalsIngnorecase method of String class. "==" will only compare if the both objects points to same memory location, in your case, both object are different and will not be equal as they dont point to same location. On the other hand, equals method of String class perform a comparison on the basis of the value which objects contains. Hence, if you will use equals method, your if condition will be satisfied.
== compares object references or primitive types (int, char, float ...)
equals(), you can override this method to compare how both objects are equal.
for String class, its method equal() will compare the content inside if they are the same or not.
If your examples, both strings do not have the same object references, so they return false, == are not comparing the characters on both Strings.
It seems nobody yet pointed out that the best practice for comparing an object with a constant in Java is calling the equals method of the constant, not the variable object:
if ("Maaz".equals (realName)) {}
This way you don't need to additionally check if the variable realName is null.
if(realName.compareTo("Maaz") == 0) {
// I dont think theres a better way do to do this.
}

Java == for String objects ceased to work?

public class Comparison {
public static void main(String[] args) {
String s = "prova";
String s2 = "prova";
System.out.println(s == s2);
System.out.println(s.equals(s2));
}
}
outputs:
true
true
on my machine. Why? Shouldn't be == compare object references equality?
Because String instances are immutable, the Java language is able to make some optimizations whereby String literals (or more generally, String whose values are compile time constants) are interned and actually refer to the same (i.e. ==) object.
JLS 3.10.5 String Literals
Each string literal is a reference to an instance of class String. String objects have a constant value. String literals-or, more generally, strings that are the values of constant expressions -are "interned" so as to share unique instances, using the method String.intern.
This is why you get the following:
System.out.println("yes" == "yes"); // true
System.out.println(99 + "bottles" == "99bottles"); // true
System.out.println("7" + "11" == "" + '7' + '1' + (char) (50-1)); // true
System.out.println("trueLove" == (true + "Love")); // true
System.out.println("MGD64" == "MGD" + Long.SIZE);
That said it needs to be said that you should NOT rely on == for String comparison in general, and should use equals for non-null instanceof String. In particular, do not be tempted to intern() all your String just so you can use == without knowing how string interning works.
Related questions
Java String.equals versus ==
difference between string object and string literal
what is the advantage of string object as compared to string literal
Is it good practice to use java.lang.String.intern()?
On new String(...)
If for some peculiar reason you need to create two String objects (which are thus not == by definition), and yet be equals, then you can, among other things, use this constructor:
public String(String original) : Initializes a newly created String object so that it represents the same sequence of characters as the argument; in other words, the newly created string is a copy of the argument string. Unless an explicit copy of original is needed, use of this constructor is unnecessary since Strings are immutable.
Thus, you can have:
System.out.println("x" == new String("x")); // false
The new operator always create a new object, thus the above is guaranteed to print false. That said, this is not generally something that you actually need to do. Whenever possible, you should just use string literals instead of explicitly creating a new String for it.
Related questions
Java Strings: “String s = new String(”silly“);”
What is the purpose of the expression “new String(…)” in Java?
JLS, 3.10.5 => It is guaranteed that a literal string object will be reused by any other code running in the same virtual machine that happens to contain the same string literal
If you explicitly create new objects, == returns false:
String s1 = new String("prova");
String s2 = new String("prova");
System.out.println(s1 == s2); // returns false.
Otherwise the JVM can use the same object, hence s1 == s2 will return true.
It does. But String literals are pooled, so "prova" returns the same instance.
String s = "prova";
String s2 = "prova";
s and s2 are literal strings which are pointing the same object in String Pool of JVM, so that the comparison returns true.
Yes, "prova" is stored in the java inner string pool, so its the same reference.
Source code literals are part of a constant pool, so if the same literal appears multiple times, it will be the same object at runtime.
The JVM may optimize the String usage so that there is only one instance of the "equal" String in memory. In this case also the == operator will return true. But don't count on it, though.
You must understand that "==" compares references and "equals" compares values. Both s and s1 are pointing to the same string literal, so their references are the same.
When you put a literal string in java code, the string is automatically interned by the compiler, that is one static global instance of it is created. Or more specifically, it is put into a table of interned strings. Any other quoted string that is exactly the same, content-wise, will reference the same interned string.
So in your code s and s2 are the same string
Ideally it should not happen ever. Because java specification guarantees this. So I think it may be the bug in JVM, you should report to the sun microsystems.

Strings are objects in Java, so why don't we use 'new' to create them?

We normally create objects using the new keyword, like:
Object obj = new Object();
Strings are objects, yet we do not use new to create them:
String str = "Hello World";
Why is this? Can I make a String with new?
In addition to what was already said, String literals [ie, Strings like "abcd" but not like new String("abcd")] in Java are interned - this means that every time you refer to "abcd", you get a reference to a single String instance, rather than a new one each time. So you will have:
String a = "abcd";
String b = "abcd";
a == b; //True
but if you had
String a = new String("abcd");
String b = new String("abcd");
then it's possible to have
a == b; // False
(and in case anyone needs reminding, always use .equals() to compare Strings; == tests for physical equality).
Interning String literals is good because they are often used more than once. For example, consider the (contrived) code:
for (int i = 0; i < 10; i++) {
System.out.println("Next iteration");
}
If we didn't have interning of Strings, "Next iteration" would need to be instantiated 10 times, whereas now it will only be instantiated once.
Strings are "special" objects in Java. The Java designers wisely decided that Strings are used so often that they needed their own syntax as well as a caching strategy. When you declare a string by saying:
String myString = "something";
myString is a reference to String object with a value of "something". If you later declare:
String myOtherString = "something";
Java is smart enough to work out that myString and myOtherString are the same and will store them in a global String table as the same object. It relies on the fact that you can't modify Strings to do this. This lowers the amount of memory required and can make comparisons faster.
If, instead, you write
String myOtherString = new String("something");
Java will create a brand new object for you, distinct from the myString object.
String a = "abc"; // 1 Object: "abc" added to pool
String b = "abc"; // 0 Object: because it is already in the pool
String c = new String("abc"); // 1 Object
String d = new String("def"); // 1 Object + "def" is added to the Pool
String e = d.intern(); // (e==d) is "false" because e refers to the String in pool
String f = e.intern(); // (f==e) is "true"
//Total Objects: 4 ("abc", c, d, "def").
Hope this clears a few doubts. :)
We usually use String literals to avoid creating unnecessary objects. If we use new operator to create String object , then it will create new object everytime .
Example:
String s1=“Hello“;
String s2=“Hello“;
String s3= new String(“Hello“);
String s4= new String(“Hello“);
For the above code in memory :
It's a shortcut. It wasn't originally like that, but Java changed it.
This FAQ talks about it briefly. The Java Specification guide talks about it also. But I can't find it online.
String is subject to a couple of optimisations (for want of a better phrase). Note that String also has operator overloading (for the + operator) - unlike other objects. So it's very much a special case.
In Java, Strings are a special case, with many rules that apply only to Strings. The double quotes causes the compiler to create a String object. Since String objects are immutable, this allows the compiler to intern multiple strings, and build a larger string pool. Two identical String constants will always have the same object reference. If you don't want this to be the case, then you can use new String(""), and that will create a String object at runtime. The intern() method used to be common, to cause dynamically created strings to be checked against the string lookup table. Once a string in interned, the object reference will point to the canonical String instance.
String a = "foo";
String b = "foo";
System.out.println(a == b); // true
String c = new String(a);
System.out.println(a == c); // false
c = c.intern();
System.out.println(a == c); // true
When the classloader loads a class, all String constants are added to the String pool.
Well the StringPool is implemented using The Hashmap in java. If we are creating always with a new keyword its not searching in String Pool and creating a new memory for it which might be needed later if we have a memory intensive operation running and if we are creating all the strings with new keyword that would affect performance of our application. So its advisable to not to use new keywords for creating string because then only it will go to String pool which in turn is a Hashmap ,(memory saved , imagine if we have lots of strings created with new keyword ) here it will be stored and if the string already exists the reference of it(which would usually reside in Stack memory) would be returned to the newly created string.
So its done to improve performance .
Syntactic sugar. The
String s = new String("ABC");
syntax is still available.
You can still use new String("string"), but it would be harder to create new strings without string literals ... you would have to use character arrays or bytes :-) String literals have one additional property: all same string literals from any class point to same string instance (they are interned).
There's almost no need to new a string as the literal (the characters in quotes) is already a String object created when the host class is loaded. It is perfectly legal to invoke methods on a literal and don, the main distinction is the convenience provided by literals. It would be a major pain and waste of tine if we had to create an array of chars and fill it char by char and them doing a new String(char array).
Feel free to create a new String with
String s = new String("I'm a new String");
The usual notation s = "new String"; is more or less a convenient shortcut - which should be used for performance reasons except for those pretty rare cases, where you really need Strings that qualify for the equation
(string1.equals(string2)) && !(string1 == string2)
EDIT
In response to the comment: this was not intended to be an advise but just an only a direct response to the questioners thesis, that we do not use the 'new' keyword for Strings, which simply isn't true. Hope this edit (including the above) clarifies this a bit. BTW - there's a couple of good and much better answers to the above question on SO.
The literal pool contains any Strings that were created without using the keyword new.
There is a difference : String without new reference is stored in String literal pool and String with new says that they are in heap memory.
String with new are elsewhere in memory just like any other object.
Because String is an immutable class in java.
Now why it is immutable?
As String is immutable so it can be shared between multiple threads and we dont need to synchronize String operation externally.
As String is also used in class loading mechanism. So if String was mutable then java.io.writer could have been changed to abc.xyz.mywriter
TString obj1 = new TString("Jan Peter");
TString obj2 = new TString("Jan Peter");
if (obj1.Name == obj2.Name)
System.out.println("True");
else
System.out.println("False");
Output:
True
I created two separate objects, both have a field(ref) 'Name'. So even in this case "Jan Peter" is shared, if I understand the way java deals..

Java Strings: "String s = new String("silly");"

I'm a C++ guy learning Java. I'm reading Effective Java and something confused me. It says never to write code like this:
String s = new String("silly");
Because it creates unnecessary String objects. But instead it should be written like this:
String s = "No longer silly";
Ok fine so far...However, given this class:
public final class CaseInsensitiveString {
private String s;
public CaseInsensitiveString(String s) {
if (s == null) {
throw new NullPointerException();
}
this.s = s;
}
:
:
}
CaseInsensitiveString cis = new CaseInsensitiveString("Polish");
String s = "polish";
Why is the first statement ok? Shouldn't it be
CaseInsensitiveString cis = "Polish";
How do I make CaseInsensitiveString behave like String so the above statement is OK (with and without extending String)? What is it about String that makes it OK to just be able to pass it a literal like that? From my understanding there is no "copy constructor" concept in Java?
String is a special built-in class of the language. It is for the String class only in which you should avoid saying
String s = new String("Polish");
Because the literal "Polish" is already of type String, and you're creating an extra unnecessary object. For any other class, saying
CaseInsensitiveString cis = new CaseInsensitiveString("Polish");
is the correct (and only, in this case) thing to do.
I believe the main benefit of using the literal form (ie, "foo" rather than new String("foo")) is that all String literals are 'interned' by the VM. In other words it is added to a pool such that any other code that creates the same string will use the pooled String rather than creating a new instance.
To illustrate, the following code will print true for the first line, but false for the second:
System.out.println("foo" == "foo");
System.out.println(new String("bar") == new String("bar"));
Strings are treated a bit specially in java, they're immutable so it's safe for them to be handled by reference counting.
If you write
String s = "Polish";
String t = "Polish";
then s and t actually refer to the same object, and s==t will return true, for "==" for objects read "is the same object" (or can, anyway, I"m not sure if this is part of the actual language spec or simply a detail of the compiler implementation-so maybe it's not safe to rely on this) .
If you write
String s = new String("Polish");
String t = new String("Polish");
then s!=t (because you've explicitly created a new string) although s.equals(t) will return true (because string adds this behavior to equals).
The thing you want to write,
CaseInsensitiveString cis = "Polish";
can't work because you're thinking that the quotations are some sort of short-circuit constructor for your object, when in fact this only works for plain old java.lang.Strings.
String s1="foo";
literal will go in pool and s1 will refer.
String s2="foo";
this time it will check "foo" literal is already available in StringPool or not as now it exist so s2 will refer the same literal.
String s3=new String("foo");
"foo" literal will be created in StringPool first then through string arg constructor String Object will be created i.e "foo" in the heap due to object creation through new operator then s3 will refer it.
String s4=new String("foo");
same as s3
so System.out.println(s1==s2);// **true** due to literal comparison.
and System.out.println(s3==s4);// **false** due to object
comparison(s3 and s4 is created at different places in heap)
Strings are special in Java - they're immutable, and string constants are automatically turned into String objects.
There's no way for your SomeStringClass cis = "value" example to apply to any other class.
Nor can you extend String, because it's declared as final, meaning no sub-classing is allowed.
Java strings are interesting. It looks like the responses have covered some of the interesting points. Here are my two cents.
strings are immutable (you can never change them)
String x = "x";
x = "Y";
The first line will create a variable x which will contain the string value "x". The JVM will look in its pool of string values and see if "x" exists, if it does, it will point the variable x to it, if it does not exist, it will create it and then do the assignment
The second line will remove the reference to "x" and see if "Y" exists in the pool of string values. If it does exist, it will assign it, if it does not, it will create it first then assignment. As the string values are used or not, the memory space in the pool of string values will be reclaimed.
string comparisons are contingent on what you are comparing
String a1 = new String("A");
String a2 = new String("A");
a1 does not equal a2
a1 and a2 are object references
When string is explicitly declared, new instances are created and their references will not be the same.
I think you're on the wrong path with trying to use the caseinsensitive class. Leave the strings alone. What you really care about is how you display or compare the values. Use another class to format the string or to make comparisons.
i.e.
TextUtility.compare(string 1, string 2)
TextUtility.compareIgnoreCase(string 1, string 2)
TextUtility.camelHump(string 1)
Since you are making up the class, you can make the compares do what you want - compare the text values.
The best way to answer your question would be to make you familiar with the "String constant pool". In java string objects are immutable (i.e their values cannot be changed once they are initialized), so when editing a string object you end up creating a new edited string object wherease the old object just floats around in a special memory ares called the "string constant pool". creating a new string object by
String s = "Hello";
will only create a string object in the pool and the reference s will refer to it, but by using
String s = new String("Hello");
you create two string objects: one in the pool and the other in the heap. the reference will refer to the object in the heap.
You can't. Things in double-quotes in Java are specially recognised by the compiler as Strings, and unfortunately you can't override this (or extend java.lang.String - it's declared final).
- How do i make CaseInsensitiveString behave like String so the above statement is ok (with and w/out extending String)? What is it about String that makes it ok to just be able to pass it a literal like that? From my understanding there is no "copy constructor" concept in Java right?
Enough has been said from the first point. "Polish" is an string literal and cannot be assigned to the CaseInsentiviveString class.
Now about the second point
Although you can't create new literals, you can follow the first item of that book for a "similar" approach so the following statements are true:
// Lets test the insensitiveness
CaseInsensitiveString cis5 = CaseInsensitiveString.valueOf("sOmEtHiNg");
CaseInsensitiveString cis6 = CaseInsensitiveString.valueOf("SoMeThInG");
assert cis5 == cis6;
assert cis5.equals(cis6);
Here's the code.
C:\oreyes\samples\java\insensitive>type CaseInsensitiveString.java
import java.util.Map;
import java.util.HashMap;
public final class CaseInsensitiveString {
private static final Map<String,CaseInsensitiveString> innerPool
= new HashMap<String,CaseInsensitiveString>();
private final String s;
// Effective Java Item 1: Consider providing static factory methods instead of constructors
public static CaseInsensitiveString valueOf( String s ) {
if ( s == null ) {
return null;
}
String value = s.toLowerCase();
if ( !CaseInsensitiveString.innerPool.containsKey( value ) ) {
CaseInsensitiveString.innerPool.put( value , new CaseInsensitiveString( value ) );
}
return CaseInsensitiveString.innerPool.get( value );
}
// Class constructor: This creates a new instance each time it is invoked.
public CaseInsensitiveString(String s){
if (s == null) {
throw new NullPointerException();
}
this.s = s.toLowerCase();
}
public boolean equals( Object other ) {
if ( other instanceof CaseInsensitiveString ) {
CaseInsensitiveString otherInstance = ( CaseInsensitiveString ) other;
return this.s.equals( otherInstance.s );
}
return false;
}
public int hashCode(){
return this.s.hashCode();
}
// Test the class using the "assert" keyword
public static void main( String [] args ) {
// Creating two different objects as in new String("Polish") == new String("Polish") is false
CaseInsensitiveString cis1 = new CaseInsensitiveString("Polish");
CaseInsensitiveString cis2 = new CaseInsensitiveString("Polish");
// references cis1 and cis2 points to differents objects.
// so the following is true
assert cis1 != cis2; // Yes they're different
assert cis1.equals(cis2); // Yes they're equals thanks to the equals method
// Now let's try the valueOf idiom
CaseInsensitiveString cis3 = CaseInsensitiveString.valueOf("Polish");
CaseInsensitiveString cis4 = CaseInsensitiveString.valueOf("Polish");
// References cis3 and cis4 points to same object.
// so the following is true
assert cis3 == cis4; // Yes they point to the same object
assert cis3.equals(cis4); // and still equals.
// Lets test the insensitiveness
CaseInsensitiveString cis5 = CaseInsensitiveString.valueOf("sOmEtHiNg");
CaseInsensitiveString cis6 = CaseInsensitiveString.valueOf("SoMeThInG");
assert cis5 == cis6;
assert cis5.equals(cis6);
// Futhermore
CaseInsensitiveString cis7 = CaseInsensitiveString.valueOf("SomethinG");
CaseInsensitiveString cis8 = CaseInsensitiveString.valueOf("someThing");
assert cis8 == cis5 && cis7 == cis6;
assert cis7.equals(cis5) && cis6.equals(cis8);
}
}
C:\oreyes\samples\java\insensitive>javac CaseInsensitiveString.java
C:\oreyes\samples\java\insensitive>java -ea CaseInsensitiveString
C:\oreyes\samples\java\insensitive>
That is, create an internal pool of CaseInsensitiveString objects, and return the corrensponding instance from there.
This way the "==" operator returns true for two objects references representing the same value.
This is useful when similar objects are used very frequently and creating cost is expensive.
The string class documentation states that the class uses an internal pool
The class is not complete, some interesting issues arises when we try to walk the contents of the object at implementing the CharSequence interface, but this code is good enough to show how that item in the Book could be applied.
It is important to notice that by using the internalPool object, the references are not released and thus not garbage collectible, and that may become an issue if a lot of objects are created.
It works for the String class because it is used intensively and the pool is constituted of "interned" object only.
It works well for the Boolean class too, because there are only two possible values.
And finally that's also the reason why valueOf(int) in class Integer is limited to -128 to 127 int values.
In your first example, you are creating a String, "silly" and then passing it as a parameter to another String's copy constructor, which makes a second String which is identical to the first. Since Java Strings are immutable (something that frequently stings people who are used to C strings), this is a needless waste of resources. You should instead use the second example because it skips several needless steps.
However, the String literal is not a CaseInsensitiveString so there you cannot do what you want in your last example. Furthermore, there is no way to overload a casting operator like you can in C++ so there is literally no way to do what you want. You must instead pass it in as a parameter to your class's constructor. Of course, I'd probably just use String.toLowerCase() and be done with it.
Also, your CaseInsensitiveString should implement the CharSequence interface as well as probably the Serializable and Comparable interfaces. Of course, if you implement Comparable, you should override equals() and hashCode() as well.
CaseInsensitiveString is not a String although it contains a String. A String literal e.g "example" can be only assigned to a String.
CaseInsensitiveString and String are different objects. You can't do:
CaseInsensitiveString cis = "Polish";
because "Polish" is a String, not a CaseInsensitiveString. If String extended CaseInsensitiveString String then you'd be OK, but obviously it doesn't.
And don't worry about the construction here, you won't be making unecessary objects. If you look at the code of the constructor, all it's doing is storing a reference to the string you passed in. Nothing extra is being created.
In the String s = new String("foobar") case it's doing something different. You are first creating the literal string "foobar", then creating a copy of it by constructing a new string out of it. There's no need to create that copy.
Just because you have the word String in your class, does not mean you get all the special features of the built-in String class.
when they say to write
String s = "Silly";
instead of
String s = new String("Silly");
they mean it when creating a String object because both of the above statements create a String object but the new String() version creates two String objects: one in heap and the other in string constant pool. Hence using more memory.
But when you write
CaseInsensitiveString cis = new CaseInsensitiveString("Polish");
you are not creating a String instead you are creating an object of class CaseInsensitiveString. Hence you need to use the new operator.
If I understood it correctly, your question means why we cannot create an object by directly assigning it a value, lets not restrict it to a Wrapper of String class in java.
To answer that I would just say, purely Object Oriented Programming languages have some constructs and it says, that all the literals when written alone can be directly transformed into an object of the given type.
That precisely means, if the interpreter sees 3 it will be converted into an Integer object because integer is the type defined for such literals.
If the interpreter sees any thing in single quotes like 'a' it will directly create an object of type character, you do not need to specify it as the language defines the default object of type character for it.
Similarly if the interpreter sees something in "" it will be considered as an object of its default type i.e. string. This is some native code working in the background.
Thanks to MIT video lecture course 6.00 where I got the hint for this answer.
In Java the syntax "text" creates an instance of class java.lang.String. The assignment:
String foo = "text";
is a simple assignment, with no copy constructor necessary.
MyString bar = "text";
Is illegal whatever you do because the MyString class isn't either java.lang.String or a superclass of java.lang.String.
First, you can't make a class that extends from String, because String is a final class. And java manage Strings differently from other classes so only with String you can do
String s = "Polish";
But whit your class you have to invoke the constructor. So, that code is fine.
I would just add that Java has Copy constructors...
Well, that's an ordinary constructor with an object of same type as argument.
String is one of the special classes in which you can create them without the new Sring part
it's the same as
int x = y;
or
char c;
String str1 = "foo";
String str2 = "foo";
Both str1 and str2 belongs to tha same String object, "foo", b'coz Java manages Strings in StringPool, so if a new variable refers to the same String, it doesn't create another one rather assign the same alerady present in StringPool.
String str1 = new String("foo");
String str2 = new String("foo");
Here both str1 and str2 belongs to different Objects, b'coz new String() forcefully create a new String Object.
It is a basic law that Strings in java are immutable and case sensitive.
Java creates a String object for each string literal you use in your code. Any time "" is used, it is the same as calling new String().
Strings are complex data that just "act" like primitive data. String literals are actually objects even though we pretend they're primitive literals, like 6, 6.0, 'c', etc. So the String "literal" "text" returns a new String object with value char[] value = {'t','e','x','t}. Therefore, calling
new String("text");
is actually akin to calling
new String(new String(new char[]{'t','e','x','t'}));
Hopefully from here, you can see why your textbook considers this redundant.
For reference, here is the implementation of String: http://www.docjar.com/html/api/java/lang/String.java.html
It's a fun read and might inspire some insight. It's also great for beginners to read and try to understand, as the code demonstrates very professional and convention-compliant code.
Another good reference is the Java tutorial on Strings:
http://docs.oracle.com/javase/tutorial/java/data/strings.html
In most versions of the JDK the two versions will be the same:
String s = new String("silly");
String s = "No longer silly";
Because strings are immutable the compiler maintains a list of string constants and if you try to make a new one will first check to see if the string is already defined. If it is then a reference to the existing immutable string is returned.
To clarify - when you say "String s = " you are defining a new variable which takes up space on the stack - then whether you say "No longer silly" or new String("silly") exactly the same thing happens - a new constant string is compiled into your application and the reference points to that.
I dont see the distinction here. However for your own class, which is not immutable, this behaviour is irrelevant and you must call your constructor.
UPDATE: I was wrong!
I tested this and realise that my understanding is wrong - new String("Silly") does indeed create a new string rather than reuse the existing one. I am unclear why this would be (what is the benefit?) but code speaks louder than words!

Categories