Is boolean interned in Java? - java

The following code for Integer uses object interning:
Integer.valueOf("1")
It is not clear from API documentation whether this code for Boolean also uses interned object:
Boolean.valueOf("true")
Obviously, it may. But does it have to?
UPDATE
I agree that source code can explain what actually happens (BTW, thanks for the answers). To make the question less trivial, is there any part of Java API spec or JSL which tells what MUST happen?
It was natural to ask the question against the code like this:
String str = "true";
if (Boolean.valueOf(str) == Boolean.TRUE) { ... }
The outcome depends on whether "object interning" is guaranteed or not. It's better to avoid this code altogether and use true instead of Boolean.TRUE (rather than looking up details in any specs or sources), but it is a valid reason to ask the question.
NOTE: In fact, I didn't see guarantees of object interning for Integer in any googled specs. So, it may all be just an implementation detail nobody should rely on.

The JLS guarantees that:
Integer i = 1;
Boolean b = true;
will use interning (at least between -128 and 127 for Integers, and for true and false for Booleans).
The relevant javadocs also guarantee that:
Integer i = Integer.valueOf(1);
Boolean b = Boolean.valueOf(true);
will return interned objects.
However there is no such explicit guarantees for valueOf(String): although it is the case in the specific implementation you are using, it may not be the case with a different JVM or in future releases. In fact an implementation that would return new Boolean(Boolean.parseBoolean(input)) would be valid.

Based on the source code, the boolean is parsed as:
public static final Boolean FALSE = new Boolean(false);
public static final Boolean TRUE = new Boolean(true);
public static Boolean valueOf(String s) {
return toBoolean(s) ? TRUE : FALSE;
}
Where TRUE and FALSE are static (immutable) objects. So yes, parsed booleans are interned.
However, I agree with #JBNizet's comment that one should not derive code contracts from the source code: as long as a feature is not documented, the developers of Java can change their mind. You better use Boolean.equals to check whether two objects are equivalent.

Here's the source code:
public static Boolean valueOf(String s) {
return parseBoolean(s) ? TRUE : FALSE;
}
Where TRUE is:
public static final Boolean TRUE = new Boolean(true);
If you're still not sure, define two variables:
Boolean a = Boolean.valueOf("true");
Boolean b = Boolean.valueOf("true");
and check whatever you want by yourself.

Related

What is role of field valueOffSet in AtomicReference getAndSet and other such methods

I am studying non blocking algorithms. I realize that study of areas like volatile and Atomic* are crucial for this. So here is a question.
In AtomicReference, compareAndSet uses unsafe.compareAndSwapObject(this, valueOffset, expect, update).
Another one, the infamous lazySet uses unsafe.putOrderedObject(this, valueOffset, newValue);.
I assume somewhere in these methods, a comparison of expect and update object will be done using "equals". Wonder how it is done atomically.. I guess, using CAS eventually, in some way.. and I guess, valueOffset has a role there.
Unfortunately, source code of Unsafe is not available.. and this code appears rather crucial to understand what in the world is valueOffset and what does it do. Anyone any idea please? Thanks!
Why are you even trying to figure out how something that is so implementation specific works?
Anyway, did you look at where valueOffset was assigned?
private static final Unsafe unsafe = Unsafe.getUnsafe();
private static final long valueOffset;
static {
try {
valueOffset = unsafe.objectFieldOffset
(AtomicReference.class.getDeclaredField("value"));
} catch (Exception ex) { throw new Error(ex); }
}
private volatile V value;
public final boolean compareAndSet(V expect, V update) {
return unsafe.compareAndSwapObject(this, valueOffset, expect, update);
}
It seems like valueOffset is the byte offset of field value from the beginning of the memory block pointed to by reference this.
As for whether you need to implement equals() in the value class, you don't.
AtomicReference is about the reference, not the object. The compareAndSet() method is about reference equality (==), not about the logical object equality implemented by equals().
For example, the following code prints false, because the two 1 values are different object instances, even though they have the same logical "value".
AtomicReference r = new AtomicReference(new Integer(1));
boolean b = r.compareAndSet(new Integer(1), new Integer(2));
System.out.println(b);
The value of r did not get updated by compareAndSet().

Correct way of initializing a Boolean

How a Boolean instance has to be initialized?
Is it
Boolean b = null;
or
Boolean b = new Boolean(null);
Which one is the correct coding practice?
The first one is correct if you want a null Boolean.
Personally I don't like having null values and prefer to use boolean, which cannot be null and is false by default.
In order to understand what the second statement does you need to understand about Java primitive wrappers. A Boolean is simply an object wrapper around a boolean; when you declare directly:
Boolean b = false;
There is some autoboxing going on and this is essentially equivalent to writing
Boolean b = Boolean.FALSE;
If you declare a new Boolean then you create a new and separate Boolean object rather than allowing the compiler to (possibly) reuse the existing reference.
It rarely (if ever) makes sense to use the constructor of the primitive wrapper types.
There is absolutely no need to create a new object for Boolean.
This is what javadoc says
Note: It is rarely appropriate to use this constructor. Unless a new instance is required, the static factory valueOf(boolean) is generally a better choice. It is likely to yield significantly better space and time performance.
â—‹Boolean b = new Boolean(null); use Boolean(String) ctor and set b internal boolean value to false and is different to set b reference to null.
Boolean b = null;
System.out(b.boolValue()); throws a NullPointerException
but
Boolean b = new Boolean(null);
System.out(b.boolValue()); will print `false`
If you need only two-state value (a boolean) use a primitive boolean; if you need a three-state object (null, true, false) use Boolean object and set object reference - as in first example - to null
Both are correct declaration
Boolean b = null;
This is constant creation and it will go to constant pool memory. You need to use == operator to compare two boolean constants.
Boolean b = new Boolean(null);
This is object creation and it will go to Heap memory.You need to use .equals() method to compare two boolean objects.

Java .equals() instanceof subclass? Why not call superclass equals instead of making it final?

It is stated in Object's .equals(Object) javadoc:
It is symmetric: for any non-null reference values x and y,
x.equals(y) should return true if and only if y.equals(x) returns
true.
Almost everywhere in example code I see overridden .equals(Object) method which uses instanceof as one of the first tests, for example here: What issues / pitfalls must be considered when overriding equals and hashCode?
public class Person {
private String name;
private int age;
public boolean equals(Object obj) {
if (obj == null)
return false;
if (obj == this)
return true;
if (!(obj instanceof Person))
return false;
...
}
}
Now with class SpecialPerson extends Person having in equals:
if (!(obj instanceof SpecialPerson))
return false;
we con not guarantee that .equals() is symmetric.
It has been discussed for example here: any-reason-to-prefer-getclass-over-instanceof-when-generating-equals
Person a = new Person(), b = new SpecialPerson();
a.equals(b); //sometimes true, since b instanceof Person
b.equals(a); //always false
Maybe I should add in the beginning of SpecialPerson's equals direct call to super?
public boolean equals(Object obj) {
if( !obj instanceof SpecialPerson )
return super.equals(obj);
...
/* more equality tests here */
}
A lot of the examples use instanceof for two reasons: a) it folds the null check and type check into one or b) the example is for Hibernate or some other code-rewriting framework.
The "correct" (as per the JavaDoc) solution is to use this.getClass() == obj.getClass(). This works for Java because classes are singletons and the VM guarantees this. If you're paranoid, you can use this.getClass().equals(obj.getClass()) but the two are really equivalent.
This works most of the time. But sometimes, Java frameworks need to do "clever" things with the byte code. This usually means they create a subtype automatically. Since the subtype should be considered equal to the original type, equals() must be implemented in the "wrong" way but this doesn't matter since at runtime, the subtypes will all follow certain patterns. For example, they will do additional stuff before a setter is being called. This has no effect on the "equalness".
As you noticed, things start to get ugly when you have both cases: You really extend the base types and you mix that with automatic subtype generation. If you do that, you must make sure that you never use non-leaf types.
You are missing something here. I will try to highlight this:
Suppose you have Person person = new Person() and Person personSpecial = new SpecialPerson() then I am sure you would not like these two objects to be equal. So, its really working as required, the equal must return false.
Moreover, symmetry specifies that the equals() method in both the classes must obey it at the same time. If one equals return true and other return false, then I would say the flaw is in the equals overriding.
Your attempt at solving the problem is not correct. Suppose you have 2 subclasss SpecialPerson and BizarrePerson. With this implementation, BizarrePerson instances could be equal to SpecialPerson instances. You generally don't want that.
don't use instanceof. use this.getClass() == obj.getClass() instead. then you are checking for this exact class.
when working with equalsyou should always use the hashCode and override that too!
the hashCode method for Person could look like this:
#Override
public int hashCode()
{
final int prime = 31;
int result = 1;
result = prime * result + age;
result = prime * result + ((name == null) ? 0 : name.hashCode());
return result;
}
and use it like this in your equals method:
if (this.hashCode() != obj.hashCode())
{
return false;
}
A type should not consider itself equal to an object of any other type--even a subtype--unless both objects derive from a common class whose contract specifies how descendants of different types should check for equality.
For example, an abstract class StringyThing could encapsulate strings, and provide methods to do things like convert to a string or extract substrings, but not impose any requirements on the backing format. One possible subtype of StringyThing, for example, might contain an array of StringyThing and encapsulate the value of the concatenation of all those strings. Two instances of StringyThing would be defined as equal if conversion to strings would yield identical results, and comparison between two otherwise-indistinguishable StringyThing instances whose types knew nothing about each other may have to fall back on that, but StringyThing-derived types could include code to optimize various cases. For example, if one StringyThing represents "M repetitions of character ch" and another represents "N repetitions of the string St", and the latter type knows about the first, it could check whether St contains nothing but M/N repetitions of the character ch. Such a check would indicate whether or not the strings are equal, without having to "expand out" either one of them.

LinkedHashSet: hashCode() and equals() match, but contains() doesn't

How is the following possible:
void contains(LinkedHashSet data, Object arg) {
System.out.println(data.getClass()); // java.util.LinkedHashSet
System.out.println(arg.hashCode() == data.iterator().next().hashCode()); // true
System.out.println(arg.equals(data.iterator().next())); // true
System.out.println(new ArrayList(data).contains(arg)); // true
System.out.println(new HashSet(data).contains(arg)); // true
System.out.println(new LinkedHashSet(data).contains(arg)); // true (!)
System.out.println(data.contains(arg)); // false
}
Am I doing something wrong?
Obviously, it doesn't always happen (if you create a trivial set of Objects, you won't reproduce it). But it does always happen in my case with more complicated class of arg.
EDIT: The main reason why I don't define arg here is that's it's fairly big class, with Eclipse-generated hashCode that spans 20 lines and equals twice as long. And I don't think it's relevant - as long as they're equal for the two objects.
When you build your own objects, and plan to use them in a collection you should always override the following methods:
boolean equals(Object o);
int hashCode();
The default implementation of equals checks whether the objects point to the same object, while you'd probably want to redefine it to check the contents.
As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. To respect the rules an hashCode of an object equals to another one should be the same, thus you've also to redefine hashCode.
EDIT: I was expecting a faulty hashCode or equals implementation, but since your answer, you revealed that you're mutating the keys after they are added to an HashSet or HashMap.
When you add an Object to an hash collection, its hashCode is computed and used to map it to a physical location in the Collection.
If some fields used to compute the hashCode are changed, the hashCode itself will change, so the HashSet implementation will become confused. When it tries to get the Object it will look at another physical location, and won't find the Object. The Object will still be present if you enumerate the set though.
For this reason, always make HashMap or HashSet keys Immutable.
Got it. Once you know it, the answer is so obvious you can only blush in embarrassment.
static class MyObj {
String s = "";
#Override
public int hashCode() {
return s.hashCode();
}
#Override
public boolean equals(Object obj) {
return ((MyObj) obj).s.equals(s);
}
}
public static void main(String[] args) {
LinkedHashSet set = new LinkedHashSet();
MyObj obj = new MyObj();
set.add(obj);
obj.s = "a-ha!";
contains(set, obj);
}
That is enough to reliably reproduce it.
Explanation: Thou Shalt Never Mutate Fields Used For hashCode()!
There seems to be something missing from your question. I have made some guesses:
private void testContains() {
LinkedHashSet set = new LinkedHashSet();
String hello = "Hello!";
set.add(hello);
contains(set, hello);
}
void contains(LinkedHashSet data, Object arg) {
System.out.println(data.getClass()); // java.util.LinkedHashSet
System.out.println(arg.hashCode() == data.iterator().next().hashCode()); // true
System.out.println(arg.equals(data.iterator().next())); // true
System.out.println(new ArrayList(data).contains(arg)); // true
System.out.println(new HashSet(data).contains(arg)); // true
System.out.println(new LinkedHashSet(data).contains(arg)); // true (!)
System.out.println(data.contains(arg)); // true (!!)
}
EDITED: To keep track of changing question!
I still get "true" for ALL but the first output. Please be more specific about the type of the "arg" parameter.

In JVM heap can there be more than one object with the same hash code?

As per the title, can there be more than one object on the heap with the same hash code?
Yes.
public class MyObject {
#Override
public int hashCode() {
return 42;
}
public static void main(String[] args) {
MyObject obj1 = new MyObject();
MyObject obj2 = new MyObject(); // Ta-da!
}
}
For a less flippant answer, consider the hashCode Javadocs:
The general contract of hashCode is:
... (snip) ...
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
yes, since you can have as many objects with the same hashCode as you want. For example, the following code, without interning Strings, show this fact:
String foo = new String("dfa");
String bar = new String("dfa");
assert foo != bar; // yields false, two distinct objects (references)
assert foo.hashCode() == bar.hashCode(); // yields true
Trivial proof:
hashCode returns a 32 bit integer.
Allocate 2^32+1 Objects. (Probably going to need a 64-bit VM and a lot of RAM for this! ;-) )
Now your hashCode method, no matter how clever, must have a collision.
About hash codes: yes they are nearly, but not really unique. :) It depends on the implementation/theory how nearly unique they are.
But if we talk about the JVM, we must first of all talk about what kind of hash code you've meant.
If you talk about the result of the hashCode() method which is used f.e. by the HashMap, then the answer is: it depends on your implementation and the number of objects in your JVM.
It's your choice and plan and knowledge to resolve this conflict in a self-implemented hashCode() method.
If you talk about the result of the method System.identityHashCode( obj ), then it's a little bit different. This implementation doesn't call your hashCode() method. And the implementation isn't unique - but it's nearly unique, like many other different hash functions. :)
public class MyObject {
#Override
public int hashCode() {
return 42;
}
public static void main(String[] args) {
MyObject obj1 = new MyObject();
MyObject obj2 = new MyObject(); // Ta-da!
final int obj1Hash = System.identityHashCode( obj1 );
final int obj2Hash = System.identityHashCode( obj2 );
if( obj1Hash == obj2Hash ) throw new IllegalStateException();
}
}
In this example you will get different hashes, and in the most cases they are different, but not definitely unique...
Best regards!
Of course, and obviously you can write:
class MyClass{
...
public int hashCode(){
return 1;
}
...
}
in which case all instances of MyClass will have the same hash code.
Yes, hashcode is a standard algorithm that tries to avoid duplicates ('collisions') but doesn't guarantee it.
Moreover, it's overrideable, so you could write your own implementation yielding the same hashcode for every object; as to why you would want to do that I have no answer however. :-)
You can, but it's not generally a good idea. The example mentioned several times above:
public int hashCode(){
return 1;
}
is perfectly valid under the specification for hashCode(). However, doing this turns HashMap into a Linked List, which significantly degrades performance. So you generally want to implement hashCode to return values as unique as you can get it.
As a practical matter, though, collisions can occur with many implementations. Take this for example:
public class OrderedPair{
private int x;
private int y;
public int hashCode(){
int prime = 31;
int result=x;
result =result*prime+y;
return result;
}
public boolean equals(){...}
}
This is a pretty standard implementation of hashCode() (in fact, this is pretty close to the output which is automatically generated in IDEA and Eclipse), but it can have many collisions: x=1,y=0 and x=0,y=1 will work for starters. The idea of a well-written hashCode() implementation is to have few enough collisions that your performance is not unduly affected.
Yes you certainly can have more than one object with the same hashcode. However, usually this doesn't cause problems because the java.util.* data structures that use the object hashcode use it as a key into a "bucket" that stores all objects returning the same hash.
Object a = new Integer(1);
Object b = new Integer(1);
System.out.printf(" Is the same object? = %s\n",(a == b ));
System.out.printf(" Have the same hashCode? = %s\n",( a.hashCode() == b.hashCode() ));
Prints:
Is the same object? = false
Have the same hashCode? = true
in a 32-bit environment, I doubt any JVM would return same 'identity hash code' for different objects. but in 64-bit, that is certainly a possibility; the likelihood of collision is still very small given the limited memory we have now.

Categories