mutable fields for objects in a Java Set - java

Am I correct in assuming that if you have an object that is contained inside a Java Set<> (or as a key in a Map<> for that matter), any fields that are used to determine identity or relation (via hashCode(), equals(), compareTo() etc.) cannot be changed without causing unspecified behavior for operations on the collection? (edit: as alluded to in this other question)
(In other words, these fields should either be immutable, or you should require the object to be removed from the collection, then changed, then reinserted.)
The reason I ask is that I was reading the Hibernate Annotations reference guide and it has an example where there is a HashSet<Toy> but the Toy class has fields name and serial that are mutable and are also used in the hashCode() calculation... a red flag went off in my head and I just wanted to make sure I understood the implications of it.

The javadoc for Set says
Note: Great care must be exercised if
mutable objects are used as set
elements. The behavior of a set is not
specified if the value of an object is
changed in a manner that affects
equals comparisons while the object is
an element in the set. A special case
of this prohibition is that it is not
permissible for a set to contain
itself as an element.
This simply means you can use mutable objects in a set, and even change them. You just should make sure the change doesn't impact the way the Set finds the items. For HashSet, that would require not changing the fields used for calculating hashCode().

That is correct, it can cause some problems locating the map entry. Officially the behavior is undefined, so if you add it to a hashset or as a key in a hashmap, you should not be changing it.

Yes, that will cause bad things to happen.
// Given that the Toy class has a mutable field called 'name' which is used
// in equals() and hashCode():
Set<Toy> toys = new HashSet<Toy>();
Toy toy = new Toy("Fire engine", ToyType.WHEELED_VEHICLE, Color.RED);
toys.add(toy);
System.out.println(toys.contains(toy)); // true
toy.setName("Fast truck");
System.out.println(toys.contains(toy)); // false

In a HashSet/HashMap, you could mutate a contained object to change the results of compareTo() operation -- relative comparison isn't used to locate objects. But it'd be fatal inside a TreeSet/TreeMap.
You can also mutate objects that are inside an IdentityHashMap -- nothing other than object identity is used to locate contents.
Even though you can do these things with these qualifications, they make your code more fragile. What if someone wants to change to a TreeSet later, or add that mutable field to the hashCode/equality test?

Related

Uniqueness of HashSet of Student object after updating the student object properties [duplicate]

I am relatively new to Java and am puzzled about the following thing: I usually add objects to an ArrayList before setting its content. I.e.,
List<Bla> list = new ArrayList<>();
Bla bla = new Bla();
list.add(bla);
bla.setContent(); // content influences hashCode
This approach works great. I am concerned whether this approach will give me trouble when used with HashSets or HashMaps. The internal hash table get set at the time the object is added. What will happen if setContent() gets called after the object was added to HashSet or HashMap (and its hashCode changes)?
Should I fully set the (hashCode influencing) content before adding / putting into HashSets or HashMaps? Is it generally recommended to finish building objects before adding them?
Thank you very much for your insights.
What will happen if setContent() gets called after the object was added to HashSet or HashMap (and its hashCode changes)?
Disaster.
Should I fully set the (hashCode influencing) content before adding / putting into HashSets or HashMaps? Is it generally recommended to finish building objects before adding them?
Yes.
The relevant line of documentation is on java.util.Set:
Note: Great care must be exercised if mutable objects are used as set elements. The behavior of a set is not specified if the value of an object is changed in a manner that affects equals comparisons while the object is an element in the set. A special case of this prohibition is that it is not permissible for a set to contain itself as an element.
Generally speaking, this sort of error will manifest itself with elements being both "in" and "not in" your collection, with different methods disagreeing. You may get lucky and your elements may appear to still be in the collection, or they may not; this may happen essentially at random.
This is one of the many, many reasons why it's excellent practice for most of your objects to be immutable -- completely impossible to modify in the first place after construction.

Altering hashCode of object inside of HashSet / HashMap

I am relatively new to Java and am puzzled about the following thing: I usually add objects to an ArrayList before setting its content. I.e.,
List<Bla> list = new ArrayList<>();
Bla bla = new Bla();
list.add(bla);
bla.setContent(); // content influences hashCode
This approach works great. I am concerned whether this approach will give me trouble when used with HashSets or HashMaps. The internal hash table get set at the time the object is added. What will happen if setContent() gets called after the object was added to HashSet or HashMap (and its hashCode changes)?
Should I fully set the (hashCode influencing) content before adding / putting into HashSets or HashMaps? Is it generally recommended to finish building objects before adding them?
Thank you very much for your insights.
What will happen if setContent() gets called after the object was added to HashSet or HashMap (and its hashCode changes)?
Disaster.
Should I fully set the (hashCode influencing) content before adding / putting into HashSets or HashMaps? Is it generally recommended to finish building objects before adding them?
Yes.
The relevant line of documentation is on java.util.Set:
Note: Great care must be exercised if mutable objects are used as set elements. The behavior of a set is not specified if the value of an object is changed in a manner that affects equals comparisons while the object is an element in the set. A special case of this prohibition is that it is not permissible for a set to contain itself as an element.
Generally speaking, this sort of error will manifest itself with elements being both "in" and "not in" your collection, with different methods disagreeing. You may get lucky and your elements may appear to still be in the collection, or they may not; this may happen essentially at random.
This is one of the many, many reasons why it's excellent practice for most of your objects to be immutable -- completely impossible to modify in the first place after construction.

Can/should one write a Comparator consistent with Object's equals method

I have an object, Foo which inherits the default equals method from Object, and I don't want to override this because reference equality is the identity relation that I would like to use.
I now have a specific situation in which I would now like to compare these objects according to a specific field. I'd like to write a comparator, FooValueComparator, to perform this comparison. However, if my FooValueComparator returns 0 whenever two objects have the same value for this particular field, then it is incompatible with the equals method inherited from Object, along with all the problems that entails.
What I would like to do would be to have FooValueComparator compare the two objects first on their field value, and then on their references. Is this possible? What pitfalls might that entail (eg. memory locations being changed causing the relative order of Foo objects to change)?
The reason I would like my comparator to be compatible with equals is because I would like to have the option of applying it to SortedSet collections of Foo objects. I don't want a SortedSet to reject a Foo that I try to add just because it already contains a different object having the same value.
This is described in the documentation of Comparator:
The ordering imposed by a comparator c on a set of elements S is said to be consistent with equals if and only if c.compare(e1, e2)==0 has the same boolean value as e1.equals(e2) for every e1 and e2 in S.
Caution should be exercised when using a comparator capable of imposing an ordering inconsistent with equals to order a sorted set (or sorted map). Suppose a sorted set (or sorted map) with an explicit comparator c is used with elements (or keys) drawn from a set S. If the ordering imposed by c on S is inconsistent with equals, the sorted set (or sorted map) will behave "strangely." In particular the sorted set (or sorted map) will violate the general contract for set (or map), which is defined in terms of equals.
It short, if the implementation of Comparator is not consistent with equals method, then you should know what you're doing and you're responsible of the side effects of this design, but it's not an imposition to make the implementation consistent to Object#equals. Still, take into account that it is preferable to do it in order to not cause confusion for future coders that will maintain the system. Similar concept applies when implementing Comparable.
An example of this in the JDK may be found in BigDecimal#compareTo, which explicitly states in javadoc that this method is not consistent with BigDecimal#equals.
If your intention is to use a SortedSet<YourClass> then probably you're using the wrong approach. I would recommend using a SortedMap<TypeOfYourField, Collection<YourClass>> (or SortedMap<TypeOfYourField, YourClass>, in case there are no equals elements for the same key) instead. It may be more work to do, but it provides you more control of the data stored/retrieved in/from the structure.
You may have several comparators for a given class, i.e each per different field. In that case equals can not be reused. Therefore the answer is not necessarily. You should make them consistence however if your collection is stored in a sorted (map or tree) and the comperator is used to determined element position in that collection.
See documentation for details.

java hashing objects

I'd like to be able to determine whether I've encountered an object before - I have a graph implementation and I want to see if I've created a cycle, probably by iterating through the Node objects with a tortoise/hare floyd algorithm.
But I want to avoid a linear search through my list of "seen" nodes each time. This would be great if I had a hash table for just keys. Can I somehow hash an object? Aren't java objects just references to places in memory anyway? I wonder how much of a problem collisions would be if so..
The simple answer is to create a HashSet and add each node to the set the first time you encounter it.
The only case that this won't work is if you've overloaded hashCode() and equals(Object) for the node class to implement equality based on node contents (or whatever). Then you'll need to:
use the IdentityHashMap class which uses == and System.identityHashcode rather than equals(Object) and hashCode(), or
build a hashtable yourself using your own flavour of object identity.
Aren't java objects just references to places in memory anyway?
Yes and no. Yes, the reference is represented by a memory address (on most JVMs). The problem is that 1) you can't get hold of the address, and 2) it can change when the GC relocates the object. This means that you can't use the object address as a hashcode.
The identityHashCode method deals this by returning a value that is initially based on the memory address. If you then call identityHashCode again for the same object, you are guaranteed to get the same value as before ... even if the object has been relocated.
I wonder how much of a problem collisions would be if so..
The hash values produced by the identityHashCode method can collide. (That is, two distinct objects can have the same identity hashcode value.) Anything that uses these values has to deal with this. (The standard HashSet and IdentityHashMap classes take care of these collisions ... if you chose to use them.)
I'd like to be able to determine whether I've encountered an object
before
Use an IdentityHashMap. It is the ideal for your job since it is not an equals but a == implementation.
Take a look at HashSet. Note that in order for objects to work with HashSet, they need to provide correct implementations of hashCode and equals methods of the java.lang.Object class.
You'll need to implement a hash function for your objects. This is done by overriding hashCode() defined in java.lang.Object. This method is used by HashMap, HashSet etc to store objects. In hashCode() it's up to you to calculate a hash for the object. Don't forget to also implement the equals()-method!
Take a look at Java collection framework (http://docs.oracle.com/javase/tutorial/collections/)

Is there a Java Collection (or similar) that behaves like an auto-id SQL table?

Note that I'm not actually doing anything with a database here, so ORM tools are probably not what I'm looking for.
I want to have some containers that each hold a number of objects, with all objects in one container being of the same class. The container should show some of the behaviour of a database table, namely:
allow one of the object's fields to be used as a unique key, i. e. other objects that have the same value in that field are not added to the container.
upon accepting a new object, the container should issue a numeric id that is returned to the caller of the insertion method.
Instead of throwing an error when a "duplicate entry" is being requested, the container should just skip insertion and return the key of the already existing object.
Now, I would write a generic container class that accepts objects which implement an interface to get the value of the key field and use a HashMap keyed with those values as the actual storage class. Is there a better approach using existing built-in classes? I was looking through HashSet and the like, but they didn't seem to fit.
None of the Collections classes will do what you need. You'll have to write your own!
P.S. You'll also need to decide whether your class will be thread-safe or not.
P.P.S. ConcurrentHashMap is close, but not exactly the same. If you can subclass or wrap it or wrap the objects that enter your map such that you're relying only on that class for thread-safety, you'll have an efficient and thread-safe implementation.
You can simulate this behavior with a HashSet. If the objects you're adding to the collection have a field that you can use as a unique ID, just have that field returned by the object's hashCode() method (or use a calculated hash code value, either way should work).
HashSet won't throw an error when you add a duplicate entry, it just returns false. You could wrap (or extend) HashSet so that your add method returns the unique ID that you want as a return value.
I was thinking you could do it with ArrayList, using the current location in the array as the "id", but that doesn't prevent you from making an insert at an existing location, plus when you insert at that location, it will move everything up. But you might base your own class on ArrayList, returning the current value of .size() after a .add.
Is there a reason why the object's hash code couldn't be used as a "numeric id"?
If not, then all you'd need to do is wrap the call into a ConcurrentHashMap, return the object's hashCode and use the putIfAbsent(K key, V value) method to ensure you don't add duplicates.
putIfAbsent also returns the existing value, so you could get its hashCode to return to your user.
See ConcurrentHashMap

Categories