There have been some discussions here about JPA entities and which hashCode()/equals() implementation should be used for JPA entity classes. Most (if not all) of them depend on Hibernate, but I'd like to discuss them JPA-implementation-neutrally (I am using EclipseLink, by the way).
All possible implementations are having their own advantages and disadvantages regarding:
hashCode()/equals() contract conformity (immutability) for List/Set operations
Whether identical objects (e.g. from different sessions, dynamic proxies from lazily-loaded data structures) can be detected
Whether entities behave correctly in detached (or non-persisted) state
As far I can see, there are three options:
Do not override them; rely on Object.equals() and Object.hashCode()
hashCode()/equals() work
cannot identify identical objects, problems with dynamic proxies
no problems with detached entities
Override them, based on the primary key
hashCode()/equals() are broken
correct identity (for all managed entities)
problems with detached entities
Override them, based on the Business-Id (non-primary key fields; what about foreign keys?)
hashCode()/equals() are broken
correct identity (for all managed entities)
no problems with detached entities
My questions are:
Did I miss an option and/or pro/con point?
What option did you choose and why?
UPDATE 1:
By "hashCode()/equals() are broken", I mean that successive hashCode() invocations may return differing values, which is (when correctly implemented) not broken in the sense of the Object API documentation, but which causes problems when trying to retrieve a changed entity from a Map, Set or other hash-based Collection. Consequently, JPA implementations (at least EclipseLink) will not work correctly in some cases.
UPDATE 2:
Thank you for your answers -- most of them have remarkable quality.
Unfortunately, I am still unsure which approach will be the best for a real-life application, or how to determine the best approach for my application. So, I'll keep the question open and hope for some more discussions and/or opinions.
Read this very nice article on the subject: Don't Let Hibernate Steal Your Identity.
The conclusion of the article goes like this:
Object identity is deceptively hard to implement correctly when
objects are persisted to a database. However, the problems stem
entirely from allowing objects to exist without an id before they are
saved. We can solve these problems by taking the responsibility of
assigning object IDs away from object-relational mapping frameworks
such as Hibernate. Instead, object IDs can be assigned as soon as the
object is instantiated. This makes object identity simple and
error-free, and reduces the amount of code needed in the domain model.
I always override equals/hashcode and implement it based on the business id. Seems the most reasonable solution for me. See the following link.
To sum all this stuff up, here is a listing of what will work or won't work with the different ways to handle equals/hashCode:
EDIT:
To explain why this works for me:
I don't usually use hashed-based collection (HashMap/HashSet) in my JPA application. If I must, I prefer to create UniqueList solution.
I think changing business id on runtime is not a best practice for any database application. On rare cases where there is no other solution, I'd do special treatment like remove the element and put it back to the hashed-based collection.
For my model, I set the business id on constructor and doesn't provide setters for it. I let JPA implementation to change the field instead of the property.
UUID solution seems to be overkill. Why UUID if you have natural business id? I would after all set the uniqueness of the business id in the database. Why having THREE indexes for each table in the database then?
I personally already used all of these three stategies in different projects. And I must say that option 1 is in my opinion the most practicable in a real life app. In my experience breaking hashCode()/equals() conformity leads to many crazy bugs as you will every time end up in situations where the result of equality changes after an entity has been added to a collection.
But there are further options (also with their pros and cons):
a) hashCode/equals based on a set of immutable, not null, constructor assigned, fields
(+) all three criterias are guaranteed
(-) field values must be available to create a new instance
(-) complicates handling if you must change one of then
b) hashCode/equals based on a primary key that is assigned by the application (in the constructor) instead of JPA
(+) all three criterias are guaranteed
(-) you cannot take advantage of simple reliable ID generation stategies like DB sequences
(-) complicated if new entities are created in a distributed environment (client/server) or app server cluster
c) hashCode/equals based on a UUID assigned by the constructor of the entity
(+) all three criterias are guaranteed
(-) overhead of UUID generation
(-) may be a little risk that twice the same UUID is used, depending on algorythm used (may be detected by an unique index on DB)
If you want to use equals()/hashCode() for your Sets, in the sense that the same entity can only be in there once, then there is only one option: Option 2. That's because a primary key for an entity by definition never changes (if somebody indeed updates it, it's not the same entity anymore)
You should take that literally: Since your equals()/hashCode() are based on the primary key, you must not use these methods, until the primary key is set. So you shouldn't put entities in the set, until they're assigned a primary key. (Yes, UUIDs and similar concepts may help to assign primary keys early.)
Now, it's theoretically also possible to achieve that with Option 3, even though so-called "business-keys" have the nasty drawback that they can change: "All you'll have to do is delete the already inserted entities from the set(s), and re-insert them." That is true - but it also means, that in a distributed system, you'll have to make sure, that this is done absolutely everywhere the data has been inserted to (and you'll have to make sure, that the update is performed, before other things occur). You'll need a sophisticated update mechanism, especially if some remote systems aren't currently reachable...
Option 1 can only be used, if all the objects in your sets are from the same Hibernate session. The Hibernate documentation makes this very clear in chapter 13.1.3. Considering object identity:
Within a Session the application can safely use == to compare objects.
However, an application that uses == outside of a Session might produce unexpected results. This might occur even in some unexpected places. For example, if you put two detached instances into the same Set, both might have the same database identity (i.e., they represent the same row). JVM identity, however, is by definition not guaranteed for instances in a detached state. The developer has to override the equals() and hashCode() methods in persistent classes and implement their own notion of object equality.
It continues to argue in favor of Option 3:
There is one caveat: never use the database identifier to implement equality. Use a business key that is a combination of unique, usually immutable, attributes. The database identifier will change if a transient object is made persistent. If the transient instance (usually together with detached instances) is held in a Set, changing the hashcode breaks the contract of the Set.
This is true, if you
cannot assign the id early (e.g. by using UUIDs)
and yet you absolutely want to put your objects in sets while they're in transient state.
Otherwise, you're free to choose Option 2.
Then it mentions the need for a relative stability:
Attributes for business keys do not have to be as stable as database primary keys; you only have to guarantee stability as long as the objects are in the same Set.
This is correct. The practical problem I see with this is: If you can't guarantee absolute stability, how will you be able to guarantee stability "as long as the objects are in the same Set". I can imagine some special cases (like using sets only for a conversation and then throwing it away), but I would question the general practicability of this.
Short version:
Option 1 can only be used with objects within a single session.
If you can, use Option 2. (Assign PK as early as possible, because you can't use the objects in sets until the PK is assigned.)
If you can guarantee relative stability, you can use Option 3. But be careful with this.
We usually have two IDs in our entities:
Is for persistence layer only (so that persistence provider and database can figure out relationships between objects).
Is for our application needs (equals() and hashCode() in particular)
Take a look:
#Entity
public class User {
#Id
private int id; // Persistence ID
private UUID uuid; // Business ID
// assuming all fields are subject to change
// If we forbid users change their email or screenName we can use these
// fields for business ID instead, but generally that's not the case
private String screenName;
private String email;
// I don't put UUID generation in constructor for performance reasons.
// I call setUuid() when I create a new entity
public User() {
}
// This method is only called when a brand new entity is added to
// persistence context - I add it as a safety net only but it might work
// for you. In some cases (say, when I add this entity to some set before
// calling em.persist()) setting a UUID might be too late. If I get a log
// output it means that I forgot to call setUuid() somewhere.
#PrePersist
public void ensureUuid() {
if (getUuid() == null) {
log.warn(format("User's UUID wasn't set on time. "
+ "uuid: %s, name: %s, email: %s",
getUuid(), getScreenName(), getEmail()));
setUuid(UUID.randomUUID());
}
}
// equals() and hashCode() rely on non-changing data only. Thus we
// guarantee that no matter how field values are changed we won't
// lose our entity in hash-based Sets.
#Override
public int hashCode() {
return getUuid().hashCode();
}
// Note that I don't use direct field access inside my entity classes and
// call getters instead. That's because Persistence provider (PP) might
// want to load entity data lazily. And I don't use
// this.getClass() == other.getClass()
// for the same reason. In order to support laziness PP might need to wrap
// my entity object in some kind of proxy, i.e. subclassing it.
#Override
public boolean equals(final Object obj) {
if (this == obj)
return true;
if (!(obj instanceof User))
return false;
return getUuid().equals(((User) obj).getUuid());
}
// Getters and setters follow
}
EDIT: to clarify my point regarding calls to setUuid() method. Here's a typical scenario:
User user = new User();
// user.setUuid(UUID.randomUUID()); // I should have called it here
user.setName("Master Yoda");
user.setEmail("yoda#jedicouncil.org");
jediSet.add(user); // here's bug - we forgot to set UUID and
//we won't find Yoda in Jedi set
em.persist(user); // ensureUuid() was called and printed the log for me.
jediCouncilSet.add(user); // Ok, we got a UUID now
When I run my tests and see the log output I fix the problem:
User user = new User();
user.setUuid(UUID.randomUUID());
Alternatively, one can provide a separate constructor:
#Entity
public class User {
#Id
private int id; // Persistence ID
private UUID uuid; // Business ID
... // fields
// Constructor for Persistence provider to use
public User() {
}
// Constructor I use when creating new entities
public User(UUID uuid) {
setUuid(uuid);
}
... // rest of the entity.
}
So my example would look like this:
User user = new User(UUID.randomUUID());
...
jediSet.add(user); // no bug this time
em.persist(user); // and no log output
I use a default constructor and a setter, but you may find two-constructors approach more suitable for you.
If you have a business key, then you should use that for equals and hashCode.
If you don't have a business key, you should not leave it with the default Object equals and hashCode implementations because that does not work after you merge and entity.
You can use the entity identifier in the equals method only if the hashCode implementation returns a constant value, like this:
#Entity
public class Book implements Identifiable<Long> {
#Id
#GeneratedValue
private Long id;
private String title;
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (!(o instanceof Book)) return false;
Book book = (Book) o;
return getId() != null && Objects.equals(getId(), book.getId());
}
#Override
public int hashCode() {
return getClass().hashCode();
}
//Getters and setters omitted for brevity
}
Check out this test case on GitHub that proves this solution works like a charm.
Although using a business key (option 3) is the most commonly recommended approach (Hibernate community wiki, "Java Persistence with Hibernate" p. 398), and this is what we mostly use, there's a Hibernate bug which breaks this for eager-fetched sets: HHH-3799. In this case, Hibernate can add an entity to a set before its fields are initialized. I'm not sure why this bug hasn't gotten more attention, as it really makes the recommended business-key approach problematic.
I think the heart of the matter is that equals and hashCode should be based on immutable state (reference Odersky et al.), and a Hibernate entity with Hibernate-managed primary key has no such immutable state. The primary key is modified by Hibernate when a transient object becomes persistent. The business key is also modified by Hibernate, when it hydrates an object in the process of being initialized.
That leaves only option 1, inheriting the java.lang.Object implementations based on object identity, or using an application-managed primary key as suggested by James Brundege in "Don't Let Hibernate Steal Your Identity" (already referenced by Stijn Geukens's answer) and by Lance Arlaus in "Object Generation: A Better Approach to Hibernate Integration".
The biggest problem with option 1 is that detached instances can't be compared with persistent instances using .equals(). But that's OK; the contract of equals and hashCode leaves it up to the developer to decide what equality means for each class. So just let equals and hashCode inherit from Object. If you need to compare a detached instance to a persistent instance, you can create a new method explicitly for that purpose, perhaps boolean sameEntity or boolean dbEquivalent or boolean businessEquals.
I agree with Andrew's answer. We do the same thing in our application but instead of storing UUIDs as VARCHAR/CHAR, we split it into two long values. See UUID.getLeastSignificantBits() and UUID.getMostSignificantBits().
One more thing to consider, is that calls to UUID.randomUUID() are pretty slow, so you might want to look into lazily generating the UUID only when needed, such as during persistence or calls to equals()/hashCode()
#MappedSuperclass
public abstract class AbstractJpaEntity extends AbstractMutable implements Identifiable, Modifiable {
private static final long serialVersionUID = 1L;
#Version
#Column(name = "version", nullable = false)
private int version = 0;
#Column(name = "uuid_least_sig_bits")
private long uuidLeastSigBits = 0;
#Column(name = "uuid_most_sig_bits")
private long uuidMostSigBits = 0;
private transient int hashCode = 0;
public AbstractJpaEntity() {
//
}
public abstract Integer getId();
public abstract void setId(final Integer id);
public boolean isPersisted() {
return getId() != null;
}
public int getVersion() {
return version;
}
//calling UUID.randomUUID() is pretty expensive,
//so this is to lazily initialize uuid bits.
private void initUUID() {
final UUID uuid = UUID.randomUUID();
uuidLeastSigBits = uuid.getLeastSignificantBits();
uuidMostSigBits = uuid.getMostSignificantBits();
}
public long getUuidLeastSigBits() {
//its safe to assume uuidMostSigBits of a valid UUID is never zero
if (uuidMostSigBits == 0) {
initUUID();
}
return uuidLeastSigBits;
}
public long getUuidMostSigBits() {
//its safe to assume uuidMostSigBits of a valid UUID is never zero
if (uuidMostSigBits == 0) {
initUUID();
}
return uuidMostSigBits;
}
public UUID getUuid() {
return new UUID(getUuidMostSigBits(), getUuidLeastSigBits());
}
#Override
public int hashCode() {
if (hashCode == 0) {
hashCode = (int) (getUuidMostSigBits() >> 32 ^ getUuidMostSigBits() ^ getUuidLeastSigBits() >> 32 ^ getUuidLeastSigBits());
}
return hashCode;
}
#Override
public boolean equals(final Object obj) {
if (obj == null) {
return false;
}
if (!(obj instanceof AbstractJpaEntity)) {
return false;
}
//UUID guarantees a pretty good uniqueness factor across distributed systems, so we can safely
//dismiss getClass().equals(obj.getClass()) here since the chance of two different objects (even
//if they have different types) having the same UUID is astronomical
final AbstractJpaEntity entity = (AbstractJpaEntity) obj;
return getUuidMostSigBits() == entity.getUuidMostSigBits() && getUuidLeastSigBits() == entity.getUuidLeastSigBits();
}
#PrePersist
public void prePersist() {
// make sure the uuid is set before persisting
getUuidLeastSigBits();
}
}
Jakarta Persistence 3.0, section 4.12 writes:
Two entities of the same abstract schema type are equal if and only if they have the same primary key value.
I see no reason why Java code should behave differently.
If the entity class is in a so called "transient" state, i.e. it's not yet persisted and it has no identifier, then the hashCode/equals methods can not return a value, they ought to blow up, ideally implicitly with a NullPointerException when the method attempts to traverse the ID. Either way, this will effectively stop application code from putting a non-managed entity into a hash-based data structure. In fact, why not go one step further and blow up if the class and identifier are equal, but other important attributes such as the version are unequal (IllegalStateException)! Fail-fast in a deterministic way is always the preferred option.
Word of caution: Also document the blowing-up behavior. Documentation is important in and by itself, but it will hopefully also stop junior developers in the future to do something stupid with your code (they have this tendency to suppress NullPointerException where it happened and the last thing on their mind is side-effects lol).
Oh, and always use getClass() instead of instanceof. The equals-method requires symmetry. If b is equal to a, then a must be equal to b. With subclasses, instanceof breaks this relationship (a is not instance of b).
Although I personally always use getClass() even when implementing non-entity classes (the type is state, and so a subclass adds state even if the subclass is empty or only contains behavior), instanceof would've been fine only if the class is final. But entity classes must not be final (§2.1) so we're really out of options here.
Some folks may not like getClass(), because of the persistence provider's proxy wrapping the object. This might have been a problem in the past, but it really shouldn't be. A provider not returning different proxy classes for different entities, well, I'd say that's not a very smart provider lol. Generally, we shouldn't solve a problem until there is a problem. And, it seems like Hibernate's own documentation doesn't even see it worthwhile mentioning. In fact, they elegantly use getClass() in their own examples (see this).
Lastly, if one has an entity subclass that is an entity, and the inheritance mapping strategy used is not the default ("single table"), but configured to be a "joined subtype", then the primary key in that subclass table will be the same as the superclass table. If the mapping strategy is "table per concrete class", then the primary key may be the same as in the superclass. An entity subclass is very likely to be adding state and therefore just as likely to be logically a different thing. But an equals implementation using instanceof can not necessarily and secondarily rely on the ID only, as we saw may be the same for different entities.
In my opinion, instanceof has no place at all in a non-final Java class, ever. This is especially true for persistent entities.
There are obviously already very informative answers here but I will tell you what we do.
We do nothing (ie do not override).
If we do need equals/hashcode to work for collections we use UUIDs.
You just create the UUID in the constructor. We use http://wiki.fasterxml.com/JugHome for UUID. UUID is a little more expensive CPU wise but is cheap compared to serialization and db access.
Please consider the following approach based on predefined type identifier and the ID.
The specific assumptions for JPA:
entities of the same "type" and the same non-null ID are considered equal
non-persisted entities (assuming no ID) are never equal to other entities
The abstract entity:
#MappedSuperclass
public abstract class AbstractPersistable<K extends Serializable> {
#Id #GeneratedValue
private K id;
#Transient
private final String kind;
public AbstractPersistable(final String kind) {
this.kind = requireNonNull(kind, "Entity kind cannot be null");
}
#Override
public final boolean equals(final Object obj) {
if (this == obj) return true;
if (!(obj instanceof AbstractPersistable)) return false;
final AbstractPersistable<?> that = (AbstractPersistable<?>) obj;
return null != this.id
&& Objects.equals(this.id, that.id)
&& Objects.equals(this.kind, that.kind);
}
#Override
public final int hashCode() {
return Objects.hash(kind, id);
}
public K getId() {
return id;
}
protected void setId(final K id) {
this.id = id;
}
}
Concrete entity example:
static class Foo extends AbstractPersistable<Long> {
public Foo() {
super("Foo");
}
}
Test example:
#Test
public void test_EqualsAndHashcode_GivenSubclass() {
// Check contract
EqualsVerifier.forClass(Foo.class)
.suppress(Warning.NONFINAL_FIELDS, Warning.TRANSIENT_FIELDS)
.withOnlyTheseFields("id", "kind")
.withNonnullFields("id", "kind")
.verify();
// Ensure new objects are not equal
assertNotEquals(new Foo(), new Foo());
}
Main advantages here:
simplicity
ensures subclasses provide type identity
predicted behavior with proxied classes
Disadvantages:
Requires each entity to call super()
Notes:
Needs attention when using inheritance. E.g. instance equality of class A and class B extends A may depend on concrete details of the application.
Ideally, use a business key as the ID
Looking forward to your comments.
I have always used option 1 in the past because I was aware of these discussions and thought it was better to do nothing until I knew the right thing to do. Those systems are all still running successfully.
However, next time I may try option 2 - using the database generated Id.
Hashcode and equals will throw IllegalStateException if the id is not set.
This will prevent subtle errors involving unsaved entities from appearing unexpectedly.
What do people think of this approach?
Business keys approach doesn't suit for us. We use DB generated ID, temporary transient tempId and override equal()/hashcode() to solve the dilemma. All entities are descendants of Entity. Pros:
No extra fields in DB
No extra coding in descendants entities, one approach for all
No performance issues (like with UUID), DB Id generation
No problem with Hashmaps (don't need to keep in mind the use of equal & etc.)
Hashcode of new entity doesn't changed in time even after persisting
Cons:
There are may be problems with serializing and deserializing not persisted entities
Hashcode of the saved entity may change after reloading from DB
Not persisted objects considered always different (maybe this is right?)
What else?
Look at our code:
#MappedSuperclass
abstract public class Entity implements Serializable {
#Id
#GeneratedValue
#Column(nullable = false, updatable = false)
protected Long id;
#Transient
private Long tempId;
public void setId(Long id) {
this.id = id;
}
public Long getId() {
return id;
}
private void setTempId(Long tempId) {
this.tempId = tempId;
}
// Fix Id on first call from equal() or hashCode()
private Long getTempId() {
if (tempId == null)
// if we have id already, use it, else use 0
setTempId(getId() == null ? 0 : getId());
return tempId;
}
#Override
public boolean equals(Object obj) {
if (super.equals(obj))
return true;
// take proxied object into account
if (obj == null || !Hibernate.getClass(obj).equals(this.getClass()))
return false;
Entity o = (Entity) obj;
return getTempId() != 0 && o.getTempId() != 0 && getTempId().equals(o.getTempId());
}
// hash doesn't change in time
#Override
public int hashCode() {
return getTempId() == 0 ? super.hashCode() : getTempId().hashCode();
}
}
IMO you have 3 options for implementing equals/hashCode
Use an application generated identity i.e. a UUID
Implement it based on a business key
Implement it based on the primary key
Using an application generated identity is the easiest approach, but comes with a few downsides
Joins are slower when using it as PK because 128 Bit is simply bigger than 32 or 64 Bit
"Debugging is harder" because checking with your own eyes wether some data is correct is pretty hard
If you can work with these downsides, just use this approach.
To overcome the join issue one could be using the UUID as natural key and a sequence value as primary key, but then you might still run into the equals/hashCode implementation problems in compositional child entities that have embedded ids since you will want to join based on the primary key. Using the natural key in child entities id and the primary key for referring to the parent is a good compromise.
#Entity class Parent {
#Id #GeneratedValue Long id;
#NaturalId UUID uuid;
#OneToMany(mappedBy = "parent") Set<Child> children;
// equals/hashCode based on uuid
}
#Entity class Child {
#EmbeddedId ChildId id;
#ManyToOne Parent parent;
#Embeddable class ChildId {
UUID parentUuid;
UUID childUuid;
// equals/hashCode based on parentUuid and childUuid
}
// equals/hashCode based on id
}
IMO this is the cleanest approach as it will avoid all downsides and at the same time provide you a value(the UUID) that you can share with external systems without exposing system internals.
Implement it based on a business key if you can expect that from a user is a nice idea, but comes with a few downsides as well
Most of the time this business key will be some kind of code that the user provides and less often a composite of multiple attributes.
Joins are slower because joining based on variable length text is simply slow. Some DBMS might even have problems creating an index if the key exceeds a certain length.
In my experience, business keys tend to change which will require cascading updates to objects referring to it. This is impossible if external systems refer to it
IMO you shouldn't implement or work with a business key exclusively. It's a nice add-on i.e. users can quickly search by that business key, but the system shouldn't rely on it for operating.
Implement it based on the primary key has it's problems, but maybe it's not such a big deal
If you need to expose ids to external system, use the UUID approach I suggested. If you don't, you could still use the UUID approach but you don't have to.
The problem of using a DBMS generated id in equals/hashCode stems from the fact that the object might have been added to hash based collections before assigning the id.
The obvious way to get around this is to simply not add the object to hash based collections before assigning the id. I understand that this is not always possible because you might want deduplication before assigning the id already. To still be able to use the hash based collections, you simply have to rebuild the collections after assigning the id.
You could do something like this:
#Entity class Parent {
#Id #GeneratedValue Long id;
#OneToMany(mappedBy = "parent") Set<Child> children;
// equals/hashCode based on id
}
#Entity class Child {
#EmbeddedId ChildId id;
#ManyToOne Parent parent;
#PrePersist void postPersist() {
parent.children.remove(this);
}
#PostPersist void postPersist() {
parent.children.add(this);
}
#Embeddable class ChildId {
Long parentId;
#GeneratedValue Long childId;
// equals/hashCode based on parentId and childId
}
// equals/hashCode based on id
}
I haven't tested the exact approach myself, so I'm not sure how changing collections in pre- and post-persist events works but the idea is:
Temporarily Remove the object from hash based collections
Persist it
Re-add the object to the hash based collections
Another way of solving this is to simply rebuild all your hash based models after an update/persist.
In the end, it's up to you. I personally use the sequence based approach most of the time and only use the UUID approach if I need to expose an identifier to external systems.
I tried to answer this question myself and was never totally happy with found solutions until i read this post and especially DREW one. I liked the way he lazy created UUID and optimally stored it.
But I wanted to add even more flexibility, ie lazy create UUID ONLY when hashCode()/equals() is accessed before first persistence of the entity with each solution's advantages :
equals() means "object refers to the same logical entity"
use database ID as much as possible because why would I do the work twice (performance concern)
prevent problem while accessing hashCode()/equals() on not yet persisted entity and keep the same behaviour after it is indeed persisted
I would really apreciate feedback on my mixed-solution below
public class MyEntity {
#Id()
#Column(name = "ID", length = 20, nullable = false, unique = true)
#GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id = null;
#Transient private UUID uuid = null;
#Column(name = "UUID_MOST", nullable = true, unique = false, updatable = false)
private Long uuidMostSignificantBits = null;
#Column(name = "UUID_LEAST", nullable = true, unique = false, updatable = false)
private Long uuidLeastSignificantBits = null;
#Override
public final int hashCode() {
return this.getUuid().hashCode();
}
#Override
public final boolean equals(Object toBeCompared) {
if(this == toBeCompared) {
return true;
}
if(toBeCompared == null) {
return false;
}
if(!this.getClass().isInstance(toBeCompared)) {
return false;
}
return this.getUuid().equals(((MyEntity)toBeCompared).getUuid());
}
public final UUID getUuid() {
// UUID already accessed on this physical object
if(this.uuid != null) {
return this.uuid;
}
// UUID one day generated on this entity before it was persisted
if(this.uuidMostSignificantBits != null) {
this.uuid = new UUID(this.uuidMostSignificantBits, this.uuidLeastSignificantBits);
// UUID never generated on this entity before it was persisted
} else if(this.getId() != null) {
this.uuid = new UUID(this.getId(), this.getId());
// UUID never accessed on this not yet persisted entity
} else {
this.setUuid(UUID.randomUUID());
}
return this.uuid;
}
private void setUuid(UUID uuid) {
if(uuid == null) {
return;
}
// For the one hypothetical case where generated UUID could colude with UUID build from IDs
if(uuid.getMostSignificantBits() == uuid.getLeastSignificantBits()) {
throw new Exception("UUID: " + this.getUuid() + " format is only for internal use");
}
this.uuidMostSignificantBits = uuid.getMostSignificantBits();
this.uuidLeastSignificantBits = uuid.getLeastSignificantBits();
this.uuid = uuid;
}
This is a common problem in every IT system that uses Java and JPA. The pain point extends beyond implementing equals() and hashCode(), it affects how an organization refer to an entity and how its clients refer to the same entity. I've seen enough pain of not having a business key to the point that I wrote my own blog to express my view.
In short: use a short, human readable, sequential ID with meaningful prefixes as business key that's generated without any dependency on any storage other than RAM. Twitter's Snowflake is a very good example.
I using a class EntityBase and inherited to all my entities of JPA and this it works very good to me.
/**
* #author marcos.oliveira
*/
#MappedSuperclass
public abstract class EntityBase<TId extends Serializable> implements Serializable{
/**
*
*/
private static final long serialVersionUID = 1L;
#Id
#Column(name = "id", unique = true, nullable = false)
#GeneratedValue(strategy = GenerationType.IDENTITY)
protected TId id;
public TId getId() {
return this.id;
}
public void setId(TId id) {
this.id = id;
}
#Override
public int hashCode() {
return (super.hashCode() * 907) + Objects.hashCode(getId());//this.getId().hashCode();
}
#Override
public String toString() {
return super.toString() + " [Id=" + id + "]";
}
#Override
public boolean equals(Object obj) {
if (this == obj) {
return true;
}
if (obj == null || getClass() != obj.getClass()) {
return false;
}
EntityBase entity = (EntityBase) obj;
if (entity.id == null || id == null) {
return false;
}
return Objects.equals(id, entity.id);
}
}
Reference: https://thorben-janssen.com/ultimate-guide-to-implementing-equals-and-hashcode-with-hibernate/
If UUID is the answer for many people, why don't we just use factory methods from business layer to create the entities and assign primary key at creation time?
for example:
#ManagedBean
public class MyCarFacade {
public Car createCar(){
Car car = new Car();
em.persist(car);
return car;
}
}
this way we would get a default primary key for the entity from the persistence provider, and our hashCode() and equals() functions could rely on that.
We could also declare the Car's constructors protected and then use reflection in our business method to access them. This way developers would not be intent on instantiate Car with new, but through factory method.
How'bout that?
In practice it seems, that Option 2 (Primary key) is most frequently used.
Natural and IMMUTABLE business key is seldom thing, creating and supporting synthetic keys are too heavy to solve situations, which are probably never happened.
Have a look at spring-data-jpa AbstractPersistable implementation (the only thing: for Hibernate implementation use Hibernate.getClass).
public boolean equals(Object obj) {
if (null == obj) {
return false;
}
if (this == obj) {
return true;
}
if (!getClass().equals(ClassUtils.getUserClass(obj))) {
return false;
}
AbstractPersistable<?> that = (AbstractPersistable<?>) obj;
return null == this.getId() ? false : this.getId().equals(that.getId());
}
#Override
public int hashCode() {
int hashCode = 17;
hashCode += null == getId() ? 0 : getId().hashCode() * 31;
return hashCode;
}
Just aware of manipulating new objects in HashSet/HashMap.
In opposite, the Option 1 (remain Object implementation) is broken just after merge, that is very common situation.
If you have no business key and have a REAL needs to manipulate new entity in hash structure, override hashCode to constant, as below Vlad Mihalcea was advised.
Below is a simple (and tested) solution for Scala.
Note that this solution does not fit into any of the 3 categories
given in the question.
All my Entities are subclasses of the UUIDEntity so I follow the
don't-repeat-yourself (DRY) principle.
If needed the UUID generation can be made more precise (by using more
pseudo-random numbers).
Scala Code:
import javax.persistence._
import scala.util.Random
#Entity
#Inheritance(strategy = InheritanceType.TABLE_PER_CLASS)
abstract class UUIDEntity {
#Id #GeneratedValue(strategy = GenerationType.TABLE)
var id:java.lang.Long=null
var uuid:java.lang.Long=Random.nextLong()
override def equals(o:Any):Boolean=
o match{
case o : UUIDEntity => o.uuid==uuid
case _ => false
}
override def hashCode() = uuid.hashCode()
}
Related
I am really too confused with the equals() and hashCode() methods after reading lots of documentation and articles. Mainly, there are different kind of examples and usages that makes me too confused.
So, could you clarify me about the following points?
1. If there is not any unique field in an entity (except from id field) then should we use getClass() method or only id field in the equals() method as shown below?
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (getClass() != o.getClass()) return false;
// code omitted
}
2. If there is a unique key e.g. private String isbn;, then should we use only this field? Or should we combine it with getClass() as shown below?
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (getClass() != o.getClass()) return false;
Book book = (Book) o;
return isbn == book.isbn;
}
3. What about NaturalId? As far as I understood, it is used for unique fields e.g. private String isbn;. What is the purpose of its usage? Is it related to equals() and hashCode() methods?
It all boils down to what your class actually represents, what is its identity and when should the JVM consider two objects as actually the same. The context in which the class is used determines its behavior (in this case - equality to another object).
By default Java considers two given objects "the same" only if they are actually the same instance of a class (comparison using ==). While it makes sense in case of strictly technical verification, Java applications are usually used to represent a business domain, where multiple objects may be constructed, but they should still be considered the same. An example of that could be a book (as in your question). But what does it mean that a book is the same as another?
See - it depends.
When you ask someone if they read a certain book, you give them a title and the author, they try to "match" it agains the books they've read and see if any of them is equal to criteria you provided. So equals in this case would be checking if the title and the author of a given book is the same as the other. Simple.
Now imagine that you're a Tolkien fan. If you were Polish (like me), you could have multiple "Lord of the Rings" translations available to read, but (as a fan) you would know about some translators that went a bit too far and you would like to avoid them. The title and the author is not enough, you're looking for a book with a certain ISBN identifier that will let you find a certain edition of the book. Since ISBN also contains information about the title and the author, it's not required to use them in the equals method in this case.
The third (and final) book-related example is related to a library. Both situations described above could easily happen at a library, but from the librarian point of view books are also another thing: an "item". Each book in the library (it's just an assumption, I've never worked with such a system) has it's own identifier, which can be completely separate from the ISBN (but could also be an ISBN plus something extra). When you return a book in the library it's the library identifier that matters and it should be used in this case.
To sum up: a Book as an abstraction does not have a single "equality definition". It depends on the context. Let's say we create such set of classes (most likely in more than one context):
Book
BookEdition
BookItem
BookOrder (not yet in the library)
Book and BookEdition are more of a value object, while BookItem and BookOrder are entities. Value objects are represented only by their values and even though they do not have an identifier, they can be equal to other ones. Entities on the other hand can include values or can even consist of value objects (e.g. BookItem could contain a BookEdition field next to its libraryId field), but they have an identifier which defines whether they are the same as another (even if their values change). Books are not a good example here (unless we imagine reassigning a library identifier to another book), but a user that changed their username is still the same user - identified by their ID.
In regard to checking the class of the object passed to the equals method - it is highly advised (yet not enforced by the compiler in any way) to verify if the object is of given type before casting it to avoid a ClassCastException. To do that instanceof or getClass() should be used. If the object fulfills the requirement of being of an expected type you can cast it (e.g. Book other = (Book) object;) and only then can you access the properties of the book (libraryId, isbn, title, author) - an object of type Object doesn't have such fields or accessors to them.
You're not explicitly asking about that in your question, but using instanceof and getClass() can be similarly unclear. A rule of thumb would be: use getClass() as it helps to avoid problems with symmetry.
Natural IDs can vary depending on a context. In case of a BookEdition an ISBN is a natural ID, but in case of just a Book it would be a pair of the title and the author (as a separate class). You can read more about the concept of natural ID in Hibernate in the docs.
It is important to understand that if you have a table in the database, it can be mapped to different types of objects in a more complex domain. ORM tools should help us with management and mapping of data, but the objects defined as data representation are (or rather: usually should be) a different layer of abstraction than the domain model.
Yet if you were forced to use, for example, the BookItem as your data-modeling class, libraryId could probably be an ID in the database context, but isbn would not be a natural ID, since it does not uniquely identify the BookItem. If BookEdition was the data-modeling class, it could contain an ID autogenerated by the database (ID in the database context) and an ISBN, which in this case would be the natural ID as it uniquely identifies a BookEdition in the book editions context.
To avoid such problems and make the code more flexible and descriptive, I'd suggest treating data as data and domain as domain, which is related to domain-driven design. A natural ID (as a concept) is present only on the domain level of the code as it can vary and evolve and you can still use the same database table to map the data into those various objects, depending on the business context.
Here's a code snippet with the classes described above and a class representing a table row from the database.
Data model (might be managed by an ORM like Hibernate):
// database table representation (represents data, is not a domain object)
// getters and hashCode() omitted in all classes for simplicity
class BookRow {
private long id;
private String isbn;
private String title;
// author should be a separate table joined by FK - done this way for simplification
private String authorName;
private String authorSurname;
// could have other fields as well - e.g. date of addition to the library
private Timestamp addedDate;
#Override
public boolean equals(Object object) {
if (this == object) {
return true;
}
if (object == null || getClass() != object.getClass()) {
return false;
}
BookRow book = (BookRow) object;
// id identifies the ORM entity (a row in the database table represented as a Java object)
return id == book.id;
}
}
Domain model:
// getters and hashCode() omitted in all classes for simplicity
class Book {
private String title;
private String author;
#Override
public boolean equals(Object object) {
if (this == object) {
return true;
}
if (object == null || getClass() != object.getClass()) {
return false;
}
Book book = (Book) object;
// title and author identify the book
return title.equals(book.title)
&& author.equals(book.author);
}
static Book fromDatabaseRow(BookRow bookRow) {
var book = new Book();
book.title = bookRow.title;
book.author = bookRow.authorName + " " + bookRow.authorSurname;
return book;
}
}
class BookEdition {
private String title;
private String author;
private String isbn;
#Override
public boolean equals(Object object) {
if (this == object) {
return true;
}
if (object == null || getClass() != object.getClass()) {
return false;
}
BookEdition book = (BookEdition) object;
// isbn identifies the book edition
return isbn.equals(book.isbn);
}
static BookEdition fromDatabaseRow(BookRow bookRow) {
var edition = new BookEdition();
edition.title = bookRow.title;
edition.author = bookRow.authorName + " " + bookRow.authorSurname;
edition.isbn = bookRow.isbn;
return edition;
}
}
class BookItem {
private long libraryId;
private String title;
private String author;
private String isbn;
#Override
public boolean equals(Object object) {
if (this == object) {
return true;
}
if (object == null || getClass() != object.getClass()) {
return false;
}
BookItem book = (BookItem) object;
// libraryId identifies the book item in the library system
return libraryId == book.libraryId;
}
static BookItem fromDatabaseRow(BookRow bookRow) {
var item = new BookItem();
item.libraryId = bookRow.id;
item.title = bookRow.title;
item.author = bookRow.authorName + " " + bookRow.authorSurname;
item.isbn = bookRow.isbn;
return item;
}
}
If there is not any unique field in an entity (except from id field) then should we use getClass() method or only id field in the
equals() method as shown below?
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (getClass() != o.getClass()) return false;
// code omitted
}
we achieve two following goals when comparing classes in #equals implementation:
thus we make sure that we do not compare apples with oranges (it could be correct though)
the code you omitted must perform cast of Object o to some known class, otherwise we will unable to extract required information from Object o, so, we make #equals method safe - nobody expect to get ClassCastException when calling Set#add for example. Using instanceof there seems not to be a good idea because it violates symmetric and transitive contracts of equals.
Also it is worth noticing that calling o.getClass() could cause unexpected behaviour when Object o is a proxy, some people prefer to either call Hibernate.getClass(o) instead or implement other tricks.
I am really too confused with the equals() and hashCode() methods
after reading lots of documentation and articles. Mainly, there are
different kind of examples and usages that makes me too confused
If there is a unique key e.g. private String isbn;, then should we use > only this field? Or should we combine it with getClass() as shown below?
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (getClass() != o.getClass()) return false;
Book book = (Book) o;
return isbn == book.isbn;
}
That is very controversial topic, below are some thoughts on the problem:
it is a good idea to maintain PK column for each DB table - it costs almost nothing, but simplifies a lot of things - imagine someone asked you to delete some rows and instead of delete from tbl where id=... you need to write delete from tbl where field1=... and field2=... and ...
PK's should not be composite, otherwise you might get surprised with queries like select count(distinct field1, field2) from tbl
the argument that entities get their IDs only when get stored in DB that is why we can't rely or surrogate ids in equals and hashCode is just wrong, yes, it is a common situation/behaviour for the most JPA projects, but you always has an option to generate and assign IDs manually, some examples below:
EclipseLink UserGuide: "By default, the entities Id must be set by the application, normally before the persist is called. A #GeneratedValue can be used to have EclipseLink generate the Id value." - I believe it is clear enough that #GeneratedValue is just an extra feature and nobody prevents you from creating own object factory.
Hibernate User Guide: "Values for simple identifiers can be assigned, which simply means that the application itself will assign the value to the identifier attribute prior to persisting the entity."
some popular persistent storages (Cassandra, MongoDB) do not have out-of-the-box auto-increment functionality, however nobody may say those storages do not allow to implement some high level ideas like DDD, etc.
in such discussions examples make sense but book/author/isbn is not the good one, below are something more practical: my db contains about 1000 tables, and just 3 of them contains something similar to natural id, please give me the reason why I should not use surrogate ids there
it is not always possible to use natural ids even when they exist, some examples:
bank card PAN - it seems to be unique, however you must not even store it in DB (I believe SSN, VIN are also security sensitive)
no matter what anyone says, thinking that natural ids never change is too naive, surrogate ids never change
they may have bad format: too long, case insensitive, contains unsafe symbols, etc
it is not possible to implement soft deletes feature when we are using natural ids
PS. Vlad Mihalcea had provided amusing implementation of hashCode:
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (!(o instanceof Book))
return false;
Book other = (Book) o;
return id != null &&
id.equals(other.getId());
}
#Override
public int hashCode() {
return getClass().hashCode();
}
In regard to HBN documentation, the problem is their synthetic cases have nothing in common with the real world. Let's consider their dummy author/book model and try to extend it... Imagine I'm a publisher and I want to keep records of my authors, their books and drafts. What is the difference between book and draft? Book has isbn assigned, draft has not, but draft may one time become a book (or may not). How to keep java equals/hashCode contracts for drafts in such case?
getClass()
In regard to the usage of getClass() everything is straightforward.
Method equals() expects an argument of type Object.
It's important to ensure that you're dialing with an instance of the same class before performing casting and comparing attributes, otherwise you can end up with a ClassCastException. And getClass() can be used for that purpose, if objects do not belong to the same class they are clearly not equal.
Natural Id vs Surrogate Id
When you're talking about "NaturalId" like ISBN-number of a book versus "id", I guess you refer to a natural key of a persistence entity versus surrogate key which is used in a relational database.
There are different opinions on that point, the general recommended approach (see a link to the Hibernate user-guide and other references below) is to use natural id (a set of unique properties, also called business keys) in your application and ID which entity obtains after being persisted only in the database.
You can encounter hashCode() and equals() that are implemented based on surrogate id, and making a defensive null-check to guard against the case when an entity is in transient state and its id is null. According to such implementations, a transient entity would not be equal to the entity in persistent state, having the same properties (apart from non-null id). Personally, I don't think this approach is correct.
The following code-sample has been taken from the most recent official Hibernate 6.1 User-Guide
Example 142. Natural Id equals/hashCode
#Entity(name = "Book")
public static class Book {
#Id
#GeneratedValue
private Long id;
private String title;
private String author;
#NaturalId
private String isbn;
//Getters and setters are omitted for brevity
#Override
public boolean equals(Object o) {
if (this == o) {
return true;
}
if (o == null || getClass() != o.getClass()) {
return false;
}
Book book = (Book) o;
return Objects.equals(isbn, book.isbn);
}
#Override
public int hashCode() {
return Objects.hash(isbn);
}
}
The code provided above that makes use of business-keys is denoted in the guide as a final approach in contrast to implementation based on the surrogate keys, which is called a naive implementation (see Example 139 and further).
The same reasoning for the choice ID vs Natural key has been described here:
You have to override the equals() and hashCode() methods if you
intend to put instances of persistent classes in a Set (the recommended way to represent many-valued associations) and
intend to use reattachment of detached instances
Hibernate guarantees equivalence of persistent identity (database row)
and Java identity only inside a particular session scope. So as soon
as we mix instances retrieved in different sessions, we must implement
equals() and hashCode() if we wish to have meaningful semantics for
Sets.
The most obvious way is to implement equals()/hashCode() by comparing
the identifier value of both objects. If the value is the same, both
must be the same database row, they are therefore equal (if both are
added to a Set, we will only have one element in the Set).
Unfortunately, we can't use that approach with generated identifiers!
Hibernate will only assign identifier values to objects that are
persistent, a newly created instance will not have any identifier
value! Furthermore, if an instance is unsaved and currently in a Set,
saving it will assign an identifier value to the object. If equals()
and hashCode() are based on the identifier value, the hash code would
change, breaking the contract of the Set. See the Hibernate website
for a full discussion of this problem. Note that this is not a
Hibernate issue, but normal Java semantics of object identity and
equality.
We recommend implementing equals() and hashCode() using Business key
equality.
For more information, have a look at this recent (Sep 15, 2021) article by #Vlad Mihalcea on how to improve caching query results with natural keys The best way to map a #NaturalId business key with JPA and Hibernate, and these questions:
The JPA hashCode() / equals() dilemma
Should the id field of a JPA entity be considered in equals and hashCode?
As in the subject Can I mark object field as unique in Java? I want to have unique id in the class.
You can create unique IDs with a static variable which you increment on each object creation and assign to your ID variable:
private static int globalID = 0;
private int ID;
public obj()
{
globalID++;
this.ID = globalID;
}
No, that is not possible. You have to manage that in a Controller class.
Your question was a bit vague, but I am going to attempt to answer it based on the limited information.
I want to have unique id in the class.
If you are looking to specify a Unique ID for the class for persistence through a well-known framework such as Hibernate, you can do so through the HBM mapping or #Id annotation.
If you are looking to ensure that an instance of a particular Class is unique (from a runtime perspective), then you should override the .equals method and do the comparison in .equals based on the field that you are using as the "unique id of the class". If you wanted some sort of development-time indicator that particular field is your "unique ID", you could always create a custom annotation.
Here is an excellet answer from a different StackOverflow post regarding how to override .equals and .hashCode using the Apache Commons Lang library. You can use this and just modify the .equals override to compare on whatever field you are using as your "Unique ID".
When you don't care about the content of your unique field except for the fact that they have to be unique, you can use the solution by nukebauer.
But when you want to set the value manually and throw an exception when the value is already used by another instance of the class, you can keep track of the assigned values in a static set:
class MyObject {
private Integer id;
private static Set<Integer> assignedIds = new HashSet<Integer>();
public void setId(int id) throws NotUniqueException {
if (!this.id.equals(id) {
if (assignedIds.contains(id) {
throw new NotUniqueException();
}
assignedIds.add(id);
this.id = id;
}
}
}
By the way: Note that both options are not thread-safe! That means that when two threads create an instance / set an ID at exactly the same time, they might still get the same ID. When you want to use this in a multi-threading environment, you should wrap the code in a synchronized block which synchronizes on the static variable.
I'm trying to figure out what's wrong with this approach, given my particular usage patterns:
#Entity
public class DomainObject {
#Id // + sequence generator
private Long id;
#Override
public boolean equals(Object o) {
// bunch of other checks omitted for clarity
if (id != null) {
return id.equals(o.getId());
}
return super.equals(o);
}
#Override
public int hashCode() {
if (id != null) {
return id.hashCode();
}
return super.hashCode();
}
I've read several posts on the subject and it sounds like you don't want to use a DB-generated sequence value in equals/hashCode because they won't be set until the objects are persisted and you don't want disparate transient instances to all be equal, or the persistence layer itself might break.
But is there anything wrong with falling back to the default Object equals/hashCode (instance equality) for transient objects and then using the generated #Id when you have it?
The worst thing I can think of is, a transient object can't ever be equal to a persistent object, which is fine in my use case - the only time I'm ever putting objects in collections and want contains to work, all the objects are already persistent and all have IDs.
However, I feel like there's something else wrong in a really subtle, non-obvious way deep in the persistence layer but I can't quite figure out what.
The other options don't seem that appealing either:
doing nothing and living with instance equality (default Object.equals): works great for most of my entities, but getting tired of workarounds for the handful of cases when I want a collection with a mix of detached entities (e.g., session scope) and "live" ones from current transaction
using a business key: I have clear natural keys but they're mutable, and this would have
some of the same problems as above (hashCode stability if object changes)
using a UUID - I know that will work, but feels wrong to pollute the DB with artifacts to support java.util collections.
See also:
The JPA hashCode() / equals() dilemma
Should the id field of a JPA entity be considered in equals and hashCode?.
The javadoc of Map writes:
Note: great care must be exercised if mutable objects are used as map keys. The behavior of a map is not specified if the value of an object is changed in a manner that affects equals comparisons while the object is a key in the map.
Whenever an object is persisted, your implementation changes the meaning of equals. As such, any Collections containing that object need no longer work right. In particular, changing the hashcode of an object used as key in a HashMap (or contained in HashSet) is likely to result in future lookups on that Map (Set) not finding the object, and adding that object again to the Map (Set) is likely to succeed, even though under ordinary circumstances, a Map may contain at most one mapping for every given key, and a Set contain every object at most once.
Since it is common to store entities in collections (to express ToMany-associations), that flaw is likely to result in actual hard-to-find bugs.
I therefore strongly recommend against implementing hashcode based on database-generated identifiers.
If you're sure you don't ever need to add an unpersisted entity to a Set or Map key, you can use the ID to test for equality and as the hash code. But if you do this, you could enforce it by throwing an Exception for an unpersisted object:
#Entity
public class DomainObject {
#Id // + sequence generator
private Long id;
#Override
public boolean equals(Object that) {
// bunch of other checks omitted for clarity
if (id != null) {
throw new IllegalStateException("equals() before persisting");
}
if (this == that) {
return true;
}
if (that instanceof DomainObject) {
return id.equals(((DomainObject)that).id);
}
}
#Override
public int hashCode() {
if (id != null) {
throw new IllegalStateException("hashCode() before persisting");
}
return id;
}
}
If you do this, you may see surprise exceptions, where you didn't realize you were relying on these methods on unpersisted objects. You may find this helpful in debugging. You might also find it makes your existing code unusable. Either way, you'll be clearer about how your code works.
One thing you should never do is return a constant for the hash code.
public int hashCode() { return 5; } // Don't ever do this!
Technically, it fulfills the contract, but it's a terrible implementation. Just read the javadocs for Object.hashCode(): …producing distinct integer results for unequal objects may improve the performance of hash tables. (The word "may" here is a serious understatement.)
Yes, you can! But you have to be careful that the hashCode implementation always returns the same constant value as explained in this post:
#Entity
public class Book implements Identifiable<Long> {
#Id
#GeneratedValue
private Long id;
private String title;
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (!(o instanceof Book)) return false;
Book book = (Book) o;
return Objects.equals(getId(), book.getId());
}
#Override
public int hashCode() {
return getClass().hashCode();
}
//Getters and setters omitted for brevity
}
This is the only way you can ensure that equals and hashCode are consistent across all entity state transitions.
I have a class for objects ... lat's say apples.
Each apple object mush have a unique identifier (id)... how do I ensure (elegantly and efficiently) that newly created has unique id.
Thanks
have a static int nextId in your Apple class and increment it in your constructor.
It would probably be prudent to ensure that your incrementing code is atomic, so you can do something like this (using AtomicInteger). This will guarantee that if two objects are created at exactly the same time, they do not share the same Id.
public class Apple {
static AtomicInteger nextId = new AtomicInteger();
private int id;
public Apple() {
id = nextId.incrementAndGet();
}
}
Use java.util.UUID.randomUUID()
It is not int, but it is guaranteed to be unique:
A class that represents an immutable universally unique identifier (UUID).
If your objects are somehow managed (for example by some persistence mechanism), it is often the case that the manager generates the IDs - taking the next id from the database, for example.
Related: Jeff Atwood's article on GUIDs (UUIDs). It is database-related, though, but it's not clear from your question whether you want your objects to be persisted or not.
Have you thought about using UUID class. You can call the randomUUID() function to create a new id everytime.
There is another way to get unique ID's. Instead of using an int or other data type, just make a class:
final class ID
{
#Override
public boolean equals(Object o)
{
return this==o;
}
}
public Apple
{
final private ID id=new ID();
}
Thread safe without synchronizing!
This question already has answers here:
What issues should be considered when overriding equals and hashCode in Java?
(11 answers)
Closed 7 years ago.
I have a domain object called User. Properties of user include ssoId, name, email, createdBy, createdDate and userRole. Of these, ssoId must be unique in the sense no two users can have the same sso id. So my equals method checks for the sso id and returns either true or false.
#Override public boolean equals(Object o) {
if (!(o instanceof User))
return false;
return user.getSsoId().equals((User)o.getSsoId());
}
What I feel is that this is an incorrect implementation, though it is correct as far as the business rules are concerned. The above implementation will return true for two objects with same sso id but with different values for say name or email or both. Should I change my equals contract to check the equality of all fields? What is your suggestion?
This is (almost) correct for "technical equality", but not for "natural equality". To achieve top technical equality, you should also test the reflexive o == this. It may happen that the object isn't persisted in DB yet and thus doesn't have a technical ID yet. E.g.
public class User {
private Long id;
#Override
public boolean equals(Object object) {
return (object instanceof User) && (id != null)
? id.equals(((User) object).id)
: (object == this);
}
#Override
public int hashCode() {
return (id != null)
? (User.class.hashCode() + id.hashCode())
: super.hashCode();
}
}
For "natural equality" you should rather compare all non-technical properties. For "real world entities" this is after all more robust (but also more expensive) than technical equality.
public class User {
private String name;
private Date birth;
private int housenumber;
private long phonenumber;
#Override
public boolean equals(Object object) {
// Basic checks.
if (object == this) return true;
if (!(object instanceof User)) return false;
// Property checks.
User other = (User) object;
return Objects.equals(name, other.name)
&& Objects.equals(birth, other.birth)
&& (housenumber == other.housenumber)
&& (phonenumber == other.phonenumber);
}
#Override
public int hashCode() {
return Objects.hash(name, birth, housenumber, phonenumber);
}
}
True, that's lot of code when there are a lot of properties. A bit decent IDE (Eclipse, Netbeans, etc) can just autogenerate equals(), hashCode() (and also toString(), getters and setters) for you. Take benefit of it. In Eclipse, rightclick code and peek the Source (Alt+Shift+S) menu option.
See also:
JBoss: Equals and HashCode (in view of persistence)
Hibernate: Persistent Classes - implementing equals() and hashCode()
Related SO question: Overriding equals and hashCode in Java
If in your model ssoid must be unique, that implies that the values for the other fields should not be different for two instances of User. If you want to validate that assumption, you could do so with assertions within the equals method if the overhead is not an issue.
What you're doing seems fine, and you're not violating any of the rules that equals must follow.
You may still want to check other fields, not to change equals's semantics, but to detect an inconsistency in your business logic, and possibly trigger an assertion/exception.
This is a tricky decision to make.
This is a spot i got into when considering hashing a few months ago. I would suggest you read up on what a hash is because it is highly relevant to your answer ... i suggest that you are looking to implement some kind of hash and test its equality.
There are different kinds of equality ... there is the equality of the identity of the object, the equality of the data of the object, the equality of the entire object ... you could also include audit information in there also.
The fact is that 'equal' has many possible meanings.
I resolved this by implementing equal as a strict equality across all fields simply because after asking around it seems to be the intuitive meaning of equals. I then constructed methos for the other kinds of equality i required and defined an interface to wrap these.
I wouldnt test equality on object == this because often you are testing two different objects with the same data which in my book are equal despite them referring to different memory addresses.