DDD: Entity and its identifier

DDD: Entity and its identifier - java

I have an entity in my system, which naturally needs an identifier so that it can be uniquely identified. Assuming the database is used for generating the identifier with Hibernate, using the native strategy, then obviously the application code is free of this responsibility of assigning identifiers.
Now, can an instance of that entity be considered valid before it is persisted and gets its identifier?
Or should I use some other strategy to assign my entities their identifiers so that it gets its identifier when its constructor is called?

That's an extensive topic, but here are two possibilities:
define your hashCode() and equals(..) contracts based on business keys. For example, for a User entity, this would be the username, rather than the auto-generated id. Thus you will be able to use the entity in collections before it is persisted
use UUID as a primary key, and handle the generation yourself. See this article by Jeff Atwood and this article demonstrating a way to use it with Hibernate
(Since you mention DDD and hibernate, take a look at this article of mine)

Related

Is it good practice to have #Id annotation under attributes like username or user email

What are the complications that might happen when you set #Id attributes for a filed other than id, for instance
#GeneratedValue(strategy=GenerationType.AUTO)
#Column(name="user_id")
private Integer user_id;
#Id
#Size(min=6,message="email is too short")
#Pattern(regexp = "^.+#.+\\..+$",message="invalid email")
#Column(name="user_email")
private String user_email;
In my user pojo class: if I'm not defining id annotation for user_id but instead for user_email will that create any complications in long run?

It is bad practice.
Emails may change, but primary keys (which is what #Id annotates) should never change. Usernames may change too, such as on this very site.

First let me say: That's a good question! The short answer to it is:
No, it is not a good practice to annotate fields such as user_email or username with #Id.
For a deeper look into the Why, let us first remember how a primary key (PK) is defined. In most SQL-oriented database systems, the PK definition follows at least two assumptions under which a PK always is:
(i) NOT NULL
(ii) UNIQUE
The implications with respect to your idea/question:
(a):
A username or a user_email value would have be set to a User instance before you could persist it to a database. In the case, it was missing or be known only at a later point in time: Persistence = impossible. Vice versa, it could never be set to an unknown value, aka null, e.g., if a user simply can not provide his/her email address (even though a rare case in 2018).
(b):
Actual values for username/user_email tend to be unique in most applications. Nevertheless, in the application's business logic you'd have to ensure, no other user inputs an already existent/known username/user_email twice. If you had #Id annotated, again: Persistence = impossible. Still, most likely, allowing duplicates on one of these fields would violate its semantics any way.
If you'd annotate user_id as #Idfield: No pain in both cases.
In an ORM context, the JPA specification document expands this definition and states on page 31:
The value of its primary key uniquely identifies an entity instance within a persistence context and to
EntityManager operations as described in Chapter 3, “Entity Operations”. The application must not change the value of the primary key. The behavior is undefined if this occurs.
This yields the next implication
(c)
By definition, you are not allowed to change values of #Id fields programmatically / at runtime. The control of such fields should remain exclusively on the side of the object-relational mapper implementing the JPA specification.
As pointed out in the answer by Bohemian, a username or user_email might be subject to change over the lifetime of your application. The reason: Data are fluid from the perspective of an application user. Users might want to change these properties (at some time in the future), e.g if they get married or change their email provider.
Nevertheless, it is generally possible to annotate fields of type String (and others), see p. 31:
A simple primary key or a field or property of a composite primary key should be one of the following types: any Java primitive type; any primitive wrapper type; java.lang.String;
java.util.Date; java.sql.Date; java.math.BigDecimal; java.math.BigInteger.
So one might get tempted and say: Well, let's go with that! However, one should step back and reflect what (a) the domain or (b) the intended use cases of the application impose on the intended application's database schema. As discussed above, this is especially important in your scenario.
With these "real world" considerations in mind, you'll be better off to only annotate #Id to a surrogate key, such as user_id in your example°. This one will not change throughout the lifetime of your database. Question: Why won't it change? Answer: There is no practical reason to satisfy a business case defined by application users.
Hope this helps.
°Hint
Better use Long instead of Integer right from the start; you'll thus have enough primary key values way beyond your lifetime as a SW-developer (True for >99.5% of all applications).

Best approach for linking diverse entity types in JPA

Short version for the hasty:
There's various tables/entities in my domain model which have the same field (a UUID). There is a table where I need to link rows/instances of such entities to other JPA-managed entities. In other words, the instance of the field in that link table won't be known up-front. The two approaches I can think of are:
Use an abstract entity and a TABLE_PER_CLASS strategy, or
use an #MappedSuperClass store the class name of the instance in the link table as well, or something similar that lets me define logic for getting the actual instance from the right table.
Both have advantages and disadvantages in terms of complexity and performance. Which do you believe to be best, is there maybe a third option, or have you tried something like this in the past and would advice/strongly warn against?
Long version in case you want more background:
I have a database/object model wherein many types have a common field: a universally unique identifier (UUID). The reason for this is that instances of these types can be subject to changes. The changes follow the command model and their data can be encapsulated and itself persisted. Let's call such a change a "mutation". It must be possible to find out which mutations exist in the database for any given entity, and vice-versa, on which entity a stored mutation operates.
Take the following entities with UUIDs as an (extremely simplified) example:
To store the "mutations", we use a table/entity called MutationHolder. To link a mutation to its target entity, there's a MutationEntityLink. The only reason this data isn't directly on the MutationHolder is because there can be direct or indirect links, but that's of little importance here so I left it out:
The question comes down to how I can model the entity field in MutationEntityLink. There are two approaches I can think of.
The first is to make an abstract #Entity annotated class with the UUID field. Customer, Contract and Address would extend it. So it is a TABLE_PER_CLASS strategy. I assume that I could use this as a type for the entity field, although I'm not certain. However, I fear this might have a serious performance penalty since JPA would need to query many tables to find the actual instance.
The second is to simply use #MappedSuperClass and just store the UUID for an entity in the entity field of MutationEntityLink. In order to get the actual entity with that UUID, I'd have to solve it programmatically. Adding an additional column with the class name of the entity, or something else that allows me to identify it or paste it in a JPQL query would do. This requires more work but seems more efficient. I'm not averse to coding some utility classes or doing some reflection/custom annotation work if needed.
My question is which of these approaches seems best? Alternatively, you might have a better suggestion, or notice I'm missing something; for example, maybe there's a way to add a type column even with TABLE_PER_CLASS inheritance to point JPA to the right table? Perhaps you've tried something like this and want to warn me about numerous issues that would arise.
Some additional info:
We create the database schema, so we can add whatever we want.
A single table inheritance strategy isn't an option. The tables must remain distinct. For the same reason, joined inheritance doesn't seem a good fit either.
The JPA provider is Hibernate and using things that are not part of the JPA standard isn't an issue.

If the entities don't have anything in common besides having a uuid I'd use the second approach you describe: use MappedSuperclass. Making the common superclass an entity would prevent you to use a different inheritance strategy if needed, would require a table for that super entity even if no instances exist and from a business point of view it's just wrong.
The link itself could be implemented in multiple ways, e.g. you could subclass MutationEntityLink for each entity to map (e.g. CustomerMutationEntityLink etc.) or do as you described it, i.e. only store the uuid as well as some discriminator/type information and resolve programatically (we're using that approach for something similar btw.).

You need to use #MappedSuperclass while inheriting associations/methods/properties whereas TABLE_PER_CLASS is generally used when you have entity and sub-entities. If there are entities having an association with the base class in the model, then use TABLE_PER_CLASS since the base class behaves like an entity. Otherwise, since the base class would include properties/attributes and methods which are general to such entities not related to each other, using #MappedSuperclass would be a better idea
Example1: You need to set alarms for some different activities like "take medicine", "call mom", "go to doctor" etc. The content of the alarm message does not matter, you will need a reminder. So use TABLE_PER_CLASS since alarm message, which is your base class is like an entity here.
Example2: Assume the base class AbstractDomainObject enables you to create login ID, loginName, creation/modification date for each object where no entity has an association with the base class, you will need to specify the association for the sake of clearing later, like "Company","University" etc. In this situation, using #MappedSuperclass would be better.

hashCode() method for related entities

I read that when using JPA you should implement hashCode()/equals() for your entities.
So Eclipse for example has this nice feature to generate those methods for the classes.
But what fields do i have to choose?
I read that choosing the Long id; field of your entity is not a good idea. (right?, why?)
One should use a business key (some fields of the entity which can be used to identify the entity. right?) in the hashCode()/equals() methods.
Considering following scenario:
1:n relation between A and B...
is it a good idea to use those references in the hashcode() method?
if i do so i sometimes run into java.util.ConcurrentModificationException or Stackoverflow exceptions.
What about collections variables? i think i should not use those in my hashcode() function...
can somebody give me hints?

Consider using the fields (as few as possible) that will uniquely identify the object. If it were a Person it might be first, middle and last name. Or better still, Social Security Number if US Person. I don't see any issue with using a DB ID so long as the table cannot contain duplicate entities. In general, the identity of an object should not require checking the identities of it's associated objects (the 1:n relationship) but just the local fields.

Equals and hashcode methods should be always implemented either on primary key or on your business key this is necessary if you want to adhere to requirements of your persistent manager. Check here

You can implement your own logic in hashcode to get the unique number.For example
you can do some combination of ^-ing
(XOR-ing) a class's instance variables (in other words, twiddling their bits), along
with perhaps multiplying them by a prime number.

Entities in domain driven design

I am reading Eric Evans book about DDD and I have a question to the following quote. How do you make your equals() method when you should not use the attributes? I am using JPA and I have a id attribute which is unique but this is not set until you actually persist the entity. So what do you do? I have implemented the equals method based on the attributes and I understand why you shouldn't because it failed in my project.
Section about entities:
When an object is distinguished by its identity, rather than its
attributes, make this primary to its definition in the model. Keep the
class definition simple and focused on life cycle continuity and
identity. Define a means of distinguishing each object regardless of
its form or history. Be alert to requirements that call for matching
objects by attributes. Define an operation that is guaranteed to
produce a unique result for each object, possibly by attaching a
symbol that is guaranteed unique. This means of identification may
come from the outside, or it may be an arbitrary identifier created by
and for the system, but it must correspond to the identity
distinctions in the model. The model must define what it means to be
the same thing.
http://www.amazon.com/Domain-Driven-Design-Tackling-Complexity-Software/dp/0321125215

Couple approaches possible:
Use a business key. This is the most 'DDD compliant' approach. Look closely at domain and business requirements. How does your business identify Customers for example? Do they use Social Security Number or phone number? How would your business solve this problem if it was paper-based (no computers)? If there is no natural business key, create surrogate. Choose the business key that is final and use it in equals(). There is a section in DDD book dedicated to this specific problem.
For the cases when there is no natural business key you can generate UUID. This would also have an advantage in distributed system in which case you don't need to rely on centralized (and potentially unavailable) resource like database to generate a new id.
There is also an option to just rely on default equals() for entity classes. It would compare two memory locations and it is enough in most cases because Unit Of Work (Hibernate Session) holds on to all the entities (this ORM pattern is called Identity Map). This is not reliable because it will break if you use entities that are not limited to the scope of one Hibernate Session (think threads, detached entities etc)
Interestingly enough, 'official' DDD sample uses a very lightweight framework where every entity class is derived from Entity interface with one method:
boolean sameIdentityAs(T other)
// Entities compare by identity, not by attributes.

If the object is not persistent yet, then is there any harm in comparing 2 objects based on their attributes?
I am not sure why this failed in your project, but in my experience, comparison based on attributes almost always is slippery slope if your attributes are not final. That means, 2 objects that are equal now, may not be equal after sometime. This is very bad.
Given that most Java classes are written along with their accessors, equals comparing attributes are said to be a bad idea.
However, I would probably first check to see if the ID field is not null. If it is null, I would fall back to attribute comparison. If it is not null, then just use it and not do anything else. Does this make sense?

Given Person class with attributes name, surname. When Person at the age of 21 changes its name is it still the same Person (equals gives true)?
If you write equals basis on attributes, then, it would not be the same person, so in my opinion the best approach is to test equality of entities basis on their business identifier (unique and immutable over the whole entity lifecycle).

Another solution could be to use an UUID field in your entity.
In this case, you could use the UUID as primary key or just for equals.
#Entity
public class YourEntity{
#Id
private String uuid = UUID.randomUUID().toString();
// getter only...
}

How hibernate uses equals() and hashCode()?

If you load an entity from db and modify it somehow, will hibernate use equals/hashCode to compare current state of entity with it's snapshot to determine if sql update needs to be performed?
If it does such comprasions, I have another question: if equals will return true, will hibernate think that entity did not changed or attempt to use it's default comprasion (to be sure)?

Please see Equals and HashCode from the JBoss Community website. From there:
To avoid this problem we recommend using the "semi"-unique attributes
of your persistent class to implement equals() (and hashCode()).
Basically you should think of your database identifier as not having
business meaning at all (remember, surrogate identifier attributes and
automatically generated vales are recommended anyway). The database
identifier property should only be an object identifier, and basically
should be used by Hibernate only. Of course, you may also use the
database identifier as a convenient read-only handle, e.g. to build
links in web applications.
In other words, Hibernate uses equals and hashCode for identity, not to see if an object has been modified. It uses attribute by attribute comparisons for that.

Not an Hibernate expert, but you may find this section of manual enlightening.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.