What is the difference between Unidirectional and Bidirectional associations?
Since the table generated in the db are all the same,so the only difference I found is that each side of the bidiretional assocations will have a refer to the other,and the unidirectional not.
This is a Unidirectional association
public class User {
private int id;
private String name;
#ManyToOne
#JoinColumn(
name = "groupId")
private Group group;
}
public class Group {
private int id;
private String name;
}
The Bidirectional association
public class User {
private int id;
private String name;
#ManyToOne
#JoinColumn(
name = "groupId")
private Group group;
}
public class Group {
private int id;
private String name;
#OneToMany(mappedBy="group")
private List<User> users;
}
The difference is whether the group holds a reference of the user.
So I wonder if this is the only difference? which is recommended?
The main difference is that bidirectional relationship provides navigational access in both directions, so that you can access the other side without explicit queries. Also it allows you to apply cascading options to both directions.
Note that navigational access is not always good, especially for "one-to-very-many" and "many-to-very-many" relationships. Imagine a Group that contains thousands of Users:
How would you access them? With so many Users, you usually need to apply some filtering and/or pagination, so that you need to execute a query anyway (unless you use collection filtering, which looks like a hack for me). Some developers may tend to apply filtering in memory in such cases, which is obviously not good for performance. Note that having such a relationship can encourage this kind of developers to use it without considering performance implications.
How would you add new Users to the Group? Fortunately, Hibernate looks at the owning side of relationship when persisting it, so you can only set User.group. However, if you want to keep objects in memory consistent, you also need to add User to Group.users. But it would make Hibernate to fetch all elements of Group.users from the database!
So, I can't agree with the recommendation from the Best Practices. You need to design bidirectional relationships carefully, considering use cases (do you need navigational access in both directions?) and possible performance implications.
See also:
Deterring “ToMany” Relationships in JPA models
Hibernate mapped collections performance problems
There are two main differences.
Accessing the association sides
The first one is related to how you will access the relationship. For a unidirectional association, you can navigate the association from one end only.
So, for a unidirectional #ManyToOne association, it means you can only access the relationship from the child side where the foreign key resides.
If you have a unidirectional #OneToMany association, it means you can only access the relationship from the parent side which manages the foreign key.
For the bidirectional #OneToMany association, you can navigate the association in both ways, either from the parent or from the child side.
You also need to use add/remove utility methods for bidirectional associations to make sure that both sides are properly synchronized.
Performance
The second aspect is related to performance.
For #OneToMany, unidirectional associations don't perform as well as bidirectional ones.
For #OneToOne, a bidirectional association will cause the parent to be fetched eagerly if Hibernate cannot tell whether the Proxy should be assigned or a null value.
For #ManyToMany, the collection type makes quite a difference as Sets perform better than Lists.
I'm not 100% sure this is the only difference, but it is the main difference. It is also recommended to have bi-directional associations by the Hibernate docs:
http://docs.jboss.org/hibernate/core/3.3/reference/en/html/best-practices.html
Specifically:
Prefer bidirectional associations:
Unidirectional associations are more difficult to query. In a large
application, almost all associations
must be navigable in both directions
in queries.
I personally have a slight problem with this blanket recommendation -- it seems to me there are cases where a child doesn't have any practical reason to know about its parent (e.g., why does an order item need to know about the order it is associated with?), but I do see value in it a reasonable portion of the time as well. And since the bi-directionality doesn't really hurt anything, I don't find it too objectionable to adhere to.
In terms of coding, a bidirectional relationship is more complex to implement because the application is responsible for keeping both sides in synch according to JPA specification 5 (on page 42). Unfortunately the example given in the specification does not give more details, so it does not give an idea of the level of complexity.
When not using a second level cache it is usually not a problem to do not have the relationship methods correctly implemented because the instances get discarded at the end of the transaction.
When using second level cache, if anything gets corrupted because of wrongly implemented relationship handling methods, this means that other transactions will also see the corrupted elements (the second level cache is global).
A correctly implemented bi-directional relationship can make queries and the code simpler, but should not be used if it does not really make sense in terms of business logic.
Related
After reading a bit on the topic, I am a bit lost on the Hibernate/JPA requirements for #Entity equality. Do I really have to adjust my #EqualsAndHashCode to make my entities equal based on the db uniqueness still in 2020? What's the point of the #Id metannotation then?
I need to be able to compare my entities at object level, so for now I just implemented my EqualsAndHashCode according to all fields besides #Id.
What are exactly the problems I can face if I keep on with that approach? Isn't anyway the db gonna throw an exception if for some reason Hibernate tries to store or mix 2 entities that have same #Id but are not equals with my implementation? Is it really a risk? I am pretty sure Ive seen in the past a lot of projects with proper tests and noone defining any particular #EqualsAndHashCode, so by default is just comparing the instances, and those projects passed all kind of CRUD tests green, and had no bugs on production
Basically, you'd get some problems when you have bi-directional relationships between entities. For example, if Entity1 has #OneToMany access to Entity2, and Entity2 has #ManyToOne access to EntityId, and both of these entities have #EqualsAndHashcode without specifying fields (i.e., equals and hashcode are generated for all fields including those for relations). In this case, you'd have a circular reference, hence a StackOverflow exception.
In order to avoid that, you can rely only on a field with #Id for constructing equals and hashcode (there are some examples with this approach in hibernate docs). But in this case, you'd get another kind of problems, e.g. if you store transient entities with auto-generated ids in a set (as child entities for some parent one), it wouldn't work correctly because the id field will be null in this case. Probably, you'd need to use some other fields in equals and hashcode in this case.
So, there is no correct answer to this question. You need to make a decision every time you construct your entities.
So I have some entities that are used as the basis for a coordinate system, for the purpose of this post we'll call them A, B, C and D. Each of these entities has multiple #OneToMany relationships, and I want to cascade deletes. i.e. When some A is deleted, all entities in each of the #OneToMany relationships are deleted too. Fairly standard stuff.
However, I don't see the point in having these entities explicitly tracking these relationships when all I want to do is cascade a delete. I don't see the point in loading all these entities (potentially millions!) into memory each time a new entity is added to the #OneToMany relationship (i.e. using lazy loading only loads in when it's accessed, but it's of course accessed when a new entity in the relationship is added).
Let's add a little example:
#Entity
public class A {
#Id
private long id;
// ... other fields ...
#OneToMany
private Collection<SomeClass> collection;
}
#Entity
public class SomeClass {
#Id
private long id;
// ... other fields ...
#ManyToOne
A a;
#ManyToOne
B b;
// ... likewise for C, D ...
}
There can be multiple classes similar to SomeClass, and so multiple #OneToMany relationships in A (and B,C,D) that require tacking. This gets tedious FAST. Also, every time a new instance of SomeClass is added, I'd need to load the entire collection and this seems exceedingly inefficient (I'd pretty much end up with my entire database loaded into memory just to cascade a delete!!!).
How can I achieve what I want without modifying the underlying database (e.g. specfying ON DELETE CASCADE in the definition), surely the designers of JPA have considered such a use case? Maybe I'm incorrect that I'd need to load the entire collection when adding an entity to the relationship (if so, please explain why :) ).
A similar question was asked here: JPA: unidirectional many-to-one and cascading delete but it doesn't have a satisfactory solution, and it doesn't discuss whether or not the entire relationship gets loaded into memory.
To achieve a multi-level cascade without initializing all the entities you can only use a DB cascade.
There's no other way! That's why you couldn't find a satisfactory solution.
As for the:
Also, every time a new instance of SomeClass is added, I'd need to
load the entire collection and this seems exceedingly inefficient (I'd
pretty much end up with my entire database loaded into memory just to
cascade a delete!!!).
You need to understand the unidirectional Collections taxonomy:
Adding one element to a Set, requires the whole collection to be initializes to enforce the uniqueness Set contract.
a java.util.Collection or an unindexed List means you have a Bag, which are very inefficient in the unidirectional use case. For inverse collections they are fine, but that's out of your current context.
An indexed List (where the order is materialized in the database) is what you might be looking for:
#OrderColumn(name="orders_index")
public List<Order> getOrders() { return orders; }
The indexed list will use the index key for add/remove/update operations. As opposed to a Bag which simply deletes all elements and recreates the collection with the remaining elements, an index List will use the index key to only remove the elements that no longer belong to the List.
Reading a wiki page about Hibernate I elaborated some perplexing conclusions:
1) Bidirectionality is reccomended in one-to-many
2) Bidirectionality is optional in many-to-one
3) Bidirectionality is normally present in many-to-many
4) Unidirectionality is reccomended in one-to-one relationships,
using as owner class the one with the primary key of the
relation (not the foreign key).
Are these statements true? Do you have any example to explain why in some cases unidirectionality is reccomended and in others bidirectionality is reccomended instead?
Here's the wiki page (read under "concepts"):
http://wiki.elvanor.net/index.php/Hibernate
Note that "bidirectionality" in the context of Hibernate means that in your Java classes, both sides of the relationship maintain a link to the other side. It has no impact on the underlying database schema (except in the case of indexed collections, see below), it's just whether or not you want the Java side to reflect that.
For all of your conclusions, "recommended" actually translates to "it usually ends up making sense, given your business logic, that you'd do it this way".
You really want to read through chapters 7 and 8 of the Hibernate Core Reference Manual.
It's recommended if you need it. A lot of convenience comes from specifying a bidirectional relationship; particularly it becomes possible to navigate the relationship from both ends in your business logic. However, if you don't actually need to do this, there's nothing to gain. Use whatever is most appropriate for the situation. In practice I've found that I want to specify both ends of the relationship to Hibernate more often than not -- but it is not a rule, rather, it reflects what I want to accomplish.
This is true. In a many-to-one (or one-to-many) relationship, it is optional. Consider the following schema:
table: users
fields: userId, userName
table: forumPosts
fields: postId, userId, content
Where forumPosts.userId is a foreign key into users. Your DAO classes might be (getters/setters omitted for brevity):
public class User {
private long userId;
private String userName;
}
public class ForumPost {
private long postId;
private User user;
private String content;
}
As you can see, this is a unidirectional many-to-one relationship (ForumPost-to-User). The ForumPost links to the user, but the User does not contain a list of ForumPosts.
You could then add a one-to-many mapping to User to make it have a list of ForumPosts. If you use a non-indexed collection like a set, this has no impact on the database schema. Merely by specifying both sides to Hibernate, you have made it bidirectional (using exactly the same schema as above), e.g.:
public class User {
private long userId;
private String userName;
private Set<ForumPost> forumPosts;
}
public class ForumPost {
private long postId;
private User user;
private String content;
}
Hibernate will now populate User.forumPosts when necessary (essentially with SELECT * FROM forumPosts WHERE userId = ?). The only difference between bidirectional and unidirectional here is that in one case Hibernate fills a set of ForumPosts in User, and in the other case it doesn't. If you ever have to get a collection of any given user's posts, you will want to use a bidirectional relationship like this rather than explicitly constructing an HQL query. Depending on your inverse/insert/update/cascade options in your relationship, you can also add and remove posts by modifying the User's set of posts, which may be a more accurate reflection of your business logic (or not!).
The reason I specified that non-indexed collections don't impact the underlying schema is because if you want to use an ordered, indexed collection like a list, you do have to add an extra list index field to the forumPosts table (although you do not have to add it to the ForumPost DAO class).
This is true, but is not a requirement and it's deeper than that. Same as above. Bidirectionality is usually present in many-to-many. Many-to-many relationships are implemented with a third join table. You specify the details of this table on both sides of the relationship. You can simply not specify the relationship on one side, and now it's a unidirectional relationship. Again, whether or not you tell Hibernate about the mapping is what determines if its unidirectional or bidirectional (in the context of Hibernate). In this case it also has no impact on the underlying schema unless you are using an ordered index collection. In fact, the many-to-many example in the Hibernate reference manual is a unidirectional setup.
In reality, it would be odd to have a unidirectional many-to-many relationship, unless perhaps you are working with an existing database schema and your particular application's business logic has no need for one of the sides of the relationship. Usually, though, when you've decided you need a many-to-many relationship, you've decided that because you need to maintain a collection of references on both sides of the relationship, and your DAO classes would reflect that need.
So the correct conclusion here is not merely that "bidirectionality is normally present in many-to-many", but instead "if you've designed a database with a join table, but your business logic only uses a unidirectional relationship, you should question whether or not your schema is appropriate for your application (and it very well may be)".
This is not true. Exactly the same as all the points above. If you need to navigate the one-to-one relationship from both sides, then you'd want to make it bidirectional (specify both sides of the mapping to Hibernate). If not, then you make it unidirectional (don't specify both sides of the mapping to Hibernate). This again comes down to what makes sense in your business layer.
I hope that helps. I left a lot of intricacies out. You really should read through the Hibernate documentation - it is not organized particularly well but Chapter 7 and 8 will tell you everything you need to know about collection mapping.
When I'm designing an application and a database from scratch, personally, I try to forget about Hibernate and the database entirely. I set up my DAOs in a way that makes sense for my business requirements, design a database schema to match, then set up the Hibernate mappings, making any final tweaks to the schema (e.g. adding index fields for ordered collections) at that point if necessary.
I have a #ManyToMany relationship between two entities. When I perform an update on the owning side, it appears that JPA deletes all the linked records from my database and re-inserts them. For me this is a problem because I have a MySQL trigger that fires before a record is deleted. Any ideas on how to get around this problem?
#Entity
public class User {
#Id
#Column(name="username")
private String username;
...
#ManyToMany
#JoinTable(name="groups", joinColumns=
#JoinColumn(name="username", referencedColumnName="username"),
inverseJoinColumns=#JoinColumn(name="groupname",
referencedColumnName="type_id"))
private List<UserType> types;
...
}
#Entity
public class UserType {
#Id
#Column(name="type_id")
private String id;
#ManyToMany(mappedBy="types")
private List<User> users;
...
}
Use Set instead of List solved the problem. But I have no idea why it works.
Another solution provided by Hibernate is to split the #ManyToMany association into two bidirectional #OneTo#Many relationships. See Hibernate 5.2 documentation for example.
If a bidirectional #OneToMany association performs better when
removing or changing the order of child elements, the #ManyToMany
relationship cannot benefit from such an optimization because the
foreign key side is not in control. To overcome this limitation, the
link table must be directly exposed and the #ManyToMany association
split into two bidirectional #OneToMany relationships.
Try this one:
1) change declaration to:
private List<UserType> types = new Vector<UserType>();
2) never call
user.setTypes(newTypesList)
3) only call
user.getTypes().add(...);
user.getTypes().remove(...);
Its probably related to this question. You have to ensure you have an appropriately defined hashCode an equals method in your mapped object so that Eclipselink can determine equality and thus determine that the existing objects map to existing objects in the DB. Otherwise it has no choice but to recreate the child objects every time.
Alternatively, I've read that this kind of join can only support efficient adding and removing of list items if you use an index column, but that's going to be EclipseLink specific, since the JPA annotations don't seem to support such a thing. I know there is an equivalent Hibernate annotation, but I don't know what it would be in Eclipselink, if such a thing exists.
It appears my problem was that I was not merging the entity.
Imagine 2 tables in a relational database, e.g. Person and Billing. There is a (non-mandatory) OneToOne association defined between these entities, and they share the Person primary key (i.e. PERSON_ID is defined in both Person and Billing, and it is a foreign key in the latter).
When doing a select on Person via a named query such as:
from Person p where p.id = :id
Hibernate/JPA generates two select queries, one on the Person table and another on the Billing table.
The example above is very simple and would not cause any performance issues, given the query returns only one result. Now, imagine that Person has n OneToOne relationships (all non-mandatory) with other entities (all sharing the Person primary key).
Correct me if I'm wrong, but running a select query on Person, returning r rows, would result in (n+1)*r selects being generated by Hibernate, even if the associations are lazy.
Is there a workaround for this potential performance disaster (other than not using a shared primary key at all)? Thank you for all your ideas.
Imagine 2 tables in a relational database, e.g. Person and Billing. There is a (non-mandatory) OneToOne association defined between these entities,
Lazy fetching is conceptually not possible for non-mandatory OneToOne by default, Hibernate has to hit the database to know if the association is null or not. More details from this old wiki page:
Some explanations on lazy loading (one-to-one)
[...]
Now consider our class B has
one-to-one association to C
class B {
private C cee;
public C getCee() {
return cee;
}
public void setCee(C cee) {
this.cee = cee;
}
}
class C {
// Not important really
}
Right after loading B, you may call
getCee() to obtain C. But look,
getCee() is a method of YOUR class
and Hibernate has no control over it.
Hibernate does not know when someone
is going to call getCee(). That
means Hibernate must put an
appropriate value into "cee"
property at the moment it loads B from
database. If proxy is enabled for
C, Hibernate can put a C-proxy
object which is not loaded yet, but
will be loaded when someone uses it.
This gives lazy loading for
one-to-one.
But now imagine your B object may or
may not have associated C
(constrained="false"). What should
getCee() return when specific B
does not have C? Null. But remember,
Hibernate must set correct value of
"cee" at the moment it set B
(because it does no know when someone
will call getCee()). Proxy does not
help here because proxy itself in
already non-null object.
So the resume: if your B->C mapping
is mandatory (constrained=true),
Hibernate will use proxy for C
resulting in lazy initialization. But
if you allow B without C, Hibernate
just HAS TO check presence of C at the
moment it loads B. But a SELECT to
check presence is just inefficient
because the same SELECT may not just
check presence, but load entire
object. So lazy loading goes away.
So, not possible... by default.
Is there a workaround for this potential performance disaster (other than not using a shared primary key at all)? Thank you for all your ideas.
The problem is not the shared primary key, with or without shared primary key, you'll get it, the problem is the nullable OneToOne.
First option: use bytecode instrumentation (see references to the documentation below) and no-proxy fetching:
#OneToOne( fetch = FetchType.LAZY )
#org.hibernate.annotations.LazyToOne(org.hibernate.annotations.LazyToOneOption.NO_PROXY)
Second option: Use a fake ManyToOne(fetch=FetchType.LAZY). That's probably the most simple solution (and to my knowledge, the recommended one). But I didn't test this with a shared PK though.
Third option: Eager load the Billing using a join fetch.
Related question
Making a OneToOne-relation lazy
References
Hibernate Reference Guide
19.1.3. Single-ended association proxies
19.1.7. Using lazy property fetching
Old Hibernate FAQ
How do I set up a 1-to-1 relationship as lazy?
Hibernate Wiki
Some explanations on lazy loading (one-to-one)
This is a common performance issue with Hibernate (just search for "Hibernate n+1"). There are three options to avoiding n+1 queries:
Batch size
Subselect
Do a LEFT JOIN in your query
These are covered in the Hibernate FAQs here and here
Stay away from hibernate's OneToOne mapping
It is very broken and dangerous. You are one minor bug away from a database corruption problem.
http://opensource.atlassian.com/projects/hibernate/browse/HHH-2128
You could try "blind-guess optimization", which is good for "n+1 select problems".
Annotate you field (or getter) like this:
#org.hibernate.annotations.BatchSize(size = 10)
java.util.Set<Billing> bills = new HashSet<Billing>();
That "n+1" problem will only occur if you specify the relationship as as lazy or you explicitly indicate that you want hibernate to run a separate query.
Hibernate can fetch the relationship to Billing with an outer join on the select of Person, obviating the n+1 problem altogether. I think it is the fetch="XXX" indication in your hbm files.
Check out A Short Primer On Fetching Strategies
use optional =true with a one-to-one relationship like this to avoid the n+1 issue
#OneToOne(fetch = FetchType.LAZY, optional=true)
#PrimaryKeyJoinColumn