Hibernate Many to Many Relations Set Or List? - java

I have a many to many relationship at my Java beans. When I use List to define my variables as like:
#Entity
#Table(name="ScD")
public class Group extends Nameable {
#ManyToMany(cascade = {CascadeType.PERSIST, CascadeType.MERGE}, fetch = FetchType.EAGER)
#JoinColumn(name="b_fk")
private List<R> r;
//or
private Set<R> r;
I get that error:
org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'org.springframework.dao.annotation.PersistenceExceptionTranslationPostProcessor#0'
...
When I use Set everything seem to work well.
I want to ask that when using many to many relationships which one to use for logical consept List or Set (because of list may have duplicates and set but what about performance and other issues)?

From relational databases perspective this is a set. Databases do not preserve order and using a List is meaningless, the order in them is unspecified (unless using so called indexed collections).
Using a Set also has great performance implications. When List is used, Hibernate uses PersistentBag collection underneath which has some terrible characteristics. I.e.: if you add a new relationship it will first delete all existing ones and then insert them back + your new one. With Set it just inserts the new record.
Third thing - you cannot have multiple Lists in one entity as you will get infamous cannot simultaneously fetch multiple bags exception.
See also:
19.5. Understanding Collection performance
Why Hibernate does "delete all then re-insert" - its not so strange

How about the uniqueness requirement from Set? Doesn't this force Hibernate to retrieve all objects each time one is added to the collection to make sure a newly added one is unique? A List wouldn't have this limitation.

I know the question was made years ago but I wanted to comment on this topic, just in case someone is doubtful about the set vs list issue.
Regarding lazy fetching, I think a bag (list without index) would be a better option due to the fact that you avoid retrieving all objects each time one is added to the collection to:
make sure a newly added one is unique, in case you are using a set
preserve order, in case you are using a list (with index)
Please correct me if I'm mistaken.

Related

Hibernate : Force lazy-loadding on eager field

One of our model object in our application has many fields configured to be eagerly fetched like so:
#ManyToOne(fetch = FetchType.EAGER)
#JoinColumn(name = "field")
public Field getField() {
return this.field;
}
However I sometime do not need these information, which slow down my queries for nothing. I cannot change the behaviour and use FetchType.LAZY instead as I've no idea what will be the impact on the whole application (legacy...). Is there a way to simply tell hibernate to fetch nothing, except if it is specified in the query?
Last time I checked there was no proper solution provided by hibernate, so I ended up with this solution:
Configured the problematic references as LAZY.
All affected service methods (that used these models) got an overloaded version with boolean forceEager
by default all existing functions were refactored to call the new ones with forceEager=true
and here comes the trick: as a means of "forcing the eager fetching" I found nothing better than actually accessing the proxied (lazy-fetched) objects. In case for example a lazily referenced list doing list.size() will force Hibernate to load the full list, hence the service returns with fully fetched object.
In case of more than one layer in your objectstructure is affected, you need to traverse through the whole hierarchy and access every lazily loaded object from top to bottom.
This is a bit error-prone solution, so you need to handle it with care.
If its possible to switch to Criteria for this query, you could use FetchMode.SELECT for the field property
crit.setFetchMode("field", FetchMode.SELECT);

Object Relational mapping and performance

I am currently working on a product that works with Hibernate (HQL) and another one that works with JPQL. As much as I like the concept of the mapping from a relational structure (database) to an object (Java class), I am not convinced of the performance.
EXAMPLE:
Java:
public class Person{
private String name;
private int age;
private char sex;
private List<Person> children;
//...
}
I want to get attribute age of a certain Person. A person with 10 children (he has been very busy). With Hibernate or JPQL you would retrieve the person as an object.
HQL:
SELECT p
FROM my.package.Person as p
WHERE p.name = 'Hazaart'
Not only will I be retrieving the other attributes of the person that I don't need, it will also retrieve all the children of that person and their attributes. And they might have children as well and so on... This would mean more tables would be accessed on database level than needed.
Conclusion:
I understand the advantages of Object Relational Mapping. However it would seem that in a lot of cases you will not need every attribute of a certain object. Especially in a complex system. It would seem like the advantages do not nearly justify the performance loss. I've always learned performance should be the main concern.
Can anyone please share their opinion? Maybe I am looking at it the wrong way, maybe I am using it the wrong way...
I'm not familiar with JPQL, but if you set up Hiernate correctly, it will not automatically fetch the children. Instead it will return a proxy list, which will fetch the missing data transparently if it is accessed.
This will also work with simple references to other persistent objects. Hibernate will create a proxy object, containing only the ID, and load the actual data only if it is accessed. ("lazy loading")
This of couse has some limitations (like persistent class hierarchies), but overall works pretty good.
BTW, you should use List<Person> to reference the children. I'm not sure that Hibernate can use a proxy List if you specify a specific implementation.
Update:
In the example above, Hibernate will load the attributes name, age and sex, and will create a List<Person> proxy object that initially contains no data.
Once the application accesses calls any method of the List that requires knowledge of the data, like childen.size() or iterates over the list, the proxy will call Hibernate to read the children objects and populate the List. The cildren objects, being instances of Person, will also contain a proxy List<Person> of their children.
There are some optimizations hibernate might perform in the background, like loading the children for other Person objects at the same time that might be in this session, since it is querying the database anyways. But whether this is done, and to what extend, is configurable per attribute.
You can also tell hibernate to never use lazy-loading for certain references or classes, if you are sure you'll need them later, or if you continue to use the persistent oject once the session is closed.
Be aware that lazy loading will of course fail if the session is no longer active. If for example you load a Person oject, don't access the children List, and close the session, a call to children.size() for example will fail.
IIRC the hibernate session class has method to populate all not-yet-loaded references in a persistent oject, if needed.
Best read the hibernate documentation on how to configure all this.

List vs Set on JPA 2 - Pros / Cons / Convenience

I have tried searching on Stack Overflow and at other websites the pros, cons and conveniences about using Sets vs Lists but I really couldn't find a DEFINITE answer for when to use this or that.
From Hibernate's documentation, they state that non-duplicate records should go into Sets and, from there, you should implement your hashCode() and equals() for every single entity that could be wrapped into a Set. But then it comes to the price of convenience and ease of use as there are some articles that recommend the use of business-keys as every entity's id and, from there, hashCode() and equals() could then be perfectly implemented for every situation regardless of the object's state (managed, detached, etc).
It's all fine, all fine... until I come across on lots of situations where the use of Sets are just not doable, such as Ordering (though Hibernate gives you the idea of SortedSet), convenience of collectionObj.get(index), collectionObj.remove(int location || Object obj), Android's architecture of ListView/ExpandableListView (GroupIds, ChildIds) and on... My point is: Sets are just really bad (imho) to manipulate and make it work 100%.
I am tempted to change every single collection of my project to List as they work very well. The IDs for all my entities are generated through MYSQL's auto-generated sequence (#GeneratedValue(strategy = GenerationType.IDENTITY)).
Is there anyone out the who could in a definite way clear up my mind in all these little details mentioned above?
Also, is it doable to use Eclipse's auto-generated hashCode() and equals() for the ID field for every entity? Will it be effective in every situation?
Thank you very much,
Renato
List versus Set
Duplicates allowed
Lists allow duplicates and Sets do not allow duplicates. For some this will be the main reason for them choosing List or Set.
Multiple Bag's Exception - Multiple Eager fetching in same query
One notable difference in the handling of Hibernate is that you can't fetch two different lists in a single query.
It will throw an exception "cannot fetch multiple bags". But with sets, no such issues.
A list, if there is no index column specified, will just be handled as a bag by Hibernate (no specific ordering).
#OneToMany
#OrderBy("lastname ASC")
public List<Rating> ratings;
One notable difference in the handling of Hibernate is that you can't fetch two different lists in a single query. For example, if you have a Person entity having a list of contacts and a list of addresses, you won't be able to use a single query to load persons with all their contacts and all their addresses. The solution in this case is to make two queries (which avoids the cartesian product), or to use a Set instead of a List for at least one of the collections.
It's often hard to use Sets with Hibernate when you have to define equals and hashCode on the entities and don't have an immutable functional key in the entity.
furthermore i suggest you this link.

Order of items in List relationship

This may be super easy but I can't find any clear statement about it in JPA specs. If I have a List-based relationship without #OrderBy annotation, e.g:
#OneToMany
List<Child> children;
then what will be order of this list elements in Java? It seems reasonable that this will be order of corresponding records in Child table or entries in intermetiate table if it's many to many, but is that a guaranteed behavior of JPA providers?
Order is not guaranteed by the specification as far as I know. #OrderBy is the way to go if you depend on the order.
EDIT: Quote from JPA 1.0 spec:
Portable applications should not expect the order of lists to be maintained across persistence contexts unless the OrderBy construct is used and the modifications to the list observe the specified ordering. The order is not otherwise persistent.
(Page 19, Footnote [4])
Also, when your entity has no children, JPA specs don't specify if it should return an empty list or null when you retrieve your List, so be sure to check it to avoid nullpointerexceptions.
There is no guaranteed behavior, the order can be different each time. It happened to me with hibernate: the order changed when I refreshed the page.
You are able to order the search result when you search, not in the definition of the class and its properties.

Is it valid for Hibernate list() to return duplicates?

Is anyone aware of the validity of Hibernate's Criteria.list() and Query.list() methods returning multiple occurrences of the same entity?
Occasionally I find when using the Criteria API, that changing the default fetch strategy in my class mapping definition (from "select" to "join") can sometimes affect how many references to the same entity can appear in the resulting output of list(), and I'm unsure whether to treat this as a bug or not. The javadoc does not define it, it simply says "The list of matched query results." (thanks guys).
If this is expected and normal behaviour, then I can de-dup the list myself, that's not a problem, but if it's a bug, then I would prefer to avoid it, rather than de-dup the results and try to ignore it.
Anyone got any experience of this?
Yes, getting duplicates is perfectly possible if you construct your queries so that this can happen. See for example Hibernate CollectionOfElements EAGER fetch duplicates elements
I also started noticing this behavior in my Java API as it started to grow. Glad there is an easy way to prevent it. Out of practice I've started out appending:
.setResultTransformer(Criteria.DISTINCT_ROOT_ENTITY)
To all of my criteria that return a list. For example:
List<PaymentTypeAccountEntity> paymentTypeAccounts = criteria()
.setResultTransformer(Criteria.DISTINCT_ROOT_ENTITY)
.list();
If you have an object which has a list of sub objects on it, and your criteria joins the two tables together, you could potentially get duplicates of the main object.
One way to ensure that you don't get duplicates is to use a DistinctRootEntityResultTransformer. The main drawback to this is if you are using result set buffering/row counting. The two don't work together.
I had the exact same issue with Criteria API. The simple solution for me was to set distinct to true on the query like
CriteriaQuery<Foo> query = criteriaBuilder.createQuery(Foo.class);
query.distinct(true);
Another possible option that came to my mind before would be to simply pass the resulting list to a Set which will also by definition have just an object's single instance.

Categories