JPA: When do eager fetching result in StackOverflowError?

JPA: When do eager fetching result in StackOverflowError? - java

Do we have some set of rules when we should abstain of using FetchType.EAGER?
I have heard that some of JPA frameworks (eg. Hibernate) should resolve cyclic dependencies.
Do we have some advice when it comes risky to relay on frameworks?
Do StackOverflowError along with FetchType.EAGER is always about bug in framework(I mean if I have like 3 rows in two tables)?

One good reason to avoid FetchType.EAGER is that you can always enable eager fetching manually in JPQL (with JOIN FETCH), but you can't enable lazy fetching if you've set FetchType.EAGER.

EAGER:
Convenient, but slow
Use Eager when your parent class always need associate class.
LAZY:
More coding, but much more efficient
Basically lazy loading has more benefits than the eager alternative (performance, use of resources) . you should generally not configure it for eager fetching, unless you experience certain issues.
Sharing a domain instance accross different hibernate sessions (eg. when putting a domain class instance into the http session scope and accessing properties from it - such as a User)
When you're sure, you will access a certain relation property everytime (or most of the time) when an instance is fetched, it would also make sense to configure this relation for eager fetching.
eg: Relation Person and Address
Here, its not necessary that when i load Person entity i need to fetch Address all the time so you can keep lazy .
Conclusion
The EAGER fetching strategy is a code smell. Most often it’s used for simplicity sake without considering the long-term performance penalties. The fetching strategy should never be the entity mapping responsibility. Each business use case has different entity load requirements and therefore the fetching strategy should be delegated to each individual query.
The global fetch plan should only define LAZY associations, which are fetched on a per query basis. Combined with the always check generated queries strategy, the query based fetch plans can improve application performance and reduce maintaining costs.

Related

How can I set the default fetch type for Hibernate?

I have a crufty old Java Enterprise program that I'm trying to bring into the current decade thanks to security audits flagging half the code it imports as being a security risk. As some of you know, somewhere in the mid 2010's Hibernate changed its defaults from LAZY loading for everything, to EAGER loading for almost everything.
Problem is, this breaks this program bad. It was annotated assuming everything was lazy loading, so the only annotations are for things that it wanted to fetch EAGER, which, uhm, wasn't much. The end result is a huge increase in the number of joins, which has caused performance to plummet disastrously because most of the joined entities aren't used in a typical batch operation. For example, the User field is necessary in order to query by user. But the user for a report is already fetched at the top of the loop that processes the user's records, so adding a JOIN to the query to eager fetch the user for each and every record just makes the report slower.
Most of my relationships aren't annotated for lazy fetching, and there's a lot of them. I could manually go in and, laboriously, one by one, annotate them for lazy fetching. Or I could change Hibernate's defaults back to what they were back when this program was written. I'd obviously much prefer the latter, for obvious reasons -- I really don't want to be spending any more time updating this antique code base than I have to, since we're in the process of writing its replacement.

There is a default fetch plan based on the mapping you provide while defining entities.
By default, the JPA #ManyToOne and #OneToOne annotations are fetched EAGERly, while the #OneToMany and #ManyToMany relationships are considered LAZY.
This is the default strategy, and Hibernate doesn’t magically optimize your object retrieval, it only does what is instructed to do

As pointed out in this answer https://stackoverflow.com/a/57586036/19351325 unfortunately there's no way to change the default fetch type, it will always be:
OneToMany: LAZY
ManyToOne: EAGER
ManyToMany: LAZY
OneToOne: EAGER
But you can use simple regex to automatically add Lazy fetching. In Intellij Idea use these combinations: ctrl+shft+r or cmd+shft+r, and the following regex: #ManyToOne$ or #OneToOne$ , for replace-section use: #ManyToOne(fetch = FetchType.LAZY) and #OneToOne(fetch = FetchType.LAZY) accordingly.
If you already have in these annotations some properties added you can just modify your search regex. Anyway it will be faster then doing it all manually

JPA+Hibernate force JPA not to use Proxies on Lazy loading

We are using JPA + Hibernate.
I have some Many-to-one mappings which are lazy loaded.
In Service, I Initiallize the Many-to-one objects by calling their getter method. but proxy gets assigned to parent VO and not actual VO Object.
My Question is, Is there any way in JPA to force to use no proxy Strategy.
My limitation here is i cant use Hibernate Objects or annotaions like #LazytoOne etc.
thanks in advance.

You cannot prevent Hibernate from using proxy objects there due to the fact that somehow it has to guarantee it's a lazy relation.
You have multiple choices:
Trigger the initialization Hibernate.initialize(parent.getChild()). Note that this is not the best way to do it and this also requires an active transaction.
Fetch the relation when fetching the entity itself. This can be done with the Fetch Joins. JPQL/HQL/Criteria API are capable of doing this.
Use read-only projections which contains only the data you need. For this particular case you can use Spring Data JPA as it comes with such a feature.
I suggest you to go with either option 2 or 3 as they are the most effective ways to do this.
Furher reading about lazy-loading here.

Efficiently load large batch of OneToMany collections at once with Hibernate

I have a Parent entity with a #OneToMany relationship with a Child entity. Most of the time, when I need to work with a Parent’s Child entities, I’m working with a single parent, so lazy fetching (FetchMode.SELECT) is appropriate.
However, I have a situation where I’m querying a large number of Parents (sometimes hundreds or even thousands), and I need to work with their Child entities. FetchMode.SELECT gives me a serious N+1 problem, so I need to do something different in this scenario. If I were doing this via JDBC, it’d be single query for the Parent records, then another query for all the Child records using an IN statement (where child.parentid in (?,?,?....)). I need live Hibernate entities, because Hibernate Search is going to call getChildren() as part of its indexing process.
The options I’ve considered are:
Criteria.setFetchMode(“children”, FetchMode.JOIN) (or join fetch in HQL) - this would give me a cartesian product, though, which is brutal with that many entities.
Adding #BatchSize to Parent.getChildren() - this would help for my big batch scenario, but it isn’t really the strategy I want to use for normal operations. It’d be perfect if I could set a batch size for the fetch in my Criteria/HQL, but I can’t find a way to do so.
Using FetchMode.SUBSELECT in Parent.getChildren() - much like #BatchSize, this would be great for my big batch scenario, but isn’t appropriate for normal operations, and I can’t find a way to use it with Criteria/HQL (Criteria and the entity annotations use different FetchMode enums, despite the duplicate name).
tldr; I have a one-to-many relationship with a lazy fetch mode, but sometimes I want to be able to efficiently load the relationship for many entities at once.

Hibernate FetchType.LAZY without session

I have many issues with LazyLoadingException in a Spring web application wherever I try to access fields that are annotated with FetchType.LAZY
There is no session configured in Spring because the requirement is that the API should be stateless.
All the service layer methods have #Transactional annotations properly set.
However when I try to access the Lazy fields on any domain object, I get the famous LazyInitializationException (...) could not initialize proxy - no Session
I thought that Hibernate would automatically load the lazy fields when needed when I'm in a #Transactional method but it appears it doesn't.
I have spent several days looking for answers but nothing fits my needs. I found that Spring could be configured with openSessionInViewFilter but it appears to cause many issues, moreover I don't have any session.
How to automatically load lazy fields in #Transactionalannotated service methods with such a stateless API ?
I'm sure I'm missing something obvious here, but I'm not very familiar with Spring and Hibernate.
Please tell me if there are missing information in my question I should give you.

LazyInitializationExceptions are a code smell in a same way EAGER fetching is too.
First of all, the fetching policy should be query-based on a business case basis. Tha DAO layer is solely responsible for fetching the right associations, so:
You should use the FETCH directive for all many-to-one associations and at most one one-to-many association. If you try to JOIN FETCH more than one one-to-many associations, you'll get a Cartesian Product and your application performance will be affected.
If you need to fetch multiple collections, then a multi-level fetching is more appropriate.
You should ask yourself why you want to return entities from the DAO layer. Using DTOs is a much better alternative, as it reduces both the amount of data that's fetched from the DB and it doesn't leak the Entity abstraction into the UI layer.

Strategies for performance optimizations on an inherited EJB3 application

I was asked to have a look at a legacy EJB3 application with significant performance problems. The original author is not available anymore so all I've got is the source code and some user comments regarding the unacceptable performance. My personal EJB3 skill are pretty basic, I can read and understand the annotated code but that's all until know.
The server has a database, several EJB3 beans (JPA) and a few stateless beans just to allow CRUD on 4..5 domain objects for remote clients. The client itself is a java application. Just a few are connected to the server in parallel. From the user comments I learned that
the client/server app performed well in a LAN
the app was practically unusable on a WAN (1MBit or more) because read and update operations took much too long (up to several minutes)
I've seen one potential problem - on all EJB, all relations have been defined with the fetching strategy FetchType.EAGER. Would that explain the performance issues for read operations, is it advisable to start tuning with the fetching strategies?
But that would not explain performance issues on update operations, or would it? Update is handled by an EntityManager, the client just passes the domain object to the manager bean and persisting is done with nothing but manager.persist(obj). Maybe the domain objects that are sent to the server are just too big (maybe a side effect of the EAGER strategy).
So my actual theory is that too many bytes are sent over a rather slow network and I should look at reducing the size of result sets.
From your experience, what are the typical and most common coding errors that lead to performance issues on CRUD operations, where should I start investigating/optimizing?

On all EJB, all relations have been defined with the fetching strategy FetchType.EAGER. Would that explain the performance issues for read operations?
Depending on the relations betweens classes, you might be fetching much more (the whole database?) than actually wanted when retrieving entities?
is it advisable to start tuning with the fetching strategies?
I can't say that making all relations EAGER is a very standard approach. To my experience, you usually keep them lazy and use "Fetch Joins" (a type of join allowing to fetch an association) when you want to eager load an association for a given use case.
But that would not explain performance issues on update operations, or would it?
It could. I mean, if the app is retrieving a big fat object graph when reading and then sending the same fat object graph back to update just the root entity, there might be a performance penalty. But it's kinda weird that the code is using em.persist(Object) to update entities.
From your experience, what are the typical and most common coding errors that lead to performance issues on CRUD operations, where should I start investigating/optimizing?
The obvious ones include:
Retrieving more data than required
N+1 requests problems (bad fetching strategy)
Poorly written JPQL queries
Non appropriate inheritance strategies
Unnecessary database hits (i.e. lack of caching)
I would start with writing some integration tests or functional tests before touching anything to guarantee you won't change the functional behavior. Then, I would activate SQL logging and start to look at the generated SQL for the major use cases and work on the above points.

From DBA position.
From your experience, what are the typical and most common coding errors that lead to performance issues on CRUD operations, where should I start investigating/optimizing?
Turn off caching
Enable sql logging Ejb3/Hibernate generates by default a lots of extremely stupid queries.
Now You see what I mean.
Change FetchType.EAGER to FetchType.LAZY
Say "no" for big business logic between em.find em.persist
Use ehcache http://ehcache.org/
Turn on entity cache
If You can, make primary keys immutable ( #Column(updatable = false, ...)
Turn on query cache
Never ever use Hibernate if You want big performance:
http://www.google.com/search?q=hibernate+sucks

I my case a similar performance problem wasn't depending on the fetch strategy. Or lets say it was not really possible to change the business logic in the existing fetch strategies. In my case the solution was simply adding indices.
When your JPA Object model have a lot of relationsships (OneToOne, OneToMany, ...) you will typical use JPQL statements with a lot of joins. This can result in complex SQL translations. When you take a look at the datamodel (generated by the JPA) you will recognize that there are no indices for any of your table rows.
For example if you have a Customer and a Address object with an oneToOne relationship everything will work well on the first look. Customer and Address have an foreign key. But if you do selections like this
Select c from Customer as c where c.address.zip='8888'
you should take care about your table column 'zip' in the table ADDRESS. JPA will not create such an index for you during deployment. So in my case I was able to speed up the database performance by simply adding indices.
An SQL Statement in your database looks like this:
ALTER TABLE `mydatabase`.`ADDRESS` ADD INDEX `zip_index`(`IZIP`);

In the question, and in the other answers, I'm hearing a lot of "might"s and "maybe"s.
First find out what's going on. If you haven't done that, we're all just poking in the dark.
I'm no expert on this kind of system, but this method works on any language or OS.
When you find out what's making it take too long, why don't you summarize it here?
I'm especially interested to know if it was something that might have been guessed.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.