Exception while persisting a large object graph using JPA/Hibernate

Exception while persisting a large object graph using JPA/Hibernate - java

We are creating a new web application backed by JPA to replace an old web application. As part of the migration we are converting the old application's database to a new, more sophisticated, JPA-managed database.
So I've written a 'script' that converts the old database to a set of JPA entities and subsequently saves them. It works like this:
Create an order of conversion based on the dependencies of the domain models
For each entity
Execute database query to legacy DB
Store new object for each obtained table row in a list in memory
Iterate over generated lists in the same order as the conversion, and persist each entity.
Now, the first two steps work well. Upon persisting, however I get an exception. The exception occurs when one entity has a relation to another entity. For example if one of our entities would be a Book and another would be Chapter defining a #ManyToOne(optional=false) relation to Book. Upon persisting the Chapter, it throws the exception java.lang.IllegalStateException: org.hibernate.TransientPropertyValueException: Not-null property references a transient value - transient instance must be saved before current operation: models.Chapter.book -> models.Book.
Of course, this indicates that something is wrong with the state of the book: it seems it is either not set or has not yet been persisted. However, I can verify that the Book is set properly in the conversion of the Chapter, and I can also verify that all entities of type Book are persisted by the EntityManager before the entities of type Chapter get persisted. Obviously, my JPA provider does not behave as expected and does not truly persist my Book objects for some reason.
What solution would allow me to save the entire graph of objects that I have converted to the database? I use Hibernate as my JPA provider and I also use Spring 3.1 for injection of dependencies and EntityManagers.
EDIT 1: Some additional info: I've again verified that entityManager.persist() is called on each of the book objects before entityManager.persist() is called on the chapters. However, the id of the book object remains null, meaning it is not properly persisted. The database also remains empty, despite not using transactions.
EDIT 2: Because I don't think it's clear from the text above: the Book and Chapter story is just an example. It happens for any entity that references another entity. This makes it seem as if I'm not using JPA/Hibernate properly as opposed to not setting the values of my entities properly.
EDIT 3: The core issue seems to be that despite persisting Book properly, having all the right annotations, book.getId() remains null. Basically, Hibernate is not setting the ids on my entities after persisting them, leading to problems when I need to use those entities later.

I once battled with such an error from hibernate myself. It turned out that it was a combination of a circle in the object graph and the cascade settings that caused the problem.
It has been a while so the fowlling might not be 100% accurate but maybe it is enough information to track your problem:
Hibernate Wants to insert the chapter. Realizes it needs to insert the book first.
Wants to insert the book. Realizes it needs to insert another entity first (e.g. publisher)
Inserts publisher and performs cascades defined on publisher (e.g. authors)
Author has e.g. reference to his lastestBook. Because hibernate internally already marked the book as processed (in step 2) you would no get an exception stating that author.book references a transient instance.
To find out if this is your problem you can enable full hibernate debugging and follow the path hibernate is taking through your object graph.

I've found the answer thanks to the discussion I've had with user1888440.
The solution to this answer was that the Spring #Transactional annotation was nonfunctional in my application. This mean that everything Hibernate did didn't occur in the context of a transaction. This meant that Hibernate would not set ids after persisting and this meant that all conversions would break down.
The reason why #Transactional did not work is probably because of a fact I did not mention: this script is part of a Play 2.0 (actually 2.1) app and is thus built using SBT. SBT doesn't use a normal Java setup to build an application, but instead uses the Scala compiler to compile Java as well. My guess is that the Scala compile did not work well with the AspectJ that Spring requires to make #Transactional work.
Instead, I performed all of the database work involved in this conversion within a programmatically defined Spring transaction (section 11.6). Now everything behaves as expected.

Check he unsaved values for your primary key/Object ID in your hbm files.If you have automated ID creaion by hibernate framework and you are stting th ID somewhere it woudl throw this error.By defaut the unsaved-value is 0 , so if you set the ID as 0 you would see this error.

Sounds like you are forgetting to assign a Book to each Chapter before persisting it. Even if you have persisted the Book it needs to be assigned to the #book property of the Chapter instance before you can persist the Chapter. This is because you have specified the relationship as non-optional. #book can never be null.

Related

update() and merge behave differently in case of updating an item in OneToMany collection

I have this a class like bellow:
#Entity
#Table(name="work")
public class Work {
#Id
#Column(name="id")
private String id;
#OneToMany(orphanRemoval=true ,mappedBy="work", cascade=CascadeType.ALL , fetch=FetchType.EAGER)
private List<PersonRole> personRoleList;
}
As mine is an web application, when i update (comes from client) a personRoleList item and call :
session.update(work); //`work` is in detached state
It does not update the existing personRoleList item it actually add a new one.
Some other people also having the same problem. REF:
using-saveorupdate-in-hibernate-creates-new-records-instead-of-updating-existi
jpa-onetomany-not-deleting-child
I tried all suggested solution, but none of them work for me.
But then i just tried :
session.merge(work); //replacing session.update(work)
And it works as expected.!!
This is where I get confused. Because I can't find any explanation for this difference in behaviors in case of OneToMany relationship (or may be i missed ). I read some threads to understand the differences between update() and merge() and gone through the doc. REF:
what-are-the-differences-between-the-different-saving-methods-in-hibernate
differences-among-save-update-saveorupdate-merge-methods-in-session
But still it is not clear What are those behavioral pattern/logic/steps that creating this difference.?

Merge attempts to associate a currently transient object with a persistent object currently under management by the session by 'merging' them into one entity. Its intended use is when you have a detached object and an attached object and wish to resolve them.
In a merge(), Hibernate will read the entity from the database if there isn't already a managed instance in the session. In your example, this will result in Hibernate eagerly loading the collection (due to fetch=FetchType.EAGER). Then when your session ends, Hibernate will check for changes in the collection (due to cascade=CascadeType.ALL) and will perform the appropriate UPDATE in the database.
This differs from the update() scenario because in an update Hibernate always (by default) assumes the object is dirty and schedules an UPDATE. This update is likely what's causing creation of a new element in your collection - Hibernate hasn't looked in the database to bring the collection into session before issuing the UPDATE.
I'd bet you can get the desired behavior of update() by setting
select-before-update="true"
in your class mapping or by using the lock method to re-attach your object to the session before making changes.
From Chapter 9 of Java Persistence with Hibernate
It doesn’t matter if the item object is modified before or after it’s passed to
update(). The important thing here is that the call to update() is reattaching the detached instance to the new Session (and persistence context). Hibernate
always treats the object as dirty and schedules an SQL UPDATE, which will be executed during flush. You can see the same unit of work in figure 9.8.
You may be surprised and probably hoped that Hibernate could know that you
modified the detached item’s description (or that Hibernate should know you did
not modify anything). However, the new Session and its fresh persistence context
don’t have this information. Neither does the detached object contain some internal list of all the modifications you’ve made.
UDPATE in the database is needed. One way to avoid this UDPATE statement is to
configure the class mapping of Item with the select-before-update="true"
attribute. Hibernate then determines whether the object is dirty by executing a
SELECT statement and comparing the object’s current state to the current data-
base state.

What is the correct CascadeType in #ManyToMany Hibernate annotation?

I am trying to model a transient operations solution schema in Hibernate and I am unsure how to get the object graph and behavior I want from the model.
The table structure uses a correlation table (many-to-many) to create lists of users for the operation:
Operation OperationUsers Users
op_id op_id user_id
... user_id ...
In modeling the persistent class Operation.java using hibernate annotations, I created:
#ManyToMany(fetch=FetchType.LAZY)
#JoinColumn(name="op_id")
public List<User> users() { return userlist; }
So far, I have the following questions:
When a user is removed from the list, how do I avoid Hibernate
deleting the user from the Users table? It should just be removed
from the correlation table, not the Users table. I cannot see a valid
CascadeType to accomplish this.
Do I need to put anything more in the method body?
Do I need to add more annotation arguments?
I am expecting to do this without futzing with the User class.
Please tell me that I do not have to mess with User.java!
It's possible I'm overthinking this, but that's the nature of learning... Thanks in advance for any help you can offer!

From the documentation:
Hibernate defines and supports the following object states:
*Transient - an object is transient if it has just been instantiated using the new operator, and it is not associated with a Hibernate Session. It has no persistent representation in the database and no identifier value has been assigned. Transient instances will be destroyed by the garbage collector if the application does not hold a reference anymore. Use the Hibernate Session to make an object persistent (and let Hibernate take care of the SQL statements that need to be executed for this transition).
*Persistent - a persistent instance has a representation in the database and an identifier value. It might just have been saved or loaded, however, it is by definition in the scope of a Session. Hibernate will detect any changes made to an object in persistent state and synchronize the state with the database when the unit of work completes. Developers do not execute manual UPDATE statements, or DELETE statements when an object should be made transient.
*Detached - a detached instance is an object that has been persistent, but its Session has been closed. The reference to the object is still valid, of course, and the detached instance might even be modified in this state. A detached instance can be reattached to a new Session at a later point in time, making it (and all the modifications) persistent again. This feature enables a programming model for long running units of work that require user think-time. We call them application transactions, i.e., a unit of work from the point of view of the user.
As explained in this answer, you can detach your entity using Session.evict() to prevent hibernate from updating the database or simply clone it and make the needed changes on the copy.

It turns out that the specific answer to my primary question (#1 and the main topic) is: "Do not specify any CascadeType on the property."
The answer is mentioned sorta sideways in the answer to this question.

Should Hibernate Session#merge do an insert when receiving an entity with an ID?

This seems like it would come up often, but I've Googled to no avail.
Suppose you have a Hibernate entity User. You have one User in your DB with id 1.
You have two threads running, A and B. They do the following:
A gets user 1 and closes its Session
B gets user 1 and deletes it
A changes a field on user 1
A gets a new Session and merges user 1
All my testing indicates that the merge attempts to find user 1 in the DB (it can't, obviously), so it inserts a new user with id 2.
My expectation, on the other hand, would be that Hibernate would see that the user being merged was not new (because it has an ID). It would try to find the user in the DB, which would fail, so it would not attempt an insert or an update. Ideally it would throw some kind of concurrency exception.
Note that I am using optimistic locking through #Version, and that does not help matters.
So, questions:
Is my observed Hibernate behaviour the intended behaviour?
If so, is it the same behaviour when calling merge on a JPA EntityManager instead of a Hibernate Session?
If the answer to 2. is yes, why is nobody complaining about it?

Please see the text from hibernate documentation below.
Copy the state of the given object onto the persistent object with the same identifier. If there is no persistent instance currently associated with the session, it will be loaded. Return the persistent instance. If the given instance is unsaved, save a copy of and return it as a newly persistent instance.
It clearly stated that copy the state(data) of object in database. if object is not there then save a copy of that data. When we say save a copy hibernate always create a record with new identifier.
Hibernate merge function works something like as follows.
It checks the status(attached or detached to the session) of entity and found it detached.
Then it tries to load the entity with identifier but not found in database.
As entity is not found then it treat that entity as transient.
Transient entity always create a new database record with new identifier.
Locking is always applied to attached entities. If entity is detached then hibernate will always load it and version value gets updated.
Locking is used to control concurrency problems. It is not the concurrency issue.

I've been looking at JSR-220, from which Session#merge claims to get its semantics. The JSR is sadly ambiguous, I have found.
It does say:
Optimistic locking is a technique that is used to insure that updates
to the database data corresponding to the state of an entity are made
only when no intervening transaction has updated that data since the
entity state was read.
If you take "updates" to include general mutation of the database data, including deletes, and not just a SQL UPDATE, which I do, I think you can make an argument that the observed behaviour is not compliant with optimistic locking.
Many people agree, given the comments on my question and the subsequent discovery of this bug.
From a purely practical point of view, the behaviour, compliant or not, could lead to quite a few bugs, because it is contrary to many developers' expectations. There does not seem to be an easy fix for it. In fact, Spring Data JPA seems to ignore this issue completely by blindly using EM#merge. Maybe other JPA providers handle this differently, but with Hibernate this could cause issues.
I'm actually working around this by using Session#update currently. It's really ugly, and requires code to handle the case when you try to update an entity that is detached, and there's a managed copy of it already. But, it won't lead to spurious inserts either.

1.Is my observed Hibernate behaviour the intended behaviour?
The behavior is correct. You just trying to do operations that are not protected against concurrent data modification :) If you have to split the operation into two sessions. Just find the object for update again and check if it is still there, throw exception if not. If there is one then lock it by using em.(class, primary key, LockModeType); or using #Version or #Entity(optimisticLock=OptimisticLockType.ALL/DIRTY/VERSION) to protect the object till the end of the transaction.
2.If so, is it the same behaviour when calling merge on a JPA EntityManager instead of a Hibernate Session?
Probably: yes
3.If the answer to 2. is yes, why is nobody complaining about it?
Because if you protect your operations using pessimistic or optimistic locking the problem will disappear:)
The problem you are trying to solve is called: Non-repeatable read

JPA merge in a RESTful web application with DTOs and Optimistic Locking?

My question is this: Is there ever a role for JPA merge in a stateless web application?
There is a lot of discussion on SO about the merge operation in JPA. There is also a great article on the subject which contrasts JPA merge via a more manual Do-It-Yourself process (where you find the entity via the entity manager and make your changes).
My application has a rich domain model (ala domain-driven design) that uses the #Version annotation in order to make use of optimistic locking. We have also created DTOs to send over the wire as part of our RESTful web services. The creation of this DTO layer also allows us to send to the client everything it needs and nothing it doesn't.
So far, I understand this is a fairly typical architecture. My question is about the service methods that need to UPDATE (i.e. HTTP PUT) existing objects. In this case we have these two approaches 1) JPA Merge, and 2) DIY.
What I don't understand is how JPA merge can even be considered an option for handling updates. Here's my thinking and I am wondering if there is something I don't understand:
1) In order to properly create a detached JPA entity from a wire DTO, the version number must be set correctly...else an OptimisticLockException is thrown. But the JPA spec says:
An entity may access the state of its version field or property or
export a method for use by the application to access the version, but
must not modify the version value[30]. Only the persistence provider
is permitted to set or update the value of the version attribute in
the object.
2) Merge doesn't handle bi-directional relationships ... the back-pointing fields always end up as null.
3) If any fields or data is missing from the DTO (due to a partial update), then the JPA merge will delete those relationships or null-out those fields. Hibernate can handle partial updates, but not JPA merge. DIY can handle partial updates.
4) The first thing the merge method will do is query the database for the entity ID, so there is no performance benefit over DIY to be had.
5) In a DYI update, we load the entity and make the changes according to the DTO -- there is no call to merge or to persist for that matter because the JPA context implements the unit-of-work pattern out of the box.
Do I have this straight?
Edit:
6) Merge behavior with regards to lazy loaded relationships can differ amongst providers.

Using Merge does require you to either send and receive a complete representation of the entity, or maintain server side state. For trivial CRUD-y type operations, it is easy and convenient. I have used it plenty in stateless web apps where there is no meaningful security hazard to letting the client see the entire entity.
However, if you've already reduced operations to only passing the immediately relevant information, then you need to also manually write the corresponding services.
Just remember that when doing your 'DIY' update you still need to pass a Version number around on the DTO and manually compare it to the one that comes out of the database. Otherwise you don't get the Optimistic Locking that spans 'user think-time' that you would have if you were using the simpler approach with merge.
You can't change the version on an entity created by the provider, but when you have made your own instance of the entity class with the new keyword it is fine and expected to set the version on it.
It will make the persistent representation match the in-memory representation you provide, this can include making things null. Remember when an object is merged that object is supposed to be discarded and replaced with the one returned by merge. You are not supposed to merge an object and then continue using it. Its state is not defined by the spec.
True.
Most likely, as long as your DIY solution is also using the entity ID and not an arbitrary query. (There are other benefits to using the 'find' method over a query.)
True.

I would add:
7) Merge translates to insert or to update depending on the existence of the record on DB, hence it does not deal correctly with update-vs-delete optimistic concurrency. That is, if another user concurrently deletes the record and you update it, it must (1) throw a concurrency exception... but it does not, it just inserts the record as new one.
(1) At least, in most cases, in my opinion, it should. I can imagine some cases where I would want this use case to trigger a new insert, but they are far from usual. At least, I would like the developer to think twice about it, not just accept that "merge() == updateWithConcurrencyControl()", because it is not.

merging / re-attaching IN JPA / Hibernate without updating the DB

Working with JPA / Hibernate in an OSIV Web environment is driving me mad ;)
Following scenario: I have an entity A that is loaded via JPA and has a collection of B entities. Those B entities have a required field.
When the user adds a new B to A by pressing a link in the webapp, that required field is not set (since there is no sensible default value).
Upon the next http request, the OSIV filter tries to merge the A entity, but this fails as Hibernate complains that the new B has a required field is not set.
javax.persistence.PersistenceException: org.hibernate.PropertyValueException: not-null property references a null or transient value
Reading the JPA spec, i see no sign that those checks are required in the merge phase (i have no transaction active)
I can't keep the collection of B's outside of A and only add them to A when the user presses 'save' (aka entitymanager.persist()) as the place where the save button is does not know about the B's, only about A.
Also A and B are only examples, i have similar stuff all over the place ..
Any ideas? Do other JPA implementaions behave the same here?
Thanks in advance.

I did a lot reading and testing. The problem come from my misunderstanding of JPA / Hibernate. merge() always does a hit on the DB and also schedules an update for the entity. I did not find any mention of this in the JPA spec, but the 'Java Persistence with Hibernate' book does mention it.
Looking through the EntityManager (and Session as fallback) API it looks as if there is no means of just assigning an entity to the current persistent context WITHOUT scheduling an update. After all, what I want is to navigate the object graph, changing properties as needed and trigger an update (with version check if needed) later on. Something i think every Webapp out there using ORM must do?
The basic workflow i 'm looking for:
load an entity from the DB (or create a new one)
let the entity (and all its associations become detached (as the EntitManager closes at the end of a HTTP request)
when the next HTTP request comes in, work again with those objects, navigating the tree without fear of LazyInitExceptions
call a method that persists all changes made during 1-3)
With the OSIV filter from spring in conjunction with an IModel implementation from wicket i thought i have archived this.
I basically see 2 possible ways out of it:
a) load the entity and all the associations needed when entering a certain page (use case), letting them become detached, adding/ changing them as needed in the course of several http requests. Than reattach them when the user initiates a save (validators will ensure a valid state) and submit them to the database.
b) use the current setup, but make sure that all newly added entities have all their required fields set (probably using some wizard components). i would still have all the updates to the database for every merge(), but hopefully the database admin won't realize ;)
How do other people work with JPA in a web environment? Any other options for me?

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.