Correct use of flush() in JPA/Hibernate - java

I was gathering information about the flush() method, but I'm not quite clear when to use it and how to use it correctly. From what I read, my understanding is that the contents of the persistence context will be synchronized with the database, i. e. issuing outstanding statements or refreshing entity data.
Now I got following scenario with two entities A and B (in a one-to-one relationship, but not enforced or modelled by JPA). A has a composite PK, which is manually set, and also has an auto-generated IDENTITY field recordId. This recordId should be written to entity B as a foreign-key to A. I'm saving A and B in a single transaction. The problem is that the auto-generated value A.recordId is not available within the transaction, unless I make an explicit call of em.flush() after calling em.persist() on A. (If I have an auto-generated IDENTITY PK then the value is directly updated in the entity, but that's not the case here.)
Can em.flush() cause any harm when using it within a transaction?

Probably the exact details of em.flush() are implementation-dependent.
In general anyway, JPA providers like Hibernate can cache the SQL instructions they are supposed to send to the database, often until you actually commit the transaction.
For example, you call em.persist(), Hibernate remembers it has to make a database INSERT, but does not actually execute the instruction until you commit the transaction. Afaik, this is mainly done for performance reasons.
In some cases anyway you want the SQL instructions to be executed immediately; generally when you need the result of some side effects, like an autogenerated key, or a database trigger.
What em.flush() does is to empty the internal SQL instructions cache, and execute it immediately to the database.
Bottom line: no harm is done, only you could have a (minor) performance hit since you are overriding the JPA provider decisions as regards the best timing to send SQL instructions to the database.

Can em.flush() cause any harm when using it within a transaction?
Yes, it may hold locks in the database for a longer duration than necessary.
Generally, When using JPA you delegates the transaction management to the container (a.k.a CMT - using #Transactional annotation on business methods) which means that a transaction is automatically started when entering the method and commited / rolled back at the end. If you let the EntityManager handle the database synchronization, sql statements execution will be only triggered just before the commit, leading to short lived locks in database. Otherwise your manually flushed write operations may retain locks between the manual flush and the automatic commit which can be long according to remaining method execution time.
Notes that some operation automatically triggers a flush : executing a native query against the same session (EM state must be flushed to be reachable by the SQL query), inserting entities using native generated id (generated by the database, so the insert statement must be triggered thus the EM is able to retrieve the generated id and properly manage relationships)

Actually, em.flush(), do more than just sends the cached SQL commands. It tries to synchronize the persistence context to the underlying database. It can cause a lot of time consumption on your processes if your cache contains collections to be synchronized.
Caution on using it.

Related

JPA flush vs commit

in JPA, if we call EntityTransaction.commit(), does it automatically call EntityManager.flush()? or should we call them both? what is the difference? because i have problem with JPA, when i insert an entity to database, i call persist(). in the database, the data has been inserted (can be fetched), but that data doesn't show up in my app (i fetch it using findAll()). and on another entity, it showed up. is there something i don't know? i'm using standard Spring CRUD, JPA resource_local, and postgresql. sorry for my english, thanks in advance
if we call EntityTransaction.commit(), does it automatically call
EntityManager.flush()?
Yes
what is the difference?
In flush() the changes to the data are reflected in database after encountering flush, but it is still in transaction.flush() MUST be enclosed in a transaction context and you don't have to do it explicitly unless needed (in rare cases), when EntityTransaction.commit() does that for you.
Source
em.flush() - It saves the entity immediately to the database with in a transaction to be used further and it can be rolled back.
em.getTransaction().commit - It marks the end of transaction and saves all the changes with in the transaction into the database and it can't be rolled back.
Refer https://prismoskills.appspot.com/lessons/Hibernate/Chapter_14_-_Flush_vs_Commit.jsp
If you have a #Version annotated column in your entity and call entityManager.flush(), then you will either (immediately!) get an OptimisticLockException, or the database will lock this row (or table). In the later case you can still call setRollbackOnly(), and the lock will later be released without a DB change.
Or from a different perspective, with flush() you can create a (pessimistic) lock on that database row. The others will still see the old entry, but if they try to update they will be blocked, until the lock is released.
All this is also true for CMT (container managed transactions). Instead of waiting for the moment, where the service method is finished and the CMT commit is performed, you can call flush() (even several times) in your service method and handle the OptimisticLockException(s) immediately.

JPA transaction combination of readonly and read uncommitted

Using Spring transactions, JPA with Hibernate implementation.
I have marked a method as:
#Transactional(readOnly=true, isolation=Isolation.READ_UNCOMMITTED)
Does this combination of readOnly and read uncommitted valid? I am using this on a method who's native sql is a select statement for a report page. First, I am marking it as read uncommitted so that it does not get stuck waiting on the table which gets frequently updated, as user wants to generate report even during processing time (a warning is shown if this happens). Second, I am marking as readonly to tell JPA not to keep the entities in the persistent context.
Is this a correct understanding?
Since you need intermediate data during the processing time, READ_UNCOMMITTED isolation level makes sense.
ReadOnly transaction will make sure that all the sql statements in that particular transaction are Select statements and if it finds any Insert/Update sql statements, it will immediately throw an error.
I think ReadOnly is just an additional check to make sure that your are not updating any data during a particular transaction.

What does EntityManager.flush do and why do I need to use it?

I have an EJB where I am saving an object to the database. In an example I have seen, once this data is saved (EntityManager.persist) there is a call to EntityManager.flush(); Why do I need to do this? The object I am saving is not attached and not used later in the method. In fact, once saved the method returns and I would expect the resources to be released. (The example code does this on a remove call as well.)
if (somecondition) {
entityManager.persist(unAttachedEntity);
} else {
attachedEntityObject.setId(unAttachedEntity.getId());
}
entityManager.flush();
A call to EntityManager.flush(); will force the data to be persisted in the database immediately as EntityManager.persist() will not (depending on how the EntityManager is configured: FlushModeType (AUTO or COMMIT) by default is set to AUTO and a flush will be done automatically. But if it's set to COMMIT the persistence of the data to the underlying database will be delayed until the transaction is committed.
EntityManager.persist() makes an entity persistent whereas EntityManager.flush() actually runs the query on your database.
So, when you call EntityManager.flush(), queries for inserting/updating/deleting associated entities are executed in the database. Any constraint failures (column width, data types, foreign key) will be known at this time.
The concrete behaviour depends on whether flush-mode is AUTO or COMMIT.
So when you call EntityManager.persist(), it only makes the entity get managed by the EntityManager and adds it (entity instance) to the Persistence Context. An Explicit flush() will make the entity now residing in the Persistence Context to be moved to the database (using a SQL).
Without flush(), this (moving of entity from Persistence Context to the database) will happen when the Transaction to which this Persistence Context is associated is committed.
The EntityManager.flush() operation can be used the write all changes to the database before the transaction is committed. By default JPA does not normally write changes to the database until the transaction is committed. This is normally desirable as it avoids database access, resources and locks until required. It also allows database writes to be ordered, and batched for optimal database access, and to maintain integrity constraints and avoid deadlocks. This means that when you call persist, merge, or remove the database DML INSERT, UPDATE, DELETE is not executed, until commit, or until a flush is triggered.
EntityManager.flush() sends actual SQL commands to DB.
Persistence framework usually manages transactions behind the scene. So there is no guaranty that flushed queries will be successfully committed.
EntityManager.flush() is always called just before the transaction commit by persistence framework automatically behind the scene.
EntityManager.persist() only registers an entity to persistence context without sending any SQL statements to DB. That means you won't get auto generated IDs after persist(). You just pass persisted object and eventually after later flush() it gets ID. You can call flush() yourself to get those IDs earlier. But again it is not the end of transaction so it can be rolled back and even more: changes might be seen (via phantom reads) by other transactions/thread/processes/servers and then disappear depending on DB engine and isolation level of current and foreign transaction!

Does a Hibernate transaction rollback delete "session.flush()"ed entities?

I have been confused about transaction.rollback. Here is example pseudocode:
transaction = session.beginTransaction()
EntityA a = new EntityA();
session.save(a);
session.flush();
transaction.rollback();
What happens when this code works? Do I have the entity in the database or not?
Short answer: No, you won't have entity in the database.
Longer answer: hibernate is smart enough not to send insert/updates to the DB until it knows if the transaction is going to be committed or rolled back (although this behavior can be changed by setting a different FlushMode), in your case by calling flush you are forcing the SQL to be sent to the DB but you still have the DB transaction to protect you, when you call rollback the DB transaction will be rolled back removing the changes performed inside itself and hence nothing will be actually saved. Note that depending on your configured transaction isolation level perhaps other transactions will be able to see in some way the EntityA you saved for the short while between the save and the rollback.
Also note that flush is called automatically when you try to read from DB, in 99% of the cases calling it explicitly is not necessary. One exception that comes to mind is when unit testing with auto rolling back tests.
When you call session.save(a) Hibernate basically remembers somewhere inside session that this object has to be saved. It can decide if he wants to issue INSERT INTO... immediately, some time later or on commit. This is a performance improvement, allowing Hibernate to batch inserts or avoid them if transaction is rolled back.
When you call session.flush(), Hibernate is forced to issue INSERT INTO... against the database. The entity is stored in the database, but not yet commited. Depending on transaction isolation level it won't be seen by other running transactions. But now the database knows about the record.
When you call transaction.rollback(), Hibernate rolls-back the database transaction. Database handles rollback, thus removing newly created object.
Now consider the scenario without flush(). First of all, you never touch the database so the performance is better and rollback is basically a no-op. On the other hand if transaction isolation level is READ UNCOMMITTED, other transactions can see inserted record even before commit/rollback. Without flush() this won't happen, unless Hibernate does not decide to flush() implicitly.
I think you get confused with flush and commit.
flush() synchronizes the state with the database, but it is not doing a commit. The state is still visible by transaction, so that you can call rollback to rollback.
So the answer to your question is: no, you don't have the entity (a) in the database.

How long are Entities persited?

I am trying to understand how JPA works. From what I know, if you persist an Entity, that object will remain in the memory until the application is closed. This means, that when I look for a previously persisted entity, there will be no query made on the database. Assuming that no insert, update or delete is made, if the application runs long enough, all the information in it might become persistent. Does this mean that at some point, I will no longer need the database?
Edit
My problem is not with the database. I am sure that the database can not be modified from outside the application. I am managing transactions by myself, so the data gets stored in the database as soon as I commit. My question is: What happens with the entities after I commit? Are they kept in the memory and act like a cache? If so, how long are they kept there? After I commit a persist, I make a select query. This select should return the object I persisted before. Will that object be brought from memory, or will the application query the database?
Not really. Think about it.
Your application probably isn't the only thing that will use the database. If an entity was persisted once and stored in memory, how can you be sure that, let's say, one hour later, it won't be changed by some other means? If that happens, you will have stale data that can harm logic of your application.
Storing data in memory and hoping that everything will be alright won't bring any benefits. That's why data stored in database is your primary source of information, and you should query it every time, unless you are absolutely sure that a subset of data won't change.
When you persist an entity an entity this will add it to the persistence context which acts like a first level cache (this is in-memory). When the actual persisting happens depends on whether you use container managed transactions or deal with transactions yourself. The entity instance will live in memory as long as the transaction is not commited, and when it is it will be persisted to the database or XML etc.
JPA can't work with only the persistence context (L1 cache) or the explicit cache (L2 cache). It always needs to be combined with a datasource, and this datasource typically points to a database that persists to stable storage.
So, the entity is in memory only as long as the transaction (which is required for JPA persist operations) isn't committed. After that it's send to the datasource.
If the transaction manager is transaction scoped (the 'normal' case) then the L1 cache (the persistence context) is closed and the entities do not longer exist there. If the L1 cache somehow bothers you, you can manage it a bit explicitly. There are operations to clear it and you could separate your read operations (which don't need transactions) from write operations. If there's no transaction active when reading, there's no persistence context, an entity becomes never attached and is thus never put into this L1 cache.
The L2 cache however is not cleared when the transaction commits and entities inside it remain available for the entire application. This L2 cache must be explicitly configured and you as an application developer must indicate which entities should be cached in it. Via vendor specific mechanisms (e.g. JBoss Cache, Infinispan) you can put a max on the number of entities being cached and set/define so-called eviction policies.
Of course, nothing prevents you from letting the datasource point to an in-memmory embedded DB, but this is outside the knowledge of JPA.
Persistence means in short terms: you can shut down your app, and the data is not lost.
To achieve that you need a database or some sort of saving data in a way that it's not lost when you shut down the app.
To "persist" an entity means to actually save it in the data base. Sure, JPA maintains some entity information in memory in the persistence context (and this is highly dependent on configuration and programming practices), but at certain point information will be stored in the data base - for instance, when a transaction commits, or likely (but not necessarily) after flush() or merge() operations.
If you want to keep your entities after committing and for a select query, you need to use the query cache. Just Google around on that term and it should be clear to you.

Categories