When calling EntityManager.flush() will it flush the second cache too? - java

When calling EntityManager.flush(), will it flush the second level cache too? I tried Googling and I also tried flushing it and it seems like it does, but it would be good to have it confirmed.
Edit: Now it does not seem like it flush the second level cache.

JPA has no notion of a second-level cache (it isn't part of the spec). So the behavior of the second-level cache depends entirely upon the JPA provider. What are you using Hibernate, EclipseLink, OpenJPA?
Update: I stand partially corrected, JPA 2.0 introduces a few options to control second-level cache usage (like #Cachable)

The L2 cache ought (by default, in any sensible JPA implementation) to be updated at commit not flush, but this is not mandated in the JPA2 spec, so you're down to implementation specifics. DataNucleus certainly only updates it at commit. If an L2 cache was updated at flush, and then those objects changes rolled back, this leads to potential invalid/non-persistent data being read. Some may allow that as an option.

Related

Spring Transaction Isolation Level

Most of us might be using Spring and Hibernate for data access.
I am trying to understand few of the internals of Spring Transaction Manager.
According to Spring API, it supports different Isolation Level - doc
But I couldn't find clear cut information on which occasions these are really helpful to gain performance improvements.
I am aware that readOnly parameter from Spring Transaction can help us to use different TxManagers to read-only data and can leverage good performance. But it locks the table to get the data to avoid dirty-reads/non-committed reads - doc.
Assume, in few occasions, we might want to blindly insert the records into a table and retrieve the information without locking the table, a case where we never update the table data, we just insert and read [append-only]. Can we use better Isolation to gain any performance?
As you see from one of the reference links, do we really require to implement/write our own CustomJPADiaelect?
What's the better Isolation for my requirement?
Read-only allows certain optimizations like disabling dirty checking and you should totally use it when you don't plan on changing an entity.
Each isolation level defines how much locking a database has to impose for ensuring the data anomaly prevention.
Most database use MVCC (Oracle, PostgreSQL, MySQL) so readers don't lock writers and writers don't lock readers. Only writers lock writers as you can see in the following example.
REPEATABLE_READ doesn't have to hold a lock to prevent a concurrent transaction from modifying your current transaction loaded rows. The MVCC engine allows other transactions to read the committed state of a row, even if your current transaction has changed it but hasn't yet committed (MVCC uses the undo logs to recover the previous version of a pending changed row).
In your use case you should use READ_COMMITTED as it scales better than other more strict isolation levels and you should use optimistic locking for preventing lost updates in long conversations.
Update
Setting #Transactional(isolation = Isolation.SERIALIZABLE) to a Spring bean has a different behaviour, depending on the current transaction type:
For RESOURCE_LOCAL transactions, the JpaTransactionManager can apply the specific isolation level for the current running transaction.
For JTA resources, the transaction-scoped isolation level doesn't propagate to the underlying database connection, as this is the default JTA transaction manager behavior. You could override this, following the example of the WebLogicJtaTransactionManager.
Actually readOnly=truedoesn’t cause any lock contention to the database table, because simply no locking is required - the database is able to revert back to previous versions of the records ignoring all new changes.
With readOnly as true, you will have the flush mode as FlushMode.NEVER in the current Hibernate Session preventing the session from committing the transaction. In addition, setReadOnly(true) will be called on the JDBC Connection, which is also a hint to the underlying database not to commit changes.
So readOnly=true is exactly what you are looking for (e.g. SERIALIZED isolation level).
Here is a good explanation.

Is there a stateless version of the JPA EntityManager?

Hibernate has a Stateless Version of its Session: Does something similar exist for the JPA EntityManager? I.e. an EntityManager that does not use the first level cache?
From JPA point of view:
javax.persistence.EntityManager stands for 1st level cache (persistence context, transactional cache)
javax.persistence.EntityManagerFactory stands for 2nd level cache (shared cache)
A given persistence provider may implement additional caching layers. Additionally JDBC Driver API may be treated as low-level cache for storing columns/tables and caching connections/statements. It's however transparent to JPA.
Both javax.persistence.EntityManager and org.hibernate.StatelessSession offer similar APIs.
You cannot disable 1st level cache with EntityManager beacuse these two things are equivalent. You can however:
skip 1st level cache by using createQuery, createNamedQuery, createNativeQuery for querying and bulk updates/deletes (the persistence context is not updated to reflect their results). Such queries should be executed in their own transaction thus invalidating any cached entities, if any. Transaction-scoped entity manager (means stateless) should be used as well.
disable 2nd level cache by setting up <shared-cache-mode>NONE</shared-cache-mode> in persistence.xml or javax.persistence.sharedCache.mode in properties
Not part of the JPA API or spec. Individual implementations may allow disabling the L1 cache. DataNucleus JPA, the one I have used, does allow this
On an interface point of view, RDBMS usually respect the ACID constraints, a stateless option would be very specific. I guess this is the reason why Hibernate proposes this feature but not the specification.
To disable the cache, you have implementation-specific configurations (here is the doc for EclipseLink). The #Cacheable annotation (JPA 2.0) at the entity level is standard.
But if you would like to perform bulk operations, this would not do the job. Anyway, such a behavior would be implementation-specific.

What's the difference between L1 and L2 caches in web-applications with Hibernate as ORM mechanism?

I just want some general info about standard purpose of using L1 cache and L2 cache.
I'm curious because I'm investigating the system with terracotta as 2nd level cache and I've found that it also has 1st-level cache.
L1 Cache is the cache that exists per Hibernate session, and this cache is not shared among threads. This cache makes use of Hibernate's own caching.
L2 Cache is a cache that survives beyond a Hibernate session, and can be shared among threads. For this cache you can use either a caching implementation that comes with Hibernate like EHCache or something else like JBossCache2
In JPA/Hibernate (and other similar ORM tools), the L1 cache is the transactional cache i.e. the entities stored from when you open a transaction to when you close it. This is almost never a shared cache (other threads can't make use of it). In JPA, this would usually be held by the EntityManager.
The L2 cache is a full (typically) shared cache. If you have multiple threads/queries pulling in data, then they can make use of entities that have already been retrieved by other threads that are still live in the cache. In JPA, this would usually be held by the EntityManagerFactory.
GaryF is not wrong, but is not technically right :-) Anton is more correct on this, but to complement his answer:
First Level Cache: this is a "cache" which stores all the entities known by a specific session. So, if you have 3 transactions inside this session, it'll hold all entities touched by all three transactions. It gets cleared when you close the session or when you perform the "clear" method.
Second Level Cache: this is a "real" cache and is delegated to an external provider, such as Infinispan. In this cache, you have full control over the contents of the cache, meaning that you are able to specify which entries should be evicted, which ones should be retained longer and so on.
If Hibernate is anything similar to NHibernate (which it is, except the other way round), the Session is the first-level cache. Except that it is not cache in a general sense, but rather an identity map.
L1 By default enabled, you have to add some third party library like EH cache, Redis for L2.
You can't disable L1 in hibernate.
L1:The first-level cache is the cache per Hibernate Session cache and is a mandatory cache through which all requests must pass and this cache is not shared among threads.
L2:The second-level cache can be configured on a per-class and per-collection basis and mainly responsible for caching objects across sessions.L2 Cache is a cache that survives beyond a Hibernate session and can be shared among threads.
As mentioned in this article these are important differences :
First level Cache vs. Second-level Cache in Hibernate Now that we have
got some basic understanding of the first level and second level
cache, here are some differences between them:
The primary difference is that the first level cache is maintained
at the Session level while the second level cache is maintained at the
SessionFactory level.
The data stored in the first level cache is accessible to the only
Session that maintains it, while the second level cache is accessible
to all.
The first level cache is by default enabled while the second level
cache is by default disabled.
A couple of things to know about Hibernate's first level cache:
You can use the Session.evict() to remove the loaded entity from
the first level cache, can use the refresh() method to refresh the
cache, and can use the clear() method to remove all entities in cache.
You cannot disable the first level cache, it is always enabled.
Hibernate entities or database rows remain in cache only until
Session is open, once Session is closed, all associated cached data is
lost.
Read more: https://www.java67.com/2017/10/difference-between-first-level-and-second-level-cache-in-Hibernate.html#ixzz7B8pzzLzL

Difference between FlushMode.AUTO and FlushMode.ALWAYS in Hibernate?

Have gone through hibernate api specification on FlushMode but didn't get the exact difference. So please help.
If flush mode is 'AUTO' before firing any query hibernate will check if there are any tables to be updated. If so, flush will be done otherwise no. If flush mode is 'ALWAYS', flush will happen even if there are no tables to be updated.
Check source of , org.hibernate.event.def.DefaultAutoFlushEventListener.onAutoFlush(AutoFlushEvent)
Always means that before any query is run on a collection or such the query is rerun against the database. With auto I am assuming there is some "magic" under the hoods that knows most data doesn't change that often so you don't always have to flush. It also affects how often might happen during a transaction. I say might because some sources say setting the flushmode is only a hint to hibernate - but see this thread for some discussion...
http://forum.springsource.org/archive/index.php/t-14044.html

When does Hibernate read from second-level cache and when from DB?

As far as I know Hibernate let's you configure entities and collections to be stored in a second-level cache.
When does Hibernate try to read these cached entities from the second-level cache and when does it hit the DB? Does Hibernate only read from the second-level cache when loading entities by calling Sesssion.get() and when initializing proxies (including collections)? Does Hibernate ever hit the second-level cache when executing HQL- or Criteria-Queries?
Examples?
2nd level cache contains only entities by their ids, so when retrieving an entity by id (i.e. get, load or resolving a proxy implicitly) a 2nd level cache may be accessed. Any other queries (hal, criteria) will bypass the cache and hit the DB - at least as long as no query cache is used as well.
(Note: the easiest way to answer that type of questions is to turn show_sql on and see what queries Hib generates.)
Sometimes query only return PKs of the records (e.g. for iteration queries) and then Hib can use the cache.
When retrieving linked objects cache can be used too.
I cannot though give you the precise rule here. I also suspect the answer depends on capabilities of dialect used.

Categories