How to force reread db data (without evicting second level cache before reading) and then put it to cache.
The use case is as following:
There is a service, which uses dictionary data. The service experiences high load, so performance is the must.
I would like dictionary data to be always available in the second level cache for the service.
The dictionary data can be externally modified, so the cache needs to be refreshed periodically.
Is there a possibility to 'refresh' second level cache in such a way that other clients of SessionFactory will not cause a hit to db (will get old items while updated data is being read).
ehcache is used as cache provider, but it can be reasonably changed.
Every now and again, to cause a read through to the db, you can set the CacheMode on the session to REFRESH - the data will be fetched from the db for that one client, and any updates added to the second-level cache. Other clients querying using the normal cache mode will read from the second-level cache without hitting the db.
See CacheMode
Related
I have a common database that is used by two different applications (different technologies, different deployment servers, they just use the same database).
Let's call them application #1 and application #2.
Suppose we have the following scenario:
the database contains a table called items (doesn't matter its content)
application #2 is developed in Spring Boot and it is mainly used just for reading data from the database
application #2 retrieves an item from the database
application #1 changes that item
application #2 retrieves the same item again, but the changes are not visible
What I understood by reading a lot of articles:
when application #2 retrieves the item, Hibernate stores it in the first level cache
the changes that are done to the item by application #1 are external changes and Hibernate is unaware of them, and thus, the cache is not updated (same happens when you do a manual change in the database)
you cannot disable Hibernate's first level cache.
So, my question is, can you force Hibernate into refreshing the entities every time they are read (or make it go into the database) without explicitly calling em.refresh(entity)? The problem is that the business logic module from application1 is used as a dependency in application1 so I can only call service methods (i.e. I don't have access to the entityManager or session references).
Hibernate L1 cache is roughly equivalent to a DB transaction when you run in a repeatable-read level isolation. Basically, if you read/write some data, the next time you query in the context of the same session, you will get the same data. Further, within the same process, sessions run independent of each other, which means 2 session are looking at different data in the L1 cache.
If you use repeatable read or less, then you shouldn't really be concerned about the L1 cache, as you might run into this scenario regardless of the ORM (or no ORM).
I think you only need to think about the L2 cache here. The L2 cache is what stores data and assumes only hibernate is accessing the DB, which means that if some change happens in the DB, hibernate might not know about it. If you just disable the L2 cache, you are sorted.
Further reading - Short description of hibernate cache levels
Well, if you cannot access hibernate session you are left with nothing. Any operations you want to do requires session access. For instance you can remove entity from cache after reading it like this:
session.evict(entity);
or this
session.clear();
but first and foremost you need a session. Since you calling only services you need to create service endpoints clearing session cache after serving them or modify existing endpoints to do that.
You can try to use StatelessSession, but you will lose cascading and other things.
https://docs.jboss.org/hibernate/orm/current/userguide/html_single/Hibernate_User_Guide.html#_statelesssession
https://stackoverflow.com/a/48978736/3405171
You can force to start a new transaction, so in this manner hibernate will not be read from the cache and it will redo the read from the db.
You can annotate your function in this manner
#Transactional(readOnly = true, propagation = Propagation.REQUIRES_NEW)
Requesting a new transaction, the system will generation a new hibernate session, so the data will not be in the cache.
I'm designing an application that has to consume live data from several sources and periodically report on it. Consumed data will be added to an Ehcache cache and reports will query it. Once the live data is consumed it needs to be persisted for recovery purposes only. If the application restarts it will prime the cache with historical data from the DB before connecting to the live data sources (which queue new data).
I'm leaning toward implementing it as a cache-as-sor with JDBC caching:
1. Receive data from source
2. Persist to DB
3. Add to cache
4. Confirm receipt with source
with 2-4 wrapped in a JTA transaction.
I also looked into Hibernate with Ehcache as a 2nd level cache, but that doesn't seem appropriate.
I'm relatively new to Ehcache so would like some advice on the right design.
For persistence, rather than do a "cache-aside", you probably would want to configure your caches to use read-through and some cache writer (either write-through, or write-behind). You can read about these here: http://ehcache.org/documentation/user-guide/concepts#cache-as-sor
Now I'd avoid JTA, as I fear the overhead might be overkill (except if you really need XA Transaction Recovery) and rather opt for a fault tolerant approach. If you opt for a asynchronous persistence (write-behind), clustering your cache with Terracotta (the WriteBehind Queue would automatically be persistent, recoverable and even HA if multiple nodes are available) is one approach of ensuring every element gets written out to the underlying SoR... All depending on your needs I guess.
Ehcache would let you start with a single node, unclustered approach, simply using read- & write-through caches, that you could grow and fine tune to meet your SLA. As data grows, you'd then be able to move to clustered caches and asynchronous writers (should writes become the issues) or grow your cache sizes (if reads remain the issue). Obviously, you should measure (or at least know what the bottlenecks are you foresee) and choose accordingly. But putting a Cache in front of your RDBMS is a common and well understood pattern to scale read (and write) access to these "slower" stores...
If you want to have data in a cache, the Hibernate looks like overkill. All you need is JDBC, both to implement a cache loader for cache initialization and for saving the data to a database periodically. Or just setup your cache to persist on disk.
Then Ehcache + Hibernate is not the solution. What you are describing here is an asynchronous event processing system in which one of the listeners awaits a "event processed successfully" to persist.
NoSQL databases are a far better option in this case, unless you need to strictly rely to a relational database.
I use hibernate + ehcache to read a workflow engine database.
hibernate does not write anything on that database.
If i set TimetoLive setting in the cache, the cache won't reflect any database changes unless TimetoLive arrives.
database changes is done by the workflow engine API, so there is no way to use hibernate to write the database.
Shouldn't ehcache knows the cache is expired and do the updates for me ?
Any clean way to solve the cache wrong problem ?
the cache won't reflect any database changes unless TimetoLive arrives.
That's the intended functionality! These second level caches do nothing but store data in hash maps and know nothing about the changes unless you tell it to or the time to evict the objects out of cache and reread them.
To solve this is to not use caches on volatile objects.
If i set TimetoLive setting in the cache, the cache won't reflect any database changes unless TimetoLive arrives.
So that means you are not using it.
database changes is done by the workflow engine API, so there is no way to use hibernate to write the database.
So as an laternative (to timetoLive), that means you need cache mode to read-write or read-nonstrictly-write (check the name something like that). If its not reflecting the chnages and I am asssuming two things
Your workflow Engine is using hibernate
And your cache setting is read-only
I have a scenario where I am displaying the data in the database which changes frequently (changed by outside application) on a webpage using Spring MVC, somewhat similar to a stock monitoring application. Currently i am using a daemon thread which fires on web container startup and queries the database every 45 secs and stores the data in the application wide hashmap object. And the web application reads the data from hashmap (instead of database) for displaying the data.
I have read about third party caching API's like Ehcache and OSCache. I have read the documentation on Ehcache and seems like I can use the Hibernate query caching technique instead of a daemon thread.
Now my query if I use hibernate and enable query caching and set timetoidle to 45 secs will the data in the cache is automatically refreshed to reflect latest data in the database or do i need force refresh (query the database again and repopulate the cache) the cache, also can you explain what a self populating cache is.
In the Ehcache docs a SelfPopulatingCacheis described as a:
A selfpopulating decorator for Ehcache that creates entries on demand.
That means when asking the SelfPopulatingCache for a value and that value is not in the cache, it will create this value for you. This blog article gives a lot of details and also code (inclusively auto-updating).
For me, it sounds that a Ehcache SelfPopulatingCache is what would fit your needs best. So I'd recommend to have a closer look at.
A Hibernate 2nd level cache would surely help to increase system performance, but not solve your problem, as I understand it. It's true when using Ehcache and setting timeToIdleSeconds the cache expires after that time, but it's not refreshed automatically.
Take a look at what Hibernate docs write about query cache:
The query cache does not cache the state of the actual entities in the cache; it caches only identifier values and results of value type. For this reason, the query cache should always be used in conjunction with the second-level cache for those entities expected to be cached as part of a query result cache (just as with collection caching).
Finally, OSCache is outdated.
I just want some general info about standard purpose of using L1 cache and L2 cache.
I'm curious because I'm investigating the system with terracotta as 2nd level cache and I've found that it also has 1st-level cache.
L1 Cache is the cache that exists per Hibernate session, and this cache is not shared among threads. This cache makes use of Hibernate's own caching.
L2 Cache is a cache that survives beyond a Hibernate session, and can be shared among threads. For this cache you can use either a caching implementation that comes with Hibernate like EHCache or something else like JBossCache2
In JPA/Hibernate (and other similar ORM tools), the L1 cache is the transactional cache i.e. the entities stored from when you open a transaction to when you close it. This is almost never a shared cache (other threads can't make use of it). In JPA, this would usually be held by the EntityManager.
The L2 cache is a full (typically) shared cache. If you have multiple threads/queries pulling in data, then they can make use of entities that have already been retrieved by other threads that are still live in the cache. In JPA, this would usually be held by the EntityManagerFactory.
GaryF is not wrong, but is not technically right :-) Anton is more correct on this, but to complement his answer:
First Level Cache: this is a "cache" which stores all the entities known by a specific session. So, if you have 3 transactions inside this session, it'll hold all entities touched by all three transactions. It gets cleared when you close the session or when you perform the "clear" method.
Second Level Cache: this is a "real" cache and is delegated to an external provider, such as Infinispan. In this cache, you have full control over the contents of the cache, meaning that you are able to specify which entries should be evicted, which ones should be retained longer and so on.
If Hibernate is anything similar to NHibernate (which it is, except the other way round), the Session is the first-level cache. Except that it is not cache in a general sense, but rather an identity map.
L1 By default enabled, you have to add some third party library like EH cache, Redis for L2.
You can't disable L1 in hibernate.
L1:The first-level cache is the cache per Hibernate Session cache and is a mandatory cache through which all requests must pass and this cache is not shared among threads.
L2:The second-level cache can be configured on a per-class and per-collection basis and mainly responsible for caching objects across sessions.L2 Cache is a cache that survives beyond a Hibernate session and can be shared among threads.
As mentioned in this article these are important differences :
First level Cache vs. Second-level Cache in Hibernate Now that we have
got some basic understanding of the first level and second level
cache, here are some differences between them:
The primary difference is that the first level cache is maintained
at the Session level while the second level cache is maintained at the
SessionFactory level.
The data stored in the first level cache is accessible to the only
Session that maintains it, while the second level cache is accessible
to all.
The first level cache is by default enabled while the second level
cache is by default disabled.
A couple of things to know about Hibernate's first level cache:
You can use the Session.evict() to remove the loaded entity from
the first level cache, can use the refresh() method to refresh the
cache, and can use the clear() method to remove all entities in cache.
You cannot disable the first level cache, it is always enabled.
Hibernate entities or database rows remain in cache only until
Session is open, once Session is closed, all associated cached data is
lost.
Read more: https://www.java67.com/2017/10/difference-between-first-level-and-second-level-cache-in-Hibernate.html#ixzz7B8pzzLzL