Ehcache and Hibernate - java

I have a scenario where I am displaying the data in the database which changes frequently (changed by outside application) on a webpage using Spring MVC, somewhat similar to a stock monitoring application. Currently i am using a daemon thread which fires on web container startup and queries the database every 45 secs and stores the data in the application wide hashmap object. And the web application reads the data from hashmap (instead of database) for displaying the data.
I have read about third party caching API's like Ehcache and OSCache. I have read the documentation on Ehcache and seems like I can use the Hibernate query caching technique instead of a daemon thread.
Now my query if I use hibernate and enable query caching and set timetoidle to 45 secs will the data in the cache is automatically refreshed to reflect latest data in the database or do i need force refresh (query the database again and repopulate the cache) the cache, also can you explain what a self populating cache is.

In the Ehcache docs a SelfPopulatingCacheis described as a:
A selfpopulating decorator for Ehcache that creates entries on demand.
That means when asking the SelfPopulatingCache for a value and that value is not in the cache, it will create this value for you. This blog article gives a lot of details and also code (inclusively auto-updating).
For me, it sounds that a Ehcache SelfPopulatingCache is what would fit your needs best. So I'd recommend to have a closer look at.
A Hibernate 2nd level cache would surely help to increase system performance, but not solve your problem, as I understand it. It's true when using Ehcache and setting timeToIdleSeconds the cache expires after that time, but it's not refreshed automatically.
Take a look at what Hibernate docs write about query cache:
The query cache does not cache the state of the actual entities in the cache; it caches only identifier values and results of value type. For this reason, the query cache should always be used in conjunction with the second-level cache for those entities expected to be cached as part of a query result cache (just as with collection caching).
Finally, OSCache is outdated.

Related

How to force Hibernate read external database changes

I have a common database that is used by two different applications (different technologies, different deployment servers, they just use the same database).
Let's call them application #1 and application #2.
Suppose we have the following scenario:
the database contains a table called items (doesn't matter its content)
application #2 is developed in Spring Boot and it is mainly used just for reading data from the database
application #2 retrieves an item from the database
application #1 changes that item
application #2 retrieves the same item again, but the changes are not visible
What I understood by reading a lot of articles:
when application #2 retrieves the item, Hibernate stores it in the first level cache
the changes that are done to the item by application #1 are external changes and Hibernate is unaware of them, and thus, the cache is not updated (same happens when you do a manual change in the database)
you cannot disable Hibernate's first level cache.
So, my question is, can you force Hibernate into refreshing the entities every time they are read (or make it go into the database) without explicitly calling em.refresh(entity)? The problem is that the business logic module from application1 is used as a dependency in application1 so I can only call service methods (i.e. I don't have access to the entityManager or session references).
Hibernate L1 cache is roughly equivalent to a DB transaction when you run in a repeatable-read level isolation. Basically, if you read/write some data, the next time you query in the context of the same session, you will get the same data. Further, within the same process, sessions run independent of each other, which means 2 session are looking at different data in the L1 cache.
If you use repeatable read or less, then you shouldn't really be concerned about the L1 cache, as you might run into this scenario regardless of the ORM (or no ORM).
I think you only need to think about the L2 cache here. The L2 cache is what stores data and assumes only hibernate is accessing the DB, which means that if some change happens in the DB, hibernate might not know about it. If you just disable the L2 cache, you are sorted.
Further reading - Short description of hibernate cache levels
Well, if you cannot access hibernate session you are left with nothing. Any operations you want to do requires session access. For instance you can remove entity from cache after reading it like this:
session.evict(entity);
or this
session.clear();
but first and foremost you need a session. Since you calling only services you need to create service endpoints clearing session cache after serving them or modify existing endpoints to do that.
You can try to use StatelessSession, but you will lose cascading and other things.
https://docs.jboss.org/hibernate/orm/current/userguide/html_single/Hibernate_User_Guide.html#_statelesssession
https://stackoverflow.com/a/48978736/3405171
You can force to start a new transaction, so in this manner hibernate will not be read from the cache and it will redo the read from the db.
You can annotate your function in this manner
#Transactional(readOnly = true, propagation = Propagation.REQUIRES_NEW)
Requesting a new transaction, the system will generation a new hibernate session, so the data will not be in the cache.

How hibernate ensures second level cache is updated with latest data in database

I have read that using hibernate's second level cache, it can improve applications performance by having less database hits for data / object retrieval.
However, how does hibernate ensure that second level cache is up to date with the data in database.
For, example:
Suppose the below class is entity and persisted into the DB.
#Entity
class User {
Id
private int id;
private String str;
}
Now, if we have enabled second level cache, I understand that if we open different sessions then each session will hit the second level cache for retrieving object value.
Now, if data in database gets changes (for e.g. for row with id=1) say by some independent process/ manually changing the values, and we try to access the value, how does hibernate detect that the cache is having latest value (for id = 1).
In general, how does hibernate ensure that data in second level cache is consistent with the db values.
Thanks for your help.
Hibernate manages the cache himself, so when you update some entity thru hibernate Session it will invalidate cache entry assocciated with this entity - so cache is always fresh.
If another process (or even second JVM running the same hibernate application) updates record in database, Hibernate on first JVM is unaware of this fact and has stale object in his cache.
However you can use any cache implementation (cache provider) you want. There are many production-ready cache providers that allow you to configure how long given entity will be stored in cache. For example you can configure your cache to invalide all entities after 30 seconds and so on.
If you use EhCache cache provider you can provide such configuration:
<cache name="com.my.company.Entity"
maxElementsInMemory="1000"
eternal="false"
timeToIdleSeconds="7200"
timeToLiveSeconds="7200"
overflowToDisk="false"
memoryStoreEvictionPolicy="LRU"/>
You can find more information abount L2 cache here:
http://www.tutorialspoint.com/hibernate/hibernate_caching.htm
however there is a lot of useful tutorials about this.
It doesn't.
If you change data in the database without using hibernate it won't know about it and your cache and the database get out of sync.

JDBC Query Caching and Precaching

Scenario:
I have a need to cache the results of database queries in my web service. There about 30 tables queried during the cycle of a service call. I am confident data in a certain date range will be accessed frequently by the service, and I would like to pre-cache that data. This would mean caching around 800,000 rows at application startup, the data is read-only. The data does not need to be dynamically refreshed, this is reference data. The cache can't be loaded on each service call, there's simply too much data for that. Data outside of this 'frequently used' window is not time critical and can be lazy loaded. Most queries would return 1 row, and none of the tables have a parent/child relationship to each other, though there will be a few joins. There is no need for dynamic sql support.
Options:
I intended to use myBatis, but there isn't a good method to warm up the cache. myBatis can't understand that the service query select * from table where key = ? is already covered by the startup pre-cache query select * from table.
As far as I understand it (documentation overload), Hibernate has the same problem. Additionally, these tables were designed with composite keys and no primary key, which is an extra hassle for Hibernate.
Question:
Preferred: Is there a myBatis solution for this problem ? I'd very much like to use it. (Familiarity, simplicity, performance, funny name, etc)
Alternatively: Is there an ORM or DB-friendly cache that offers what I'm looking for ?
You can use distributed caching solution like NCache or Tayzgrid which provide indexing and queries features along with cache startup loader.
You can configure indexes on attributes of your entities in cache. A cache startup loader can be configured to load all data from database in cache at cache startup. While loading data, cache will create indexes for all entities in memory.
Object Query Language (OQL) feature, which provides queries similar to SQL can then be used to query in-memory data.
The variety of options for third-party products (free and paid) is too broad and too dependent on your particular requirements and operational capabilities to try to "answer" here.
However, I will suggest an alternative to an explicit cache of your read-only data.
You clearly believe that the memory footprint of your dataset will fit into RAM on a reasonably-sized server. My suggestion is that you use your database engine directly (no additional external cache), but configured the database with internal cache large enough to hold your whole dataset. If all of your data is residing in the database server's RAM, it will be accessed very quickly.
I have used this technique successfully with mySQL, but I expect the same applies to all major database engines. If you cannot figure out how to configure your chosen database appropriately, I suggest that you follow ask a separate, detailed question.
You can warm the cache by executing representative queries when you start your system. These queries will be relatively slow because they have to actually do the disk I/O to pull the relevant blocks of data into the cache. Subsequent queries that access the same blocks of data will be much faster.
This approach should give you a huge performance boost with no additional complexity in your code or your operational environment.
Sormula may do want you want. You would need to annotate each POJO to be cached like:
#Cached(type=ReadOnlyCache.class)
public class SomePojo {
...
}
Pre-populate the cache by invoking selectAll method for each:
Database db = new Database(one of the JNDI constructors);
Table<SomePojo> t = db.getTable(SomePojo.class);
t.selectAll();
The key is that the cache is stored in the Table object, t. So you would need to keep a reference to t and use it for subsequent queries. Or in the case of many tables, keep reference to database object, db, and use db.getTable(...) to get tables to query.
See javadoc and tests in org.sormula.tests.cache.readonly package.

Replace single entity in query cache from JPA/Hibernate/EclipseLink?

We need to cache the results of a query; the query returns the whole table of a database. The problem is that the database is changed externally by other application; the Java application making the query is notified with the exact primary key of the row changed.
Is it possible to replace only the changed element from the query cache, not the whole list?
Is this the 1st level cache (from EntityManager) 2nd level Cache (from EntityManagerFactory) or a different cache?
If it is possible can this be done from JPA?
entityManager.refersh(entity);
or is this query cache the 2nd level JPA cache:
entityManagerFactory.getCache().evict(cls, primaryKey);
or only possible through Hibernate/EclipseLink API?
If is not possible, in order to achieve this, would calling entityManager.find() on all elements do it?
I haven't find anything useful neither in Hibernate documentation nor in EclipseLink documentation. Hibernate supports regions and refreshing only regions, but we need entity-level refresh granularity.
Later edit to clarify my findings.
Following the link posted by #Chris in the comment I have found out that what I wanted is actually supported by EclipseLink but only for Oracle Database (there is possible to implement own handler for other vendors; but the call from database is not standardized and differs from vendor to vendor). I have not found if Hibernate supports this or not.
Anyhow the query cache from EclipseLink had some very poor performance compared with Spring Cache (based on CocurrentMap) or with custom based cache so will remain with Spring Cache or Custom Cache over Spring Jdbc.
EntityManager.refresh() is what you want - it refreshes the entity from what is in the database. This should also update the entity in the shared cache if you are not in a transaction, otherwise you may need to use the entityManagerFactory.getCache().evict(cls, primaryKey); as well to clear the second level shared cache so it can be read into it as well later on.

Ehcache cached item is wrong

I use hibernate + ehcache to read a workflow engine database.
hibernate does not write anything on that database.
If i set TimetoLive setting in the cache, the cache won't reflect any database changes unless TimetoLive arrives.
database changes is done by the workflow engine API, so there is no way to use hibernate to write the database.
Shouldn't ehcache knows the cache is expired and do the updates for me ?
Any clean way to solve the cache wrong problem ?
the cache won't reflect any database changes unless TimetoLive arrives.
That's the intended functionality! These second level caches do nothing but store data in hash maps and know nothing about the changes unless you tell it to or the time to evict the objects out of cache and reread them.
To solve this is to not use caches on volatile objects.
If i set TimetoLive setting in the cache, the cache won't reflect any database changes unless TimetoLive arrives.
So that means you are not using it.
database changes is done by the workflow engine API, so there is no way to use hibernate to write the database.
So as an laternative (to timetoLive), that means you need cache mode to read-write or read-nonstrictly-write (check the name something like that). If its not reflecting the chnages and I am asssuming two things
Your workflow Engine is using hibernate
And your cache setting is read-only

Categories