hibernate distributed 2nd level cache options - java

Not really a question but I'm looking for comments/suggestions from anyone who has experiences using one or more of the following:
EhCache with RMI
EhCache with JGroups
EhCache with Terracotta
Gigaspaces Data Grid
A bit of background: our applications is read only for the most part but there is some user data that is read-write and some that is only written (and can also be reasonably inaccurate). In addition, it would be nice to have tools that enable us to flush and fill the cache at intervals or by admin intervention.
Regarding the first option - are there any concerns about the overhead of RMI and performance of Java serialization?

I'm working with EhCache for Hibernate and for application level cache since 3 years ago.
We use it with RMI for cache invalidation and it works really good. If you use the cache for replication you should take care about the object graph, it could turn very heavy with high cardinality relations.
If you use EhCache for Hibernate you could use it for Query cache (it's a good improvement for read-only tables) and it the table is modified it cleans the cache automatically.
Using EhCache to cache collections is a good idea too, to avoid joins a sub-selects.
To clean the caches at time intervals you could implement a cache extension of EhCache that cleans the caches. We did it, it works well.

Also check out Hazelcast, Coherence and GemStone. These are distributed caching solutions with Query support. They also have ready-to-go second level cache plug-in for Hibernate. Hazelcast is open source.

Related

Putting a cache infront of distributed redis cache

I have a java enterprise application that does a lot of fetching of cached data.
The data is stored in a 3 server redis cluster and is accessed by 5 backend api nodes.
I am seeing that we are putting alot of stress on the redis caches, which is why I am wondering if it is dumb to put a in-mem cache such as Ehcache in front of redis. With this solution I would set the TTL to be very short in the Ehcache.
Is this a common solution or is it more reasonable to look into expanding the redis cluster?
Thing you are talking about is called near cache. It's absolutely legit solution in some cases. It provides trade-off between performance and freshness of the values. However you can only consider this option if seeing a bit stale values is tolerable in your case. Just FYI, Apache Ignite supports this feature out of the box.

Java Large number of transaction object caching

I am looking for best solution for caching large amount of simple transactional pojo structure in memory. Transactions happen at oracle database on 3-4 tables by external application. Another application is kind of Business Intelligence type, which based on transactions in database evaluates updated pojos(mapped to table) and applies various business rules.
Hibernate solution relies on transactions on same server; where as in our case transactions happen some where else, and not sure cached objects can be queried.
Question:
Is there oracle jdbc API that would trigger update event on java side?
Which Caching solution would support #1,
Is cached objects can be queried?
Oracle databases support Java triggers, so in theory you could implement something like this yourself, see this guide. In theory, your Java trigger could invoke the client library of whichever distributed caching solution you are using, to update or evict stale entries.
Oracle also have a caching solution of their own, known as Coherence. It might have integration like this built in, or at least it might be worth checking it out. Search for "java distributed cache" for some alternatives.
As far as I know Hibernate does not support queries on objects stored in its cache.
However if you cache an entire collection of objects separately, then there are some libraries which will allow you to perform SQL-like queries on those collections:
LambdaJ - supports advanced queries, not as fast
CQEngine - supports typical queries, extremely fast
BTW I am the author of CQEngine. I like both of those libraries. But please excuse my slight bias for my own one :)

Memory cache options for postgres and java

Running a last db server jersey applications and I need to start thinking about memory caching.
The majority of the db commands are only updated once a day. There is an opportinity to cache these queries at the server level.
What options do I have? I know quite a few large applications use memcached. Others??
Any of the Java memcached libraries is probably your best best.
Spymemcached
Memcached-Java-Client
Xmemcached
Memcached is a good default. Redis can be used to. It offers richer functionality should you choose to use it, but if you're use case will always be what memcached offers then there's no particular advantage.
Note that PostgreSQL has an internal cache (the buffer cache) and uses the kernel's disk cache. So tuning the PostgreSQL config for your needs may be a good idea.
In addition to this you could use materialized views for some queries.

Shared cache between Tomcat web apps

I'm looking for a solution to share a cache between two tomcat web apps running on different hosts. The cache is being used for data synchronization, so the cache must be guaranteed to be up-to-date at all times between the two tomcat instances. (Sorry, I'm not 100% sure if the correct terminology for this requirement is "consistency" or something more specific like having ACID property). Another requirement is of course is that it should be fast to access the cache, with about equal numbers of writes as reads. I do have access to a shared filesystem so that is a consideration.
I've looked at something like ehcache but in order to get a shared cache between the webapps I would either need to implement on top of a Terracotta environment or using the new ehcache cache server. The former (Terracotta) seems like overkill for this, while the cache web server seems like it wouldn't provide the fast performance that I want.
Another solution I've looked at is building something simple on top of a fast key-value store like Redis or memcachedb. Redis is in-memory but can easily be configured to be a centralized cache, while memcachedb is a disk-based persistent cache which could work because I have a shared filesystem.
I'm looking for suggestions on how to best solve this problem. The solution needs to be a relatively mature technology as it will be used in a production environment.
Thanks in advance!
I'm quite sure that you don't require terracotta or ehcache server if you need a distributed cache. Ehcache with one of the four replication mechanisms would do.
However, based on what you've written I guess that you're looking for more than just a cache. Memcached/Ehcache are examples of what you might call a caching layer for your application - nothing more.
If you find yourself using words like 'guaranteed' 'up-to-date' 'ACID' you're better off using an in-memory DB like Oracle Times Ten/MySQL Cluster/Redis with a disk-based persistent storage.
You can use memcached (not memcachedb) for fast and efficient caching. Redis or memcachedb could be an overkill unless you want persistent caching. Memcached can be clustered very easily and you can use spymemcached java client to access it. Memcacached is very mature and is running in several hundred thousands, if not millions of production servers. It can be monitored through Nagios and Munin systems when in production.

Ehcache performance on a large cluster

I would like to use Ehcache replicated cache, first as the backend to Hibernate second level cache, second as a cache for any data.
I know how a distributed cache like memcached is working, and I know it can scale to large clusters, but I cannot find how Ehcache replication behaves on large clusters.
Has someone a pointer to some information or some kind of benchmark?
I found that many replication strategies can be used, like RMI, JGroups, JMS or Terracotta, and RMI and Terracotta seem the most popular.
How do they compare on large clusters?
Will the replication kill my performances as I add many nodes (like several dozens)?
Fully replicated cache will only work if your application is read-mostly. Replicated cache cannot scale; passing the updates to the other nodes will kill your performance. You need partitioned cache with backup replicas. Partitioned caches will linearly scale even for the write-intensive applications.
Try Hazelcast! it is open source (Apache license) transactional, partitioned caching solution for Java. It comes with hibernate second level cache plugin.
Several dozens? No problem. Hazelcast 100 node cluster demo can be found here.
A good solution to the cluster scaling problem is the notion of "buddy replication", where data is only replicated to each node's neighbours (however you define that), rather to all nodes. You get failover without the scaling issue.
To my knowledge, ehcache doesn't do this. However, JBossCache does, and that also integrates with Hibernate in the same way that ehcache does.
Have you read the section in the manual about Distributed Caching with ehcache?
There are further chapters on:
RMI Distributed Caching
Distributed Caching using JGroups
Distributed Caching using JMS
Distributed Caching via Terracotta

Categories