How to maintain cache for Java/J2EE web application?

How to maintain cache for Java/J2EE web application? - java

I am developing an application for that every time I need to connect to the service. I want to save each search in my cache for further use. Is there any option to do that?
I heard about Memcached. But I didn't find any correct site for reference. Can we use Ehcache as we use in Hibernate?

here is the good article about caching. http://www.javaworld.com/javaworld/jw-05-2004/jw-0531-cache.html?page=1

There are various caching solutions in Java. Among them are Infinispan, EhCache and OSCache.
All of them can also be used standalone, e.g. none of them were exclusively build to function as a Hibernate caching provider.
Functionalities between caches differ a little. Infinispan for instance provides first class support for transactions (meaning an item won't be inserted into the cache if the transaction in which the item was inserted rollbacks). EhCache has great support for really large in-process but off heap storage for cache.
OSCache comes with very handy tags to cache content on JSP pages (but that doesn't work with e.g. JSF).
Most of them are capable of doing the typical spill over to disk thing, and have some mechanisms to invalidate stale entries. Infinispan for instance has eviction policies, and those really remove stale entries from the cache (saving memory). OSCache on its turn never really removes an entry from the cache, but marks it as stale. When the entry is accessed, the caller is alerted to refresh the entry (used to be an exception, but might be different now).
Those things are typically what sets a "cache" apart from a simple concurrent hashmap. If your requirements are modest, don't overlook this simple solution though. A cache can be somewhat hard to configure and a concurrent map in application scope may also suffice for you.

You can cache data on a per user basis (ie session) with OSCache's jsp tags very easily. For example, imagine a web application, where a particular users "worklist" hasn't changed, then always serve the cached (ie already generated) jsp until the list has changed ( via a flush cache call somewhere else in application)
Wrapping code on the jsp layer, with an cache tag as follows:
<cache:cache key="foobar" scope="session">
<%= myBean.getData() %>
</cache:cache>
means the java code myBean.getData() will only be called once per session (unless otherwise flushed)

Related

Cache in a distributed web application - complex queries use case

We are developing a distributed web application (3 tomcats with a load balancer).
Currently we are looking for a cache solution. This solution should be cluster safe ofcourse.
We are using spring, jpa (mysql)
We thought about the following solution :
Create a cache server that runs a simple cache and all DB operations from each tomcat will be delegated to it. (dao layer in web app will communicate with that server instead of accessing DB itself). This is appealing since the cache on the cache server configuration can be minimal.
What we are wondering about right now is:
If a complex query is passed to the cacheServer (i.e. select with multiple joins and where clauses) how exactly the standard cache form (map) can handle this? does it mean we have to single handedly implement a lookup for each complex query and adjust it to map search instead of DB?
P.S - there is a possibility that this architecture is flawed in its base and therefore a weird question like this was raised, if that's the case please suggest an alternative.
Best,

mySql already come with a query cache, see http://dev.mysql.com/doc/refman/5.1/en/query-cache-operation.html

If I understand correctly, you are trying to implement a method cache, using as a key the arguments of your DAO methods and as value, the resulted object/list.
This should work, but your concern about complex queries is valid, you will end up with a lot of entries in your cache. For a complex query you would hit the cache only if the same query is executed exactly with the same arguments as the one in the cache. You will have to figure out if it is useful to cache those complex queries, if there is a chance they will be hit, it really depends on the application business logic.
Another option would be to implement a cache with multiple levels: second level cache and query cache, using ehcache and big memory. You might find this useful:
http://ehcache.org/documentation/integrations/hibernate

Spring + Hibernate web application - When to use cache?

I'm building java web application which in future can generate a lot of traffic.
All in all it uses some quite simple queries to the database but still some kind of cache may be necessary to keep low latency and to prevent high database access rate.
Shall I bother with cache from the start? is it necessity?
Is it hard to implement or use some open source solutions on existing queries and how such cache will know when database state changed?

It all depends on how much traffic you expect, do you have some estimate of the max volume or the number of users?
Most of the times you don't need to worry about the cache from the beginning and can add later a hibernate second level cache later on.
You can start the development without a cache configured, then add it later on by choosing a cache provider and plug it as second level cache provider. EHCache is a frequent choice.
You can then annotate your entities with #Cache with different strategies, for example read only, etc.

how serializable is releated to caching a java/command class?

I am working on ibm websphere commerce (wcs). In this framework we have an option to cache our command class, basically they are just a java classes. While having a new cache entry i got to know that these java classes must be serializable (implement the java.io.Serializable interface). Why is that?
is it like caching is basically saving an output of some execution? and in this case it will save the sequence of bytes generated as part of serialization and whenever a requested to that cached object comes then it will just deserialize and returns the object without executing actual program? Can anyone please share knowledge about this??
Thanks in advance,
Santosh

For caching the result of a method execution and returning it for subsequent calls serialization is not needed.
The most likely reason it needs to be Serializable is that when you cache some data in a clustered environment changes made to the cached data on one node would have to be replicated on other nodes of the cluster. For doing this replication the data needs to be serialized and sent across to another node using some remoting api.
The other reason for requiring the class to be serialiazable is that the cache implementation might overflow the data to a disk. Even in this case the objects in the cache need to be converted to some form that can be stored on the disk and recreated.
The following is a passage from ehcache documentation that explains the overflow scenario in more detail.
When an element is added to a cache and it goes beyond its maximum
memory size, an existing element is either deleted, if overflowToDisk
is false, or evaluated for spooling to disk, if overflowToDisk is
true.
In the latter case, a check for expiry is carried out. If it is
expired it is deleted; if not it is spooled. The eviction of an item
from the memory store is based on the 'MemoryStoreEvictionPolicy'
setting specified in the configuration file.

Serialization saves the actual object itself, in its current state.

The reason why is due to WebSphere Commerce's use of WebSphere Application Server's Dynacache feature. WAS dynacache is an in-memory java cache that is very similiar to a built in memcached. Out of the box, the starter store uses the dynacache to cache JSP, servlets, controllers, commands, command tasks and other java objects. There is also caching done on the DB side. This is why in performance tests, IBM scales much better at high volumes than other software.

Why does Hibernate attempt to "cache" and how does this work in a clustered environment?

Say you have a 4-node J2EE application server cluster, all running instances of a Hibernate application. How does caching work in this situation? Does it do any good at all? Should it simply be turned off?
It seems to me that data on one particular node would quickly become stale, as other users hitting other nodes make changes to database data. In such a situation, how could Hibernate ever trust that its cache is up to date?

First of all, you should clarify what cache you're talking about, Hibernate has 3 of them (the first-level cache aka session cache, the second-level cache aka global cache and the query cache that relies on the second-level cache). I guess the question is about the second-level cache so this is what I'm going to cover.
How does caching work in this situation?
If you want to cache read only data, there is no particular problem.
If you want to cache read/write data, you need a cluster-safe cache implementation (via invalidation or replication).
Does it do any good at all?
It depends on a lot of things: the cache implementation, the frequency of updates, the granularity of cache regions, etc.
Should it simply be turned off?
Second-level caching is actually disabled by default. Turn it on if you want to use it.
It seems to me that data on one particular node would become stale quickly as other users hitting other nodes make changes to database data.
Which is why you need a cluster-safe cache implementation.
In such a situation, how could Hibernate ever trust that its cache is up to date?
Simple: Hibernate trusts the cache implementation which has to offer a mechanism to guarantee that the cache of a given node is not out of date. The most common mechanism is synchronous invalidation: when an entity is updated, the updated cache sends a notification to the other members of the cluster telling them that the entity has been modified. Upon receipt of this message, the other nodes will remove this data from their local cache, if it is stored there.

First of all, there are 2 caches in Hibernate.
There is the first level cache, which you cannot remove, and is called Hibernate session. Then, there is the second level cache which is optional and pluggable (e.g Ehcache). It works accross many requests and, most probably, it's the cache you are referring to.
If you work on a clustered environment, then you need a 2nd level cache which can replicate changes accross the members of the cluster. Ehcache can do that. Caching is a hard topic and you need a deep understanding in order to use it without introducing other problems. Caching in a clustered environment is slightly more difficult.

Sharing nHibernate and hibernate 2nd level cache

Is it possible to share the 2nd level cache between a hibernate and nhibernate solution? I have an environment where there are servers running .net and servers running java who both access the same database.
there is some overlap in the data they access, so sharing a 2nd level cache would be desirable. Is it possible?
If this is not possible, what are some of the solutions other have come up with?

There is some overlap in the data they access, so sharing a 2nd level cache would be desirable. Is it possible?
This would require (and this is very likely oversimplified):
Being able to access a cache from Java and .Net.
Having cache provider implementations for both (N)Hibernate.
Being able to read/write data in a format compatible with both languages (or there is no point at mutualizing the cache).
This sounds feasible but:
I'm not aware of an existing ready-to-use solution implementing this (my first idea was Memcache but AFAIK Memcache stores a serialized version of the data so this doesn't meet the requirement #3 which is the most important).
I wonder if using a language neutral format to store data would not generate too much overhead (and somehow defeat the purpose of using a cache).
If this is not possible, what are some of the solutions other have come up with?
I never had to do this but if we're talking about a read-write cache and if you use two separate caches, you'll have to invalidate a given Java cache region from the .Net side and inversely. You'll have to write the code to handle that.

As Pascal said, it's improbable that sharing the 2nd cache is technically possible.
However, you can think about this from a different perspective.
It's unlikely that both applications read and write the same data. So, instead of sharing the cache, what you could implement is a cache invalidation service (using the communications stack of your choice).
Example:
Application A mostly reads Customer data and writes Invoice data
Application B mostly reads Invoice data and writes Customer data
Therefore, Application A caches Customer data and Application B caches Invoice data
When Application A, for example, modifies an invoice, it sends a message to Application B and tells it to evict the invoice from the cache.
You can also evict whole entity types, collections and regions.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.