Java cache and update dynamically

Java cache and update dynamically - java

I need to "preload" some data from a database on servlet startup.
So I thought to create some cache e.g. using a HashMap or some similar synchronized version.
I also need to update the cache on database update changes.
So I thought to add some kind of "listener".
My question is: is this somehow available or do I have to actually implement it?
If yes what design pattern would be the best approach here?
Update:
No JPA or ORM used. But Spring is available

Yes of course you can implement that
I'll draw a small architecture then ill explain it to u:
first of all , you can learn about Mappers here and TDGs here.
A mapper has a method called cacheAll() which calls and delegate to TDG's method cacheAll() which in its turn has a mission to get all rows from a table from the db( the rows you want to cache in the cache object).
so basically first you have to create a listener implementing "ServletContextListener"
which means its a listener for the whole servlet context, and inside its contextInitialized you have to call mp.fill(Mapper.cacheAll()), so it is sthg like ( this is general code, of course write it better and optimize it)
public class myServletContextListener implements ServletContextListener{
#Override
public void contextInitialized(ServletContextEvent sce) {
mp.fill(Mapper.cacheAll());
}
//
}
Don't forget to add your listener in web.xml:
<listener>
<listener-class>myServletContextListener </listener-class>
</listener>
so what this will do , is on startup of the server, will cache all record into a hashmap mp in a cache object.
As for updating cache based on database change, you will have to use observer pattern.
UPDATE
I forgot to mention, about the cache object, i assume you want it accessible for all users or your app, so you should code it as a singleton (singleton pattern), code like that:
public class cacheObject
{
private static Map cMap;
private static cacheObject cObject;
private cacheObject()
{
cMap = Mapper.cacheAll();
}
public static synchronized cacheObject getInstance()
{
if (cObject == null){
cObject = new cacheObject();
}
return cObject;
}
}
Also if the data that you want to cache can be changed by users, so make it a Threadlocal singleton.

You may find your needs served best by Guava here. The wiki article on Caches is probably most relevant to you, but the exact approach here would depend heavily on the conditions for database update changes. If you want to refresh the whole cache on database update changes -- or at least invalidate old entries -- you might just call Cache.invalidateAll() whenever a database update occurs. If you're willing to let the cache be only slightly behind the times, using CacheBuilder.refreshAfterWrite(long, TimeUnit) might work well for you.

Hashmap and it's thread safe variant ConcurrentHashMap is already available.
There are caching solutions which are available like ehcache which also provide advanced support like eviction policies and many more things.
As for the design pattern, read into the Observer design pattern.

I actually had a production level project where I needed to do something like this. My solution was (and this is just MY solution) was to load the object (your so "data") into memory at servlet start up. This decision was made because the object was large enough that it made client requests sluggish to pull it from the database AND I had a small number of concurrent users. Any requests that would change that object's data in the database would also change the in-memory object. You of course would need to use a synchronized object to do this if you are working with a lot of users. If the amount of data is not huge, then you could always pull from the database every time the user requests information about the data.
Good Luck.

Related

Coherence EntryProcessor query

I'm trying to implement a business functionality which uses Coherence transient caches.
One of the features I was planning to depend upon is auto-eviction of cache entries, when providing a (configurable) time-to-live at the time of putting an item in the cache. The interface NamedCache provides an API to achieve this (http://download.oracle.com/otn_hosted_doc/coherence/330/com/tangosol/net/NamedCache.html#put(java.lang.Object, java.lang.Object, long)).
However, I'm also planning to use Entry-Processors to ensure effective concurrency across the cluster. I'm stuck at a point now where, within the scope of the processor, I'm supposed to work with InvocableMap.Entry to get/set values with a key in the cache. Unfortunately, there is no setValue method which lets me specify the time-to-live value.
I'm assuming here that interfacing directly with the NamedCache reference inside the EntryProcessor's process method will not be a good idea, and will compromise the concurrency guarantees which EntryProcessor provides.
Can you please share your thoughts on what could be the best way to get an entry evicted after a certain amount of time (which is dynamically decided), while ensuring optimal concurrency across a cluster of nodes?
I'm not completely hung up on using the auto-eviction functionality. However, if I were to abandon that, I may have to rely upon a timer-based programmatic removal of the entry, which works reliably across a cluster. Again, I'm falling short of ideas on this one. Ideally, I would want Coherence to deal with this.
Many thanks in advance.
Best regards,
- Aditya

you can try the following:
Cast the entry in the EntryProcessor to BinaryEntry and set the expiration time.
For example:
public class MyEntryProcessor extends AbstractProcessor implements PortableObject {
#Override
public Object process(Entry myEntry) {
((BinaryEntry)myEntry).expire(100);
return myEntry;
}
}
http://docs.oracle.com/middleware/1212/coherence/COHJR/com/tangosol/util/BinaryEntry.html

In EhCache, is it possible to enable Statistics by default on all caches?

I'm currently writing some monitoring code for an app that's composed of lots of different little modules, many of which make use of EhCache. My goal is to gather statistics about hit ratios, cache contents, etc. from each cache in the app. However, I'm running into some trouble implementing this feature because enabling statistics is an opt-in feature in EhCache. I'm looking for a way to have statistics enabled for all caches in an automatic way so that developers maintaining the different modules don't have to always remember to enable them.
The closest thing I could find in the JavaDocs (but that still doesn't work is):
cacheManager.getDefaultCacheConfiguration().setStatisticsEnabled(true);
That method call enables statistics on the default cache only whereas the rest of the caches will not be affected.
Another thought I had was to wrap the CacheManager so as to intercept calls that create caches and automatically opt them in to statistics. Unfortunately, CacheManager is a class and not an interface, so such a solution would require lots of code and would be brittle--every time a public methods gets added/removed as EhCache evolves, I'd have to update my subclass.
Has anyone out there run into a similar problem? If so, how did you go about solving it? Many thanks...

At some point once your caches are created you could do something like this:
for (CacheManager manager : CacheManager.ALL_CACHE_MANAGERS) {
for (String name : manager.getCacheNames()) {
manager.getCache(name).getCacheConfiguration().setStatistics(true);
}
}
Of course you'll want to add error checking.
If you have caches that are created dynamically, you can use a Cache Manager Event Listener (see the documentation). Basically you have to create a factory by extending CacheManagerEventListenerFactory, and then create the actual listener by implementing CacheManagerEventListener. The listener could look like this:
public class StatisticsEnabledCacheManagerListener implements CacheManagerEventListener {
public void notifyCacheAdded(String cacheName) {
CacheManager.getInstance().getCache(cacheName).getCacheConfiguration().setStatistics(true);
}
public void notifyCacheRemoved(String cacheName) {}
}
To register the factory with Ehcache you add this to ehcache.xml:
<cacheManagerEventListenerFactory class="com.example.cache.MyListenerFactory" properties=""/>
It could be important to note that if you set your default cache to have statistics enabled, then any cache you create dynamically will have statistics enabled by default unless whatever is creating the cache specifically turns it off.

Struts2 static data storage / access

I am trying to find what is the usual design/approach for "static/global"! data access/storage in a web app, I'm using struts 2. Background, I have a number of tables I want to display in my web app.
Problem 1.
The tables will only change and be updated once a day on the server, I don't want to access a database/or loading a file for every request to view a table.
I would prefer to load the tables to some global memory/cache once (a day), and each request get the table from there, rather than access a database.
I imagine this is a common scenario and there is an established approach? But I cant find it at the moment.
For struts 2, Is the ActionContext the right place for this data.
If so, any link to a tutorial would be really appreciated.
Problem 2.
The tables were stored in a XML file I unmarshalled with JAXB to get the table objects, and so the lists for the tables.
For a small application this was OK, but I think for the web app, its hacky to store the xml as resources and read in the file as servlet context and parse, or is it?
I realise I may be told to store the tables to a database accessing with a dao, and use hibernate to get the objects.
I am just curious as to what is the usual approach with data already stored in XML file? Given I will have new XML files daily.
Apologies if the questions are basic, I have a large amount of books/reference material, but its just taking me time to get the higher level design answers.

Not having really looked at the caching options I would fetch the data from the DB my self but only after an interval has passed.
Usually you work within the Action scope, the next level up is the Session and the most global is the Application. A simple way to test this is to create an Action class which implements ApplicationAware. Then you can get the values put there from any jsp/action... anywhere you can get to the ActionContext (which is most anyplace) see: http://struts.apache.org/2.0.14/docs/what-is-the-actioncontext.html
Anyways, I would implement a basic interceptor which would check if new data should be available and I have not looked it up already, then load the new data (the user triggering this interceptor may not need this new data, so doing this in a new thread would be a good idea).
This method increases the complexity, as you are responsible for managing some data structures and making them co-operate with the ORM.
I've done this to load data from tables which will never need to be loaded again, and that data stands on it's own (I don't need to find relationships between it and other tables). This is quick and dirty, Stevens solution is far more robust and probably would pay you back at a later date when further performance is a requirement.

This isn't really specific to Struts2 at all. You definitely do not want to try storing this information in the ActionContext -- that's a per-request object.
You should look into a caching framework like EHCache or something similar. If you use Hibernate for your persistence, Hibernate has options for caching data so that it does not need to hit the database on every request. (Hibernate can also use EHCache for its second-level cache).

As mentioned earlier, the best approach would be using EHCache or some other trusted cache manager.
Another approach is to use a factory to access the information. For instance, something to the effect of:
public class MyCache {
private static MyCache cache = new MyCache();
public static MyCache getCache() {
return cache;
}
(data members)
private MyCache() {
(update data members)
}
public synchronized getXXX() {
...
}
public synchronized setXXX(SomeType data) {
...
}
}
You need to make sure you synchronize all your reads and writes to make sure you don't have race conditions while updating the cache.
synchronized (MyCache.getCahce()) {
MyCahce.getCache().getXXX();
MyCache.getCache().getTwo();
...
}
etc
Again, better to use EHCache or something else turn-key since this is likely to be fickle without good understanding of the mechanisms. This sort of cache also has performance issues since it only allows ONE thread to read/write to the cache at a time. (Possible ways to speed up are to use thread locals and read/write locks - but that sort of thing is already built into many of the established cache managers)

The best place to store large data retrieved by a java servlet (Tomcat)

I have the java servlet that retrieves data from a mysql database. In order to minimize roundtrips to the database, it is retrieved only once in init() method, and is placed to a HashMap<> (i.e. cached in memory).
For now, this HashMap is a member of the servlet class. I need not only store this data but also update some values (counters in fact) in the cached objects of underlying hashmap value class. And there is a Timer (or Cron task) to schedule dumping these counters to DB.
So, after googling i found 3 options of storing the cached data:
1) as now, as a member of servlet class (but servlets can be taken out of service and put back into service by the container at will. Then the data will be lost)
2) in ServletContext (am i right that it is recommended to store small amounts of data here?)
3) in a JNDI resource.
What is the most preferred way?

Put it in ServletContext But use ConcurrentHashMap to avoid concurrency issues.

From those 3 options, the best is to store it in the application scope. I.e. use ServletContext#setAttribute(). You'd like to use a ServletContextListener for this. In normal servlets you can access the ServletContext by the inherited getServletContext() method. In JSP you can access it by ${attributename}.
If the data is getting excessive large that it eats too much of Java's memory, then you should consider a 4th option: use a cache manager.

The most obvious way would be use something like ehcache and store the data in that. ehcache is a cache manager that works much like a hash map except the cache manager can be tweaked to hold things in memory, move them to disk, flush them, even write them into a database via a plugin etc. Depends if the objects are serializable, and whether your app can cope without data (i.e. make another round trip if necessary) but I would trust a cache manager to do a better job of it than a hand rolled solution.

If your cache can become large enough and you access it often it'll be reasonable to utilize some caching solution. For example ehcache is a good candidate and easily integrated with Spring applications, too. Documentation is here.
Also check this overview of open-source caching solutions for Java.

How to cache information in a DAO in a threadsafe manner

I often need to implement DAO's for some reference data that doesn't change very often. I sometimes cache this in collection field on the DAO - so that it is only loaded once and explicitly updated when required.
However this brings in many concurrency issues - what if another thread attempts to access the data while it is loading or being updated.
Obviously this can be handled by making both the getters and setters of the data synchronised - but for a large web application this is quite an overhead.
I've included a trivial flawed example of what I need as a strawman. Please suggest alternative ways to implement this.
public class LocationDAOImpl implements LocationDAO {
private List<Location> locations = null;
public List<Location> getAllLocations() {
if(locations == null) {
loadAllLocations();
}
return locations;
}
For further information I'm using Hibernate and Spring but this requirement would apply across many technologies.
Some further thoughts:
Should this not be handled in code at all - instead let ehcache or similar handle it?
Is there a common pattern for this that I'm missing?
There are obviously many ways this can be achieved but I've never found a pattern that is simple and maintainable.
Thanks in advance!

The most simple and safe way is to include the ehcache library in your project and use that to setup a cache. These people have solved all the issues you can encounter and they have made the library as fast as possible.

In situations where I've rolled my own reference data cache, I've typically used a ReadWriteLock to reduce thread contention. Each of my accessors then takes the form:
public PersistedUser getUser(String userName) throws MissingReferenceDataException {
PersistedUser ret;
rwLock.readLock().lock();
try {
ret = usersByName.get(userName);
if (ret == null) {
throw new MissingReferenceDataException(String.format("Invalid user name: %s.", userName));
}
} finally {
rwLock.readLock().unlock();
}
return ret;
}
The only method to take out the write lock is refresh(), which I typically expose via an MBean:
public void refresh() {
logger.info("Refreshing reference data.");
rwLock.writeLock().lock();
try {
usersById.clear();
usersByName.clear();
// Refresh data from underlying data source.
} finally {
rwLock.writeLock().unlock();
}
}
Incidentally, I opted for implementing my own cache because:
My reference data collections are small so I can always store them all in memory.
My app needs to be simple / fast; I want as few dependencies on external libraries as possible.
The data is rarely updated and when it is the call to refresh() is fairly quick. Hence I eagerly initialise my caches (unlike in your straw man example), which means accessors never need to take out the write lock.

If you just want a quick roll-your own caching solution, have a look at this article on JavaSpecialist, which is a review of the book Java Concurrency in Practice by Brian Goetz.
It talks about implementing a basic thread safe cache using a FutureTask and a ConcurrentHashMap.
The way this is done ensures that only one concurrent thread triggers the long running computation (in your case, your database calls in your DAO).
You'd have to modify this solution to add cache expiry if you need it.
The other thought about caching it yourself is garbage collection. Without using a WeakHashMap for your cache, then the GC wouldn't be able to release the memory used by the cache if needed. If you are caching infrequently accessed data (but data that was still worth caching since it is hard to compute), then you might want to help out the garbage collector when running low on memory by using a WeakHashMap.

If your reference data is immutable the second level cache of hibernate could be a reasonable solution.

Obviously this can be handled by making both the getters and setters of the data synchronised - but for a large web application this is quite an overhead.
I've included a trivial flawed example of what I need as a strawman. Please suggest alternative ways to implement this.
While this might be somewhat true, you should take note that the sample code you've provided certainly needs to be synchronized to avoid any concurrency issues when lazy-loading the locations. If that accessor is not synchronized, then you will have:
Multiple threads access the loadAllLocations() method at the same time
Some threads may enter loadAllLocations() even after another thread has completed the method and assigned the result to locations - under the Java Memory Model there is no guarantee that other threads will see the change in the variable without synchronization.
Be careful when using lazy loading/initialization, it seems like a simple performance boost but it can cause lots of nasty threading issues.

I think it's best to not do it yourself, because getting it right is a very difficult thing. Using EhCache or OSCache with Hibernate and Spring is a far better idea.
Besides, it makes your DAOs stateful, which might be problematic. You should have no state at all, besides the connection, factory, or template objects that Spring manages for you.
UPDATE: If your reference data isn't too large, and truly never changes, perhaps an alternative design would be to create enumerations and dispense with the database altogether. No cache, no Hibernate, no worries. Perhaps oxbow_lakes' point is worth considering: perhaps it could be a very simple system.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.