Update cached data in a hashtable - java

In order to minimize the number of database queries I need some sort of cache to store pairs of data. My approach now is a hashtable (with Strings as keys, Integers as value). But I want to be able to detect updates in the database and replace the values in my "cache". What I'm looking for is something that makes my stored pairs invalid after a preset timespan, perhaps 10-15 minutes. How would I implement that? Is there something in the standard Java package I can use?

I would use some existing solution(there are many cache frameworks).
ehcache is great, it can reset the values on given timespan and i bet it can do much more(i only used that)

You can either use existing solutions (see previous reply)
Or if you want a challenge, make your own easy cache class (not recommended for production project, but it's a great learning experience.
You will need at least 3 members
A cache data stored as hashtable object,
Next cache expiration date
Cache expiration interval set via constructor.
Then simply have public data getter methods, which verify cache expiration status:
if not expired, call hastable's accessors;
if expired, first call "data load" method that is also called in the constructor to pre-populate and then call hashtable accessors.
For an even cooler cache class (I have implemented it in Perl at my job), you can have additional functionality you can implement:
Individual per-key cache expiration (coupled with overall total cache expiration)
Auto, semi-auto, and single-shot data reload (e.g., reload entire cache at once; reload a batch of data defined either by some predefined query, or reload individual data elements piecemail). The latter approach is very useful when your cache has many hits on the same exact keys - that way you don't need to reload universe every time 3 kets that are always accessed expire.

You could use a caching framework like OSCache, EHCache, JBoss Cache, JCS... If you're looking for something that follows a "standard", choose a framework that supports the JCache standard interface (javax.cache) aka JSR-107.
For simple needs like what you are describing, I'd look at EHCache or OSCache (I'm not saying they are basic, but they are simple to start with), they both support expiration based on time.
If I had to choose one solution, I'd recommend Ehcache which has my preference, especially now that it has joined Terracotta. And just for the record, Ehcache provides a preview implementation of JSR107 via the net.sf.cache.jcache package.

Related

Implementing cache using Guava

I want to implement a cache using Guava's caching mechanism.
I have a DB query which returns a map, I want to cache the entire map but let it expire after a certain amount of time.
I realize Guava caches works as a per-item bases. We provide a key, the Cache will either returns the corresponding value from the cache or get it.
Is there a way to use Guava to get everything, cache it but timeout it after a certain time period of time and get everything again.
Many thanks
You can create an instance of Supplier<Map<K,V>> that fetches the entire map from the database, and then use Suppliers.memoizeWithExpiration to cache it.
Related:
Google Guava Supplier Example
http://google.github.io/guava/releases/snapshot/api/docs/com/google/common/base/Supplier.html
http://google.github.io/guava/releases/snapshot/api/docs/com/google/common/base/Suppliers.html

Java - load data into a Singleton when it is accessed, but only after time has passed

I have a singleton that contains some data I want to cache from a database. I'd like it to call the database and refresh its data every time it is accessed AND a certain amount of time has passed. The singleton pattern I used was the enum one from Effective Java. What do you think is the best way to accomplish this? Ideally, I would override some method that is called every time MySingleton.INSTANCE is called...but I don't know if such a method exists. Another idea is to include a call to the refresh method within every method that can be accessed by the client code...but this seems clumsy to me.
Instead of accessing the INSTANCE directly you can use
DataCache.getInstance().method();
The getInstance() can check if the data needs to be refreshed.
For such a scenario, a robust, performant and very well tested solution is a Caching Library like EHCache: http://ehcache.org/
It seems that what you need is a time-based eviction cache.
Take a look at Google Guava Cache library:
CacheBuilder provides two approaches to timed eviction:
expireAfterAccess(long duration, TimeUnit unit) Only expire entries after the specified duration has passed since the entry was last accessed by a read or a write. Note that the order in which entries are evicted will be similar to that of size-based eviction.
expireAfterWrite(long duration, TimeUnit unit) Expire entries after the specified duration has passed since the entry was created, or the most recent replacement of the value. This could be desirable if cached data grows stale after a certain amount of time.
So which approach do you need? You said:
I'd like it to call the database and refresh its data every time it is accessed AND a certain amount of time has passed.
A certain amount of time has passed since when?
Since entry was last read from your cache? It doesn't matter how much time has passed since entry was loaded from database? Then use expireAfterAccess()
Since entry was last loaded from database into your cache? Then use expireAfterWrite()
You can even use them both at the same time.
Here is such example: http://mkorytin.blogspot.com/2012/02/caching-objects-with-guava.html

Struts2 static data storage / access

I am trying to find what is the usual design/approach for "static/global"! data access/storage in a web app, I'm using struts 2. Background, I have a number of tables I want to display in my web app.
Problem 1.
The tables will only change and be updated once a day on the server, I don't want to access a database/or loading a file for every request to view a table.
I would prefer to load the tables to some global memory/cache once (a day), and each request get the table from there, rather than access a database.
I imagine this is a common scenario and there is an established approach? But I cant find it at the moment.
For struts 2, Is the ActionContext the right place for this data.
If so, any link to a tutorial would be really appreciated.
Problem 2.
The tables were stored in a XML file I unmarshalled with JAXB to get the table objects, and so the lists for the tables.
For a small application this was OK, but I think for the web app, its hacky to store the xml as resources and read in the file as servlet context and parse, or is it?
I realise I may be told to store the tables to a database accessing with a dao, and use hibernate to get the objects.
I am just curious as to what is the usual approach with data already stored in XML file? Given I will have new XML files daily.
Apologies if the questions are basic, I have a large amount of books/reference material, but its just taking me time to get the higher level design answers.
Not having really looked at the caching options I would fetch the data from the DB my self but only after an interval has passed.
Usually you work within the Action scope, the next level up is the Session and the most global is the Application. A simple way to test this is to create an Action class which implements ApplicationAware. Then you can get the values put there from any jsp/action... anywhere you can get to the ActionContext (which is most anyplace) see: http://struts.apache.org/2.0.14/docs/what-is-the-actioncontext.html
Anyways, I would implement a basic interceptor which would check if new data should be available and I have not looked it up already, then load the new data (the user triggering this interceptor may not need this new data, so doing this in a new thread would be a good idea).
This method increases the complexity, as you are responsible for managing some data structures and making them co-operate with the ORM.
I've done this to load data from tables which will never need to be loaded again, and that data stands on it's own (I don't need to find relationships between it and other tables). This is quick and dirty, Stevens solution is far more robust and probably would pay you back at a later date when further performance is a requirement.
This isn't really specific to Struts2 at all. You definitely do not want to try storing this information in the ActionContext -- that's a per-request object.
You should look into a caching framework like EHCache or something similar. If you use Hibernate for your persistence, Hibernate has options for caching data so that it does not need to hit the database on every request. (Hibernate can also use EHCache for its second-level cache).
As mentioned earlier, the best approach would be using EHCache or some other trusted cache manager.
Another approach is to use a factory to access the information. For instance, something to the effect of:
public class MyCache {
private static MyCache cache = new MyCache();
public static MyCache getCache() {
return cache;
}
(data members)
private MyCache() {
(update data members)
}
public synchronized getXXX() {
...
}
public synchronized setXXX(SomeType data) {
...
}
}
You need to make sure you synchronize all your reads and writes to make sure you don't have race conditions while updating the cache.
synchronized (MyCache.getCahce()) {
MyCahce.getCache().getXXX();
MyCache.getCache().getTwo();
...
}
etc
Again, better to use EHCache or something else turn-key since this is likely to be fickle without good understanding of the mechanisms. This sort of cache also has performance issues since it only allows ONE thread to read/write to the cache at a time. (Possible ways to speed up are to use thread locals and read/write locks - but that sort of thing is already built into many of the established cache managers)

Is it safe to cache DataSource lookups in Java EE?

I'm developing a simple Java EE 5 "routing" application. Different messages from a MQ queue are first transformed and then, according to the value of a certain field, stored in different datasources (stored procedures in different ds need to be called).
For example valueX -> dataSource1, valueY -> dataSource2. All datasources are setup in the application server with different jndi entries. Since the routing info usually won't change while the app is running, is it save to cache the datasource lookups? For example I would implement a singleton, which holds a hashmap where I store valueX->DataSource1. When a certain entry is not in the list, I would do the resource lookup and store the result in the map. Do I gain any performance with the cache or are these resource lookups fast enough?
In general, what's the best way to build this kind of cache? I could use a cache for some other db lookups too. For example the mapping valueX -> resource name is defined in a simple table in a DB. Is it better too lookup the values on demand and save the result in a map, do a lookup all the time or even read and save all entries on startup? Do I need to synchronize the access? Can I just create a "enum" singleton implementation?
It is safe from operational/change management point of view, but not safe from programmer's one.
From programmer's PoV, DataSource configuration can be changed at runtime, and therefore one should always repeat the lookup.
But this is not how things are happening in real life.
When a change to a Datasource is to be implemented, this is done via a Change Management procedure. There is a c/r record, and that record states that the application will have a downtime. In other words, operational folks executing the c/r will bring the application down, do the change and bring it back up. Nobody does the changes like this on a live AS -- for safety reasons. As the result, you shouldn't take into account a possibility that DS changes at runtime.
So any permanent synchronized shared cache is good in the case.
Will you get a performance boost? This depends on the AS implementation. It likely to have a cache of its own, but that cache may be more generic and so slower and in fact you cannot count on its presence at all.
Do you need to build a cache? The answer usually comes from performance tests. If there is no problem, why waste time and introduce risks?
Resume: yes, build a simple cache and use it -- if it is justified by the performance increase.
Specifics of implementation depend on your preferences. I usually have a cache that does lookups on demand, and has a synchronized map of jndi->object inside. For high-concurrency cache I'd use Read/Write locks instead of naive synchronized -- i.e. many reads can go in parallel, while adding a new entry gets an exclusive access. But those are details much depending on the application details.

The best place to store large data retrieved by a java servlet (Tomcat)

I have the java servlet that retrieves data from a mysql database. In order to minimize roundtrips to the database, it is retrieved only once in init() method, and is placed to a HashMap<> (i.e. cached in memory).
For now, this HashMap is a member of the servlet class. I need not only store this data but also update some values (counters in fact) in the cached objects of underlying hashmap value class. And there is a Timer (or Cron task) to schedule dumping these counters to DB.
So, after googling i found 3 options of storing the cached data:
1) as now, as a member of servlet class (but servlets can be taken out of service and put back into service by the container at will. Then the data will be lost)
2) in ServletContext (am i right that it is recommended to store small amounts of data here?)
3) in a JNDI resource.
What is the most preferred way?
Put it in ServletContext But use ConcurrentHashMap to avoid concurrency issues.
From those 3 options, the best is to store it in the application scope. I.e. use ServletContext#setAttribute(). You'd like to use a ServletContextListener for this. In normal servlets you can access the ServletContext by the inherited getServletContext() method. In JSP you can access it by ${attributename}.
If the data is getting excessive large that it eats too much of Java's memory, then you should consider a 4th option: use a cache manager.
The most obvious way would be use something like ehcache and store the data in that. ehcache is a cache manager that works much like a hash map except the cache manager can be tweaked to hold things in memory, move them to disk, flush them, even write them into a database via a plugin etc. Depends if the objects are serializable, and whether your app can cope without data (i.e. make another round trip if necessary) but I would trust a cache manager to do a better job of it than a hand rolled solution.
If your cache can become large enough and you access it often it'll be reasonable to utilize some caching solution. For example ehcache is a good candidate and easily integrated with Spring applications, too. Documentation is here.
Also check this overview of open-source caching solutions for Java.

Categories