Coherence EntryProcessor query - java

I'm trying to implement a business functionality which uses Coherence transient caches.
One of the features I was planning to depend upon is auto-eviction of cache entries, when providing a (configurable) time-to-live at the time of putting an item in the cache. The interface NamedCache provides an API to achieve this (http://download.oracle.com/otn_hosted_doc/coherence/330/com/tangosol/net/NamedCache.html#put(java.lang.Object, java.lang.Object, long)).
However, I'm also planning to use Entry-Processors to ensure effective concurrency across the cluster. I'm stuck at a point now where, within the scope of the processor, I'm supposed to work with InvocableMap.Entry to get/set values with a key in the cache. Unfortunately, there is no setValue method which lets me specify the time-to-live value.
I'm assuming here that interfacing directly with the NamedCache reference inside the EntryProcessor's process method will not be a good idea, and will compromise the concurrency guarantees which EntryProcessor provides.
Can you please share your thoughts on what could be the best way to get an entry evicted after a certain amount of time (which is dynamically decided), while ensuring optimal concurrency across a cluster of nodes?
I'm not completely hung up on using the auto-eviction functionality. However, if I were to abandon that, I may have to rely upon a timer-based programmatic removal of the entry, which works reliably across a cluster. Again, I'm falling short of ideas on this one. Ideally, I would want Coherence to deal with this.
Many thanks in advance.
Best regards,
- Aditya

you can try the following:
Cast the entry in the EntryProcessor to BinaryEntry and set the expiration time.
For example:
public class MyEntryProcessor extends AbstractProcessor implements PortableObject {
#Override
public Object process(Entry myEntry) {
((BinaryEntry)myEntry).expire(100);
return myEntry;
}
}
http://docs.oracle.com/middleware/1212/coherence/COHJR/com/tangosol/util/BinaryEntry.html

Related

hashmap cache in servlet

I am trying to implement a servlet for GPS monitoring and trying create simple cache, because i think that it will be faster then SQL request for every Http Request. simple scheme:
in the init() method, i reads one point for each vehicle into HashMap (vehicle id = key, location in json = value) . after that, some request try to read this points and some request try to update (one vehicle update one item). Of course I want to minimize synchronization so i read javadoc :
http://docs.oracle.com/javase/6/docs/api/java/util/HashMap.html
Note that this implementation is not synchronized. If multiple threads access a hash map concurrently, and at least one of the threads modifies the map structurally, it must be synchronized externally. (A structural modification is any operation that adds or deletes one or more mappings; merely changing the value associated with a key that an instance already contains is not a structural modification.)
If I am right, there is no any synchronization in my task, because i do only "not a structural modification == changing the value associated with a key that an instance already contains)". is it a correct statement?
Use the ConcurrentHashMap it doesn't use synchronization by locks, but by atomic operations.
Wrong. Adding an item to the hash map is a structural modification (and to implement a cache you must add items at some point).
Use java.util.concurrent.ConcurrentHashMap.
if all the entries are read into hashmap in init() and then only read/modified - then yes, all the other threads theoretically do not need to sync, though some problems might arise due to threads caching values, so ConcurrentHashMap would be better.
perhaps rather than implementing cache yourself, use a simple implementation found in Guava library
Caching is not an easy problem - but it is a known one. Before starting, I would carefully measure wheter you really do have a performance problem, and whether caching actually solve it. You may think it should, and you may be right. You may also be horrendously wrong depending on the situation ("Preemptive optimization is the root of all evil"), so measure.
This said, do not implement a cache yourself, use a library doing it for you. I have personnaly good experience with ehcache.
If I understand correctly, you have two types of request:
Read from cache
Write to cache (to update the value)
In this case, you may potentially try to write to the same map twice at the same time, which is what the docs are referring to.
If all requests go through the same piece of code (e.g. an update method which can only be called from one thread) you will not need synchronisation.
If your system is multi-threaded and you have more than one thread or piece of code that writes to the map, you will need to externally synchronise your map or use a ConcurrentHashMap.
For clarity, the reason you need synchronisation is that if you have two threads both trying to update the a JSON value for the same key, who wins? This is either left up to chance or causes exceptions or, worse, buggy behaviour.
Any time you modify the same element from two threads, you need to synchronise on that code or, better still, use a thread-safe version of the data structure if that is applicable.

Java cache and update dynamically

I need to "preload" some data from a database on servlet startup.
So I thought to create some cache e.g. using a HashMap or some similar synchronized version.
I also need to update the cache on database update changes.
So I thought to add some kind of "listener".
My question is: is this somehow available or do I have to actually implement it?
If yes what design pattern would be the best approach here?
Update:
No JPA or ORM used. But Spring is available
Yes of course you can implement that
I'll draw a small architecture then ill explain it to u:
first of all , you can learn about Mappers here and TDGs here.
A mapper has a method called cacheAll() which calls and delegate to TDG's method cacheAll() which in its turn has a mission to get all rows from a table from the db( the rows you want to cache in the cache object).
so basically first you have to create a listener implementing "ServletContextListener"
which means its a listener for the whole servlet context, and inside its contextInitialized you have to call mp.fill(Mapper.cacheAll()), so it is sthg like ( this is general code, of course write it better and optimize it)
public class myServletContextListener implements ServletContextListener{
#Override
public void contextInitialized(ServletContextEvent sce) {
mp.fill(Mapper.cacheAll());
}
//
}
Don't forget to add your listener in web.xml:
<listener>
<listener-class>myServletContextListener </listener-class>
</listener>
so what this will do , is on startup of the server, will cache all record into a hashmap mp in a cache object.
As for updating cache based on database change, you will have to use observer pattern.
UPDATE
I forgot to mention, about the cache object, i assume you want it accessible for all users or your app, so you should code it as a singleton (singleton pattern), code like that:
public class cacheObject
{
private static Map cMap;
private static cacheObject cObject;
private cacheObject()
{
cMap = Mapper.cacheAll();
}
public static synchronized cacheObject getInstance()
{
if (cObject == null){
cObject = new cacheObject();
}
return cObject;
}
}
Also if the data that you want to cache can be changed by users, so make it a Threadlocal singleton.
You may find your needs served best by Guava here. The wiki article on Caches is probably most relevant to you, but the exact approach here would depend heavily on the conditions for database update changes. If you want to refresh the whole cache on database update changes -- or at least invalidate old entries -- you might just call Cache.invalidateAll() whenever a database update occurs. If you're willing to let the cache be only slightly behind the times, using CacheBuilder.refreshAfterWrite(long, TimeUnit) might work well for you.
Hashmap and it's thread safe variant ConcurrentHashMap is already available.
There are caching solutions which are available like ehcache which also provide advanced support like eviction policies and many more things.
As for the design pattern, read into the Observer design pattern.
I actually had a production level project where I needed to do something like this. My solution was (and this is just MY solution) was to load the object (your so "data") into memory at servlet start up. This decision was made because the object was large enough that it made client requests sluggish to pull it from the database AND I had a small number of concurrent users. Any requests that would change that object's data in the database would also change the in-memory object. You of course would need to use a synchronized object to do this if you are working with a lot of users. If the amount of data is not huge, then you could always pull from the database every time the user requests information about the data.
Good Luck.

Is it safe to cache DataSource lookups in Java EE?

I'm developing a simple Java EE 5 "routing" application. Different messages from a MQ queue are first transformed and then, according to the value of a certain field, stored in different datasources (stored procedures in different ds need to be called).
For example valueX -> dataSource1, valueY -> dataSource2. All datasources are setup in the application server with different jndi entries. Since the routing info usually won't change while the app is running, is it save to cache the datasource lookups? For example I would implement a singleton, which holds a hashmap where I store valueX->DataSource1. When a certain entry is not in the list, I would do the resource lookup and store the result in the map. Do I gain any performance with the cache or are these resource lookups fast enough?
In general, what's the best way to build this kind of cache? I could use a cache for some other db lookups too. For example the mapping valueX -> resource name is defined in a simple table in a DB. Is it better too lookup the values on demand and save the result in a map, do a lookup all the time or even read and save all entries on startup? Do I need to synchronize the access? Can I just create a "enum" singleton implementation?
It is safe from operational/change management point of view, but not safe from programmer's one.
From programmer's PoV, DataSource configuration can be changed at runtime, and therefore one should always repeat the lookup.
But this is not how things are happening in real life.
When a change to a Datasource is to be implemented, this is done via a Change Management procedure. There is a c/r record, and that record states that the application will have a downtime. In other words, operational folks executing the c/r will bring the application down, do the change and bring it back up. Nobody does the changes like this on a live AS -- for safety reasons. As the result, you shouldn't take into account a possibility that DS changes at runtime.
So any permanent synchronized shared cache is good in the case.
Will you get a performance boost? This depends on the AS implementation. It likely to have a cache of its own, but that cache may be more generic and so slower and in fact you cannot count on its presence at all.
Do you need to build a cache? The answer usually comes from performance tests. If there is no problem, why waste time and introduce risks?
Resume: yes, build a simple cache and use it -- if it is justified by the performance increase.
Specifics of implementation depend on your preferences. I usually have a cache that does lookups on demand, and has a synchronized map of jndi->object inside. For high-concurrency cache I'd use Read/Write locks instead of naive synchronized -- i.e. many reads can go in parallel, while adding a new entry gets an exclusive access. But those are details much depending on the application details.

Update cached data in a hashtable

In order to minimize the number of database queries I need some sort of cache to store pairs of data. My approach now is a hashtable (with Strings as keys, Integers as value). But I want to be able to detect updates in the database and replace the values in my "cache". What I'm looking for is something that makes my stored pairs invalid after a preset timespan, perhaps 10-15 minutes. How would I implement that? Is there something in the standard Java package I can use?
I would use some existing solution(there are many cache frameworks).
ehcache is great, it can reset the values on given timespan and i bet it can do much more(i only used that)
You can either use existing solutions (see previous reply)
Or if you want a challenge, make your own easy cache class (not recommended for production project, but it's a great learning experience.
You will need at least 3 members
A cache data stored as hashtable object,
Next cache expiration date
Cache expiration interval set via constructor.
Then simply have public data getter methods, which verify cache expiration status:
if not expired, call hastable's accessors;
if expired, first call "data load" method that is also called in the constructor to pre-populate and then call hashtable accessors.
For an even cooler cache class (I have implemented it in Perl at my job), you can have additional functionality you can implement:
Individual per-key cache expiration (coupled with overall total cache expiration)
Auto, semi-auto, and single-shot data reload (e.g., reload entire cache at once; reload a batch of data defined either by some predefined query, or reload individual data elements piecemail). The latter approach is very useful when your cache has many hits on the same exact keys - that way you don't need to reload universe every time 3 kets that are always accessed expire.
You could use a caching framework like OSCache, EHCache, JBoss Cache, JCS... If you're looking for something that follows a "standard", choose a framework that supports the JCache standard interface (javax.cache) aka JSR-107.
For simple needs like what you are describing, I'd look at EHCache or OSCache (I'm not saying they are basic, but they are simple to start with), they both support expiration based on time.
If I had to choose one solution, I'd recommend Ehcache which has my preference, especially now that it has joined Terracotta. And just for the record, Ehcache provides a preview implementation of JSR107 via the net.sf.cache.jcache package.

How to cache information in a DAO in a threadsafe manner

I often need to implement DAO's for some reference data that doesn't change very often. I sometimes cache this in collection field on the DAO - so that it is only loaded once and explicitly updated when required.
However this brings in many concurrency issues - what if another thread attempts to access the data while it is loading or being updated.
Obviously this can be handled by making both the getters and setters of the data synchronised - but for a large web application this is quite an overhead.
I've included a trivial flawed example of what I need as a strawman. Please suggest alternative ways to implement this.
public class LocationDAOImpl implements LocationDAO {
private List<Location> locations = null;
public List<Location> getAllLocations() {
if(locations == null) {
loadAllLocations();
}
return locations;
}
For further information I'm using Hibernate and Spring but this requirement would apply across many technologies.
Some further thoughts:
Should this not be handled in code at all - instead let ehcache or similar handle it?
Is there a common pattern for this that I'm missing?
There are obviously many ways this can be achieved but I've never found a pattern that is simple and maintainable.
Thanks in advance!
The most simple and safe way is to include the ehcache library in your project and use that to setup a cache. These people have solved all the issues you can encounter and they have made the library as fast as possible.
In situations where I've rolled my own reference data cache, I've typically used a ReadWriteLock to reduce thread contention. Each of my accessors then takes the form:
public PersistedUser getUser(String userName) throws MissingReferenceDataException {
PersistedUser ret;
rwLock.readLock().lock();
try {
ret = usersByName.get(userName);
if (ret == null) {
throw new MissingReferenceDataException(String.format("Invalid user name: %s.", userName));
}
} finally {
rwLock.readLock().unlock();
}
return ret;
}
The only method to take out the write lock is refresh(), which I typically expose via an MBean:
public void refresh() {
logger.info("Refreshing reference data.");
rwLock.writeLock().lock();
try {
usersById.clear();
usersByName.clear();
// Refresh data from underlying data source.
} finally {
rwLock.writeLock().unlock();
}
}
Incidentally, I opted for implementing my own cache because:
My reference data collections are small so I can always store them all in memory.
My app needs to be simple / fast; I want as few dependencies on external libraries as possible.
The data is rarely updated and when it is the call to refresh() is fairly quick. Hence I eagerly initialise my caches (unlike in your straw man example), which means accessors never need to take out the write lock.
If you just want a quick roll-your own caching solution, have a look at this article on JavaSpecialist, which is a review of the book Java Concurrency in Practice by Brian Goetz.
It talks about implementing a basic thread safe cache using a FutureTask and a ConcurrentHashMap.
The way this is done ensures that only one concurrent thread triggers the long running computation (in your case, your database calls in your DAO).
You'd have to modify this solution to add cache expiry if you need it.
The other thought about caching it yourself is garbage collection. Without using a WeakHashMap for your cache, then the GC wouldn't be able to release the memory used by the cache if needed. If you are caching infrequently accessed data (but data that was still worth caching since it is hard to compute), then you might want to help out the garbage collector when running low on memory by using a WeakHashMap.
If your reference data is immutable the second level cache of hibernate could be a reasonable solution.
Obviously this can be handled by making both the getters and setters of the data synchronised - but for a large web application this is quite an overhead.
I've included a trivial flawed example of what I need as a strawman. Please suggest alternative ways to implement this.
While this might be somewhat true, you should take note that the sample code you've provided certainly needs to be synchronized to avoid any concurrency issues when lazy-loading the locations. If that accessor is not synchronized, then you will have:
Multiple threads access the loadAllLocations() method at the same time
Some threads may enter loadAllLocations() even after another thread has completed the method and assigned the result to locations - under the Java Memory Model there is no guarantee that other threads will see the change in the variable without synchronization.
Be careful when using lazy loading/initialization, it seems like a simple performance boost but it can cause lots of nasty threading issues.
I think it's best to not do it yourself, because getting it right is a very difficult thing. Using EhCache or OSCache with Hibernate and Spring is a far better idea.
Besides, it makes your DAOs stateful, which might be problematic. You should have no state at all, besides the connection, factory, or template objects that Spring manages for you.
UPDATE: If your reference data isn't too large, and truly never changes, perhaps an alternative design would be to create enumerations and dispense with the database altogether. No cache, no Hibernate, no worries. Perhaps oxbow_lakes' point is worth considering: perhaps it could be a very simple system.

Categories