Persist guava cache on shutdown - java

I use the following guava cache to store messages for a specific time waiting for a possible response. So I use the cache more like a timeout for messages:
Cache cache = CacheBuilder.newBuilder().expireAfterWrite(7, TimeUnit.DAYS).build();
cache.put(id,message);
...
cache.getIfPresent(id);
In the end I need to persist the messages with its currently 'timeout' information on shutdown
and restore it on startup with the internal already expired times per entry. I couldn't find any methods which give me access to the time information, so I can handle it by myself.
The gauva wiki says:
Your application will not need to store more data than what would fit in RAM. (Guava caches are local to a single run of your application. They do not store data in files, or on outside servers. If this does not fit your needs, consider a tool like Memcached.)
Do you think this restriction address also a 'timeout' map to persist on shutdown?

I don't believe there's any way to recreate the cache with per-entry expiration values -- even if you do use reflection. You might be able to simulate it by using a DelayedQueue in a separate thread that explicitly invalidates entries that should have expired, but that's the best I think you can do.
That said, if you're just interested in peeking at the expiration information, I would recommend wrapping your cache values in a class that remembers the expiration time, so you can look up the expiration time for an entry just by looking up its value and calling a getExpirationTime() method or what have you.
That approach, at least, should not break with new Guava releases.

Well, unfortunately Guava doesn't seems to expose this functionality but if you feel adventurous and absolutely must have this you could always use reflection. Just look at sources and see what methods do you need. As always care should be taken as your code might break when Guaval internal implementation changes. Code below seems to work with Guava 10.0.1:
Cache<Integer, String> cache = CacheBuilder.newBuilder().expireAfterWrite(7, TimeUnit.DAYS).build(new CacheLoader<Integer, String>() {
#Override
public String load(Integer key) throws Exception {
return "The value is "+key.toString();
}
});
Integer key_1 = Integer.valueOf(1);
Integer key_2 = Integer.valueOf(2);
System.out.println(cache.get(key_1));
System.out.println(cache.get(key_2));
ConcurrentMap<Integer, String> map = cache.asMap();
Method m = map.getClass().getDeclaredMethod("getEntry", Object.class);
m.setAccessible(true);
for(Integer key: map.keySet()) {
Object value = m.invoke(map, key);
Method m2 = value.getClass().getDeclaredMethod("getExpirationTime", null);
m2.setAccessible(true);
Long expirationTime = (Long)m2.invoke(value, null);
System.out.println(key+" expiration time is "+expirationTime);
}

Related

Scheduled in-memory cache invalidation based on predicate

I have List<CompletableFuture> stored in Map
private final Map<UUID, List<CompletableFuture>> hydrationProcesses = new ConcurrentHashMap<>();
Currently there is a daemon thread which runs every 30 sec and removes all Futures that have alredy been completed.
There are lots of TTL invalidation implementations, I am looking for invalidation based on some predicate. I want to get rid of this daemon thread.
Is there any out of a box solution for scheduled cache invalidation based on some custom logic. Maybe something in Spring/Guava that I missed?
Maybe something similar to Guava's
CacheBuilder.newBuilder()
.expireAfterAccess(2,TimeUnit.MILLISECONDS)
.build(loader);
But rather then mark everything as expired after access I need to check if this Future has already been completed and then remove it from cache.
I doubt there is a solution based on Guava. Since you don't specifically ask for a "Guava-only" solution, I'd like to give an idea how to solve the problem with cache2k.
Here is how the setup could look like:
final long RECHECK_INTERVAL_MILLIS = 30000;
Cache<UUID, CompletableFuture<Void>> cache =
new Cache2kBuilder<UUID, CompletableFuture<Void>>(){}
.loader(new AdvancedCacheLoader<UUID, CompletableFuture<Void>>() {
#Override
public CompletableFuture<Void> load(UUID key,
long currentTime,
CacheEntry<UUID, CompletableFuture<Void>> currentEntry) {
return currentEntry != null ? currentEntry.getValue() : null;
}
})
.expiryPolicy(new ExpiryPolicy<UUID, CompletableFuture<Void>>() {
#Override
public long calculateExpiryTime(UUID key,
CompletableFuture<Void> value,
long loadTime,
CacheEntry<UUID, CompletableFuture<Void>> oldEntry) {
return value.isDone() ? NOW : loadTime + RECHECK_INTERVAL_MILLIS;
}
})
.refreshAhead(true)
.build();
Actually cache2k has similar capabilities then Guava or other caches. However there are tiny extensions that allow more sophisticated setups.
The trick used here, is to configure the cache in read through operation, but make the loader return the current cache value. When an entry expires, the loader is called, because of refreshAhead(true), but the current value is preserved and the expiry policy is evaluated again. The predicate that you like to be checked goes into the expiry policy.
Other caches have read through and custom expiry as well, but lack the concept of a "smarter loader" (AdvancedCacheLoader) that can act more efficiently based on the existing cache value.
We use setups similar to this in production.
There is a downside, too. cache2k uses one timing thread per cache, if custom expiry is used. This means your extra thread will not go away. cache2k will be enhanced in the future to have a global set of threads for timing that are shared among all caches. Update: New versions of cache2k use a common thread pool for timers.
Disclaimer: I am the author of cache2k, so I cannot speak with absolute certainty that there is no possible solution based on Guava.

Pattern for Java ConcurrentHashMap of Sets

A data structure that I use commonly in multi-threaded applications is a ConcurrentHashMap where I want to save a group of items that all share the same key. The problem occurs when installing the first item for a particular key value.
The pattern that I have been using is:
final ConcurrentMap<KEYTYPE, Set<VALUETYPE>> hashMap = new ConcurrentHashMap<KEYTYPE, Set<VALUETYPE>>();
// ...
Set<VALUETYPE> newSet = new HashSet<VALUETYPE>();
final Set<VALUETYPE> set = hashMap.putIfAbsent(key, newSet)
if (set != null) {
newSet = set;
}
synchronized (newSet) {
if (!newSet.contains(value)) {
newSet.add(value);
}
}
Is there a better pattern for doing this operation? Is this even thread-safe? Is there a better class to use for the inner Set than java.util.HashSet?
I strongly recommend using the Google Guava libraries for this, specifically an implementation of Multimap. The HashMultimap would be your best bet, though if you need concurrent update opterations you would need to wrap it in a delegate using Multimaps.synchronizedSetMultimap().
Another option is to use a ComputingMap (also from Guava), which is a map that, if the Value returned from a call to get(Key) does not exist, it is instantiated there and then. ComputingMaps are created using MapMaker.
The code from your question would be roughly:
ConcurrentMap<KEYTYPE, Set<VALUETYPE>> hashMap = new MapMaker()
.makeComputingMap(
new Function<KEYTYPE, VALUETYPE>() {
public Graph apply(KEYTYPE key) {
return new HashSet<VALUETYPE>();
}
});
The Function would only be called when a call to get() for a specific key would otherwise return null. This means that you can then do this:
hashMap.get(key).put(value);
safely knowing that the HashSet<VALUETYPE> is created if it doesn't already exist.
MapMaker is also relevant because of the control it gives you over the tuning of the returned Map, letting you specify, for example, the concurrency level using the method concurrencyLevel(). You may find that useful:
Guides the allowed concurrency among update operations. Used as a hint for internal sizing. The table is internally partitioned to try to permit the indicated number of concurrent updates without contention. Because assignment of entries to these partitions is not necessarily uniform, the actual concurrency observed may vary.
I think using java.util.concurrent.ConcurrentSkipListMap and java.util.concurrent.ConcurrentSkipListSet could help you resolve the concurrency concerns.

reliably forcing Guava map eviction to take place

EDIT: I've reorganized this question to reflect the new information that since became available.
This question is based on the responses to a question by Viliam concerning Guava Maps' use of lazy eviction: Laziness of eviction in Guava's maps
Please read this question and its response first, but essentially the conclusion is that Guava maps do not asynchronously calculate and enforce eviction. Given the following map:
ConcurrentMap<String, MyObject> cache = new MapMaker()
.expireAfterAccess(10, TimeUnit.MINUTES)
.makeMap();
Once ten minutes has passed following access to an entry, it will still not be evicted until the map is "touched" again. Known ways to do this include the usual accessors - get() and put() and containsKey().
The first part of my question [solved]: what other calls cause the map to be "touched"? Specifically, does anyone know if size() falls into this category?
The reason for wondering this is that I've implemented a scheduled task to occasionally nudge the Guava map I'm using for caching, using this simple method:
public static void nudgeEviction() {
cache.containsKey("");
}
However I'm also using cache.size() to programmatically report the number of objects contained in the map, as a way to confirm this strategy is working. But I haven't been able to see a difference from these reports, and now I'm wondering if size() also causes eviction to take place.
Answer: So Mark has pointed out that in release 9, eviction is invoked only by the get(), put(), and replace() methods, which would explain why I wasn't seeing an effect for containsKey(). This will apparently change with the next version of guava which is set for release soon, but unfortunately my project's release is set sooner.
This puts me in an interesting predicament. Normally I could still touch the map by calling get(""), but I'm actually using a computing map:
ConcurrentMap<String, MyObject> cache = new MapMaker()
.expireAfterAccess(10, TimeUnit.MINUTES)
.makeComputingMap(loadFunction);
where loadFunction loads the MyObject corresponding to the key from a database. It's starting to look like I have no easy way of forcing eviction until r10. But even being able to reliably force eviction is put into doubt by the second part of my question:
The second part of my question [solved]: In reaction to one of the responses to the linked question, does touching the map reliably evict all expired entries? In the linked answer, Niraj Tolia indicates otherwise, saying eviction is potentially only processed in batches, which would mean multiple calls to touch the map might be needed to ensure all expired objects were evicted. He did not elaborate, however this seems related to the map being split into segments based on concurrency level. Assuming I used r10, in which a containsKey("") does invoke eviction, would this then be for the entire map, or only for one of the segments?
Answer: maaartinus has addressed this part of the question:
Beware that containsKey and other reading methods only run postReadCleanup, which does nothing but on each 64th invocation (see DRAIN_THRESHOLD). Moreover, it looks like all cleanup methods work with single Segment only.
So it looks like calling containsKey("") wouldn't be a viable fix, even in r10. This reduces my question to the title: How can I reliably force eviction to occur?
Note: Part of the reason my web app is noticeably affected by this issue is that when I implemented caching I decided to use multiple maps - one for each class of my data objects. So with this issue there is the possibility that one area of code is executed, causing a bunch of Foo objects to be cached, and then the Foo cache isn't touched again for a long time so it doesn't evict anything. Meanwhile Bar and Baz objects are being cached from other areas of code, and memory is being eaten. I'm setting a maximum size on these maps, but this is a flimsy safeguard at best (I'm assuming its effect is immediate - still need to confirm this).
UPDATE 1: Thanks to Darren for linking the relevant issues - they now have my votes. So it looks like a resolution is in the pipeline, but seems unlikely to be in r10. In the meantime, my question remains.
UPDATE 2: At this point I'm just waiting for a Guava team member to give feedback on the hack maaartinus and I put together (see answers below).
LAST UPDATE: feedback received!
I just added the method Cache.cleanUp() to Guava. Once you migrate from MapMaker to CacheBuilder you can use that to force eviction.
I was wondering the about the same issue you described in the first part of your question. From what I can tell from looking at the source code for Guava's CustomConcurrentHashMap (release 9), it appears that entries are evicted on the get(), put(), and replace() methods. The containsKey() method does not appear to invoke eviction. I'm not 100% sure because I took a quick pass at the code.
Update:
I also found a more recent version of the CustomConcurrentHashmap in Guava's git repository and it looks like containsKey() has been updated to invoke eviction.
Both release 9 and the latest version I just found do not invoke eviction when size() is called.
Update 2:
I recently noticed that Guava r10 (yet to be released) has a new class called CacheBuilder. Basically this class is a forked version of the MapMaker but with caching in mind. The documentation suggests that it will support some of the eviction requirements you are looking for.
I reviewed the updated code in r10's version of the CustomConcurrentHashMap and found what looks like a scheduled map cleaner. Unfortunately, that code appears unfinished at this point but r10 looks more and more promising each day.
Beware that containsKey and other reading methods only run postReadCleanup, which does nothing but on each 64th invocation (see DRAIN_THRESHOLD). Moreover, it looks like all cleanup methods work with single Segment only.
The easiest way to enforce eviction seems to be to put some dummy object into each segment. For this to work, you'd need to analyze CustomConcurrentHashMap.hash(Object), which is surely no good idea, as this method may change anytime. Moreover, depending on the key class it may be hard to find a key with a hashCode ensuring it lands in a given segment.
You could use reads instead, but would have to repeat them 64 times per segment. Here, it'd easy to find a key with an appropriate hashCode, since here any object is allowed as an argument.
Maybe you could hack into the CustomConcurrentHashMap source code instead, it could be as trivial as
public void runCleanup() {
final Segment<K, V>[] segments = this.segments;
for (int i = 0; i < segments.length; ++i) {
segments[i].runCleanup();
}
}
but I wouldn't do it without a lot of testing and/or an OK by a guava team member.
Yep, we've gone back and forth a few times on whether these cleanup tasks should be done on a background thread (or pool), or should be done on user threads. If they were done on a background thread, this would eventually happen automatically; as it is, it'll only happen as each segment gets used. We're still trying to come up with the right approach here - I wouldn't be surprised to see this change in some future release, but I also can't promise anything or even make a credible guess as to how it will change. Still, you've presented a reasonable use case for some kind of background or user-triggered cleanup.
Your hack is reasonable, as long as you keep in mind that it's a hack, and liable to break (possibly in subtle ways) in future releases. As you can see in the source, Segment.runCleanup() calls runLockedCleanup and runUnlockedCleanup: runLockedCleanup() will have no effect if it can't lock the segment, but if it can't lock the segment it's because some other thread has the segment locked, and that other thread can be expected to call runLockedCleanup as part of its operation.
Also, in r10, there's CacheBuilder/Cache, analogous to MapMaker/Map. Cache is the preferred approach for many current users of makeComputingMap. It uses a separate CustomConcurrentHashMap, in the common.cache package; depending on your needs, you may want your GuavaEvictionHacker to work with both. (The mechanism is the same, but they're different Classes and therefore different Methods.)
I'm not a big fan of hacking into or forking external code until absolutely necessary. This problem occurs in part due to an early decision for MapMaker to fork ConcurrentHashMap, thereby dragging in a lot of complexity that could have been deferred until after the algorithms were worked out. By patching above MapMaker, the code is robust to library changes so that you can remove your workaround on your own schedule.
An easy approach is to use a priority queue of weak reference tasks and a dedicated thread. This has the drawback of creating many stale no-op tasks, which can become excessive in due to the O(lg n) insertion penalty. It works reasonably well for small, less frequently used caches. It was the original approach taken by MapMaker and its simple to write your own decorator.
A more robust choice is to mirror the lock amortization model with a single expiration queue. The head of the queue can be volatile so that a read can always peek to determine if it has expired. This allows all reads to trigger an expiration and an optional clean-up thread to check regularly.
By far the simplest is to use #concurrencyLevel(1) to force MapMaker to use a single segment. This reduces the write concurrency, but most caches are read heavy so the loss is minimal. The original hack to nudge the map with a dummy key would then work fine. This would be my preferred approach, but the other two options are okay if you have high write loads.
I don't know if it is appropriate for your use case, but your main concern about the lack of background cache eviction seems to be memory consumption, so I would have thought that using softValues() on the MapMaker to allow the Garbage Collector to reclaim entries from the cache when a low memory situation occurs. Could easily be the solution for you. I have used this on a subscription-server (ATOM) where entries are served through a Guava cache using SoftReferences for values.
Based on maaartinus's answer, I came up with the following code which uses reflection rather than directly modifying the source (If you find this useful please upvote his answer!). While it will come at a performance penalty for using reflection, the difference should be negligible since I'll run it about once every 20 minutes for each caching Map (I'm also caching the dynamic lookups in the static block which will help). I have done some initial testing and it appears to work as intended:
public class GuavaEvictionHacker {
//Class objects necessary for reflection on Guava classes - see Guava docs for info
private static final Class<?> computingMapAdapterClass;
private static final Class<?> nullConcurrentMapClass;
private static final Class<?> nullComputingConcurrentMapClass;
private static final Class<?> customConcurrentHashMapClass;
private static final Class<?> computingConcurrentHashMapClass;
private static final Class<?> segmentClass;
//MapMaker$ComputingMapAdapter#cache points to the wrapped CustomConcurrentHashMap
private static final Field cacheField;
//CustomConcurrentHashMap#segments points to the array of Segments (map partitions)
private static final Field segmentsField;
//CustomConcurrentHashMap$Segment#runCleanup() enforces eviction on the calling Segment
private static final Method runCleanupMethod;
static {
try {
//look up Classes
computingMapAdapterClass = Class.forName("com.google.common.collect.MapMaker$ComputingMapAdapter");
nullConcurrentMapClass = Class.forName("com.google.common.collect.MapMaker$NullConcurrentMap");
nullComputingConcurrentMapClass = Class.forName("com.google.common.collect.MapMaker$NullComputingConcurrentMap");
customConcurrentHashMapClass = Class.forName("com.google.common.collect.CustomConcurrentHashMap");
computingConcurrentHashMapClass = Class.forName("com.google.common.collect.ComputingConcurrentHashMap");
segmentClass = Class.forName("com.google.common.collect.CustomConcurrentHashMap$Segment");
//look up Fields and set accessible
cacheField = computingMapAdapterClass.getDeclaredField("cache");
segmentsField = customConcurrentHashMapClass.getDeclaredField("segments");
cacheField.setAccessible(true);
segmentsField.setAccessible(true);
//look up the cleanup Method and set accessible
runCleanupMethod = segmentClass.getDeclaredMethod("runCleanup");
runCleanupMethod.setAccessible(true);
}
catch (ClassNotFoundException cnfe) {
throw new RuntimeException("ClassNotFoundException thrown in GuavaEvictionHacker static initialization block.", cnfe);
}
catch (NoSuchFieldException nsfe) {
throw new RuntimeException("NoSuchFieldException thrown in GuavaEvictionHacker static initialization block.", nsfe);
}
catch (NoSuchMethodException nsme) {
throw new RuntimeException("NoSuchMethodException thrown in GuavaEvictionHacker static initialization block.", nsme);
}
}
/**
* Forces eviction to take place on the provided Guava Map. The Map must be an instance
* of either {#code CustomConcurrentHashMap} or {#code MapMaker$ComputingMapAdapter}.
*
* #param guavaMap the Guava Map to force eviction on.
*/
public static void forceEvictionOnGuavaMap(ConcurrentMap<?, ?> guavaMap) {
try {
//we need to get the CustomConcurrentHashMap instance
Object customConcurrentHashMap;
//get the type of what was passed in
Class<?> guavaMapClass = guavaMap.getClass();
//if it's a CustomConcurrentHashMap we have what we need
if (guavaMapClass == customConcurrentHashMapClass) {
customConcurrentHashMap = guavaMap;
}
//if it's a NullConcurrentMap (auto-evictor), return early
else if (guavaMapClass == nullConcurrentMapClass) {
return;
}
//if it's a computing map we need to pull the instance from the adapter's "cache" field
else if (guavaMapClass == computingMapAdapterClass) {
customConcurrentHashMap = cacheField.get(guavaMap);
//get the type of what we pulled out
Class<?> innerCacheClass = customConcurrentHashMap.getClass();
//if it's a NullComputingConcurrentMap (auto-evictor), return early
if (innerCacheClass == nullComputingConcurrentMapClass) {
return;
}
//otherwise make sure it's a ComputingConcurrentHashMap - error if it isn't
else if (innerCacheClass != computingConcurrentHashMapClass) {
throw new IllegalArgumentException("Provided ComputingMapAdapter's inner cache was an unexpected type: " + innerCacheClass);
}
}
//error for anything else passed in
else {
throw new IllegalArgumentException("Provided ConcurrentMap was not an expected Guava Map: " + guavaMapClass);
}
//pull the array of Segments out of the CustomConcurrentHashMap instance
Object[] segments = (Object[])segmentsField.get(customConcurrentHashMap);
//loop over them and invoke the cleanup method on each one
for (Object segment : segments) {
runCleanupMethod.invoke(segment);
}
}
catch (IllegalAccessException iae) {
throw new RuntimeException(iae);
}
catch (InvocationTargetException ite) {
throw new RuntimeException(ite.getCause());
}
}
}
I'm looking for feedback on whether this approach is advisable as a stopgap until the issue is resolved in a Guava release, particularly from members of the Guava team when they get a minute.
EDIT: updated the solution to allow for auto-evicting maps (NullConcurrentMap or NullComputingConcurrentMap residing in a ComputingMapAdapter). This turned out to be necessary in my case, since I'm calling this method on all of my maps and a few of them are auto-evictors.

How to make use of element versioning in an EHCache instance?

I am caching objects that are being sent to my component in an asynchronous way. In other words, the order in which these objects arrive is unpredictable. To avoid any issues, I have included a version attribute to my objects (which basically is a timestamp). The idea is that any object that arrives with a version that's older than the one that has already been cached, it can be discarded.
The "Element" class of EHCache (which wraps objects in an EHCache) seems to facilitate this: apart from a key and value, the constructor can take a (long-based) version. I cannot make this work in the way I'd expect it to work though. The following code snippet demonstrates my problem (Using EHCache 2.1.1):
public static void main(String[] args) {
final CacheManager manager = CacheManager.create();
final Cache testCache = new Cache(new CacheConfiguration("test", 40));
manager.addCache(testCache);
final String key = "key";
final Element elNew = new Element(key, "NEW", 2L);
testCache.put(elNew);
final Element elOld = new Element(key, "OLD", 1L);
testCache.put(elOld);
System.out.println("Cache content:");
for (Object k : testCache.getKeys()) {
System.out.println(testCache.get(k));
}
}
I'd expect the code above to cause the cached value to be "NEW", instead, "OLD" is printed. If you play a bit with the order in which elements are inserted, you'll find that the last one that has been inserted is the one that will remain in cache. Versioning seems to be ignored.
Am I not using the versioning-feature properly, or is it perhaps not intended to be used for this purpose? Can anyone recommend alternatives?
EhCache apparently ignores the value of the version field — its meaning is defined by the user. So EhCache overwrites your version 2L with version 1L without knowing what the version numbers mean.
See 1) http://jira.terracotta.org/jira/browse/EHC-765
it was decided that providing an internal versioning scheme would
cause unnecessary overhead for all users. Instead we now leave the
version value untouched so that it is entirely within the control of
the user.
And 2) http://jira.terracotta.org/jira/browse/EHC-666
[...] I would much prefer the solution proposed by Marek, that we grant the
user complete control over the version attribute and to not mutate it
at all internally. This prevents there being any performance impact
for the bulk of users, and allows the user the flexibility to use it
as they see fit. [...]
As agreed with Greg via email I fixed this as
per my last comment.
I suppose using the version field might lead to race conditions, resulting in one thread overwriting the up-to-date version of a cache item with a some what somewhat older version. Therefore, in my application, I have a counter that keeps track of the most recent version of the database, and when I load a cached value with a version field different from the most-recent-database-version-value, I know that the cached value might be stale and ignore it.

How to use ReadWriteLock?

I'm the following situation.
At web application startup I need to load a Map which is thereafter used by multiple incoming threads. That is, requests comes in and the Map is used to find out whether it contains a particular key and if so the value (the object) is retrieved and associated to another object.
Now, at times the content of the Map changes. I don't want to restart my application to reload the new situation. Instead I want to do this dynamically.
However, at the time the Map is re-loading (removing all items and replacing them with the new ones), concurrent read requests on that Map still arrive.
What should I do to prevent all read threads from accessing that Map while it's being reloaded ? How can I do this in the most performant way, because I only need this when the Map is reloading which will only occur sporadically (each every x weeks) ?
If the above is not an option (blocking) how can I make sure that while reloading my read request won't suffer from unexpected exceptions (because a key is no longer there, or a value is no longer present or being reloaded) ?
I was given the advice that a ReadWriteLock might help me out. Can you someone provide me an example on how I should use this ReadWriteLock with my readers and my writer ?
Thanks,
E
I suggest to handle this as follow:
Have your map accessible at a central place (could be a Spring singleton, a static ...).
When starting to reload, let the instance as is, work in a different Map instance.
When that new map is filled, replace the old map with this new one (that's an atomic operation).
Sample code:
static volatile Map<U, V> map = ....;
// **************************
Map<U, V> tempMap = new ...;
load(tempMap);
map = tempMap;
Concurrency effects :
volatile helps with visibility of the variable to other threads.
While reloading the map, all other threads see the old value undisturbed, so they suffer no penalty whatsoever.
Any thread that retrieves the map the instant before it is changed will work with the old values.
It can ask several gets to the same old map instance, which is great for data consistency (not loading the first value from the older map, and others from the newer).
It will finish processing its request with the old map, but the next request will ask the map again, and will receive the newer values.
If the client threads do not modify the map, i.e. the contents of the map is solely dependent on the source from where it is loaded, you can simply load a new map and replace the reference to the map your client threads are using once the new map is loaded.
Other then using twice the memory for a short time, no performance penalty is incurred.
In case the map uses too much memory to have 2 of them, you can use the same tactic per object in the map; iterate over the map, construct a new mapped-to object and replace the original mapping once the object is loaded.
Note that changing the reference as suggested by others could cause problems if you rely on the map being unchanged for a while (e.g. if (map.contains(key)) {V value = map.get(key); ...}. If you need that, you should keep a local reference to the map:
static Map<U,V> map = ...;
void do() {
Map<U,V> local = map;
if (local.contains(key)) {
V value = local.get(key);
...
}
}
EDIT:
The assumption is that you don't want costly synchronization for your client threads. As a trade-off, you allow client threads to finish their work that they've already begun before your map changed - ignoring any changes to the map that happened while it is running. This way, you can safely made some assumptions about your map - e.g. that a key is present and always mapped to the same value for the duration of a single request. In the example above, if your reader thread changed the map just after a client called map.contains(key), the client might get null on map.get(key) - and you'd almost certainly end this request with a NullPointerException. So if you're doing multiple reads to the map and need to do some assumptions as the one mentioned before, it's easiest to keep a local reference to the (maybe obsolete) map.
The volatile keyword isn't strictly necessary here. It would just make sure that the new map is used by other threads as soon as you changed the reference (map = newMap). Without volatile, a subsequent read (local = map) could still return the old reference for some time (we're talking about less than a nanosecond though) - especially on multicore systems if I remember correctly. I wouldn't care about it, but f you feel a need for that extra bit of multi-threading beauty, your free to use it of course ;)
I like the volatile Map solution from KLE a lot and would go with that. Another idea that someone might find interesting is to use the map equivalent of a CopyOnWriteArrayList, basically a CopyOnWriteMap. We built one of these internally and it is non-trivial but you might be able to find a COWMap out in the wild:
http://old.nabble.com/CopyOnWriteMap-implementation-td13018855.html
This is the answer from the JDK javadocs for ReentrantReadWriteLock implementation of ReadWriteLock. A few years late but still valid, especially if you don't want to rely only on volatile
class RWDictionary {
private final Map<String, Data> m = new TreeMap<String, Data>();
private final ReentrantReadWriteLock rwl = new ReentrantReadWriteLock();
private final Lock r = rwl.readLock();
private final Lock w = rwl.writeLock();
public Data get(String key) {
r.lock();
try { return m.get(key); }
finally { r.unlock(); }
}
public String[] allKeys() {
r.lock();
try { return m.keySet().toArray(); }
finally { r.unlock(); }
}
public Data put(String key, Data value) {
w.lock();
try { return m.put(key, value); }
finally { w.unlock(); }
}
public void clear() {
w.lock();
try { m.clear(); }
finally { w.unlock(); }
}
}

Categories