Lock or wait cache load - java

We need to lock a method responsible for loading database date into a HashMap based cache.
A possible situation is that a second thread tries to access the method while the first method is still loading cache.
We consider the second thread's effort in this case to be superfluous. We would therefore like to have that second thread wait until the first thread is finished, and then return (without loading the cache again).
What I have works, but it seems quite inelegant. Are there better solutions?
private static final ReentrantLock cacheLock = new ReentrantLock();
private void loadCachemap() {
if (cacheLock.tryLock()) {
try {
this.cachemap = retrieveParamCacheMap();
} finally {
cacheLock.unlock();
}
} else {
try {
cacheLock.lock(); // wait until thread doing the load is finished
} finally {
try {
cacheLock.unlock();
} catch (IllegalMonitorStateException e) {
logger.error("loadCachemap() finally {}",e);
}
}
}
}

I prefer a more resilient approach using read locks AND write locks. Something like:
private static final ReadWriteLock cacheLock = new ReentrantReadWriteLock();
private static final Lock cacheReadLock = cacheLock.readLock();
private static final Lock cacheWriteLock = cacheLock.writeLock();
private void loadCache() throws Exception {
// Expiry.
while (storeCache.expired(CachePill)) {
/**
* Allow only one in - all others will wait for 5 seconds before checking again.
*
* Eventually the one that got in will finish loading, refresh the Cache pill and let all the waiting ones out.
*
* Also waits until all read locks have been released - not sure if that might cause problems under busy conditions.
*/
if (cacheWriteLock.tryLock(5, TimeUnit.SECONDS)) {
try {
// Got a lock! Start the rebuild if still out of date.
if (storeCache.expired(CachePill)) {
rebuildCache();
}
} finally {
cacheWriteLock.unlock();
}
}
}
}
Note that the storeCache.expired(CachePill) detects a stale cache which may be more than you are wanting but the concept here is the same, establish a write lock before updating the cache which will deny all read attempts until the rebuild is done. Also, manage multiple attempts at write in a loop of some sort or just drop out and let the read lock wait for access.
A read from the cache now looks like this:
public Object load(String id) throws Exception {
Store store = null;
// Make sure cache is fresh.
loadCache();
try {
// Establish a read lock so we do not attempt a read while teh cache is being updated.
cacheReadLock.lock();
store = storeCache.get(storeId);
} finally {
// Make sure the lock is cleared.
cacheReadLock.unlock();
}
return store;
}
The primary benefit of this form is that read access does not block other read access but everything stops cleanly during a rebuild - even other rebuilds.

You didn't say how complicated your structure is and how much concurrency / congestion you need. There are many ways to address your need.
If your data is simple, use a ConcurrentHashMap or similar to hold your data. Then just read and write in threads regardlessly.
Another alternative is to use actor model and put read/write on the same queue.

If all you need is to fill a read-only map which is initialized from database once requested, you could use any form of double-check locking which may be implemented in a number of ways. The easiest variant would be the following:
private volatile Map<T, V> cacheMap;
public void loadCacheMap() {
if (cacheMap == null) {
synchronized (this) {
if (cacheMap == null) {
cacheMap = retrieveParamCacheMap();
}
}
}
}
But I would personally prefer to avoid any form of synchronization here and just make sure that the initialization is done before any other thread can access it (for example in a form of init method in a DI container). In this case you would even avoid overhead of volatile.
EDIT: The answer works only when initial load is expected. In case of multiple updates, you could try to replace the tryLock by some other form of test and test-and-set, for example using something like this:
private final AtomicReference<CountDownLatch> sync =
new AtomicReference<>(new CountDownLatch(0));
private void loadCacheMap() {
CountDownLatch oldSync = sync.get();
if (oldSync.getCount() == 0) { // if nobody updating now
CountDownLatch newSync = new CountDownLatch(1);
if (sync.compareAndSet(oldSync, newSync)) {
cacheMap = retrieveParamCacheMap();
newSync.countDown();
return;
}
}
sync.get().await();
}

Related

Java possible race condition of Collection in -- spring framework

I am using the spring framework under tomcat to write service that can handle multiple concurrent requests. There is a static variable declared in my service class to be shared across all the threads and protected by a read-write lock. This static collection is read and written to periodically. Each time I update I acquire the write lock and when I read it I acquire the read lock.
The data stored in the collection is entries from a database table. So it is a collection of type List. Basically I have a table that gets updated rarely and so I am caching it in the process memory.
Now there are times when I need to log this data so can I log the return object without acquiring the lock in the method. Or will that cause a race condition? The logging is for the purpose of debugging.
Also, the returned value of the collection is only read only and is not modified by any method. No one outside of the service object uses this Collection.
I feel this should work because when a new collection is allocated the old collection will only go away if all references to it have gone away. And while updating no new references to old or new object are allowed till the write lock is unlocked.
The code looks as follows:
class ObjService {
private static Collection<OtherObj> _staticCollection;
private static ReentrantReadWriteLock rwlock = new ReentrantReadWriteLock(true);
private Collection<OtherObj> getCollection () {
Collection<OtherObj> retVal = null;
rwlock.readLock().lock();
if (_staticCollection != null) {
retVal = _staticCollection;
rwlock.readLock().unlock();
log (retVal);
}
else {
rwlock.readLock().unlock();
ReloadCollectionFromDB ();
rwlock.readLock().lock();
retVal = _staticCollection;
rwlock.readLock().unlock();
}
}
private ReloadCollectionFromDB () {
Collection<OtherObj> otherObjCol = null;
try {
otherObjCol = objRepo.findAll ();
}
catch (Exception ex) {
// log exception
return;
}
rwlock.writeLock().lock();
_staticCollection = otherObjCol;
rwlock.writeLock().unlock();
}
// periodically get data from DB
#Scheduled(initialDelayString = "120000", fixedDelayString = "540000")
void readLoadCache () {
ReloadCollectionFromDB ();
}
}
If there are better ways of doing this, I would appreciate some guidance.
Many thanks,
~Ash

Using Java Queued Locks

I'm looking for a recommendation on how to make this code thread-safe with locks in Java. I know there are a lot of gotchas with locks; obscure problems, race-conditions, etc that can pop up. Here is the basic idea of what I'm trying to achieve, implemented rather naïvely:
public class MultipleThreadWriter {
boolean isUpgrading=false;
boolean isWriting=false;
public void writeData(String uniqueId) {
if (isUpgrading)
//block until isUpgrading is false
isWriting = true;
{
//do write stuff
}
isWriting = false;
}
public void upgradeSystem() {
if (isWriting)
//block until isWriting is false
isUpgrading = true;
{
//do updates
}
isUpgrading = false;
}
}
The basic idea is that multiple threads are allowed to write data simultaneously. It doesn't matter, since no two threads will ever be writing to data pertaining to the same uniqueId. However, the "system upgrade" manipulates data for all uniqueIds, so it must block (wait in line) until no data is being written before it can start, at which point it blocks all writes until it is finished. (It is definitely not a consumer/producer pattern going on here- upgrading occurs arbitrarily, i.e. has no relation to the data being written.)
This sounds like a good application for a readers-writer lock.
However, in this case your "readers" are the small update tasks that can all run concurrently, and your "writer" is the system upgrade task.
There's an implementation of this in the Java standard library:
java.util.concurrent.locks.ReentrantReadWriteLock
The lock has fair and non-fair modes. If you want the system upgrade to run as soon as possible after it's scheduled, then use the fair mode of the lock. If you want the upgrade to be applied during idle time (i.e., wait until there are no small updates going on), then you can use the non-fair mode instead.
Since this is a bit of an unorthodox application of the readers-writer lock (your readers are actually writing too!), make sure to comment this well in your code. You might even consider writing a wrapper around the ReentrantReadWriteLock class that provides localUpdateLock vs globalUpdateLock methods, which delegate to the readLock and writeLock, respectively.
Based on answer from #DaoWen , this is my untested solution.
public class MultipleThreadWriter {
private final ReentrantReadWriteLock rwl = new ReentrantReadWriteLock();
private final Lock r = rwl.readLock();
private final Lock w = rwl.writeLock();
public void writeData() {
r.lock();
try {
//do write stuff
} finally {
r.unlock();
}
}
public void upgradeSystem() {
w.lock();
try {
//do updates
} finally {
w.unlock();
}
}
}

Synchronizing on cached items

I'm using something like
Cache<Integer, Item> cache;
where the Items are independent of each other and look like
private static class Item {
private final int id;
... some mutable data
synchronized doSomething() {...}
synchronized doSomethingElse() {...}
}
The idea is to obtain the item from the cache and call a synchronized method on it. In case of a miss, the item can be recreated, that's fine.
A problem occurs when an item gets evicted from the cache and recreated while a thread runs a synchronized method. A new thread obtains a new item and synchronizes on it... so for a single id, there are two threads inside the synchronized method. FAIL.
Is there an easy way around it? It's Guava Cache, if it helps.
I think the suggestion from Louis, using the the keys for locking is the most simple and practical one. Here is code some snippet, that, without the help of Guava libraries, illustrates the idea:
static locks[] = new Lock[ ... ];
static { /* initialize lock array */ }
int id;
void doSomething() {
final lock = locks[id % locks.length];
lock.lock();
try {
/* protected code */
} finally {
lock.unlock();
}
}
The size of the lock array limits the maximum amount of parallelism you get. If your code is only using CPU, you can initialize it by the number of available processors and this is the perfect solution. If your code waits for I/O you might need an arbitrary big array of locks or you limit the number of threads that can run the critical section. In this case another approach might be better.
Comments on a more conceptual level:
If you want to prevent the item from being evicted, you need a mechanism called pinning. Internally this is used by most cache implementations, e.g. for blocking during I/O operations. Some caches may expose a way to do it by the applications.
In a JCache compatible cache, there is the concept of an EntryProcessor. The EntryProcessor allows you to process a peace of code on an entry in an atomic way. This means the cache is doing all the locking for you. Depending of the scope of the problem, this may have an advantage, since this also works in clustered scenarios, which means the locking is cluster wide.
Another idea which comes to my mind is the vetoable eviction. This is a concept EHCache 3 is implementing. By specifying a vetoable eviction policy you can implement a pinning mechanism on your own.
I'm sure that there are multiple solutions for your issue.
I wrote down one of them with using a unique lock for each ietmId:
public class LockManager {
private Map<Integer, Lock> lockMap = new ConcurrentHashMap<>();
public synchronized Lock getOrCreateLockForId(Integer itemId) {
Lock lock;
if (lockMap.containsKey(itemId)) {
System.out.println("Get lock");
lock = lockMap.get(itemId);
} else {
System.out.println("Create lock");
lock = new ReentrantLock();
lockMap.put(itemId, lock);
}
return lock;
}
public synchronized Lock getLockForId(Integer itemId) {
Lock lock;
if (lockMap.containsKey(itemId)) {
System.out.println("get lock");
return lockMap.get(itemId);
} else {
throw new IllegalStateException("First lock, than unlock");
}
}
}
So, instead of using synchronised methods in class Item use LockManager to get Lock by itemId and call lock.lock() after it was retrieved.
Also note that LockManager should have singleton scope and the same instance should be shared across all usages.
Below you can see example of LockManager using:
try {
lockManager.getOrCreateLockForId(itemId).lock();
System.out.println("start doing something" + num);
try {
Thread.sleep(5000);
} catch (InterruptedException e) {
e.printStackTrace();
}
System.out.println("completed doing something" + num);
} finally {
lockManager.getLockForId(itemId).unlock();
}

Better solution than while (reentrantLock.isLocked()) for waiting

I have a service bean which provides access to a Map. From time to time I need to rebuild the content of the Map wich takes several seconds and I want to block the access to the map while its rebuilding, because it can be accessed from different Threads.
#Service
public class MyService {
private Map<Key,Value> cache = null;
private ReentrantLock reentrantLock = new ReentrantLock();
public void rebuildCache(){
try {
reentrantLock.lock();
cache = new ConcurrentHashMap<>();
... //processing time consuming stuff and building up the cache
}finally {
reentrantLock.unlock();
}
}
public Value getValue(Key key){
while (lock.isLocked()){}
return cache.get(key);
}
...
}
As you can see I use
while (reentrantLock.isLocked()){}
to check if the lock is locked and wait until its unlocked. This solution seems to be quite dirty. Is there a better solution?
Use a ReentrantReadWriteLock instead.
In your write method:
theLock.writeLock().lock();
try {
// update the map
} finally {
theLock.writeLock().unlock();
  }
In the read method, use the .readLock() instead.
This has the problem however that during the update of the map, all readers will be blocked; another solution would be to use a plain lock to replace the reference of the old map to a new, updated one, and use a plain old synchronized.
More importantly though, your use of locks is incorrect. You should do:
theLock.lock();
try {
// whatever
} finally {
theLock.unlock();
}
Imagine what happens if the locking fails with your current lock: you'll always try to unlock and you'll end up with an IllegalLockStateException.
I would propose a ReadWriteLock.
With it you can read as many times as you want, as long as read lock is not locked.
#Service
public class MyService {
private Map<Key,Value> cache = null;
private ReentrantLock reentrantLock = new ReentrantLock();
public void rebuildCache(){
try {
reentrantLock.writeLock().lock();
cache = new ConcurrentHashMap<>();
... //processing time consuming stuff and building up the cache
}finally {
reentrantLock.writeLock().unlock();
}
}
public Value getValue(Key key){
if(reentrantLock.getReadLock().lock()){
return cache.get(key);
}finally{
reentrantLock.getReadLock().unlock();
}
}
...
}

Using ReentrantReadWriteLock and a boolean flag

I have a cache that gets loaded upfront with a large amount of data (by a background thread) and is unusable until full (it will also get reloaded every so often and be unusable during that load). I want the classes that use it to check a flag isLoaded() before accesses. I use a ReentrantReadWriteLock (I omit this in the code for simplicity) for access control like this:
public class Cache {
private volatile boolean loaded = false; //starts false
private static String[] cache;
private static Lock readLock;
private static Lock writeLock;
public Object get(Object key) {
if (!readLock.tryLock()) throw IllegalStateException(...);
try {
... do some work
} finally {
readLock.unlock();
}
}
// called by background thread
private void loadFull() {
loaded = false;
writeLock.lock()
try {
cache = new String[];
... fill cache
} finally {
writeLock.unlock();
loaded = true;
}
}
....
}
Now in my other class I have a block like this:
if (cache.isLoaded()) {
try {
Object x = cache.get(y);
} catch (IllegalStateException iex) {
// goto database for object
}
} else {
// goto database for object
}
Do I really need the try/catch? Is it ever possible that the flag will be set to false and the readLock try() will fail? Should I even bother with the flag and jut catch the Exception (since I basically do the same code if the Exception is thrown as if the flag is false). I just feel like I am doing something slightly wrong but I can't put my finger on it. Thanks.
Do I really need the try/catch? Is it
ever possible that the flag will be
set to false and the readLock try()
will fail?
Yes, you need it. Between the time cache.isLoaded() and cache.get() are called, a writer can come in and get the write lock - in which case cache.isLoaded() will return true, but cache.get() will throw the exception.
Should I even bother with the flag and
jut catch the Exception (since I
basically do the same code if the
Exception is thrown as if the flag is
false).
From the code you have shown, the exception is thrown only in cases where the get fails to acquire the read lock. Acquisition of the read lock fails only if there is a concurrent writer at the time. isLoaded also returns false in precisely this scenario. So just relying on the exception would suffice. Also, consider creating a specialized CacheStaleException.
The tryLock will fail if some other thread has already acquired that lock. This typically means that an exception would be thrown if a client fails to acquire a lock due to high contention (multiple clients accessing the same cache). Is there any fallback strategy you have implemented in your client layer which deals with such situations?
Also, why static locks? I think that even though your cache is typically used in the application as a singleton, there is no need to limit its usability by making Locks static.
No, but to be honest your paradigm is confusing. Presumably it is expensive to go to the actual database and that is the purpose of the cache. In the case that the cache is being reloaded, is it not better to just wait until it is?
Assuming you really do want to go to the database if the read lock is not immediately available, I would do this:
public Object get(Object key) {
Object returnValue;
if (readLock.tryLock()) {
try {
... do some work
returnValue = ...
} finally {
readLock.unlock();
}
} else {
//go to database
returnValue = ...
}
return returnValue;
}

Categories