I have a cache that gets loaded upfront with a large amount of data (by a background thread) and is unusable until full (it will also get reloaded every so often and be unusable during that load). I want the classes that use it to check a flag isLoaded() before accesses. I use a ReentrantReadWriteLock (I omit this in the code for simplicity) for access control like this:
public class Cache {
private volatile boolean loaded = false; //starts false
private static String[] cache;
private static Lock readLock;
private static Lock writeLock;
public Object get(Object key) {
if (!readLock.tryLock()) throw IllegalStateException(...);
try {
... do some work
} finally {
readLock.unlock();
}
}
// called by background thread
private void loadFull() {
loaded = false;
writeLock.lock()
try {
cache = new String[];
... fill cache
} finally {
writeLock.unlock();
loaded = true;
}
}
....
}
Now in my other class I have a block like this:
if (cache.isLoaded()) {
try {
Object x = cache.get(y);
} catch (IllegalStateException iex) {
// goto database for object
}
} else {
// goto database for object
}
Do I really need the try/catch? Is it ever possible that the flag will be set to false and the readLock try() will fail? Should I even bother with the flag and jut catch the Exception (since I basically do the same code if the Exception is thrown as if the flag is false). I just feel like I am doing something slightly wrong but I can't put my finger on it. Thanks.
Do I really need the try/catch? Is it
ever possible that the flag will be
set to false and the readLock try()
will fail?
Yes, you need it. Between the time cache.isLoaded() and cache.get() are called, a writer can come in and get the write lock - in which case cache.isLoaded() will return true, but cache.get() will throw the exception.
Should I even bother with the flag and
jut catch the Exception (since I
basically do the same code if the
Exception is thrown as if the flag is
false).
From the code you have shown, the exception is thrown only in cases where the get fails to acquire the read lock. Acquisition of the read lock fails only if there is a concurrent writer at the time. isLoaded also returns false in precisely this scenario. So just relying on the exception would suffice. Also, consider creating a specialized CacheStaleException.
The tryLock will fail if some other thread has already acquired that lock. This typically means that an exception would be thrown if a client fails to acquire a lock due to high contention (multiple clients accessing the same cache). Is there any fallback strategy you have implemented in your client layer which deals with such situations?
Also, why static locks? I think that even though your cache is typically used in the application as a singleton, there is no need to limit its usability by making Locks static.
No, but to be honest your paradigm is confusing. Presumably it is expensive to go to the actual database and that is the purpose of the cache. In the case that the cache is being reloaded, is it not better to just wait until it is?
Assuming you really do want to go to the database if the read lock is not immediately available, I would do this:
public Object get(Object key) {
Object returnValue;
if (readLock.tryLock()) {
try {
... do some work
returnValue = ...
} finally {
readLock.unlock();
}
} else {
//go to database
returnValue = ...
}
return returnValue;
}
Related
I'm looking for a recommendation on how to make this code thread-safe with locks in Java. I know there are a lot of gotchas with locks; obscure problems, race-conditions, etc that can pop up. Here is the basic idea of what I'm trying to achieve, implemented rather naïvely:
public class MultipleThreadWriter {
boolean isUpgrading=false;
boolean isWriting=false;
public void writeData(String uniqueId) {
if (isUpgrading)
//block until isUpgrading is false
isWriting = true;
{
//do write stuff
}
isWriting = false;
}
public void upgradeSystem() {
if (isWriting)
//block until isWriting is false
isUpgrading = true;
{
//do updates
}
isUpgrading = false;
}
}
The basic idea is that multiple threads are allowed to write data simultaneously. It doesn't matter, since no two threads will ever be writing to data pertaining to the same uniqueId. However, the "system upgrade" manipulates data for all uniqueIds, so it must block (wait in line) until no data is being written before it can start, at which point it blocks all writes until it is finished. (It is definitely not a consumer/producer pattern going on here- upgrading occurs arbitrarily, i.e. has no relation to the data being written.)
This sounds like a good application for a readers-writer lock.
However, in this case your "readers" are the small update tasks that can all run concurrently, and your "writer" is the system upgrade task.
There's an implementation of this in the Java standard library:
java.util.concurrent.locks.ReentrantReadWriteLock
The lock has fair and non-fair modes. If you want the system upgrade to run as soon as possible after it's scheduled, then use the fair mode of the lock. If you want the upgrade to be applied during idle time (i.e., wait until there are no small updates going on), then you can use the non-fair mode instead.
Since this is a bit of an unorthodox application of the readers-writer lock (your readers are actually writing too!), make sure to comment this well in your code. You might even consider writing a wrapper around the ReentrantReadWriteLock class that provides localUpdateLock vs globalUpdateLock methods, which delegate to the readLock and writeLock, respectively.
Based on answer from #DaoWen , this is my untested solution.
public class MultipleThreadWriter {
private final ReentrantReadWriteLock rwl = new ReentrantReadWriteLock();
private final Lock r = rwl.readLock();
private final Lock w = rwl.writeLock();
public void writeData() {
r.lock();
try {
//do write stuff
} finally {
r.unlock();
}
}
public void upgradeSystem() {
w.lock();
try {
//do updates
} finally {
w.unlock();
}
}
}
I'm using something like
Cache<Integer, Item> cache;
where the Items are independent of each other and look like
private static class Item {
private final int id;
... some mutable data
synchronized doSomething() {...}
synchronized doSomethingElse() {...}
}
The idea is to obtain the item from the cache and call a synchronized method on it. In case of a miss, the item can be recreated, that's fine.
A problem occurs when an item gets evicted from the cache and recreated while a thread runs a synchronized method. A new thread obtains a new item and synchronizes on it... so for a single id, there are two threads inside the synchronized method. FAIL.
Is there an easy way around it? It's Guava Cache, if it helps.
I think the suggestion from Louis, using the the keys for locking is the most simple and practical one. Here is code some snippet, that, without the help of Guava libraries, illustrates the idea:
static locks[] = new Lock[ ... ];
static { /* initialize lock array */ }
int id;
void doSomething() {
final lock = locks[id % locks.length];
lock.lock();
try {
/* protected code */
} finally {
lock.unlock();
}
}
The size of the lock array limits the maximum amount of parallelism you get. If your code is only using CPU, you can initialize it by the number of available processors and this is the perfect solution. If your code waits for I/O you might need an arbitrary big array of locks or you limit the number of threads that can run the critical section. In this case another approach might be better.
Comments on a more conceptual level:
If you want to prevent the item from being evicted, you need a mechanism called pinning. Internally this is used by most cache implementations, e.g. for blocking during I/O operations. Some caches may expose a way to do it by the applications.
In a JCache compatible cache, there is the concept of an EntryProcessor. The EntryProcessor allows you to process a peace of code on an entry in an atomic way. This means the cache is doing all the locking for you. Depending of the scope of the problem, this may have an advantage, since this also works in clustered scenarios, which means the locking is cluster wide.
Another idea which comes to my mind is the vetoable eviction. This is a concept EHCache 3 is implementing. By specifying a vetoable eviction policy you can implement a pinning mechanism on your own.
I'm sure that there are multiple solutions for your issue.
I wrote down one of them with using a unique lock for each ietmId:
public class LockManager {
private Map<Integer, Lock> lockMap = new ConcurrentHashMap<>();
public synchronized Lock getOrCreateLockForId(Integer itemId) {
Lock lock;
if (lockMap.containsKey(itemId)) {
System.out.println("Get lock");
lock = lockMap.get(itemId);
} else {
System.out.println("Create lock");
lock = new ReentrantLock();
lockMap.put(itemId, lock);
}
return lock;
}
public synchronized Lock getLockForId(Integer itemId) {
Lock lock;
if (lockMap.containsKey(itemId)) {
System.out.println("get lock");
return lockMap.get(itemId);
} else {
throw new IllegalStateException("First lock, than unlock");
}
}
}
So, instead of using synchronised methods in class Item use LockManager to get Lock by itemId and call lock.lock() after it was retrieved.
Also note that LockManager should have singleton scope and the same instance should be shared across all usages.
Below you can see example of LockManager using:
try {
lockManager.getOrCreateLockForId(itemId).lock();
System.out.println("start doing something" + num);
try {
Thread.sleep(5000);
} catch (InterruptedException e) {
e.printStackTrace();
}
System.out.println("completed doing something" + num);
} finally {
lockManager.getLockForId(itemId).unlock();
}
We need to lock a method responsible for loading database date into a HashMap based cache.
A possible situation is that a second thread tries to access the method while the first method is still loading cache.
We consider the second thread's effort in this case to be superfluous. We would therefore like to have that second thread wait until the first thread is finished, and then return (without loading the cache again).
What I have works, but it seems quite inelegant. Are there better solutions?
private static final ReentrantLock cacheLock = new ReentrantLock();
private void loadCachemap() {
if (cacheLock.tryLock()) {
try {
this.cachemap = retrieveParamCacheMap();
} finally {
cacheLock.unlock();
}
} else {
try {
cacheLock.lock(); // wait until thread doing the load is finished
} finally {
try {
cacheLock.unlock();
} catch (IllegalMonitorStateException e) {
logger.error("loadCachemap() finally {}",e);
}
}
}
}
I prefer a more resilient approach using read locks AND write locks. Something like:
private static final ReadWriteLock cacheLock = new ReentrantReadWriteLock();
private static final Lock cacheReadLock = cacheLock.readLock();
private static final Lock cacheWriteLock = cacheLock.writeLock();
private void loadCache() throws Exception {
// Expiry.
while (storeCache.expired(CachePill)) {
/**
* Allow only one in - all others will wait for 5 seconds before checking again.
*
* Eventually the one that got in will finish loading, refresh the Cache pill and let all the waiting ones out.
*
* Also waits until all read locks have been released - not sure if that might cause problems under busy conditions.
*/
if (cacheWriteLock.tryLock(5, TimeUnit.SECONDS)) {
try {
// Got a lock! Start the rebuild if still out of date.
if (storeCache.expired(CachePill)) {
rebuildCache();
}
} finally {
cacheWriteLock.unlock();
}
}
}
}
Note that the storeCache.expired(CachePill) detects a stale cache which may be more than you are wanting but the concept here is the same, establish a write lock before updating the cache which will deny all read attempts until the rebuild is done. Also, manage multiple attempts at write in a loop of some sort or just drop out and let the read lock wait for access.
A read from the cache now looks like this:
public Object load(String id) throws Exception {
Store store = null;
// Make sure cache is fresh.
loadCache();
try {
// Establish a read lock so we do not attempt a read while teh cache is being updated.
cacheReadLock.lock();
store = storeCache.get(storeId);
} finally {
// Make sure the lock is cleared.
cacheReadLock.unlock();
}
return store;
}
The primary benefit of this form is that read access does not block other read access but everything stops cleanly during a rebuild - even other rebuilds.
You didn't say how complicated your structure is and how much concurrency / congestion you need. There are many ways to address your need.
If your data is simple, use a ConcurrentHashMap or similar to hold your data. Then just read and write in threads regardlessly.
Another alternative is to use actor model and put read/write on the same queue.
If all you need is to fill a read-only map which is initialized from database once requested, you could use any form of double-check locking which may be implemented in a number of ways. The easiest variant would be the following:
private volatile Map<T, V> cacheMap;
public void loadCacheMap() {
if (cacheMap == null) {
synchronized (this) {
if (cacheMap == null) {
cacheMap = retrieveParamCacheMap();
}
}
}
}
But I would personally prefer to avoid any form of synchronization here and just make sure that the initialization is done before any other thread can access it (for example in a form of init method in a DI container). In this case you would even avoid overhead of volatile.
EDIT: The answer works only when initial load is expected. In case of multiple updates, you could try to replace the tryLock by some other form of test and test-and-set, for example using something like this:
private final AtomicReference<CountDownLatch> sync =
new AtomicReference<>(new CountDownLatch(0));
private void loadCacheMap() {
CountDownLatch oldSync = sync.get();
if (oldSync.getCount() == 0) { // if nobody updating now
CountDownLatch newSync = new CountDownLatch(1);
if (sync.compareAndSet(oldSync, newSync)) {
cacheMap = retrieveParamCacheMap();
newSync.countDown();
return;
}
}
sync.get().await();
}
I have a class that has the object "Card". This class keeps checking to see if the object is not null anymore. Only one other thread can update this object. Should I just do it like the code below? Use volatile?Syncronized? lock (which I dont know how to use really)? What do you recommend as easiest solution?
Class A{
public Card myCard = null;
public void keepCheck(){
while(myCard == null){
Thread.sleep(100)
}
//value updated
callAnotherMethod();
}
Another thread has following:
public void run(){
a.myCard = new Card(5);
}
What do you suggest?
You should use a proper wait event (see the Guarded Block tutorial), otherwise you run the risk of the "watching" thread seeing the reference before it sees completely initialized member fields of the Card. Also wait() will allow the thread to sleep instead of sucking up CPU in a tight while loop.
For example:
Class A {
private final Object cardMonitor = new Object();
private volatile Card myCard;
public void keepCheck () {
synchronized (cardMonitor) {
while (myCard == null) {
try {
cardMonitor.wait();
} catch (InterruptedException x) {
// either abort or ignore, your choice
}
}
}
callAnotherMethod();
}
public void run () {
synchronized (cardMonitor) {
myCard = new Card(5);
cardMonitor.notifyAll();
}
}
}
I made myCard private in the above example. I do recommend avoiding lots of public fields in a case like this, as the code could end up getting messy fast.
Also note that you do not need cardMonitor -- you could use the A itself, but having a separate monitor object lets you have finer control over synchronization.
Beware, with the above implementation, if run() is called while callAnotherMethod() is executing, it will change myCard which may break callAnotherMethod() (which you do not show). Moving callAnotherMethod() inside the synchronized block is one possible solution, but you have to decide what the appropriate strategy is there given your requirements.
The variable needs to be volatile when modifying from a different thread if you intend to poll for it, but a better solution is to use wait()/notify() or even a Semaphore to keep your other thread sleeping until myCard variable is initialized.
Looks like you have a classic producer/consumer case.
You can handle this case using wait()/notify() methods. See here for an example: How to use wait and notify in Java?
Or here, for more examples: http://www.programcreek.com/2009/02/notify-and-wait-example/
I have a common interface for a number of singleton implementations. Interface defines initialization method which can throw checked exception.
I need a factory which will return cached singleton implementations on demand, and wonder if following approach is thread-safe?
UPDATE1: Please don't suggest any 3rd partly libraries, as this will require to obtain legal clearance due to possible licensing issues :-)
UPDATE2: this code will likely to be used in EJB environment, so it's preferrable not to spawn additional threads or use stuff like that.
interface Singleton
{
void init() throws SingletonException;
}
public class SingletonFactory
{
private static ConcurrentMap<String, AtomicReference<? extends Singleton>> CACHE =
new ConcurrentHashMap<String, AtomicReference<? extends Singleton>>();
public static <T extends Singleton> T getSingletonInstance(Class<T> clazz)
throws SingletonException
{
String key = clazz.getName();
if (CACHE.containsKey(key))
{
return readEventually(key);
}
AtomicReference<T> ref = new AtomicReference<T>(null);
if (CACHE.putIfAbsent(key, ref) == null)
{
try
{
T instance = clazz.newInstance();
instance.init();
ref.set(instance); // ----- (1) -----
return instance;
}
catch (Exception e)
{
throw new SingletonException(e);
}
}
return readEventually(key);
}
#SuppressWarnings("unchecked")
private static <T extends Singleton> T readEventually(String key)
{
T instance = null;
AtomicReference<T> ref = (AtomicReference<T>) CACHE.get(key);
do
{
instance = ref.get(); // ----- (2) -----
}
while (instance == null);
return instance;
}
}
I'm not entirely sure about lines (1) and (2). I know that referenced object is declared as volatile field in AtomicReference, and hence changes made at line (1) should become immediately visible at line (2) - but still have some doubts...
Other than that - I think use of ConcurrentHashMap addresses atomicity of putting new key into a cache.
Do you guys see any concerns with this approach? Thanks!
P.S.: I know about static holder class idiom - and I don't use it due to ExceptionInInitializerError (which any exception thrown during singleton instantiation is wrapped into) and subsequent NoClassDefFoundError which are not something I want to catch. Instead, I'd like to leverage the advantage of dedicated checked exception by catching it and handling it gracefully rather than parse the stack trace of EIIR or NCDFE.
You have gone to a lot of work to avoid synchronization, and I assume the reason for doing this is for performance concerns. Have you tested to see if this actually improves performance vs a synchronized solution?
The reason I ask is that the Concurrent classes tend to be slower than the non-concurrent ones, not to mention the additional level of redirection with the atomic reference. Depending on your thread contention, a naive synchronized solution may actually be faster (and easier to verify for correctness).
Additionally, I think that you can possibly end up with an infinite loop when a SingletonException is thrown during a call to instance.init(). The reason being that a concurrent thread waiting in readEventually will never end up finding its instance (since an exception was thrown while another thread was initializing the instance). Maybe this is the correct behaviour for your case, or maybe you want to set some special value to the instance to trigger an exception to be thrown to the waiting thread.
Having all of these concurrent/atomic things would cause more lock issues than just putting
synchronized(clazz){}
blocks around the getter. Atomic references are for references that are UPDATED and you don't want collision. Here you have a single writer, so you do not care about that.
You could optimize it further by having a hashmap, and only if there is a miss, use the synchronized block:
public static <T> T get(Class<T> cls){
// No lock try
T ref = cache.get(cls);
if(ref != null){
return ref;
}
// Miss, so use create lock
synchronized(cls){ // singletons are double created
synchronized(cache){ // Prevent table rebuild/transfer contentions -- RARE
// Double check create if lock backed up
ref = cache.get(cls);
if(ref == null){
ref = cls.newInstance();
cache.put(cls,ref);
}
return ref;
}
}
}
Consider using Guava's CacheBuilder. For example:
private static Cache<Class<? extends Singleton>, Singleton> singletons = CacheBuilder.newBuilder()
.build(
new CacheLoader<Class<? extends Singleton>, Singleton>() {
public Singleton load(Class<? extends Singleton> key) throws SingletonException {
try {
Singleton singleton = key.newInstance();
singleton.init();
return singleton;
}
catch (SingletonException se) {
throw se;
}
catch (Exception e) {
throw new SingletonException(e);
}
}
});
public static <T extends Singleton> T getSingletonInstance(Class<T> clazz) {
return (T)singletons.get(clazz);
}
Note: this example is untested and uncompiled.
Guava's underlying Cache implementation will handle all caching and concurrency logic for you.
This looks like it would work although I might consider some sort of sleep if even a nanosecond or something when testing for the reference to be set. The spin test loop is going to be extremely expensive.
Also, I would consider improving the code by passing the AtomicReference to readEventually() so you can avoid the containsKey() and then putIfAbsent() race condition. So the code would be:
AtomicReference<T> ref = (AtomicReference<T>) CACHE.get(key);
if (ref != null) {
return readEventually(ref);
}
AtomicReference<T> newRef = new AtomicReference<T>(null);
AtomicReference<T> oldRef = CACHE.putIfAbsent(key, newRef);
if (oldRef != null) {
return readEventually(oldRef);
}
...
The code is not generally thread safe because there is a gap between the CACHE.containsKey(key) check and the CACHE.putIfAbsent(key, ref) call. It is possible for two threads to call simultaneously into the method (especially on multi-core/processor systems) and both perform the containsKey() check, then both attempt to do the put and creation operations.
I would protect that execution of the getSingletonInstnace() method using either a lock or by synchronizing on a monitor of some sort.
google "Memoizer". basically, instead of AtomicReference, use Future.