(please note i cannot use external libraries for the cache).
---may it can be done using the stream API? ---
i need to implement a cache, it's has 1 key property:
If the cache is asked for a key that it doesn't contain, it should fetch the data using an externally provided function that reads the data from another source (database or similar).
i've started to create a basic skelaton code:
public interface ICache<K,V> {
}
interface IDataSource<K,V> {
void put(K key, V value);
V get(K key);
}
public class Cache<K,V> implements ICache<K,V> {
Map<K,V> cache = new HashMap<>();
IDataSource<K,V> dataSource;
public Cache(IDataSource<K,V> dataSrc) {
dataSource = dataSrc;
}
//may it change to a future? how it can be done?
public V getAsync(K key) {
if (cache.containsKey(key)) {
return cache.get(key);
}
else {
//do some async op
}
}
}
can you advice?
do u think it's need more features?
In reality what you are writing is a lazy evaluator. You are providing a Supplier for the value, without computing it the first time. The moment someone asks for your value you compute it and return it, memoizing (caching) it for future use.
Have a look at Vavr's Lazy class, it is doing precisely this (but for a single value). You can take some ideas from what that is doing, and also some extra utility methods like checking if it was already computed.
https://github.com/vavr-io/vavr/blob/master/vavr/src/main/java/io/vavr/Lazy.java
Another option is to simply use ConcurrentHashMap. It provides methods to safely (atomically) update values if they are not in the map.
If you want it to be asynchronous you need to introduce some ExecutorService or use CompletableFuture (with your own ExecutorService or the default thread pool that is used by parallel streams etc.). For example:
public class Cache<K,V> implements ICache<K,V> {
Map<K,V> cache = new ConcurrentHashMap<>();
IDataSource<K,V> dataSource;
public Cache(IDataSource<K,V> dataSrc) {
dataSource = dataSrc;
}
// async non-blocking call
public CompletableFuture<V> getAsync(K key) {
return CompletableFuture.supplyAsync(() -> get(key));
}
// blocking call
public V get(K key) {
//computeIfAbsent is atomic and threadsafe, in case multiple CompletableFutures try this in parallel
return cache.computeIfAbsent(key, (k) -> dataSource.get(k));
}
}
If you also wanted to have an async direct cache and datasource update you could do something like:
public CompletableFuture<Void> putAsync(K key, V value) {
return CompletableFuture.runAsync(() -> {
synchronized (cache) {
dataSource.put(key, value);
cache.put(key, value);
}
}
}
Although honestly I would avoid having 2 entrypoints to update the dataSource (the cache and the dataSource directly). Also it is difficult to make this completely thread safe without having the synchronized (which blocks concurrent cache puts completely from happening in parallel, even if the key is different).
Related
I have the concurrent hashmap in some service class:
class MyClass implements Flushable {
private volatile ConcurrentHashMap<Integer, Object> hashMap = ...
public void add(int id, Object value) {
hashMap.put(id, value);
}
#Override
public void flush() throws IOException {
hashMap.foreach((k, v) -> ...)
hashMap.clear();
}
}
Do I need to do some additional locking to be sure that:
1. flush will process all map entries (what if add is invoked between foreach and clear?)
2. clear will not remove entries which were inserted/updated after foreach
From javadoc, there is a guarantee that update happens before read. So as far as I understand clear will block put invocations, however to reach what I want I need some additional locks.
In your case you need any extra locking here, both are operation should be locked , clear method also do locking on segment level..
I have a service bean which provides access to a Map. From time to time I need to rebuild the content of the Map wich takes several seconds and I want to block the access to the map while its rebuilding, because it can be accessed from different Threads.
#Service
public class MyService {
private Map<Key,Value> cache = null;
private ReentrantLock reentrantLock = new ReentrantLock();
public void rebuildCache(){
try {
reentrantLock.lock();
cache = new ConcurrentHashMap<>();
... //processing time consuming stuff and building up the cache
}finally {
reentrantLock.unlock();
}
}
public Value getValue(Key key){
while (lock.isLocked()){}
return cache.get(key);
}
...
}
As you can see I use
while (reentrantLock.isLocked()){}
to check if the lock is locked and wait until its unlocked. This solution seems to be quite dirty. Is there a better solution?
Use a ReentrantReadWriteLock instead.
In your write method:
theLock.writeLock().lock();
try {
// update the map
} finally {
theLock.writeLock().unlock();
}
In the read method, use the .readLock() instead.
This has the problem however that during the update of the map, all readers will be blocked; another solution would be to use a plain lock to replace the reference of the old map to a new, updated one, and use a plain old synchronized.
More importantly though, your use of locks is incorrect. You should do:
theLock.lock();
try {
// whatever
} finally {
theLock.unlock();
}
Imagine what happens if the locking fails with your current lock: you'll always try to unlock and you'll end up with an IllegalLockStateException.
I would propose a ReadWriteLock.
With it you can read as many times as you want, as long as read lock is not locked.
#Service
public class MyService {
private Map<Key,Value> cache = null;
private ReentrantLock reentrantLock = new ReentrantLock();
public void rebuildCache(){
try {
reentrantLock.writeLock().lock();
cache = new ConcurrentHashMap<>();
... //processing time consuming stuff and building up the cache
}finally {
reentrantLock.writeLock().unlock();
}
}
public Value getValue(Key key){
if(reentrantLock.getReadLock().lock()){
return cache.get(key);
}finally{
reentrantLock.getReadLock().unlock();
}
}
...
}
I want to synchronize method calls on basis some id i.e. something like a concurrency Decorator of a given object instance.
For example:
All threads which call the method with param "id1", should execute serially to one another.
All of the rest, which call the method with different argument, say "id2", should execute in parallel to the threads which call the method with param "id1", but again serially to each other.
So in my mind this can be implemented by having a lock (http://docs.oracle.com/javase/6/docs/api/java/util/concurrent/locks/ReentrantLock.html) instance per such method param.
Each time the method is called with the param, the lock instance corresponding to the specific param value (e.g. "id1") would be looked up and the current thread would try to obtain the lock.
Speaking in code:
public class ConcurrentPolicyWrapperImpl implements Foo {
private Foo delegate;
/**
* Holds the monitor objects used for synchronization.
*/
private Map<String, Lock> concurrentPolicyMap = Collections.synchronizedMap(new HashMap<String, Lock>());
/**
* Here we decorate the call to the wrapped instance with a synchronization policy.
*/
#Override
public Object callFooDelegateMethod (String id) {
Lock lock = getLock(id);
lock.lock();
try {
return delegate.delegateMethod(id);
} finally {
lock.unlock();
}
}
protected Lock getLock(String id) {
Lock lock = concurrentPolicyMap.get(id);
if (lock == null) {
lock = createLock();
concurrentPolicyMap.put(id, lock);
}
return lock;
}
}
protected Lock createLock() {
return new ReentrantLock();
}
It seems that this works - I did some performance testing with jmeter and so on.
Still, as we all know concurrency in Java is a tricky thing, I decided to ask for your opinion here.
I can't stop thinking that there could be a better way to accomplish this. For example by using one of the BlockingQueue implementations. What do you think?
I also can't really decide for sure if there is a potential synchronization problem with getting the lock i.e. the protected Lock getLock(String id) method. I am using a synchronized collection, but is that enough? I.e. shouldn't it be something like the following instead of what I currently have:
protected Lock getLock(String id) {
synchronized(concurrentPolicyMap) {
Lock lock = concurrentPolicyMap.get(id);
if (lock == null) {
lock = createLock();
concurrentPolicyMap.put(id, lock);
}
return lock;
}
}
So what do you guys think?
Lock creation issues aside, the pattern is OK except that you may have an unbounded number of locks. Generally people avoid this by creating/using a Striped lock. There is a good/simple implementation in the guava library.
Application area of lock-striping
How to acquire a lock by a key
http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/util/concurrent/Striped.html
Example code using guava implementation:
private Striped<Lock> STRIPPED_LOCK = Striped.lock(64);
public static void doActualWork(int id) throws InterruptedException {
try {
STRIPPED_LOCK.get(id).lock();
...
} finally {
STRIPPED_LOCK.get(id).unlock();
}
}
Though I would personally prefer Guava's Striped<Lock> approach suggested by Keith, just for discussion & completeness, I'd like to point out that using a Dynamic Proxy, or the more generic AOP (Aspect Oriented Programming), is one approach.
So we would define an IStripedConcurrencyAware interface that would serve as the "something like a concurrency Decorator" that you desire, and the Dynamic Proxy / AOP method hijacking based on this interface would de-multiplex the method call into the appropriate Executor / Thread.
I personally dislike AOP (or most of Spring, for that matter) because it breaks the what-you-see-is-what-you-get simplicity of Core Java, but YMMV.
After looking at this question, I think I want to wrap ThreadLocal to add a reset behavior.
I want to have something similar to a ThreadLocal, with a method I can call from any thread to set all the values back to the same value. So far I have this:
public class ThreadLocalFlag {
private ThreadLocal<Boolean> flag;
private List<Boolean> allValues = new ArrayList<Boolean>();
public ThreadLocalFlag() {
flag = new ThreadLocal<Boolean>() {
#Override protected Boolean initialValue() {
Boolean value = false;
allValues.add(value);
return value;
}
};
}
public boolean get() {
return flag.get();
}
public void set(Boolean value) {
flag.set(value);
}
public void setAll(Boolean value) {
for (Boolean tlValue : allValues) {
tlValue = value;
}
}
}
I'm worried that the autoboxing of the primitive may mean the copies I've stored in the list will not reference the same variables referenced by the ThreadLocal if I try to set them. I've not yet tested this code, and with something tricky like this I'm looking for some expert advice before I continue down this path.
Someone will ask "Why are you doing this?". I'm working in a framework where there are other threads that callback into my code, and I don't have references to them. Periodically I want to update the value in a ThreadLocal variable they use, so performing that update requires that the thread which uses the variable do the updating. I just need a way to notify all these threads that their ThreadLocal variable is stale.
I'm flattered that there is new criticism recently regarding this three year old question, though I feel the tone of it is a little less than professional. The solution I provided has worked without incident in production during that time. However, there are bound to be better ways to achieve the goal that prompted this question, and I invite the critics to supply an answer that is clearly better. To that end, I will try to be more clear about the problem I was trying to solve.
As I mentioned earlier, I was using a framework where multiple threads are using my code, outside my control. That framework was QuickFIX/J, and I was implementing the Application interface. That interface defines hooks for handling FIX messages, and in my usage the framework was configured to be multithreaded, so that each FIX connection to the application could be handled simultaneously.
However, the QuickFIX/J framework only uses a single instance of my implementation of that interface for all the threads. I'm not in control of how the threads get started, and each is servicing a different connection with different configuration details and other state. It was natural to let some of that state, which is frequently accessed but seldom updated, live in various ThreadLocals that load their initial value once the framework has started the thread.
Elsewhere in the organization, we had library code to allow us to register for callbacks for notification of configuration details that change at runtime. I wanted to register for that callback, and when I received it, I wanted to let all the threads know that it's time to reload the values of those ThreadLocals, as they may have changed. That callback comes from a thread I don't control, just like the QuickFIX/J threads.
My solution below uses ThreadLocalFlag (a wrapped ThreadLocal<AtomicBoolean>) solely to signal the other threads that it may be time to update their values. The callback calls setAll(true), and the QuickFIX/J threads call set(false) when they begin their update. I have downplayed the concurrency issues of the ArrayList because the only time the list is added to is during startup, and my use case was smaller than the default size of the list.
I imagine the same task could be done with other interthread communication techniques, but for what it's doing, this seemed more practical. I welcome other solutions.
Interacting with objects in a ThreadLocal across threads
I'll say up front that this is a bad idea. ThreadLocal is a special class which offers speed and thread-safety benefits if used correctly. Attempting to communicate across threads with a ThreadLocal defeats the purpose of using the class in the first place.
If you need access to an object across multiple threads there are tools designed for this purpose, notably the thread-safe collections in java.util.collect.concurrent such as ConcurrentHashMap, which you can use to replicate a ThreadLocal by using Thread objects as keys, like so:
ConcurrentHashMap<Thread, AtomicBoolean> map = new ConcurrentHashMap<>();
// pass map to threads, let them do work, using Thread.currentThread() as the key
// Update all known thread's flags
for(AtomicBoolean b : map.values()) {
b.set(true);
}
Clearer, more concise, and avoids using ThreadLocal in a way it's simply not designed for.
Notifying threads that their data is stale
I just need a way to notify all these threads that their ThreadLocal variable is stale.
If your goal is simply to notify other threads that something has changed you don't need a ThreadLocal at all. Simply use a single AtomicBoolean and share it with all your tasks, just like you would your ThreadLocal<AtomicBoolean>. As the name implies updates to an AtomicBoolean are atomic and visible cross-threads. Even better would be to use a real synchronization aid such as CyclicBarrier or Phaser, but for simple use cases there's no harm in just using an AtomicBoolean.
Creating an updatable "ThreadLocal"
All of that said, if you really want to implement a globally update-able ThreadLocal your implementation is broken. The fact that you haven't run into issues with it is only a coincidence and future refactoring may well introduce hard-to-diagnose bugs or crashes. That it "has worked without incident" only means your tests are incomplete.
First and foremost, an ArrayList is not thread-safe. You simply cannot use it (without external synchronization) when multiple threads may interact with it, even if they will do so at different times. That you aren't seeing any issues now is just a coincidence.
Storing the objects as a List prevents us from removing stale values. If you call ThreadLocal.set() it will append to your list without removing the previous value, which introduces both a memory leak and the potential for unexpected side-effects if you anticipated these objects becoming unreachable once the thread terminated, as is usually the case with ThreadLocal instances. Your use case avoids this issue by coincidence, but there's still no need to use a List.
Here is an implementation of an IterableThreadLocal which safely stores and updates all existing instances of the ThreadLocal's values, and works for any type you choose to use:
import java.util.Iterator;
import java.util.concurrent.ConcurrentMap;
import com.google.common.collect.MapMaker;
/**
* Class extends ThreadLocal to enable user to iterate over all objects
* held by the ThreadLocal instance. Note that this is inherently not
* thread-safe, and violates both the contract of ThreadLocal and much
* of the benefit of using a ThreadLocal object. This class incurs all
* the overhead of a ConcurrentHashMap, perhaps you would prefer to
* simply use a ConcurrentHashMap directly instead?
*
* If you do really want to use this class, be wary of its iterator.
* While it is as threadsafe as ConcurrentHashMap's iterator, it cannot
* guarantee that all existing objects in the ThreadLocal are available
* to the iterator, and it cannot prevent you from doing dangerous
* things with the returned values. If the returned values are not
* properly thread-safe, you will introduce issues.
*/
public class IterableThreadLocal<T> extends ThreadLocal<T>
implements Iterable<T> {
private final ConcurrentMap<Thread,T> map;
public IterableThreadLocal() {
map = new MapMaker().weakKeys().makeMap();
}
#Override
public T get() {
T val = super.get();
map.putIfAbsent(Thread.currentThread(), val);
return val;
}
#Override
public void set(T value) {
map.put(Thread.currentThread(), value);
super.set(value);
}
/**
* Note that this method fundamentally violates the contract of
* ThreadLocal, and exposes all objects to the calling thread.
* Use with extreme caution, and preferably only when you know
* no other threads will be modifying / using their ThreadLocal
* references anymore.
*/
#Override
public Iterator<T> iterator() {
return map.values().iterator();
}
}
As you can hopefully see this is little more than a wrapper around a ConcurrentHashMap, and incurs all the same overhead as using one directly, but hidden in the implementation of a ThreadLocal, which users generally expect to be fast and thread-safe. I implemented it for demonstration purposes, but I really cannot recommend using it in any setting.
It won't be a good idea to do that since the whole point of thread local storage is, well, thread locality of the value it contains - i.e. that you can be sure that no other thread than your own thread can touch the value. If other threads could touch your thread local value, it won't be "thread local" anymore and that will break the memory model contract of thread local storage.
Either you have to use something other than ThreadLocal (e.g. a ConcurrentHashMap) to store the value, or you need to find a way to schedule an update on the threads in question.
You could use google guava's map maker to create a static final ConcurrentWeakReferenceIdentityHashmap with the following type: Map<Thread, Map<String, Object>> where the second map is a ConcurrentHashMap. That way you'd be pretty close to ThreadLocal except that you can iterate through the map.
I'm disappointed in the quality of the answers received for this question; I have found my own solution.
I wrote my test case today, and found the only issue with the code in my question is the Boolean. Boolean is not mutable, so my list of references wasn't doing me any good. I had a look at this question, and changed my code to use AtomicBoolean, and now everything works as expected.
public class ThreadLocalFlag {
private ThreadLocal<AtomicBoolean> flag;
private List<AtomicBoolean> allValues = new ArrayList<AtomicBoolean>();
public ThreadLocalFlag() {
flag = new ThreadLocal<AtomicBoolean>() {
#Override protected AtomicBoolean initialValue() {
AtomicBoolean value = new AtomicBoolean();
allValues.add(value);
return value;
}
};
}
public boolean get() {
return flag.get().get();
}
public void set(boolean value) {
flag.get().set(value);
}
public void setAll(boolean value) {
for (AtomicBoolean tlValue : allValues) {
tlValue.set(value);
}
}
}
Test case:
public class ThreadLocalFlagTest {
private static ThreadLocalFlag flag = new ThreadLocalFlag();
private static boolean runThread = true;
#AfterClass
public static void tearDownOnce() throws Exception {
runThread = false;
flag = null;
}
/**
* #throws Exception if there is any issue with the test
*/
#Test
public void testSetAll() throws Exception {
startThread("ThreadLocalFlagTest-1", false);
try {
Thread.sleep(1000L);
} catch (InterruptedException e) {
//ignore
}
startThread("ThreadLocalFlagTest-2", true);
try {
Thread.sleep(1000L);
} catch (InterruptedException e) {
//ignore
}
startThread("ThreadLocalFlagTest-3", false);
try {
Thread.sleep(1000L);
} catch (InterruptedException e) {
//ignore
}
startThread("ThreadLocalFlagTest-4", true);
try {
Thread.sleep(8000L); //watch the alternating values
} catch (InterruptedException e) {
//ignore
}
flag.setAll(true);
try {
Thread.sleep(8000L); //watch the true values
} catch (InterruptedException e) {
//ignore
}
flag.setAll(false);
try {
Thread.sleep(8000L); //watch the false values
} catch (InterruptedException e) {
//ignore
}
}
private void startThread(String name, boolean value) {
Thread t = new Thread(new RunnableCode(value));
t.setName(name);
t.start();
}
class RunnableCode implements Runnable {
private boolean initialValue;
RunnableCode(boolean value) {
initialValue = value;
}
#Override
public void run() {
flag.set(initialValue);
while (runThread) {
System.out.println(Thread.currentThread().getName() + ": " + flag.get());
try {
Thread.sleep(4000L);
} catch (InterruptedException e) {
//ignore
}
}
}
}
}
I have a common interface for a number of singleton implementations. Interface defines initialization method which can throw checked exception.
I need a factory which will return cached singleton implementations on demand, and wonder if following approach is thread-safe?
UPDATE1: Please don't suggest any 3rd partly libraries, as this will require to obtain legal clearance due to possible licensing issues :-)
UPDATE2: this code will likely to be used in EJB environment, so it's preferrable not to spawn additional threads or use stuff like that.
interface Singleton
{
void init() throws SingletonException;
}
public class SingletonFactory
{
private static ConcurrentMap<String, AtomicReference<? extends Singleton>> CACHE =
new ConcurrentHashMap<String, AtomicReference<? extends Singleton>>();
public static <T extends Singleton> T getSingletonInstance(Class<T> clazz)
throws SingletonException
{
String key = clazz.getName();
if (CACHE.containsKey(key))
{
return readEventually(key);
}
AtomicReference<T> ref = new AtomicReference<T>(null);
if (CACHE.putIfAbsent(key, ref) == null)
{
try
{
T instance = clazz.newInstance();
instance.init();
ref.set(instance); // ----- (1) -----
return instance;
}
catch (Exception e)
{
throw new SingletonException(e);
}
}
return readEventually(key);
}
#SuppressWarnings("unchecked")
private static <T extends Singleton> T readEventually(String key)
{
T instance = null;
AtomicReference<T> ref = (AtomicReference<T>) CACHE.get(key);
do
{
instance = ref.get(); // ----- (2) -----
}
while (instance == null);
return instance;
}
}
I'm not entirely sure about lines (1) and (2). I know that referenced object is declared as volatile field in AtomicReference, and hence changes made at line (1) should become immediately visible at line (2) - but still have some doubts...
Other than that - I think use of ConcurrentHashMap addresses atomicity of putting new key into a cache.
Do you guys see any concerns with this approach? Thanks!
P.S.: I know about static holder class idiom - and I don't use it due to ExceptionInInitializerError (which any exception thrown during singleton instantiation is wrapped into) and subsequent NoClassDefFoundError which are not something I want to catch. Instead, I'd like to leverage the advantage of dedicated checked exception by catching it and handling it gracefully rather than parse the stack trace of EIIR or NCDFE.
You have gone to a lot of work to avoid synchronization, and I assume the reason for doing this is for performance concerns. Have you tested to see if this actually improves performance vs a synchronized solution?
The reason I ask is that the Concurrent classes tend to be slower than the non-concurrent ones, not to mention the additional level of redirection with the atomic reference. Depending on your thread contention, a naive synchronized solution may actually be faster (and easier to verify for correctness).
Additionally, I think that you can possibly end up with an infinite loop when a SingletonException is thrown during a call to instance.init(). The reason being that a concurrent thread waiting in readEventually will never end up finding its instance (since an exception was thrown while another thread was initializing the instance). Maybe this is the correct behaviour for your case, or maybe you want to set some special value to the instance to trigger an exception to be thrown to the waiting thread.
Having all of these concurrent/atomic things would cause more lock issues than just putting
synchronized(clazz){}
blocks around the getter. Atomic references are for references that are UPDATED and you don't want collision. Here you have a single writer, so you do not care about that.
You could optimize it further by having a hashmap, and only if there is a miss, use the synchronized block:
public static <T> T get(Class<T> cls){
// No lock try
T ref = cache.get(cls);
if(ref != null){
return ref;
}
// Miss, so use create lock
synchronized(cls){ // singletons are double created
synchronized(cache){ // Prevent table rebuild/transfer contentions -- RARE
// Double check create if lock backed up
ref = cache.get(cls);
if(ref == null){
ref = cls.newInstance();
cache.put(cls,ref);
}
return ref;
}
}
}
Consider using Guava's CacheBuilder. For example:
private static Cache<Class<? extends Singleton>, Singleton> singletons = CacheBuilder.newBuilder()
.build(
new CacheLoader<Class<? extends Singleton>, Singleton>() {
public Singleton load(Class<? extends Singleton> key) throws SingletonException {
try {
Singleton singleton = key.newInstance();
singleton.init();
return singleton;
}
catch (SingletonException se) {
throw se;
}
catch (Exception e) {
throw new SingletonException(e);
}
}
});
public static <T extends Singleton> T getSingletonInstance(Class<T> clazz) {
return (T)singletons.get(clazz);
}
Note: this example is untested and uncompiled.
Guava's underlying Cache implementation will handle all caching and concurrency logic for you.
This looks like it would work although I might consider some sort of sleep if even a nanosecond or something when testing for the reference to be set. The spin test loop is going to be extremely expensive.
Also, I would consider improving the code by passing the AtomicReference to readEventually() so you can avoid the containsKey() and then putIfAbsent() race condition. So the code would be:
AtomicReference<T> ref = (AtomicReference<T>) CACHE.get(key);
if (ref != null) {
return readEventually(ref);
}
AtomicReference<T> newRef = new AtomicReference<T>(null);
AtomicReference<T> oldRef = CACHE.putIfAbsent(key, newRef);
if (oldRef != null) {
return readEventually(oldRef);
}
...
The code is not generally thread safe because there is a gap between the CACHE.containsKey(key) check and the CACHE.putIfAbsent(key, ref) call. It is possible for two threads to call simultaneously into the method (especially on multi-core/processor systems) and both perform the containsKey() check, then both attempt to do the put and creation operations.
I would protect that execution of the getSingletonInstnace() method using either a lock or by synchronizing on a monitor of some sort.
google "Memoizer". basically, instead of AtomicReference, use Future.