I'm dealing with some third-party library code that involves creating expensive objects and caching them in a Map. The existing implementation is something like
lock.lock()
try {
Foo result = cache.get(key);
if (result == null) {
result = createFooExpensively(key);
cache.put(key, result);
}
return result;
} finally {
lock.unlock();
}
Obviously this is not the best design when Foos for different keys can be created independently.
My current hack is to use a Map of Futures:
lock.lock();
Future<Foo> future;
try {
future = allFutures.get(key);
if (future == null) {
future = executorService.submit(new Callable<Foo>() {
public Foo call() {
return createFooExpensively(key);
}
});
allFutures.put(key, future);
}
} finally {
lock.unlock();
}
try {
return future.get();
} catch (InterruptedException e) {
throw new MyRuntimeException(e);
} catch (ExecutionException e) {
throw new MyRuntimeException(e);
}
But this seems... a little hacky, for two reasons:
The work is done on an arbitrary pooled thread. I'd be happy to have the work
done on the first thread that tries to get that particular key, especially since
it's going to be blocked anyway.
Even when the Map is fully populated, we still go through Future.get() to get
the results. I expect this is pretty cheap, but it's ugly.
What I'd like is to replace cache with a Map that will block gets for a given key until that key has a value, but allow other gets meanwhile. Does any such thing exist? Or does someone have a cleaner alternative to the Map of Futures?
Creating a lock per key sounds tempting, but it may not be what you want, especially when the number of keys is large.
As you would probably need to create a dedicated (read-write) lock for each key, it has impact on your memory usage. Also, that fine granularity may hit a point of diminishing returns given a finite number of cores if concurrency is truly high.
ConcurrentHashMap is oftentimes a good enough solution in a situation like this. It provides normally full reader concurrency (normally readers do not block), and updates can be concurrent up to the level of concurrency level desired. This gives you pretty good scalability. The above code may be expressed with ConcurrentHashMap like the following:
ConcurrentMap<Key,Foo> cache = new ConcurrentHashMap<>();
...
Foo result = cache.get(key);
if (result == null) {
result = createFooExpensively(key);
Foo old = cache.putIfAbsent(key, result);
if (old != null) {
result = old;
}
}
The straightforward use of ConcurrentHashMap does have one drawback, which is that multiple threads may find that the key is not cached, and each may invoke createFooExpensively(). As a result, some threads may do throw-away work. To avoid this, you would want to use the memoizer pattern that's mentioned in "Java Concurrency in Practice".
But then again, the nice folks at Google already solved these problems for you in the form of CacheBuilder:
LoadingCache<Key,Foo> cache = CacheBuilder.newBuilder().
concurrencyLevel(32).
build(new CacheLoader<Key,Foo>() {
public Foo load(Key key) {
return createFooExpensively(key);
}
});
...
Foo result = cache.get(key);
You can use funtom-java-utils - PerKeySynchronizedExecutor.
It will create a lock for each key but will clear it for you immediately when it becomes unused.
It will also grantee memory visibility between invocations with the same key, and is designed to be very fast and minimize the contention between invocations off different keys.
Declare it in your class:
final PerKeySynchronizedExecutor<KEY_CLASS> executor = new PerKeySynchronizedExecutor<>();
Use it:
Foo foo = executor.execute(key, () -> createFooExpensively());
public class Cache {
private static final Set<String> lockedKeys = new HashSet<>();
private void lock(String key) {
synchronized (lockedKeys) {
while (!lockedKeys.add(key)) {
try {
lockedKeys.wait();
} catch (InterruptedException e) {
log.error("...");
throw new RuntimeException(e);
}
}
}
}
private void unlock(String key) {
synchronized (lockedKeys) {
lockedKeys.remove(key);
lockedKeys.notifyAll();
}
}
public Foo getFromCache(String key) {
try {
lock(key);
Foo result = cache.get(key);
if (result == null) {
result = createFooExpensively(key);
cache.put(key, result);
}
return result;
//For different keys it is executed in parallel.
//For the same key it is executed synchronously.
} finally {
unlock(key);
}
}
}
key can be not only a 'String' but any class with correctly overridden 'equals' and 'hashCode' methods.
try-finally - is very important - you must guarantee to unlock waiting threads after your operation even if your operation threw exception.
It will not work if your back-end is distributed across multiple servers/JVMs.
Related
I have a Java REST application where one endpoint always deals with a ConcurrentMap. I am doing load tests and it is really bad when the load test starts to increase.
What strategies can I implement in order to improve the efficiency of the application?
Should I play around with Jetty threads, as it is the server I'm using? Or is it mainly code? Or both?
The method that becomes the bottleneck is the one below.
Basically I need to read some line from a given file. I can't store it on a DB, so I came up with this processing with a Map. However, I'm aware that for large files it will take long not only to get to the line and I risk the fact that the Map will consume much memory when it has many entries...
dict is the ConcurrentMap.
public String getLine(int lineNr) throws IllegalArgumentException {
if (lineNr > nrLines) {
throw new IllegalArgumentException();
}
if (dict.containsKey(lineNr)) {
return dict.get(lineNr);
}
synchronized (this) {
try (Stream<String> st = Files.lines(doc.toPath())
Optional<String> optionalLine = st.skip(lineNr - 1).findFirst();
if (optionalLine.isPresent()) {
dict.put(lineNr, optionalLine.get());
} else {
nrLines = nrLines > lineNr ? lineNr : nrLines;
throw new IllegalArgumentException();
}
} catch (IOException e) {
e.printStackTrace();
}
return cache.get(lineNr);
}
Mixing up ConcurrentMap with synchronized(this) is probably not the right approach. Classes from java.util.concurrent package are designed for specific use cases and try to optimize synchronization internally.
Instead I'd suggest to first try a well designed caching library and see if the performance is good enough. One example would be Caffeine. As per Population docs it gives you a way to declare how to load the data, even asynchronously:
AsyncLoadingCache<Key, Graph> cache = Caffeine.newBuilder()
.maximumSize(10_000)
.expireAfterWrite(10, TimeUnit.MINUTES)
// Either: Build with a synchronous computation that is wrapped as asynchronous
.buildAsync(key -> createExpensiveGraph(key));
// Or: Build with a asynchronous computation that returns a future
.buildAsync((key, executor) -> createExpensiveGraphAsync(key, executor));
This solution is based on ConcurrentHashMap#computeIfAbsent, with two assumptions:
Multiple threads reading the same file is not a problem.
While the documentations says the computation should be simple and short because of blocking, I believe it is only a problem for same key (or bucket/stripe) access and only for updates (not reads)? In this scenario, it is not a problem, as we either succesfully compute the value or throw IllegalArgumentException.
Using this, we achieve only opening the file once per key, by placing that as the computation required to put a key.
public String getLine(int lineNr) throws IllegalArgumentException {
if (lineNr > nrLines) {
throw new IllegalArgumentException();
}
return cache.computeIfAbsent(lineNr, (l) -> {
try (Stream<String> st = Files.lines(path)) {
Optional<String> optionalLine = st.skip(lineNr - 1).findFirst();
if (optionalLine.isPresent()) {
return optionalLine.get();
} else {
nrLines = nrLines > lineNr ? lineNr : nrLines;
throw new IllegalArgumentException();
}
} catch (IOException e) {
e.printStackTrace();
}
return null;
});
}
I "verified" the second assumption by spawning 3 threads, where:
Thread1 computes key 0 by looping infinitely (blocks forever).
Thread2 attempts to put at key 0, but never does because Thread1 blocks.
Thread3 attempts to put at key 1, and does so immediately.
Try it out, maybe it works or maybe assumptions are wrong and it sucks. The Map uses buckets internally, so the computation may become a bottleneck even with different keys, as it locks the bucket/stripe.
I have already topic with same code:
public abstract class Digest {
private Map<String, byte[]> cache = new HashMap<>();
public byte[] digest(String input) {
byte[] result = cache.get(input);
if (result == null) {
synchronized (cache) {
result = cache.get(input);
if (result == null) {
result = doDigest(input);
cache.put(input, result);
}
}
}
return result;
}
protected abstract byte[] doDigest(String input);
}
At previous I got prove that code is not thread safe.
At this topic I want to provide solutions which I have in my head and I ask you to review these solutions:
Solution#1 through ReadWriteLock:
public abstract class Digest {
private final ReadWriteLock rwl = new ReentrantReadWriteLock();
private final Lock readLock = rwl.readLock();
private final Lock writeLock = rwl.writeLock();
private Map<String, byte[]> cache = new HashMap<>(); // I still don't know should I use volatile or not
public byte[] digest(String input) {
byte[] result = null;
readLock.lock();
try {
result = cache.get(input);
} finally {
readLock.unlock();
}
if (result == null) {
writeLock.lock();
try {
result = cache.get(input);
if (result == null) {
result = doDigest(input);
cache.put(input, result);
}
} finally {
writeLock.unlock();
}
}
return result;
}
protected abstract byte[] doDigest(String input);
}
Solution#2 through CHM
public abstract class Digest {
private Map<String, byte[]> cache = new ConcurrentHashMap<>(); //should be volatile?
public byte[] digest(String input) {
return cache.computeIfAbsent(input, this::doDigest);
}
protected abstract byte[] doDigest(String input);
}
Please review correctness of both solutions. It is not question about what the solution better. I undestand that CHM better. Please, review correctnes of implementation
Unlike the clusterfudge we got into in the last question, this is better.
As was shown in the prefious question's duplicate, the original code is not thread-safe since HashMap is not threadsafe and the initial get() can be called while the put() is being executed inside the synchronized block. This can break all sorts of things, so that's definitely not threadsafe.
The second solution is thread-safe, since all accesses to cache are done in guarded code. The inital get() is protected by a readlock, and the put() is done inside a writelock, guaranteeing that threads can't read the cache while it's being written to, but they're free to read it at the same time as other reading threads. No concurrency issues, no visibility issues, no chances of deadlocks. Everything's fine.
The last is of course the most elegant one. Since computeIfAbsent() is an atomic operation, it guarantees that the value is either directly returned or computed at most once, from the javadoc:
If the specified key is not already associated with a value, attempts
to compute its value using the given mapping function and enters it
into this map unless null. The entire method invocation is
performed atomically, so the function is applied at most once per key.
Some attempted update operations on this map by other threads may be
blocked while computation is in progress, so the computation should be
short and simple, and must not attempt to update any other mappings of
this map.
The Map in question shouldn't be volatile, but it should be final. If it's not final, it could (at least in theory) be changed and it would be possible for 2 threads to work on different objects, which is not what you want.
I have the following issue after trying to run my webapplication on Linux server.
When running on windows, everything works perfectly (simplified version) - call send() method, wait for JMS response on synchronizer object, send the response to client)...
When started on linux server (same JVM version - 1.7, bytecode - java 1.5 version), I get response only for the first message, and following error in log for the rest of the messages:
synchronizer is null /*my_generated_message_id*/
It looks like JMS message listener thread cannot see new entries (created in JMS sender Thread) in synchronizers map, but I don't understand why...
Synchronizers Map definition:
public final Map<String, ReqRespSynchro<Map>> synchronizers
= Collections.synchronizedMap(new HashMap<String, ReqRespSynchro<Map>>());
Sending JMS request with active response awaiting:
#Override
public Map send(Map<String,Object> params) {
String msgIdent = ""/*my_generated_message_id*/;
Map response = null;
ReqRespSynchro<Map> synchronizer = synchronizers.get(msgIdent);
if (synchronizer == null) {
synchronizer = new ReqRespSynchro<Map>();
synchronizers.put(msgIdent , synchronizer);
}
synchronized(synchronizer) {
try {
sender.send(params);
} catch (Exception ex) {
log.error("send error", ex);
}
synchronizer.initSendSequence();
int iter = 1;
try {
while (!synchronizer.isSet() && iter > 0) {
synchronizer.wait(this.waitTimeout);
iter--;
}
} catch (Exception ex) {
log.error("send error 2", ex);
return null;
} finally {
response = (synchronizers.remove(msgIdent )).getRespObject();
}
}
return response;
}
JMS onMessage response processing (separate thread):
public void onMessage(Message msg) {
Map<String,Object> response = (Map<String,Object>) om.getObject();
String msgIdent = response.getMyMsgID(); ///*my_generated_message_id*/
ReqRespSynchro<Map> synchronizer = synchronizers.get(msgIdent);
if (synchronizer != null) {
synchronized (synchronizer) {
msgSynchronizer.setRespObject(response);
synchronizer.notify();
}
} else {
log.error("synchronizer is null " + msgIdent);
}
}
Synchronizer class:
public class ReqRespSynchro<E> {
private E obj = null;
public synchronized void setRespObject(E obj) {
this.obj = obj;
}
public synchronized void initSendSequence() {
this.obj = null;
}
public synchronized boolean isSet() {
return this.obj != null;
}
public synchronized E getRespObject() {
E ret = null;
ret = obj;
return ret;
}
}
Your code bears the “check-then-act” anti-pattern.
ReqRespSynchro<Map> synchronizer = synchronizers.get(msgIdent);
if (synchronizer == null) {
synchronizer = new ReqRespSynchro<Map>();
synchronizers.put(msgIdent , synchronizer);
}
Here, you first check whether the synchronizers contains a particular mapping then you act by putting a new mapping when the mapping is not present, but by the time you act, there is no guaranty that the condition you have checked still holds.
While the map returned by Collections.synchronizedMap guarantees thread-safe put and get methods, it does not (and can’t) guaranty that there won’t be an update between subsequent invocation of get and put.
So if two threads execute the code above, there is the possibility that one thread puts a new value while the other already has performed the get operation but not the put operation and will therefore proceed with putting a new value, overwriting the existing. So the threads will use different ReqRespSynchro instances and so will the other threads get either of these from the map.
The correct use would be to synchronize the entire compound operation:
synchronized(synchronizers) {
ReqRespSynchro<Map> synchronizer = synchronizers.get(msgIdent);
if (synchronizer == null) {
synchronizer = new ReqRespSynchro<Map>();
synchronizers.put(msgIdent , synchronizer);
}
}
It’s a common mistake to think that by wrapping a map or collection into a synchronized one, every thread safety issue was solved. But you still have to think about access patterns and guard compound operations manually, so sometimes you’re better off using manual locking only and resist the temptation of easy-to-use synchronized wrappers.
But note the ConcurrentMap was added to the Java API to address this use pattern (amongst others).
Change the map declaration to
public final ConcurrentHashMap<String, ReqRespSynchro<Map>> synchronizers
= new ConcurrentHashMap<>();
This map provides thread safe put and get methods, but also methods allowing to avoid the “check-then-act” anti-pattern for updates.
Using the ConcurrentMap under Java 8 is especially easy:
ReqRespSynchro<Map> synchronizer = synchronizers
.computeIfAbsent(msgIdent, key -> new ReqRespSynchro<>());
The invocation of computeIfAbsent will get the ReqRespSynchro<Map>, if there is one, otherwise the provided function will be executed to compute a value which will get stored, all with atomicity guaranty. The places where you simply get an existing instance need no change.
The pre-Java 8 code is a bit more convoluted:
ReqRespSynchro<Map> synchronizer = synchronizers.get(msgIdent);
if (synchronizer == null) {
synchronizer = new ReqRespSynchro<>();
ReqRespSynchro<Map> concurrent = synchronizers.putIfAbsent(msgIdent , synchronizer);
if(concurrent!=null) synchronizer = concurrent;
}
Here, we can’t perform the operation atomically, but we are able to detect if a concurrent update happened in-between. In this case, putIfAbsent will not modify the map but return the value already contained in the map. So if we encounter such a situation, all we have to do is to use that existing one instead of the one we attempted to put.
This could happen if your waitTimeout in send() method is too short. You only have one iteration set for the waiting cycle. So the msgIdent entry may be removed from the map in finally block in send before it can be read in onMessage(): wait timeout expires, iteration counter is decremented, thread exits the cycle and removes the entry from map.
Even if waitTimeout is long enough you may experience a so-called spurious wakeup:
A thread can also wake up without being notified, interrupted, or timing out, a so-called spurious wakeup. While this will rarely occur in practice, applications must guard against it by testing for the condition that should have caused the thread to be awakened, and continuing to wait if the condition is not satisfied.
By the way why don't you send response back via JMS without some cryptic synchronization? Here is an example for ActiveMQ message broker: How should I implement request response with JMS?
I'm creating a memoization cache with the following characteristics:
a cache miss will result in computing and storing an entry
this computation is very expensive
this computation is idempotent
unbounded (entries never removed) since:
the inputs would result in at most 500 entries
each stored entry is very small
cache is relatively shorted-lived (typically less than an hour)
overall, memory usage isn't an issue
there will be thousands of reads - over the cache's lifetime, I expect 99.9%+ cache hits
must be thread-safe
What would have superior performance, or under what conditions would one solution be favored over the other?
ThreadLocal HashMap:
class MyCache {
private static class LocalMyCache {
final Map<K,V> map = new HashMap<K,V>();
V get(K key) {
V val = map.get(key);
if (val == null) {
val = computeVal(key);
map.put(key, val);
}
return val;
}
}
private final ThreadLocal<LocalMyCache> localCaches = new ThreadLocal<LocalMyCache>() {
protected LocalMyCache initialValue() {
return new LocalMyCache();
}
};
public V get(K key) {
return localCaches.get().get(key);
}
}
ConcurrentHashMap:
class MyCache {
private final ConcurrentHashMap<K,V> map = new ConcurrentHashMap<K,V>();
public V get(K key) {
V val = map.get(key);
if (val == null) {
val = computeVal(key);
map.put(key, val);
}
return val;
}
}
I figure the ThreadLocal solution would initially be slower if there a lot of threads because of all the cache misses per thread, but over thousands of reads, the amortized cost would be lower than the ConcurrentHashMap solution. Is my intuition correct?
Or is there an even better solution?
use ThreadLocal as cache is a not good practice
In most containers, threads are reused via thread pools and thus are never gc. this would lead something wired
use ConcurrentHashMap you have to manage it in order to prevent mem leak
if you insist, i suggest using week or soft ref and evict after rich maxsize
if you are finding a in mem cache solution ( do not reinventing the wheel )
try guava cache
http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/cache/CacheBuilder.html
this computation is very expensive
I assume this is the reason you created the cache and this should be your primary concern.
While the speed of the solutions might be slightly different << 100 ns, I suspect it is more important that you be able to share results between threads. i.e. ConcurrentHashMap is likely to be the best for your application is it is likely to save you more CPU time in the long run.
In short, the speed of you solution is likely to be tiny compared to the cost of computing the same thing multiple times (for multiple threads)
Note that your ConcurrentHashMap implementation is not thread safe and could lead to one item being computed twice. It is actually quite complicated to get it right if you store the results directly without using explicit locking, which you certainly want to avoid if performance is a concern.
It is worth noting that ConcurrentHashMap is highly scalable and works well under high contention. I don't know if ThreadLocal would perform better.
Apart from using a library, you could take some inspiration from Java Concurrency in Practice Listing 5.19. The idea is to save a Future<V> in your map instead of a V. That helps a lot in making the whole method thread safe while staying efficient (lock-free). I paste the implementation below for reference but the chapter is worth reading to understand that every detail counts.
public interface Computable<K, V> {
V compute(K arg) throws InterruptedException;
}
public class Memoizer<K, V> implements Computable<K, V> {
private final ConcurrentMap<K, Future<V>> cache = new ConcurrentHashMap<K, Future<V>>();
private final Computable<K, V> c;
public Memoizer(Computable<K, V> c) {
this.c = c;
}
public V compute(final K arg) throws InterruptedException {
while (true) {
Future<V> f = cache.get(arg);
if (f == null) {
Callable<V> eval = new Callable<V>() {
public V call() throws InterruptedException {
return c.compute(arg);
}
};
FutureTask<V> ft = new FutureTask<V>(eval);
f = cache.putIfAbsent(arg, ft);
if (f == null) {
f = ft;
ft.run();
}
}
try {
return f.get();
} catch (CancellationException e) {
cache.remove(arg, f);
} catch (ExecutionException e) {
throw new RuntimeException(e.getCause());
}
}
}
}
Given that it's relatively easy to implement both of these, I would suggest you try them both and test at steady state load to see which one performs the best for your application.
My guess is that the the ConcurrentHashMap will be a little faster since it does not have to make native calls to Thread.currentThread() like a ThreadLocal does. However, this may depend on the objects you are storing and how efficient their hash coding is.
I may also be worthwhile trying to tune the concurrent map's concurrencyLevel to the number of threads you need. It defaults to 16.
The lookup speed is probably similar in both solutions. If there are no other concerns, I'd prefer ThreadLocal, since the best solution to multi-threading problems is single-threading.
However, your main problem is you don't want concurrent calculations for the same key; so there should be a lock per key; such locks can usually be implemented by ConcurrentHashMap.
So my solution would be
class LazyValue
{
K key;
volatile V value;
V getValue() { lazy calculation, doubled-checked locking }
}
static ConcurrentHashMap<K, LazyValue> centralMap = ...;
static
{
for every key
centralMap.put( key, new LazyValue(key) );
}
static V lookup(K key)
{
V value = localMap.get(key);
if(value==null)
localMap.put(key, value=centralMap.get(key).getValue())
return value;
}
The performance question is irrelevant, as the solutions are not equivalent.
The ThreadLocal hash map isn't shared between threads, so the question of thread safety doesn't even arise, but it also doesn't meet your specification, which doesn't say anything about each thread having its own cache.
The requirement for thread safety implies that a single cache is shared among all threads, which rules out ThreadLocal completely.
In JCIP book, Listing 5.19 Final Implementation of Memorizer. My questions are:
The endless while loop is here because of atomic putIfAbsent()?
should the while loop just inside impl of putIfAbsent() instead of client code?
should the while loop be in smaller scope just wrapping putIfAbsent()?
while loop looks bad on readability
Code:
public class Memorizer<A, V> implements Computable<A, V> {
private final ConcurrentMap<A, Future<V>> cache
= new ConcurrentHashMap<A, Future<V>>();
private final Computable<A, V> c;
public Memorizer(Computable<A, V> c) { this.c = c; }
public V compute(final A arg) throws InterruptedException {
while (true) { //<==== WHY?
Future<V> f = cache.get(arg);
if (f == null) {
Callable<V> eval = new Callable<V>() {
public V call() throws InterruptedException {
return c.compute(arg);
}
};
FutureTask<V> ft = new FutureTask<V>(eval);
f = cache.putIfAbsent(arg, ft);
if (f == null) { f = ft; ft.run(); }
}
try {
return f.get();
} catch (CancellationException e) {
cache.remove(arg, f);
} catch (ExecutionException e) {
throw launderThrowable(e.getCause());
}
}
}
}
1) The endless while loop is here because of atomic putIfAbsent()?
The while loop here is for repeating computation when a computation was cancelled (first case in try).
2) Should the while loop just inside impl of putIfAbsent() instead of client code?
No, please, read what putIfAbsent does. It just tries to put an object once only.
3) Should the while loop be in smaller scope just wrapping putIfAbsent()?
No, it shouldn't. See #1.
4) While loop looks bad on readability.
You are free to offer something better. In fact, this construction suites perfect for situation when you have to try to do something until it proceeds successfully.
No, you cannot reduce the scope of the while loop. You want to do f.get() on the value that is in the cache. If there was no value for arg in the map, you want to execute get() on your result, otherwise you want to get the existing value for arg in the map and get() that one.
The problem is that there are no locks in this implementation, so between you checking if there is a value and trying to insert a value, another thread could have inserted its own value. Equally, it could be the case that between the insertion failing and the retrieval, the value could have been removed from the cache (due to an CancellationException). Because of these failure cases, you spin in the while(true) until either you can get the canonical value out of the map or you insert a new value into the map (making your value canonical).
It would seem that you could try to more the f.get() out of the loop, but that is kept in due to the risk of an CancellationException, where you want to keep trying.