I'm creating a memoization cache with the following characteristics:
a cache miss will result in computing and storing an entry
this computation is very expensive
this computation is idempotent
unbounded (entries never removed) since:
the inputs would result in at most 500 entries
each stored entry is very small
cache is relatively shorted-lived (typically less than an hour)
overall, memory usage isn't an issue
there will be thousands of reads - over the cache's lifetime, I expect 99.9%+ cache hits
must be thread-safe
What would have superior performance, or under what conditions would one solution be favored over the other?
ThreadLocal HashMap:
class MyCache {
private static class LocalMyCache {
final Map<K,V> map = new HashMap<K,V>();
V get(K key) {
V val = map.get(key);
if (val == null) {
val = computeVal(key);
map.put(key, val);
}
return val;
}
}
private final ThreadLocal<LocalMyCache> localCaches = new ThreadLocal<LocalMyCache>() {
protected LocalMyCache initialValue() {
return new LocalMyCache();
}
};
public V get(K key) {
return localCaches.get().get(key);
}
}
ConcurrentHashMap:
class MyCache {
private final ConcurrentHashMap<K,V> map = new ConcurrentHashMap<K,V>();
public V get(K key) {
V val = map.get(key);
if (val == null) {
val = computeVal(key);
map.put(key, val);
}
return val;
}
}
I figure the ThreadLocal solution would initially be slower if there a lot of threads because of all the cache misses per thread, but over thousands of reads, the amortized cost would be lower than the ConcurrentHashMap solution. Is my intuition correct?
Or is there an even better solution?
use ThreadLocal as cache is a not good practice
In most containers, threads are reused via thread pools and thus are never gc. this would lead something wired
use ConcurrentHashMap you have to manage it in order to prevent mem leak
if you insist, i suggest using week or soft ref and evict after rich maxsize
if you are finding a in mem cache solution ( do not reinventing the wheel )
try guava cache
http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/cache/CacheBuilder.html
this computation is very expensive
I assume this is the reason you created the cache and this should be your primary concern.
While the speed of the solutions might be slightly different << 100 ns, I suspect it is more important that you be able to share results between threads. i.e. ConcurrentHashMap is likely to be the best for your application is it is likely to save you more CPU time in the long run.
In short, the speed of you solution is likely to be tiny compared to the cost of computing the same thing multiple times (for multiple threads)
Note that your ConcurrentHashMap implementation is not thread safe and could lead to one item being computed twice. It is actually quite complicated to get it right if you store the results directly without using explicit locking, which you certainly want to avoid if performance is a concern.
It is worth noting that ConcurrentHashMap is highly scalable and works well under high contention. I don't know if ThreadLocal would perform better.
Apart from using a library, you could take some inspiration from Java Concurrency in Practice Listing 5.19. The idea is to save a Future<V> in your map instead of a V. That helps a lot in making the whole method thread safe while staying efficient (lock-free). I paste the implementation below for reference but the chapter is worth reading to understand that every detail counts.
public interface Computable<K, V> {
V compute(K arg) throws InterruptedException;
}
public class Memoizer<K, V> implements Computable<K, V> {
private final ConcurrentMap<K, Future<V>> cache = new ConcurrentHashMap<K, Future<V>>();
private final Computable<K, V> c;
public Memoizer(Computable<K, V> c) {
this.c = c;
}
public V compute(final K arg) throws InterruptedException {
while (true) {
Future<V> f = cache.get(arg);
if (f == null) {
Callable<V> eval = new Callable<V>() {
public V call() throws InterruptedException {
return c.compute(arg);
}
};
FutureTask<V> ft = new FutureTask<V>(eval);
f = cache.putIfAbsent(arg, ft);
if (f == null) {
f = ft;
ft.run();
}
}
try {
return f.get();
} catch (CancellationException e) {
cache.remove(arg, f);
} catch (ExecutionException e) {
throw new RuntimeException(e.getCause());
}
}
}
}
Given that it's relatively easy to implement both of these, I would suggest you try them both and test at steady state load to see which one performs the best for your application.
My guess is that the the ConcurrentHashMap will be a little faster since it does not have to make native calls to Thread.currentThread() like a ThreadLocal does. However, this may depend on the objects you are storing and how efficient their hash coding is.
I may also be worthwhile trying to tune the concurrent map's concurrencyLevel to the number of threads you need. It defaults to 16.
The lookup speed is probably similar in both solutions. If there are no other concerns, I'd prefer ThreadLocal, since the best solution to multi-threading problems is single-threading.
However, your main problem is you don't want concurrent calculations for the same key; so there should be a lock per key; such locks can usually be implemented by ConcurrentHashMap.
So my solution would be
class LazyValue
{
K key;
volatile V value;
V getValue() { lazy calculation, doubled-checked locking }
}
static ConcurrentHashMap<K, LazyValue> centralMap = ...;
static
{
for every key
centralMap.put( key, new LazyValue(key) );
}
static V lookup(K key)
{
V value = localMap.get(key);
if(value==null)
localMap.put(key, value=centralMap.get(key).getValue())
return value;
}
The performance question is irrelevant, as the solutions are not equivalent.
The ThreadLocal hash map isn't shared between threads, so the question of thread safety doesn't even arise, but it also doesn't meet your specification, which doesn't say anything about each thread having its own cache.
The requirement for thread safety implies that a single cache is shared among all threads, which rules out ThreadLocal completely.
Related
I have a piece of code that can be executed by multiple threads that needs to perform an I/O-bound operation in order to initialize a shared resource that is stored in a ConcurrentMap. I need to make this code thread safe and avoid unnecessary calls to initialize the shared resource. Here's the buggy code:
private ConcurrentMap<String, Resource> map;
// .....
String key = "somekey";
Resource resource;
if (map.containsKey(key)) {
resource = map.get(key);
} else {
resource = getResource(key); // I/O-bound, expensive operation
map.put(key, resource);
}
With the above code, multiple threads may check the ConcurrentMap and see that the resource isn't there, and all attempt to call getResource() which is expensive. In order to ensure only a single initialization of the shared resource and to make the code efficient once the resource has been initialized, I want to do something like this:
String key = "somekey";
Resource resource;
if (!map.containsKey(key)) {
synchronized (map) {
if (!map.containsKey(key)) {
resource = getResource(key);
map.put(key, resource);
}
}
}
Is this a safe version of double checked locking? It seems to me that since the checks are called on ConcurrentMap, it behaves like a shared resource that is declared to be volatile and thus prevents any of the "partial initialization" problems that may happen.
If you can use external libraries, take a look at Guava's MapMaker.makeComputingMap(). It's tailor-made for what you're trying to do.
yes it' safe.
If map.containsKey(key) is true, according to doc, map.put(key, resource) happens before it. Therefore getResource(key) happens before resource = map.get(key), everything is safe and sound.
Why not use the putIfAbsent() method on ConcurrentMap?
if(!map.containsKey(key)){
map.putIfAbsent(key, getResource(key));
}
Conceivably you might call getResource() more than once, but it won't happen a bunch of times. Simpler code is less likely to bite you.
In general, double-checked locking is safe if the variable you're synchronizing on is marked volatile. But you're better off synchronizing the entire function:
public synchronized Resource getResource(String key) {
Resource resource = map.get(key);
if (resource == null) {
resource = expensiveGetResourceOperation(key);
map.put(key, resource);
}
return resource;
}
The performance hit will be tiny, and you'll be certain that there will be no sync
problems.
Edit:
This is actually faster than the alternatives, because you won't have to do two calls to the map in most cases. The only extra operation is the null check, and the cost of that is close to zero.
Second edit:
Also, you don't have to use ConcurrentMap. A regular HashMap will do it. Faster still.
No need for that - ConcurrentMap supports this as with its special atomic putIfAbsent method.
Don't reinvent the wheel: Always use the API where possible.
The verdict is in. I timed 3 different solutions in nanosecond accuracy, since after all the initial question was about performance:
Fully synching the function on a regular HashMap:
synchronized (map) {
Object result = map.get(key);
if (result == null) {
result = new Object();
map.put(key, result);
}
return result;
}
first invocation: 15,000 nanoseconds, subsequent invocations: 700 nanoseconds
Using the double check lock with a ConcurrentHashMap:
if (!map.containsKey(key)) {
synchronized (map) {
if (!map.containsKey(key)) {
map.put(key, new Object());
}
}
}
return map.get(key);
first invocation: 15,000 nanoseconds, subsequent invocations: 1500 nanoseconds
A different flavor of double checked ConcurrentHashMap:
Object result = map.get(key);
if (result == null) {
synchronized (map) {
if (!map.containsKey(key)) {
result = new Object();
map.put(key, result);
} else {
result = map.get(key);
}
}
}
return result;
first invocation: 15,000 nanoseconds, subsequent invocations: 1000 nanoseconds
You can see that the biggest cost was on the first invocation, but was similar for all 3. Subsequent invocations were the fastest on the regular HashMap with method sync like user237815 suggested but only by 300 NANO seocnds. And after all we are talking about NANO seconds here which means a BILLIONTH of a second.
Here is the code in one of my classes:
class SomeClass {
private Map<Integer, Integer> map = new ConcurrentHashMap<>();
private volatile int counter = 0;
final AtomicInteger sum = new AtomicInteger(0); // will be used in other classes/threads too
private ReentrantLock l = new ReentrantLock();
public void put(String some) {
l.lock();
try {
int tmp = Integer.parseInt(some);
map.put(counter++, tmp);
sum.getAndAdd(tmp);
} finally {
l.unlock();
}
}
public Double get() {
l.lock();
try {
//... perform some map resizing operation ...
// some calculations including sum field ...
} finally {
l.unlock();
}
}
}
You can assume that this class will be used in concurrent environment.
The question is: how do you think is there a necessity of the locks? How does this code smell? :)
Let's look at the operations inside public void put(String some).
map.put(counter++, tmp);
sum.getAndAdd(tmp);
Now let's look at the individual parts.
counter is a volatile variable. So it only provides memory visibility but not atomicity. Since counter++ is a compound operation, you need a lock to achieve atomicity.
map.put(key, value) is atomic since it is a ConcurrentHashMap.
sum.getAndAdd(tmp) is atomic since it is a AtomicInteger.
As you can see, except counter++ every other operation is atomic. However, you are trying to achieve some function by combining all these operations. To achieve atomicity at the functionality level, you need a lock. This will help you to avoid surprising side effects when the threads interleave between the individual atomic operations.
So you need a lock because counter++ is not atomic and you want to combine a few atomic operations to achieve some functionality (assuming you want this to be atomic).
Since you always increment counter when you use it as a key to put into this map:
map.put(counter++, tmp);
when you come to read it again:
return sum / map.get(counter);
map.get(counter) will be null, so this results in a NPE (unless you put more than 2^32 things into the map, ofc). (I'm assuming you mean sum.get(), otherwise it won't compile).
As such, you can have equivalent functionality without any locks:
class SomeClass {
public void put(String some) { /* do nothing */ }
public Double get() {
throw new NullPointerException();
}
}
You've not really fixed the problem with your edit. divisor will still be null, so the equivalent functionality without locks would be:
class SomeClass {
private final AtomicInteger sum = new AtomicInteger(0);
public void put(String some) {
sum.getAndAdd(Integer.parseInt(some));
}
public Double get() {
return sum.get();
}
}
I came across the performance issue when implementing a data structure of non-duplicate concurrent ArrayList(or ConcurrentLinkedQueue).
public class NonDuplicateList implements Outputable {
private Map<Term, Integer> map;
private List<Term> terms;
public NonDuplicateList() {
this.map = new HashMap<>();
this.terms = new ArrayList<>();
}
public synchronized int addTerm(Term term) { //bad performance :(
Integer index = map.get(term);
if (index == null) {
index = terms.size();
terms.add(term);
map.put(term, index);
}
return index;
}
#Override
public void output(DataOutputStream out) throws IOException {
out.writeInt(terms.size());
for (Term term : terms) {
term.output(out);
}
}
}
Note that Term and NonDuplicateList both implement Outputable interface to output.
In order to keep NonDuplicateList thread-safe, I use synchronized to guard the method addTerm(Term) and the performance is as bad as expected, when currently invoking addTerm.
It seems that ConcurrentHashMap isn't suitable for this case, since it doesn't keep strong data consistency. Any idea how to improve the performance of addTerm without losing its thread-safety?
EDIT:
output method, i.e. iteration through NonDuplicateList, might not be thread-safe since only one thread will access this method after concurrently invoking addTerm, but addTerm must return the index value immediately as soon as a term is added into the NonDuplicateList.
There is a possibility to reuse ConcurrentHashMap in your implementation if you can sacrifice addTerm return type. Instead of returning actual index you can return boolean which indicates whether addition was successful or produced duplicate. This will also allow you to remove method synchronization and improve performance:
private ConcurrentMap<Term, Boolean> map;
private List<Term> terms;
public boolean addTerm(Term term) {
Boolean previousValue = map.putIfAbsent(term, Boolean.TRUE);
if (previousValue == null) {
terms.add(term);
return true;
}
return false;
}
I am afraid you will not get much faster solution here. The point is to avoid synchronization when you don't need it. If you don't mind weak consistency, using ConcurrentHashMap iterator can be significantly cheaper than either preventing other threads from adding items while you're iterating or taking a consistent snapshot when the iterator is created.
On the other hand, when you need synchronization and a consistent iterator, you'll need an alternative for ConcurrentHashMap. One that comes to my mind is java.util.Collections#synchronizedMap, but it's using synchronization at Object level, so every read/write operation needs to acquire lock, which is a performance overhead.
Take a look at ConcurrentSkipListMap, which guarantees average O(log(n)) performance on a wide variety of operations. It also has a number of operations that ConcurrentHashMap doesn't: ceilingEntry/Key, floorEntry/Key, etc. It also maintains a sort order, which would otherwise have to be calculated (at notable expense) if you were using a ConcurrentHashMap. Maybe it would be possible to get rid of list+map and use ConcurrentSkipListMap instead. Index of element might be computed using ConcurrentSkipListMap api.
I understand the overall concepts of multi-threading and synchronization but am new to writing thread-safe code. I currently have the following code snippet:
synchronized(compiledStylesheets) {
if(compiledStylesheets.containsKey(xslt)) {
exec = compiledStylesheets.get(xslt);
} else {
exec = compile(s, imports);
compiledStylesheets.put(xslt, exec);
}
}
where compiledStylesheets is a HashMap (private, final). I have a few questions.
The compile method can take a few hundred milliseconds to return. This seems like a long time to have the object locked, but I don't see an alternative. Also, it is unnecessary to use Collections.synchronizedMap in addition to the synchronized block, correct? This is the only code that hits this object other than initialization/instantiation.
Alternatively, I know of the existence of a ConcurrentHashMap but I don't know if that's overkill. The putIfAbsent() method will not be usable in this instance because it doesn't allow me to skip the compile() method call. I also don't know if it will solve the "modified after containsKey() but before put()" problem, or if that's even really a concern in this case.
Edit: Spelling
For tasks of this nature, I highly recommend Guava caching support.
If you can't use that library, here is a compact implementation of a Multiton. Use of the FutureTask was a tip from assylias, here, via OldCurmudgeon.
public abstract class Cache<K, V>
{
private final ConcurrentMap<K, Future<V>> cache = new ConcurrentHashMap<>();
public final V get(K key)
throws InterruptedException, ExecutionException
{
Future<V> ref = cache.get(key);
if (ref == null) {
FutureTask<V> task = new FutureTask<>(new Factory(key));
ref = cache.putIfAbsent(key, task);
if (ref == null) {
task.run();
ref = task;
}
}
return ref.get();
}
protected abstract V create(K key)
throws Exception;
private final class Factory
implements Callable<V>
{
private final K key;
Factory(K key)
{
this.key = key;
}
#Override
public V call()
throws Exception
{
return create(key);
}
}
}
I think you are looking for a Multiton.
There's a very good Java one here that #assylas posted some time ago.
You can loosen the lock at the risk of an occasional doubly compiled stylesheet in race condition.
Object y;
// lock here if needed
y = map.get(x);
if(y == null) {
y = compileNewY();
// lock here if needed
map.put(x, y); // this may happen twice, if put is t.s. one will be ignored
y = map.get(x); // essential because other thread's y may have been put
}
This requires get and put to be atomic, which is true in the case of ConcurrentHashMap and you can achieve by wrapping individual calls to get and put with a lock in your class. (As I tried to explain with "lock here if needed" comments - the point being you only need to wrap individual calls, not have one big lock).
This is a standard thread safe pattern to use even with ConcurrentHashMap (and putIfAbsent) to minimize the cost of compiling twice. It still needs to be acceptable to compile twice sometimes, but it should be okay even if expensive.
By the way, you can solve that problem. Usually the above pattern isn't used with a heavy function like compileNewY but a lightweight constructor new Y(). e.g. do this:
class PrecompiledY {
public volatile Y y;
private final AtomicBoolean compiled = new AtomicBoolean(false);
public void compile() {
if(!compiled.getAndSet(true)) {
y = compile();
}
}
}
// ...
ConcurrentMap<X, PrecompiledY> myMap; // alternatively use proper locking
py = map.get(x);
if(py == null) {
py = new PrecompiledY(); // much cheaper than compiling
map.put(x, y); // this may happen twice, if put is t.s. one will be ignored
y = map.get(x); // essential because other thread's y may have been put
y.compile(); // object that didn't get inserted never gets compiled
}
Also:
Alternatively, I know of the existence of a ConcurrentHashMap but I don't know if that's overkill.
Given that your code is heavily locking, ConcurrentHashMap is almost certainly far faster, so not overkill. (And much more likely to be bug-free. Concurrency bugs are not fun to fix.)
Please see Erickson's comment below. Using double-checked locking with Hashmaps is not very smart
The compile method can take a few hundred milliseconds to return. This seems like a long time to have the object locked, but I don't see an alternative.
You can use double-checked locking, and note that you don't need any lock before get since you never remove anything from the map.
if(compiledStylesheets.containsKey(xslt)) {
exec = compiledStylesheets.get(xslt);
} else {
synchronized(compiledStylesheets) {
if(compiledStylesheets.containsKey(xslt)) {
// another thread might have created it while
// this thread was waiting for lock
exec = compiledStylesheets.get(xslt);
} else {
exec = compile(s, imports);
compiledStylesheets.put(xslt, exec);
}
}
}
}
Also, it is unnecessary to use Collections.synchronizedMap in addition to the synchronized block, correct?
Correct
This is the only code that hits this object other than initialization/instantiation.
First of all, the code as you posted it is race-condition-free because containsKey() result will never change while compile() method is running.
Collections.synchronizedMap() is useless for your case as stated above because it wraps all map methods into a synchronized block using either this as a mutex or another object you provided (for two-argument version).
IMO using ConcurrentHashMap is also not an option because it stripes locks based on key hashCode() result; its concurrent iterators is also useless here.
If you really want compile() out of synchronized block, you may pre-calculate if before checking containsKey(). This may draw the overall performance back, but may be better than calling it in synchronized block. To make a decision, personally I would consider how often key "miss" is happening and so, which option is preferrable - keep the lock for longer times or calculate your stuff always.
I was going to implement some code which needs a synchronized data structure. I came up with HashTable and Collections.synchronized(HashMap). I wouldn't be needing ConcurrentHashMap for this. I was wondering which one of the two would be better.
PS : I will be calling a lot of getter of this object and they would not be at the same time. So their is no problem with concurrency issue also.
ConcurrentHashMap is much more scaleable: http://www.javamex.com/tutorials/concurrenthashmap_scalability.shtml
HashTable and Collections.synchronized(HashMap) provide with the same performance, but they are conditionally thread-safe (i.e. they are not fully thread-safe)
If there are a lot read operations, I would recommend to wrap it with read-write locks:
public class MyHashMap<K, V> extends HashMap<K, V> {
private final ReadWriteLock lock = new ReentrantReadWriteLock();
#Override
public V put(K key, V value) {
final Lock w = lock.writeLock();
w.lock();
try {
return super.put(key, value);
} finally {
w.unlock();
}
}
#Override
public V get(Object key) {
final Lock r = lock.readLock();
r.lock();
try {
return super.get(key);
} finally {
r.unlock();
}
}
.... // the same approach distinguishing read and write operations
}
UPDATE:
I will be calling a lot of getter of this object and they would not be at the same time
It doesn't guarantee that you don't need synchronization.
Unless you need to acquire a lock on the whole map for some reason (unlikely) you should go for ConcurrentHashMap which gives much better scalability.
HashTable and the synchronized wrapper (Collections.synchronized(HashMap)) use one lock whereas ConcurrentHashMap partitions the map in 16 segments by default, each having its own lock, which gives much better concurrent access.
Although HashTable is thread-safe but it doesn't guarantee that it make your whole code thread-safe.HashTable also has some performance issue. So you should use HashMap but you have to manage all thread-safety yourself.