Why below class is not thread safe ?
public class UnsafeCachingFactorizer implements Servlet {
private final AtomicReference<BigInteger> lastNumber = new AtomicReference<>();
private final AtomicReference<BigInteger[]> lastFactors = new AtomicReference<>();
public void service(ServletRequest req, ServletResponse resp) {
BigInteger i = extractFromRequest(req);
if i.equals(lastNumber.get())) {
encodeIntoResponse(resp, lastFactors.get());
}
else {
BigInteger[] factors = factor(i);
lastNumber.set(i);
lastFactors.set(factors);
encodeIntoResponse(resp, factors);
}
}
}
Instance variables are thread safe, then why the whole class is not thread safe ?
It's not thread safe because you don't always get the right answer when multiple threads call the code.
Let's say that lastNumber=1 and lastFactors=factors(1). In the one-thread case, where the thread calls with i=1:
T1: if (lastNumber.get().equals(1)) { // true
T1: encodeIntoResponse(resp, lastFactors.get());
Fine, this is the expected result. But consider a multi-threaded case, where the actions within each thread takes place in the same order, but can arbitrarily interleave. One such interleaving is (where i=1 and i=2 for the two threads respectively):
T1: if (lastNumber.get().equals(1)) { // true
T2: if (lastNumber.get().equals(2)) { // false
T2: } else {
T2: lastNumber.set(2);
T2: lastFactors.set(factors(2));
T1: encodeIntoResponse(resp, lastFactors.get()); // oops! You wrote the factors(2), not factors(1).
The problem is that you're not getting and setting the AtomicReferences atomically: that is, there is nothing to stop another thread sneaking in and changing the values (of one or either) between the get and the set.
In general, whilst individual calls to methods on an AtomicReference are atomic, multiple calls are not (and they definitely aren't atomic between instances of AtomicReference). So, if you ever find yourself writing code like:
if (/* some condition with ref.get() */) {
/* some statement with ref.set() */
}
then you probably aren't using AtomicReference correctly (or, at least, it's not thread-safe).
To fix this, you need something that can be read and set atomically. For example, create a simple class to hold both:
class Holder {
// Add a ctor to initialize these.
final BigInteger number;
final BigInteger[] factors;
}
Then store this in a single AtomicReference, and use updateAndGet:
BigInteger[] factors = holderRef.updateAndGet(h -> {
if (h != null && h.number.equals(i)) {
return h;
}
return new Holder(i, factor(i));
}).factors;
encodeIntoResponse(resp, factors);
Upon reflection, updateAndGet isn't necessarily the right way to do this. If factors sometimes takes a long time to compute, then a long-time computation might get done many times, because lots of other shorter-time computations preempt it, so the update function keeps having to be called.
Instead, you can just always set the reference if you had to recompute it:
Holder h = holderRef.get();
if (h == null || !h.number.equals(i)) {
h = new Holder(i, factors(i));
holderRef.set(h);
}
return h.factors;
This may seem to violate what I said previously, in that separate calls to holderRef are not atomic, and thus not thread-safe.
It's a bit more nuanced, however: my first paragraph states that the lack of thread safety in the original code stems from the fact that you might get the factors for the wrong input. This problem doesn't occur here: you either get the holder for the right number (and hence the factors for the right number), or you compute the factors for the input.
The issue arises in what this holder is actually meant to be storing: the "last" number/factors is rather hard to define in terms of multithreading. When are you measuring "last-ness" from? The most recent call to start? The most recent call to finish? Other?
This code simply stores "a" previously computed value, without attempting to nail down this ambiguity.
Related
Based on our Crashlytics logs it seems that we're running into the following exception from time to time:
Fatal Exception: java.lang.IllegalArgumentException
Illegal initial capacity: -1
...
java.util.HashMap.<init> (HashMap.java:448)
java.util.LinkedHashMap.<init> (LinkedHashMap.java:371)
java.util.HashSet.<init> (HashSet.java:161)
java.util.LinkedHashSet.<init> (LinkedHashSet.java:146)
kotlin.collections.CollectionsKt___CollectionsKt.toSet (CollectionsKt___CollectionsKt.java:1316)
But we're not sure when it is possible that this exception is actually thrown. The relevant code for this statement looks something like this:
private val markersMap = mutableMapOf<Any, Marker>()
...
synchronized(markersMap) {
val currentMarkers = markersMap.values.toSet() //it crashes here
// performing some operation on the markers
}
Right now we're suspecting multithreading to cause the issue as the markersMap is modified in multiple places, but as the map is already initialized by default we're not really sure how it can end up in less than an empty state. We also took a look at the toSet implementation:
if (this is Collection) {
return when (size) {
0 -> emptySet()
1 -> setOf(if (this is List) this[0] else iterator().next())
else -> toCollection(LinkedHashSet<T>(mapCapacity(size)))
}
}
Based on this, we'd assume that mapCapacity(size) returns -1, but we weren't able to find the actual implementation of mapCapacity to verify when this can happen.
Does anybody know when -1 is returned here, which in turn causes the constructor to fail?
Java collections are not synchronized and if you need to access a Map or any collection from multiple threads then you are required to take care of synchonization. as stated in LinkedHashMap's header
Note that this implementation is not synchronized.If multiple threads
access a linked hash map concurrently, and at least one of the threads
modifies the map structurally, it must be synchronized externally.
My guess is that you are probably performing structural modifications(mix of put and remove) on the Map without synchronization, which can cause this issue. for example
fun main(){
val markersMap = mutableMapOf<Any, Any>()
(1..1000).forEach { markersMap.put(it, "$it") }
val t1 = Thread{
(1..1000).forEach { markersMap.remove(it)
if(markersMap.size < 0){
print("SIZE IS ${markersMap.size}")
}
}
}
val t2 = Thread{
(1..1000).forEach {
markersMap.remove(it)
if(markersMap.size < 0){
print("SIZE IS ${markersMap.size}")
}
}
}
t1.start()
t2.start()
}
On my machine this code prints SIZE IS -128, SIZE IS -127 and lot many other negative values and when I added markersMap.values.toSet() inside one of the if blocks, this happened
A bit late, I have a Christmas special for you. There is a Santa class with an ArrayList of presents and a Map to keep track which children already have got their presents. Children modeled as threads constantly asking Santa for presents at the same time. For simplicity, each child receives exactly one (random) present.
Here is the method in the Santa class occasionally yielding a IllegalArgumentException because presents.size() is negative.
public Present givePresent(Child child) {
if(gotPresent.containsKey(child) && !gotPresent.get(child)) {
synchronized(this) {
gotPresent.put(child, true);
Random random = new Random();
int randomIndex = random.nextInt(presents.size());
Present present = presents.get(randomIndex);
presents.remove(present);
return present;
}
}
return null;
}
However, making the whole method synchronized works just fine. I don't really understand the problem with the smaller sized synchronized block shown before. From my point of view, it should still assure that a present isn't assigned to a kid multiple times and there shouldn't be concurrent writes (and also reads) on the presents ArrayList. Could you please tell me why my assumption is wrong?
That happens because the code contains a race condition. Let us use the following example to illustrate that race condition.
Imagine that Thread 1 reads
`if(gotPresent.containsKey(child) && !gotPresent.get(child))`
and it evaluates as true. While Thread 1 enters the synchronized block, another thread (i.e., Thread 2) also reads
if(gotPresent.containsKey(child) && !gotPresent.get(child))
before Thread 1 has had the time to do gotPresent.put(child, true);. Consequently, the aforementioned if also evaluates as true for Thread 2.
Thread 1 is inside the synchronized(this) and removes the present from the list of presents (i.e., presents.remove(present);). Now the size of the present list is 0. Thread 1 exits the synchronized block, while Thread 2 just enters it, and eventually calls
int randomIndex = random.nextInt(presents.size());
since presents.size() will return 0, and the random.nextInt implementation is as follows:
public int nextInt(int bound) {
if (bound <= 0)
throw new IllegalArgumentException(BadBound);
...
}
you get the IllegalArgumentException exception.
However, making the whole method synchronized works just fine.
Yes, because with
synchronized(this) {
if(gotPresent.containsKey(child) && !gotPresent.get(child)) {
gotPresent.put(child, true);
Random random = new Random();
int randomIndex = random.nextInt(presents.size());
Present present = presents.get(randomIndex);
presents.remove(present);
return present;
}
}
in the aforementioned race-condition example Thread 2 would have been waiting before the
if(gotPresent.containsKey(child) && !gotPresent.get(child))
and because Thread 1, before exiting the synchronized block, would have done
gotPresent.put(child, true);
by the time Thread 2 would have entered the synchronized block the following statement
!gotPresent.get(child)
would have evaluated as false, and consequently Thread 2 would have exit immediately without calling int randomIndex = random.nextInt(presents.size()); with a list of size 0.
Since the method that you have shown is being executed in parallel by multiple threads you should ensure mutual exclusion of the shared data structure among threads, namely gotPresent and presents. Which implies, for instance, that operations like containsKey, get, and put should be performed within the same synchronized block.
I am unsure of how lambdas work in practice, and I am concerned since under certain circumstances, lambdas can result in errors such as ConcurrentModificationExceptions if you use them incorrectly, which seems to be indicative of a race condition.
Consider the code below.
private class deltaCalculator{
Double valueA;
Double valueB;
//Init delta
volatile Double valueDelta = null;
private void calculateMinimum(List<T> dataSource){
dataSource.forEach((entry -> {
valueA = entry.getA();
valueB = entry.getB();
Double dummyDelta;
dummyDelta = Math.abs(valueA - valueB);
if(valueDelta == null){
setDelta(dummyDelta);
}else {
setDelta((valueDelta > dummyDelta) ? dummyDelta : valueDelta);
}
}));
}
private void setDelta(Double d){
this.valueDelta = d;
}
}
How does the forEach loop operate? Do different calls get passed to different threads where the JVM considers it appropriate, opening up the possibility of a race condition that could lead to incorrect minimum calculation?
If not, why can a forEach lambda throw a ConcurrentModificationException?
You'll get a ConcurrentModificationException if you try to modify the collection that you're iterating over while the for each loop runs. This could be done in a separate thread entirely, but much more commonly occurs when you try to modify the collection in the loop body.
Do different calls get passed to different threads where the JVM considers it appropriate, opening up the possibility of a race condition that could lead to incorrect minimum calculation?
No. No multithreading is taking place in your example above.
I understand the overall concepts of multi-threading and synchronization but am new to writing thread-safe code. I currently have the following code snippet:
synchronized(compiledStylesheets) {
if(compiledStylesheets.containsKey(xslt)) {
exec = compiledStylesheets.get(xslt);
} else {
exec = compile(s, imports);
compiledStylesheets.put(xslt, exec);
}
}
where compiledStylesheets is a HashMap (private, final). I have a few questions.
The compile method can take a few hundred milliseconds to return. This seems like a long time to have the object locked, but I don't see an alternative. Also, it is unnecessary to use Collections.synchronizedMap in addition to the synchronized block, correct? This is the only code that hits this object other than initialization/instantiation.
Alternatively, I know of the existence of a ConcurrentHashMap but I don't know if that's overkill. The putIfAbsent() method will not be usable in this instance because it doesn't allow me to skip the compile() method call. I also don't know if it will solve the "modified after containsKey() but before put()" problem, or if that's even really a concern in this case.
Edit: Spelling
For tasks of this nature, I highly recommend Guava caching support.
If you can't use that library, here is a compact implementation of a Multiton. Use of the FutureTask was a tip from assylias, here, via OldCurmudgeon.
public abstract class Cache<K, V>
{
private final ConcurrentMap<K, Future<V>> cache = new ConcurrentHashMap<>();
public final V get(K key)
throws InterruptedException, ExecutionException
{
Future<V> ref = cache.get(key);
if (ref == null) {
FutureTask<V> task = new FutureTask<>(new Factory(key));
ref = cache.putIfAbsent(key, task);
if (ref == null) {
task.run();
ref = task;
}
}
return ref.get();
}
protected abstract V create(K key)
throws Exception;
private final class Factory
implements Callable<V>
{
private final K key;
Factory(K key)
{
this.key = key;
}
#Override
public V call()
throws Exception
{
return create(key);
}
}
}
I think you are looking for a Multiton.
There's a very good Java one here that #assylas posted some time ago.
You can loosen the lock at the risk of an occasional doubly compiled stylesheet in race condition.
Object y;
// lock here if needed
y = map.get(x);
if(y == null) {
y = compileNewY();
// lock here if needed
map.put(x, y); // this may happen twice, if put is t.s. one will be ignored
y = map.get(x); // essential because other thread's y may have been put
}
This requires get and put to be atomic, which is true in the case of ConcurrentHashMap and you can achieve by wrapping individual calls to get and put with a lock in your class. (As I tried to explain with "lock here if needed" comments - the point being you only need to wrap individual calls, not have one big lock).
This is a standard thread safe pattern to use even with ConcurrentHashMap (and putIfAbsent) to minimize the cost of compiling twice. It still needs to be acceptable to compile twice sometimes, but it should be okay even if expensive.
By the way, you can solve that problem. Usually the above pattern isn't used with a heavy function like compileNewY but a lightweight constructor new Y(). e.g. do this:
class PrecompiledY {
public volatile Y y;
private final AtomicBoolean compiled = new AtomicBoolean(false);
public void compile() {
if(!compiled.getAndSet(true)) {
y = compile();
}
}
}
// ...
ConcurrentMap<X, PrecompiledY> myMap; // alternatively use proper locking
py = map.get(x);
if(py == null) {
py = new PrecompiledY(); // much cheaper than compiling
map.put(x, y); // this may happen twice, if put is t.s. one will be ignored
y = map.get(x); // essential because other thread's y may have been put
y.compile(); // object that didn't get inserted never gets compiled
}
Also:
Alternatively, I know of the existence of a ConcurrentHashMap but I don't know if that's overkill.
Given that your code is heavily locking, ConcurrentHashMap is almost certainly far faster, so not overkill. (And much more likely to be bug-free. Concurrency bugs are not fun to fix.)
Please see Erickson's comment below. Using double-checked locking with Hashmaps is not very smart
The compile method can take a few hundred milliseconds to return. This seems like a long time to have the object locked, but I don't see an alternative.
You can use double-checked locking, and note that you don't need any lock before get since you never remove anything from the map.
if(compiledStylesheets.containsKey(xslt)) {
exec = compiledStylesheets.get(xslt);
} else {
synchronized(compiledStylesheets) {
if(compiledStylesheets.containsKey(xslt)) {
// another thread might have created it while
// this thread was waiting for lock
exec = compiledStylesheets.get(xslt);
} else {
exec = compile(s, imports);
compiledStylesheets.put(xslt, exec);
}
}
}
}
Also, it is unnecessary to use Collections.synchronizedMap in addition to the synchronized block, correct?
Correct
This is the only code that hits this object other than initialization/instantiation.
First of all, the code as you posted it is race-condition-free because containsKey() result will never change while compile() method is running.
Collections.synchronizedMap() is useless for your case as stated above because it wraps all map methods into a synchronized block using either this as a mutex or another object you provided (for two-argument version).
IMO using ConcurrentHashMap is also not an option because it stripes locks based on key hashCode() result; its concurrent iterators is also useless here.
If you really want compile() out of synchronized block, you may pre-calculate if before checking containsKey(). This may draw the overall performance back, but may be better than calling it in synchronized block. To make a decision, personally I would consider how often key "miss" is happening and so, which option is preferrable - keep the lock for longer times or calculate your stuff always.
I have read article concerning atomic operation in Java but still have some doubts needing to be clarified:
int volatile num;
public void doSomething() {
num = 10; // write operation
System.out.println(num) // read
num = 20; // write
System.out.println(num); // read
}
So i have done w-r-w-r 4 operations on 1 method, are they atomic operations? What will happen if multiple threads invoke doSomething() method simultaneously ?
An operation is atomic if no thread will see an intermediary state, i.e. the operation will either have completed fully, or not at all.
Reading an int field is an atomic operation, i.e. all 32 bits are read at once. Writing an int field is also atomic, the field will either have been written fully, or not at all.
However, the method doSomething() is not atomic; a thread may yield the CPU to another thread while the method is being executing, and that thread may see that some, but not all, operations have been executed.
That is, if threads T1 and T2 both execute doSomething(), the following may happen:
T1: num = 10;
T2: num = 10;
T1: System.out.println(num); // prints 10
T1: num = 20;
T1: System.out.println(num); // prints 20
T2: System.out.println(num); // prints 20
T2: num = 20;
T2: System.out.println(num); // prints 20
If doSomething() were synchronized, its atomicity would be guaranteed, and the above scenario impossible.
volatile ensures that if you have a thread A and a thread B, that any change to that variable will be seen by both. So if it at some point thread A changes this value, thread B could in the future look at it.
Atomic operations ensure that the execution of the said operation happens "in one step." This is somewhat confusion because looking at the code 'x = 10;' may appear to be "one step", but actually requires several steps on the CPU. An atomic operation can be formed in a variety of ways, one of which is by locking using synchronized:
What the volatile keyword promises.
The lock of an object (or the Class in the case of static methods) is acquired, and no two objects can access it at the same time.
As you asked in a comment earlier, even if you had three separate atomic steps that thread A was executing at some point, there's a chance that thread B could begin executing in the middle of those three steps. To ensure the thread safety of the object, all three steps would have to be grouped together to act like a single step. This is part of the reason locks are used.
A very important thing to note is that if you want to ensure that your object can never be accessed by two threads at the same time, all of your methods must be synchronized. You could create a non-synchronized method on the object that would access the values stored in the object, but that would compromise the thread safety of the class.
You may be interested in the java.util.concurrent.atomic library. I'm also no expert on these matters, so I would suggest a book that was recommended to me: Java Concurrency in Practice
Each individual read and write to a volatile variable is atomic. This means that a thread won't see the value of num changing while it's reading it, but it can still change in between each statement. So a thread running doSomething while other threads are doing the same, will print a 10 or 20 followed by another 10 or 20. After all threads have finished calling doSomething, the value of num will be 20.
My answer modified according to Brian Roach's comment.
It's atomic because it is integer in this case.
Volatile can only ganrentee visibility among threads, but not atomic. volatile can make you see the change of the integer, but cannot ganrentee the integration in changes.
For example, long and double can cause unexpected intermediate state.
Atomic Operations and Synchronization:
Atomic executions are performed in a single unit of task without getting affected from other executions. Atomic operations are required in multi-threaded environment to avoid data irregularity.
If we are reading/writing an int value then it is an atomic operation. But generally if it is inside a method then if the method is not synchronized many threads can access it which can lead to inconsistent values. However, int++ is not an atomic operation. So by the time one threads read it’s value and increment it by one, other thread has read the older value leading to wrong result.
To solve data inconsistency, we will have to make sure that increment operation on count is atomic, we can do that using Synchronization but Java 5 java.util.concurrent.atomic provides wrapper classes for int and long that can be used to achieve this atomically without usage of Synchronization.
Using int might create data data inconsistencies as shown below:
public class AtomicClass {
public static void main(String[] args) throws InterruptedException {
ThreardProcesing pt = new ThreardProcesing();
Thread thread_1 = new Thread(pt, "thread_1");
thread_1.start();
Thread thread_2 = new Thread(pt, "thread_2");
thread_2.start();
thread_1.join();
thread_2.join();
System.out.println("Processing count=" + pt.getCount());
}
}
class ThreardProcesing implements Runnable {
private int count;
#Override
public void run() {
for (int i = 1; i < 5; i++) {
processSomething(i);
count++;
}
}
public int getCount() {
return this.count;
}
private void processSomething(int i) {
// processing some job
try {
Thread.sleep(i * 1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
OUTPUT: count value varies between 5,6,7,8
We can resolve this using java.util.concurrent.atomic that will always output count value as 8 because AtomicInteger method incrementAndGet() atomically increments the current value by one. shown below:
public class AtomicClass {
public static void main(String[] args) throws InterruptedException {
ThreardProcesing pt = new ThreardProcesing();
Thread thread_1 = new Thread(pt, "thread_1");
thread_1.start();
Thread thread_2 = new Thread(pt, "thread_2");
thread_2.start();
thread_1.join();
thread_2.join();
System.out.println("Processing count=" + pt.getCount());
}
}
class ThreardProcesing implements Runnable {
private AtomicInteger count = new AtomicInteger();
#Override
public void run() {
for (int i = 1; i < 5; i++) {
processSomething(i);
count.incrementAndGet();
}
}
public int getCount() {
return this.count.get();
}
private void processSomething(int i) {
// processing some job
try {
Thread.sleep(i * 1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
Source: Atomic Operations in java