I am going to parallelize some code with some global variables.
I am going to use ReentrantReadWriteLock.
Did I understand it right, that I need one own instance of ReentrantReadWriteLock per variable I want to make thread safe?
I mean, when I have two lists where every thread can attach an item and all threads are sometimes reading items from that lists.
In that case I would implement something like:
private static String[] globalVariables = null;
private static String[] processedItems = null;
private final ReentrantReadWriteLock globalVariablesLock = new ReentrantReadWriteLock();
private final Lock globalVariablesEeadLock = globalVariablesLock .readLock();
private final Lock globalVariablesWriteLock = globalVariablesLock .writeLock();
private final ReentrantReadWriteLock processedItemsLock = new ReentrantReadWriteLock();
private final Lock processedItemsLockReadLock = processedItemsLock .readLock();
private final Lock processedItemsLockWriteLock = processedItemsLock .writeLock();
What if I have much more variables like databaseconnection(pool)s, loggers, further lists, etc.
Do I need to make a new ReentrantReadWriteLock or do I missing something?
Samples on the internet only handles one variable.
Thanks in advance.
What are you trying to protect?
Don't think of locking variables. The purpose of a lock is to protect an invariant. An invariant is some assertion that you can make about the state of your program that must always be true. An example might be, "the sum of variables A, B, and C will always be zero."
In that case, it doesn't do you any good to have separate locks for A, B, and C. You want one lock that protects that particular invariant. Any thread that wants to change A, B, or C must lock that lock, and any thread that depends on their sum being zero must lock that same lock.
Often it is not possible for a thread to make progress without temporarily breaking some invariant. E.g.,
A += 1; //breaks the invariant
B -= 1; //fixes it again.
Without synchronization, some other thread could examine A, B, and C in-between those two statements, and find the invariant broken.
With synchronization:
private final Object zeroSumLock = new Object();
void bumpA() {
synchronized(zeroSumLock) {
A += 1;
B -= 1;
}
}
boolean verifySum() {
synchronized(zeroSumLock) {
return (A+B+C) == 0;
}
}
Yes, you should have one Lock per thread-safe variable (arrays in your case). However, consider using
ArrayList<String> syncList = Collections.synchronizedList(new ArrayList<String>());
instead of arrays. It is usually way better when you delegate to the library (in this case, not only the synchronization but also the resize of the arrays). Of course, before doing it check that the library does exactly what you would expect (in this case, as #SashaSalauyou pointed out, you'd lose the ability to read concurrently).
One of solutions is creating an immutable Map, where you put locks for all items you need:
final static Map<String, ReadWriteLock> locks = Collections.unmodifiableMap(
new HashMap<String, ReadWriteLock>() {{
put("globalVariables", new ReentrantReadWriteLock());
put("processedItems", new ReentrantReadWriteLock());
// rest items
}}
);
As HashMap is wrapped by Collections.unmodifiableMap() and thus cannot be modified, it becomes thread-safe.
Then, in code:
Lock lo = locks.get("globalVariables").readLock();
lo.acquire();
try {
// ...
} catch (Exception e) {
// ...
} finally {
lo.release();
}
Related
Can someone explain the output of the following program:
public class DataRace extends Thread {
static ArrayList<Integer> arr = new ArrayList<>();
public void run() {
Random random = new Random();
int local = random.nextInt(10) + 1;
arr.add(local);
}
public static void main(String[] args) {
DataRace t1 = new DataRace();
DataRace t2 = new DataRace();
DataRace t3 = new DataRace();
DataRace t4 = new DataRace();
t1.start();
t2.start();
t3.start();
t4.start();
try {
t1.join();
t2.join();
t3.join();
t4.join();
} catch (InterruptedException e) {
System.out.println("interrupted");
}
System.out.println(DataRace.arr);
}
}
Output:
[8, 5]
[9, 2, 2, 8]
[2]
I am having trouble understanding the varying number of values in my output. I would expect the main thread to either wait until all threads have finished execution as I am joining them in the try-catch block and then output four values, one from each thread, or print to the console in case of an interruption. Neither of which is really happening here.
How does it come into play here if this is due to data race in multithreading?
The main problem is that multiple threads are adding to the same shared ArrayList concurrently. ArrayList is not thread-safe. From source one can read:
Note that this implementation is not synchronized.
If multiple threads
access an ArrayList instance concurrently, and at least one of the
threads modifies the list structurally, it must be synchronized
externally. (A structural modification is any operation that adds or
deletes one or more elements, or explicitly resizes the backing array;
merely setting the value of an element is not a structural
modification.) This is typically accomplished by synchronizing on some
object that naturally encapsulates the list. If no such object exists,
the list should be "wrapped" using the Collections.synchronizedList
method. This is best done at creation time, to prevent accidental
unsynchronized access to the list:
In your code every time you call
arr.add(local);
inside the add method implementation, among others, a variable that keeps track of the size of the array will be updated. Below is shown the relevant part of the add method of the ArrayList:
private void add(E e, Object[] elementData, int s) {
if (s == elementData.length)
elementData = grow();
elementData[s] = e;
size = s + 1; // <--
}
where the variable field size is:
/**
* The size of the ArrayList (the number of elements it contains).
*
* #serial
*/
private int size;
Notice that neither is the add method synchronized nor the variable size is marked with the volatile clause. Hence, suitable to race-conditions.
Therefore, because you did not ensure mutual exclusion on the accesses to that ArrayList (e.g., surrounding the calls to the ArrayList with the synchronized clause), and because the ArrayList does not ensure that the size variable is updated atomically, each thread might see (or not) the last updated value of that variable. Hence, threads might see outdated values of the size variable, and add elements into positions that already other threads have added before. In the extreme, all threads might end-up adding an element into the same position (e.g., as one of your outputs [2]).
The aforementioned race-condition leads to undefined behavior, hence the reason why:
System.out.println(DataRace.arr);
outputs different number of elements in different execution of your code.
To make the ArrayList thread-safe or for alternatives have a look at the following SO thread: How do I make my ArrayList Thread-Safe?, where it showcases the use of Collections.synchronizedList()., CopyOnWriteArrayList among others.
An example of ensuring mutual exclusion of the accesses to the arr structure:
public void run() {
Random random = new Random();
int local = random.nextInt(10) + 1;
synchronized (arr) {
arr.add(local);
}
}
or :
static final List<Integer> arr = Collections.synchronizedList(new ArrayList<Integer>());
public void run() {
Random random = new Random();
int local = random.nextInt(10) + 1;
arr.add(local);
}
TL;DR
ArrayList is not Thread-Safe. Therefore it's behaviour in a race-condition is undefined. Use synchronized or CopyOnWriteArrayList instead.
Longer answer
ArrayList.add ultimately calls this private method:
private void add(E e, Object[] elementData, int s) {
if (s == elementData.length)
elementData = grow();
elementData[s] = e;
size = s + 1;
}
When two Threads reach this same point at the "same" time, they would have the same size (s), and both will try add an element on the same position and update the size to s + 1, thus likely keeping the result of the second.
If the size limit of the ArrayList is reached, and it has to grow(), a new bigger array is created and the contents copied, likely causing any other changes made concurrently to be lost (is possible that multiple threads will be trying to grow).
Alternatives here are to use monitors - a.k.a. synchronized, or to use Thread-Safe alternatives like CopyOnWriteArrayList.
I think there is a lot of similar or closely related questions. For example see this.
Basically the reason of this "unexpected" behabiour is because ArrayList is not thread-safe. You can try List<Integer> arr = new CopyOnWriteArrayList<>() and it will work as expected. This data structure is recommended when we want to perform read operation frequently and the number of write operations is relatively rare. For good explanation see What is CopyOnWriteArrayList in Java - Example Tutorial.
Another option is to use List<Integer> arr = Collections.synchronizedList(new ArrayList<>()).
You can also use Vector but it is not recommended (see here).
This article also will be useful - Vector vs ArrayList in Java.
I was thinking about how to solve race condition between two threads which tries to write to the same variable using immutable objects and without helping any keywords such as synchronize(lock)/volatile in java.
But I couldn't figure it out, is it possible to solve this problem with such solution at all?
public class Test {
private static IAmSoImmutable iAmSoImmutable;
private static final Runnable increment1000Times = () -> {
for (int i = 0; i < 1000; i++) {
iAmSoImmutable.increment();
}
};
public static void main(String... args) throws Exception {
for (int i = 0; i < 10; i++) {
iAmSoImmutable = new IAmSoImmutable(0);
Thread t1 = new Thread(increment1000Times);
Thread t2 = new Thread(increment1000Times);
t1.start();
t2.start();
t1.join();
t2.join();
// Prints a different result every time -- why? :
System.out.println(iAmSoImmutable.value);
}
}
public static class IAmSoImmutable {
private int value;
public IAmSoImmutable(int value) {
this.value = value;
}
public IAmSoImmutable increment() {
return new IAmSoImmutable(++value);
}
}
If you run this code you'll get different answers every time, which mean a race condition is happening.
You can not solve race condition without using any of existence synchronisation (or volatile) techniques. That what they were designed for. If it would be possible there would be no need of them.
More particularly your code seems to be broken. This method:
public IAmSoImmutable increment() {
return new IAmSoImmutable(++value);
}
is nonsense for two reasons:
1) It makes broken immutability of class, because it changes object's variable value.
2) Its result - new instance of class IAmSoImmutable - is never used.
The fundamental problem here is that you've misunderstood what "immutability" means.
"Immutability" means — no writes. Values are created, but are never modified.
Immutability ensures that there are no race conditions, because race conditions are always caused by writes: either two threads performing writes that aren't consistent with each other, or one thread performing writes and another thread performing reads that give inconsistent results, or similar.
(Caveat: even an immutable object is effectively mutable during construction — Java creates the object, then populates its fields — so in addition to being immutable in general, you need to use the final keyword appropriately and take care with what you do in the constructor. But, those are minor details.)
With that understanding, we can go back to your initial sentence:
I was thinking about how to solve race condition between two threads which tries to write to the same variable using immutable objects and without helping any keywords such as synchronize(lock)/volatile in java.
The problem here is that you actually aren't using immutable objects: your entire goal is to perform writes, and the entire concept of immutability is that no writes happen. These are not compatible.
That said, immutability certainly has its place. You can have immutable IAmSoImmutable objects, with the only writes being that you swap these objects out for each other. That helps simplify the problem, by reducing the scope of writes that you have to worry about: there's only one kind of write. But even that one kind of write will require synchronization.
The best approach here is probably to use an AtomicReference<IAmSoImmutable>. This provides a non-blocking way to swap out your IAmSoImmutable-s, while guaranteeing that no write gets silently dropped.
(In fact, in the special case that your value is just an integer, the JDK provides AtomicInteger that handles the necessary compare-and-swap loops and so on for threadsafe incrementation.)
Even if the problems are resolved by :
Avoiding the change of IAmSoImmutable.value
Reassigning the new object created within increment() back into the iAmSoImmutable reference.
There still are pieces of your code that are not atomic and that needs a sort of synchronization.
A solution would be to use a synchronized method of course
public synchronized static void increment() {
iAmSoImmutable = iAmSoImmutable.increment();
}
Thread t1 = new Thread(() -> {
for (int i = 0; i < 1000; i++) {
increment();
}
});
Thread t2 = new Thread(() -> {
for (int i = 0; i < 1000; i++) {
increment();
}
});
I am confused about sharing arrays safely between threads in Java, specifically memory fences and the keyword synchronized.
This Q&A is helpful, but does not answer all of my questions: Java arrays: synchronized + Atomic*, or synchronized suffices?
What follows is sample code to demonstrate the issue. Assume there is a pool of worker threads that populates the SharedTable via method add(...). After all worker threads are done, a final thread reads and saves the data.
Sample code to demonstrate the issue:
public final class SharedTable {
// Column-oriented data entries
private final String[] data1Arr;
private final int[] data2Arr;
private final long[] data3Arr;
private final AtomicInteger nextIndex;
public SharedTable(int size) {
this.data1Arr = new String[size];
this.data2Arr = new int[size];
this.data3Arr = new long[size];
this.nextIndex = new AtomicInteger(0);
}
// Thread-safe: Called by worker threads
public void addEntry(String data1, int data2, long data3) {
final int index = nextIndex.getAndIncrement();
data1Arr[index] = data1;
data2Arr[index] = data2;
data3Arr[index] = data3;
}
// Not thread-safe: Called by clean-up/joiner/collator thread...
// after worker threads are complete
public void save() {
// Does this induce a full memory fence to ensure thread-safe reading of
synchronized (this) {
final int usedSide = nextIndex.get();
for (int i = 0; i < usedSide; ++i) {
final String data1 = data1Arr[i];
final int data2 = data2Arr[i];
final long data3 = data3Arr[i];
// TODO: Save data here
}
}
}
}
The sample code above could also be implemented using Atomic*Array, which acts as an "array of volatile values/references".
public final class SharedTable2 {
// Column-oriented data entries
private final AtomicReferenceArray<String> data1Arr;
private final AtomicIntegerArray data2Arr;
private final AtomicLongArray data3Arr;
private final AtomicInteger nextIndex;
public SharedTable2(int size) { ... }
// Thread-safe: Called by worker threads
public void addEntry(String data1, int data2, long data3) {
final int index = nextIndex.getAndIncrement();
data1Arr.set(index, data1);
...
}
// Not thread-safe: Called by clean-up/joiner/collator thread...
// after worker threads are complete
public void save() {
final int usedSide = nextIndex.get();
for (int i = 0; i < usedSide; ++i) {
final String data1 = data1Arr.get(i);
final int data2 = data2Arr.get(i);
final long data3 = data3Arr.get(i);
// TODO: Save data here
}
}
}
Is SharedTable thread-safe (and cache coherent)?
Is SharedTable (much?) more efficient as only a single memory fence is required, whereas SharedTable2 invokes a memory fence for each call to Atomic*Array.set(...)?
If it helps, I am using Java 8 on 64-bit x86 hardware (Windows and Linux).
No, SharedTable is not thread-safe. A happens-before is only guaranteed if you read, from a synchronized block, something that has been written from a synchronized block using the same lock.
Since the writes are made out of a synchronized block, the JMM doesn't guarantee that the writes will be visible by the reader thread.
Mutating an object that is exchanged between threads, can be done outside of a synchronized block.
Let's first introduce a very practical example. Imagine you have 2 threads; one produces jobs and the other consumes jobs. These threads communicate with each other using a queue. Let's assume a BLockingQueue. Then the producer thread can use simple POJO objects that do not have any internal synchronization and safely exchange these POJOs with the consumer thread. This is exactly how java Executors work. In the documentation, you will find something about the memory consistency effects.
Why does it work?
There needs to be a happens-before edge between writing of the fields of the job and reading the fields of the job.
class Job{int a;}
queue = new SomeBlockingQueue();
thread1:
job = new Job();
job.a=1; (1)
queue.put(job); (2)
thread2:
job=queue.take(); (3)
r1=job.a; (4)
There is a happens-before edge between (1) and (2) due to program order rule.
There is a happens-before edge between (2) and (3) due to either the monitor lock rule or volatile variable rule.
There is a happens-before edge between (3) and (4) due to the program order rule.
Because the happens-before relation is transitive, there is a happens-before edge between (1) and (4) and hence there is no data race.
So the above code will work fine. But if the producer modifies the Job after it has put it on the queue, then there could be a data race. So you need to make sure your code doesn't suffer from that problem.
Is this as safe as using an AtomicReference?
private volatile String myMember;
public void setMyMember(String s) {
myMember = s;
}
vs.
private final AtomicReference<String> myMember = new AtomicReference<>();
public void setMyMember(String s) {
while (true) {
String current = myMember.get();
if (myMember.compareAndSet(current, s))
break;
}
}
Your code is "safe" but doesn't do the same thing as the AtomicReference code. Typically, the AtomicReference loop with compareAndSet is used when someone is trying to add something to a list or object and they want to protect against the race conditions with multiple threads.
For example:
private final AtomicReference<List<String>> listRef = new AtomicReference<>();
...
while (true) {
List<String> currentList = listRef.get();
List<String> newList = new ArrayList<String>(currentList);
newList.add(stringToAdd);
// if we update the list reference, make sure we don't overwrite another one
if (listRef.compareAndSet(currentList, newList))
break;
}
In your case, since you are using a simple String object, just making it volatile will be fine. There is no point in doing the compareAndSet. If you still want to use AtomicReference, then just call myMember.set(...).
Your first code snippet is completely thread safe and is enough because String is thread safe and assigning to variable is atomic.
The second one doesn't make much sense, such construct is used internally e.g. in AtomicInteger to avoid ignoring assignments in concurrent environment. volatile is fine in your case.
I have code similar to following:
public class Cache{
private final Object lock = new Object();
private HashMap<Integer, TreeMap<Long, Integer>> cache =
new HashMap<Integer, TreeMap<Long, Integer>>();
private AtomicLong FREESPACE = new AtomicLong(102400);
private void putInCache(TreeMap<Long, Integer> tempMap, int fileNr){
int length; //holds the length of data in tempMap
synchronized(lock){
if(checkFreeSpace(length)){
cache.get(fileNr).putAll(tmpMap);
FREESPACE.getAndAdd(-length);
}
}
}
private boolean checkFreeSpace(int length){
while(FREESPACE.get() < length && thereIsSomethingToDelete()){
// deleteSomething returns the length of deleted data or 0 if
// it could not delete anything
FREESPACE.getAndAdd(deleteSomething(length));
}
if(FREESPACE.get() < length) return true;
return false;
}
}
putInCache is called by about 139 threads a second. Can I be sure that these two methods will synchronize on both cache and FREESPACE? Also, is checkFreeSpace() multithread-safe i.e can I be sure that there will be only one invocation of this method at a time? Can the "multithread-safety" of this code be improved?
To have your question answered fully, you would need to show the implementations of the thereIsSomethingToDelete() and deleteSomething() methods.
Given that checkFreeSpace is a public method (does it really need to be?), and is unsynchronized, it is possible it could be called by another thread while the synchronized block in the putInCache() method is running. This by itself might not break anything, since it appears that the checkFreeSpace method can only increase the amount of free space, not reduce it.
What would be more serious (and the code sample doesn't allow us to determine this) is if the thereIsSomethingToDelete() and deleteSomething() methods don't properly synchronize their access to the cache object, using the same Object lock as used by putInCache().
You don't usually synchronize on the fields you want to control access to directly.
The fields that you want to synchronize access to must only be accessed from within synchronized blocks (on the same object) to be considered thread safe. You are already doing this in putInCache().
Therefore, because checkFreeSpace() accesses shared state in an unsynchronized fashion, it is not thread safe.