Java concurrency: safe publication of array - java

My question is extremely basic: once I have written some array values by one or more threads (phase 1), how can I 'publish' my array to make all the changes visible to other threads (phase 2)?
I have code that does all the array writing, then all the array reading, then again all the writing, then again all the reading etc. I'd like to do it in multiple threads, so multiple threads first would do the array writing phase, then multiple threads would do the array reading phase etc.
My concern is how to safely publish the array writes after each writing phase.
Consider the following simplified thread-unsafe code, that does just one writing phase with just one thread and then just one reading phase with multiple threads:
ExecutorService executor = Executors.newFixedThreadPool(5);
double[] arr = new double[5];
for (int i=0; i<5; ++i) {
arr[i] = 1 + Math.random();
}
for (int i=0; i<5; ++i) {
final int j=i;
executor.submit(() -> System.out.println(String.format("arr[%s]=%s", j, arr[j])));
}
The code normally prints non-zero values, but I understand that it might occasionally print zeros as well, as the array is not properly published by the writing thread, so some writes might not be visible to other threads.
I'd like to fix this problem and write the above code properly, in a thread-safe manner, i.e. to make sure that all my writes will be visible to the reading threads.
1. Could you advise on the best way to do so?
The concurrent collections and AtomicXxxArray are not an option for me because of performance (and also code clarity), as I have 2D arrays etc.
2. I can think of the following possible solutions, but I am not 100% sure they would work. Could you also advise on the solutions below?
Solution 1: assignment to a final array
Justification: I expect a final field to be always properly initialized with the latest writes, including all its recursive dependencies.
for (int i=0; i<5; ++i) {
arr[i] = 1 + Math.random();
}
final double[] arr2 = arr; //<---- safe publication?
for (int i=0; i<5; ++i) {
final int j=i;
executor.submit(() -> System.out.println(String.format("arr[%s]=%s", j, arr2[j])));
}
Solution 2: a latch
Justification: I expect the latch to establish a perfect happens-before relationship between the writing thread(s) and the reading threads.
CountDownLatch latch = new CountDownLatch(1); //1 = the number of writing threads
for (int i=0; i<5; ++i) {
arr[i] = Math.random();
}
latch.countDown(); //<- writing is done
for (int i=0; i<5; ++i) {
final int j=i;
executor.submit(() -> {
try {latch.await();} catch (InterruptedException e) {...} //happens-before(writings, reading) guarantee?
System.out.println(String.format("arr[%s]=%s", j, arr[j]));
});
}
Update: this answer https://stackoverflow.com/a/5173805/1847482 suggests the following solution:
volatile int guard = 0;
...
//after the writing is done:
guard = guard + 1; //write some new value
//just before the reading: read the volatile variable, e.g.
guard = guard + 1; //includes reading
... //do the reading
This solution uses the following rule: "if thread A writes some non-volatile stuff and a volatile variable after that, thread B is guaranteed to see the changes of the volatile stuff as well if it reads the volatile variable first".

Your first example is perfectly safe, because the tasks originate from the writer thread. As the docs say:
Actions in a thread prior to the submission of a Runnable to an Executor happen-before its execution begins.

Related

Java: Fill a matrix multi-threaded, currently no thread-safe

I'm supernew to multi-threading, but I think I got the overall idea more or less. I'm trying to fill a matrix multi-threaded, but my code is clearly not thread-safe, I have duplicate columns in my matrix, which is not the case when the matrix is filled regularly. Below is an example block of code. Note that the reader is a Scanner object and someOperationOnText(someText) returns an int[100] object.
int[][] mat = new int[100][100];
ExecutorService threadPool = Executors.newFixedThreadPool(8);
for (int i = 0; i < 100; i++) {
Set<Integer>someText = new HashSet<>(reader.next());
int lineIndex = i;
threadPool.submit(() -> mat[lineIndex] = someOperationOnText(someText);
}
Do you see any reason why this is not thread-safe? I can't seem to get my head around it, since the reading is done outside the thread-pool, I didn't think it would be at risk.
Thanks a lot for any debugging tips!
Grts
There is a happens-before between the submit calls and the execution of the lambda by a thread in the executor's thread pool. (See javadoc: "Memory consistency effects"). So that means the lambda will see the correct values for someText, mat and lineIndex.
The only thing that is not thread-safe about this is the (implied) code that uses the values in mat in the main thread. Calling shutdown() on the executor should be sufficient ... though the javadocs don't talk about the memory consistency effects of shutdown() and awaitTermination().
(By my reading of the code for ThreadPoolExecutor, the awaitTermination() method provides a happens-before between the pool threads (after completion of all tasks) and the method's return. The happens-before is due the use of the executor's main lock to synchronize the pool shutdown. It is hard to see how they could implement shutdown correctly without this (or equivalent), so it is more than an "implementation artefact" ... IMO.)
Well, accessing elements in the matrix is a very fast operation, a computing is not.
I think making access to a matrix synchronized while computing can be concurrent is a right approach for you.
int[][] mat = new int[100][100];
Object lock = new Object();
ExecutorService threadPool = Executors.newFixedThreadPool(8);
for (int i = 0; i < 100; i++) {
Set<Integer> someText = new HashSet<>(reader.next());
int lineIndex = i;
threadPool.submit(() -> {
int result = someOperationOnText(someText);
synchronized (lock) {
mat[lineIndex] = result;
}
});
}

How could I replace this CyclicBarrier with a lower level alternative?

I'm wondering how can I replace this CyclicBarrier with something else, like the usual joining of all threads in an array. This is the snippet (try-catch blocks etc. ommited for clarity):
protected void execute(int nGens)
{
CyclicBarrier thread_barrier = new CyclicBarrier(n_threads+1);
ExecutorService thread_pool = Executors.newFixedThreadPool(n_threads);
for (int i=0; i<thread_array.length; i++) // all threads are in this array
threadPool.execute(thread_array[i]);
for (int i=0; i<total_generations; i++)
threadBarrier.await();
threadPool.shutdown();
while(!threadPool.isTerminated()){}
}
and this is the code executed by the threads
public void run()
{
for (int i=0; i<total_generations; i++)
{
next_generation(); // parallel computation (aka the 'task')
thread_barrier.await();
}
}
as you can see, all threads are launched on startup, then perform a certain task a number of times. Each time they finish a task, they wait until all other threads have finished their task, and then they perform it again. Are there any lower level ways to achieve this kind of synchronization?

Using volatile to ensure visibility of shared (but not concurrent) data in Java

I'm trying to implement a fast version of LZ77 and I have a question to ask you about concurrent programming.
For now I have a final byte[] buffer and a final int[] resultHolder, both of the same length. The program does the following:
The main thread writes all the buffer, then notifies the Threads and wait for them to complete.
The single working Thread processes a portion of the buffer saving the results in the same portion of the result holder. Worker's portion is exclusive. After that the main thread is notified and the worker pauses.
When all the workers have paused, the main thread reads the data in resultHolder and updates the buffer, then (if needed) the process begins again from point 1.
Important things in manager (main Thread) are declared as follow:
final byte[] buffer = new byte[SIZE];
final MemoryHelper memoryHelper = new MemoryHelper();
final ArrayBlockingQueue<Object> waitBuffer = new ArrayBlockingQueue<Object>(TOT_WORKERS);
final ArrayBlockingQueue<Object> waitResult = new ArrayBlockingQueue<Object>(TOT_WORKERS);
final int[] resultHolder = new int[SIZE];
MemoryHelper simply wraps a volatile field and provides two methods: one for reading it and one for writing to it.
Worker's run() code:
public void run() {
try {
// Wait main thread
while(manager.waitBuffer.take() != SHUTDOWN){
// Load new buffer values
manager.memoryHelper.readVolatile();
// Do something
for (int i = a; i <= b; i++){
manager.resultHolder[i] = manager.buffer[i] + 10;
}
// Flush new values of resultHolder
manager.memoryHelper.writeVolatile();
// Signal job done
manager.waitResult.add(Object.class);
}
} catch (InterruptedException e) { }
}
Finally, the important part of main Thread:
for(int i=0; i < 100_000; i++){
// Start workers
for (int j = 0; j < TOT_WORKERS; j++)
waitBuffer.add(Object.class);
// Wait workers
for (int j = 0; j < TOT_WORKERS; j++)
waitResult.take();
// Load results
memoryHelper.readVolatile();
// Do something
processResult();
setBuffer();
// Store buffer
memoryHelper.writeVolatile();
}
Synchronization on ArrayBlockingQueue works well. My doubt is in using readVolatile() and writeVolatile(). I've been told that writing to a volatile field flushes to memory all the previously changed data, then reading it from another thread makes them visible.
So is it enough in this case to ensure a correct visibility? There is never a real concurrent access to the same memory areas, so a volatile field should be a lot cheaper than a ReadWriteLock.
You don't even need volatile here, because BlockingQueues already provide necessary memory visibility guarantees:
Memory consistency effects: As with other concurrent collections, actions in a thread prior to placing an object into a BlockingQueue happen-before actions subsequent to the access or removal of that element from the BlockingQueue in another thread.
In general, if you already have some kind of synchronization, you probably don't need to do anything special to ensure memory visibility, because it's already guaranteed by synchronization primitives you use.
However, volatile reads and writes can be used to ensure memory visibility when you don't have explicit synchronization (e.g. in lock-free algorithms).
P. S.
Also it looks like you can use CyclicBarrier instead of your solution with queues, it's especially designed for similar scenarios.

Multithreading the work done within a for-loop by using a thread pool

Suppose I have the following code which I wan't to optimize by spreading the workload over the multiple CPU cores of my PC:
double[] largeArray = getMyLargeArray();
double result = 0;
for (double d : largeArray)
result += d;
System.out.println(result);
In this example I could distribute the work done within the for-loop over multiple threads and verify that the threads have all terminated before proceeding to printing the result. I therefore came up with something that looks like this:
final double[] largeArray = getMyLargeArray();
int nThreads = 5;
final double[] intermediateResults = new double[nThreads];
Thread[] threads = new Thread[nThreads];
final int nItemsPerThread = largeArray.length/nThreads;
for (int t = 0; t<nThreads; t++) {
final int t2 = t;
threads[t] = new Thread(){
#Override public void run() {
for (int d = t2*nItemsPerThread; d<(t2+1)*nItemsPerThread; d++)
intermediateResults[t2] += largeArray[d];
}
};
}
for (Thread t : threads)
t.start();
for (Thread t : threads)
try {
t.join();
} catch (InterruptedException e) { }
double result = 0;
for (double d : intermediateResults)
result += d;
System.out.println(result);
Assume that the length of the largeArray is dividable by nThreads. This solution works correctly.
However, I am encountering the problem that the above threading of for-loops occurs a lot in my program, which causes a lot of overhead due to the creation and garbage collection of threads. I am therefore looking at modifying my code by using a ThreadPoolExecutor. The threads giving the intermediate results would then be reused in the next execution (summation, in this example).
Since I store my intermediate results in an array of a size which has to be known beforehand, I was thinking of using a thread pool of fixed size.
I am having trouble, however, with letting a thread know at which place in the array it should store its result.
Should I define my own ThreadFactory?
Or am I better of using an array of ExecutorServices created by the method Executors.newSingleThreadExecutor(ThreadFactory myNumberedThreadFactory)?
Note that in my actual program it is very hard to replace the double[] intermediateResults with something of another type. I would prefer a solution which is confined to creating the right kind of thread pool.
I am having trouble, however, with letting a thread know at which place in the array it should store its result. Should I define my own ThreadFactory?
No need for that. The interfaces used by executors (Runnable and Callable) are run by threads, and you can pass whatever arguments to implementations you want to pass (for instance, an array, a begin index and an end index).
A ThreadPoolExecutor is indeed a good solution. Also look at FutureTask if you have runnables bearing results.
ExecutorService provides you with API to get the result from thread pool via Future interface:
Future<Double> futureResult = executorService.submit(new Callable<Double>() {
Double call() {
double totalForChunk = 0.0;
// do calculation here
return totalForChunk;
}
});
Now all you need to do is to submit tasks (Callable instances) and wait for result to be available:
List<Future<Double>> results = new ArrayList<Double>();
for (int i = 0; i < nChunks; i++) {
results.add(executorService.submit(callableTask));
}
Or even simpler:
List<Future<Double>> results = executorService.invokeAll(callableTaskList);
The rest is easy, iterate over results and collect total:
double total = 0.0;
for (Future<Double> result : results) {
total += result.get(); // this will block until your task is completed by executor service
}
Having that said, you do not care how much threads you have in executor service. You just submit tasks and collect results when they are available.
You would be better off creating "worker" threads that take information about work to be performed from a queue. Your process would then be to create an initially empty WorkQueue and then create and start the worker threads. Each worker thread would pick up its work from the queue, do the work, and put the work on a "finished" queue for the master to pick up and handle.

Passing a global I/O parameter to Java thread

I have a Java program which creates threads each one executing the same code (the same run()).
My main looks like:
{
// Create threads
GameOfLifeThread[][] threads = new GameOfLifeThread[vSplit][hSplit];
for(int i=0; i<vSplit; i++){
for(int j=0; j<hSplit; j++){
threads[i][j] = new GameOfLifeThread(initalField, ...);
}
}
// Run threads
for(int i=0; i<vSplit; i++){
for(int j=0; j<hSplit; j++){
// threads[i][j].run();
(new Thread(threads[i][j])).start();
}
}
return ...;
}
initialField is a global 2D array. Each thread is supposed to make some changes to it.
The problem is that after the threads execution the array stays unchanged even if there is only a single worker thread. However, when I run
threads[i][j].run();
instead of
(new Thread(threads[i][j])).start();
with a single worker thread (i.e. pure serial execution by the main thread) the initalField changes as it should.
What could be the problem? It looks like the array's elements are passed by value, but it cannot be so.
Thank you in advance.
Just one guess in the blue:
Your initalField must be volatile, otherwise it may be cached by the threads and won't get changed (as viewed by the other threads), because they can get cached thread-locally.
This and this answer may explain it a bit better.

Categories