I'm wondering how can I replace this CyclicBarrier with something else, like the usual joining of all threads in an array. This is the snippet (try-catch blocks etc. ommited for clarity):
protected void execute(int nGens)
{
CyclicBarrier thread_barrier = new CyclicBarrier(n_threads+1);
ExecutorService thread_pool = Executors.newFixedThreadPool(n_threads);
for (int i=0; i<thread_array.length; i++) // all threads are in this array
threadPool.execute(thread_array[i]);
for (int i=0; i<total_generations; i++)
threadBarrier.await();
threadPool.shutdown();
while(!threadPool.isTerminated()){}
}
and this is the code executed by the threads
public void run()
{
for (int i=0; i<total_generations; i++)
{
next_generation(); // parallel computation (aka the 'task')
thread_barrier.await();
}
}
as you can see, all threads are launched on startup, then perform a certain task a number of times. Each time they finish a task, they wait until all other threads have finished their task, and then they perform it again. Are there any lower level ways to achieve this kind of synchronization?
Related
I want the final count to be 10000 always but even though I have used synchronized here, Im getting different values other than 1000. Java concurrency newbie.
public class test1 {
static int count = 0;
public static void main(String[] args) throws InterruptedException {
int numThreads = 10;
Thread[] threads = new Thread[numThreads];
for(int i=0;i<numThreads;i++){
threads[i] = new Thread(new Runnable() {
#Override
public void run() {
synchronized (this) {
for (int i = 0; i < 1000; i++) {
count++;
}
}
}
});
}
for(int i=0;i<numThreads;i++){
threads[i].start();
}
for (int i=0;i<numThreads;i++)
threads[i].join();
System.out.println(count);
}
}
Boris told you how to make your program print the right answer, but the reason why it prints the right answer is, your program effectively is single threaded.
If you implemented Boris's suggestion, then your run() method probably looks like this:
public void run() {
synchronized (test1.class) {
for (int i = 0; i < 1000; i++) {
count++;
}
}
}
No two threads can ever be synchronized on the same object at the same time, and there's only one test1.class in your program. That's good because there's also only one count. You always want the number of lock objects and their lifetimes to match the number and lifetimes of the data that they are supposed to protect.
The problem is, you have synchronized the entire body of the run() method. That means, no two threads can run() at the same time. The synchronized block ensures that they all will have to execute in sequence—just as if you had simply called them one-by-one instead of running them in separate threads.
This would be better:
public void run() {
for (int i = 0; i < 1000; i++) {
synchronized (test1.class) {
count++;
}
}
}
If each thread releases the lock after each increment operation, then that gives other threads a chance to run concurrently.
On the other hand, all that locking and unlocking is expensive. The multi-threaded version almost certainly will take a lot longer to count to 10000 than a single threaded program would do. There's not much you can do about that. Using multiple CPUs to gain speed only works when there's big computations that each CPU can do independently of the others.
For your simple example, you can use AtomicInteger instead of static int and synchronized.
final AtomicInteger count = new AtomicInteger(0);
And inside Runnable only this one row:
count.IncrementAndGet();
Using syncronized blocks the whole class to be used by another threads if you have more complex codes with many of functions to use in a multithreaded code environment.
This code does'nt runs faster because of incrementing the same counter 1 by 1 is always a single operation which cannot run more than once at a moment.
So if you want to speed up running near 10x times faster, you should counting each thread it's own counter, than summing the results in the end. You can do this with ThreadPools using executor service and Future tasks wich can return a result for you.
My question is extremely basic: once I have written some array values by one or more threads (phase 1), how can I 'publish' my array to make all the changes visible to other threads (phase 2)?
I have code that does all the array writing, then all the array reading, then again all the writing, then again all the reading etc. I'd like to do it in multiple threads, so multiple threads first would do the array writing phase, then multiple threads would do the array reading phase etc.
My concern is how to safely publish the array writes after each writing phase.
Consider the following simplified thread-unsafe code, that does just one writing phase with just one thread and then just one reading phase with multiple threads:
ExecutorService executor = Executors.newFixedThreadPool(5);
double[] arr = new double[5];
for (int i=0; i<5; ++i) {
arr[i] = 1 + Math.random();
}
for (int i=0; i<5; ++i) {
final int j=i;
executor.submit(() -> System.out.println(String.format("arr[%s]=%s", j, arr[j])));
}
The code normally prints non-zero values, but I understand that it might occasionally print zeros as well, as the array is not properly published by the writing thread, so some writes might not be visible to other threads.
I'd like to fix this problem and write the above code properly, in a thread-safe manner, i.e. to make sure that all my writes will be visible to the reading threads.
1. Could you advise on the best way to do so?
The concurrent collections and AtomicXxxArray are not an option for me because of performance (and also code clarity), as I have 2D arrays etc.
2. I can think of the following possible solutions, but I am not 100% sure they would work. Could you also advise on the solutions below?
Solution 1: assignment to a final array
Justification: I expect a final field to be always properly initialized with the latest writes, including all its recursive dependencies.
for (int i=0; i<5; ++i) {
arr[i] = 1 + Math.random();
}
final double[] arr2 = arr; //<---- safe publication?
for (int i=0; i<5; ++i) {
final int j=i;
executor.submit(() -> System.out.println(String.format("arr[%s]=%s", j, arr2[j])));
}
Solution 2: a latch
Justification: I expect the latch to establish a perfect happens-before relationship between the writing thread(s) and the reading threads.
CountDownLatch latch = new CountDownLatch(1); //1 = the number of writing threads
for (int i=0; i<5; ++i) {
arr[i] = Math.random();
}
latch.countDown(); //<- writing is done
for (int i=0; i<5; ++i) {
final int j=i;
executor.submit(() -> {
try {latch.await();} catch (InterruptedException e) {...} //happens-before(writings, reading) guarantee?
System.out.println(String.format("arr[%s]=%s", j, arr[j]));
});
}
Update: this answer https://stackoverflow.com/a/5173805/1847482 suggests the following solution:
volatile int guard = 0;
...
//after the writing is done:
guard = guard + 1; //write some new value
//just before the reading: read the volatile variable, e.g.
guard = guard + 1; //includes reading
... //do the reading
This solution uses the following rule: "if thread A writes some non-volatile stuff and a volatile variable after that, thread B is guaranteed to see the changes of the volatile stuff as well if it reads the volatile variable first".
Your first example is perfectly safe, because the tasks originate from the writer thread. As the docs say:
Actions in a thread prior to the submission of a Runnable to an Executor happen-before its execution begins.
Let's say I have 5 threads that must make a combined total of 1,000,000 function calls for a parallel Monte Carlo Method program. I assigned 1,000,000 / 5 function calls for each of the 5 threads. However, after many tests (some tests ranging up to 1 trillion iterations) I realized that some threads were finishing much faster than others. So instead I would like to dynamically assign workload to each of these threads. My first approach involved a AtomicLong variable that was set to an initial value of, let's say, 1 billion. After each function call, I would decrement the AtomicLong by 1. Before every function call the program would check to see if the AtomicLong was greater than 0, like this:
AtomicLong remainingIterations = new AtomicLong(1000000000);
ExecutorService threadPool = Executors.newFixedThreadPool(5);
for (int i = 0; i < 5; i++) {//create 5 threads
threadPool.submit(new Runnable() {
public void run() {
while (remainingIterations.get() > 0) {//do a function call if necessary
remainingIterations.decrementAndGet();//decrement # of remaining calls needed
doOneFunctionCall();//perform a function call
}
}
});
}//more unrelated code is not show (thread shutdown, etc.)
This approach seemed to be extremely slow, am I using AtomicLong correctly? Is there a better approach?
am I using AtomicLong correctly?
Not quite. The way you are using it, two threads could each check remainingIterations, each see 1, then each decrement it, putting you at -1 total.
As for you slowness issue, it is possible that, if doOneFunctionCall() completes quickly, your app is being bogged down by the lock-contention surrounding your AtomicLong.
The nice thing about an ExecutorService is that it logically decouples the work being done from the threads that are doing it. You can submit more jobs than you have threads, and the ExecutorService will execute them as soon as it is able:
ExecutorService threadPool = Executors.newFixedThreadPool(5);
for (int i = 0; i < 1000000; i++) {
threadPool.submit(new Runnable() {
public void run() {
doOneFunctionCall();
}
});
}
This might be balancing your work a bit too much in the other direction: creating too many short-lived Runnable objects. You can experiment to see what gives you the best balance between distributing the work and performing the work quickly:
ExecutorService threadPool = Executors.newFixedThreadPool(5);
for (int i = 0; i < 1000; i++) {
threadPool.submit(new Runnable() {
public void run() {
for (int j = 0; j < 1000; j++) {
doOneFunctionCall();
}
}
});
}
Look at ForkJoinPool. What you are attempting is called divide-and-conquer. In F/J you set the number of threads to 5. Each thread has a queue of pending Tasks. You can evenly set the number of Tasks for each thread/queue and when a thread runs out of work it work-steals from another thread's queue. This way you don't need the AtomicLong.
There a many examples of using this Class. If you need more info, let me know.
An elegant approach to avoid the creation of 1B tasks is to use a synchronous queue and a ThreadPoolExecutor, doing so submit will be blocked until a thread becomes available.
I didn't test actual performance though.
BlockingQueue<Runnable> queue = new SynchronousQueue<>();
ExecutorService threadPool = new ThreadPoolExecutor(5, 5,
0L, TimeUnit.MILLISECONDS,
queue);
for (int i = 0; i < 1000000000; i++) {
threadPool.submit(new Runnable() {
public void run() {
doOneFunctionCall();
}
});
}
What happens if a thread has been executed same time more than once. Lets say I have thread like
private Runnable mySampleThread() {
return new Runnable() {
#Override
public void run() {
//something is going on here.
}
};
}
And I created an ExecutorService with fixed thread pool of 10. What happens if I execute mySampleThread 10 times in this ExecutorService.
Something like below,
ExecutorService mySampleExecutor = Executors.newFixedThreadPool(10);
while (i <= 10) {
mySampleExecutor.execute(mySampleThread);
i++;
}
Answer is very simple. Executor will execute Runnable object (it's not the Thread object) as described in documentation Interface Executor
Executes the given command at some time in the future. The command may execute in a new thread, in a pooled thread, or in the calling thread, at the discretion of the Executor implementation.
Basically, Executor will pick up one thread of it's internal pool (ThreadPoolExecutor), assign runnable to it a execute run() method.
Firstly elaborate your problem or query.
Nevertheless, assuming that you are calling the method "mySampleThread()" without missing brackets. This method actually returns a new Runnable object every time, so you are passing a new runnable all 10 times to executor. And it means you are submitting 10 different tasks to executor. So if executor creates different thread for every task (that depends upon its implementation), then whatever you code inside run() will be executed 10 times in 10 different threads.
And as described in other answers, the runnable object being passed to executor is not a thread.
Hope it clarifies.
By the way, you may try running the program.
As other answers clearly state, there will be as many new threads as the number of calls (might be less due to used executor, I'm focusing on Runnable reusage, limiting number of threads with executor is well explained in other answers). All of them created with single Runnable object.
What's worth mentioning, and I personally made use of this quite a few times - this is one of the ways to share data between multiple threads as all of these threads share Runnable that was used for creation. Synchronization issues come into play at this point, but that's another story.
Here's code to show the typical usage and the aforementioned synchronization problem.
import java.util.concurrent.ExecutorService;
class MyThread implements Runnable {
public int counter = 0;
#Override
public void run() {
for (int i = 0; i < 10000; i++) {
counter++;
}
}
}
class MySynchronizedThread implements Runnable {
public int counter = 0;
#Override
public void run() {
for (int i = 0; i < 10000; i++) {
synchronized (this) {
counter++;
}
}
}
}
public class RunnableTest {
public static void main(String[] args) throws InterruptedException {
MyThread runnableObject = new MyThread();
ExecutorService ex = Executors.newFixedThreadPool(5);
for (int i = 0; i < 5; i++) {
ex.execute(runnableObject);
}
ex.shutdown();
ex.awaitTermination(Long.MAX_VALUE, TimeUnit.MILLISECONDS);
System.out
.println("Without synchronization: " + runnableObject.counter);
MyThread runnableSynchronizedObject = new MyThread();
ExecutorService ex2 = Executors.newFixedThreadPool(5);
for (int i = 0; i < 5; i++) {
ex2.execute(runnableSynchronizedObject);
}
ex2.shutdown();
ex2.awaitTermination(Long.MAX_VALUE, TimeUnit.MILLISECONDS);
System.out.println("Without synchronization: "
+ runnableSynchronizedObject.counter);
}
}
There will be no differences in mySampleExecutor.execute(mySampleThread);, mySampleThread method return a new Runnable object. every thread will have it's own Frames
In Java, how to pass the objects back to Main thread from worker threads? Take the following codes as an example:
main(String[] args) {
String[] inputs;
Result[] results;
Thread[] workers = new WorkerThread[numThreads];
for (int i = 0; i < numThreads; i++) {
workers[i] = new WorkerThread(i, inputs[i], results[i]);
workers[i].start();
}
....
}
....
class WorkerThread extends Thread {
String input;
int name;
Result result;
WorkerThread(int name, String input, Result result) {
super(name+"");
this.name = name;
this.input = input;
this.result = result;
}
public void run() {
result = Processor.process(input);
}
}
How to pass the result back to main's results[i] ?
How about passing this to WorkerThread,
workers[i] = new WorkerThread(i, inputs[i], results[i], this);
so that it could
mainThread.reults[i] = Processor.process(inputs[i]);
Why don't you use Callables and an ExecutorService?
main(String[] args) {
String[] inputs;
Future<Result>[] results;
for (int i = 0; i < inputs.length; i++) {
results[i] = executor.submit(new Worker(inputs[i]);
}
for (int i = 0; i < inputs.length; i++) {
Result r = results[i].get();
// do something with the result
}
}
#Thilo's and #Erickson's answers are the best one. There are existing APIs that do this kind of thing simply and reliably.
But if you want to persist with your current approach of doing it by hand, then the following change to you code may be sufficient:
for (int i = 0; i < numThreads; i++) {
results[i] = new Result();
...
workers[i] = new WorkerThread(i, inputs[i], results[i]);
workers[i].start();
}
...
public void run() {
Result tmp = Processor.process(input);
this.result.updateFrom(tmp);
// ... where the updateFrom method copies the state of tmp into
// the Result object that was passed from the main thread.
}
Another approach is to replace Result[] in the main program with Result[][] and pass a Result[0] to the child thread that can be updated with the result object. (A light-weight holder).
However, there us an Important Gotcha when you are implementing this at a low level is that the main thread needs to call Thread.join on all of the child threads before attempting to retrieve the results. If you don't, there is a risk that the main thread will occasionally see stale values in the Result objects. The join also ensures that the main thread doesn't try to access a Result before the corresponding child thread has completed it.
The main thread will need to wait for the worker threads to complete before getting the results. One way to do this is for the main thread to wait for each worker thread to terminate before attempting to read the result. A thread terminates when its run() method completes.
For example:
for (int i = 0; i < workers.length; i++) {
worker.join(); // wait for worker thread to terminate
Result result = results[i]; // get the worker thread's result
// process the result here...
}
You still have to arrange for the worker thread's result to be inserted into the result[] array somehow. As one possibility, you could do this by passing the array and an index into each worker thread and having the worker thread assign the result before terminating.
Some typical solutions would be:
Hold the result in the worker thread's instance (be it Runnable or Thread). This is similar to the use of the Future interface.
Use a BlockingQueue that the worker threads are constructed with which they can place their result into.
Simple use the ExecutorService and Callable interfaces to get a Future which can be asked for the result.
It looks like your goal is to perform the computation in parallel, then once all results are available to the main thread, it can continue and use them.
If that's the case, implement your parallel computation as a Callable rather than a thread. Pass this collection of tasks to the invokeAll() method of an ExecutorService. This method will block until all the tasks have been completed, and then your main thread can continue.
I think I have a better solution, why don't you make your worker threads pass the result into a linkedListBlockingQueue, which is passed to them, after they are done, and your main function picks the results up from the queue like this
while(true){linkedListBlockingQueue.take();
//todo: fil in the task you want it to do
//if a specific kind of object is returned/countdownlatch is finished exit
}