Executors distribution of task per threads

Executors distribution of task per threads - java

I am fairly new with java executors, so this maybe an easy question.
ExecutorService executorService = Executors.newFixedThreadPool(NumberOfThreads - 1);
do_work();
for(int i = 1; i < NumberOfThreads; i++)
{
executorService.execute(new Runnable()
{
public void run()
{
do_work();
}
});
}
My question is:
If I create a fixed thread pool with 'N' threads, and if I want to execute 'N' tasks, like the code above. Do I have guarantees that each thread will only execute one task (do_work())?

No. It's a pool, and the assignment of threads to tasks doesn't make such guarantees.
e.g. imagine your do_work() method completes immediately. By the time you submit your 2nd Runnable, all the threads in the pool will be available, and any one of them will be a candidate for your job.

Related

Is re-starting a Thread better than creating a new one?

I'm wondering whether there is any advantage to keeping the same threads over the course of the execution of an object, rather than re-using the same Thread objects. I have an object for which a single (frequently used) method is parallelized using local Thread variables, such that every time the method is called, new Threads (and Runnables) are instantiated. Because the method is called so frequently, a single execution may instantiate upwards of a hundred thousand Thread objects, even though there are never more than a few (~4-6) active at any given time.
Following is a cut down example of how this method is currently implemented, to give a sense of what I mean. For reference, n is of course the pre-determined number of threads to use, whereas this.dataStructure is a (thread-safe) Map which serves as the input to the computation, as well as being modified by the computation. There are other inputs involved, but as they are not relevant to this question, I've omitted their usage. I've also omitted exception handling for the same reason.
Runnable[] tasks = new Runnable[n];
Thread[] threads = new Thread[n];
ArrayBlockingQueue<MyObject> inputs = new ArrayBlockingQueue<>(this.dataStructure.size());
inputs.addAll(this.dataStructure.values());
for (int i = 0; i < n; i++) {
tasks[i] = () -> {
while (true) {
MyObject input = inputs.poll(1L, TimeUnit.MICROSECONDS);
if (input == null) return;
// run computations over this.dataStructure
}
};
threads[i] = new Thread(tasks[i]);
threads[i].start();
}
for (int i = 0; i < n; i++)
threads[i].join();
Because these Threads (and their runnables) always execute the same way using a single ArrayBlockingQueue as input, an alternative to this would be to just "refill the queue" every time the method is called and just re-start the same Threads. This is easily implemented, but I'm unsure as to whether it would make any difference one way or the other. I'm not too familiar with concurrency, so any help is appreciated.
PS.: If there is a more elegant way to handle the polling, that would also be helpful.

It is not possible to start a Thread more than once, but conceptually, the answer to your question is yes.
This is normally accomplished with a thread pool. A thread pool is a set of Threads which rarely actually terminate. Instead, an application is passes its task to the thread pool, which picks a Thread in which to run it. The thread pool then decides whether the Thread should be terminated or reused after the task completes.
Java has some classes which make use of thread pools quite easy: ExecutorService and CompletableFuture.
ExecutorService usage typically looks like this:
ExecutorService executor = Executors.newCachedThreadPool();
for (int i = 0; i < n; i++) {
tasks[i] = () -> {
while (true) {
MyObject input = inputs.poll(1L, TimeUnit.MICROSECONDS);
if (input == null) return;
// run computations over this.dataStructure
}
};
executor.submit(tasks[i]);
}
// Doesn't interrupt or halt any tasks. Will wait for them all to finish
// before terminating its threads.
executor.shutdown();
executor.awaitTermination(Long.MAX_VALUE, TimeUnit.DAYS);
Executors has other methods which can create thread pools, like newFixedThreadPool() and newWorkStealingPool(). You can decide for yourself which one best suits your needs.
CompletableFuture use might look like this:
Runnable[] tasks = new Runnable[n];
CompletableFuture<?>[] futures = new CompletableFuture<?>[n];
for (int i = 0; i < n; i++) {
tasks[i] = () -> {
while (true) {
MyObject input = inputs.poll(1L, TimeUnit.MICROSECONDS);
if (input == null) return;
// run computations over this.dataStructure
}
};
futures[i] = CompletableFuture.runAsync(tasks[i]);
}
CompletableFuture.allOf(futures).get();
The disadvantage of CompletableFuture is that the tasks cannot be canceled or interrupted. (Calling cancel will mark the task as completing with an exception instead of completing successfully, but the task will not be interrupted.)

Per definition, you cannot restart a thread. According to the documentation:
It is never legal to start a thread more than once. In particular, a thread may not be restarted once it has completed execution.
Nevertheless a thread is a valuable resource, and there are implementations to reuse threads. Have a look at the Java Tutorial about Executors.

Dynamically distributing workload to multiple threads in Java

Let's say I have 5 threads that must make a combined total of 1,000,000 function calls for a parallel Monte Carlo Method program. I assigned 1,000,000 / 5 function calls for each of the 5 threads. However, after many tests (some tests ranging up to 1 trillion iterations) I realized that some threads were finishing much faster than others. So instead I would like to dynamically assign workload to each of these threads. My first approach involved a AtomicLong variable that was set to an initial value of, let's say, 1 billion. After each function call, I would decrement the AtomicLong by 1. Before every function call the program would check to see if the AtomicLong was greater than 0, like this:
AtomicLong remainingIterations = new AtomicLong(1000000000);
ExecutorService threadPool = Executors.newFixedThreadPool(5);
for (int i = 0; i < 5; i++) {//create 5 threads
threadPool.submit(new Runnable() {
public void run() {
while (remainingIterations.get() > 0) {//do a function call if necessary
remainingIterations.decrementAndGet();//decrement # of remaining calls needed
doOneFunctionCall();//perform a function call
}
}
});
}//more unrelated code is not show (thread shutdown, etc.)
This approach seemed to be extremely slow, am I using AtomicLong correctly? Is there a better approach?

am I using AtomicLong correctly?
Not quite. The way you are using it, two threads could each check remainingIterations, each see 1, then each decrement it, putting you at -1 total.
As for you slowness issue, it is possible that, if doOneFunctionCall() completes quickly, your app is being bogged down by the lock-contention surrounding your AtomicLong.
The nice thing about an ExecutorService is that it logically decouples the work being done from the threads that are doing it. You can submit more jobs than you have threads, and the ExecutorService will execute them as soon as it is able:
ExecutorService threadPool = Executors.newFixedThreadPool(5);
for (int i = 0; i < 1000000; i++) {
threadPool.submit(new Runnable() {
public void run() {
doOneFunctionCall();
}
});
}
This might be balancing your work a bit too much in the other direction: creating too many short-lived Runnable objects. You can experiment to see what gives you the best balance between distributing the work and performing the work quickly:
ExecutorService threadPool = Executors.newFixedThreadPool(5);
for (int i = 0; i < 1000; i++) {
threadPool.submit(new Runnable() {
public void run() {
for (int j = 0; j < 1000; j++) {
doOneFunctionCall();
}
}
});
}

Look at ForkJoinPool. What you are attempting is called divide-and-conquer. In F/J you set the number of threads to 5. Each thread has a queue of pending Tasks. You can evenly set the number of Tasks for each thread/queue and when a thread runs out of work it work-steals from another thread's queue. This way you don't need the AtomicLong.
There a many examples of using this Class. If you need more info, let me know.

An elegant approach to avoid the creation of 1B tasks is to use a synchronous queue and a ThreadPoolExecutor, doing so submit will be blocked until a thread becomes available.
I didn't test actual performance though.
BlockingQueue<Runnable> queue = new SynchronousQueue<>();
ExecutorService threadPool = new ThreadPoolExecutor(5, 5,
0L, TimeUnit.MILLISECONDS,
queue);
for (int i = 0; i < 1000000000; i++) {
threadPool.submit(new Runnable() {
public void run() {
doOneFunctionCall();
}
});
}

Create and add Runnable only when one/more of the worker Thread is available..?

I am executing millions of iteration and I want to parallelize this. Hence decided to add the task [each iteration] to the Thread Pool.
Now, if I add all the iteration to the Thread Pool, it might throw an OutOfMemoryError. I want to handle that gracefully, so is there any way to know about the availability of the worker Thread in the Thread Pool?
Once it's available, add the Runnable to the Worker Thread.
for(int i=0; i<10000000000; i++) {
executor.submit(new Task(i));
}
Each of those tasks merely take 1 sec to complete.

Why don't you set a limit to how many tasks can run concurrently. Like:
HashSet<Future> futures = new HashSet<>();
int concurrentTasks = 1000;
for (int ii=0; ii<100000000; ii++) {
while(concurrentTasks-- > 0 && ii<100000000) {
concurrentTasks.add(executor.submit(new Task(ii)));
}
Iterator<Future> it = concurrentTasks.iterator();
while(it.hasNext()) {
Future task = it.next();
if (task.isDone()) {
concurrentTasks++;
it.remove();
}
}
}

You'll want to use something like this:
ArrayBlockingQueue<Runnable> queue = new ArrayBlockingQueue<Runnable>(MAX_PENDING_TASKS);
Executor executor = new ThreadPoolExecutor(MIN_THREADS, MAX_THREADS, IDLE_TIMEOUT, TimeUnit.SECONDS, queue, new ThreadPoolExecutor.CallerRunsPolicy());
for(int i=0; i<10000000000; i++) {
executor.submit(new Task(i));
}
Basically you create a thread pool with min/max threads and an array backed queue. When you hit the limit of pending tasks, the "caller runs policy" kicks in and your main thread ends up running the next task (giving time for your other tasks to complete and open slots in the queue).
Since you've stated that your tasks are short lived, this seems like an optimal strategy.
The values for MAX_PENDING_TASKS and MIN_THREADS are something you can fiddle with to figure out what the optimal values are for your workload, but MAX_PENDING_TASKS should be at least twice MIN_THREADS and probably more like 10 to 100 times.

You should use java.lang.Runtime
The biggest memory issue is probably going to be your Object creation, not in adding them to your Executor, so that's where you should be calling Runtime.getRuntime().freeMemory().

Java Concurrency in Practice: race condition in BoundedExecutor?

There's something odd about the implementation of the BoundedExecutor in the book Java Concurrency in Practice.
It's supposed to throttle task submission to the Executor by blocking the submitting thread when there are enough threads either queued or running in the Executor.
This is the implementation (after adding the missing rethrow in the catch clause):
public class BoundedExecutor {
private final Executor exec;
private final Semaphore semaphore;
public BoundedExecutor(Executor exec, int bound) {
this.exec = exec;
this.semaphore = new Semaphore(bound);
}
public void submitTask(final Runnable command) throws InterruptedException, RejectedExecutionException {
semaphore.acquire();
try {
exec.execute(new Runnable() {
#Override public void run() {
try {
command.run();
} finally {
semaphore.release();
}
}
});
} catch (RejectedExecutionException e) {
semaphore.release();
throw e;
}
}
When I instantiate the BoundedExecutor with an Executors.newCachedThreadPool() and a bound of 4, I would expect the number of threads instantiated by the cached thread pool to never exceed 4. In practice, however, it does. I've gotten this little test program to create as much as 11 threads:
public static void main(String[] args) throws Exception {
class CountingThreadFactory implements ThreadFactory {
int count;
#Override public Thread newThread(Runnable r) {
++count;
return new Thread(r);
}
}
List<Integer> counts = new ArrayList<Integer>();
for (int n = 0; n < 100; ++n) {
CountingThreadFactory countingThreadFactory = new CountingThreadFactory();
ExecutorService exec = Executors.newCachedThreadPool(countingThreadFactory);
try {
BoundedExecutor be = new BoundedExecutor(exec, 4);
for (int i = 0; i < 20000; ++i) {
be.submitTask(new Runnable() {
#Override public void run() {}
});
}
} finally {
exec.shutdown();
}
counts.add(countingThreadFactory.count);
}
System.out.println(Collections.max(counts));
}
I think there's a tiny little time frame between the release of the semaphore and the task ending, where another thread can aquire a permit and submit a task while the releasing thread hasn't finished yet. In other words, it has a race condition.
Can someone confirm this?

BoundedExecutor was indeed intended as an illustration of how to throttle task submission, not as a way to place a bound on thread pool size. There are more direct ways to achieve the latter, as at least one comment pointed out.
But the other answers don't mention the text in the book that says to use an unbounded queue and to
set the bound on the semaphore to be equal to the pool size plus the
number of queued tasks you want to allow, since the semaphore is
bounding the number of tasks both currently executing and awaiting
execution. [JCiP, end of section 8.3.3]
By mentioning unbounded queues and pool size, we were implying (apparently not very clearly) the use of a thread pool of bounded size.
What has always bothered me about BoundedExecutor, however, is that it doesn't implement the ExecutorService interface. A modern way to achieve similar functionality and still implement the standard interfaces would be to use Guava's listeningDecorator method and ForwardingListeningExecutorService class.

You are correct in your analysis of the race condition. There is no synchronization guarantees between the ExecutorService & the Semaphore.
However, I do not know if throttling the number of threads is what the BoundedExecutor is used for. I think it is more for throttling the number of tasks submitted to the service. Imagine if you have 5 million tasks that need to submit, and if you submit more then 10,000 of them you run out of memory.
Well you only will ever have 4 threads running at any given time, why would you want to try and queue up all 5 millions tasks? You can use a construct similar to this to throttle the number of tasks queued up at any given time. What you should get out of this is that at any given time there are only 4 tasks running.
Obviously the resolution to this is to use a Executors.newFixedThreadPool(4).

I see as much as 9 threads created at once. I suspect there is a race condition which causes there to be more thread than required.
This could be because there is before and after running the task work to be done. This means that even though there is only 4 thread inside your block of code, there is a number of thread stopping a previous task or getting ready to start a new task.
i.e. the thread does a release() while it is still running. Even though its the last thing you do its not the last thing it does before acquiring a new task.

Java: ExecutorService less efficient than manual Thread executions?

I've got a multi-threaded application. When using Thread.start() to manually start threads every concurrent thread uses exactly 25% CPU (or exactly one core - this is on a quad core machine). So if I run two threads CPU usage is exactly 50%.
When using ExecutorService to run threads however, there seems to be one "ghost" thread consuming CPU resources! One Thread uses 50% instead of 25%, two thread use 75%, etc.
Could this be some kind of windows task manager artefact?
Excutor service code is
ExecutorService executor = Executors.newFixedThreadPool(threadAmount);
for (int i = 1; i < 50; i++) {
Runnable worker = new ActualThread(i);
executor.execute(worker);
}
executor.shutdown();
while (!executor.isTerminated()) {
}
System.out.println("Finished all threads");
and Thread.start() code is:
ActualThread one= new ActualThread(2,3);
ActualThread two= new ActualThread(3,4);
...
Thread threadOne = new Thread(one);
Thread threadTtwo = new Thread(two);
...
threadOne.start();
threadTwo.start();
...

Here's your problem:
while (!executor.isTerminated()) {
}
Your "main" method is spinning the CPU doing nothing. Use invokeAll() instead, and your thread will block without a busy wait.
final ExecutorService executor = Executors.newFixedThreadPool(threadAmount);
final List<Callable<Object>> tasks = new ArrayList<Callable<Object>>();
for (int i = 1; i < 50; i++) {
tasks.add(Executors.callable(new ActualThread(i)));
}
executor.invokeAll(tasks);
executor.shutdown(); // not really necessary if the executor goes out of scope.
System.out.println("Finished all threads");
Since invokeAll() wants a collection of Callable, note the use of the helper method Executors.callable(). You can actually use this to get a collection of Futures for the tasks as well, which is useful if the tasks are actually producing something you want as output.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Executors distribution of task per threads - java

No. It's a pool, and the assignment of threads to tasks doesn't make such guarantees. e.g. imagine your do_work() method completes immediately. By the time you submit your 2nd Runnable, all the threads in the pool will be available, and any one of them will be a candidate for your job.

Related

Is re-starting a Thread better than creating a new one?

Dynamically distributing workload to multiple threads in Java

Create and add Runnable only when one/more of the worker Thread is available..?

Java Concurrency in Practice: race condition in BoundedExecutor?

Java: ExecutorService less efficient than manual Thread executions?

Categories

Resources