I know that Thread and Task are in different abstraction-level.But anyway,I'm still confused that what's the relationship of them.And,by the way,I think that the Task tells how to do a job and the Thread actually excute the job according to a Task instance.Is my understanding correct?thank u^
I assume by Task you mean Runnable and Callable. The relationship is simple:
Thread might be used to execute multiple tasks
might - because you don't need a separate thread to execute tasks (well, technically, everything runs inside a thread - you don't need a separate one)
multiple - thread can be reused; it can run multiple tasks from a collection like queue
Typically one thread executes one Runnable passed to Thread constructor or multiple Callables passed to ExecutorService (wrapping thread pool in most cases).
If by Task you mean something like this, then the difference is that the task is used to run some thread-like code execution, but has additional properties, such as when to run it, how many times, and the option to cancel its execution, whereas a thread will just go ahead and run once immediately.
A task is rather abstract it can be implemented as a process or a thread.
Your understanding is correct.
We can do the analogy with workflow patterns where tasks are something that needs to be done in a process and threads are resources used to process or execute them.
Related
In my Java application I have a Runnable such as:
this.runner = new Runnable({
#Override
public void run() {
// do something that takes roughly 5 seconds.
}
});
I need to run this roughly every 30 seconds (although this can vary) in a separate thread. The nature of the code is such that I can run it and forget about it (whether it succeeds or fails). I do this as follows as a single line of code in my application:
(new Thread(this.runner)).start()
Now, this works fine. However, I'm wondering if there is any sort of cleanup I should be doing on each of the thread instances after they finish running? I am doing CPU profiling of this application in VisualVM and I can see that, over the course of 1 hour runtime, a lot of threads are being created. Is this concern valid or is everything OK?
N.B. The reason I start a new Thread instead of simply defining this.runner as a Thread, is that I sometimes need to run this.runner twice simultaneously (before the first run call has finished), and I can't do that if I defined this.runner as a Thread since a single Thread object can only be run again once the initial execution has finished.
Java objects that need to be "cleaned up" or "closed" after use conventionally implement the AutoCloseable interface. This makes it easy to do the clean up using try-with-resources. The Thread class does not implement AutoCloseable, and has no "close" or "dispose" method. So, you do not need to do any explicit clean up.
However
(new Thread(this.runner)).start()
is not guaranteed to immediately start computation of the Runnable. You might not care whether it succeeds or fails, but I guess you do care whether it runs at all. And you might want to limit the number of these tasks running concurrently. You might want only one to run at once, for example. So you might want to join() the thread (or, perhaps, join with a timeout). Joining the thread will ensure that the thread will completes its computation. Joining the thread with a timeout increases the chance that the thread starts its computation (because the current thread will be suspended, freeing a CPU that might run the other thread).
However, creating multiple threads to perform regular or frequent tasks is not recommended. You should instead submit tasks to a thread pool. That will enable you to control the maximum amount of concurrency, and can provide you with other benefits (such as prioritising different tasks), and amortises the expense of creating threads.
You can configure a thread pool to use a fixed length (bounded) task queue and to cause submitting threads to execute submitted tasks itself themselves when the queue is full. By doing that you can guarantee that tasks submitted to the thread pool are (eventually) executed. The documentation of ThreadPool.execute(Runnable) says it
Executes the given task sometime in the future
which suggests that the implementation guarantees that it will eventually run all submitted tasks even if you do not do those specific tasks to ensure submitted tasks are executed.
I recommend you to look at the Concurrency API. There are numerous pre-defined methods for general use. By using ExecutorService you can call the shutdown method after submitting tasks to the executor which stops accepting new tasks, waits for previously submitted tasks to execute, and then terminates the executor.
For a short introduction:
https://www.baeldung.com/java-executor-service-tutorial
In my Java application I have a Runnable such as:
this.runner = new Runnable({
#Override
public void run() {
// do something that takes roughly 5 seconds.
}
});
I need to run this roughly every 30 seconds (although this can vary) in a separate thread. The nature of the code is such that I can run it and forget about it (whether it succeeds or fails). I do this as follows as a single line of code in my application:
(new Thread(this.runner)).start()
Now, this works fine. However, I'm wondering if there is any sort of cleanup I should be doing on each of the thread instances after they finish running? I am doing CPU profiling of this application in VisualVM and I can see that, over the course of 1 hour runtime, a lot of threads are being created. Is this concern valid or is everything OK?
N.B. The reason I start a new Thread instead of simply defining this.runner as a Thread, is that I sometimes need to run this.runner twice simultaneously (before the first run call has finished), and I can't do that if I defined this.runner as a Thread since a single Thread object can only be run again once the initial execution has finished.
Java objects that need to be "cleaned up" or "closed" after use conventionally implement the AutoCloseable interface. This makes it easy to do the clean up using try-with-resources. The Thread class does not implement AutoCloseable, and has no "close" or "dispose" method. So, you do not need to do any explicit clean up.
However
(new Thread(this.runner)).start()
is not guaranteed to immediately start computation of the Runnable. You might not care whether it succeeds or fails, but I guess you do care whether it runs at all. And you might want to limit the number of these tasks running concurrently. You might want only one to run at once, for example. So you might want to join() the thread (or, perhaps, join with a timeout). Joining the thread will ensure that the thread will completes its computation. Joining the thread with a timeout increases the chance that the thread starts its computation (because the current thread will be suspended, freeing a CPU that might run the other thread).
However, creating multiple threads to perform regular or frequent tasks is not recommended. You should instead submit tasks to a thread pool. That will enable you to control the maximum amount of concurrency, and can provide you with other benefits (such as prioritising different tasks), and amortises the expense of creating threads.
You can configure a thread pool to use a fixed length (bounded) task queue and to cause submitting threads to execute submitted tasks itself themselves when the queue is full. By doing that you can guarantee that tasks submitted to the thread pool are (eventually) executed. The documentation of ThreadPool.execute(Runnable) says it
Executes the given task sometime in the future
which suggests that the implementation guarantees that it will eventually run all submitted tasks even if you do not do those specific tasks to ensure submitted tasks are executed.
I recommend you to look at the Concurrency API. There are numerous pre-defined methods for general use. By using ExecutorService you can call the shutdown method after submitting tasks to the executor which stops accepting new tasks, waits for previously submitted tasks to execute, and then terminates the executor.
For a short introduction:
https://www.baeldung.com/java-executor-service-tutorial
I read a great article about the fork-join framework in Java 7, and the idea is that, with ForkJoinPool and ForkJoinTask, the threads in the pool can get the sub tasks from other tasks, so it's able to use less threads to handle more tasks.
Then I tried to use a normal ExecutorService to do the same work, and found I can't tell the difference, since when I submit a new task to the pool, the task will be run on another available thread.
The only difference I can tell is if I use ForkJoinPool, I don't need to pass the pool to the tasks, because I can call task.fork() to make it running on another thread. But with normal ExecutorService, I have to pass the pool to the task, or make it a static, so inside the task, I can call pool.submit(newTask)
Do I miss something?
(You can view the living code from https://github.com/freewind/fork-join-test/tree/master/src)
Although ForkJoinPool implements ExecutorService, it is conceptionally different from 'normal' executors.
You can easily see the difference if your tasks spawn more tasks and wait for them to complete, e.g. by calling
executor.invoke(new Task()); // blocks this thread until new task completes
In a normal executor service, waiting for other tasks to complete will block the current thread. There are two possible outcomes: If your executor service has a fixed number of threads, it might deadlock if the last running thread waits for another task to complete. If your executor dynamically creates new threads on demand, the number of threads might explode and you end up having thousands of threads which might cause starvation.
In opposite, the fork/join framework reuses the thread in the meantime to execute other tasks, so it won't deadlock although the number of threads is fixed:
new MyForkJoinTask().invoke();
So if you have a problem that you can solve recursively, think of using a ForkJoinPool as you can easily implement one level of recursion as ForkJoinTask.
Just check the number of running threads in your examples.
I have 2 threads T1 and T2 ,both have different jobs so usually we prefer to accomplish this task by thread Joins.
But we can do this with out using join(). We can add T2 thread's code inside T1 thread.
What difference does this make ?
Joining a thread means that one waits for the other to end, so that you can safely access its result or continue after both have finished their jobs.
Example: if you start a new thread in the main thread and both do some work, you'd join the main thread on the newly created one, causing the main thread to wait for the second thread to finish. Thus you can do some work in parallel until you reach the join.
If you split a job into two parts which are executed by different threads you may get a performance improvement, if
the threads can run independently, i.e. if they don't rely on each other's data, otherwise you'd have to synchronize which costs performance
the JVM is able to execute multiple threads in parallel, i.e. you have a hyperthreading/multicore machine and the JVM utilizes that
usually we prefer to accomplish this task by thread Joins.
No we don't. We accomplish this task by starting two threads. There is no obligation to use join() so there is no 'should' about it. If you want to pause the current thread while another thread completes, do so. If you don't, don't.
If you call T1.join(); from T2 it will wait for T1 to die (finish). It is a form of thread synchronization, but from what you describe you can simply fire of two thread and simply do not use join. If you use two threads then the work will be done in parallel, if you put the code only in one thread then the work will be done sequentially.
Here is the reason to use join: You use it when final result depends on result of two tasks which could run at the same time.
Example1:
After user clicks submit button the program has to call two external webservices to update their respective systems. It can be done at the same time that is why we would create a separate thread for one of webservices.
The user will sit before the screen and wait for a notification: Your submission is OK! The screen should say OK only after both threads finished.
Two things.
Join is used only when one thread must wait for the open to finish (lets say thread A prepares a file and thread B cannot continue until the file is ready). There are instance where threads are independent and no join is needed (for example most of daemon threads).
With threading you get several things:
- mainly, independence in the order of execution. Lets say that you have a program that when you push a button does some heavy processing. If you do that processing in the main thread, you GUI will freeze until the task is finished. If you do the processing in another thread, then the GUI thread is "freed" and the GUI keeps working.
- in some (most) of modern computers, creating several threads could allow the OS to use the different cores to serve different threads, improving performance.
The drawback is bigger complexity, as you need information of other threads execution state.
You could use something like a java.util.concurrent.CountDownLatch, eg:
CountDownLatch doneSignal = new CountDownLatch(2);
and have each thread countDown() when they're done, so a main thread knows when both threads have completed.
using Join also like we can add the T2 thread's code inside T1 thread
join() like the method name implies waits for the thread to die and joins it at the end of execution. You can add one thread's code inside another but that would destroy the purpose of using 2 separate threads to run your jobs concurrently. Placing one code after the other would run your statements in sequence. There is no concurrency.
When in doubt, consult the javadocs - http://download.oracle.com/javase/1.5.0/docs/api/java/lang/Thread.html#join%28%29
If T1 and T2 do different tasks which do not depend on state changes caused by each other - you should not join them to reap advantages of parallel execution. In case there are state dependenices you should synchronize both threads using mechanisms like wait/notify or even .Join() depending on your use case.
And as for, combining the run() methods of both threads, it's entirely left to you. I mean, you should understand why both threads are of different "types" (as they have different run() body) in the first place . It's a design aspect and not a performance aspect.
All the parallel threads typically needs to join at some point in the code thereby allowing them to wait until all threads terminate. After this point typically the serial processing continues. This ensures proper synchronisation between peer threads so that subsequent serial code does not begin abruptly before all parallel threads complete the collective activity.
The main difference is when we join T2 thread with T1 ,the time T2 is executing the job can be utilised by T1 also ,that means they will do different job parllely.But this cann't happen when you include the T2 thread code inside T1 thread.
I'm looking for a java thread-pool, that won't run more threads simultaneously than there are cores in the system. This service is normally provided by a ThreadPoolExecutor using a BlockingQueue.
However, if a new thread is scheduled to execute, I want the new thread to pre-empt one of the already running threads, and add the the pre-empted thread (in a suspended state) to a task queue, so it can be resumed as soon as the new thread is finished.
Any suggestions?
I would make a subclass of ThreadPoolExecutor.
When you setup your ThreadPoolExecutor you want to set the corePoolSize and the maximumPoolSize to Runtime.getRuntime().availableProcessors() (Look at Executors.newFixedThreadPool() to see why this works).
Next you want to make sure that your Queue also implements Deque. LinkedBlockingDeque is an example but you should shop around to see which one will work best for you. A Deque allows you to get stack like LIFO behavior which is exactly what you want.
Since everything (submit(), invokeAll()) funnels through execute() you will want to override this method. Basically do what you described above:
Check if all threads are running. If not simply start the new runnable on an available thread. If all the threads are already running then you need to find the one running the oldest runnable, stop the runnable, re-queue the runnable somewhere (maybe at the beginning?), and then start your new runnable.
The idea of a ThreadPoolExecutor is to avoid all of the expensive actions related to creating and destroying a thread. If you absolutely insist on preempting the running tasks, then you won't get that from the default API.
If you are willing to allow the running tasks to complete and instead only preempt the tasks which have not begun execution, then you can use a BlockingQueue implementation which works like a Stack (LIFO).
You can also have tasks 'preempt' other tasks by using different executors with different thread priorities. Essentially, if the OS supports time-slicing, then the higher priority executor gets the time-slice.
Otherwise, you need a custom implementation which manages execution. You could use a SynchronousQueue and have P worker threads waiting on it. If a client calls execute and SynchronousQueue.offer fails, then you would have to create a special worker Thread which grabs one of the other Threads and flags them to halt before executing and again flags them to resume after executing.