I'm almost new to Java. I know multithreading is the action of separating program into several tasks so that they can run concurrently. I have two problems with this concepts.
First of all, it's been said that application server creates a thread per each request. I can't associate this per-request-thread with program's threads. Suppose a program in which there are 5 threads to do things concurrently. How that single thread per request is going to deal with the 5 threads of that program?
Secondly, I have problem grasping the idea of thread pool. Is it about the threads that application server creates per request or it's regarding programs's threads that do tasks concurrently?
I have problem grasping the idea of thread pool.
A simple thread pool is a collection of running threads (a.k.a., worker threads) in which, each thread continually tries to take a task object from a BlockingQueue, and when it gets one, it executes the task, and then goes back to the queue to wait for another one.
A task is an object with some well-known method that the worker calls in order to "execute" the task. E.g., in the thread pools defined by the Java standard library, task objects either are Runnable instances or Callable instances, and the worker executes a task by calling task.run() or by task.call().
Related
In my Java application I have a Runnable such as:
this.runner = new Runnable({
#Override
public void run() {
// do something that takes roughly 5 seconds.
}
});
I need to run this roughly every 30 seconds (although this can vary) in a separate thread. The nature of the code is such that I can run it and forget about it (whether it succeeds or fails). I do this as follows as a single line of code in my application:
(new Thread(this.runner)).start()
Now, this works fine. However, I'm wondering if there is any sort of cleanup I should be doing on each of the thread instances after they finish running? I am doing CPU profiling of this application in VisualVM and I can see that, over the course of 1 hour runtime, a lot of threads are being created. Is this concern valid or is everything OK?
N.B. The reason I start a new Thread instead of simply defining this.runner as a Thread, is that I sometimes need to run this.runner twice simultaneously (before the first run call has finished), and I can't do that if I defined this.runner as a Thread since a single Thread object can only be run again once the initial execution has finished.
Java objects that need to be "cleaned up" or "closed" after use conventionally implement the AutoCloseable interface. This makes it easy to do the clean up using try-with-resources. The Thread class does not implement AutoCloseable, and has no "close" or "dispose" method. So, you do not need to do any explicit clean up.
However
(new Thread(this.runner)).start()
is not guaranteed to immediately start computation of the Runnable. You might not care whether it succeeds or fails, but I guess you do care whether it runs at all. And you might want to limit the number of these tasks running concurrently. You might want only one to run at once, for example. So you might want to join() the thread (or, perhaps, join with a timeout). Joining the thread will ensure that the thread will completes its computation. Joining the thread with a timeout increases the chance that the thread starts its computation (because the current thread will be suspended, freeing a CPU that might run the other thread).
However, creating multiple threads to perform regular or frequent tasks is not recommended. You should instead submit tasks to a thread pool. That will enable you to control the maximum amount of concurrency, and can provide you with other benefits (such as prioritising different tasks), and amortises the expense of creating threads.
You can configure a thread pool to use a fixed length (bounded) task queue and to cause submitting threads to execute submitted tasks itself themselves when the queue is full. By doing that you can guarantee that tasks submitted to the thread pool are (eventually) executed. The documentation of ThreadPool.execute(Runnable) says it
Executes the given task sometime in the future
which suggests that the implementation guarantees that it will eventually run all submitted tasks even if you do not do those specific tasks to ensure submitted tasks are executed.
I recommend you to look at the Concurrency API. There are numerous pre-defined methods for general use. By using ExecutorService you can call the shutdown method after submitting tasks to the executor which stops accepting new tasks, waits for previously submitted tasks to execute, and then terminates the executor.
For a short introduction:
https://www.baeldung.com/java-executor-service-tutorial
In my Java application I have a Runnable such as:
this.runner = new Runnable({
#Override
public void run() {
// do something that takes roughly 5 seconds.
}
});
I need to run this roughly every 30 seconds (although this can vary) in a separate thread. The nature of the code is such that I can run it and forget about it (whether it succeeds or fails). I do this as follows as a single line of code in my application:
(new Thread(this.runner)).start()
Now, this works fine. However, I'm wondering if there is any sort of cleanup I should be doing on each of the thread instances after they finish running? I am doing CPU profiling of this application in VisualVM and I can see that, over the course of 1 hour runtime, a lot of threads are being created. Is this concern valid or is everything OK?
N.B. The reason I start a new Thread instead of simply defining this.runner as a Thread, is that I sometimes need to run this.runner twice simultaneously (before the first run call has finished), and I can't do that if I defined this.runner as a Thread since a single Thread object can only be run again once the initial execution has finished.
Java objects that need to be "cleaned up" or "closed" after use conventionally implement the AutoCloseable interface. This makes it easy to do the clean up using try-with-resources. The Thread class does not implement AutoCloseable, and has no "close" or "dispose" method. So, you do not need to do any explicit clean up.
However
(new Thread(this.runner)).start()
is not guaranteed to immediately start computation of the Runnable. You might not care whether it succeeds or fails, but I guess you do care whether it runs at all. And you might want to limit the number of these tasks running concurrently. You might want only one to run at once, for example. So you might want to join() the thread (or, perhaps, join with a timeout). Joining the thread will ensure that the thread will completes its computation. Joining the thread with a timeout increases the chance that the thread starts its computation (because the current thread will be suspended, freeing a CPU that might run the other thread).
However, creating multiple threads to perform regular or frequent tasks is not recommended. You should instead submit tasks to a thread pool. That will enable you to control the maximum amount of concurrency, and can provide you with other benefits (such as prioritising different tasks), and amortises the expense of creating threads.
You can configure a thread pool to use a fixed length (bounded) task queue and to cause submitting threads to execute submitted tasks itself themselves when the queue is full. By doing that you can guarantee that tasks submitted to the thread pool are (eventually) executed. The documentation of ThreadPool.execute(Runnable) says it
Executes the given task sometime in the future
which suggests that the implementation guarantees that it will eventually run all submitted tasks even if you do not do those specific tasks to ensure submitted tasks are executed.
I recommend you to look at the Concurrency API. There are numerous pre-defined methods for general use. By using ExecutorService you can call the shutdown method after submitting tasks to the executor which stops accepting new tasks, waits for previously submitted tasks to execute, and then terminates the executor.
For a short introduction:
https://www.baeldung.com/java-executor-service-tutorial
I read a great article about the fork-join framework in Java 7, and the idea is that, with ForkJoinPool and ForkJoinTask, the threads in the pool can get the sub tasks from other tasks, so it's able to use less threads to handle more tasks.
Then I tried to use a normal ExecutorService to do the same work, and found I can't tell the difference, since when I submit a new task to the pool, the task will be run on another available thread.
The only difference I can tell is if I use ForkJoinPool, I don't need to pass the pool to the tasks, because I can call task.fork() to make it running on another thread. But with normal ExecutorService, I have to pass the pool to the task, or make it a static, so inside the task, I can call pool.submit(newTask)
Do I miss something?
(You can view the living code from https://github.com/freewind/fork-join-test/tree/master/src)
Although ForkJoinPool implements ExecutorService, it is conceptionally different from 'normal' executors.
You can easily see the difference if your tasks spawn more tasks and wait for them to complete, e.g. by calling
executor.invoke(new Task()); // blocks this thread until new task completes
In a normal executor service, waiting for other tasks to complete will block the current thread. There are two possible outcomes: If your executor service has a fixed number of threads, it might deadlock if the last running thread waits for another task to complete. If your executor dynamically creates new threads on demand, the number of threads might explode and you end up having thousands of threads which might cause starvation.
In opposite, the fork/join framework reuses the thread in the meantime to execute other tasks, so it won't deadlock although the number of threads is fixed:
new MyForkJoinTask().invoke();
So if you have a problem that you can solve recursively, think of using a ForkJoinPool as you can easily implement one level of recursion as ForkJoinTask.
Just check the number of running threads in your examples.
As we create a Thread pool using Java's Executor service and submit threads to this thread pool, what is the order in which those threads get executed?
I want to ensure that threads submitted first, execute first.
For example, in the code below, I want first 5 threads to get executed first, followed by the next 5 threads and so on...
// Create a thread pool of 5 threads.
ScheduledExecutorService exService = Executors.newScheduledThreadPool(5, new ModifiedThreadFactory("ReadThreadPool"));
// Create 100 threads.
MyThread[] threads = createMyThreads(100);
// Submit these 100 threads to thread pool for execution.
for(MyThread thread : threads) {
exService.submit(thread);
}
Does Java's Thread Pool provide any API for this purpose, or do we need to implement a FIFO queue at our end to achieve this.
If Java's thread pool does not provide any such functionality, I am really interested to understand the reason behind the non-existence of this functionality as it appears like a very common use-case to me.
Is it technically not possible (which I think is quite unlikely), or is it just a miss?
That's the default behavior. ScheduledThreadExecutor (that you're using although you're not scheduling anything) extends from ThreadPoolExecutor. Tasks submitted to a ThreadPoolExecutor are stored in a BlockingQueue until one thread is available to take them and execute them. And queues are FIFO.
This is decscribed in details in the javadoc.
Threads do not get executed. Threads are the entities running taska like Runnable and Callable . Submiting such a task to a executor service will put it in it's inner BlockingQueue until it gets picked up by a thread from it's thread pool. This will still tell you nothing about the order of execution as different classes can do different things while implementing Runnable
I have 2 threads T1 and T2 ,both have different jobs so usually we prefer to accomplish this task by thread Joins.
But we can do this with out using join(). We can add T2 thread's code inside T1 thread.
What difference does this make ?
Joining a thread means that one waits for the other to end, so that you can safely access its result or continue after both have finished their jobs.
Example: if you start a new thread in the main thread and both do some work, you'd join the main thread on the newly created one, causing the main thread to wait for the second thread to finish. Thus you can do some work in parallel until you reach the join.
If you split a job into two parts which are executed by different threads you may get a performance improvement, if
the threads can run independently, i.e. if they don't rely on each other's data, otherwise you'd have to synchronize which costs performance
the JVM is able to execute multiple threads in parallel, i.e. you have a hyperthreading/multicore machine and the JVM utilizes that
usually we prefer to accomplish this task by thread Joins.
No we don't. We accomplish this task by starting two threads. There is no obligation to use join() so there is no 'should' about it. If you want to pause the current thread while another thread completes, do so. If you don't, don't.
If you call T1.join(); from T2 it will wait for T1 to die (finish). It is a form of thread synchronization, but from what you describe you can simply fire of two thread and simply do not use join. If you use two threads then the work will be done in parallel, if you put the code only in one thread then the work will be done sequentially.
Here is the reason to use join: You use it when final result depends on result of two tasks which could run at the same time.
Example1:
After user clicks submit button the program has to call two external webservices to update their respective systems. It can be done at the same time that is why we would create a separate thread for one of webservices.
The user will sit before the screen and wait for a notification: Your submission is OK! The screen should say OK only after both threads finished.
Two things.
Join is used only when one thread must wait for the open to finish (lets say thread A prepares a file and thread B cannot continue until the file is ready). There are instance where threads are independent and no join is needed (for example most of daemon threads).
With threading you get several things:
- mainly, independence in the order of execution. Lets say that you have a program that when you push a button does some heavy processing. If you do that processing in the main thread, you GUI will freeze until the task is finished. If you do the processing in another thread, then the GUI thread is "freed" and the GUI keeps working.
- in some (most) of modern computers, creating several threads could allow the OS to use the different cores to serve different threads, improving performance.
The drawback is bigger complexity, as you need information of other threads execution state.
You could use something like a java.util.concurrent.CountDownLatch, eg:
CountDownLatch doneSignal = new CountDownLatch(2);
and have each thread countDown() when they're done, so a main thread knows when both threads have completed.
using Join also like we can add the T2 thread's code inside T1 thread
join() like the method name implies waits for the thread to die and joins it at the end of execution. You can add one thread's code inside another but that would destroy the purpose of using 2 separate threads to run your jobs concurrently. Placing one code after the other would run your statements in sequence. There is no concurrency.
When in doubt, consult the javadocs - http://download.oracle.com/javase/1.5.0/docs/api/java/lang/Thread.html#join%28%29
If T1 and T2 do different tasks which do not depend on state changes caused by each other - you should not join them to reap advantages of parallel execution. In case there are state dependenices you should synchronize both threads using mechanisms like wait/notify or even .Join() depending on your use case.
And as for, combining the run() methods of both threads, it's entirely left to you. I mean, you should understand why both threads are of different "types" (as they have different run() body) in the first place . It's a design aspect and not a performance aspect.
All the parallel threads typically needs to join at some point in the code thereby allowing them to wait until all threads terminate. After this point typically the serial processing continues. This ensures proper synchronisation between peer threads so that subsequent serial code does not begin abruptly before all parallel threads complete the collective activity.
The main difference is when we join T2 thread with T1 ,the time T2 is executing the job can be utilised by T1 also ,that means they will do different job parllely.But this cann't happen when you include the T2 thread code inside T1 thread.