Does multithreading cause each task to take longer? - java

I'm new to multithreading... multithreading is used to improve performance, however how can this be so if the processor is already working as fast as it can for a single thread?
To explain:
In a single threaded environment User A starts a task which takes 1 second - the task is completed in 1 second.
User B starts the same task a fraction of a second later and has to wait for user A, therefore user Bs task completes in almost 2 seconds. Now, if this was a multithreaded environment would both tasks not be run similtaeously causing both to take 2 seconds? ...e.g. part of task A done, then part of task B, then part of task A again, then part of task B, ...till eventually both tasks complete in around two seconds?
Is it only faster if there is more than one processor? Or should multithreading be reserved for when there is a big task being dealt with and smaller tasks need to pop in and out of existence during this time?

If the task is 100% CPU bound, and you have only one CPU core, then multithreading will make things slower. If you have more than one CPU core, clearly you can allocate as many threads as you have cores to improve performance. If the tasks interact with anything external (I/O) then multithreading can increase performance to a point. While one thread is waiting for I/O to complete, other threads can be performing CPU-based processing.
The classic example is a program that does computation and also displays a GUI. You update the GUI on one thread (the event thread) and do all processing on other "background" threads. The event thread does nothing but handle user interactions. If you don't do this then when a user-requested operation takes appreciable time the GUI stops responding. In this situation, you would run multithreaded even on a single-core system.
How to tune an application's use of threads depends on a lot of factors and could take up an entire textbook.

Ok now consider that Task A of your's needs a particular resource (ex a network file or a user input) to complete its work. Now say the resource needed by the Task A is not currently available so what happens next in a single threaded environment is that Task A has control of the CPU so it'll wait for the resource to be available and at this time the CPU will be idle which means we're wasting an important resource i.e, the CPU time while waiting for some other resource. But if the Task A has been implemented in multithreaded environment when Task A waits for the resource, it(Task A thread) gets suspended till the resource become available and the CPU time can be used efficiently to execute other tasks. Hope this helps :)

Related

Executing continuously running threads/task in thread pool executor with context switching

I want to run n tasks continuously, however those tasks are memory intensive I want only x of them to be active at a time. But ultimately all those n tasks should run by context switching between them.
In short, I want another implementation of FixedThreadPool, where extra tasks should also run with context switching.
Do we have Thread Pool variant which achieves same? Or any other way to implement it?
UPDATE : After reading a bit and reading answer below, decided to "divide and conquer" i.e breaking continuously running task into units of small short lived tasks and submitting those again and again to FixedThreadPool
One could write a paper on the subject but let us keep it simple and concise.
In short, I want another implementation of FixedThreadPoolSize, where
extra tasks should also run with context switching.
To achieve that, one would need a Thread Pool that allows to explicitly perform affinity between a Thread and a core. And (as far I know) such a Thread Pool is not officially provided with Java. Which makes sense, since one of the goals of abstractions such as Thread Pools (in Java) is to increase the level of abstraction, even to the point of abstracting concepts such as a thread (i.e., Executor). Therefore, it does not come as a surprise that such a low-level feature (as mapping threads to cores) is not provided out-of-the-box.
Do we have Thread Pool variant which achieves same? Or any other way
to implement it?
Unless you are running your code in a Non-Uniform Memory Access (NUMA) architecture, I don't see the benefits of such a low level feature in the context of your program.
My use case is I have to run n tasks continuously. But as those tasks
are memory intensive I want only x of them to be active at a time. But
ultimately all those n tasks should run by context switching between
them.
If you run n tasks and n Threads, and the hardware where the code is running has c cores, where n >> c, then inevitably the SO will map multiple threads to the same core. Hence, you will have your context switching for free.
Finally, before actually opting to run more Threads than cores, profile your code, accordingly. For instance, running the code with the same number threads as cores, then with doubling the number of threads until it stops scaling. It might so happen that your code does even scale with more threads than cores.

Why does thread pool takes tasks independenty and not concurrently?

I am trying to get the basics very strong about thread pool. I learnt that it internally uses blocking queue to 'steal' tasks and run them into given threads in pool. Meaning that if I had 10 tasks and 5 threads, it could run only 5 tasks at the same time, until 1 finishes ENTIRELY.
Question is: Why not concurrently? Why not just time-slice those 10 tasks?
What is the reason of this implementation?
Why not concurrently? Why not just time-slice those 10 tasks?
You can have a thread pool that is able to perform ten concurrent tasks. You just need to configure it to have at least ten worker threads. "Time-slicing" tasks is what threads do. What thread pools do is:
Allow your program to control the number of threads that it uses to perform "background" tasks, and
Allow your program to re-use threads, which can be much more efficient than creating a new thread for each new task, and then destroying the thread when the task is complete.
In order to "time-slice 10 tasks", those tasks need to be in 10 separate threads that run concurrently.
The time-slicing scheduling algorithm is implemented by the operating system, not by Java. Time slicing applies to threads in Java because Java threads are implemented as native operating system threads: every Java thread has a native thread of its own, and these threads are scheduled by the operating system as it sees fit.
There is no difference between "thread pool threads" and "raw threads" here. If you give an instance of Runnable to a thread (whether it's part of a thread pool or not) it will run from beginning to end, subject to the time slicing scheduling algorithm of the operating system.
So why not use thousands of threads, why even bother with thread pools? It turns out that operating system threads are a relatively expensive and scarce resource, and therefore so are Java threads.
Since operating system threads are so expensive, Project Loom is investigating adding lightweight user space threads to Java. Some of the details in this answer may change when/if loom gets merged into main stream Java.
Some good answers but I thought I'd respond to your questions specifically.
I learnt that it internally uses blocking queue to 'steal' tasks and run them into given threads in pool. Meaning that if I had 10 tasks and 5 threads, it could run only 5 tasks at the same time, until 1 finishes ENTIRELY.
If you configure your thread pool to have 5 threads (Executors.newFixedThreadPool(5)) then it will start 5 threads to run your jobs. Initially 5 jobs are given to the 5 threads to run concurrently (if your server has 5 CPUs available). Once one of the 5 jobs finishes, a 6th job will be immediately started on the idle thread. This continues until all 10 jobs have been run.
Question is: Why not concurrently? Why not just time-slice those 10 tasks? What is the reason of this implementation?
You can instead use a cached thread pool (Executors.newCachedThreadPool()) if you want which will start a thread for each of the 10 jobs that you submit concurrently. This might work fine for 10 jobs but won't work well with 100,000 jobs – you would not want to start 100,000 threads. We use a fixed thread pool when we want to limit the number of jobs that run concurrently. Even though it seems like running 5 jobs concurrently would always run slower than running 10 jobs at once, but this isn't necessarily the case. There is a cost when the OS time slices between the jobs and the overall job throughput may be faster with 5 threads than 10 depending on how many processors your hardware has. Limiting the number of concurrent jobs also does not stress your server as much and should make your application work better with other running applications.
See my answer here about scaling threads: Concept behind putting wait(),notify() methods in Object class

how does Thread.sleep(delay) react to actual parallel threads

The link to the documentation says: Thread.sleep causes the current thread to suspend execution for a specified period
What does the term current thread mean? I mean if the processor has only one core that it makes sense to coin one of the threads as the current thread, but if all the threads(say 4 of them) are running individually on separate cores, then which one is the current thread?
The "current thread" is the thread which calls Thread.sleep(delay).
Also if a thread sleeps, it does not block the entire CPU core. Some other thread can be run on the same CPU core while your thread is asleep.
Every single command and method call you make has to be executed by anyone thread. From that thread's perspective, it is itself the current thread. So in other words: Thread.sleep(delay) pauses the thread that executes the Thread.sleep() method.
Also, keep in mind that multi-threading and multiple cores only have a very distant relationship.
Even before multi-core CPUs were commonplace, pretty much every operating system supported heavy multi-threading (or multi-tasking, which is basically the same thing for the purpose of this discussion) operation.
In modern OS this is done with a technique called preemptive multitasking. This basically means that the OS can forcibly pause the currently running process and allow another one to run for a short time, providing the illusion of actual parallel processing.
And since a lot of time in a given process is often spent waiting for some external I/O (network, disk, ...) that even means that you can use the CPU more efficiently (since the time a process would spend waiting for IO another process can spend doing actual computation).
As an example at the time of writing this, my laptop has 1311 threads (most of which are probably sleeping and only a handful will actually run and/or wait to run), even though it has only 4 cores.
tl;dr while multiple cores allow more than one thread to actually execute at the exact same time, you can have multi-threading even with a single core and there's very little noticeable difference if you do (besides raw performance, obviously)
The name, "Current thread," was chosen for the convenience of the authors of the operating system, not for the authors of applications that have to run under the operating system.
In the source code for an operating system, it makes sense to have a variable current_thread[cpu_id] that points to a data structure that describes the thread that is running on that cpu_id at any given moment.
From the point-of-view of an application programmer, any system call that is supposed to do something to the "current thread," is going to do it to the thread that makes the call. If a thread that is running on CPU 3 calls Thread.sleep(n), the OS will look up current_thread[3] (i.e., the thread that made the call) and put that thread to sleep.
From the application point-of-view, Thread.sleep(n) is a function that appears to do nothing, and always takes at least n milliseconds to do it.
In general, you should substitute "the caller" or "the calling thread" any time you see "current thread" in any API document.

How to figure out N threads I need in a parallel execution context?

I'm working in a redelivery system. This system attempt to execute an action, if the action fails, it try to execute again two times with an interval of five minutes, so I use the ExecutorService implementation to perform the first execution and ScheduledExecutorService to schedule the other ones, depending of its results (fail).
What should I consider to figure out the number of threads I need? In this moment I use only a single thread model (created by newSingleThreadScheduledExecutor method)
Without knowing details about the load your system has, environment it is using and how long does it take to process one message it is hard to say which number of threads you need. However, you can think of the following base principles:
Having many threads is bad, because you'll spend significant amount of time on a context switch, the chance of starvation and wasting system resources is higher .
Each thread consumes some space in memory for its stack. On x64 it is typically 1MB per thread.
I would probably create 2 thread pools (one scheduled, one non-scheduled) for both sending and redelivery and test them under high load varying number of threads from 2 to 10 to see which number suits best.
You should only need the one thread as only one action is running at a time. You could use a CachedThreadPool and not worry about it.

What happens to a thread as soon as it has completed its assigned task in java?

I have been working on a project in which my program creates around 500 threads during the execution. I find that my PC starts taking a huge load as soon as the program I execute the program. And it continues showing load after 75% of the threads have completed their job.
I want to know whether the threads whose work has finished were killed or not. And how does java deal with threads which have finished their job. Any help...
I find that my PC starts taking a huge load as soon as the program I execute the program. And it continues showing load after 75% of the threads have completed their job.
If 75% of the 500 threads have completed their job then that leaves 100+ threads that continue to run. 100 threads, if using a good amount of CPU, can more than swamp the processors on a box which I assume does not have 100s of cores. So your application may continue to show 100% CPU utilization until the number of running threads drops below the number of cores.
You should consider using a fixed sized thread pool instead of creating 500 concurrent threads. You then submit 500 tasks to the thread pool. This allows you to choose an appropriate number of threads to be running concurrently. The appropriate number of threads is highly dependent on the tasks being submitted. More CPU bound tasks should use fewer threads while IO bound tasks can use more. Doing some test runs with your application while adjusting the number of threads is the best way to optimize the value. I tend to start with 2 times the number of cores and then optimize from there.
// create a thread pool with 10 workers
ExecutorService threadPool = Executors.newFixedThreadPool(10);
// define your jobs somehow
for (MyRunnable job : jobsToDo) {
threadPool.submit(job);
}
// once we have submitted all jobs to the thread pool, it should be shutdown
threadPool.shutdown();
For more details, check out the ExecutorService tutorial.
I want to know whether the threads whose work has finished were killed or not. And how does java deal with threads which have finished their job. Any help...
The threads have most likely finished. After a thread leaves the run() method (either because it returns or throws an exception) it will no longer consume CPU and its underlying native thread can be reaped or reused. If there are no references to the Thread object, its memory will eventually be reaped by the garbage collector.
The JVM will garbage collect the Thread object, as long as there are no references to it, and as long as its run method returns. Thread is dead itself after its run method returns. It might still be in the heap, but it will not have its own stack anymore, and not do anything.
The only possible way that your threads have not been killed is that they still do something or you forgot to clean up references to your thread objects - but this is memory related.
If you allocated your threads through the thread pool, they are returned to the pool after the execution of the task. I this case, they might not be released after the completion of the task.
we should not create many threads to accomplish our task, it will give you many issues like OutofMemoryError. And also creation of thread is a costly task, so we should think of Thread pool i.e. ExecutorService in which we reuse the same threads again and again.
But any ways to answer you question after threads are created they die automatically i.e. it will be garbage collected, you don't need to do anything. Initially java provided methods like stop() or destroy() but these are deprecated for good reason.
You can read about a Thread's lifecycle. If the run method is over then they should not be consuming your cpu.
Threads who have finished their jobs will die. It won't consume any more CPU time.
You can use jstack to check how many threads are active running in your java process.

Categories