I have learnt while studying operating system concepts that there are two types of Threads : Kernel level and User level.
I also learnt that Kernel level threads or processes can be executed in multiprocessor environments too.
I have basic doubt regarding Java threads (being user level threads),
can we use them to execute in multiprocessor environment ?
First the answer is yes. You can make use of your multi-core processors' full power by creating multiple threads in java.
According to what I know, jvm employs a mixed thread model including both kernel threads and user threads. It has a strategy to decide when to create which type of threads. I believe that when the system resource is abundant, it will tends to create kernel thread and assign java thread object to run on it.
Sounds to me like you are talking about the good old Java 1.1 “green threads”. This was a kludge for operating system’s not having a native thread support, or at least no stable support. This feature does not exist in current JVM implementations any more, at least when talking about the reference implementation from Oracle. Java threads are always kernel threads on these JVMs.
So the answer is, yes, Java threads will benefit from SMP aka Multicore CPUs. It requires the operating system to have a native thread implementation, but without it the entire SMP machine would not make much sense. And the JVM must be able to use it, which is the case in all common systems.
Related
Suppose we have a very complex task. I know that if we use one thread then in practice we will use one core, but if I divide the task into threads equal to the number of processor cores does the program necessarily run on all the cores?
Or there is no correlation between the number of threads and cores used and the JVM 'decides'?
Actually, it is typically the Operating System that decides how many cores that a Java application gets to use. The scheduling of native threads to cores is handled by the operating system1. The JVM has little (if any) say thread scheduling.
But yes, an Java application won't necessarily get access to all of the cores. It will depend on what other applications, services, etc on system are doing. Indeed, the OS may provide ways for an administrator to externally limit the number of cores that may be used by a given (Java or not) application, or give one application priority over another.
... or there is no correlation between the number of thread and used core's
There is a correlation, but not one that is particularly useful. A JVM won't (cannot) use more cores than there are native threads in existence (including the JVM's internal and GC threads).
It is also worth noting that the OS (typically) doesn't assign a core to a native thread that is not currently runnable.
Basil Bourque notes in his answer that Project Loom will bring significant improvements to threading in Java, if and when it is incorporated into the standard releases. However, Loom won't alter the fact that the number of physical cores assigned an application JVM at any given time is controlled / limited by the OS.
1 - Or the operating system's hypervisor. Things can get a bit complicated in a cloud computing environment, for instance.
The Answer by Stephen C is true today, where Java threads are typically implemented as native threads provided by the host operating systems. But things change with Project Loom technology being developed for a future version of Java.
Project Loom brings virtual threads (fibers) to the concurrency toolbox of Java. Many virtual threads will be mapped to each of the few platform/kernel threads. A virtual thread when blocked (waiting on a call to slow resources such as file I/O, network I/O, database access, etc.) will be “parked” (set aside) allowing some other virtual thread to execute for a while on the “real” platform/kernel thread.
This parked/unparked switching will be very fast, and take little memory (using a flexible growing/shrinking stack). This makes threads “cheap”, so cheap that you might reasonably be able to run millions of threads at a time.
Returning to your question, the management of these virtual threads will be managed within the JVM rather than the host OS. But underneath our virtual threads, we rely on the same platform/kernel threads used today, and those are ultimately controlled by the host OS rather than the JVM.
By default, when setting up an executor service backed by virtual threads via Executors.newVirtualThreadExecutor we do not specify the number of platform/kernel threads to be used. That s handled by the implementation of your JVM.
Experimental builds of Project Loom are available now, based on early-access Java 17. The Loom team seeks feedback.
Is there a Java thread pool object which automatically load balances threads across the available cores or is this done for you by the JVM?
Since (most) JVMs use native threads, the scheduling of the threads is the responsibility to the operating system. There may well still be "green thread" implementations of the JVM (or, at least options for them, especially on older JVMs), but since "green threads" are implemented by the JVM itself, they tend to not scale across cores. A primary goal of using native threads was multi-processor compatibility. The JVM doesn't, typically, run at a low enough level within the operating environment to have control over a resource like the CPUs of the machine.
I keep qualifying which JVM because while the vast majority of folks use the Oracle/OpenJDK JVM, there are other JVMs, older JVMs, JVMs on embedded hardware that do not behave as the Oracle/OpenJDK JVM does.
I am really curious about how the JVM works with threads!
In my searches on the internet, I found some material about RTSJ, but I don't know if it's the right directions for my answers.
Can someone give me directions, material, articles or suggestions about the JVM scheduling algorithm?
I am also looking for information about the default configuration of Java threads in the scheduler, like how long does it take for every thread in case of time-slicing.
I appreciate any help, thank you!
There is no single Java Virtual Machine; JVM is a specification, and there are multiple implementations of it, including the OpenJDK version and the Sun version of it, among others. I don't know for certain, but I would guess that any reasonable JVM would simply use the underlying threading mechanism provided by the OS, which would imply POSIX Threads (pthreads) on UNIX (Mac OS X, Linux, etc.) and would imply WIN32 threads on Windows. Typically, those systems use a round-robin strategy by default.
It doesn't. The JVM uses operating system native threads, so the OS does the scheduling, not the JVM.
A while ago I wrote some articles on thread scheduling from the point of view of Java. However, on mainstream platforms, threading behaviour essentially depends on underlying OS threading.
Have a look in particular at my page on what is Java thread priority, which explains how Java's priority levels map to underlying OS threading priorities, and how in practice this makes threads of different priorities behave on Linux vs Windows. A major difference discussed is that under Linux there's more of a relationship between thread priority and the proportion of CPU allocated to a thread, whereas under Windows this isn't directly the case (see the graphs).
I don't have commenting rights so writing is here...
JVM invokes pthreads(generally used threading mechanism,other variants are there) for each corresponding request. But the scheduling here is done entirely by OS acting as host.
But it is a preferred approach and it is possible to schedule these threads by JVM. For example in Jikes RVM there are options to override this approach of OS decision. For example, in it threads are referred as RVMThread and they can be scheduled/manipulated using org.jikesrvm.schedular package classes.
For more reference
Could someone please provide explanation how Java multi-threaded program (e.g. Tomcat servlet container) is able to use all cores of CPU when JVM is only single process on linux? Is there any good in-depth article that describes the subject in details?
EDIT #1: I'm not looking for advice how to implement multi-threaded program in Java. I'm looking for explanation of how JVM internally manages to use multiple cores on linux/windows while still being single process on the OS.
EDIT #2: The best explanation I managed to find is that Hotspot (Sun/Oracle JVM) implements threads as native threads on Linux using NPTL. So more less each thread in Java is lightweight process (native thread) on Linux. It is clearly visible using ps -eLf command that print outs not only process id (PPID) but also native thread id (LWP).
More details can be also found here:
http://www.velocityreviews.com/forums/t499841-java-5-threads-in-linux.html
Distinguishing between Java threads and OS threads?
EDIT #3: Wikipedia has short but nice entry on NPTL with some further references http://en.wikipedia.org/wiki/Native_POSIX_Thread_Library
The Linux kernel supports threads as first-class citizens. In fact to the kernel a thread isn't much different to a process, except that it shares a address space with another thread/process.
Some old versions of ps even showed a separate process for each thread by default and newer versions can enable this behavior using the -m flag.
The JVM is a single process with many threads. Each thread can be scheduled on a different CPU core. A single process can have many threads.
When Java software running inside the JVM asks for another thread the JVM starts another thread.
That is how the JVM manages to use multiple cores.
If you use the concurrency library and split up your work as much as you can, the JVM should handle the rest.
Take a look at this http://embarcaderos.net/2011/01/23/parallel-processing-and-multi-core-utilization-with-java/
I would start by reading the Concurrency Tutorial.
In particular, it explains the differences (and relationship) between processes and threads.
On the architectures that I'm familiar with, the threads (including JVM-created threads) are managed by the OS. The JVM simply uses the threading facilities provided by the operating system.
When I create a multi-threaded program and I use methods such as Wait or Signal to control threads among other things, does JVM control all the thread state changes or does the underlying OS have anything to do with it.
It depends on the implementation of the JVM. Most modern JVM's (Suns HotSpot, Oracles JRockit, IBM VMs) will use the Operating system threading model as this will give the best performance.
Early implementations used green threads - The VM was modelling the threads using itself. This was typically used when the platform or operating system it was running on didn't support threading. For example, in Java 1.1, Green Threads were used on Solaris. At the time, the common pattern to use multiple cores/CPU's in Solaris was to use multiple processes - only later were threads added to the Operating System.
The Java Language Specification does not specify how Threads must be implemented but in general, if the OS has threading support, modern JVM's will use the OS implementation. When there is no support in the OS, for example on low end mobile phones or in a Java Card implementation for example, then Green Threads will be used by the runtime.
In general, Java threads will map to OS threads and Java will make use of OS synchronisation primitives to implement synchronized/wait/signal/..., but the mapping is not as straightforward as you might think. In fact, the JVM uses some clever tricks to improve performance and implements as much of the synchronisation code itself (at least the uncontended case).
If you are really interested in the details, have a look at the JVM source code or at cmeerw.org/notes/java_sync.html which provides some overview of how Java's synchronisation primitives are implemented on Linux and Solaris.
In the early days of linux 2.4, at least the IBM JVM used separate processes to implement java threads. This resulted in a long time to switch between threads, as the system needed to activate a completely different process each time.