JVM + Linux + Intel's Hyperthreading = - java

I noticed that JVM threads run as processes under Linux for some reasons (correct me if I'm wrong). Furthermore, it is a fact that Intel's Hyperthreading provides additional parallelization only for two threads belonging to same process.
Does that mean that a single multitheraded JVM program would not profit from Hyperthreading under Linux, because its threads are no threads from the CPUs "point of view"?

Processes and threads are not treated differently by the scheduler in Linux. There are a range of resources that can be shared by processes, as defined by the clone system call. Threads and processes as they're typically used are just names for commonly used recipes.
If you're observing threads as processes in the JVM, this is just a mixing of nomenclature. By the usual definition if processes are sharing a virtual address space, then they are "threads" within a process.
All hardware scheduling will benefit from hyper-threading, regardless of the terminology used. Also to be completely fair, hyper threading does not allow you to run more threads in parallel: it makes context switches faster, giving processes more run time.

"JVM threads run as processes under Linux "-- No they run as LWP(Light Weight Process).
Java threads are internally implemented as native threads i.e. LWP (in linux) and you can see them using ps -eLf. Though mapping between a native thread and a java thread is difficult. The only thread that can be mapped easily is the main-thread as it will have the id same as the process id.
JVM will definitely profit from HT.
From an article on HT in java:
SMT holds the promise of significantly increasing Java's server-side
performance by more completely utilizing existing processor cycles in
multithreaded applications.

Related

Relationship between core and threads [duplicate]

Say if I have a processor like this which says # cores = 4, # threads = 4 and without Hyper-threading support.
Does that mean I can run 4 simultaneous program/process (since a core is capable of running only one thread)?
Or does that mean I can run 4 x 4 = 16 program/process simultaneously?
From my digging, if no Hyper-threading, there will be only 1 thread (process) per core. Correct me if I am wrong.
A thread differs from a process. A process can have many threads. A thread is a sequence of commands that have a certain order. A logical core can execute on sequence of commands. The operating system distributes all the threads to all the logical cores available, and if there are more threads than cores, threads are processed in a fast cue, and the core switches from one to another very fast.
It will look like all the threads run simultaneously, when actually the OS distributes CPU time among them.
Having multiple cores gives the advantage that less concurrent threads will be placed on one single core, less switching between threads = greater speed.
Hyper-threading creates 2 logical cores on 1 physical core, and makes switching between threads much faster.
That's basically correct, with the obvious qualifier that most operating systems let you execute far more tasks simultaneously than there are cores or threads, which they accomplish by interleaving the executing of instructions.
A system with hyperthreading generally has twice as many hardware threads as physical cores.
The term thread is generally used as a description of an operating system concept that has the potential to execute independently of other threads. Whether it does so depends on whether it is stuck waiting for some event (disk or screen I/O, message queue), or if there are enough physical CPUs (hyperthreaded or not) to allow it run in the face of other non-waiting threads.
Hyperthreading is a CPU vendor term that means a single core, that can multiplex its attention between two computations. The easy way to think about a hyperthreaded core is as if you had two real CPUs, both slightly slower than what the manufacture says the core can actually do.
Basically this is up to the OS. A thread is a high-level construct holding a instruction pointer, and where the OS places a threads execution on a suitable logical processor. So with 4 cores you can basically execute 4 instructions in parallell. Where as a thread simply contains information about what instructions to execute and the instructions placement in memory.
An application normally uses a single process during execution and the OS switches between processes to give all processes "equal" process time. When an application deploys multiple threads the processes allocates more than one slot for execution but shares memory between threads.
Normally you make a difference between concurrent and parallell execution. Where parallell execution is when you actually physically execute instructions of more than one logical processor and concurrent execution is the the frequent switching of a single logical processor giving the apperence of parallell execution.

How does the JVM spread threads between CPU cores?

Can somebody help me to understand how JVM spread threads between available CPU cores? Here som my vision how it is work but pls correct me.
So from the begining: when computer is started then bootstrap thread (usually thread 0 in core 0 in processor 0) starts up fetching code from address 0xfffffff0. All the rest CPUs/cores are in special sleep state called Wait-for-SIPI(WFS).
Then after OS is loaded it starts managing processes and schedule them between CPU/cores sending a special inter-processor-interrupt (IPI) over the Advanced Programmable Interrupt Controller (APIC) called a SIPI (Startup IPI) to each thread that is in WFS. The SIPI contains the address from which that thread should start fetching code.
So for example OS started JVM by loading JVM code in memory and pointing one of the CPU cores to its address (using mechanism described above). After that JVM that is executed as separate OS process with its own virtual memory area can start several threads.
So question is: how?
Does JVM use the same mechanism as OS and during time slice that OS gave to JVM can send SIPI to other cores and point the to address of the tasks that should be executed in a separate thread? If yes then how is restored the original program that could be executed by OS on this core?
Assume that it is not correct vision as suppose that this tasks of involving other CPUs/cores should be managed via OS. Overwise we could interrupt execution of some OS processes running in parallel on other cores. So if JVM wants to start new thread on other CPU/core it makes some OS call and send address of the task to be executed to the OS. OS schedule execution as for other programs but with different that this execution should happen in the same process to be able to access the same address space as the rest JVM threads.
How is it done? Can somebody describe it in more details?
The OS manages and schedule threads by default. The JVM makes the right calls to the OS to make this happen, but doesn't get involved.
Does JVM use the same mechanism as OS
The JVM uses the OS, it has no idea what actually happens.
Each process has its own virtual address space, again managed by the OS.
I have a library which uses JNA to wrap setaffinity on Linux and Windows. You need to do this as thread scheduling is controlled by the OS not the JVM.
https://github.com/OpenHFT/Java-Thread-Affinity
Note: in most cases, using affinity either a) doesn't help or b) doesn't help as much as you might think.
We use it to reduce jitter of around 40 - 100 microseconds which doesn't happen often, but often enough to impact our performance profile. If you want your 99%ile latencies to be as low as possible, in the micro-second range, thread affinity is essential. If you are ok with 1 in 100 requests taking 1 ms longer, I wouldn't bother.

Java Multi Threading in multi cpu cores

Does Java threads runs in parallel on Multi core Processor i.e, runs multiple threads at the same time?
[Parallel processing with Java Threads]
volatile is useful when you want to prevent your resource from being cached by Threads
Multiple threads can run on single CPU (though, one at a time) and can share resources, So volatile is still useful.
JVM does not decide the number of processors to be used. It is the job of OS. JVM has the capability of creating multiple threads and submits them.
Volatile is used to guarantee the data is not being fetched from CPU cache during concurrency.
First thing it's the JVM who spawns the threads but it's the hardware whom JVM depends on. If it has multi core, JVM can run multiple threads at the same time to extract max performance.
Now when it comes to user(You) decide to what extent you want to exploit the CPU resources and you do this through thread pools.(By defining max number of threads can run in parallel) but yet again you stuck up with your hardware configuration.

How many CPU will a multithreaded application take, if runs in multicore processor

A multi-core processor is a single computing component with two or more independent actual central processing units (called "cores"), which are the units that read and execute program instructions.
If a multithreaded application runs on a multi-core processor, how many CPUs will is use? For example, if the machine is capable of dual core execution, then 2 CPUs will be used, if my understanding is correct. Within these two CPUs, multiple threads will be executed and do the context switching.
If a Mulithreaded application runs on multi-core processor, how many CPU it will use, for example if the machine is capable of doing the dualcore, then 2 CPU will be used is my understanding is correct, and within these two CPU multiple thread will be executed and do the context switching.
The JVM really doesn't deal directly with processors. It uses the native thread capabilities of the operating system which uses the processors that are exposed by the operating system and hardware. In Java there is a Runtime.availableProcessors() method but this in a only a few places by the JVM code.
To the JVM or any other application running on a computer, the multiple cores typically seem the same as multiple processors if that's how the OS exposes them. This means that the distinction between physical processors versus multiple cores in a single processor is completely hidden from the Java programmer.
There are single core CPUs then there are CPUs with multiple cores which share certain internal components but the OS sees them and schedules them as multiple processors. Multiple cores are most likely seen to the OS as multiple CPUs -- there is no distinction. Then there are the virtual processors often called hyperthreading which share the same processor core (and the associated processing circuitry) but have multiple execution pipelines. These are also (usually) seen by the OS as multiple processors.
Specifically, in the OP's example, you have a single processor with two cores, in linux cat'ing /proc/cpuinfo will show 2 processors and in Java the Runtime.availableProcessors() will return 2. It will also return 2 if you have 2 physical processors also will most likely if you have a single core with dual hyperthreading pipelines depending on the OS kernel.
As to how many processors the JVM will actually be using, this depends again on the native thread code. This said, if the JVM is running on a single CPU with two cores, and the cores are not in use by other applications or the OS, then the JVM's thread will most likely be scheduled to run on them concurrently.
By default you can utilize all processors. One processor can run virtually as many threads as possible at the same time (virtually means that physically there's always just one thread which is running). How many is possible depends on the operating system resource limitations and the used threading framework.
It doesn't matter from software point of view, if the cores are on one die, and there's one CPU socket with a multi-core CPU, or there are more CPU sockets. The OS and JVM will see the collection of the cores. (This brings in an interesting aspect though: data exchange between such cores which are on the same die and those which are in different sockets are not uniform).
Thread schedulers (talking about both the OS's and the virtual machine's) often tend to shuffle and move threads from one core to another throughout scheduler time. That can hurt performance, there are techniques to tie a thread to a certain core (thread affinity).
How much cpu resources your application (lets assume long running task) will really consume depends on how much percentage you need your cpu. Application can be network, memory, harddisk or cpu bound and a few others.
If the cpu has to wait for any other resource such as memory or network it will remain idle or be assigned to other threads.
Example:
If your application is only cpu bound (won't consume much memory) and you run a long task with as many threads as cores (physical or virtual with hyperthreading) you will get almost 100% usage of the free resources that are not used by other running threads (os, programms).
Depending on the program you can tell in which state your application is from the cpu/memory/network consumption and you can analyse the performance.
It will get use of at most as many CPUs as you have simultaneously busy threads, and possibly as few as one.
From programmer's point of view, a core is a processor. Method Runtime.availableProcessors() shows the number of cores. However, from manufacturer's point of view, multi-core processor is similar to ordinary processor, so they decided to leave the name "processor", probably making a marketing mistake.

Need clarification in Parallel Processing

For example if I use dual core processor and write a java program without using threads. Does it mean the program execution is sequential and it will only use single core among the dual?
For example if I use dual core processor and write a java program using threads and synchronisation. Does it mean the program execution is parallel and it will use all available cores(in this case two cores)?
If my reasoning is totally wrong then , what is the relation between threading,cores and parallelism?
First, the JVM has a number of background threads that will use multiple CPUs and cores even if the user code never forks another thread. The garbage collector for example will run concurrently in another CPU if possible regardless of the user code.
If your user code never forks another thread, the JVM will never run your code concurrently in multiple CPUs. If you do write your program with multiple threads there is no guarantee that it will be run in multiple CPUs but it is certainly more likely. It depends a lot on what else is running on on the OS and how blocked your threads are. If you threads are consuming a lot of CPU cycles and run for any length of time on a modern OS then yes, your program will use both CPUs.
You can verify this on a Linux OS (and other Unixen) by watching to see if your process consumes more than 100% of CPU at any one time. You can also use ps options to show the underlying threads and their CPU usage. See my answer here: Concurrency of posix threads in multiprocessor machine

Categories