My application uses loads of Java threads. I am looking for a reliable understanding how the JVM (version 5 and 6) maps the Java threads to underlying Windows threads. I know there is a document for mapping to Solaris threads, but not Windows.
Why doesn't Sun publish this information?
I want to know if there's a 1:1 mapping, or if it varies by JVM, by -server option, by workload, etc, etc.
I know I am not "supposed" to care, I should write properly synchronisd code, but I am inheriting a large body of code...
Also, does anyone know how to give names to Windows threads?
Don't have a document for you, but from the Threads column in the task-manager you can pretty reliably guess that it maps 1:1 to native threads (you need to enable the Threads column in the task manager first).
Oh, almost forgot, you can download the jdk src here and look yourself.
The mapping is platform-dependent, however I found an interesting comparison between platform threads for the vm (although probably a bit old). The bottom line is: you don't need to know. What you probably are more interested is to know about green threads (if you don't know already).
As for the naming question: Doesn't the constructor allow you to name a thread? Or do you mean name them and view their name on some windows thread browser?
How to name a Win32 thread
Unfortunately, this seems like it's impossible or at least very hard to do inside the Windows JVM.
JVM specification doesn't say anything strictly in this regard. Its left upto the JVM implementors to map Java theads to platform theads( Windows, Linux etc). Also its hard to believe that there will be one to one mapping between Java threads and OS threads.
Related
I am really curious about how the JVM works with threads!
In my searches on the internet, I found some material about RTSJ, but I don't know if it's the right directions for my answers.
Can someone give me directions, material, articles or suggestions about the JVM scheduling algorithm?
I am also looking for information about the default configuration of Java threads in the scheduler, like how long does it take for every thread in case of time-slicing.
I appreciate any help, thank you!
There is no single Java Virtual Machine; JVM is a specification, and there are multiple implementations of it, including the OpenJDK version and the Sun version of it, among others. I don't know for certain, but I would guess that any reasonable JVM would simply use the underlying threading mechanism provided by the OS, which would imply POSIX Threads (pthreads) on UNIX (Mac OS X, Linux, etc.) and would imply WIN32 threads on Windows. Typically, those systems use a round-robin strategy by default.
It doesn't. The JVM uses operating system native threads, so the OS does the scheduling, not the JVM.
A while ago I wrote some articles on thread scheduling from the point of view of Java. However, on mainstream platforms, threading behaviour essentially depends on underlying OS threading.
Have a look in particular at my page on what is Java thread priority, which explains how Java's priority levels map to underlying OS threading priorities, and how in practice this makes threads of different priorities behave on Linux vs Windows. A major difference discussed is that under Linux there's more of a relationship between thread priority and the proportion of CPU allocated to a thread, whereas under Windows this isn't directly the case (see the graphs).
I don't have commenting rights so writing is here...
JVM invokes pthreads(generally used threading mechanism,other variants are there) for each corresponding request. But the scheduling here is done entirely by OS acting as host.
But it is a preferred approach and it is possible to schedule these threads by JVM. For example in Jikes RVM there are options to override this approach of OS decision. For example, in it threads are referred as RVMThread and they can be scheduled/manipulated using org.jikesrvm.schedular package classes.
For more reference
Does anybody know of a way to lock down individual threads within a Java process to specific CPU cores (on Linux)? I've done this in C, but can't find how to do this in Java. My instincts are that this will require a JNI call, but I was hoping someone here might have some insight or might have done it before.
Thanks!
You can't do this in pure java. But if you really need it -- you can use JNI to call native code which do the job. This is the place to start with:
http://ovatman.blogspot.com/2010/02/using-java-jni-to-set-thread-affinity.html
http://blog.toadhead.net/index.php/2011/01/22/cputhread-affinity-in-java/
UPD: After some thinking, I've decided to create my own class for this: ThreadAffinity.java It's JNA-based, and very simple -- so, if you want to use it in production, may be you should spent some time making it more stable, but for benchmarking and testing it works well as is.
UPD 2: There is another library for working with thread affinity in java. It uses same method as previously noted, but has another interface
I know it's been a while, but if anyone comes across this thread, here's how I solved this problem. I wrote a script that would do the following:
"jstack -l "
Take the results, find the "nid"'s of the threads I want to manually lock down to cores.
Taskset those threads.
You might want to take a look at https://github.com/peter-lawrey/Java-Thread-Affinity/blob/master/src/test/java/com/higherfrequencytrading/affinity/AffinityLockBindMain.java
IMO, this will not be possible unless you use native calls. JVM is supposed to be platform independent, any system calls done to achieve this will not result in a portable code.
It's not possible (at least with plain Java).
You can use thread pools to limit the amount of threads (and therefore cores) used for different types of work, but there is no way to specify a core to use.
There is even the (small) possibility that your Java runtime doesn't support native threading for your OS or hardware. In this case, green threads are used and only one core will be used for the whole JVM.
I am really curious about how the JVM works with threads!
In my searches on the internet, I found some material about RTSJ, but I don't know if it's the right directions for my answers.
Can someone give me directions, material, articles or suggestions about the JVM scheduling algorithm?
I am also looking for information about the default configuration of Java threads in the scheduler, like how long does it take for every thread in case of time-slicing.
I appreciate any help, thank you!
There is no single Java Virtual Machine; JVM is a specification, and there are multiple implementations of it, including the OpenJDK version and the Sun version of it, among others. I don't know for certain, but I would guess that any reasonable JVM would simply use the underlying threading mechanism provided by the OS, which would imply POSIX Threads (pthreads) on UNIX (Mac OS X, Linux, etc.) and would imply WIN32 threads on Windows. Typically, those systems use a round-robin strategy by default.
It doesn't. The JVM uses operating system native threads, so the OS does the scheduling, not the JVM.
A while ago I wrote some articles on thread scheduling from the point of view of Java. However, on mainstream platforms, threading behaviour essentially depends on underlying OS threading.
Have a look in particular at my page on what is Java thread priority, which explains how Java's priority levels map to underlying OS threading priorities, and how in practice this makes threads of different priorities behave on Linux vs Windows. A major difference discussed is that under Linux there's more of a relationship between thread priority and the proportion of CPU allocated to a thread, whereas under Windows this isn't directly the case (see the graphs).
I don't have commenting rights so writing is here...
JVM invokes pthreads(generally used threading mechanism,other variants are there) for each corresponding request. But the scheduling here is done entirely by OS acting as host.
But it is a preferred approach and it is possible to schedule these threads by JVM. For example in Jikes RVM there are options to override this approach of OS decision. For example, in it threads are referred as RVMThread and they can be scheduled/manipulated using org.jikesrvm.schedular package classes.
For more reference
I need to create daemon that will monitor certain directory and will process every file that's written to that particular path.
My choice is either java or python.
Did you guys have any experience using both technology? what is the best one?
EDIT 1: files that will be processed is simple text file (one line with tab separated fields).
I just need to move it to buffer and send to further to my php file.
EDIT 2: It's for freebsd server
Performance-wise, for an I/O - syscall bound task such as you're mentioning, it's going to be a wash, most likely, depending a bit on the platform. Java tends to have better CPU usage (partly because a JVM can effectively use multiple cores on a multicore CPU on different threads, with CPython having problems with that; partly because of strong JIT abilities), but typically pays for them with higher RAM footprints (no big deal if you have 64GB of RAM laying around and not much else to do on the machine, say, but often an issue in other circumstances).
If you specify the platform (Linux vs Windows vs ...), we might be able to offer more help.
Edit: with processing required as light as the OP's mentioned in the Q's edit, there's really nothing either way in the CPU-load part of the task. Unfortunately I don't know what freebsd offers for "directory watching" (like Linux's inotify, etc).
Is there a way to find out which processor (either on a single system or mutliple systems) your thread is running on, using Java native threads? If not, is there any library which could help?
The JVM's thread scheduler is JVM-specific, so there is no 'universal' solution. As far as I know there is nothing available out-of-the-box, but perhaps using:
the Sun JVM;
Solaris - or Mac, as Tom Hawtin - tackline points out;
DTrace.
you might have some luck:
trace a thread-start probe, which has as args[3] the "The native/OS thread ID. This is the ID assigned by the host operating system "
map the native/OS thread ID to a CPU, using Solaris-specific utilities.
I've never heard of such a call, and I highly doubt there is one, as it's not really necessary, and would require extra platform-specific code.
As far as I know the standard JDK does not support it (at least up to JDK 6). If that's what you really need, you'll probably need to execute some native calls using JNI. A nice example could be found here (while it's not exactly what you need, I believe it's a good start).
There is a lot of other information you can get from the JDK, by the way, using the ThreadMXBean class (like CPU usage per thread), and maybe you can find what you're looking for there.
The OS will schedule threads on different processors at different times. So even if you get a snapshot of where each thread is running at any given moment, it could be out of date within milli-seconds.
What is the problem you are trying to solve? Perhaps to can do what you want without having to know this.