Will JVM threads always maintain their mapping to OS threads - java

I'm writing a service that uses JNA to delegate work from Java to a native C++ library. The C++ library makes an async call for a computationally expensive task, and then gets a callback (on a different OS thread) when that task is complete. I would like to route the result of this work back to the correct thread in JVM.
What I'm wondering is can I be guaranteed that the JVM thread id will always have a one to one mapping with a native thread_id? I.e. if I record the thread id in C++ via
std::this_thread::get_id()
then kick off some expensive work and block on a cv, that the thread will still be there once the work is complete, and that I'll be able to return results to JVM correctly. Will any behind-the-scenes JVM work like JIT, GC, or stop the world collections be causes for concern with this pattern?

The answer is not specified in the JLS, the JVM spec or the Javadocs.
Indeed, it is possibly platform specific. For example, in JVMs for Solaris it is (or was) possible to do N:M mapping of user-space threads to kernel-space threads; see this document. It is not clear what that means / meant for the native thread id.
So will a thread's native thread_id be constant for the JVM that you are using?
There is only one way to be sure: download the JVM source code and check.
Warning: it is fearsome complicated!
(And you should probably take that as a hint that you shouldn't be doing this kind of thing ... if you have to resort to asking on StackOverflow if it will work!)

It seems like a bad design to me, that requires you to know how the JVM works.
If you anyway marshal some Java data to the C++ layer, why not marshal a callback + context? when the C++ thread finishes processing the data, it calls the Java callback with the provided context and in the Java layer - you push the result back to the Java thread.
the C++ layer shouldn't know anything about how Java threads work - all it has to do is to call a callback and let the callback deal with the implementation details.
I've actually done this a few times in the past but in C# and P/Invoke, which easily allows you to marshal a C# function as a C-function pointer. I'm sure it's possible for JNI as well.

Related

Why there are JVM instructions `monitorenter/monitorexit` but no `wait/notifyAll` (they are native calls)?

When we write synchronized(some_object){} we can see two JVM instructions monitorenter/monitorexit issued as the byte code.
When we write synchronized(some_object){some_object.wait()} i would expect to see special JVM instructions like wait, but none -- instead wait/notify are implemented as native C functions.
Why there is such inconsistency (either have them all as JNI or as java byte code)? Was there a particular (historical) reason or it is just a matter of taste?
Context: i am interested in this because having all monitorenter/monitorexit/wait/notify in the bytecode would allow 'JavaByteCode program correctness verifiers that do not handle JNI' to verify concurrent Java programs that do not use JNI. Currently, such hypothetical tool has to workaround wait/notify.
i would expect to see special JVM instructions like wait
I wouldn't. That would be inconsistent, in my view - in the source code, you're just calling a method, so it makes sense that you're just calling a method in the bytecode as well. Otherwise the compiler would have to have special knowledge of those methods, where it doesn't at the moment.
Arguably it would make more sense for monitorenter and monitorexit to be implemented via method calls as well (as they are in .NET, for example). Certain methods will always be native and deeply tied to the JVM itself - I don't see anything unreasonable about that, and I wouldn't want each of those to be implemented via a separate bytecode operation. However, I don't have too much issue with synchronized having special bytecode supporting it, given that it's a language construct (like try/catch/finally) rather than just a regular method call.
There is no need for a verification program to deal with JNI as the semantics of wait and notify calls are well-specified. That’s not different to dedicated bytecode instructions. The same applies to how the hot spot optimizer deals with a lot of well known method invocations, which may include wait and notify. It does not necessarily generate a costly JNI invocation but rather generate code performing these low-level operations directly. Methods handled this way are called intrinsic methods (see also here or here.
There are so many, that you couldn’t call it bytecode anymore if you tried to reserve an opcode for each of them. Further, which methods are handled this way, depends on the actual JVM implementation and the hardware architecture on which it runs. It might also change between versions so there is no sense to carve it in stone by defining bytecode instructions for them.
You wrote “Currently, such hypothetical tool has to workaround wait/notify”. In fact, handling these special methods is not a work-around. It’s what such an audit tool has to do with a lot of methods like these declared in Lock and Condition which have similar threading-related semantics but there are also a lot of other well-known concurrency tools nowadays which have to be handled.
The exact decision to create monitorenter and monitorexit instructions but make wait and notify methods on Object is historical (it dates back over 20 years ago). Today, the decision might look different if the developers had to make it again. But I guess it would rather go into the direction to make even monitorenter and monitorexit special methods that are invoked under the hood rather than bytecode instructions. First, they are not the only thread synchronization tool anymore. Second, it’s how most of the new feature were added in the recent JVMs, preferably as method, even if it’s expected to be intrinsified by most, if not all, implementations.

What does a JVM have to do when calling a native method?

What are the usual steps that the JVM runtime has to perform when calling a Java method that is declared as native?
How does a HotSpot 1.8.0 JVM implement a JNI function call? What checking steps are involved (e.g. unhandled exceptions after return?), what bookkeeping has the JVM to perform (e.g. a local reference registry?), and where does the control go after the call of the native Java method? I would also appreciate it if someone could provide the entry point or important methods from the native HotSpot 1.8.0 code.
Disclaimer: I know that I can read the code myself but a prior explanation helps in quickly finding my way through the code. Additionally, I found this question worthwhile to be Google searchable. ;)
Calling a JNI method from Java is rather expensive comparing to a simple C function call.
HotSpot typically performs most of the following steps to invoke a JNI method:
Create a stack frame.
Move arguments to proper register or stack locations according to ABI.
Wrap object references to JNI handles.
Obtain JNIEnv* and jclass for static methods and pass them as additional arguments.
Check if should call method_entry trace function.
Lock an object monitor if the method is synchronized.
Check if the native function is linked already. Function lookup and linking is performed lazily.
Switch thread from in_java to in_native state.
Call the native function
Check if safepoint is needed.
Return thread to in_java state.
Unlock monitor if locked.
Notify method_exit.
Unwrap object result and reset JNI handles block.
Handle JNI exceptions.
Remove the stack frame.
The source code for this procedure can be found at SharedRuntime::generate_native_wrapper.
As you can see, an overhead may be significant. But in many cases most of the above steps are not necessary. For example, if a native method just performs some encoding/decoding on a byte array and does not throw any exceptions nor it calls other JNI functions. For these cases HotSpot has a non-standard (and not known) convention called Critical Natives, discussed here.

.NET GC stuck on JNI call from finalizer()

I have a .NET application that is using JNI to call Java code. On the .NET finalizer we call a JNI call to clean the connected resource on Java. But from time to time this JNI gets stuck.
This as expected stuck the all .NET process and never releases.
Bellow you can see the thread dump we got from .NET:
NET Call Stack
Function
.JNIEnv_.NewByteArray(JNIEnv_*, Int32)
Bridge.NetToJava.JVMBridge.ExecutePBSCommand(Byte[], Int32, Byte[])
Bridge.Core.Internal.Pbs.Commands.PbsDispatcher.Execute(Bridge.Core.Internal.Pbs.PbsOutputStream, Bridge.Core.Internal.DispatcherObjectProxy)
Bridge.Core.Internal.Pbs.Commands.PbsCommandsBundle.ExecuteGenericDestructCommand(Byte, Int64, Boolean)
Bridge.Core.Internal.DispatcherObjectProxy.Dispose(Boolean)
Bridge.Core.Internal.Transaction.Dispose(Boolean)
Bridge.Core.Internal.DispatcherObjectProxy.Finalize()
Full Call Stack
Function
ntdll!KiFastSystemCallRet
ntdll!NtWaitForSingleObject+c
kernel32!WaitForSingleObjectEx+ac
kernel32!WaitForSingleObject+12
jvm!JVM_FindSignal+5cc49
jvm!JVM_FindSignal+4d0be
jvm!JVM_FindSignal+4d5fa
jvm!JVM_FindSignal+beb8e
jvm+115b
jvm!JNI_GetCreatedJavaVMs+1d26
Bridge_NetToJava+1220
clr!MethodTable::SetObjCreateDelegate+bd
clr!MethodTable::CallFinalizer+ca
clr!SVR::CallFinalizer+a7
clr!WKS::GCHeap::TraceGCSegments+239
clr!WKS::GCHeap::TraceGCSegments+415
clr!WKS::GCHeap::FinalizerThreadWorker+cd
clr!Thread::DoExtraWorkForFinalizer+114
clr!Thread::ShouldChangeAbortToUnload+101
clr!Thread::ShouldChangeAbortToUnload+399
clr!ManagedThreadBase_NoADTransition+35
clr!ManagedThreadBase::FinalizerBase+f
clr!WKS::GCHeap::FinalizerThreadStart+10c
clr!Thread::intermediateThreadProc+4b
kernel32!BaseThreadStart+34
I have no idea whether .NET finalizers are equally bad idea to Java finalizers, but using a potentially (dead)locking code (i see Win32 condition call at the very bottom) from anything like finalizer (regardless of the platform) is definitely a bad idea. You need to clean your native code of any potential locking, or have an emergency brake timeout at the level of .NET
As I didn't find a question I won't post a formal answer here but rather tell a story about something similar I underwent sometimes:
We created C ojects via JNI, that were backed by java object, and we decided to clean the C objects within the finalize method. However, we envisioned deadlocks, as the finalize is called from a non-application thread, the garbage-collector. As the entire wolrd is stopped while collecting the garbage, whenever the finalizer meets a lock it's immediately a dead lock. Thus we decided to use a java mechnism called phantom references. It's possible to bind a number to each of these 'references' (the C pointer) and then the VM removes an referenced object it puts such an reference into a queue. And one can pull this data whenever appropriate and remove the C object.
I think at least your problem is the same.

Thread safety of SocketOutputStream

I know that thread safety of java sockets has been discussed in several threads here on stackoverflow, but I haven't been able to find a clear answer to this question - Is it, in practice, safe to have multiple threads concurrently write to the same SocketOutputStream, or is there a risk that the data sent from one thread gets mixed up with the data from another tread? (For example the receiver on the other end first receives the first half of one thread's message and then some data from another thread's message and then the rest of the first thread's message)
The reason I said "in practice" is that I know the Socket class isn't documented as thread-safe, but if it actually is safe in current implementations, then that's good enough for me. The specific implementation I'm most curious about is Hotspot running on Linux.
When looking at the Java layer of hotspot's implementation, more specifically the implementation of socketWrite() in SocketOutputStream, it looks like it should be thread safe as long as the native implementation of socketWrite0() is safe. However, when looking at the implemention of that method (j2se/src/solaris/native/java/net/SocketOutputStream.c), it seems to split the data to be sent into chunks of 64 or 128kb (depending on whether it's a 64bit JVM) and then sends the chunks in seperate writes.
So - to me, it looks like sending more than 64kb from different threads is not safe, but if it's less than 64kb it should be safe... but I could very well be missing something important here. Has anyone else here looked at this and come to a different conclusion?
I think it's a really bad idea to so heavily depend on the implementation details of something that can change beyond your control. If you do something like this you will have to very carefully control the versions of everything you use to make sure it's what you expect, and that's very difficult to do. And you will also have to have a very robust test suite to verify that the multithreaded operatio functions correctly since you are depending on code inspection and rumors from randoms on StackOverflow for your solution.
Why can't you just wrap the SocketOutputStream into another passthrough OutputStream and then add the necessary synchronization at that level? It's much safer to do it that way and you are far less likely to have unexpected problems down the road.
According to this documentation http://www.docjar.com/docs/api/java/net/SocketOutputStream.html, the class does not claim to be thread safe, and thus assume it is not. It inherits from FileOutputStream, which normally file I/O is not inherently thread safe.
My advice is that if the class is related to hardware or communications, it is not thread safe or "blocking". The reason is thread safe operations consume more time, which you may not like. My background is not in Java but other libraries are similar in philosophy.
I notice you tested the class extensively, but you may test it all day for many days, and it may not prove anything, my 2-cents.
Good luck & have fun with it.
Tommy Kwee

Is it possible to make GC manage native object's lifetime?

With C++ and C# experience and some little Java knowledge I'm now starting a Java+JNI (C++) project (Android, if that matters).
I have a native method, that creates some C++ class and returns a pointer to it as a Java long value (say, handle). And then other native methods called from Java code here and there, use the handle as a parameter to do some native operations on this class. C++ side does not own the object, it's Java side who does. But in the current architecture design it's hard to define who exactly owns the object and when to delete it. So it would probably be nice to make Java VM garbage collector to manage the object's lifetime somehow. The C++ class does not consume any resources, except some piece of memory, not large. So it's OK, if several such objects will not be destructed.
In C# I would probably wrap the native IntPtr handle in some managed wrapper class. And override it's finalizer to call native object's destructor when the managed wrapper is garbage collected. SafeHandle, AddMemoryPressure, etc. might be also of help here.
This is a different story with Java's finalize. The second thing you know after 'Hello world' in Java, is that using finalize is bad. Are there any other ways to accomplish this in Java? Maybe using PhantomReference?
Well let's consider the reason WHY finalize and Co are problematic: As you know there's no guarantee that the finalize will be called before the VM is shut down, which means that special cleanup code won't necessarily run (imo a bad decision, I don't see any problems to run through the finalize queue at cleanup, but well that's how it is). Also this is exactly the same situation in C#
Now your objects only consume memory, which will be cleaned up by the OS anyhow when the VM is destroyed, so the only case where finalize is problematic won't matter for you. So yes you can indeed use this variant and it'll work perfectly fine, but it may not exactly be considered a great architectural design - and as soon as you add resources to your C++ code where the OS doesn't handle the cleanup correctly you will run into problems
Also note that implementing a finalizer results in some additional overhead for the GC and means it takes two cycles to cleanup one of these objects (and whatever you do, don't ever save an object in the finalize method)
If you understand why you should avoid using Java's finalize method, you will also understand how to use it correctly. Using finalize for closing system resources (files and handles) is bad because you don't actually know when those resources will be closed and released. Using complex finalize logic is bad as your object reference can leak out and get pinned in memory again.
For your scenario, it is perfectly fine to use finalize.
using a wrapper with a finalizer is a decent solution here
but if you really don't wanna do that you can use a PhantomReference with a ReferenceQueue to clean it up (but you are going to require a separate thread to poll the queue)
So how can we achieve it using phantom reference.
Create a wrapper object for your native intPtr object. Create a
phantom reference(with a reference queue) on the wrapper object.
Create and maintain a map of phantom reference to intPtr.
Create a thread that will be monitoring the reference queue for finalized
wrapper object instances.
This thread will get the phantom reference from reference queue, lookup intPtr using phantom reference and call destructor on native int object referenced by intPtr.
While all this happening, you can go about happily using the
wrapper object in your java code.

Categories