I understand that Java can load/execute DLL code, but I'm wondering if there are any security checks to prevent untrusted code from the system being called by a JVM. Couldn't this destroy the system -- are there any OS features that prevent this? Or can someone just write in Java itself some method that prevents untrusted code from being loaded? Thanks for your help.
No. Once you call out to native code (via JNI) then that native code is free to do anything (subject to the OS itself giving permission). There's no concept of sandboxing the native code invoked from the JVM.
Note that this is a particular headache with JNI code. Badly coded native code can take down the JVM (as opposed to simply throwing an exception) and the consequent debugging/resolution is particularly hard.
The loading of native code can itself be prevented. Typically e.g. applets run such security context that they cannot load native libraries. However, if the JVM lets your Java code call into untrusted native code, all bets are off.
Related
It is not conceptually clear to me as to when Java uses JNI. The literature 1,2 seems to suggest using JNI is optional - it is a useful feature for my own, existing native C applications, but it is good practice to avoid using it when possible:
Liang indicates "Remember that once an application uses the JNI, it risks losing two benefits [of portability and security]".
However, I was looking at Oracle's API implementation in the SDK, and I see public static native void arraycopy in java/lang/System.java. Questions:
Don't methods marked as such, in native, use JNI?
Doesn't Java make use JNI when making system calls?
System calls are required for any Java API implementation, so if I'm correct, it seems there is no avoiding interfacing with native code.
1: Horstmann, Core Java Volume 2
2. Liang, The Java™ Native Interface Programmer’s Guide and Specification
You list two drawbacks of JNI - portability and security. Actually, there is another one, which is more important in everyday life: JNI calls bear a significant performance cost, because they affect the global JVM state and lock some JVM features, including GC.
As detailed in the other answer, JVM does rely on JNI. But this is not a complete answer.
Your JVM may support fast JNI methods (e.g. Android ART does). These methods are guaranteed to be fast and non-blocking, and they may be performed without state change, see e.g. #FastNative. The Java SDK native methods use such improvements a lot, so they don't suffer the performance costs of conventional JNI.
These native methods do not rely on LoadLibrary() during run. This removes another significant performance cost of JNI - loading and 'binding' the native methods at runtime. The worst risk with runtime binding is that it happens at an arbitrary time, determined by the classloader, and may clash with some other urgent thing that your JVM or app must do at that time. Irrelevant for the system native methods.
Also, portability concerns are irrelevant: the Java runtime is carefully crafted for each supported platform, and in itself it is not 'portable', only the Java apps running on top of it are.
Finally, security risks of JNI are twofold: JNI is not limited by private declarations, and the native code can do dangerous things that compromise any class or app running in the same JVM. And, being loaded from a third-party library, JNI code may be hacked (e.g. it's enough to change OS environment to cause System.loadLibrary() load a fraudster's version of the lib). The system native methods are immune to such attacks.
In the nutshell, even though JVM does use JNI, this is not an excuse to indiscriminately use JNI for your own classes.
Don't methods marked as such, in native, use JNI?
Yes, that's what it means.
Doesn't Java make use JNI when making system calls?
Same question really. Only native methods can call system calls, so Java code can only call system calls via native methods.
Using JNI is optional for applications. It's essential for the JVM.
I basically understand the idea of managed and native code and their difference. But how is it technically possible for them to communicate with each other? Imagine the following example:
I got some static or dynamic c++ library which is compiled for a specific platform. Now I write a Java Programm. Inside this code I call the library functions with the 'native' keyword. I build a jar file with the bytecode and the c++ library files will stay separate. The result will no longer be platform-independent.
But how does the java programm know if the called native methods exists?
How is the whole programmcode executed during runtime? I know that the bytecode will be interpreted or compiled with JIT.
How does this all fit in the sandboxing paradigm? Is the native code also executed inside the sandbox?
Does it work because both (java and c++) code is machine code in the end?
Maybe this is a dumb question. But I was always wondering...
EDIT: I got 3 good answers. really can't decide which helped me the most. But i will mark this question as answered to close this topic from my side.
It doesn't know until you call the method. The native code resides in a .DLL or .so; the java runtime looks for specific entry points that correspond to the native methods you created (if you're using JNI, there's a tool that can parse the methods and create function stubs that'll result in those entry points when compiled). If the wanted entry point is not there, an exception will be thrown.
The code generated by the JIT is not entirely self-suficient; it has to call external native code (both for low-level runtime routines or OS services) from time to time. The same mechanism is used to invoke the code for your native methods.
No. You can do everything you'd do in a pure C/C++ program there. The only things that'll stop it from doing any damage are external security measures you have (login privilege restrictions, other OS protections, security software, etc.) But the VM won't protect you.
No, JNI existed even before JIT appeared. The mechanism is the same, if the bytecode is being run by an interpreter, and you want this interpreter to invoke native code, you just need some logic in it to determine that a given method is "external" and should be called as native code. This information is contained in the compiled .class file, and when the interpreter or JIT loads it, it creates a memory representation that makes easy to direct the call upon a method lookup.
The JVM will check the libraries you defined and see if the method is there
Bytecode will be interpreted or JITted and a call to native code is added. This may include boxing/deboxing values and other things needed to convert the data into suitable format. The libraries have a certain interface which is explained to the Java compiler and it will produce the required interface logic.
Depends on the sandbox. By default native code is native code. It doesn't call Java APIs so the JVM cannot govern it in any way. But there may be other limitations, for example the JVM could run the native code with libraries that provide sandboxing, or the operating system might have a way of sandboxing.
It depends on what you mean. In the end anything the computer does is machine code, but it doesn't really matter in this case. What matters is the translation and execution part. That is the glue that makes everything work.
Think of the system as people. Person A only speaks Japanese, but wants to reserve a hotel in Paris. The receptionist B only speaks French. Person A can get a translator that will translate their commands to French, command receptionist B and in return translate what B produced into a form person A understands. This is the JNI part.
It depends on the platform. On Linux, Solaris, etc., the JRE uses dlopen. On Windows, it uses LoadLibraryEx and GetProcAddress. If the JRE is running in interpreted mode, it calls that function; in compiled mode, it compiles Java bytecode into native code that calls that function.
On all JREs I'm familiar with, you can't call a native function in a static library directly; only one in a dynamic library.
Native code doesn't have to be limited to a single platform; if it's standard C, you can probably compile it with a cross-compiler for every platform on which a JRE is available.
I have a Java application that calls lots of different native methods of a legacy application through JNI. But JVM crashes with a stack dump at random places, outside any JNI call. Sometimes it crashes during GC, sometimes during class loading and other places. I suspect that one or more native methods is corrupting JVM heap or some other data structure. I need to know which call is this, so I can fix the native implementation.
The legacy application is a 3rd party DLL for which I don't have sources nor symbol information. To make it callable from Java, I built a wrapper DLL that uses JNI calling conventions.
The perfect solution would be an extended JVM option that forces JVM to automatically check integrity of heap and its other data structures after each JNI call.
Do you know of something that can help?
P.S. Please don't tell me to build a socket or pipe layer between JVM and the legacy application, because our requirements disallow that. This is about bug detection, not architecture design.
Because I went out of answers and couldn't find a ready solution by myself, I ended up building a sandbox process in pure C++ just to identify the problem. My Java app instantiates the sandbox process using ProcessBuilder and then communicates with it using stdin and stdout. Instead of JVM, it's the sandbox who actually loads and calls the legacy DLL. Then I monitored the sandbox process using Microsoft's Application Verifier, which found a memory corruption problem - there was a call passing a buffer smaller than expected. After this was identified, I just increased the length of byte[] used as buffer in the Java app, and now JVM can make direct calls to DLL without use of sandbox.
Overall, I lost almost 10 days just because JVM doesn't have an option to verify heap after each JNI call. But at least now if someone finds a crash we can quickly debug it using the sandbox.
Is there any chance for the the violation of java security policy through java native interface.
Which are the main areas we have to use JNI
Java's Security policies simply do not apply to native code called via JNI, so obviously the native code can violate them at will.
As for what JNI us used for, these days it's mainly to call OS-specific APIs or interface with existing non-Java code. Improving performance used to be a frequently-cited reason, but considering the state of VMs and JIT compilers today, that almost never makes sense.
Yes, once you invoke native code through JNI it can do pretty much anything the current user is allowed to do - e.g. delete all their files. The Java system cannot police anything that native code does.
You don't have to use JNI for anything - it's typically used for e.g. low-level access (e.g. critical error handling for a removable drive) or to access a C API which doesn't have a pure-Java equivalent.
I have a java application which uses JNI in some parts to do some work. It follows the usual loading of DLL and then calling native methods of DLL. Is there any way we can restrict what native methods can do from the java application? For example, can we restrict DLLs not to open any files or not to open any sockets even if it has the code to do it? It can just forbid DLLs it loads for doing certain things, may be by loggin something or throwing an exception.
No you can't. The DLL gets loaded as a whole and then the Java side has no control on what the native code is doing.
One solution might be kind of man in the middle approach. This would involve coding a "shell" DLL that has the same interface as the original DLL. You tell Java to load a "shell" DLL for instance by putting it in a specific location and using the java.library.path property. Then the role of the "shell" DLL is to load the "true" DLL by sandboxing it and redirecting standard functions. This sounds like a lot of pain and this something that would happen in the native side on things, not from Java.
Edit 2021: today it's also relevant to point out that the sandbox to run Java in would likely be a virtual machine, in the cloud, Docker or what have you, in a locked down configuration.
I liked Gregory Pakosz' answer a lot. However, what you could do is sandbox the Java instance itself. Start the Java application itself in a restricted context.
In Windows or Unix you can create a user which is limited to a certain directory and only has access to some DLLs. Thus the DLL called from JNI can do whatever it wants, but it will not get very far, because the user the Java runs as can not do very much.
If your Java program needs to do privileged things, the Java side of it will have to talk to another program (Java or not) to do its' privileged things for it.
Just keep in mind, that if you can not trust the DLL, you can no longer trust the Java code either, since the DLL might have "hacked" the Java machine. On the other hand, no nasty stuff should be able to break out of the limits of the user they run as. (Barring misconfiguration or a bug in the OS.)
Normally you would run your application under the Java security Manager but I don't believe it has any effect on code running through the JNI.
You could implement some kind of setting that your JNI code could get. For example, on an UNIX system, you could create groups for special types of privileges, and check if the current user has the required privileges, else just return 0 or something.