I am trying to set up a Groovy Shell sandbox that can execute untrusted code. These untrusted codes are provided by the end users (developers) as behaviour configurations, e.g. how to determine if a person is high net worth. So, they really are part of the main program. I need to make sure that I am not vulnerable to any bad code [e.g. infinite loop]/hacks.
I understand that there are two things at play here:
The Java VM that provides the runtime.
The Groovy Shell that interprets and executes the code.
Are there best practices to sandbox a Groovy Shell?
Thanks
I ended up creating a Policy file. Something like this:
grant codeBase "file:/your jar file" {
permission java.security.AllPermissions;
}
grant codeBase "file:/groovy/shell" {
}
grant codeBase "file:/groovy/script" {
}
When Groovy is executed in interpreted mode, the codeBase is either file:/groovy/shell or file:/groovy/script. You can grant specific permission(s) for either context. These permissions (or lack thereof) are independent to what you give to your main program.
In addition to the policy file, there are many other considerations too.
What do you put into the evaluation context? If you put a 3rd party library, they may not even have the properly permission check in place.
Some System calls, say System.out.println() doesn't have permission check either. So, maybe you also need a source code checker (Jenkins does that).
To limit CPU, you may need to run the Groovy script in a separate thread.
You probably want to limit what a Groovy script can import too. This can be achieved with a Groovy ImportCustomizer.
I wrote an article: Secure Groovy Script Execution in a Sandbox to summarize my findings. I hope it will help others too.
Just relying on the Security Manager does not solve all the issues. There are multiple reasons why Security Manager is unsuited. Most importantly, it relies on the assumption that critical methods already incorporate the due permission checks. As already pointed out, this is not always the case. In the Java SE API alone, there are plenty of examples; expanding the scope to further 3rd party libraries, you will rarely see good implementations performing permission checks. This cannot be retrofitted. Security Manager was designed for protecting a desktop system; in contrast, on a multi-user system, you need additional protection against attacks on other users.
A working approach is what Jenkins does: intercept all method calls and member accesses and check against a configured whitelist. This is implemented in https://github.com/jenkinsci/script-security-plugin
A crucial thing here is how to maintain a usable whitelist. The Jenkins approach works with extensive lists of method signatures (plus blacklists). While this is a conservative approach, it's probably tedious to maintain. You should start with the smallest possible API subset that you want to offer. At best, there is a dedicated API for script users that hides all implementation details.
There are still some corner cases, in particular regarding resource exhaustion. Think for instance about the following script:
for( int x = 7; true; x *= x );
This infinite loop will keep the CPU busy while there is no method call or field assingment to be intercepted. IMHO, the best way to handle this would be to execute the script in a separate thread and stop it after a given timeout. Yes, I mean calling Thread.stop(), because interrupts will just be ignored. Finally, scripts may raise any Throwable, even checked ones whithout declaring them; you should always use catch Throwable around a script.
A bit more context would have been helpful, but given what you described:
I'd strongly recommend running groovyshell inside a container - one container per user/instance.
You can tightly control disk access, and also set a cap for CPU and memory usage of the container.
And if you want to go to the extreme, you can easily run each container in a network of its own with no other nodes and no internet access.
With that, the bad code will be quite restricted to docker vulnerabilities which can be exploited from a JVM program.
Related
I am curious about what automatic methods may be used to determine if a Java app running on a Windows or PC is malware. (I don't really even know what exploits are available to such an app. Is there someplace I can learn about the risks?) If I have the source code, are there specific packages or classes that could be used more harmfully than others? Perhaps they could suggest malware?
Update: Thanks for the replies. I was interested in knowing if this would be possible, and it basically sounds totally infeasible. Good to know.
If it's not even possible to automatically determine whether a program terminates, I don't think you'll get much leverage in automatically determining whether an app does "naughty stuff".
Part of the problem of course is defining what constitutes malware, but the majority is simply that deducing proofs about the behaviour of other programs is surprisingly difficult/impossible. You may have some luck spotting particular patterns, but on the whole you can't be confident (and I suspect it's provably impossible) that you've caught all possible attack vectors.
And in the general sphere, catching 95% of vectors isn't really worthwhile when the attackers simply concentrate on the remaining 5%.
Well, there's always the fundamental philosophical question: what is a malware? It's code that was intended to do damage, or at least code that doesn't do what it claims to. How do you plan to judge intent based on libraries it uses?
Having said that, if you at least roughly know what the program is supposed to do, you can indeed find suspicious packages, things the program wouldn't normally need to access. Like network connections when the program is meant to run as a desktop app. But then the network connection could just be part of an autoupdate feature. (Is autoupdate itself a malware? Sometimes it feels like it is.)
Another indicator is if a program that ostensibly doesn't need any special privileges, refuses to run in a sandbox. And the biggest threat is if it tries to load a native library when it shouldn't need one.
But all these only make sense if you know what the code is supposed to do. An antivirus package might use very similar techniques to viruses, the only difference is what's on the label.
Here is a general outline for how you can bound the possible actions your java application can take. Basically you are testing to see if the java application is 'inert' (can't take harmful actions) and thus it probably not mallware.
This won't necessarily tell you mallware or not, as others have pointed out. The app could still do annoying things like pop-up windows. Perhaps the best indication, is to see if the application is digitally signed by an author you trust; if not -- be afraid.
You can disassemble the class files to determine which Java APIs the application uses; you are looking for points where the java app uses the OS. Since java uses a virtual machine, there are well defined points where a java application could take potentially harmful actions -- these are the 'gateways' to various OS calls (for example opening a socket or reading a file).
Its difficult to enumerate all the APIs, different functions which execute the same OS action should require the same Permission. But java's docs don't provide an exhaustive list.
Does the java app use any native libraries -- if so its a big red flag.
The JVM does not offer the ability to run arbitrary code, or use native system APIs; in particular it does not offer the ability to modify the registry (a typical action of PC mallware). The only way a java application can do this is via native libraries. Typically there is no need for a normal application written in java to use native code (unless it needs to use devices).
Check for System.loadLibrary() or System.load() or Runtime.loadLibrary() or Runtime.load(). This is how the VM loads native libraries.
Does it use the network or file system?
Look for use of java.io, java.net.
Does it make system calls (via Runtime.exec())
You can check for the use of java.lang.Runtime.exec() or ProcessBuilder.exec().
Does it try to control the keyboard / mouse?
You could also run the application in a restricted policy JVM (the instructions/tools for doing this are not as simple as they should be) and see what fails (see Oracle's security tutorial) -- note that disassembly is the only way to be sure, just because the app doesn't do anything harmful once, doesn't mean it won't in the future.
This definitely is not easy, and I was surprised to find how many places one needs to look at (for example several java functions load native libraries, not just one).
I am working on a servlet (runs on tomcat) which receives requests that contains Java Script code, and using the java scripting API framework evaluates/run the code and returns the answer to the user.
Since we are dealing with user generated code, the code can be a good code and it can be bad code. As an example for a bad code can be while(true); which will endlessly loop in the server taking unnecessary resources
my questions
1) how can i discover a bad code?
2) once identified as a bad/malicious code what is the best way to stop the run?
thanks
My question to you: what counts as bad code?
If you cannot come up with a formal definition of what counts as bad code, you cannot hope to be able to detect it. And since this is probably what your question really meant, I'll put forward my answer - there's no way to do it.
Even a seemingly trivial thing such as whether a program will terminate or not cannot be determined ahead of time, and I'd expect any definition of bad code would be something that couldn't terminate.
Thus to my mind you have one major option: trust your users (or alternatively don't trust them and don't run anything).
Something that might work otherwise is to run the script in a strict sandbox, and terminate it after an appropriate amount of time if it hasn't already finished running. It very much depends on your circumstances as to what is acceptable.
You are really jumping down the rabbit hole on this one. There is no way to determine in advance if code is resource intensive or has mailious intent. Even humans have a hard time with that. Having said that there are some things you can do to defend yourself.
Use Rhino instead of Java 6's built-in JS scripting engine as it gives you more options.
Implement a custom context that monitors instruction count. This gives you an opportunity to interrupt scripts that are infinitely looping. See Rhino's ContextFactory class
run your scripts in a separate thread so that you can interrupt scripts stuck in in wait states that don't trigger the Context's intruction count
Implement a security manager: see Overview, API. This will allow you to restrict the script to just those objects it should be interacting with.
I have implemented 1,2, and 3 in Myna and you are welcome to steal code
There's already a tool that identifies 'bad' JavaScript, JSLint. Obviously the definition of bad code is highly subjective, but JSLint provides a wide range of options, so you should be able to configure it to conform fairly closely to your definition of bad.
You can submit code (and configuration options) to JSLint via the web form linked to above. It should also be possible to submit code (and options) to JSLint programatically, but you should get the author's permission if you plan to do this regularly.
How does Google App Engine sandbox work?
What would I have to do to create my own such sandbox (to safely allow my clients to run their apps on my engine without giving them the ability to format my disk drive)? Is it just class loader magic, byte manipulation or something?
You would probably need a combination of a restrictive classloader and a thorough understanding of the Java Security Architecture. You would probably run your JVM with a very strict SecurityManager specified.
In the Java case, I think it's mostly done by restricting the available libraries. Since Java doesn't have pointer concept, and you can't upload natively compiled code (only JVM bytecode), you can't break out of the sandbox. Add some tight process scheduling, and you're done!
I guess The hardest part is to pick the libraries, to make it useful while staying safe.
In the Python case, they had to modify the VM itself, because it wasn't designed with safety in mind. Fortunately, they have Guido himself to do it.
to safely allow my clients to run their apps on my engine without giving them the ability to format my disk drive
This can be easily achieved using the Java Security Manager. Refer this answer for an example.
Let's say I have an existing application written in Java which I wish to add scripting support to. This is fairly trivial with Groovy (and just as trivial in .Net with any of the Iron range of dynamic languages).
As trivial as adding the support is, it raises a whole host of questions about script execution and security - and how to implement that security.
Has anyone come across any interesting articles/papers or have any insights into this that they'd like to share? In particular I'd be very interested in architectural things like execution contexts, script authentication, script signing and things along those lines... you know, the kind of stuff that prevents users from running arbitrary scripts they just happened to download which managed to screw up their entire application, whilst still allowing scripts to be useful and flexible.
Quick Edit
I've mentioned signing and authentication as different entities because I see them as different aspects of security.
For example, as the developer/supplier of the application I distribute a signed script. This script is legitimately "data destructive" by design. Such as it's nature, it should only be run by administrators and therefore the vast majority of users/system processes should not be able to run it. This being the case it seems to me that some sort of security/authentication context is required based on the actions the script performs and who is running that script.
script signing
Conceptually, that's just reaching into your cryptographic toolbox and using the tools that exist. Have people sign their code, validate the signatures on download, and check that the originator of the signature is trusted.
The hard(er) question is what makes people worthy of trust, and who gets to choose that. Unless your users are techies, they don't want to think about that, and so they won't set good policy. On the other hand, you'll probably have to let the users introduce trust [unless you want to go for an iPhone AppStore-style walled garden, which it doesn't sound like].
script authentication
How's this different from signing? Or, is your question just that: "what are the concepts that I should be thinking about here"?
The real question is of course: what do you want your users to be secured against? Is it only in-transit modification, or do you also want to make guarantees about the behavior of the scripts? Some examples might be: scripts can't access the file system, scripts can only access a particular subtree of the file system, scripts can't access the network, etc.
I think it might be possible [with a modest amount of work] to provide some guarantees about how scripts access the outside world if they were done in haskell and the IO monad was broken into smaller pieces. Or, possibly, if you had the scripts run in their own opaque monad with access to a limited part of IO. Be sure to remove access to unsafe functions.
You may also want to look at the considerations that went into the Java (applet) security model.
I hope this gives you something to think about.
You may have a look at the documentation on security:
http://groovy.codehaus.org/Security
Basically, you can use the usual Java Security Manager.
And beyond this, there's a sample application showing how you can introspect the code by analyzing its AST (Abstract Syntax Tree) at compile-time, so that you can allow / disallow certain code constructs, etc. Here's an example of an arithmetic shell showing this in action:
http://svn.groovy.codehaus.org/browse/groovy/trunk/groovy/groovy-core/src/examples/groovyShell
I have a program that is running a basic RMISecurityManager in all its threads. But I would like to do more control to several threads and set another SecurityManager specially for these threads.
How can I do that ? ...if this is possible !?
thank you by advance.
Edit : I have found my solution. See here for more details.
It doesn't make a great deal of sense. What if code (malicious or not) causes execution on a different thread? This can even happen within the Java library, with security context transferred (which may use java.security.AccessController.getContext/doPrivileged).
Applets do use a slightly difficult system involving ThreadGroups, but I wouldn't recommend it. JAAS allows a Subject to be added to the AccessControlContext, but personally I'd suggest not using this style of programming.
Give downloaded code (if any) appropriate permissions, and don't give sensitive objects to code you don't trust with them.
The SecurityManager performs checks based on the security context of the running thread, perhaps you want to make your SecurityManager to behave differently based on whatever it finds in the context?
Or maybe, you want to implement your SecurityManager using the strategy pattern.
yc