Adding scripting security to an application

Adding scripting security to an application - java

Let's say I have an existing application written in Java which I wish to add scripting support to. This is fairly trivial with Groovy (and just as trivial in .Net with any of the Iron range of dynamic languages).
As trivial as adding the support is, it raises a whole host of questions about script execution and security - and how to implement that security.
Has anyone come across any interesting articles/papers or have any insights into this that they'd like to share? In particular I'd be very interested in architectural things like execution contexts, script authentication, script signing and things along those lines... you know, the kind of stuff that prevents users from running arbitrary scripts they just happened to download which managed to screw up their entire application, whilst still allowing scripts to be useful and flexible.
Quick Edit
I've mentioned signing and authentication as different entities because I see them as different aspects of security.
For example, as the developer/supplier of the application I distribute a signed script. This script is legitimately "data destructive" by design. Such as it's nature, it should only be run by administrators and therefore the vast majority of users/system processes should not be able to run it. This being the case it seems to me that some sort of security/authentication context is required based on the actions the script performs and who is running that script.

script signing
Conceptually, that's just reaching into your cryptographic toolbox and using the tools that exist. Have people sign their code, validate the signatures on download, and check that the originator of the signature is trusted.
The hard(er) question is what makes people worthy of trust, and who gets to choose that. Unless your users are techies, they don't want to think about that, and so they won't set good policy. On the other hand, you'll probably have to let the users introduce trust [unless you want to go for an iPhone AppStore-style walled garden, which it doesn't sound like].
script authentication
How's this different from signing? Or, is your question just that: "what are the concepts that I should be thinking about here"?
The real question is of course: what do you want your users to be secured against? Is it only in-transit modification, or do you also want to make guarantees about the behavior of the scripts? Some examples might be: scripts can't access the file system, scripts can only access a particular subtree of the file system, scripts can't access the network, etc.
I think it might be possible [with a modest amount of work] to provide some guarantees about how scripts access the outside world if they were done in haskell and the IO monad was broken into smaller pieces. Or, possibly, if you had the scripts run in their own opaque monad with access to a limited part of IO. Be sure to remove access to unsafe functions.
You may also want to look at the considerations that went into the Java (applet) security model.
I hope this gives you something to think about.

You may have a look at the documentation on security:
http://groovy.codehaus.org/Security
Basically, you can use the usual Java Security Manager.
And beyond this, there's a sample application showing how you can introspect the code by analyzing its AST (Abstract Syntax Tree) at compile-time, so that you can allow / disallow certain code constructs, etc. Here's an example of an arithmetic shell showing this in action:
http://svn.groovy.codehaus.org/browse/groovy/trunk/groovy/groovy-core/src/examples/groovyShell

Related

Groovy Shell Sandboxing Best Practices

I am trying to set up a Groovy Shell sandbox that can execute untrusted code. These untrusted codes are provided by the end users (developers) as behaviour configurations, e.g. how to determine if a person is high net worth. So, they really are part of the main program. I need to make sure that I am not vulnerable to any bad code [e.g. infinite loop]/hacks.
I understand that there are two things at play here:
The Java VM that provides the runtime.
The Groovy Shell that interprets and executes the code.
Are there best practices to sandbox a Groovy Shell?
Thanks

I ended up creating a Policy file. Something like this:
grant codeBase "file:/your jar file" {
permission java.security.AllPermissions;
}
grant codeBase "file:/groovy/shell" {
}
grant codeBase "file:/groovy/script" {
}
When Groovy is executed in interpreted mode, the codeBase is either file:/groovy/shell or file:/groovy/script. You can grant specific permission(s) for either context. These permissions (or lack thereof) are independent to what you give to your main program.
In addition to the policy file, there are many other considerations too.
What do you put into the evaluation context? If you put a 3rd party library, they may not even have the properly permission check in place.
Some System calls, say System.out.println() doesn't have permission check either. So, maybe you also need a source code checker (Jenkins does that).
To limit CPU, you may need to run the Groovy script in a separate thread.
You probably want to limit what a Groovy script can import too. This can be achieved with a Groovy ImportCustomizer.
I wrote an article: Secure Groovy Script Execution in a Sandbox to summarize my findings. I hope it will help others too.

Just relying on the Security Manager does not solve all the issues. There are multiple reasons why Security Manager is unsuited. Most importantly, it relies on the assumption that critical methods already incorporate the due permission checks. As already pointed out, this is not always the case. In the Java SE API alone, there are plenty of examples; expanding the scope to further 3rd party libraries, you will rarely see good implementations performing permission checks. This cannot be retrofitted. Security Manager was designed for protecting a desktop system; in contrast, on a multi-user system, you need additional protection against attacks on other users.
A working approach is what Jenkins does: intercept all method calls and member accesses and check against a configured whitelist. This is implemented in https://github.com/jenkinsci/script-security-plugin
A crucial thing here is how to maintain a usable whitelist. The Jenkins approach works with extensive lists of method signatures (plus blacklists). While this is a conservative approach, it's probably tedious to maintain. You should start with the smallest possible API subset that you want to offer. At best, there is a dedicated API for script users that hides all implementation details.
There are still some corner cases, in particular regarding resource exhaustion. Think for instance about the following script:
for( int x = 7; true; x *= x );
This infinite loop will keep the CPU busy while there is no method call or field assingment to be intercepted. IMHO, the best way to handle this would be to execute the script in a separate thread and stop it after a given timeout. Yes, I mean calling Thread.stop(), because interrupts will just be ignored. Finally, scripts may raise any Throwable, even checked ones whithout declaring them; you should always use catch Throwable around a script.

A bit more context would have been helpful, but given what you described:
I'd strongly recommend running groovyshell inside a container - one container per user/instance.
You can tightly control disk access, and also set a cap for CPU and memory usage of the container.
And if you want to go to the extreme, you can easily run each container in a network of its own with no other nodes and no internet access.
With that, the bad code will be quite restricted to docker vulnerabilities which can be exploited from a JVM program.

Encrypting a JAR where source protection is a priority

I have a dilemma. Basically, I've given a group of people I'm friends with a program that utilizes source code that I don't want anyone outside the group knowing of. We all know Java is absolutely horrible at doing any level of obfuscation, as most obfuscation tools only rename objects, scramble code, etc. I've used such tools, but to be honest I'd like to go as far as possible with the security of the program.
Since the application requires a username, password, and other identifiers to log in to the server it uses, I was beginning to wonder if a unique AES key could be generated for the user to secure the JAR.
Basically, upon running a launcher of sorts to log in, the launcher app may request an AES key from the server, and use it to decrypt a secured JAR it's downloaded from the server already. The key would be completely unique to each user, which would mean the server would have to encrypt the JAR differently for each user.
Now, I know how crazy this sounds. But since this is such a low-level thing, I need to know if there is a way you can somehow both decrypt and run a JAR from any type of stream. Or, if that isn't possible, would it be reasonable to decrypt the file, run it, then re-encrypt it?

Of course you can decrypt and run Java bytecode on the fly - bytecode manipulation libraries such as ASM even go as far as creating new classes dynamically.
But, quite honestly, if something actually runs on a computer then its code is definitely going to be available to anyone with the knowledge. Java, especially, is even more convenient since it allows far better access to the bytecode of a class that is loaded by the JVM than any natively compiled language.
You could theoretically take your obfuscation a bit further by using JNA/JNI and a native shared library or two. But, in the hands of a determined attacker no measure will protect your code completely - it would just take more time for them to figure out how your algorithms work. And if you are concerned about piracy, well, we are in the era of virtualization; you can actually clone whole computer systems from top to bottom with a couple of key presses - you figure out the rest...
The only potentially viable solution would be to offer your software as a service, with all the issues entailed by that approach - and you would still not have absolute security.
If you are that concerned about protecting your intellectual property, then get a lawyer and consider publishing your algorithms in some form - obscurity will only go so far. It will not stop someone from doing black-box analysis on your system and quite often just knowing that something is possible is enough.
Please stop trying to find technical solutions to a problem that is so obviously not of a technical nature...

My answer would be to keep the server information outside of the jar entirely. Use a parameter or configuration file to point to where to get that information. Then the jar file has no secrets in it. Only the server where the code runs has that information. You can then do things like make the configuration file readable only by the user that can run the code in the jar.

Determining if a Java app is malware

I am curious about what automatic methods may be used to determine if a Java app running on a Windows or PC is malware. (I don't really even know what exploits are available to such an app. Is there someplace I can learn about the risks?) If I have the source code, are there specific packages or classes that could be used more harmfully than others? Perhaps they could suggest malware?
Update: Thanks for the replies. I was interested in knowing if this would be possible, and it basically sounds totally infeasible. Good to know.

If it's not even possible to automatically determine whether a program terminates, I don't think you'll get much leverage in automatically determining whether an app does "naughty stuff".
Part of the problem of course is defining what constitutes malware, but the majority is simply that deducing proofs about the behaviour of other programs is surprisingly difficult/impossible. You may have some luck spotting particular patterns, but on the whole you can't be confident (and I suspect it's provably impossible) that you've caught all possible attack vectors.
And in the general sphere, catching 95% of vectors isn't really worthwhile when the attackers simply concentrate on the remaining 5%.

Well, there's always the fundamental philosophical question: what is a malware? It's code that was intended to do damage, or at least code that doesn't do what it claims to. How do you plan to judge intent based on libraries it uses?
Having said that, if you at least roughly know what the program is supposed to do, you can indeed find suspicious packages, things the program wouldn't normally need to access. Like network connections when the program is meant to run as a desktop app. But then the network connection could just be part of an autoupdate feature. (Is autoupdate itself a malware? Sometimes it feels like it is.)
Another indicator is if a program that ostensibly doesn't need any special privileges, refuses to run in a sandbox. And the biggest threat is if it tries to load a native library when it shouldn't need one.
But all these only make sense if you know what the code is supposed to do. An antivirus package might use very similar techniques to viruses, the only difference is what's on the label.

Here is a general outline for how you can bound the possible actions your java application can take. Basically you are testing to see if the java application is 'inert' (can't take harmful actions) and thus it probably not mallware.
This won't necessarily tell you mallware or not, as others have pointed out. The app could still do annoying things like pop-up windows. Perhaps the best indication, is to see if the application is digitally signed by an author you trust; if not -- be afraid.
You can disassemble the class files to determine which Java APIs the application uses; you are looking for points where the java app uses the OS. Since java uses a virtual machine, there are well defined points where a java application could take potentially harmful actions -- these are the 'gateways' to various OS calls (for example opening a socket or reading a file).
Its difficult to enumerate all the APIs, different functions which execute the same OS action should require the same Permission. But java's docs don't provide an exhaustive list.
Does the java app use any native libraries -- if so its a big red flag.
The JVM does not offer the ability to run arbitrary code, or use native system APIs; in particular it does not offer the ability to modify the registry (a typical action of PC mallware). The only way a java application can do this is via native libraries. Typically there is no need for a normal application written in java to use native code (unless it needs to use devices).
Check for System.loadLibrary() or System.load() or Runtime.loadLibrary() or Runtime.load(). This is how the VM loads native libraries.
Does it use the network or file system?
Look for use of java.io, java.net.
Does it make system calls (via Runtime.exec())
You can check for the use of java.lang.Runtime.exec() or ProcessBuilder.exec().
Does it try to control the keyboard / mouse?
You could also run the application in a restricted policy JVM (the instructions/tools for doing this are not as simple as they should be) and see what fails (see Oracle's security tutorial) -- note that disassembly is the only way to be sure, just because the app doesn't do anything harmful once, doesn't mean it won't in the future.
This definitely is not easy, and I was surprised to find how many places one needs to look at (for example several java functions load native libraries, not just one).

What are common Java vulnerabilities?

What are common Java vulnerabilities that can be exploited to gain some sort of access to a system? I have been thinking about it recently, and havent been able to come up with much of anything - integer overflow - maybe? race condition - what does it give you?
I am not looking for things like "sql injection in a web app". I am looking for a relationship similar to buffer overflow - c/c++.
Any security experts out there that can help out? Thanks.

Malicious Code injection.
Because Java (or any language using an interpreter at runtime), performs linkage at runtime, it is possible to replace the expected JARs (the equivalent of DLLs and SOs) with malicious ones at runtime.
This is a vulnerability, which is combated since the first release of Java, using various mechanisms.
There are protections in places in the classloaders to ensure that java.* classes cannot be loaded from outside rt.jar (the runtime jar).
Additionally, security policies can be put in place to ensure that classes loaded from different sources are restricted to performing only a certain set of actions - the most obvious example is that of applets. Applets are constrained by the Java security policy model from reading or writing the file system etc; signed applets can request for certain permissions.
JARs can also be signed, and these signatures can be verified at runtime when they're loaded.
Packages can also be sealed to ensure that they come from the same codesource. This prevents an attacker from placing classes into your package, but capable of performing 'malicious' operations.
If you want to know why all of this is important, imagine a JDBC driver injected into the classpath that is capable of transmitting all SQL statements and their results to a remote third party. Well, I assume you get the picture now.

After reading most of the responses I think your question has been answered in an indirect way. I just wanted to point this out directly. Java doesn't suffer from the same problems you see in C/C++ because it protects the developer from these types of memory attacks (buffer overflow, heap overflow, etc). Those things can't happen. Because there is this fundamental protection in the language security vulnerabilities have moved up the stack.
They're now occurring at a higher level. SQL injection, XSS, DOS, etc. You could figure out a way to get Java to remotely load malicious code, but to do that would mean you'd need to exploit some other vulnerability at the services layer to remotely push code into a directory then trigger Java to load through a classloader. Remote attacks are theoretically possible, but with Java it's more complicated to exploit. And often if you can exploit some other vulnerability then why not just go after and cut java out of the loop. World writable directories where java code is loaded from could be used against you. But at this point is it really Java that's the problem or your sys admin or the vendor of some other service that is exploitable?
The only vulnerabilities that pose remote code potential I've seen in Java over the years have been from native code the VM loads. The libzip vulnerability, the gif file parsing, etc. And that's only been a handful of problems. Maybe one every 2-3 years. And again the vuln is native code loaded by the JVM not in Java code.
As a language Java is very secure. Even these issues I discussed that can be theoretically attacked have hooks in the platform to prevent them. Signing code thwarts most of this. However, very few Java programs run with a Security Manager installed. Mainly because of performance, usability, but mainly because these vulns are very limited in scope at best. Remote code loading in Java hasn't risen to epidemic levels that buffer overflows did in the late 90s/2000s for C/C++.
Java isn't bullet proof as a platform, but it's harder to exploit than the other fruit on the tree. And hackers are opportunistic and go for that low hanging fruit.

I'm not a security expert, but there are some modules in our company that we can't code in java because it is so easy to de-compile java bytecode. We looked at obfuscation but if you want real obfuscation it comes only with a lot of problems (performance hit/loss of debug information).
One could steal our logics, replace the module with a modified version that will return incorrect results etc...
So compared to C/C++, I guess this is one "vulnerability" that stands out.
We also have a software license mechanism built-in in our java modules, but this can also be easily hacked by de-compiling and modifying the code.

Including third party class files and calling upon them basically means you are running unsecure code. That code can do anything it wants if you don't have security turned on.

Security with Java Scripting (JRuby, Jython, Groovy, BeanShell, etc)

I'm looking to run some un-verified scripts (written in a yet-to-be-determined language, but needs to be Java-based, so JRuby, Groovy, Jython, BeanShell, etc are all candidates). I want these scripts to be able to do some things and restricted from doing other things.
Normally, I'd just go use Java's SecurityManager and be done with it. That's pretty simple and lets me restrict file and network access, the ability to shutdown the JVM, etc. And that will work well for the high level stuff I want to block off.
But there is some stuff I want to allow, but only via my custom API/library that I've providing. For example, I don't want to allow direct network access to open up a URLConnection to yahoo.com, but I am OK if it is done with MyURLConnection. That is - there is a set of methods/classes that I want to allow and then everything else I want to be off limits.
I don't believe this type of security can be done with the standard Java security model, but perhaps it can. I don't have a specific requirement for performance or flexibility in the scripting language itself (the scripts will be simple procedural calls to my API with basic looping/branching). So even a "large" overhead that checks a security check on every reflection call is fine by me.
Suggestions?

Disclaimer: I am not an expert on Java Security APIs, so there may be a better way to do this.
I work for Alfresco, Java-based Open Source Enterprise CMS, and we implemented something similar to what you describe. We wanted to allow scripting, but only to expose a subset of our Java APIs to the scripting engine.
We chose Rhino Engine for JavaScript scripting. It allows you to control which APIs are exposed to JavaScript, which allows us to choose which classes are available, and which are not. The overhead, according to our engineers, is on the order of 10%- not too bad.
In addition to this, and this may be relevant to you as well, on the Java side, we use Acegi (now Spring Security), and use AOP to give role-based control over which methods a certain user can call. That works pretty well for authorization. So in effect, a user accessing our app through JavaScript first has a restricted API available to him in the first place, and then that API can be restricted even further based on authorization. So you could use the AOP techniques to further restrict which methods can be called, thus allowing to expose this in other scripting languages, such as Groovy, etc. We are in the process of adding those as well, having the confidence that our underlying Java APIs protect users from unauthorized access.

You may be able to use a custom class loader that does vets linking to classes before delegating to its parent.
You can create your own permissions, check for those in your security sensitive APIs and then use AccessController.doPrivileged to restore appropriate privileges whilst calling the underlying API.
You need to make sure that the scripting engine itself is secure. The version of Rhino in the Sun JDK should be okay, but no guarantees. Obviously you need to make sure everything available to the script is secure.

In Groovy, you can do exactly what you mentioned. Actually very easy. You can easily limit permissions of untrusted scripts running in a trusted environment, allow usage of your own api, which in turn can do untrusted things.
http://www.chrismoos.com/2010/03/24/groovy-scripts-and-jvm-security/

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.