Related
I would like your opinion on whether this idea sounds good to you, and if not, what you would do instead.
The goal of my project is to make a downloadable application that lets the user input a text file of experimental data, then performs calculations on the data to find statistical values such as the mean, standard deviation, and slope and intercept of the linear regression line. These are presented on the screen, as well as a scatter plot or histogram of the data.
For now, my plan has been to code the interface that the user interacts with in Java using the Swing library, and the part that performs the calculations in C. My reasons for doing this are that Java is good for GUIs that can be used on any machine, and C is faster at performing big calculations. One critical step in my project is to parallelize the code using the MPICH library so that my program can do things like make many sets of randomized data and analyze them. The Java and C code would communicate with each other by inputting and outputting text files, and I have been told that I need to do some shell scripting to bridge the two together. By doing this, I would hope that the Java code would give the C code the text file of the original data, the C code would do the calculations and report the statistical values in the form of a text file, and then the Java code would read this text file to present the results of data analysis to the user.
The important characteristics of this downloadable application are:
Has a very clear, easy-to-use interface
Can be downloaded and used easily, ideally on all kinds of computers (Windows, Mac, Linux)
Takes advantage of parallelization to do big calculations faster
I am not very knowledgeable about these languages or environments, and I am having a few doubts about my plan.
I know that Java programs are easily downloadable in a jar file, but if I use Java and C, will my program still be easily downloadable and able to be used on all machines with the shell scripting?
Would it be best to do all my coding in one language and still preserve the important characteristics listed above? If so, what would I be losing by doing so compared to using two languages?
I appreciate your help!
Please read again quotes from your own post:
First
I am not very knowledgeable about these languages
But:
Java is good for GUIs
And
Java is good for GUIs
So, you do not really have knowledge in Java and C but you can state that Java is good for GUI and C is faster. But both statements are probably wrong.
For the last 15 years people tend to avoid implementing SWING UI and desktop applications in java. They typically try to move the calculations to server and control the process using web based UI. (This probably is not applicable for your use-case if you have indeed large input data sets, e.g. tens or hundreds of GB).
During the same period JVM was significantly improved in terms of performance, so assumption that C code runs faster than the same code written in java could be incorrect.
So, probably you should implement all in java.
If you cannot move the calculations to server you can implement java application and run it using JNLP.
However before you start, may I recommend you to ask other design/architecture question that will contain more details about the amount of your input/output data and the nature of your calculations?
Let's address your characteristics first.
This is debatable, but I would make the claim that while certain languages make clean UI's easier, it is possible to make a good UI in any popular language with proper library support.
Portability - If you are planning to distribute in binary form and not in source form, then Java wins hands down. If you are planning to distribute in source form then C is possible but you will have to provide a means of building the application on each platform, which will usually be different for each platform.
Performance through parallel computation. For your kind of application (CPU Bound), C will likely be faster, but the difference may not matter enough for you to care.
Now let's talk about your doubts.
This is addressed in (2) above.
A single-language program is generally easier to maintain and distribute. You will lose a lot of headache by not dealing with two languages. You might also lose some performance if you choose only Java, but you will lose no portability if you choose only C vs C and Java, since the C part will already be non-portable.
I want to protect some algorithms, from being reversed engineered. I know there is always a risk, but I want to make the work as complicated as possible. I know in Java there are ProGuard and other obfuscator. But the most knowledge isn't in the structure of the application, but in the numerical details of the algorithm. And reading about it, made me doubt on the protection of the algorithm.
Simple renaming some variables, wouldn't make it hard enough to reverse engineer the algorithms. Perhaps you can tell me, which methods would be more appropriate for algorithms and which of obfuscator may do the best work on algorithms.
At the moment I'm thinking about a bit handwork and to combine it with a tool.
Assuming that your algorithm should be executed as Java bytecode, on arbitrary JVMs. Then people can hack their JVM to dump the bytecode somewhere, no matter how much you obfuscate the class loading process. Once you have the bytecode, you can do control flow analysis, i.e. decide what information gets passed from where to where.
You can confuse the order of the individual instructions, but that won't change the computation. For someone who simply wants to run your algorithm unmodified, this doesn't change anything. How much a reordering will prevent people from modifying your algorithm very much depends on the algorithm and the complexity of the control flow.
You might be able to confuse the control flow using reflection in some obscure way, or by implementing your own interpreter and using that to run the algorithm. But both these approaches will likely come at a severe penalty to the performance of the algorithm.
In other languages (like native x86 code) you might be able to confuse the disassembler by introducing ambiguity about how the bytes should be split into instructions, using some bytes as tail part of an instruction in one case, but as a distinct instruction in other cases. But in Java there is no such option, the meaning of bytecode is too well defined.
One way you might be able to obfuscate things somewhat is by closely intermixing the algorithm with other steps of the program. For a straight-line program, this might make things a wee bit harder to track, in particular if you pass numbers through invisible GUI objects or similar bizarre stuff. But once you require loops or similar, getting the loop bounds lined up seems very hard, so I doubt that this approach has much potential either. And I doubt there is a ready-to-use obfuscator for this, so you'd have to do things by hand.
In my exeperience you can use .so file I.e. native implementation with java implementation and it is really hard to track with obfsucated code but only disadvantage is you will have to use JNI for that.
I'm facing a project with audio streaming, as a client and a server. Would Java be a good choice for the server app?
I've read in other questions that because of performance C++ is the best choice for this kind of app.
If you are more comfortable with C++ or Java, I would use that. You can write a low pause server in either language.
A streaming server is mostly about passing lots of data from A to B i.e. the I/O matters. Unless you plan to compress the stream on the fly, the CPU performance is unlikely to be important.
Even if you are doing on the fly compression and Java isn't fast enough for that, you can call a library (preferably one already written/tested) to do this via JNI and still write most of the server in Java.
shrug It's not a bad choice. While audio streaming does have a performance component, the algorithms/optimizations you make are going to have a much larger effect than the language you choose.
Not to mention the famous Knuth quote "Premature optimization is the root of all evil". Write in whatever you are most comfortable with and check if it's a problem later.
The biggest issue with Java performance, I think, is garbage collection. Without careful consideration to what you're doing, it's easy in Java to write code that needs to pause every so often to clean up. C++ doesn't have that problem. On the other hand, without consideration of what you're doing, it's easy to write C++ code that leaks heap memory (when you forget to delete something from the heap). This is really bad for a long-running process like a server. It's possible to leak memory in Java, but it's related to keeping references around too long, not to anything built into the language.
Although C++ tends to be faster, with modern just-in-time compilers for Java, the performance difference tends to be overstated. Overall Java is probably just as fine as C++ for a streaming audio server. If you find that there's a bottleneck in some compute-intensive section, you can always drop down to C++ using Java Native Interface. But that should only be after identifying a problem with profiling.
If you did want to use java, here would be a great place to find some uses and files for using media in java...
Hi guys does anyone know why the programming language C++ is used more widely in biometric security applications compared to the programming language Java? The answers that I have collected so far are 1) Virtual Compilers 2) OpenCV Library provided by C++. Can anyone help with this question??
Maybe it's the hardware support: I wrote an app that uses a fingerprint sensor. The library support for the device is C++, so I wrote the app in C++. Now they have a .NET version, so my next app will be C#.
I don't know specifically about biometric applications, but in general when security is important Java can be a stumbling block. Depending on how the security requirements are written, they can cover things that one must do manually in C++, but which are done automatically by Java. This poses a problem because one would need to demonstrate that Java properly (and in a timely manner!) satisfies the requirement. It is a lot easier to show that these requirements are met in C++ code, because the code the meets the requirement is part of the program in question.
If the security person/requirements/customer make it clear that relying on Java for some security features is acceptable, then this is no big deal. We could go round-and-round about whether or not it is reasonable to rely on/trust Java to satisfy security requirements, it really just depends on the specific security needs.
I am willing to put money on the reason being simply that the access api's for the hardware are written in c++. Most of the modern/higher-level languages are not going to easily communicate with hardware originaly exposed through a C/C++ api.
On a somewhat related note, Vala has all the languages features expected of a modern\high-level language(and then some), but compiles to C binary and source, and can easily make use of any library written in C (not sure about c++). Check it out, I havnt used it much, but its pretty cool.
Implementing a library in C++ provide a lot over java. Once written, C++ library can run on almost any platform (including embedded ones), and can be made available as a native import to a variety of other languages through tools like SWIG. Java can only run on something with enough speed and memory to run a JVM, and the only other Java programs can include the code as a native import. For biometric applications especially I think running on embedded systems would be a large concern, since you could build this into a small sensor.
The more glib answer would be no one wants to wait for your garbage collection cycle to launch the friggen missiles.
You could replace Java with any other language there. Probably it has more to do with the APIs and hardware.
Also, Java is more suited for Web Applications. Its not the best choice for desktop applications.
For some biometric applications, execution speed is crucial.
For instance, let's say you're doing facial recognition for a checkpoint, and Java takes twice the time to run the algorithm that a compiled language like C++ does. That means if you go with Java, either:
The checkpoint lines will be twice as long,
You'll have to pay to staff twice as many checkpoints, or
Your system will do half as good a job at recognizing faces
None of those are usually acceptable options, which makes using Java a non-starter.
How do I lock compiled Java classes to prevent decompilation?
I know this must be very well discussed topic on the Internet, but I could not come to any conclusion after referring them.
Many people do suggest obfuscator, but they just do renaming of classes, methods, and fields with tough-to-remember character sequences but what about sensitive constant values?
For example, you have developed the encryption and decryption component based on a password based encryption technique. Now in this case, any average Java person can use JAD to decompile the class file and easily retrieve the password value (defined as constant) as well as salt and in turn can decrypt the data by writing small independent program!
Or should such sensitive components be built in native code (for example, VC++) and call them via JNI?
Some of the more advanced Java bytecode obfuscators do much more than just class name mangling. Zelix KlassMaster, for example, can also scramble your code flow in a way that makes it really hard to follow and works as an excellent code optimizer...
Also many of the obfuscators are also able to scramble your string constants and remove unused code.
Another possible solution (not necessarily excluding the obfuscation) is to use encrypted JAR files and a custom classloader that does the decryption (preferably using native runtime library).
Third (and possibly offering the strongest protection) is to use native ahead of time compilers like GCC or Excelsior JET, for example, that compile your Java code directly to a platform specific native binary.
In any case You've got to remember that as the saying goes in Estonian "Locks are for animals". Meaning that every bit of code is available (loaded into memory) during the runtime and given enough skill, determination and motivation, people can and will decompile, unscramble and hack your code... Your job is simply to make the process as uncomfortable as you can and still keep the thing working...
As long as they have access to both the encrypted data and the software that decrypts it, there is basically no way you can make this completely secure. Ways this has been solved before is to use some form of external black box to handle encryption/decryption, like dongles, remote authentication servers, etc. But even then, given that the user has full access to their own system, this only makes things difficult, not impossible -unless you can tie your product directly to the functionality stored in the "black box", as, say, online gaming servers.
Disclaimer: I am not a security expert.
This sounds like a bad idea: You are letting someone encrypt stuff with a 'hidden' key that you give him. I don't think this can be made secure.
Maybe asymmetrical keys could work:
deploy an encrypted license with a public key to decrypt
let the customer create a new license and send it to you for encryption
send a new license back to the client.
I'm not sure, but I believe the client can actually encrypt the license key with the public key you gave him. You can then decrypt it with your private key and re-encrypt as well.
You could keep a separate public/private key pair per customer to make sure you actually are getting stuff from the right customer - now you are responsible for the keys...
No matter what you do, it can be 'decompiled'. Heck, you can just disassemble it. Or look at a memory dump to find your constants. You see, the computer needs to know them, so your code will need to too.
What to do about this?
Try not to ship the key as a hardcoded constant in your code: Keep it as a per-user setting. Make the user responsible for looking after that key.
#jatanp: or better yet, they can decompile, remove the licensing code, and recompile. With Java, I don't really think there is a proper, hack-proof solution to this problem. Not even an evil little dongle could prevent this with Java.
My own biz managers worry about this, and I think too much. But then again, we sell our application into large corporates who tend to abide by licensing conditions--generally a safe environment thanks to the bean counters and lawyers. The act of decompiling itself can be illegal if your license is written correctly.
So, I have to ask, do you really need hardened protection like you are seeking for your application? What does your customer base look like? (Corporates? Or the teenage gamer masses, where this would be more of an issue?)
If you're looking for a licensing solution, you can check out the TrueLicense API. It's based on the use of asymmetrical keys. However, it doesn't mean your application cannot be cracked. Every application can be cracked with enough effort. What really important is, as Stu answered, figuring out how strong protection you need.
You can use byte-code encryption with no fear.
The fact is that the cited above paper “Cracking Java byte-code encryption” contains a logic fallacy. The main claim of the paper is before running all classes must be decrypted and passed to the ClassLoader.defineClass(...) method. But this is not true.
The assumption missed here is provided that they are running in authentic, or standard, java run-time environment. Nothing can oblige the protected java app not only to launch these classes but even decrypt and pass them to ClassLoader. In other words, if you are in standard JRE you can't intercept defineClass(...) method because the standard java has no API for this purpose, and if you use modified JRE with patched ClassLoader or any other “hacker trick” you can't do it because protected java app will not work at all, and therefore you will have nothing to intercept. And absolutely doesn't matter which “patch finder” is used or which trick is used by hackers. These technical details are a quite different story.
I don't think there exists any effective offline antipiracy method. The videogame industry has tried to find that many times and their programs has always been cracked. The only solution is that the program must be run online connected with your servers, so that you can verify the lincense key, and that there is only one active connecion by the licensee at a time. This is how World of Warcraft or Diablo works. Even tough there are private servers developed for them to bypass the security.
Having said that, I don't believe that mid/large corporations use illegal copied software, because the cost of the license for them is minimal (perhaps, I don't know how much you are goig to charge for your program) compared to the cost of a trial version.
Q: If I encrypt my .class files and use a custom classloader to load and decrypt them on the fly, will this prevent decompilation?
A: The problem of preventing Java byte-code decompilation is almost as old the language itself. Despite a range of obfuscation tools available on the market, novice Java programmers continue to think of new and clever ways to protect their intellectual property. In this Java Q&A installment, I dispel some myths around an idea frequently rehashed in discussion forums.
The extreme ease with which Java .class files can be reconstructed into Java sources that closely resemble the originals has a lot to do with Java byte-code design goals and trade-offs. Among other things, Java byte code was designed for compactness, platform independence, network mobility, and ease of analysis by byte-code interpreters and JIT (just-in-time)/HotSpot dynamic compilers. Arguably, the compiled .class files express the programmer's intent so clearly they could be easier to analyze than the original source code.
Several things can be done, if not to prevent decompilation completely, at least to make it more difficult. For example, as a post-compilation step you could massage the .class data to make the byte code either harder to read when decompiled or harder to decompile into valid Java code (or both). Techniques like performing extreme method name overloading work well for the former, and manipulating control flow to create control structures not possible to represent through Java syntax work well for the latter. The more successful commercial obfuscators use a mix of these and other techniques.
Unfortunately, both approaches must actually change the code the JVM will run, and many users are afraid (rightfully so) that this transformation may add new bugs to their applications. Furthermore, method and field renaming can cause reflection calls to stop working. Changing actual class and package names can break several other Java APIs (JNDI (Java Naming and Directory Interface), URL providers, etc.). In addition to altered names, if the association between class byte-code offsets and source line numbers is altered, recovering the original exception stack traces could become difficult.
Then there is the option of obfuscating the original Java source code. But fundamentally this causes a similar set of problems.
Encrypt, not obfuscate?
Perhaps the above has made you think, "Well, what if instead of manipulating byte code I encrypt all my classes after compilation and decrypt them on the fly inside the JVM (which can be done with a custom classloader)? Then the JVM executes my original byte code and yet there is nothing to decompile or reverse engineer, right?"
Unfortunately, you would be wrong, both in thinking that you were the first to come up with this idea and in thinking that it actually works. And the reason has nothing to do with the strength of your encryption scheme.