Some modern network cards support Direct Memory Access for improved performance. How can I utilize this feature from Java?
Does the JVM provide this automatically, or do I need to do an allocateDirect on the ByteBuffers that I am using to talk to that NIC?
Does anyone have documentation that discusses this?
It is the operating systems task to use the DMA feature of the network card. The JVM does not really care how the OS does it, and simply uses the operating system's functions for talking to "network interfaces".
You cannot do this from inside Java in the typical desktop/server JVMs, as this is operating system area which requires you to reach out into C code. Go have a look on JNI or JNA to see how to do this. Please note that this may make your application brittle if you do not get this exactly right.
Yeah - ankon's answer is right. Java operates in a sandbox - a virtual machine (hence the, "VM" in JVM; Sun actually built ONE physical version -- it's on display somewhere).
Java was never designed (intentionally) to reach outside the sandbox, unlike ActiveX, which can go just about anywhere on a PC.
Just think of all the bad things ActiveX has done over the years via a browser. You wouldn't want that to happen with Java, would you?
Although...
you might be able to instantiate an object in Java that does have access to the hardware (like one of those ActiveX controls, or some DLL, for example - which you'd have to write, too).
The problem I see is the throughput. With 100MB or 1000MB cards, would a JVM (remember, this is a VM running on an OS, so you're a couple of layers removed from the hardware) have the speed to handle what's coming in under load? Would you want a Java program holding up data in your NIC while it tinkered with it (think of the impact to the rest of the system)?
At this point, you're probably better off writing the hard-working guts of your solution in C. And, if you still need Java to play with that data, put it in a place where Java can get to it.
If you're not getting the network throughput you need in java, then you're going to need to write a C wrapper in order to access it.
Have you benchmarked your code to find where your performance issues really are? If you let us know that we can likely help you out without resorting to JNI.
Related
I'm facing a project with audio streaming, as a client and a server. Would Java be a good choice for the server app?
I've read in other questions that because of performance C++ is the best choice for this kind of app.
If you are more comfortable with C++ or Java, I would use that. You can write a low pause server in either language.
A streaming server is mostly about passing lots of data from A to B i.e. the I/O matters. Unless you plan to compress the stream on the fly, the CPU performance is unlikely to be important.
Even if you are doing on the fly compression and Java isn't fast enough for that, you can call a library (preferably one already written/tested) to do this via JNI and still write most of the server in Java.
shrug It's not a bad choice. While audio streaming does have a performance component, the algorithms/optimizations you make are going to have a much larger effect than the language you choose.
Not to mention the famous Knuth quote "Premature optimization is the root of all evil". Write in whatever you are most comfortable with and check if it's a problem later.
The biggest issue with Java performance, I think, is garbage collection. Without careful consideration to what you're doing, it's easy in Java to write code that needs to pause every so often to clean up. C++ doesn't have that problem. On the other hand, without consideration of what you're doing, it's easy to write C++ code that leaks heap memory (when you forget to delete something from the heap). This is really bad for a long-running process like a server. It's possible to leak memory in Java, but it's related to keeping references around too long, not to anything built into the language.
Although C++ tends to be faster, with modern just-in-time compilers for Java, the performance difference tends to be overstated. Overall Java is probably just as fine as C++ for a streaming audio server. If you find that there's a bottleneck in some compute-intensive section, you can always drop down to C++ using Java Native Interface. But that should only be after identifying a problem with profiling.
If you did want to use java, here would be a great place to find some uses and files for using media in java...
I am curious about what automatic methods may be used to determine if a Java app running on a Windows or PC is malware. (I don't really even know what exploits are available to such an app. Is there someplace I can learn about the risks?) If I have the source code, are there specific packages or classes that could be used more harmfully than others? Perhaps they could suggest malware?
Update: Thanks for the replies. I was interested in knowing if this would be possible, and it basically sounds totally infeasible. Good to know.
If it's not even possible to automatically determine whether a program terminates, I don't think you'll get much leverage in automatically determining whether an app does "naughty stuff".
Part of the problem of course is defining what constitutes malware, but the majority is simply that deducing proofs about the behaviour of other programs is surprisingly difficult/impossible. You may have some luck spotting particular patterns, but on the whole you can't be confident (and I suspect it's provably impossible) that you've caught all possible attack vectors.
And in the general sphere, catching 95% of vectors isn't really worthwhile when the attackers simply concentrate on the remaining 5%.
Well, there's always the fundamental philosophical question: what is a malware? It's code that was intended to do damage, or at least code that doesn't do what it claims to. How do you plan to judge intent based on libraries it uses?
Having said that, if you at least roughly know what the program is supposed to do, you can indeed find suspicious packages, things the program wouldn't normally need to access. Like network connections when the program is meant to run as a desktop app. But then the network connection could just be part of an autoupdate feature. (Is autoupdate itself a malware? Sometimes it feels like it is.)
Another indicator is if a program that ostensibly doesn't need any special privileges, refuses to run in a sandbox. And the biggest threat is if it tries to load a native library when it shouldn't need one.
But all these only make sense if you know what the code is supposed to do. An antivirus package might use very similar techniques to viruses, the only difference is what's on the label.
Here is a general outline for how you can bound the possible actions your java application can take. Basically you are testing to see if the java application is 'inert' (can't take harmful actions) and thus it probably not mallware.
This won't necessarily tell you mallware or not, as others have pointed out. The app could still do annoying things like pop-up windows. Perhaps the best indication, is to see if the application is digitally signed by an author you trust; if not -- be afraid.
You can disassemble the class files to determine which Java APIs the application uses; you are looking for points where the java app uses the OS. Since java uses a virtual machine, there are well defined points where a java application could take potentially harmful actions -- these are the 'gateways' to various OS calls (for example opening a socket or reading a file).
Its difficult to enumerate all the APIs, different functions which execute the same OS action should require the same Permission. But java's docs don't provide an exhaustive list.
Does the java app use any native libraries -- if so its a big red flag.
The JVM does not offer the ability to run arbitrary code, or use native system APIs; in particular it does not offer the ability to modify the registry (a typical action of PC mallware). The only way a java application can do this is via native libraries. Typically there is no need for a normal application written in java to use native code (unless it needs to use devices).
Check for System.loadLibrary() or System.load() or Runtime.loadLibrary() or Runtime.load(). This is how the VM loads native libraries.
Does it use the network or file system?
Look for use of java.io, java.net.
Does it make system calls (via Runtime.exec())
You can check for the use of java.lang.Runtime.exec() or ProcessBuilder.exec().
Does it try to control the keyboard / mouse?
You could also run the application in a restricted policy JVM (the instructions/tools for doing this are not as simple as they should be) and see what fails (see Oracle's security tutorial) -- note that disassembly is the only way to be sure, just because the app doesn't do anything harmful once, doesn't mean it won't in the future.
This definitely is not easy, and I was surprised to find how many places one needs to look at (for example several java functions load native libraries, not just one).
I'd like to learn how, or if its possible at all to programmaticly interact with a black-box java application(by reading its data). Has there been any previous research/work on doing this sort of thing?
I'd imagine that running on a JVM significantly complicates things.
#anon: Doing this with any JVM is relevant. Do you have to know or control the specifics of how the JVM allocates memory to extract data from a java application?
You could look into java.lang.instrument. As long as you understand the class structure of the application, it will let you modify the methods in an already-running JVM and you may be able to concoct a way that allows you to extract or insert data enough to communicate (depends on the methods available, of course).
The Sable group at McGill University has contributed a lot of research to the Java world.
Much of the work is getting rather dated, but you might find some help in their EVolve project which has the goal of visualizing object-oriented programs. Some of their projects appear to be actively maintained (such as Soot, their Java optimization framework), so you might find luck contacting them directly
It is easily possible with, for example, StackTrace. It can attach to a java process and let you inspect and change almost everything with BeanShell.
I believe what you're looking for is what the Eclipse MAT does. You might want to take a look at the source code...
The HotSpot JVM allows you to hook up an agentlib from a profiler (see Open Source Java Profilers or commercials like Your Kit), in the profiler you can then inspect the memory/cpu/threads etc.
If you want very specific stuff you might want to make your own agentlib that sends you information about the jvm that you need.
How does one secure the Java environment when running on a machine you don't control? What is to stop someone from creating a java agent or native JVMTI agent and dumping bytecode or re-writing classes to bypass licensing and/or other security checks? Is there any way to detect if any agents are running from Java code? From JNI? From a JVMTI agent?
If you don't control the environment, then I'm sorry - you're really stuck. Yes, you could look for trivial JVMTI agents via some sort of cmdline sniffing, but that's the least of your worries. Think about java/lang/Classloader.defineClass() being compromised directly. That's easy to do if you own the box - just replace the .class file in rt.jar. In fact, until JVMTI came around, that was a typical way that profilers and monitoring tools instrumented Java code.
Going back to JVMTI - the "Late attach" feature also allows for JVMTI agents to be loaded on the fly. That might not have happened when you scanned the first time around.
Bottom line - if someone can change the bytes of the JRE on disk, they can do anything they want. Is it ethical, no? Can they get caught? Possibly, but you'll never win the war.
It looks like I can go with a combination of checks inside some custom JNI native code.
1.) cmd line sniffing to search for agents.
2.) Ensure that the cmd-line parameter -XX:+DisableAttachMechanism exists. (this will prevent people from attaching to my running VM)
I remember I once made almost a silent Java Agent. I guess you better look for port scanners or something around that.
Java 2 security, signing of jars etc, gives some level of control over what gets loaded into your application.
However in the end if a malicious person has access to a machine such that they can write to disk then in all probability they have plenty of capacity to do harm without resorting to clever Java hacks.
Turn this round, in any language what can you do to detect Trojans?
Careful access control to the machines you care about is non-trivial but essential if you are serious about such issues. Security specialists may seem paranoid, but that often means that they really understand the risks.
If you can't control the platform, you can't control the software upon it.
Even if you could shut down all the avenues of inspection you've listed, Java is open source. They could just take the source code and recompile it with the necessary changes built-in.
Also, try to remember that while it is your code, it's their machine. They have a right to inspect your code to verify that running it on their machine does what they expect it to do, and doesn't perform "extra" actions which they might find undesirable. Less trustworthy companies in the past have scanned for non-relevant files, copied sensitive information back to their home servers, etc.
I would look at the command line and see, if there are any "-agent" parameters. All profilers, debuggers and other code modificators use this for introspection. You could also check for unusual jars on the bootclasspath, since those might also provide a threat (but be aware that you then also must deliver a custom JVM, since some software like Quicktime adds itself to the bootclasspath of ALL java apps running... (I couldn't belive my eyes when I saw that...))
Basically this is a loosing battle.
Have a look at how visualvm in the Sun JDK works and how it can attach to a running process and redefine whatever it pleases. It is extremely hard to detect that in a portable way, and unless you can do so, you might as well give up on this approach.
The question is, what is it you want to avoid?
I need an alternative to Java, because I am working on a genetics-calculation project.
It takes a lot of memory and the most of the cpu time. And therefore it won´t work when I deploy it on a server, because many people use the program at the same time.
Does anybody know another language that is not running in a virtual machine and is similar to Java (object-oriented, using exceptions and type-safety)?
Best regards,
Jonathan
To answer the direct question: there are dozens of languages that fit your explicit requirements. AmmoQ listed a few; Wikipedia has many more.
And I think that you'll be disappointed with every one of them.
Despite what Java haters want you to think, Java's performance is not much different than any other compiled language. Just changing languages won't improve performance much.
You'll probably do better by getting a profiler, and looking at the algorithms that you used.
Good luck!
If your apps is consuming most of the CPU and memory on a single-user workstation, I'm skeptical that translating it into some non-VM language is going to help much. With Java, you're depending on the VM for things like memory management; you're going to have to re-implement their equivalents in your non-VM language. Also, Java's memory management is pretty good. Your application probably isn't real-time sensitive, so having it pause once in a while isn't a problem. Besides, you're going to be running this on a multi-user system anyway, right?
Memory usage will have more to do with your underlying data structures and algorithms rather than something magical about the language. Unless you've got a really great memory allocator library for your chosen language, you may find you uses just as much memory (if not more) due to bugs in your program.
Since your app is compute-intensive, some other language is unlikely to make it less so, unless you insert some strategic sleep() calls throughout the code to deliberately make it yield the CPU more often. This will slow it down, but will be nicer to the other users.
Try running your app with Java's -server option. That will engage a VM designed for long-running programs and includes a JIT that will compile your Java into native code. It may make your program run a bit faster, but it will still be CPU and memory bound.
If you don't like C++, you might consider D, ObjectiveC or the new Go language from google.
You may try C++, it satisfies all your requirements.
Use Python along with numpy, scipy and matplotlib packages. numpy is a Python package which has all the number crunching code implemented in C. Hence runtime performance (bcoz of Python Virtual Machine) won't be an issue.
If you want compiled, statically typed language only, have a look at Haskell.
Can your algorithms be parallelised?
No matter what language you use you may come up against limitations at some point if you use a single process. Using something like Hadoop will mean you can retain Java and ease of use but you can run in parallel across many machines.
On the same theme as #Barry Brown's answer:
If your application is compute / memory intensive in Java, it will probably be compute / memory intensive in C++ or any other "more efficient" language. You might get some extra leeway ... but you'll soon run into the same performance wall.
IMO, you need to do the following things:
You need profile your application, and look for any major performance bottlenecks. You might find some real surprises.
In the light of the previous step, review the design and algorithms, paying attention to space and time complexity issues. Do some research to see if someone has discovered better algorithms for doing the computations that are problematic from a performance perspective.
If the previous steps don't get you ahead of the curve, see if you can upgrade your platform; get a bigger machine with more processors, more memory, etc.
If you are still stuck, your only other option is a scale-out design. Assuming that individual user requests are processed in a single-threaded, re-architect your system so that you can run "workers" across multiple servers, with a load balancer on the front. If you have a persistent back-end, look into how you can replicate that. And so on.
Figure out if the key algorithms can be parallelized / distributed so that the resource intensive parts of a user request execute in parallel on multiple processors / multiple servers; e.g. using a "map-reduce" framework.
OK, so there is no easy answer. But simply changing programming languages is NOT a good answer.
Regardless of language your program will need to share with others when running in multiple instances on a single machine. That is simply the way computers work.
The best way to allow your current program to scale to use the available hardware resources is to chop your amount of work into small, independent pieces, and make them implement the Callable interface. These can then be executed by a suitable Executor which can then be chosen according to the available hardware. See the Executors class for many preconfigured versions. THis is what I would recommend you to do here.
If you want to switch language then Mac OS X 10.6 allows for programming in the way described above with C and ObjectiveC and if you do it properly OS X can distribute the code over all available computing resources (both CPU and GPU and what have we).
If none of the above is interesting to you, then consider one of the Grid frameworks. Terracotta may be a good place to start.
F# or ruby, or python, they are very good for calculations, and many other things
NASA uses python
Well.. I think you are looking for C#.
C# is Object Oriented and has excellent support for Generics. You can use it do write both WinForm and server-side applications.
You can read more about C# generics here: http://msdn.microsoft.com/en-us/library/ms379564(VS.80).aspx
Edit:
My mistake, geneTIcs, not geneRIcs. It does not change the fact C# will do the job, and using generics will reduce load significantly.
You might find the computer language shootout here interesting.
For example, here's Java vs C++.
You might find Ocaml (from which F# is derived) worth a look; it meets your requirements for OO, exceptions, static types and it has a native compiler, however according to the shootout you may be trading less memory for lower speed.