I've used Concurrent Pascal, a tool which helps debug concurrent algorithms because when it runs your code, it randomizes which thread to swap to at every possible step, trying out as many paths as possible.
Is there a JVM that can do this?
Take a look at the Java Pathfinder (from NASA, nonetheless—and it's free). I think it should do what you need almost out of the box, that is, trying different interleavings (some assembly may be required).
Of course, you still need to specify the verification property on your data that you're interested in, like an invariant. Otherwise, by default it would probably only tell you if there was a deadlock. Take a look at the section "Explore Execution Alternatives".
There are no commercial JVMs I'm aware of that do this, but I suggest you look at tools like ConTest that try to help you in your problem domain:
ConTest on developerWorks
ConTest on research site
In general, because most commerical JVMs rely on the OS to do thread scheduling, it's not a natural thing for JVMs themselves to do. There might be something out there for the green-threads versions of Jikes-RVM (which might be the older ones).
Related
I was wondering if there is any framework or application(app)/program out there that can analyze the concurrency of any java code?
If the tool knows all the implementations of the jre shipped classes and methods then it comes down to a simple analyzing of synchronized blocks and method and their call hierarchies. From here it can create a petri net and tell you for sure if you could ever experience a deadlock.
Am I missing out on something or is this really so easy? Then there must be some cool tool doing that kind of stuff? Or would such a tool report too many possible deadlocks that are completely save because of some underlying program/business logic? Petri nets should be powerful enough to handle these situations?
This would save so many man hours of searching for bugs that might or might not be related to dead locking issues.
Although (many) concurrency related bugs can be found using static code analysis, it doesn't apply to every type of bug. Some bugs only appear at runtime under certain conditions.
IBM has a tool called ConTest that "schedules the execution of program threads such that program scenarios that are likely to contain race conditions, deadlocks, and other intermittent bugs (collectively called synchronization problems) are forced to appear with high frequency".
This requires running (unit)tests against an instrumented version of your app. More background info in this developerWorks article.
This paper describes a tool that performs static analysis of a library and determines if deadlock is possible.
Some more :
klocwork
CheckThread
There is a specification of Java memory model.
And I want to dive into the source code to actually investigate how those mechanisms are implemented. (e.g., synchronized, volatile, ..., etc.)
But the codebase is so huge, I have no idea where to start with.
(http://www.java2s.com/Open-Source/Java-Document/CatalogJava-Document.htm)
Could anyone give me some clues?
Thanks a lot!
You might start by looking at the synchronizer.cpp class in the current version of the JDK. Prepare yourself a strong pot of coffee-- you've picked one of the most complex areas of the JVM to start delving into the source code.
If you haven't already done so, I would also suggest that you take a look at Bill Pugh's page on the Java Memory Model and Doug Lea's recommendations for compiler writers on implementing the Java memory model.
You may also glean something from running the debug JVM with the option turned on to output the JIT-compiled assembly which you can then inspect. (This won't tell you everything, but it might give you some pointers in: I think some of the things it prints will if nothing else give you some things to search for in the JDK source code...)
I make a tool and provide an API for external world, but I am not sure whether it is thread safe. Because users may want t use it in multiple-thread environment. Is there any way or tool that I can use to verify whether my API is thread safe in Java?
No. There is no such tool. Proving that a complex program is thread safe is very hard.
You have to analyze your program very carefully to ensure that is thread safe. Consider buying "Java concurrency in practice" (very good explanation of concurrency in java).
Stress tests, or static analysis tools like PMD and FindBugs can uncover some concurrency bugs in your code. So these can show if your code is not thread-safe. However they can never prove if it is thread-safe.
The most effective method is a thorough code review by developer(s) experienced in concurrency.
You can always stress-test it with tools like jmeter.
But the main problem with threads is that they're mostly unpredictable, so even with stress-tests etc. you can't be 100% sure that it will be totally thread safe.
Resources :
Wikipedia - Thread-safety
This is a variant (or so called "reduction") of the Halting Problem. Therefore it is provably unsolvable. for all non-trivial cases. (Yes, that's an edit)
That means you can find errors by any usual means (statistics, logic) but you can never completely prove that there are none.
I suppose those people saying proving an arbitrary multithreaded program is thread-safe is impossible are, in a way, correct. An arbitrary multithreaded program, coded without following strict guidelines, simply will have threading bugs, and you can't validly prove something that isn't true.
The trick is not to write an arbitrary program, but one with threading logic simple enough to possibly be correct. This then can be unambiguously validated by a tool.
The best such tool I'm aware of is CheckThread. It works on the basis of either annotations, or xml config files. If you mark a method as '#ThreadSafe' and it isn't, then you get a compile-time error. This is checked by looking at the byte code for thread-unsafe operations, e.g. reads/write sequences on unsynchronised data fields.
It also handles those APIs that require methods to be called on specific threads, e.g. Swing.
It doesn't actually handle deadlocks, but those can be statically eliminated without even requiring annotation, by using a tool such as Jlint. You just need to follow some minimal standards like ensuring locks are acquired according to a DAG, not willy-nilly.
You cannot and never will be able to automatically proof that a program is threadsafe anymore that you can prove that a program is correct (unless you think you solved the halting program, which you didn't).
So, no, you cannot verify that an API is threadsafe.
However in quite some case you can prove that it is broken, which is great!
You may also be interested in automatic deadlock detection, which in quite some case simply "just work". I'm shipping a Java program on hundreds of desktops with such a deadlock detector installed and it is a wonderful tool. For example:
http://www.javaspecialists.eu/archive/Issue130.html
You can also stress test your application in various ways.
Bogus multi-threaded programs tend to not work very well when a high load is present on the system.
Here's a question I asked about how to create easily create a high CPU load on a Un*x system, for example:
Bash: easy way to put a configurable load on a system?
I'd like to learn how, or if its possible at all to programmaticly interact with a black-box java application(by reading its data). Has there been any previous research/work on doing this sort of thing?
I'd imagine that running on a JVM significantly complicates things.
#anon: Doing this with any JVM is relevant. Do you have to know or control the specifics of how the JVM allocates memory to extract data from a java application?
You could look into java.lang.instrument. As long as you understand the class structure of the application, it will let you modify the methods in an already-running JVM and you may be able to concoct a way that allows you to extract or insert data enough to communicate (depends on the methods available, of course).
The Sable group at McGill University has contributed a lot of research to the Java world.
Much of the work is getting rather dated, but you might find some help in their EVolve project which has the goal of visualizing object-oriented programs. Some of their projects appear to be actively maintained (such as Soot, their Java optimization framework), so you might find luck contacting them directly
It is easily possible with, for example, StackTrace. It can attach to a java process and let you inspect and change almost everything with BeanShell.
I believe what you're looking for is what the Eclipse MAT does. You might want to take a look at the source code...
The HotSpot JVM allows you to hook up an agentlib from a profiler (see Open Source Java Profilers or commercials like Your Kit), in the profiler you can then inspect the memory/cpu/threads etc.
If you want very specific stuff you might want to make your own agentlib that sends you information about the jvm that you need.
I need an alternative to Java, because I am working on a genetics-calculation project.
It takes a lot of memory and the most of the cpu time. And therefore it won´t work when I deploy it on a server, because many people use the program at the same time.
Does anybody know another language that is not running in a virtual machine and is similar to Java (object-oriented, using exceptions and type-safety)?
Best regards,
Jonathan
To answer the direct question: there are dozens of languages that fit your explicit requirements. AmmoQ listed a few; Wikipedia has many more.
And I think that you'll be disappointed with every one of them.
Despite what Java haters want you to think, Java's performance is not much different than any other compiled language. Just changing languages won't improve performance much.
You'll probably do better by getting a profiler, and looking at the algorithms that you used.
Good luck!
If your apps is consuming most of the CPU and memory on a single-user workstation, I'm skeptical that translating it into some non-VM language is going to help much. With Java, you're depending on the VM for things like memory management; you're going to have to re-implement their equivalents in your non-VM language. Also, Java's memory management is pretty good. Your application probably isn't real-time sensitive, so having it pause once in a while isn't a problem. Besides, you're going to be running this on a multi-user system anyway, right?
Memory usage will have more to do with your underlying data structures and algorithms rather than something magical about the language. Unless you've got a really great memory allocator library for your chosen language, you may find you uses just as much memory (if not more) due to bugs in your program.
Since your app is compute-intensive, some other language is unlikely to make it less so, unless you insert some strategic sleep() calls throughout the code to deliberately make it yield the CPU more often. This will slow it down, but will be nicer to the other users.
Try running your app with Java's -server option. That will engage a VM designed for long-running programs and includes a JIT that will compile your Java into native code. It may make your program run a bit faster, but it will still be CPU and memory bound.
If you don't like C++, you might consider D, ObjectiveC or the new Go language from google.
You may try C++, it satisfies all your requirements.
Use Python along with numpy, scipy and matplotlib packages. numpy is a Python package which has all the number crunching code implemented in C. Hence runtime performance (bcoz of Python Virtual Machine) won't be an issue.
If you want compiled, statically typed language only, have a look at Haskell.
Can your algorithms be parallelised?
No matter what language you use you may come up against limitations at some point if you use a single process. Using something like Hadoop will mean you can retain Java and ease of use but you can run in parallel across many machines.
On the same theme as #Barry Brown's answer:
If your application is compute / memory intensive in Java, it will probably be compute / memory intensive in C++ or any other "more efficient" language. You might get some extra leeway ... but you'll soon run into the same performance wall.
IMO, you need to do the following things:
You need profile your application, and look for any major performance bottlenecks. You might find some real surprises.
In the light of the previous step, review the design and algorithms, paying attention to space and time complexity issues. Do some research to see if someone has discovered better algorithms for doing the computations that are problematic from a performance perspective.
If the previous steps don't get you ahead of the curve, see if you can upgrade your platform; get a bigger machine with more processors, more memory, etc.
If you are still stuck, your only other option is a scale-out design. Assuming that individual user requests are processed in a single-threaded, re-architect your system so that you can run "workers" across multiple servers, with a load balancer on the front. If you have a persistent back-end, look into how you can replicate that. And so on.
Figure out if the key algorithms can be parallelized / distributed so that the resource intensive parts of a user request execute in parallel on multiple processors / multiple servers; e.g. using a "map-reduce" framework.
OK, so there is no easy answer. But simply changing programming languages is NOT a good answer.
Regardless of language your program will need to share with others when running in multiple instances on a single machine. That is simply the way computers work.
The best way to allow your current program to scale to use the available hardware resources is to chop your amount of work into small, independent pieces, and make them implement the Callable interface. These can then be executed by a suitable Executor which can then be chosen according to the available hardware. See the Executors class for many preconfigured versions. THis is what I would recommend you to do here.
If you want to switch language then Mac OS X 10.6 allows for programming in the way described above with C and ObjectiveC and if you do it properly OS X can distribute the code over all available computing resources (both CPU and GPU and what have we).
If none of the above is interesting to you, then consider one of the Grid frameworks. Terracotta may be a good place to start.
F# or ruby, or python, they are very good for calculations, and many other things
NASA uses python
Well.. I think you are looking for C#.
C# is Object Oriented and has excellent support for Generics. You can use it do write both WinForm and server-side applications.
You can read more about C# generics here: http://msdn.microsoft.com/en-us/library/ms379564(VS.80).aspx
Edit:
My mistake, geneTIcs, not geneRIcs. It does not change the fact C# will do the job, and using generics will reduce load significantly.
You might find the computer language shootout here interesting.
For example, here's Java vs C++.
You might find Ocaml (from which F# is derived) worth a look; it meets your requirements for OO, exceptions, static types and it has a native compiler, however according to the shootout you may be trading less memory for lower speed.