I'm attempting to profile a Java web search program called Nutch from source. As far as I understand, to profile, I need to enable profiling in the compiler in order to generate a profile file to be opened in a program such as GProf. How do I do this if all I do to compile the software is run ANT withing the source root directory?
If you're running a newer JDK (the latest 1.6 update 7 or greater), you don't need to do anything as far as preparing your Java process to profile. Simply use JVisualVM (which comes with the JDK) to attach to your process, and click the profile button.
You say in response to #Charlie's answer that ideally you would like information about how the program spends it's time.
There's another viewpoint - you need to know why the program spends its time.
The reason each cycle is spent is a chain of reasons, where each link is a line of code on the call stack. The chain is no stronger than its weakest link.
Unless the program is as fast as possible, you've got "bottlenecks".
For example, if a "bottleneck" is wasting 20% of the time, then it consists of an optimizable line of code (i.e. poorly justified) that is on the stack 20% of the time. All you have to do is find it.
If 10,000 samples of the stack are taken, it will be on about 2,000 of them. If 10 samples are taken, it will be on 2 of them, on average.
In fact, if you randomly pause the program several times and study the call stack, if you see an optimizable line of code on as few as 2 samples, you've found a "bottleneck".
You can fix it, get a nice speedup, and repeat the whole process.
That is the basis of this technique.
Regardless, thinking in terms of gprof concepts will not serve you well.
You're really asking an Ant question here. You can add command line flags for the compiler as attributes in the ant file for the compile target. See the <compilerarg> tag here.
There are a lot of good profiling tools, by the way. Have a look at this google search.
Related
I've been developing a Reporting Engine (RE is to generate PDF-reports) in C++ on Linux. If a PDF-report being generated must contain some charts, I need to build them while building the report. ChartBuilder is written in Java (with JFreeChart of Java-TeeChart - it does not matter anyway). Well, while RE is building a report, it invokes some ChartBuilder-API functions via JNI to build a chart (or several charts) step by step (ChartBuilder is packed into .jar-file). The problem is that it takes a lot of time to build the first chart (that is, to execute every ChartBuilder-API function for the first time during the process lifetime)! More specifically, it takes about 1.5 seconds to build the first chart. If there are several charts to be created, the rest of charts are built during about (~0.05, ~0.1) seconds. That is 30 times faster than the first one! It's worth to note, that this first chart is the same with the rest of them (except for data). The problem seems to be fundamental for Java (and I'm not very expirienced in this platform).
Below is the picture that illustrates described problem:
I wonder if there is a way to hasten the first execution. It would be great to understand how to avoid the overhead on the first execution at all because now it hampers the whole performance of RE.
In addition I'd like to describe the way it works: Somebody invokes C++RE::CreateReport with all needed parameters. This function, if it's needed, creates a JVM and makes requests to it via JNI. When a report is created, the JVM is destroyed.
Thanks in advance!
Just-in-time compilation. Keep your JVM alive as a service to avoid paying JIT compilation cost multiple times.
I this it is likely a combination of things as people have pointed out in the comments and other answer - JVM startup, class loader, the fact that Java 'interprets' your code when it is running it etc.
Most fall into the category of 'first time start up' overhead - hence the higher performance in subsequent runs.
I would personally be inclined to agree with Thomas (in the comments to your question) that the highest overhead is possibly the class loader.
There are tools you can use to profile the Java JVM to get a feel for what is taking the most time within the JVM itself - such as:
visualvm (http://visualvm.java.net)
JVM monitor (http://jvmmonitor.org)
You have to be careful using these tools to interpret the results with some thought - you may want to measure first runs and subsequent runs separately, and you also may want to add your own system timings into your C++ code that wraps the JNI calls to get a better picture of the end to end timings. With performance monitoring, multiple test runs are very important to allow for slow and fast individual runs for one reason or another (e.g. other load on the computer - even on a non shared laptop).
As LeffeBrune mentions if you can have the chart builder running as a service already, it will likely speed up the first run, although you will probably need to experiment to see how much difference it makes if it has not actually been running on a processor for a while, for example.
What I want to do is generate a call tree with CPU timing information for a Java application as it goes through a scripted task. The idea is to see how much time is spent in each part of the code, and how this changes when I change the code or the task, but to do so in a consistently repeatable way.
In Java VisualVM I can do this interactively by clicking to start and stop profiling, but I would like to automate the process so I can get more consistent results (and not get so bored). Can VisualVM do this, or is there another profiler that can?
If I were a profiler vendor I would have to be concerned about providing people what they think they want, even if what they think they want does not solve the problem they have.
The thing is, only some problems can be found by knowing how long routines typically take, and if you ignore the ones you don't find that way, they will become the dominant part of how much time your program takes.
An example of what I mean is this recent example:
A program spends 50% of its wall-clock time reading .dll files to look up string resources to get the names of files so that the strings can be displayed on a splash screen so the user can see that something is happening during application startup. That means, if there were some other way to provide eye-candy to the user, the app could start up twice as fast.
During this process, the call stack is typically 15-20 functions deep, so it's really hard to tell what's going on just by having timing numbers for the functions.
What makes the problem difficult is that it is semantic. No particular routine is "hot" in a way that it could be speeded up.
The only "hot" thing is the general description, overall, of what the program is doing, and no tool can isolate it for you.
Only you can recognize it.
However, if you simply interrupted the program and examined the call stack during startup, the probability is 50% that you would see the entire explanation for the time being spent.
If you do it several times, it's the basis of the random pausing technique that some programmers rely on because it will find every problem profilers can find, and more, and others look down on because it isn't a tool.
And do it interactively, either that or extract a small number of stack samples by using something analogous to pstack.
External tools are giving me trouble. Is there a way to get simple cpu usage/time spent per function without the use of some external gui tool?
I've been trying to profile my java program with VisualVM, but I'm having terrible, soul crushing, ambition killing results. It will only display heap usage, what I'm interested in is CPU usage, but that panel simply says Not supported for this JVM. Doesn't tell me which JVM to use, by the way. I've download JDK 6 and launched it using that, I made sure my program targets the same VM, but nothing! Still the same, unhelpful error message.
My needs are pretty simple. I just want to find out where the program is spending its time. Python has an excellent built in profiler that print out where time was spent in each function, both with per call, and total time formats. That's really the extent of what I'm looking for right now. Anyone have any suggestions?
It's not pretty, but you could use the built in hprof profiling mechanism, by adding a switch to the command line.
-Xrunhprof:cpu=times
There are many options available; see the Oracle documentation page for HPROF for more information.
So, for example, if you had an executable jar you wanted to profile, you could type:
java -Xrunhprof:cpu=times -jar Hello.jar
When the run completes, you'll have a (large) text file called "java.hprof.txt".
That file will contain a pile of interesting data, but the part you're looking for is the part which starts:
CPU TIME (ms) BEGIN (total = 500) Wed Feb 27 16:03:18 2013
rank self accum count trace method
1 8.00% 8.00% 2000 301837 sun.nio.cs.UTF_8$Encoder.encodeArrayLoop
2 5.40% 13.40% 2000 301863 sun.nio.cs.StreamEncoder.writeBytes
3 4.20% 17.60% 2000 301844 sun.nio.cs.StreamEncoder.implWrite
4 3.40% 21.00% 2000 301836 sun.nio.cs.UTF_8.updatePositions
Alternatively, if you've not already done so, I would try installing the VisualVM-Extensions, VisualGC, Threads Inspector, and at least the Swing, JVM, Monitor, and Jvmstat Tracer Probes.
Go to Tools->Plugins to install them. If you need more details, comment, and I'll extend this answer further.
We have an Java ERP type of application. Communication between server an client is via RMI. In peak hours there can be up to 250 users logged in and about 20 of them are working at the same time. This means that about 20 threads are live at any given time in peak hours.
The server can run for hours without any problems, but all of a sudden response times get higher and higher. Response times can be in minutes.
We are running on Windows 2008 R2 with Sun's JDK 1.6.0_16. We have been using perfmon and Process Explorer to see what is going on. The only thing that we find odd is that when server starts to work slow, the number of handles java.exe process has opened is around 3500. I'm not saying that this is the acual problem.
I'm just curious if there are some guidelines I should follow to be able to pinpoint the problem. What tools should I use? ....
Can you access to the log configuration of this application.
If you can, you should change the log level to "DEBUG". Tracing the DEBUG logs of a request could give you a usefull information about the contention point.
If you can't, profiler tools are can help you :
VisualVM (Free, and good product)
Eclipse TPTP (Free, but more complicated than VisualVM)
JProbe (not Free but very powerful. It is my favorite Java profiler, but it is expensive)
If the application has been developped with JMX control points, you can plug a JMX viewer to get informations...
If you want to stress the application to trigger the problem (if you want to verify whether it is a charge problem), you can use stress tools like JMeter
Sounds like the garbage collection cannot keep up and starts "halt-the-world" collecting for some reason.
Attach with jvisualvm in the JDK when starting and have a look at the collected data when the performance drops.
The problem you'r describing is quite typical but general as well. Causes can range from memory leaks, resource contention etcetera to bad GC policies and heap/PermGen-space allocation. To point out exact problems with your application, you need to profile it (I am aware of tools like Yourkit and JProfiler). If you profile your application wisely, only some application cycles would reveal the problems otherwise profiling isn't very easy itself.
In a similar situation, I have coded a simple profiling code myself. Basically I used a ThreadLocal that has a "StopWatch" (based on a LinkedHashMap) in it, and I then insert code like this into various points of the application: watch.time("OperationX");
then after the thread finishes a task, I'd call watch.logTime(); and the class would write a log that looks like this: [DEBUG] StopWatch time:Stuff=0, AnotherEvent=102, OperationX=150
After this I wrote a simple parser that generates CSV out from this log (per code path). The best thing you can do is to create a histogram (can be easily done using excel). Averages, medium and even mode can fool you.. I highly recommend to create a histogram.
Together with this histogram, you can create line graphs using average/medium/mode (which ever represents data best, you can determine this from the histogram).
This way, you can be 100% sure exactly what operation is taking time. If you can't determine the culprit, binary search is your friend (fine grain the events).
Might sound really primitive, but works. Also, if you make a library out of it, you can use it in any project. It's also cool because you can easily turn it on in production as well..
Aside from the GC that others have mentioned, Try taking thread dumps every 5-10 seconds for about 30 seconds during your slow down. There could be a case where DB calls, Web Service, or some other dependency becomes slow. If you take a look at the tread dumps you will be able to see threads which don't appear to move, and you could narrow your culprit that way.
From the GC stand point, do you monitor your CPU usage during these times? If the GC is running frequently you will see a jump in your overall CPU usage.
If only this was a Solaris box, prstat would be your friend.
For acute issues like this a quick jstack <pid> should quickly point out the problem area. Probably no need to get all fancy on it.
If I had to guess, I'd say Hotspot jumped in and tightly optimised some badly written code. Netbeans grinds to a halt where it uses a WeakHashMap with newly created objects to cache file data. When optimised, the entries can be removed from the map straight after being added. Obviously, if the cache is being relied upon, much file activity follows. You probably wont see the drive light up, because it'll all be cached by the OS.
I've noticed recently that there's a few java libs (the JDK, joda time, iText) that compile without some/all of the debugging information. Either the local variable info is missing, or the both the local variable info and line numbers are missing.
Is there any reason for this? I realise it makes compiled code larger but I don't believe that's a particular large consideration. Or is it just building with the default compile options?
Thanks.
The default compile options don't include debugging information, you must specifically tell the compiler to include it. There are several reasons why most people omit it:
Some libraries are used in embedded systems (like mobile phones). Until recently, every bit counted. Today, most mobiles come with more memory than all computers in 1985 had together ;)
When compiled with debugging active, the code runs 5% slower. Not much but again, in some cases every cycle counts.
Today's Senior Developers were born in a time when 64KB of RAM was enormous. Yesterday, I added another 2TB drive to my server in the cellar. That's 7 orders of magnitude in 25 years. Humans need more time to adjust.
[EDIT] As John pointed out, Java bytecode isn't optimized (much) anymore today. So the output of the class files will be the same for both cases (only the class file with debug information will be bigger). The code is optimized in the JIT at runtime which allows the runtime to optimize the code for the CPU, memory (amount and layout), etc.
The mentioned 5% penalty is when you run the code and add the command line options to allow a remote debugger to attach to the process. If you don't enable remote debugging, there is no penalty (except for class loading but that happens only once).
Probably size of installation. Debug information adds overhead to the jar-files which Sun probably didn't like.
I had to investigate a Java Web Start issue recently - no debug information available - so adding full tracing to the java console and downloading the source code helped some, but the code is rather tangled so I'd just like a debug build.
The JDK should be compiled with FULL debug information everywhere!