I am looking for a Java code profiler which I can use to profile my application (its a service which runs in backend) on production (so means low over head, and it must not slow down my application). Primarily I want calling tree profiling, that is if a() calls b() and then b() calls c(), then how much time a() b() and c() took, both inclusively and exclusively.
Have seen jvisualvm and jprofiler, but this is not what I am looking for, because I cannot tie my production application to them as it will cause a major performance issue.
Also, I did go through metrics (https://github.com/dropwizard/metrics), but it does not give me a functionality to profile the calling tree.
Callgrind (http://valgrind.org/docs/manual/cl-manual.html) type library is what I need, as it gives calling tree profiling functionality and advanced options like avoiding calling cycles (recursion). But I am not sure that Callgrind can be used on production as it dumps data when program terminates.
Can anyone suggest good calling tree profiler for java that can be used on production without compromising the performance?
Take a look at Java Mission Control in conjunction with Flight Recorder. Starting with the release of Oracle JDK 7 Update 40 (7u40), Java Mission Control is bundled with the HotSpot JVM, so it is highly integrated and purports to have small effects on run-time performance. I have only just started looking at it, and I do see some call tree functionality.
In general you don't (or I won't recommend) profilers that instrument your application. Instrumenting always means an uncontrollable production overhead.
What you can use is a sampling profiler. A sampling profiler makes a snapshot of the stack traces at a controllable interval. What you don't get is call counts, but, after some runtime, you get a good overview where you have hotspots. Since you can adjust the sample interval of the profiler you can influence the overhead of it.
A usable sampling profiler is shipped with the JDK, see the hprof page in the java 7 documentation.
In former times there existed some graphical analysis tools for the hprof cpu traces (not the heap traces). Now they are gone. However, you can already work with the generated text file.
I took a quick look on the Java Mission Control stuff mentioned above. I think it is quite mighty and will satisfy a lot of needs, in the white paper they say it has only 2% overhead. However, it is not totally what I personally need or want. For my applications it is better to have a "light" profiling enabled all the time.
Intel Amplifier XE http://software.intel.com/en-us/intel-vtune-amplifier-xe has got low overhead if any noticeable. It uses stack sampling technology to minimize the impact and it can attach and detach to running non-stop processes in production. You even do not need to have sources during profiling, you can dive into sources later after offline performance results browsing.
I don't know of any tool which can do profiling without an impact on performance.
You could add logging to the methods that you're interested in. Make sure you include the time stamp in the log; then you can do the timing. You should also configure the logging framework to log asynchronously to reduce the performance loss.
A load time weaver like AspectJ is able to add these calls at runtime, which would allow you to easily select the methods you want to monitor without changing the source code all the time.
Using an around aspect, you can even add timing logging, so you don't have to parse the logs and try to find matching log entries. See this blog post for details.
Have a look at perfspy (tutorial), it might already do out of the box what you need.
Related:
How to use AOP with AspectJ for logging?
http://mathewjhall.wordpress.com/2011/03/31/tracing-java-method-execution-with-aspectj/
http://www.jroller.com/holy/entry/injecting_timing_aspect_into_junit
Related
I am looking for a Java code profiler which I can use to profile my application (its a service which runs in backend) on production (so means low over head, and it must not slow down my application). Primarily I want calling tree profiling, that is if a() calls b() and then b() calls c(), then how much time a() b() and c() took, both inclusively and exclusively.
Have seen jvisualvm and jprofiler, but this is not what I am looking for, because I cannot tie my production application to them as it will cause a major performance issue.
Also, I did go through metrics (https://github.com/dropwizard/metrics), but it does not give me a functionality to profile the calling tree.
Callgrind (http://valgrind.org/docs/manual/cl-manual.html) type library is what I need, as it gives calling tree profiling functionality and advanced options like avoiding calling cycles (recursion). But I am not sure that Callgrind can be used on production as it dumps data when program terminates.
Can anyone suggest good calling tree profiler for java that can be used on production without compromising the performance?
Take a look at Java Mission Control in conjunction with Flight Recorder. Starting with the release of Oracle JDK 7 Update 40 (7u40), Java Mission Control is bundled with the HotSpot JVM, so it is highly integrated and purports to have small effects on run-time performance. I have only just started looking at it, and I do see some call tree functionality.
In general you don't (or I won't recommend) profilers that instrument your application. Instrumenting always means an uncontrollable production overhead.
What you can use is a sampling profiler. A sampling profiler makes a snapshot of the stack traces at a controllable interval. What you don't get is call counts, but, after some runtime, you get a good overview where you have hotspots. Since you can adjust the sample interval of the profiler you can influence the overhead of it.
A usable sampling profiler is shipped with the JDK, see the hprof page in the java 7 documentation.
In former times there existed some graphical analysis tools for the hprof cpu traces (not the heap traces). Now they are gone. However, you can already work with the generated text file.
I took a quick look on the Java Mission Control stuff mentioned above. I think it is quite mighty and will satisfy a lot of needs, in the white paper they say it has only 2% overhead. However, it is not totally what I personally need or want. For my applications it is better to have a "light" profiling enabled all the time.
Intel Amplifier XE http://software.intel.com/en-us/intel-vtune-amplifier-xe has got low overhead if any noticeable. It uses stack sampling technology to minimize the impact and it can attach and detach to running non-stop processes in production. You even do not need to have sources during profiling, you can dive into sources later after offline performance results browsing.
I don't know of any tool which can do profiling without an impact on performance.
You could add logging to the methods that you're interested in. Make sure you include the time stamp in the log; then you can do the timing. You should also configure the logging framework to log asynchronously to reduce the performance loss.
A load time weaver like AspectJ is able to add these calls at runtime, which would allow you to easily select the methods you want to monitor without changing the source code all the time.
Using an around aspect, you can even add timing logging, so you don't have to parse the logs and try to find matching log entries. See this blog post for details.
Have a look at perfspy (tutorial), it might already do out of the box what you need.
Related:
How to use AOP with AspectJ for logging?
http://mathewjhall.wordpress.com/2011/03/31/tracing-java-method-execution-with-aspectj/
http://www.jroller.com/holy/entry/injecting_timing_aspect_into_junit
The (relatively) new built-in performance monitor/profiler for Java is Mission Control. The Oracle docs advertise that they can be used in production without incurring performance hits (less than 2%):
The tool chain [Mission Control + Flight Recorder] enables developers and administrators to collect and analyze data from Java applications running locally or deployed in production environments.
I have used jvisualvm (VisualVM) for many years now, but never in a production environment due to the putative admonishment that it does incur performance overhead.
So I ask: What is so different between Mission Control (and its Flight Recorder) and VisualVM that allows MC/FR to not hinder performance? Or do they not include certain features/capabilities that VisualVM delivers?
The main performance difference in the method profiling is that MC/JFR uses sampling, and only samples a few threads per sampling interval. It uses a similar approach to AsyncGetCallTrace (see for example http://psy-lob-saw.blogspot.com/2016/06/the-pros-and-cons-of-agct.html)
Since I work with MC/JFR, I'm not as familiar with how VisualVM does it's sampling profiling, but I believe it's not using the same method.
MC/JFR has it's data gathering engine deeply integrated into the HotSpot JVM, VisualVM uses external APIs/MXBeans. This also helps JFR to lower the performance overhead.
Generally, JFR is designed to find the hot spots, rather than gathering data that 100% correct but might slow your application down and affect the actual behavior. This goes for both the method and allocation sampling, as well as other information about latency events (wait/sleep/block), where only the events above a certain threshold are recorded. I'm less familiar with how this compares for VisualVM.
Other than that, the two tools have different feature sets, none of which is a superset of the other.
I had an old application, a JAR file, that went through some enhancements. Basically some parts of the code had to be modified along with modifying some of the logic.
Comparing the OLD version against the NEW version, the NEW version is about 2X slower than the old one.
I'm trying to narrow down whats causing the slow down, but I'm finding myself measuring the time for certain for-loops using System.println with System.currentTimeMillis(). This is really getting very tedious.
Is there a Java performance tool that will help me in figuring out why the NEW JAR is about 2X slower than the old one?
Thanks in advance.
JProfiler has the capability to compare CPU snapshots. Record the execution for the old and the new JAR file and save snapshots (if the JVM exits at the end, configure a "JVM exit" trigger that saves a snapshot).
Then open the snapshot comparison window with "Session->Compare Snapshots in New Window" and add the two snapshot. A hot spots comparison will look like this (a view filter is set in this case):
It will immediately show you which methods are responsible for the increase in execution time.
Another way to analyze the differences in execution time is the call tree comparison which will look like this:
Disclaimer: My company develops JProfiler.
You should use a profiler. This will show you which methods are taking the most time (and what is calling them), without you having to guess which ones to measure.
Java comes with a built-in profiler called hprof, but see also:
https://stackoverflow.com/questions/14762/please-recommend-a-java-profiler
5 things you didn't know about ... Java performance monitoring
The JConsole and VisualVM tools
Depending on how long-running the process is, I'd think about Visual VM 1.3.3. If you download all the plugins, you'll be able to see heap, threads, objects, etc. That ought to help, and it won't cost a dime.
I believe it assumes the Oracle/Sun JVM.
A profiler tool like YourKit or something to measure performance reliably like Hyperic's Sigar is a good canditate for your case. Have a look at those tools.
The former will find bottlenecks in your code and/or memory leaks (not all of them) while as the latter is an API that you can measure performance reliably since Oracle's JVM & OpenJDK have no way of getting perfomance metrics reliably/consistently/accurately (like CPU wall clock time or CPU time spent from the application, memory usage, application threads, etc).
By default, Java provides packages for these things.
For example:
java.lang.management.ManagementFactory
java.lang.management.ThreadMXBean
but depending on your case they may or may not be adequate (keep in mind they are OK for most cases unless we are talking about something critical).
For the profiler which I implement using JVMTI I would like to start measuring the execution time of all Java methods. The JVMTI offers the events:
MethodEntry
MethodExit
So this would be quite easy to implement, however I came across this note in the API:
Enabling method entry or exit events will significantly degrade performance on many platforms and is thus not advised for performance critical usage (such as profiling). Bytecode instrumentation should be used in these cases.
But my profiling agent works headless, which means the collected data is serialized and sent via socket to a server application displaying the results. How should I realize this using byte code instrumentation. I am kind of confused how to go on from here. Could someone explain to me, if I have to switch the strategy or how can I approach this problem?
I don't know about the Sun JVM but the IBM JVM goes into what we call FullSpeedDebug mode when you request the MethodEntry/Exit events.... FSD slows down execution quite a bit.
As you say you can use BCI as my profiler does but unless you are selective about which methods you instrument you will also see a slow down. For example my profiler inserts a if(profiling) callProfilerHook() on every entry and all of the possible exits in a method all object creates and some other areas as well.... These additional checks can slow down execution by over 50%...
As for how to BCI... well I wrote my own C library to do it... it's technically not hard (hint just delete the StackMapTable) but I may take you a while.. Alternatively you can use ASM et. al.
Finally... you callBackHook will add overhead and on small methods render the reported CPU/Clock time meaningless unless you perform some sophisticated overhead calculation... even if you do this your callback code affects the shape of the processor L1 caches and the Java code becomes less efficient because it has less room..
My profiler basically ignores the reported times as I visualize the execution in an interesting way... I'm looking to understand the flow of all of the code, in fact in most cases what code is running (most Java projects have no idea of the millions on lines of third-party code running in their app)
I am calling a vendor's Java API, and on some servers it appears that the JVM goes into a low priority polling loop after logging into the API (CPU at 100% usage). The same app on other servers does not exhibit this behavior. This happens on WebSphere and Tomcat. The environment is tricky to set up so it is difficult to try to do something like profiling within Eclipse.
Is there a way to profile (or some other method of inspecting) an existing Java app running in Tomcat to find out what methods are being executed while it's in this spinwait kind of state? The app is only executing one method when it gets in this state (vendor's method). Vendor can't replicate the behavior (of course).
Update:
Using JConsole I was able to determine who was running and what they were doing. It took me a few hours to then figure out why it was doing it. The problem ended up being that the vendor's API jar that was being used did not match exactly to the the database configuration that it was using. It was defaulting to having tracing and performance monitoring enabled on the servers that had the slight mis-match in configuration. I used a different jar and all is well.
So thanks, Joshua, for your answer. JConsole was extremely easy to setup and use to monitor an existing application.
#Cringe - I did some experimenting with some of the options you suggested. I had some problems with getting JProfiler set up, it looks good (but pricey). Going forward I went ahead and added the Eclipse Profiler plugin and I'll be looking over the different open source profilers to compare functionality.
If you are using Java 5 or later, you can connect to your application using jconsole to view all running threads. jstack also will do a stack dump. I think this should still work even inside a container like Tomcat.
Both of these tools are included with JDK5 and later (I assume the process needs to be at least Java 5, though I could be wrong)
Update:
It's also worth noting that starting with JDK 1.6 update 7 there is now a bundled profiler called VisualVM which can be launched with 'jvisualvm'. It looks like it is a java.net project, so additional info may be available at that page. I haven't used this yet but it looks useful for more serious analysis.
Hope that helps
Facing the same problem I used YourKit profiler. It's loader doesn't activate unless you actually connect to it (though it does open a port to listen for connections). The profiler itself has a nice "get amount of time spent in each method" while working in it's less obtrusive mode.
Another way is to detect CPU load (via JNI, so you'd need an external library for this) in a "watchdog" thread with highest priority and start logging all threads when the CPU is high enough for a long enough time. You might find this article enlightining.
If it's for professional purpose and you have some money to spend, try to get your hands on JProfiler. If you just want to get some insights, try out the Eclipse Profiler Plugin. I used it several times, but I don't know the current state.
A new(?) project from the eclipse project itself is available too: http://www.eclipse.org/tptp/ (See this article). Never used it, so I can't tell if it is worth the effort.
There's also a very good list of open source profilers available at http://www.manageability.org/blog/stuff/open-source-profilers-for-java
If JConsole can't be used you can
press CTRL+BREAK under Windows
send kill -3 <process id> under Linux
to get a full Thread Dump. This doesn't affect performance and can always be run in production.
JRockit Mission Control Latency Analyzer.
The Latency Analyzer that comes with JRockit shows you what the JVM is "doing" when it's not doing anything. In the latest version you can see latencies for:
Java wait/blocked/sleep/parked.
File I/O
Network I/O
Memory allocation
GC pauses
JVM latencies, e.g code generation and class loading
Thread suspension
The tool will give you the stack trace when the latency occurred. You can view the latency data in many different ways (aggregated traces, as a histogram, in a thread graph etc.). The tool also allows you to see transitions between threads, for instance when one thread notifies another.
latency analyzer http://blogs.oracle.com/hirt/WindowsLiveWriter/The.0LatencyAnalyserMigratedfromtheoldBE_7246/latency_graph_2.png
The overhead is negligible and unlike many other tools it can be used in a production environment.
This blog post gives you a brief introduction and the program can be downloaded here.
It's free to use for development!
Use a profiler. Yes they cost money, and using them can occasionally be a bit awkward, but they do provide you with a great deal more real evidence rather than guesswork.
Human beings are universally bad at guessing where performance bottlenecks are. It just seems to be something our brains aren't build to do very well. It may seem obvious, you may have great ideas about what the problem is, but the real world often turns out to be doing something different. And optimising the wrong part of code means, at best, lots of work for minimal benefit. More often it makes things slower, and sometimes it breaks things entirely. So before you make any changes for the sake of optimisation, you should always have real evidence from a profiler or other accurate tool.
As mentioned, both JProfiler and YourKit are both fairly good and not prohibitively expensive. Last time I looked, they both had free demos too.
For completeness sake: even though my company more or less standardizes on Eclipse we use Netbeans (6 and up) with its included, free profiler on a daily basis. It works better than the Eclipse TPTP plugin (last checked 3 months ago) and for us it removes any need for a commercial profiler such as JProfiler, which is excellent, but fast becoming unnecessary.
VisualVM should be the profiler from netbeans as standalone. I tried the TPTP for eclipse but visualVm seems as a much nicer option!