I have refactored and tuned my java application.
I now want to compare the performance of newer and older version of the application , in terms of their individual CPU and heap memory usage.
I am using VisualVM and JDK 1.7. I run them individually and monitor them using VisualVM. At the end all i have is two sets of graphs. This makes deciding which one is better difficult.
Is there a metric that VisualVm provides , which can make deciding which version performs better easier. ?.
Something like Average CPU used or Average Heap Used (if that is an accurate way of measuring performance) Thanks !!
The Platform MBeans provide access to the CPU, memory and Garbage Collection data, so its possible to collect data from there and run statistical analysis against it:
CPU and Memory: http://docs.oracle.com/javase/7/docs/jre/api/management/extension/com/sun/management/OperatingSystemMXBean.html
Garbage Collection:
http://docs.oracle.com/javase/7/docs/jre/api/management/extension/com/sun/management/GarbageCollectorMXBean.html
Related
Suppose my environment is Java 1.8, my application is a batch application and there is no requirement for latency, I don't know whether I should choose Parallel GC or G1 GC?
I understand that Parallel GC is optimized for throughput and is more suitable for batch applications like mine, but I find that all Java applications around me are using G1 garbage collector, so I am not sure if I don't need Parallel GC if I have G1, or if I am looking for throughput, Parallel GC is the best choice. better choice?
I went through the first chapter of the Java® Performance Companion book and there is a passage in it that describes this:
As of this writing, G1 primarily targets the use case of large Java heaps with reasonably low pauses, and also those applications that are using CMS GC. There are plans to use G1 to also target the throughput use case, but for applications looking for high throughput that can tolerate longer GC pauses, Parallel GC is currently the better choice.
This exactly answers my question, if 1 project is purely for completing timed tasks, or consuming MQ, it usually has no requirement for pause time, which would be more appropriate with Parallel GC.
The (relatively) new built-in performance monitor/profiler for Java is Mission Control. The Oracle docs advertise that they can be used in production without incurring performance hits (less than 2%):
The tool chain [Mission Control + Flight Recorder] enables developers and administrators to collect and analyze data from Java applications running locally or deployed in production environments.
I have used jvisualvm (VisualVM) for many years now, but never in a production environment due to the putative admonishment that it does incur performance overhead.
So I ask: What is so different between Mission Control (and its Flight Recorder) and VisualVM that allows MC/FR to not hinder performance? Or do they not include certain features/capabilities that VisualVM delivers?
The main performance difference in the method profiling is that MC/JFR uses sampling, and only samples a few threads per sampling interval. It uses a similar approach to AsyncGetCallTrace (see for example http://psy-lob-saw.blogspot.com/2016/06/the-pros-and-cons-of-agct.html)
Since I work with MC/JFR, I'm not as familiar with how VisualVM does it's sampling profiling, but I believe it's not using the same method.
MC/JFR has it's data gathering engine deeply integrated into the HotSpot JVM, VisualVM uses external APIs/MXBeans. This also helps JFR to lower the performance overhead.
Generally, JFR is designed to find the hot spots, rather than gathering data that 100% correct but might slow your application down and affect the actual behavior. This goes for both the method and allocation sampling, as well as other information about latency events (wait/sleep/block), where only the events above a certain threshold are recorded. I'm less familiar with how this compares for VisualVM.
Other than that, the two tools have different feature sets, none of which is a superset of the other.
How does Java deal with GC and Heap Allocation on multi-processor machines?
In the reading I've done, there doesn't seem to be any difference in the algorithms used between single and multi-processor systems. The art & science of GC tuning is Java seems fairly mature, yet I can't find anything related to this in any of the common JVM implementations.
As a data point, in .Net, the algorithm changes significantly: There's a heap affinitized to each processor, and each processor is responsible for that heap. This is documented in a number of places such as MSDN:
Scalable Collections On a multiprocessor system running the server
version of the execution engine (MSCorSvr.dll), the managed heap is
split into several sections, one per CPU. When a collection is
initiated, the collector has one thread per CPU; all threads collect
their own sections simultaneously. The workstation version of the
execution engine (MSCorWks.dll) doesn't support this feature.
Any insight that can be offered into Java GC tuning specifically for multi processor systems is also of interest to me.
Indeed, in Hotspot JVM, the way that the heap is used does not depend on the heap size or number of cores.
Each thread (not processor) has a Thread Local Allocation Buffer (TLAB) so that object allocation is cheap and thread-safe. This memory space is kind of identical to that heap-processor-affinity you are mentionning.
You can also activate Non-Uniform Memory Access (NUMA). The idea behind NUMA is to prefer the RAM that is close to a CPU chip to store objects instead of considering the entire heap as a uniform space.
Finally, the GC are multi-threaded and scale on your number of cores, so they take advantage of your hardware.
Garbage collection is an implementation specific concept. Different JVMs (IBM, Oracle, OpenJDK, etc.) have different implementations, and different mechanisms are available in different versions too. Often you can select which mechanism you want to use when you start your Java program.
Similar questions here....
These details are often given in the documentation for the commandline options for your JRE:
IBM JDK Here
Oracle JRE Options here
I had an old application, a JAR file, that went through some enhancements. Basically some parts of the code had to be modified along with modifying some of the logic.
Comparing the OLD version against the NEW version, the NEW version is about 2X slower than the old one.
I'm trying to narrow down whats causing the slow down, but I'm finding myself measuring the time for certain for-loops using System.println with System.currentTimeMillis(). This is really getting very tedious.
Is there a Java performance tool that will help me in figuring out why the NEW JAR is about 2X slower than the old one?
Thanks in advance.
JProfiler has the capability to compare CPU snapshots. Record the execution for the old and the new JAR file and save snapshots (if the JVM exits at the end, configure a "JVM exit" trigger that saves a snapshot).
Then open the snapshot comparison window with "Session->Compare Snapshots in New Window" and add the two snapshot. A hot spots comparison will look like this (a view filter is set in this case):
It will immediately show you which methods are responsible for the increase in execution time.
Another way to analyze the differences in execution time is the call tree comparison which will look like this:
Disclaimer: My company develops JProfiler.
You should use a profiler. This will show you which methods are taking the most time (and what is calling them), without you having to guess which ones to measure.
Java comes with a built-in profiler called hprof, but see also:
https://stackoverflow.com/questions/14762/please-recommend-a-java-profiler
5 things you didn't know about ... Java performance monitoring
The JConsole and VisualVM tools
Depending on how long-running the process is, I'd think about Visual VM 1.3.3. If you download all the plugins, you'll be able to see heap, threads, objects, etc. That ought to help, and it won't cost a dime.
I believe it assumes the Oracle/Sun JVM.
A profiler tool like YourKit or something to measure performance reliably like Hyperic's Sigar is a good canditate for your case. Have a look at those tools.
The former will find bottlenecks in your code and/or memory leaks (not all of them) while as the latter is an API that you can measure performance reliably since Oracle's JVM & OpenJDK have no way of getting perfomance metrics reliably/consistently/accurately (like CPU wall clock time or CPU time spent from the application, memory usage, application threads, etc).
By default, Java provides packages for these things.
For example:
java.lang.management.ManagementFactory
java.lang.management.ThreadMXBean
but depending on your case they may or may not be adequate (keep in mind they are OK for most cases unless we are talking about something critical).
I've been told by my company's support team that some versions of java have a significant performance impact when we turn on -verbose:gc. However I can't figure out if this is the case or not.
Was this logging slow(ish) at some point, and when did it stop?
The reason I ask is that there's some hesitation about applying this to a production environment to investigate potential memory leaks (and whether we can stop doing periodic restarts of the system...).
Specifically I'm talking about Java 1.4.2 which I think introduced the argument, and what service pack it applies up to.
I know you asked about the impact of verbose:gc (Amir is correct), but based on the comments I see you are investigating a memory leak.
Is it possible for you to get a histogram of your environment? verbose GC will only show you that there is a memory leak, not where the memory is sitting.
you mention java 1.4.2, is that your current version? If you are using 1.5 or higher you can use
jmap -histo <pid> > file.txt
This will give you a breakdown of all the objects in memory. You will freeze your JVM for a time dependent on the amount of memory in the system. (2GB can freeze for a minute or so on even good hardware) test this on a development system first. I know you don't want to impact your production environment but this is a necessary evil to find the source of the problem. Do a capture right before the periodic restart to lesson your impact.
I suggest that you do the following:
Write some benchmark that is likely to stress the garbage collection. (Create large linked data structures with weak references, etc, etc).
Install a copy of the same version of the JVM as you are using in production on some test box.
Run the benchmark with various GC logging settings, including the settings that you want to run in production, measuring the performance impact on the benchmark.
If you do this right, it will give you some solid evidence about what the likely performance impact will be for your production server.