Profiling number of garbage-collected object instances per class - java

I am looking for a tool that can provide VisuaVM-like profiling about live objects, but in non-GUI mode.
The Visual VM functionality I am referring, is accessed by going to the "Profiler" tab and clicking "Memory".
By setting a profile preset of "Profile object allocations and GC" for ever 1 objects (all objects). This gives me exactly what I need in an auto-refreshing view, which I can filter for the class that interests me.
However, I want to be able to export the table of "live objects" to a text file, for every snapshot that is taken (Visual VM refreshes every one seconds). Obviously, pointing and clicking cannot possibly be a solution...
Anyone know of such a "command-line" profiler?
I have been looking at jmap which provides heap dumps, but it is too costly (the dump takes too long, I am just interested in the number of objects).
There is a commercial tool called YourKit but I don't know whether it can do what I need (and also seems rather expensive for the type of "one-off" usage I need it for).
If I could use VisualVM as-is, but have it append the output to a file (instead of refreshing its GUI) it'd be perfect...

I think Class Histograms are what you look for. You could collect the histograms in regular intervals and this will show you number of objects of each class and occupied space. You can then parse the text output yourself in order to:
compare two histograms to see instance allocation/deallocation
filter by a class name
monitor space occupation of class instances over time
Collect class histogram with jmap -histo $pid.

Related

Resize internal table and entrySet fields of many HashMaps. In java 8 after creation to take less space in heap

We have some legacy code that was written in the early 2000s that was running fine till now, except when we added more users to the system it had daily OutOfMemory exceptions
We are going to redo the project from ground up, but we want to best use the existing code base.
The code has thousands (yes not the best design) of HashMaps that are only modified in a loop on creation and after that remain readonly.
When i run Memory analyser it says that there is 15% of the heap consists of unused space in HahsMaps.
For these instances if we use reflection to make the entry set accessible and resize it so the size is equal to the actual needed, will iterators fail?
After initial construction these maps are read only caches. We are running the code on Java 6 and few instances on Java 8. wont be upgrading the JVM version, in another 10-12 months the new code will be in production and replace the current.
I know we can give a better initial size when creating the hash map but that is a lot of work compared to one function that resizes a map based on current size. ALso most of the maps are part of libraries (that i do not have access to source code) and these do nt expose the the constructor that accepts size. they call default constructor of their internal HashMap only.

How to find culprit class/object by looking at memory profiler result in visualVM

I am profiling my Java application using VisualVM
and I have gone through
profiling_with_visualvm_part_1
profiling_with_visualvm_part_2
When I see memory profile result, I see millions of Objects[], Char[], String and other such fundamental objects created which is taking all the memory. I want to know which of my classes (or my code) are actually responsible for creating those Objects[] and String etc, so far I couldn't find it. Once I know the culprit class I can dive-in the code and fix it.
I put a filter com.mypackage.*, but I see all of them are many times smaller (sometimes 0byte) compared to the total size of Objects[],Char[], String objects.
I believe, there should be a way to find the culprit code. Otherwise, profiler won't be of much use.
Let me know if my question is not clear, I will try to clarify further.
If you want to see, which code allocates those instances, go to 'Memory settings' and enable 'Record allocations stack traces'. 'Record allocations stack traces' option is explained 'Profiling with VisualVM part 2'. Once you turned it on, profile your application, take snapshot of profiling results. In the snapshot right-click on the particular class and invoke 'Show allocation stacktraces'.

Size of the specific object's subtree in the JVM memory

Is it possible to check programatically how much memory takes some object (with the whole subtree in the JVM memory). I would like to say (from the java code)
'tell me how much memory in the current JVM takes the JPanel with the
whole reference subtree when we assume that mentioned JPanel is the root
of this tree'.
I wonder if I could this way compare how much memory take two JPanels (or JFrame or whatever), and which takes more - without analyzing the dump. And I wonder if the answer is 'yes' how precise would be this value.
as stated in the comments to your qustion, the sizeOf problem in java isnt easy to solve, not only because the object youre trying to size isnt really the root of a memory graph, but also because there are issues with counting the size of static fields etc (they belong to the whole class, not any specific instance).
however, there are ways to get some meaningful data.
the 1st approach is to use a java agent attached to the jvm which in turn calls a size estimation function that sun/oracle have added starting with java 6. see this page for instructions
the 2nd approach is to estimate the size of an object tree based on theoretical calculations. there's a library that does this for you here
You can check JAMM, which is a java agent for measuring object size. You can find a tutorial here how to use it.

Is it possbile to see how much heap an object is using?

If I have a List<Object>, would it be possible to run some method on each Object to see how much memory each is consuming? I know nothing about each Object it may be an entire video file loaded onto the heap or just a two-byte string. I ultimately would like to know which objects to drop first before running out of memory.
I think Runtime.totalMemory() shows the memory currently used by the JVM, but I want to see the memory used by a single object.
SoftReference looks kinda like what you need. Create a list of soft references to your objects, and if those objects are not referenced anywhere and you run out of memory, JVM will delete some of them. I don't know how smart the algorithm for choosing what to delete is, but it could as well be removing those that will free most memory.
If you are in a container you can use Jconsole http://java.sun.com/developer/technicalArticles/J2SE/jconsole.html
The jdk since 1.5 comes with heap dump ulits... You in a container or in eclipse? Also why do you have a List of Objects??
There is no clean way to do it. You can create a dummy OutputStream which will do nothing but counting number of bytes written. So, you can make some estimation about your object graph size by serializing it to such stream.
I would not advise to do it in production system. I, personally, did it once for experimenting and making estimations.
Actually another possible tactic is just to make a crap load of instance of the class you want to check (like a million in an array).
The sheer number of objects should negate the overhead (as in the overhead of other stuff will be much smaller than your crap load of objects).
You will want to run this in isolation of course (ie public static main()).
I will admit you will need lots of memory for this test.
Something you could do is make a Map<Object, Long> which maps each object to it's memory size.
Then to measure the size of a particular object, you have to do it at instantiation of each object - measure the JVM memory use before (calling Runtime.totalMemory()) and after building the object (calling Runtime.totalMemory()) and take the difference between the two - that is the size of the object in memory. Then add the Object and Long to your map. From there you should be able to loop through all of the keys in the map and find the object using the largest amount of space.
I am not sure there is a way to do it per object after you already have your List<Object>... I hope this is helpful!

Method for finding memory leak in large Java heap dumps

I have to find a memory leak in a Java application. I have some experience with this but would like advice on a methodology/strategy for this. Any reference and advice is welcome.
About our situation:
Heap dumps are larger than 1 GB
We have heap dumps from 5 occasions.
We don't have any test case to provoke this. It only happens in the (massive) system test environment after at least a weeks usage.
The system is built on a internally developed legacy framework with so many design flaws that they are impossible to count them all.
Nobody understands the framework in depth. It has been transfered to one guy in India who barely keeps up with answering e-mails.
We have done snapshot heap dumps over time and concluded that there is not a single component increasing over time. It is everything that grows slowly.
The above points us in the direction that it is the frameworks homegrown ORM system that increases its usage without limits. (This system maps objects to files?! So not really a ORM)
Question: What is the methodology that helped you succeed with hunting down leaks in a enterprise scale application?
It's almost impossible without some understanding of the underlying code. If you understand the underlying code, then you can better sort the wheat from chaff of the zillion bits of information you are getting in your heap dumps.
Also, you can't know if something is a leak or not without know why the class is there in the first place.
I just spent the past couple of weeks doing exactly this, and I used an iterative process.
First, I found the heap profilers basically useless. They can't analyze the enormous heaps efficiently.
Rather, I relied almost solely on jmap histograms.
I imagine you're familiar with these, but for those not:
jmap -histo:live <pid> > histogram.out
creates a histogram of the live heap. In a nutshell, it tells you the class names, and how many instances of each class are in the heap.
I was dumping out heap regularly, every 5 minutes, 24hrs a day. That may well be too granular for you, but the gist is the same.
I ran several different analyses on this data.
I wrote a script to take two histograms, and dump out the difference between them. So, if java.lang.String was 10 in the first dump, and 15 in the second, my script would spit out "5 java.lang.String", telling me it went up by 5. If it had gone down, the number would be negative.
I would then take several of these differences, strip out all classes that went down from run to run, and take a union of the result. At the end, I'd have a list of classes that continually grew over a specific time span. Obviously, these are prime candidates for leaking classes.
However, some classes have some preserved while others are GC'd. These classes could easily go up and down in overall, yet still leak. So, they could fall out of the "always rising" category of classes.
To find these, I converted the data in to a time series and loaded it in a database, Postgres specifically. Postgres is handy because it offers statistical aggregate functions, so you can do simple linear regression analysis on the data, and find classes that trend up, even if they aren't always on top of the charts. I used the regr_slope function, looking for classes with a positive slope.
I found this process very successful, and really efficient. The histograms files aren't insanely large, and it was easy to download them from the hosts. They weren't super expensive to run on the production system (they do force a large GC, and may block the VM for a bit). I was running this on a system with a 2G Java heap.
Now, all this can do is identify potentially leaking classes.
This is where understanding how the classes are used, and whether they should or should not be their comes in to play.
For example, you may find that you have a lot of Map.Entry classes, or some other system class.
Unless you're simply caching String, the fact is these system classes, while perhaps the "offenders", are not the "problem". If you're caching some application class, THAT class is a better indicator of where your problem lies. If you don't cache com.app.yourbean, then you won't have the associated Map.Entry tied to it.
Once you have some classes, you can start crawling the code base looking for instances and references. Since you have your own ORM layer (for good or ill), you can at least readily look at the source code to it. If you ORM is caching stuff, it's likely caching ORM classes wrapping your application classes.
Finally, another thing you can do, is once you know the classes, you can start up a local instance of the server, with a much smaller heap and smaller dataset, and using one of the profilers against that.
In this case, you can do unit test that affects only 1 (or small number) of the things you think may be leaking. For example, you could start up the server, run a histogram, perform a single action, and run the histogram again. You leaking class should have increased by 1 (or whatever your unit of work is).
A profiler may be able to help you track the owners of that "now leaked" class.
But, in the end, you're going to have to have some understanding of your code base to better understand what's a leak, and what's not, and why an object exists in the heap at all, much less why it may be being retained as a leak in your heap.
Take a look at Eclipse Memory Analyzer. It's a great tool (and self contained, does not require Eclipse itself installed) which 1) can open up very large heaps very fast and 2) has some pretty good automatic detection tools. The latter isn't perfect, but EMA provides a lot of really nice ways to navigate through and query the objects in the dump to find any possible leaks.
I've used it in the past to help hunt down suspicious leaks.
This answer expands upon #Will-Hartung's. I applied to same process to diagnose one of my memory leaks and thought that sharing the details would save other people time.
The idea is to have postgres 'plot' time vs. memory usage of each class, draw a line that summarizes the growth and identify the objects that are growing the fastest:
^
|
s | Legend:
i | * - data point
z | -- - trend
e |
( |
b | *
y | --
t | --
e | * -- *
s | --
) | *-- *
| -- *
| -- *
--------------------------------------->
time
Convert your heap dumps (need multiple) into a format this is convenient for consumption by postgres from the heap dump format:
num #instances #bytes class name
----------------------------------------------
1: 4632416 392305928 [C
2: 6509258 208296256 java.util.HashMap$Node
3: 4615599 110774376 java.lang.String
5: 16856 68812488 [B
6: 278914 67329632 [Ljava.util.HashMap$Node;
7: 1297968 62302464
...
To a csv file with a the datetime of each heap dump:
2016.09.20 17:33:40,[C,4632416,392305928
2016.09.20 17:33:40,java.util.HashMap$Node,6509258,208296256
2016.09.20 17:33:40,java.lang.String,4615599,110774376
2016.09.20 17:33:40,[B,16856,68812488
...
Using this script:
# Example invocation: convert.heap.hist.to.csv.pl -f heap.2016.09.20.17.33.40.txt -dt "2016.09.20 17:33:40" >> heap.csv
my $file;
my $dt;
GetOptions (
"f=s" => \$file,
"dt=s" => \$dt
) or usage("Error in command line arguments");
open my $fh, '<', $file or die $!;
my $last=0;
my $lastRotation=0;
while(not eof($fh)) {
my $line = <$fh>;
$line =~ s/\R//g; #remove newlines
# 1: 4442084 369475664 [C
my ($instances,$size,$class) = ($line =~ /^\s*\d+:\s+(\d+)\s+(\d+)\s+([\$\[\w\.]+)\s*$/) ;
if($instances) {
print "$dt,$class,$instances,$size\n";
}
}
close($fh);
Create a table to put the data in
CREATE TABLE heap_histogram (
histwhen timestamp without time zone NOT NULL,
class character varying NOT NULL,
instances integer NOT NULL,
bytes integer NOT NULL
);
Copy the data into your new table
\COPY heap_histogram FROM 'heap.csv' WITH DELIMITER ',' CSV ;
Run the slop query against size (num of bytes) query:
SELECT class, REGR_SLOPE(bytes,extract(epoch from histwhen)) as slope
FROM public.heap_histogram
GROUP BY class
HAVING REGR_SLOPE(bytes,extract(epoch from histwhen)) > 0
ORDER BY slope DESC
;
Interpret the results:
class | slope
---------------------------+----------------------
java.util.ArrayList | 71.7993806279174
java.util.HashMap | 49.0324576155785
java.lang.String | 31.7770770326123
joe.schmoe.BusinessObject | 23.2036817108056
java.lang.ThreadLocal | 20.9013528767851
The slope is bytes added per second (since the unit of epoch is in seconds). If you use instances instead of size, then that's the number of instances added per second.
My one of the lines of code creating this joe.schmoe.BusinessObject was responsible for the memory leak. It was creating the object, appending it to an array without checking if it already existed. The other objects were also created along with the BusinessObject near the leaking code.
Can you accelerate time? i.e. can you write a dummy test client that forces it to do a weeks worth of calls/requests etc in a few minutes or hours? These are your biggest friend and if you don't have one - write one.
We used Netbeans a while ago to analyse heap dumps. It can be a bit slow but it was effective. Eclipse just crashed and the 32bit Windows tools did as well.
If you have access to a 64bit system or a Linux system with 3GB or more you will find it easier to analyse the heap dumps.
Do you have access to change logs and incident reports? Large scale enterprises will normally have change management and incident management teams and this may be useful in tracking down when problems started happening.
When did it start going wrong? Talk to people and try and get some history. You may get someone saying, "Yeah, it was after they fixed XYZ in patch 6.43 that we got weird stuff happening".
I've had success with IBM Heap Analyzer. It offers several views of the heap, including largest drop-off in object size, most frequently occurring objects, and objects sorted by size.
There are great tools like Eclipse MAT and Heap Hero to analyze heap dumps. However, you need to provide these tools with heap dumps captured in the correct format and correct point in time.
This article gives you multiple options to capture heap dumps. However, in my opinion, first 3 are effective options to use and others are good options to be aware.
1. jmap
2. HeapDumpOnOutOfMemoryError
3. jcmd
4. JVisualVM
5. JMX
6. Programmatic Approach
7. IBM Administrative Console
7 Options to capture Java Heap dumps
If it's happening after a week's usage, and your application is as byzantine as you describe, perhaps you're better off restarting it every week ?
I know it's not fixing the problem, but it may be a time-effective solution. Are there time windows when you can have outages ? Can you load balance and fail over one instance whilst keeping the second up ? Perhaps you can trigger a restart when memory consumption breaches a certain limit (perhaps monitoring via JMX or similar).
I've used jhat, this is a bit harsh, but it depends on the kind of framework you had.

Categories