How can I measure memory usage of java method [closed]

How can I measure memory usage of java method [closed] - java

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
sounds strange, but still.
Imagine I have a java method
boolean myMethod(Object param1, Object param2) {
// here can be some simple calculations. can be some services invokation
// or calls to DB. doesn't matter
return someValue;
}
How can I measure how much memory JVM uses to execute this method ? How much memory actually uses this method for its execution ?
Is there any tools or I maybe I can add some extra code to my method to be able to see used memory ?
I am using JConsole to monitor JVM, I also know that we can use java.lang.Runtime to see freeMemory, but actually I can't figure out how much memory my method uses with this tools.

Measuring the true size of of each invocation of a method is not as simple as it sounds and the JVM does not support it. The complexity comes about a) because the JVM was designed to hide this low level detail and b) the JVM may have more than one variation of the same method in flight at any point of time, which sounds nuts but it is also true. Some times the method will have been inlined, or optimised more heavily or even recompiled in a special way to support hot swapping of the method while it is running (!) which can radically change its runtime size.
Measuring the amount of heap used by a method call
Given that method invocations do not use the heap, it is tempting to say 0 bytes. However one may want to know the size of any new object that was allocated by a method call. This can be answered, however objects are shared across method calls and threads. So one has to be very clear about what they want to measure, and take care to not count it twice. There are libraries that will give the size of an Object in Java, they usually work either by reflection and a heuristic or they wrap sun.misc.Unsafe to inspect pointers to the object and its data. For example http://code.google.com/p/javabi-sizeof/.
Measuring the allocation rate of a method
YourKit has an excellent tool described here for tracking every object allocation at its source point and reporting back their total size and rates. Very useful for finding out who is causing GC churn.
Measuring the amount of stack used per method call
The JVM does not publish this information, and as mentioned before it will vary depending on the optimizations that are currently active and which runtime compiler is being used.
The alternative is to use a heuristic. Something like counting up how many variables are used within the method and multiply the count by 4 to give an answer in bytes (which assumes that every variable is 4 bytes in size, ie an int). Obviously this heuristic is flawed and could be improved, but it is simple enough to give a quick and fairly representative answer.

I was using the following in the past, and I found it to be useful:
http://www.javamex.com/classmexer/

Related

Count number of gc's that occur during a unit test run [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question appears to be off-topic because it lacks sufficient information to diagnose the problem. Describe your problem in more detail or include a minimal example in the question itself.
Closed 8 years ago.
Improve this question
I am currently writing a unit test to see the performance impact of a given method. From practice we observed that currently lots of gc's are occurring during the execution of the given method. I was wondering whether it is possible to see how many gcs occurred during the method run from java.

You can use GarbageCollectorMXBean to get the count of garbage collections.
You can do something like this:
Collection<GarbageCollectorMXBean> garbageCollectors = ManagementFactory.getGarbageCollectorMXBeans();
for (GarbageCollectorMXBean garbageCollectorMBean : garbageCollectors) {
String gcName = garbageCollectorMBean.getName();
long gcCount = garbageCollectorMBean.getCollectionCount();
}

Can't you simply call this method with the parameters and count them?
java -verbose:gc -XX:+PrintGCTimeStamps -XX:+PrintGCDetails TestFunc
Point 1
If you have GC calls during your method you will get wrong results. All GC calls are Stop The World calls for some phases of the algorithm, this meaning that your Threads are put on hold for that particular time. For how much it depends on the application that you are running, the heap you are giving to the JVM and the GC algorithm that you choose for the young and old generation. Note that calls to young generation are way cheaper then those in the old generation.
Point 2
Benchmarking in Java is complicated, mainly because of the JIT. You should use a library for your tests (if you can). One very good is JMH. If you can't use one, then let the jvm heat up a bit before running the actual benchmark, like 10000 times running the same method then measuring is usually the recommended.

NDK vs JAVA performance [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
Is any body have a assumption how fast will be C code with NDK with the same calculations then java code?(if any)
lets say I am doing X calculations(the same calculations) in Y seconds in java code.
How many X calculations can i do in the same Y seconds through the C code in NDK? 1.2 ?2.7 ? any guess number?
Lets say that the calc is B=L/A +C/D (the same one for all X calculations).
EDIT:
Why i am asking this?
because i consider to move my java processing camera frames to the C code.for bigger resolutions opportunities

Since no one else want to touch this topic, since its not consider serious to try to answer it, I will have a go:
Java compiles to bytecode, and the bytecode compiles to native code by the JIT.
C compiles directly to native code.
The difference are really the extra compile step, and in theory java should do a better work then your C compiler, and here's why:
Java can insert statistics calculations into the generated native code, and then after a while regenerate it to optimize it against the current runtime paths in your code!
That last point sounds awesome, java do however come with some tradeoffs:
It needs GC runs to clean out memory
It may not JIT code at all
GC copies alive objects and throws all dead one, since GC does not need to do anything for the dead one only for the live ones, GC in theory is faster then the normal malloc/free loop for objects.
However, one thing is forgotten by most Java advocates and that is that nothing says that you will have to malloc/free every object instance when coding C. You can reuse memory, you can malloc up memory blocks and free memory blocks containing thousands of temporarily objects on one go.
With big heaps on Java, GC time increases, adding stall time. In some software it is totally OK with stall times during GC cleanup cycle, in others it causes fatal errors. Try keeping your software to respond under a defined number of milliseconds when a GC happens, and you will see what I'm talking about.
In some extreme cases, the JIT may also choose not to JIT the code at all. This happens when a JITed method would be to big, 8K if I remember correct. A non JITed method has a runtime penalty in the range of 20000% (200 times slower that is, at least at our customer it was). JIT is also turned of when the JVMs CodeCache starts to get full (if keep loading new classes into the JVM over and over again this can happen, also happen at customer site).
At one point JIT statistics also reduced concurrency on one 128 core machine to basically single core performance.
In Java the JIT has a specific amount of time to compile the bytecode to native code, it is not OK to spend all CPU resources for the JIT, since it runs in parallel with the code doing the actually work of your program. In C the compiler can run as long as it needs to spit out what it thinks is the most optimized code it can. It has no impact on execution time, where in Java it has.
What I'm saying is really this:
Java gives you more, but it's not always up to you how it performs.
C gives you less, but it's up to you how it performs.
So to answer your question:
No selecting C over Java will not make your program faster
If you only keep to simple math over a preallocate buffer, both Java and C compilers should spit out about the same code.

You will probably not get a clear anwser from anyone. The questions is far more complex than it looks like.
It is no problem to put the same number of polys out in OpenGL be it with the NDK or SDK. After all it's just same OpenGL calls. The time to render the polys (in a batch) exeeds the time of the function call overhead by orders of magnitude. So it is usually completely neglectable.
But as soon as an application gets more complex and performs some serious calculations(AI, Scene Graph Management, Culling, Image processing, Number crunching, etc.) the native version will usually be much faster.
And there is another thing: Beside the fundamental problem that there currently is no JIT Compilation.
The current dalvikvm with its compiler seems to be very basic, without doing any optimizations - even not the most basic ones!
There is this (very good) video: Google I/O 2009 - Writing Real-Time Games for Android
After I have seen it, it was clear for me I will definitely use C++ with the NDK.
For example: He is talking about the overhead of function calls "Don't use function calls".
... So yeah we are back - before 1970 and start talking about the cost of structured programming and the performance advantage of using only global vars and gotos.
The garbage collection is a real problem for games. So you will spend a lot of your time thinking how you can avoid it. Even formatting a string will create new objects. So there are tips like: don't show the FPS!
Seriously, if you know C++ it is probably easier to manage you memory with new and delete than to tweak your architecture to reduce/avoid garbage collections.
It seems like if you want to program a non trivial real time game, you are loosing all the advantages of Java. Don't use Getters and Setters, Don't use function calls. Avoid any Abstraction, etc. SERIOUSLY?
But back to your question: The performance advantage of NDK vs SDK can be anything from 0-10000%. It all depends.

Best practice for creating millions of small temporary objects

What are the "best practices" for creating (and releasing) millions of small objects?
I am writing a chess program in Java and the search algorithm generates a single "Move" object for each possible move, and a nominal search can easily generate over a million move objects per second. The JVM GC has been able to handle the load on my development system, but I'm interested in exploring alternative approaches that would:
Minimize the overhead of garbage collection, and
reduce the peak memory footprint for lower-end systems.
A vast majority of the objects are very short-lived, but about 1% of the moves generated are persisted and returned as the persisted value, so any pooling or caching technique would have to provide the ability to exclude specific objects from being re-used.
I don't expect fully-fleshed out example code, but I would appreciate suggestions for further reading/research, or open source examples of a similar nature.

Run the application with verbose garbage collection:
java -verbose:gc
And it will tell you when it collects. There would be two types of sweeps, a fast and a full sweep.
[GC 325407K->83000K(776768K), 0.2300771 secs]
[GC 325816K->83372K(776768K), 0.2454258 secs]
[Full GC 267628K->83769K(776768K), 1.8479984 secs]
The arrow is before and after size.
As long as it is just doing GC and not a full GC you are home safe. The regular GC is a copy collector in the 'young generation', so objects that are no longer referenced are simply just forgotten about, which is exactly what you would want.
Reading Java SE 6 HotSpot Virtual Machine Garbage Collection Tuning is probably helpful.

Since version 6, the server mode of JVM employs an escape analysis technique. Using it you can avoid GC all together.

Well, there are several questions in one here !
1 - How are short-lived objects managed ?
As previously stated, the JVM can perfectly deal with a huge amount of short lived object, since it follows the Weak Generational Hypothesis.
Note that we are speaking of objects that reached the main memory (heap). This is not always the case. A lot of objects you create does not even leave a CPU register. For instance, consider this for-loop
for(int i=0, i<max, i++) {
// stuff that implies i
}
Let's not think about loop unrolling (an optimisations that the JVM heavily performs on your code). If max is equal to Integer.MAX_VALUE, you loop might take some time to execute. However, the i variable will never escape the loop-block. Therefore the JVM will put that variable in a CPU register, regularly increment it but will never send it back to the main memory.
So, creating millions of objects are not a big deal if they are used only locally. They will be dead before being stored in Eden, so the GC won't even notice them.
2 - Is it useful to reduce the overhead of the GC ?
As usual, it depends.
First, you should enable GC logging to have a clear view about what is going on. You can enable it with -Xloggc:gc.log -XX:+PrintGCDetails.
If your application is spending a lot of time in a GC cycle, then, yes, tune the GC, otherwise, it might not be really worth it.
For instance, if you have a young GC every 100ms that takes 10ms, you spend 10% of your time in the GC, and you have 10 collections per second (which is huuuuuge). In such a case, I would not spend any time in GC tuning, since those 10 GC/s would still be there.
3 - Some experience
I had a similar problem on an application that was creating a huge amount of a given class. In the GC logs, I noticed that the creation rate of the application was around 3 GB/s, which is way too much (come on... 3 gigabytes of data every second ?!).
The problem : Too many frequent GC caused by too many objects being created.
In my case, I attached a memory profiler and noticed that a class represented a huge percentage of all my objects. I tracked down the instantiations to find out that this class was basically a pair of booleans wrapped in an object. In that case, two solutions were available :
Rework the algorithm so that I do not return a pair of booleans but instead I have two methods that return each boolean separately
Cache the objects, knowing that there were only 4 different instances
I chose the second one, as it had the least impact on the application and was easy to introduce. It took me minutes to put a factory with a not-thread-safe cache (I did not need thread safety since I would eventually have only 4 different instances).
The allocation rate went down to 1 GB/s, and so did the frequency of young GC (divided by 3).
Hope that helps !

If you have just value objects (that is, no references to other objects) and really but I mean really tons and tons of them, you can use direct ByteBuffers with native byte ordering [the latter is important] and you need some few hundred lines of code to allocate/reuse + getter/setters. Getters look similar to long getQuantity(int tupleIndex){return buffer.getLong(tupleInex+QUANTITY_OFFSSET);}
That would solve the GC problem almost entirely as long as you do allocate once only, that is, a huge chunk and then manage the objects yourself. Instead of references you'd have only index (that is, int) into the ByteBuffer that has to be passed along. You may need to do the memory align yourself as well.
The technique would feel like using C and void*, but with some wrapping it's bearable. A performance downside could be bounds checking if the compiler fails to eliminate it. A major upside is the locality if you process the tuples like vectors, the lack of the object header reduces the memory footprint as well.
Other than that, it's likely you'd not need such an approach as the young generation of virtually all JVM dies trivially and the allocation cost is just a pointer bump. Allocation cost can be a bit higher if you use final fields as they require memory fence on some platforms (namely ARM/Power), on x86 it is free, though.

Assuming you find GC is an issue (as others point out it might not be) you will be implementing your own memory management for you special case i.e. a class which suffers massive churn. Give object pooling a go, I've seen cases where it works quite well. Implementing object pools is a well trodden path so no need to re-visit here, look out for:
multi-threading: using thread local pools might work for your case
backing data structure: consider using ArrayDeque as it performs well on remove and has no allocation overhead
limit the size of your pool :)
Measure before/after etc,etc

I've met a similar problem. First of all, try to reduce the size of the small objects. We introduced some default field values referencing them in each object instance.
For example, MouseEvent has a reference to Point class. We cached Points and referenced them instead of creating new instances. The same for, for example, empty strings.
Another source was multiple booleans which were replaced with one int and for each boolean we use just one byte of the int.

I dealt with this scenario with some XML processing code some time ago. I found myself creating millions of XML tag objects which were very small (usually just a string) and extremely short-lived (failure of an XPath check meant no-match so discard).
I did some serious testing and came to the conclusion that I could only achieve about a 7% improvement on speed using a list of discarded tags instead of making new ones. However, once implemented I found that the free queue needed a mechanism added to prune it if it got too big - this completely nullified my optimisation so I switched it to an option.
In summary - probably not worth it - but I'm glad to see you are thinking about it, it shows you care.

Given that you are writing a chess program there are some special techniques you can use for decent performance. One simple approach is to create a large array of longs (or bytes) and treat it as a stack. Each time your move generator creates moves it pushes a couple of numbers onto the stack, e.g. move from square and move to square. As you evaluate the search tree you will be popping off moves and updating a board representation.
If you want expressive power use objects. If you want speed (in this case) go native.

One solution I've used for such search algorithms is to create just one Move object, mutate it with new move, and then undo the move before leaving the scope. You are probably analyzing just one move at a time, and then just storing the best move somewhere.
If that's not feasible for some reason, and you want to decrease peak memory usage, a good article about memory efficiency is here: http://www.cs.virginia.edu/kim/publicity/pldi09tutorials/memory-efficient-java-tutorial.pdf

Just create your millions of objects and write your code in the proper way: don't keep unnecessary references to these objects. GC will do the dirty job for you. You can play around with verbose GC as mentioned to see if they are really GC'd. Java IS about creating and releasing objects. :)

I think you should read about stack allocation in Java and escape analysis.
Because if you go deeper into this topic you may find that your objects are not even allocated on the heap, and they are not collected by GC the way that objects on the heap are.
There is a wikipedia explanation of escape analysis, with example of how this works in Java:
http://en.wikipedia.org/wiki/Escape_analysis

I am not a big fan of GC, so I always try finding ways around it. In this case I would suggest using Object Pool pattern:
The idea is to avoid creating new objects by store them in a stack so you can reuse it later.
Class MyPool
{
LinkedList<Objects> stack;
Object getObject(); // takes from stack, if it's empty creates new one
Object returnObject(); // adds to stack
}

Object pools provide tremendous (sometime 10x) improvements over object allocation on the heap. But the above implementation using a linked list is both naive and wrong! The linked list creates objects to manage its internal structure nullifying the effort.
A Ringbuffer using an array of objects work well. In the example give (a chess programm managing moves) the Ringbuffer should be wrapped into a holder object for the list of all computed moves. Only the moves holder object references would then be passed around.

Debugging a slow Java method

VisualVM is showing me that a particular method is taking a long time to execute.
Are there any widely used strategies for looking at the performance (in regards to time) of a Java method?
My gut feeling is that the sluggish response time will come from a method that is somewhere down the call hierarchy from the one VisualVM is reporting but I think getting some hard numbers is better than fishing around in the code based on an assumption when it comes to performance.

VisualVM should be showing you the methods which use the most CPU. If the biggest user is your method, it means it not a method you are calling unless you are calling many methods which individually look small but in total are more.
I suggest you take the difference of the methods this method calls and your total. That is how much your method is adding which being profiled. Note: how much it adds when not profiled could be less as the profiler has an overhead.

You need to use tools like JProfiler, Yourkit etc. You can profile you code in depth & you can exactly catch which method is taking much time. You can go as much in depth hierarchy as you want with these tools.

How to memory profile in Java?

I'm still learning the ropes of Java so sorry if there's a obvious answer to this. I have a program that is taking a ton of memory and I want to figure a way to reduce its usage, but after reading many SO questions I have the idea that I need to prove where the problem is before I start optimizing it.
So here's what I did, I added a break point to the start of my program and ran it, then I started visualVM and had it profile the memory(I also did the same thing in netbeans just to compare the results and they are the same). My problem is I don't know how to read them, I got the highest area just saying char[] and I can't see any code or anything(which makes sense because visualvm is connecting to the jvm and can't see my source, but netbeans also does not show me the source as it does when doing cpu profiling).
Basically what I want to know is which variable(and hopefully more details like in which method) all the memory is being used so I can focus on working there. Is there a easy way to do this? I right now I am using eclipse and java to develop(and installed visualVM and netbeans specifically for profiling but am willing to install anything else that you feel gets this job done).
EDIT: Ideally, I'm looking for something that will take all my objects and sort them by size(so I can see which one is hogging memory). Currently it returns generic information such as string[] or int[] but I want to know which object its referring to so I can work on getting its size more optimized.

Strings are problematic
Basically in Java, String references ( things that use char[] behind the scenes ) will dominate most business applications memory wise. How they are created determines how much memory they consume in the JVM.
Just because they are so fundamental to most business applications as a data type, and they are one of the most memory hungry as well. This isn't just a Java thing, String data types take up lots of memory in pretty much every language and run time library, because at the least they are just arrays of 1 byte per character or at the worse ( Unicode ) they are arrays of multiple bytes per character.
Once when profiling CPU usage on a web app that also had an Oracle JDBC dependency I discovered that StringBuffer.append() dominated the CPU cycles by many orders of magnitude over all other method calls combined, much less any other single method call. The JDBC driver did lots and lots of String manipulation, kind of the trade off of using PreparedStatements for everything.
What you are concerned about you can't control, not directly anyway
What you should focus on is what in in your control, which is making sure you don't hold on to references longer than you need to, and that you are not duplicating things unnecessarily. The garbage collection routines in Java are highly optimized, and if you learn how their algorithms work, you can make sure your program behaves in the optimal way for those algorithms to work.
Java Heap Memory isn't like manually managed memory in other languages, those rules don't apply
What are considered memory leaks in other languages aren't the same thing/root cause as in Java with its garbage collection system.
Most likely in Java memory isn't consumed by one single uber-object that is leaking ( dangling reference in other environments ).
It is most likely lots of smaller allocations because of StringBuffer/StringBuilder objects not sized appropriately on first instantantations and then having to automatically grow the char[] arrays to hold subsequent append() calls.
These intermediate objects may be held around longer than expected by the garbage collector because of the scope they are in and lots of other things that can vary at run time.
EXAMPLE: the garbage collector may decide that there are candidates, but because it considers that there is plenty of memory still to be had that it might be too expensive time wise to flush them out at that point in time, and it will wait until memory pressure gets higher.
The garbage collector is really good now, but it isn't magic, if you are doing degenerate things, it will cause it to not work optimally. There is lots of documentation on the internet about the garbage collector settings for all the versions of the JVMs.
These un-referenced objects may just have not reached the time that the garbage collector thinks it needs them to for them to be expunged from memory, or there could be references to them held by some other object ( List ) for example that you don't realize still points to that object. This is what is most commonly referred to as a leak in Java, which is a reference leak more specifically.
EXAMPLE: If you know you need to build a 4K String using a StringBuilder create it with new StringBuilder(4096); not the default, which is like 32 and will immediately start creating garbage that can represent many times what you think the object should be size wise.
You can discover how many of what types of objects are instantiated with VisualVM, this will tell you what you need to know. There isn't going to be one big flashing light that points at a single instance of a single class that says, "This is the big memory consumer!", that is unless there is only one instance of some char[] that you are reading some massive file into, and this is not possible either, because lots of other classes use char[] internally; and then you pretty much knew that already.
I don't see any mention of OutOfMemoryError
You probably don't have a problem in your code, the garbage collection system just might not be getting put under enough pressure to kick in and deallocate objects that you think it should be cleaning up. What you think is a problem probably isn't, not unless your program is crashing with OutOfMemoryError. This isn't C, C++, Objective-C, or any other manual memory management language / runtime. You don't get to decide what is in memory or not at the detail level you are expecting you should be able to.

In JProfiler, you can take go to the heap walker and activate the biggest objects view. You will see the objects the retain most memory. "Retained" memory is the memory that would be freed by the garbage collector if you removed the object.
You can then open the object nodes to see the reference tree of the retained objects. Here's a screen shot of the biggest object view:
Disclaimer: My company develops JProfiler

I would recommend capturing heap dumps and using a tool like Eclipse MAT that lets you analyze them. There are many tutorials available. It provides a view of the dominator tree to provide insight into the relationships between the objects on the heap. Specifically for what you mentioned, the "path to GC roots" feature of MAT will tell you where the majority of those char[], String[] and int[] objects are being referenced. JVisualVM can also be useful in identifying leaks and allocations, particularly by using snapshots with allocation stack traces. There are quite a few walk-throughs of the process of getting the snapshots and comparing them to find the allocation point.

Java JDK comes with JVisualVM under bin folder, once your application server (for example is running) you can run visualvm and connect it to your localhost, which will provide you memory allocation and enable you to perform heap dump

If you use visualVM to check your memory usage, it focuses on the data, not the methods. Maybe your big char[] data is caused by many String values? Unless you are using recursion, the data will not be from local variables. So you can focus on the methods that insert elements into large data structures. To find out what precise statements cause your "memory leakage", I suggest you additionally
read Josh Bloch's Effective Java Item 6: (Eliminate obsolete object references)
use a logging framework an log instance creations on the highest verbosity level.

There are generally two distinct approaches to analyse Java code to gain an understanding of its memory allocation profile. If you're trying to measure the impact of a specific, small section of code – say you want to compare two alternative implementations in order to decide which one gives better runtime performance – you would use a microbenchmarking tool such as JMH.
While you can pause the running program, the JVM is a sophisticated runtime that performs a variety of housekeeping tasks and it's really hard to get a "point in time" snapshot and an accurate reading of the "level of memory usage". It might allocate/free memory at a rate that does not directly reflect the behaviour of the running Java program. Similarly, performing a Java object heap dump does not fully capture the low-level machine specific memory layout that dictates the actual memory footprint, as this could depend on the machine architecture, JVM version, and other runtime factors.
Tools like JMH get around this by repeatedly running a small section of code, and observing a long-running average of memory allocations across a number of invocations. E.g. in the GC profiling sample JMH benchmark the derived *·gc.alloc.rate.norm metric gives a reasonably accurate per-invocation normalised memory cost.
In the more general case, you can attach a profiler to a running application and get JVM-level metrics, or perform a heap dump for offline analysis. Some commonly used tools for profiling full applications are Async Profiler and the newly open-sourced Java Flight Recorder in conjunction with Java Mission Control to visualise results.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.