Java GC Stop and copy - java

This question is a follow up to my previous Java GC question: Java Garbage Collector clarification
This question is also referring to the same article.
I'm a little confused on why the stop and copy method for defragmenting object heap allocation is so commonly used. Yes it defragments the heap however it seems like there is tons of overhead because basically you cut the total amount of heap size in half. Also you need to copy ALL the live objects when one half has run out of space.
Other than defragmentation is there any other fundamental reason why 'stop and copy' is better than say 'mark and sweep'?

Actually, fragmentation is fundamental, and the ability of some GC to defeat it is a considerable asset.
The stop-and-copy algorithm used to be popular in GC implementations because:
it is simple to implement;
it automatically defragments memory;
its running time is proportional to the amount of live objects, which makes it asymptotically very efficient.
More modern GC, including those used in Java, use much more complex strategies because they want to make short pauses (rather than making total GC time low, they prefer never to stop the application for a long time, because pauses are bad for interactivity), to interact more cleanly with caches and virtual memory, and to benefit from systems with multiple CPU.
The Jones and Lins book is a must-read for whoever wants to understand garbage collection.

A great tutorial on the garbage collector is Tuning Garbage Collection (unfortunately the new oracle website has messed its formatting up quiet a lot).
Your question is handled in chapter V. This basically explains which types of strategies you can use in the Java garbage collector and which are default. Most desktop applications will be interested in a stop that is as small as possible, because this is what the user might notice.
Note that your question is not about defragmentation. Both will eventually compress the memory space.

Related

how to predict jvm garbage collection

I'm working on a critical application written in java and it should avoid 'stop the world garbage collection' effects.
I'm looking for a solution that can predict long pauses due to full gc. is it possible?
The best thing you can do is to either reduce allocations and/or use a pause less GC like Azul's. This will make GCs easier to manage.
If you reduce allocations enough in key sections (identified using metrics e.g. a profiler like JMC/JFR) you can run all day without a full collection, or in extreme cases, all day without a minor collection.
You can monitor how full the tenured space is and see if it is filling up (there are other causes of full GC but this is the most common)
Well that's not possibile at all. I think the best way to avoid that "stop the world" garbage collection is minimizing the life of objects. Small runs.
BTW, you need to try different solutions and profile them.
why not force regularly GC, with a frequency depending on memory usage for example ?

track down allocations of int[]

When viewing my remote application in JVisualVM over JMX, I see a saw-tooth of memory usage while idle:
Taking a heap dump and analysing it with JVisualVM, I see a large chunk of memory is in a few big int[] arrays which have no references and by comparing heap dumps I can see that it seems to be these that are taking the memory and being reclaimed by a GC periodically.
I am curious to track these down since it piqued my interest that my own code never knowingly allocates any int[] arrays.
I do use a lot of libs like netty so the culprit could be elsewhere. I do have other servers with much the same mix of frameworks but don't see this sawtooth there.
How can I discover who is allocating them?
Take a heapdump and find out what objects are holding them. Once you know what objects are holding the arrays you should have an easy time idea figuring out what is allocating them.
It doesn't answer your question, but my question is:
Why do you care?
You've told the jvm garbage collector (GC) it can use up to 1GB of memory. Java is using less than 250M.
The GC tries to be smart about when it garbage collects and also how hard it works at garbage collection. In your graph, there is no demand for memory. The jvm isn't anywhere near that 1GB limit you set. I see no reason the GC should try very hard at all. Not sure why you would care either.
Its a good thing for the garbage collector to be lazy. The less the GC works, the more resources there are available for your application.
Have you tried triggering GC via the JVisualVM "Perform GC" button? That button should trigger a "stop the world" garbage collection operation. Try it when the graph is in the middle of one of those saw tooth ramp ups - I predict that the usage will drop to the base of the saw tooth or below. If it does, that proves that the memory saw tooth is just garbage accumulation and GC is doing the right thing.
Here is an screenshot of memory usage for a java swing application I use:
Notice the sawtooth pattern.
You said you are worried about int[]. When I start the memory profiler and have it profile everything I can see the allocations of int[]
Basically all allocations come from an ObjectOutputStream$HandleTable.growEntries method. It looks like the thread the allocations were made on was spun up to handle a network message.
I suspect its caused by jmx itself. Possibly by rmi (do you use rmi?). Or the debugger (do you have a debugger connected?).
I just thought I'd add to this question that the sawtooth pattern is very much normal and has nothing necessarily to do with your int[] arrays. It happens because new allocations happen in the Eden-gen, and an ephemeral collection only is triggered once it has filled up, leaving the old-gen be. So as long as your program does any allocations at all, the Eden gen will fill up and then empty repeatedly. Especially, then, when you have a regular amount of allocations per unit of time, you'll see a very regular sawtooth pattern.
There are tons of articles on the web detailing how Hotspot's GC works, so there's no need for me to expand on that here. If you don't know at all how ephemeral collection works, you may want to check out Wikipedia's article on the subject (see the "Generational GC" section; "generational" and "ephemeral" are synonymous in this context).
As for the int[] arrays, however, they are a bit mysterious. I'm seeing those as well, and there's another question here on SO on the subject of them without any real answer. It's not actually normal for objects with no references to show up in a heap dump, because a heap dump normally only contains live objects (because Hotspot always performs a stop-the-world collection before actually dumping the heap). My personal guess is that they are allocated as part of some kind of internal JVM data-structure (and therefore only have references from the C++ part of Hotspot rather than from the Java heap), but that's really just a pure guess.

Long GC pauses in application

I am currently running an application which requires a maximum heap size of 16GB.
Currently I use the following flags to handle garbage collection.
-XX\:+UseParNewGC, -XX\:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=50, -XX\:+DisableExplicitGC, -XX\:+PrintGCDateStamps, -XX\:+PrintGCDetails, -Xloggc\:/home/user/logs/gc.log
However, I have noticed that during some garbage collections, the application locks up for a few seconds and then carries on - This is completely unacceptable as it's a game server.
An exert from my garbage collection logs can be found here.
Any advice on what I should change in order to reduce these long pauses would be greatly appreciated.
Any advice on what I should change in order to reduce these long pauses would be greatly appreciated.
The chances are that the CMS GC cannot keep up with the amount of garbage your system is generating. But the work that the GC has to perform is actually more closely related to the amount of NON-garbage that your system is retaining.
So ...
Try to reduce the actual memory usage of your application; e.g. by not caching so much stuff, or reducing the size of your "world".
Try to reduce the rate at which your application generates garbage.
Upgrade to a machine with more cores so that there are more cores available to run the parallel GC threads when necessary.
To Mysticial:
Yes in hindsight, it might have been better to implement the server in C++. However, we don't know anything about "the game". If it involves a complicated world model with complicated heterogeneous data structures, then implementing it in C++ could mean that that you replace the "GC pause" problem with the problem that the server crashes all the time due to problems with the way it manages its data structures.
Looking at your logs, I don't see any long pauses. But young GC is very frequent. Promotion rate is very low though (most garbage cleared by young GC as it should). At same time your old space utilization is low.
BTW are we talking about minecraft server?
To reduce frequency of young GC you should increase its size. I would suggest start with -XX:NewSize=8G -XX:MaxNewSize=8G
For such large young space, you should also reduce survivor space size -XX:SurvivorRatio=512
GC tuning is a path of trial and errors, so you may need some more iterations and tweaking.
You can find couple of useful articles at mu blog
HotSpot JVM GC options cheatsheet
Understanding young GC pauses in HotSpot JVM
I'm not an expert on Java garbage collection, but it looks like you're doing the right thing by using the concurrent collector (the UseConcMarkSweepGC flag), assuming the server has multiple processors. Follow the suggestions for troubleshooting at http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html#cms. If you already have, let us know what happened when you tried them.
Which version of java are you using?http://docs.oracle.com/javase/7/docs/technotes/guides/vm/G1.html
For better try to minimize the use of instance variables in a class.It would be better to perform on local variables than instance varibles .It helps in gaining the performance and safe from synchronization problem.In the end of operation before exit of program always reset the used variables if you are using instance variables and set again when it is required. It helps more in enhancing performance.Besides in the version of java a good garbage collection policy is implemented.It would be better to move to new version if that is fleasible.
Also you can monitor the garbage collector pause time via VisualVm and you can get more idea when it is performing more garbage collection.

How to memory profile in Java?

I'm still learning the ropes of Java so sorry if there's a obvious answer to this. I have a program that is taking a ton of memory and I want to figure a way to reduce its usage, but after reading many SO questions I have the idea that I need to prove where the problem is before I start optimizing it.
So here's what I did, I added a break point to the start of my program and ran it, then I started visualVM and had it profile the memory(I also did the same thing in netbeans just to compare the results and they are the same). My problem is I don't know how to read them, I got the highest area just saying char[] and I can't see any code or anything(which makes sense because visualvm is connecting to the jvm and can't see my source, but netbeans also does not show me the source as it does when doing cpu profiling).
Basically what I want to know is which variable(and hopefully more details like in which method) all the memory is being used so I can focus on working there. Is there a easy way to do this? I right now I am using eclipse and java to develop(and installed visualVM and netbeans specifically for profiling but am willing to install anything else that you feel gets this job done).
EDIT: Ideally, I'm looking for something that will take all my objects and sort them by size(so I can see which one is hogging memory). Currently it returns generic information such as string[] or int[] but I want to know which object its referring to so I can work on getting its size more optimized.
Strings are problematic
Basically in Java, String references ( things that use char[] behind the scenes ) will dominate most business applications memory wise. How they are created determines how much memory they consume in the JVM.
Just because they are so fundamental to most business applications as a data type, and they are one of the most memory hungry as well. This isn't just a Java thing, String data types take up lots of memory in pretty much every language and run time library, because at the least they are just arrays of 1 byte per character or at the worse ( Unicode ) they are arrays of multiple bytes per character.
Once when profiling CPU usage on a web app that also had an Oracle JDBC dependency I discovered that StringBuffer.append() dominated the CPU cycles by many orders of magnitude over all other method calls combined, much less any other single method call. The JDBC driver did lots and lots of String manipulation, kind of the trade off of using PreparedStatements for everything.
What you are concerned about you can't control, not directly anyway
What you should focus on is what in in your control, which is making sure you don't hold on to references longer than you need to, and that you are not duplicating things unnecessarily. The garbage collection routines in Java are highly optimized, and if you learn how their algorithms work, you can make sure your program behaves in the optimal way for those algorithms to work.
Java Heap Memory isn't like manually managed memory in other languages, those rules don't apply
What are considered memory leaks in other languages aren't the same thing/root cause as in Java with its garbage collection system.
Most likely in Java memory isn't consumed by one single uber-object that is leaking ( dangling reference in other environments ).
It is most likely lots of smaller allocations because of StringBuffer/StringBuilder objects not sized appropriately on first instantantations and then having to automatically grow the char[] arrays to hold subsequent append() calls.
These intermediate objects may be held around longer than expected by the garbage collector because of the scope they are in and lots of other things that can vary at run time.
EXAMPLE: the garbage collector may decide that there are candidates, but because it considers that there is plenty of memory still to be had that it might be too expensive time wise to flush them out at that point in time, and it will wait until memory pressure gets higher.
The garbage collector is really good now, but it isn't magic, if you are doing degenerate things, it will cause it to not work optimally. There is lots of documentation on the internet about the garbage collector settings for all the versions of the JVMs.
These un-referenced objects may just have not reached the time that the garbage collector thinks it needs them to for them to be expunged from memory, or there could be references to them held by some other object ( List ) for example that you don't realize still points to that object. This is what is most commonly referred to as a leak in Java, which is a reference leak more specifically.
EXAMPLE: If you know you need to build a 4K String using a StringBuilder create it with new StringBuilder(4096); not the default, which is like 32 and will immediately start creating garbage that can represent many times what you think the object should be size wise.
You can discover how many of what types of objects are instantiated with VisualVM, this will tell you what you need to know. There isn't going to be one big flashing light that points at a single instance of a single class that says, "This is the big memory consumer!", that is unless there is only one instance of some char[] that you are reading some massive file into, and this is not possible either, because lots of other classes use char[] internally; and then you pretty much knew that already.
I don't see any mention of OutOfMemoryError
You probably don't have a problem in your code, the garbage collection system just might not be getting put under enough pressure to kick in and deallocate objects that you think it should be cleaning up. What you think is a problem probably isn't, not unless your program is crashing with OutOfMemoryError. This isn't C, C++, Objective-C, or any other manual memory management language / runtime. You don't get to decide what is in memory or not at the detail level you are expecting you should be able to.
In JProfiler, you can take go to the heap walker and activate the biggest objects view. You will see the objects the retain most memory. "Retained" memory is the memory that would be freed by the garbage collector if you removed the object.
You can then open the object nodes to see the reference tree of the retained objects. Here's a screen shot of the biggest object view:
Disclaimer: My company develops JProfiler
I would recommend capturing heap dumps and using a tool like Eclipse MAT that lets you analyze them. There are many tutorials available. It provides a view of the dominator tree to provide insight into the relationships between the objects on the heap. Specifically for what you mentioned, the "path to GC roots" feature of MAT will tell you where the majority of those char[], String[] and int[] objects are being referenced. JVisualVM can also be useful in identifying leaks and allocations, particularly by using snapshots with allocation stack traces. There are quite a few walk-throughs of the process of getting the snapshots and comparing them to find the allocation point.
Java JDK comes with JVisualVM under bin folder, once your application server (for example is running) you can run visualvm and connect it to your localhost, which will provide you memory allocation and enable you to perform heap dump
If you use visualVM to check your memory usage, it focuses on the data, not the methods. Maybe your big char[] data is caused by many String values? Unless you are using recursion, the data will not be from local variables. So you can focus on the methods that insert elements into large data structures. To find out what precise statements cause your "memory leakage", I suggest you additionally
read Josh Bloch's Effective Java Item 6: (Eliminate obsolete object references)
use a logging framework an log instance creations on the highest verbosity level.
There are generally two distinct approaches to analyse Java code to gain an understanding of its memory allocation profile. If you're trying to measure the impact of a specific, small section of code – say you want to compare two alternative implementations in order to decide which one gives better runtime performance – you would use a microbenchmarking tool such as JMH.
While you can pause the running program, the JVM is a sophisticated runtime that performs a variety of housekeeping tasks and it's really hard to get a "point in time" snapshot and an accurate reading of the "level of memory usage". It might allocate/free memory at a rate that does not directly reflect the behaviour of the running Java program. Similarly, performing a Java object heap dump does not fully capture the low-level machine specific memory layout that dictates the actual memory footprint, as this could depend on the machine architecture, JVM version, and other runtime factors.
Tools like JMH get around this by repeatedly running a small section of code, and observing a long-running average of memory allocations across a number of invocations. E.g. in the GC profiling sample JMH benchmark the derived *·gc.alloc.rate.norm metric gives a reasonably accurate per-invocation normalised memory cost.
In the more general case, you can attach a profiler to a running application and get JVM-level metrics, or perform a heap dump for offline analysis. Some commonly used tools for profiling full applications are Async Profiler and the newly open-sourced Java Flight Recorder in conjunction with Java Mission Control to visualise results.

Garbage Collection in android (Done manually)

I have a strange doubts. I know garbage collector has its own limitation. and if allocation is
bad then it can cause a problem for application to respond in unusual way.
So my question is that is it good programming habit to call forcefully garbage collector (System.gc()) at the end of each activity?
Update
Every one is saying that calling system.gc() not beneficial at all.Then i am wondering why its present here.DVM will decide when to run garbage collector.Then what is need of that method?
Update 2
Thanks community to help me out. But honestly i got knowledge about Garbage collection real Beauvoir from this link Java Performance Optimization
it isn't good programming habit to call forcefully garbage collector (System.gc()) at the end of each activity
Because it is useless,only DVM decide when it should be call although you called it...
System.gc(), which the VM sometimes ignores at whim, is mostly useful in two cases:
you're gobbling up memory like there's no tomorrow (usually with bitmaps).
you suspect a memory leak (such as accidentally holding onto an old Context), and want to put the VM memory in a quiescent state to see if the memory usage is creeping up, for debugging.
Under nominal circumstances, one should not use it.
I really think it depends on your situation.
Because the heap is generational, the GC may not get rid of certain large objects or bitmaps on its first pass, and its heuristics may not indicate that additional garbage collection is necessary, but there are definitely scenarios where the heuristic could be wrong, and we as the developers have knowledge of a pattern, or can predict usage that the GC cannot, and therefore calling system.gc() will benefit us.
I have seen this before in specific scenarios such as dealing with map tiling or other graphic intensive behaviors, where the native GC in Android (even on 3.0+ devices), doesn't get it right, resulting in Out of Memory errors. However, by adding a few GC calls, the Out of Memory errors are prevented, and the system continues to process albeit at a slower rate (due to garbage collection). In graphic intensive operations, this usually is that state desired (a little lag) over the application crashing because it cannot load additional resources into memory.
My only explanation for why this happens in certain scenarios appears to be timing. If user operations are slow, then the native Android GC seems to do great. However, if your user is scrolling fast, or zooming quickly, this is where I have seen the Android GC lag behind, and a few well thought out System.gc() have resulted in my applications not crashing.
calling System.gc(), doesn't do any harm. but you cant be sure that it will be of some use. because you ask the DVM to do garbage collection, but can't command it... Its dependent totally on DVM. It calls when memory is running out or may be at any time..
I tried putting System.gc() on the line before the lines where I created my bitmap in my Android app. The garbage collector freed up several megabytes in some cases and put and end to my OutOfMemoryError conditions. It did not interfere with the normal garbage collection one bit but it did make my app run faster.
No; if the system needs memory, it will call GC on its own.
Any memory used by an instance, that isn't referenced anywhere else, will become eligible for GC when the instance goes away.
Memory used by the instance itself, if no longer referenced, is also eligible for GC. You can do a code review or profiling to see if you're holding on to memory unnecessarily, but that's a different issue.
Calling GC manually is a bad coding habit...
The Developer docs on RAM usage state:
...
GC_EXPLICIT
An explicit GC, such as when you call gc() (which you should avoid calling and instead trust the GC to run when needed).
...
I've highlighted the most important and relevant part here in bold.
It is possible to ask the Android JVM to run the garbage collector by calling System.gc(). As the documentation states:
Calling the gc() method suggests that the Java Virtual Machine expend effort toward recycling unused objects in order to make the memory they currently occupy available for quick reuse. When control returns from the method call, the Java Virtual Machine has made a best effort to reclaim space from all discarded objects.
Emphasis added!
Some care is needed in interpreting "best effort" in the final sentence:
The "best effort" might be to ignore the "suggestion" entirely. Some JVMs have a configuration option to totally ignore System.gc() calls.
The "best effort" may or may not amount to a full garbage collection. That is an implementation detail.
But the bottom line is that you cannot force the GC to run.
Calling System.gc() is generally a bad idea. It makes your application inefficient, and it may introduce unwanted and unnecessary GC pauses.
The inefficiency issue comes down to the way that modern garbage collectors behave. A garbage collector's work has two parts1:
Finding the objects that are reachable.
Dealing with the objects that are not reachable.
The first part involves traversing reference chains and and marking the graph of objects starting at the GC roots. This work is proportional to the number of reachable objects.
The second part can be handled in a couple of ways, but it will typically be proportional to the size of the reachable objects.
Thus the overall cost of a GC run (in CPU time) depends mostly in the amount of non-garbage. But the benefit of the work performed is the amount of space that you managed to reclaim.
To maximize efficiency, you need to run the GC when the benefit of running the GC is at its highest; i.e. when the heap is close to full. But the problem is that if you call System.gc() you may be requesting a garbage collection when there is lots of free space.
Every one is saying that calling system.gc() not beneficial at all. Then I am wondering why its present here. DVM will decide when to run garbage collector. Then what is need of that method?
It is there for largely historical reasons. The method was present in the System class in Java 1.0. Removing it now would break a lot of legacy code. As for why gc() was included in the first place, the decision was made a long, long time ago, and we were not "in the room" when it was made. My guess is that the decision makers (in ~1995):
were a bit too optimistic about how GC technology would develop,
didn't anticipate that naive programmers would try to use gc() calls to solve memory leaks and other bugs, and / or
were simply too rushed to think too hard about it.
There are also a couple of scenarios where calling System.gc() is beneficial. One such scenario is when your application is about to start a "phase" where unscheduled GC pauses are going to give a particularly bad user experience. By running System.gc() you can take the "performance hit" at a point in time where it matters less; e.g. during a user initiated pause or while switching levels in a game.
But I don't think the above scenario corresponds to your "at the end of every activity".
The final thing to note is that calling System.gc() manually does not prevent normal OOMEs. A normal OOME is typically thrown then the JVM decides there is not enough free heap space to continue. This decision is made immediately after running a (full) GC. Running System.gc() manually won't make any difference to the decision making.
Furthermore, calling System.gc() will not cure normal2 memory leaks. If your application has a memory leak, you actually have a situation where a bunch of objects are reachable when they shouldn't be. But since they are reachable, the GC won't delete them.
The cure for OOMEs is one or more of the following:
Find the memory leaks and fix them. There are tools to help you do this.
Modify the application to use memory more efficiently; e.g. don't keep so much data in memory, or represent it in a more compact form.
Increase the application's heap size.
1 - This is a simplification, but the full story is way to complicated for this posting. I recommend you buy an read an up-to-date book on Garbage Collection of you want (or need) a deeper understanding.
2 - There are cases involving non-heap memory where manually running the GC might help as a band-aid for certain kinds of OOME. But a better solution is to find a better way to reduce non-heap memory usage and/or free up non-heap resources in a more timely fashion.

Categories