I read that garbage collection can lead to memory fragmentation problem at run-time. To solve this problem, compacting is done by the JVM where it takes all the active objects and assigns them contiguous memory.
This means that the object addresses must change from time to time? Also, if this happens,
Are the references to these objects also re-assigned?
Won't this cause significant performance issues? How does Java cope with it?
I read that garbage collection can lead to memory fragmentation problem at run-time.
This is not an exclusive problem of garbage collected heaps. When you have a manually managed heap and free memory in a different order than the preceding allocations, you may get a fragmented heap as well. And being able to have different lifetimes than the last-in-first-out order of automatic storage aka stack memory, is one of the main motivations to use the heap memory.
To solve this problem, compacting is done by the JVM where it takes all the active objects and assigns them contiguous memory.
Not necessarily all objects. Typical implementation strategies will divide the memory into logical regions and only move objects from a specific region to another, but not all existing objects at a time. These strategies may incorporate the age of the objects, like generational collectors moving objects of the young generation from the Eden space to a Survivor space, or the distribution of the remaining objects, like the “Garbage First” collector which will, like the name suggests, evacuate the fragment with the highest garbage ratio first, which implies the smallest work to get a free contiguous memory block.
This means that the object addresses must change from time to time?
Of course, yes.
Also, if this happens,
Are the references to these objects also re-assigned?
The specification does not mandate how object references are implemented. An indirect pointer may eliminate the need to adapt all references, see also this Q&A. However, for JVMs using direct pointers, this does indeed imply that these pointers need to get adapted.
Won't this cause significant performance issues? How does Java cope with it?
First, we have to consider what we gain from that. To “eliminate fragmentation” is not an end in itself. If we don’t do it, we have to scan the reachable objects for gaps between them and create a data structure maintaining this information, which we would call “free memory” then. We also needed to implement memory allocations as a search for matching chunks in this data structure or to split chunks if no exact match has been found. This is a rather expensive operation compared to an allocation from a contiguous free memory block, where we only have to bump the pointer to the next free byte by the required size.
Given that allocations happen much more often than garbage collection, which only runs when the memory is full (or a threshold has been crossed), this does already justify more expensive copy operations. It also implies that just using a larger heap can solve performance issues, as it reduces the number of required garbage collector runs, whereas the number of survivor objects will not scale with the memory (unreachable objects stay unreachable, regardless of how long you defer the collection). In fact, deferring the collection raises the chances that more objects became unreachable in the meanwhile. Compare also with this answer.
The costs of adapting references are not much higher than the costs of traversing references in the marking phase. In fact, non-concurrent collectors could even combine these two steps, transferring an object on first encounter and adapting subsequently encountered references, instead of marking the object. The actual copying is the more expensive aspect, but as explained above, it is reduced by not copying all objects but using certain strategies based on typical application behavior, like generational approaches or the “garbage first” strategy, to minimize the required work.
If you move an object around the memory, its address will change. Therefore, references pointing to it will need to be updated. Memory fragmentation occurs when an object in a contigous (in memory) sequence of objects gets deleted. This creates a hole in the memory space, which is generally bad because contigous chunks of memory have faster access times and a higher probability of fitting in chache lines, among other things. It should be noted that the use of indirection tables can prevent reference updates up to the maximum level of indirection used.
Garbage collection has a moderate performance overhead, not just in Java but in other languages as well, such as C# for example. As how Java copes with this, the strategies for performing garbage collection and how to minimize its impact on performance depends on the particular JVM being used, since each JVM can implement garbage collection however it pleases; the only requirement is that it meets the JVM specification.
However, as a programmer, there are some best practices you should follow to make the best out of garbage collection and to minimze its performance hit on your application. See this, also this, this, this blog post, and this other blog post. You might want to check the JVM specs but it's a bit dense.
Related
I read http://www.cubrid.org/blog/tags/Garbage%20Collection/ article which gives high level picture of GC in Java. It says:
The compaction task is to remove memory fragmentation by compacting memory in order to remove the empty space between allocated memory areas.
Should objects be moved into anther places in order to fill holes?
I think that objects are moved. If so that mean addresses are changed and so reference to that object also should be updated?
It seems too complicated task to find all back reference and update them...
Yes, arbitrary objects are moved arbitrarily through memory, and yes this requires updating the references to those objects. One could also use indirection but that has various downsides and I'm not aware of any high performance GC doing it.
It's indeed somewhat complicated, but as far as GC optimizations go it's rather benign. Basic mark-compact works rather well and it basically just goes over all objects in address order, moves them to the smallest available address, and builds a "break table" as it goes which contains the necessary information (start address -> displacement) for quickly fixing up references, which it then does in a second pass. None of this requires information or bookkeeping beyond what any mark-sweep collector already needs (object types, locations of references, etc).
And when you move objects out of the nursery in a generational setting, you also know (roughly) where the old references are. You needed to know that to do a minor collection.
Is there a way to predict how much memory my Java program is going to take? I come from a C++ background where I implemented methods such as "size_in_bytes()" on classes and I could fairly accurately predict the runtime memory footprint of my app. Now I'm in a Java world, and that is not so easy... There are references, pools, immutable objects that are shared... but I'd still like to be able to predict my memory footprint before I look at the process size in top.
You can inspect the size of objects if you use the instrumentation API. It is a bit tricky to use -- it requires a "premain" method and extra VM parameters -- but there are plenty of examples on the web. "java instrumentation size" should find you these.
Note that the default methods will only give you a shallow size. And unless you avoid any object construction outside of the constructor (which is next to impossible), there will be dead objects around waiting to be garbage collected.
But in general, you could use these to estimate the memory requirements of your application, if you have a good control on the amount of objects generated.
You can't predict the amount of memory a program is going to take. However, you can predict how much an object will take. Edit it turns out I'm almost completely wrong, this document describes the memory usage of objects better: http://www.javamex.com/tutorials/memory/object_memory_usage.shtml
In general, you can predict fairly closely what a given object will require. There's some overhead that is relatively fixed, plus the instance fields in the object, plus a modest amount of padding. But then object size is rounded up to at least (on most JVMs) a 16-byte boundary, and some JVMs round up some object sizes to larger boundaries (to allow the use of standard sized pre-allocated object frames). But all this is relatively fixed for a given JVM.
What varies, of course, is the overhead required for garbage collection. A naive garbage collector requires 100% overhead (at least one free byte for every allocated byte), though certain varieties of "generational" collectors can improve on this to a degree. But how much space is required for GC is highly dependent on the workload (on most JVMs).
The other problem is that when you're running at a relatively low level of allocation (where you're only using maybe 10% of max available heap) then garbage will accumulate. It isn't actively referenced, but the bits of garbage are interspersed with your active objects, so it takes up working set. As a result, your working set tends to be roughly equal to your current overall garbage-collected heap size (plus other system overhead).
You can, of course, "throttle" the heap size so that you run at a higher % utilization, but that increases the frequency of garbage collection (and the overall cost of GC to a lesser degree).
You can use profilers to understand the constant set of objects that are always in memory. Then you should execute all the code paths to check for memory leaks. JProfiler is a good one to start with.
Is it possible to mark java objects non-collectable from gc perspective to save on gc-sweep time?
Something along the lines of http://wwwasd.web.cern.ch/wwwasd/lhc++/Objectivity/V5.2/Java/guide/jgdStorage.fm.html and specifically non-garbage-collectible containers there (non-garbage-collectable?).
The problem is that I have lots of ordinary temporary objects, but I have even bigger (several Gigs) of objects that are stored for Cache purposes. For no reason should the Java GC traverse all those Cache gigabytes trying to find anything to collect, because they contain cached data which have their own timeouts.
This way I could partition my data in a custom way into infinite-lived and normal-lived objects, and hopefully GC would be quite fast because normal objects don't live so long and amount to smaller amounts.
There are some workarounds to this problem, such as Apache DirectMemory and Commercial Terracotta BigMemory(http://terracotta.org/products/bigmemory), but a java-native solution would be nicer (I mean free and probably more reliable?). Also I want to avoid serialization overhead which means it should happen within same jvm. To my understanding DirectMemory and BigMemory operate mainly off heap which means that the objects must be serialized/deserialized to/from memory outside jvm. Simply marking non-gc regions within the jvm would seem a better solution. Using Files for cache is not an option either, it has the same unaffordable serialization/deserialization overhead - use case is a HA server with lots of data used in random (human) order and low latency needed.
Any memory the JVM manages is also garbage-collected by the JVM. And any “live” objects which are directly available to Java methods without deserialization have to live in JVM memory. Therefore in my understanding you cannot have live objects which are immune to garbage collection.
On the other hand, the usage you describe should make the generational approach to garbage collection quite efficient. If your big objects stay around for a while, they will be checked for reclamation less often. So I doubt there is much to be gained from avoiding those checks.
Is it possible to mark java objects non-collectable from gc perspective to save on gc-sweep time?
No it is not possible.
You can prevent objects from being garbage collected by keeping them reachable, but the GC will still need to trace them to check reachability on each full; GC (at least).
Is simply my assumption, that when the jvm is starving it begins scanning all those unnecessary objects too.
Yes. That is correct. However, unless you've got LOTS of objects that you want to be treated this way, the overhead is likely to be insignificant. (And anyway, a better idea is to give the JVM more memory ... if that is possible.)
Quite simply, for you to be able to do this, the garbage collection algorithm would need to be aware of such a flag, and take it into account when doing its work.
I'm not aware of any of the standard GC algorithms having such a flag, so for this to work you would need to write your own GC algorithm (after deciding on some feasible way to communicate this information to it).
In principle, in fact, you've already started down this track - you're deciding how garbage collection should be done rather than being happy to leaving it to the JVM's GC algo. Is the situation you describe a measurable problem for you; something for which the existing garbage collection is insufficient, but your plan would work? Garbage collectors are extremely well-tuned, so I wouldn't be surprised if the "inefficient" default strategy is actually faster than your naively-optimal one.
(Doing manual memory management is tricky and error-prone at the best of times; managing some memory yourself while using a stock garbage collector to handle the rest seems even worse. I expect you'd run into a lot of edge cases where the GC assumes it "knows" what's happening with the whole heap, which would no longer be true. Steer clear if you can...)
The recommended approaches would be to use either a commerical RTSJ implementation to avoid GC, or to use off heap memory. One could also look into soft references for caches as well (they do get collected).
This is not recommended:
If for some reason you do not believe these options are sufficient, you could look into direct memory access which is UNSAFE (part of sun.misc.Unsafe). You can use the 'theUnsafe' field to get the 'Unsafe' instance. Unsafe allows to allocation/deallocate memory via 'allocateMemory' and 'freeMemory'. This is not under GC control nor limited by JVM heap size. The impact on GC/application, once you go down this route, is not guaranteed - which is why using byte buffers might be the way to go (if you're not using a RTSJ like implementation).
Hope this helps.
Living Java objects will always be part of the GC life cycle. Or said another way, marking an object to be non-gc is the same order of overhead than having your object referenced by a root reference (a static final map for instance).
But thinking a bit further, data put in a cache are most likely to be temporary, and would eventually be evicted. At that point you will start again to like the JVM and the GC.
If you have 100's of GBs of permanent data, you may want to rethink the architecture of your application, and try to shard and distribute your data (horizontally scalability).
Last but not least, lots of work has been done around serialization, and the overhead of serialization (I'm not speaking about the poor reputation of ObjectInputStream and ObjectOutputStream) is not that big.
More than that, if your data is mainly composed of primitive types (including bytes array), there is efficient way to readInt() or readBytes() from off heap buffers (for instannce netty.io's ChannelBuffer). This could be a way to go.
I have always wondered why the garbage collector in Java activates whenever it feels like it rather than do:
if(obj.refCount == 0)
{
delete obj;
}
Are there any big advantages to how Java does it that I overlooked?
Thanks
Each JVM is different, but the HotSpot JVM does not primarily rely on reference counting as a means for garbage collection. Reference counting has the advantage of being simple to implement, but it is inherently error-prone. In particular, if you have a reference cycle (a set of objects that all refer to one another in a cycle), then reference counting will not correctly reclaim those objects because they all have nonzero reference count. This forces you to use an auxiliary garbage collector from time to time, which tends to be slower (Mozilla Firefox had this exact problem, and IIRC their solution was to add in a garbage collector on top of reference counting). This is why, for example, languages like C++ tend to have a combination of shared_ptrs that use reference counting and weak_ptrs that don't use reference cycles.
Additionally, associating a reference count with each object makes the cost of assigning a reference greater than normal, because of the extra bookkeeping involved of adjusting the reference count (which only gets worse in the presence of multithreading). Furthermore, using reference counting precludes the use of certain types fast of memory allocators, which can be a problem. It also tends to lead to heap fragmentation in its naive form, since objects are scattered through memory rather than tightly-packed, decreasing allocation times and causing poor locality.
The HotSpot JVM uses a variety of different techniques to do garbage collection, but its primary garbage collector is called a stop-and-copy collector. This collector works by allocating objects contiguously in memory next to one another, and allows for extremely fast (one or two assembly instructions) allocation of new objects. When space runs out, all of the new objects are GC'ed simultaneously, which usually kills off most of the new objects that were constructed. As a result, the GC is much faster than a typically reference-counting implementation, and ends up having better locality and better performance.
For a comparison of techniques in garbage collecting, along with a quick overview of how the GC in HotSpot works, you may want to check out these lecture slides from a compilers course that I taught last summer. You may also want to look at the HotSpot garbage collection white paper that goes into much more detail about how the garbage collector works, including ways of tuning the collector on an application-by-application basis.
Hope this helps!
Reference counting has the following limitations:
It is VERY bad for multithreading performance (basically, every assignment of an object reference must be protected).
You cannot free cycles automatically
Because it doesn't work strictly based on reference counting.
Consider circular references which are no longer reachable from the "root" of the application.
For example:
APP has a reference to SOME_SCREEN
SOME_SCREEN has a reference to SOME_CHILD
SOME_CHILD has a reference to SOME_SCREEN
now, APP drops it's reference to SOME_SCREEN.
In this case, SOME_SCREEN still has a reference to SOME_CHILD, and SOME_CHILD still has a reference to SOME_SCREEN - so, in this case, your example doesn't work.
Now, others (Apple with ARC, Microsoft with COM, many others) have solutions for this and work more similarly to how you describe it.
With ARC you have to annotate your references with keywords like strong and weak to let ARC know how to deal with these references (and avoid circular references)... (don't read too far into my specific example with ARC because ARC handles these things ahead-of-time during the compilation process and doesn't require a specific runtime per-se) so it can definitely be done similarly to how you describe it, but it's just not workable with some of the features of Java. I also believe COM works more similarly to how you describe... but again, that's not free of some amount of consideration on the developer's part.
In fact, no "simple" reference counting scheme would ever be workable without some amount of thought by the application developer (to avoid circular references, etc)
Because the garbage collector in modern JVMs is no longer tracking references count. This algorithm is used to teach how GC works, but it was both resource-consuming and error-prone (e.g. cyclic dependencies).
because the garbage collector in java is based on copying collector for 'youg generation' objects, and
mark and sweep for `tenure generations' objects.
Resources from: http://java.sun.com/docs/hotspot/gc1.4.2/faq.html
My app is allocating a ton of objects (>1mln per second; most objects are byte arrays of size ~80-100 and strings of the same size) and I think it might be the source of its poor performance.
The app's working set is only tens of megabytes. Profiling the app shows that GC time is negligibly small.
However, I suspect that perhaps the allocation procedure depends on which GC is being used, and some settings might make allocation faster or perhaps make a positive influence on cache hit rate, etc.
Is that so? Or is allocation performance independent on GC settings under the assumption that garbage collection itself takes little time?
Of course your performance depends on the allocator used. But you have profiled GC and saw that it is not much of an issue. Also, one of the strengths of the GC is fast allocation at the expense of slower collection.
I think you are having issues with resulting fragmentation which makes memory access pattern problematic for the cpu, since it may need to invalidate its cache too often. Most GC algorithms doesn't reclaim space in an optimum way.
Since your working set is limited and predictable, you might want to use an object pool which is allocated beforehand. You may also want to use reference counting to avoid much of the manual memory management. Technically it is still GC but not in the common sense of the GC.
Still, I don't think the performance is much affected by how you manage memory but how you actually use, access it. Most likely your profiler has the definite answer.
There are two distinct aspects to object allocation. The first is finding a suitable area of memory - with todays generational garbarge collectors, this is usually very fast (in the order of a few 10ths of machine cycles).
The second is the initialization of the objects you allocate. Since everything you allocate in Java is initialized, the cost for initialization can easily outweight the cost of allocation (except for the most simple, smallest objects). There is more. Since initialization requires writing the entire memory area the new object occupies (if you allocate a "new byte[1<<20]" for example, the entire megabyte needs to be set to zeros), this also usually pulls that memory into the cpu's cache, evicting other, older cache lines (which may or may not belong to your current "hot" working set).
If you do comparatively little processing on each of your arrays, those effects can severly affect the performance of your code. This can be partially avoided by re-using the same arrays over and over, but it usually makes the program logic more complex. It is also often not easy to determine if cache trashing is really the culprit. Its impossible to say from what little information is given in your question.
Does your VM try to pool strings? I had heard once, that IBM's VM did something like string interning but dynamically (no idea if its true) perhaps your VM is trying doing extra work to build an internal data structure of String internals.
Are you doing something like byte b[] = new byte[100]; String s = new String(b); by any chance? You might try not to allocate the String objects, and instead allocate some random object which has a reference to the byte[] (for comparison).