Is JIT capable to optimize memory allocation?

Is JIT capable to optimize memory allocation? - java

This is GC diagram from visualvm for a simple application that listens for some incoming stream of data trough websocket... At start it creates a lot of garbage, but as you can see it gets better over time... Is this JIT figuring out somehow how to avoid creating objects?

There are some very specific cases, where the JIT can remove allocations and therefore reduce the pressure on the GC. Mainly with escape analysis. Basically if the object lives only withing one method and never leaves it, it can be allocated on the stack instead of the heap, reducing work of the garbage collector.
If you want to know for sure: You can disable escape analysis: Use the command line argument -XX:-DoEscapeAnalysis and see if the graph changes.
However there many other self tuning mechanisms present. Like the runtime system notices that you don't need as much memory, and therefore start to reduce the heap size. Your graph would match that. As most of the memory can always be freed, the memory system reduces the heap size: With more frequent but smaller GC's.

Related

Java: reliably allocate large array on heap

The Task
Allocate X=4..8MB of byte array (on heap), e.g. using ByteBuffer.allocate() such that it will not cause an OutOfMemoryError. It is not allowed to split the array and process it in smaller portions. Note that the allocation happens on heap, this is not a direct ByteBuffer.
The Challenges
Memory can be fragmented, and if there is enough memory (greater than X), a continuous portion of size X bytes may still be unavailable to allocate the array (any API to find out is there a continuous region of X bytes is available probably would help).
Heap memory is divided into regions to keep objects of different generations, and an object cannot span two or more regions of the heap: Huge arrays throws out of memory despite enough memory available and Large Array allocation across young and tenured portions of java Heap
Large objects are immediately allocated in a tenured region, but it is tricky to reliably reason about which region exactly even using ManagementFactory.getMemoryPoolMXBeans(): how can I know size of each generation in java heap with jmx Some JVMs dynamically adjust LOAs: https://www.ibm.com/docs/en/sdk-java-technology/8?topic=SSYKE2_8.0.0/com.ibm.java.vm.80.doc/docs/mm_allocation_loa.html
Question
Is there a way in Java to code as follows?
if (<I can reliably allocate an array sized X bytes on heap right now>) {
ByteBuffer.allocate(X);
}

There’s a fundamental problem with the idea to do
if (<I can reliably allocate an array sized X bytes on heap right now>) {
ByteBuffer.allocate(X);
}
known as “check-then-act” anti-pattern. Regardless of how the check in the if’s condition is supposed to work, you need to ensure that it doesn’t change between the check and the subsequent action, i.e. the allocation.
To ensure that the result doesn’t change, you’d not only need to stop all other threads of the same JVM from performing allocations (or concurrent garbage collection from completing) but also prevent all other processes of the same machine from allocating memory, as it is possible that the operating system did not reserve memory for your JVM exclusively but still allows other processing to take it right at this point.
The condition itself has the challenges already named in your question and, as you said yourself, all this fiddling with implementation specific memory regions might be moot when the JVM is capable of reconfiguring them on-the-fly. Since this is usually done as response to the result of a garbage collection, you’d need to perform a full garbage collection first, to determine the resulting situation. Only in this case we were able to be sure that another GC won’t change the situation, if we were able to stop all other threads and processes from doing allocations.
And on some JVMs the only way to reliably trigger a garbage collection, is to perform an actual allocation.
So you need a way to atomically perform the check, followed by an actual allocation that ensures that the memory stays available to you no matter what happens in the environment or an answer that the memory is not available. This mechanism does exist. Just call ByteBuffer.allocate(X) and if it completes normally, the returned reference ensures that the memory stays available as long as you keep it. Otherwise, the thrown OutOfMemoryError signals the unavailability of the memory. Since this mechanism exist, there is no reason to provide a second one with the same outcome.

No, there is no reliable way to do this in Java.
There are several ways to get estimates or best-effort guesses for the available memory, but nothing reliable. Also note that even if there were such a thing, another thread could change the available amount between the condition and the call to allocate.
This related answer contains a way to get such an estimate, and also explains some of the reasons why this can not be reliable.

How to gracefully tell Java about total memory limits?

I have troubles with Java memory consumption.
I'd like to say to Java something like this: "you have 8GB of memory, please use it, and only it. Only if you really can't put all your resources in this memory pool, then fail with OOM".
I know, there are default parameters like -Xmx - they limit only the heap. There are also plenty of other parameters, I know. The problems with these parameters are:
They aren't relevant. I don't want to limit the heap size to 6GB (and trust that native memory won't take more than 2GB). I do want to limit all the memory (heap, native, whatever). And do that effectively, not just saying "-Xmx1GB" - to be safe.
There is too many different parameters related to memory, and I don't know how to configure all of them to achieve the goal.
So, I don't want to go there and care about heap, perm and whatever types of memory. My high-level expectation is: since there is only 8GB, and some static memory is needed - take the static memory from the 8GB, and carefully split the remaining memory between other dynamic memory entities.
Also, ulimit and similar things don't work. I don't want to kill the java process once it consumes more memory than expected. I want Java does its best to not reach the limit firstly, and only if it really, really can't - kill the process.
And I'm OK to define even 100 java parameters, why not. :) But then I need assistance with the full list of needed parameters (for, say, Java 8).

Have you tried -XX:MetaspaceSize?
Is this what you need?
Please, read this article: http://karunsubramanian.com/websphere/one-important-change-in-memory-management-in-java-8/
Keep in mind that this is only valid to Java 8.

AFAIK, there is no java command line parameter or set of parameters that will do that.
Your best bet (IMO) is to set the max heap size and the max metaspace size and hope that other things are going to be pretty static / predictable for your application. (It won't cover the size of the JVM binary and it probably won't cover native libraries, memory mapped files, stacks and so on.)
In a comment you said:
So I'm forced to have a significant amount of memory unused to be safe.
I think you are worrying about the wrong thing here. Assuming that you are not constrained by address space or swap space limitations, memory that is never used doesn't matter.
If a page of your address space is not used, the OS will (in the long term) swap it out, and give the physical RAM page to something else.
Pages in the heap won't be in that situation in a typical Java application. (Address space pages will cycle between in-use and free as the GC moves objects within and between "spaces".)
However, the flip-side is that a GC needs the total heap size to be significantly larger than the sum of the live objects. If too much of the heap is occupied with reachable objects, the interval between garbage collection runs decreases, and your GC ergonomics suffer. In the worst case, a JVM can grind to a halt as the time spent in the GC tends to 100%. Ugly. The GC overhead limit mechanism prevents this, but that just means that your JVM gets an OOME sooner.
So, in the normal heap case, a better way to think about it is that you need to keep a portion of memory "unused" so that the GC can operate efficiently.

Odd heap usage pattern

I have a repeating process that:
gets some data from the database
builds some objects in memory, adding to a Collection
writes the data from the Collection to a file
All of the objects/Collections go out of scope or are set to null after each iteration. (The Collection is reused for each iteration.)
Using Java VisualVM, I see a graph that looks like this, which seems very odd considering that it's a repeating process. Yes, the data coming back from the database is different, but it's generally the same amount.
Why does the heap size decrease at first?
Why does the used heap get so close to the heap size in the middle?
(the ~30-second blip at 1:43 was just when VisualVM froze momentarily)

I'm not as big of an expert on GC as some are, but the general idea is that when you've started the program you've given it the initial heap size, max heap size and other relevant parameters and then it's go time.
However the GC has plenty of intelligence and different algorithms that are optimized for different kinds of tasks. A naive implementation would just keep the heap size static and then collect the garbage when it's full. That's known as a "stop the world" collection, because the collector needs to stop everything so it can perform a little (or big) clean up.
Modern GC doesn't just cause long pauses into running programs because it needs to clean up, so there's always a little clean up going on as seen from the sawtooth. But when you start a program the GC has no idea what the program is going to do and how it will use memory. Therefore it has to observe what's happening, analyze memory usage and then decide what amounts of memory it needs to keep available for immediate use, whether it needs to grow the current heap size or if it can decrease the current heap size.
Depending on the behaviour of your program and the GC algorithm being used you can see a lot of different patterns. As long as you're not experiencing linear growth that ends up in an OutOfMemoryError, you should be relatively safe. Unless of course you want to optimize what's happening to increase throughput, responsiveness etc., but that's a more advanced subject and is more relevant when you've gotten your code working the way you want it.

track down allocations of int[]

When viewing my remote application in JVisualVM over JMX, I see a saw-tooth of memory usage while idle:
Taking a heap dump and analysing it with JVisualVM, I see a large chunk of memory is in a few big int[] arrays which have no references and by comparing heap dumps I can see that it seems to be these that are taking the memory and being reclaimed by a GC periodically.
I am curious to track these down since it piqued my interest that my own code never knowingly allocates any int[] arrays.
I do use a lot of libs like netty so the culprit could be elsewhere. I do have other servers with much the same mix of frameworks but don't see this sawtooth there.
How can I discover who is allocating them?

Take a heapdump and find out what objects are holding them. Once you know what objects are holding the arrays you should have an easy time idea figuring out what is allocating them.
It doesn't answer your question, but my question is:
Why do you care?
You've told the jvm garbage collector (GC) it can use up to 1GB of memory. Java is using less than 250M.
The GC tries to be smart about when it garbage collects and also how hard it works at garbage collection. In your graph, there is no demand for memory. The jvm isn't anywhere near that 1GB limit you set. I see no reason the GC should try very hard at all. Not sure why you would care either.
Its a good thing for the garbage collector to be lazy. The less the GC works, the more resources there are available for your application.
Have you tried triggering GC via the JVisualVM "Perform GC" button? That button should trigger a "stop the world" garbage collection operation. Try it when the graph is in the middle of one of those saw tooth ramp ups - I predict that the usage will drop to the base of the saw tooth or below. If it does, that proves that the memory saw tooth is just garbage accumulation and GC is doing the right thing.
Here is an screenshot of memory usage for a java swing application I use:
Notice the sawtooth pattern.
You said you are worried about int[]. When I start the memory profiler and have it profile everything I can see the allocations of int[]
Basically all allocations come from an ObjectOutputStream$HandleTable.growEntries method. It looks like the thread the allocations were made on was spun up to handle a network message.
I suspect its caused by jmx itself. Possibly by rmi (do you use rmi?). Or the debugger (do you have a debugger connected?).

I just thought I'd add to this question that the sawtooth pattern is very much normal and has nothing necessarily to do with your int[] arrays. It happens because new allocations happen in the Eden-gen, and an ephemeral collection only is triggered once it has filled up, leaving the old-gen be. So as long as your program does any allocations at all, the Eden gen will fill up and then empty repeatedly. Especially, then, when you have a regular amount of allocations per unit of time, you'll see a very regular sawtooth pattern.
There are tons of articles on the web detailing how Hotspot's GC works, so there's no need for me to expand on that here. If you don't know at all how ephemeral collection works, you may want to check out Wikipedia's article on the subject (see the "Generational GC" section; "generational" and "ephemeral" are synonymous in this context).
As for the int[] arrays, however, they are a bit mysterious. I'm seeing those as well, and there's another question here on SO on the subject of them without any real answer. It's not actually normal for objects with no references to show up in a heap dump, because a heap dump normally only contains live objects (because Hotspot always performs a stop-the-world collection before actually dumping the heap). My personal guess is that they are allocated as part of some kind of internal JVM data-structure (and therefore only have references from the C++ part of Hotspot rather than from the Java heap), but that's really just a pure guess.

Is it a memory leak if the garbage collector runs abnormally?

I have developed a J2ME web browser application, it is working fine. I am testing its memory consumption. It seems to me that it has a memory leak, because the green curve that represents the consumed memory of the memory monitor (of the wireless toolkit) reaches the maximum allocated memory (which is 687768 bytes) every 7 requests done by the browser, (i.e. when the end user navigates in the web browser from one page to other for 7 pages) after that the garbage collector runs and frees the allocated memory.
My question is:
is it a memory leak when the garbage collector runs automatically every 7 page navigation?
Do I need to run the garbage collector (System.gc()) manually one time per request to prevent the maximum allocated memory to be reached?
Please guide me, thanks

To determine if it is a memory leak, you would need to observe it more.
From your description, i.e. that once the maximum memory is reached, the GC kicks in and is able to free memory for your application to run, it does not sound like there is a leak.
Also you should not call GC yourself since
it is only an indication
could potentially affect the underlying algorithm affecting its performance.
You should instead focus on why your application needs so much memory in such a short period.

My question is: is it a memory leak when the garbage collector runs automatically every 7 page navigation?
Not necessarily. It could also be that:
your heap is too small for the size of problem you are trying to solve, or
your application is generating (collectable) garbage at a high rate.
In fact, given the numbers you have presented, I'm inclined to think that this is primarily a heap size issue. If the interval between GC runs decreased over time, then THAT would be evidence that pointed to a memory leak, but if the rate stays steady on average, then it would suggest that the rate of memory usage and reclamation are in balance; i.e. no leak.
Do I need to run the garbage collector (System.gc()) manually one time per request to prevent the maximum allocated memory to be reached?
No. No. No.
Calling System.gc() won't cure a memory leak. If it is a real memory leak, then calling System.gc() will not reclaim the leaked memory. In fact, all you will do is make your application RUN A LOT SLOWER ... assuming that the JVM doesn't ignore the call entirely.
Direct and indirect evidence that the default behaviour of HotSpot JVMs is to honour System.gc() calls:
"For example, the default setting for the DisableExplicitGC option causes JVM to honor Explicit garbage collection requests." - http://pic.dhe.ibm.com/infocenter/wasinfo/v7r0/topic/com.ibm.websphere.express.doc/info/exp/ae/rprf_hotspot_parms.html
"When JMX is enabled in this way, some JVMs (such as Sun's) that do distributed garbage collection will periodically invoke System.gc, causing a Full GC." - http://static.springsource.com/projects/tc-server/2.0/getting-started/html/ch11s07.html
"It is best to disable explicit GC by using the flag -XX:+DisableExplicitGC." - http://docs.oracle.com/cd/E19396-01/819-0084/pt_tuningjava.html
And from the Java 7 source code:
./openjdk/hotspot/src/share/vm/runtime/globals.hpp
product(bool, DisableExplicitGC, false, \
"Tells whether calling System.gc() does a full GC") \
where the false is the default value for the option. (And note that this is in the OS / M/C independent part of the code tree.)

I wrote a library that makes a good effort to force the GC. As mentioned before, System.gc() is asynchronous and won't do anything by itself. You may want to use this library to profile your application and find the spots where too much garbage is being produced. You can read more about it in this article where I describe the GC problem in detail.

That is (semi) normal behavior. Available (unreferenced) storage is not collected until the size of the heap reaches some threshold, triggering a collection cycle.
You can reduce the frequency of GC cycles by being a bit more "heap aware". Eg, a common error in many programs is to parse a string by using substring to not only parse off the left-most word, but also shorten the remaining string by substringing to the right. Creating a new String for the word is not easily avoided, but one can easily avoid repeatedly substringing the "tail" of the original string.
Running System.GC will accomplish nothing -- on most platforms it's a no-op, since it's so commonly abused.
Note that (outside of brain-dead Android) you can't have a true "memory leak" in Java (unless there's a serious JVM bug). What's commonly referred to as a "leak" in Java is the failure to remove all references to objects that will never be used again. Eg, you might keep putting data into a chain and never clear pointers to the stuff on the far end of the chain that is no longer going to be used. The resulting symptom is that the MINIMUM heap used (ie, the size immediately after GC runs) keeps rising each cycle.

Adding to the other excellent answers:
Looks like you are confusing memory leak with garbage collection.
Memory leak is when unused memory cannot be garbage collected because it still has references somewhere (although they're not used for anything).
Garbage collection is when a piece of software (the garbage collector) frees unreferenced memory automatically.
You should not call the garbage collector manually because that would affect its performance.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.