How to get the memory address of the Java heap? - java

How can I determine the address in memory of the Java heap for a JVM running in the current process? That is, get a void* pointer or equivalent to the contiguous area of memory that the JVM has allocated for the heap, using Java, C, or other calls?
Matlab has a JVM embedded in its process. The memory the JVM allocates is unavailable for Matlab arrays, and of this, the heap is important, because it takes a big contiguous chunk of memory and never shrinks, and Matlab also needs contiguous memory for its arrays. If the heap is reallocated during expansion, that could cause fragmentation.
I'd like to instrument my process to examine the interaction between the Java heap and Matlab's view of memory, and to find out when it moves due to resizing, preferably all from within the process. This needs the address of the heap. It's easy to find the heap size from java.lang.Runtime, but not its address in memory. How can this be done?
I'm running Sun's JRE 1.6.0_04 in a Matlab R2008b process on Windows XP and Server 2003. I realize this probably needs to be a vendor-specific technique. The process runs code we've written, so we can use custom Java, Matlab, JNI, and C/C++ code. Java method calls or supported hooks in the JVM would be preferred to low-level hackery.
EDIT: The goal of this is to examine the interaction between the JVM's GC and Matlab's GC. I have no need to see into the Java heap and won't be reading anything from that memory; I just want to see where it is in the context of the overall virtual memory space that Matlab's GC is also trying to fit data into.

A quick 'n dirty way to get the actual heap address of the JVM is to jump into WinDbg, attaching to the JVM and issue a single !address command. Somewhere around 0x2??????? (It differes between jvm versions but remains static for that version) will be a large VAD marked PAGE_EXECUTE_READWRITE, this is your JVM's heap in the process's memory.
To confirm, you can set a breakpoint on kernel32!VirtualAlloc and upon JVM initilization in the module JVM.DLL you will hit on the call to VirtualAlloc showing you the jvm allocation its heap. If you check out the code around this call you can see how the address is calculated.

Stepping back a bit... Could you go with a fixed-size Java heap? At that point concerns about reallocation and fragmentation go away.
On a stand-alone Java invocation, that involve specifying something like -Xmx500m and -Xms500m for a 500Mb heap. You'd have to translate that into what matlab wants.

If you just want to get the JVM heap to shrink you could try playing with the gc parameters such as -XX:MaxHeapFreeRatio (see http://java.sun.com/javase/technologies/hotspot/vmoptions.jsp)
I don't think you can get a pointer to the Java heap with JNI. However, the Java heap is just memory allocated to the process by Windows from one of the process heaps.
You can get at the process heaps from your C++ code using the GetProcessHeaps function (http://msdn.microsoft.com/en-us/library/aa366571(VS.85).aspx) and then start walking through them with the HeapWalk function. There's a good example at http://www.abstraction.net/content/articles/analyzing%20the%20heaps%20of%20a%20win32%20process.htm. You might be able to spot which allocated blocks are used by the Java heap by looking for certain patterns of bytes (the JVM source code might give you some clues as to what to look for, but good luck figuring that out!)

The GC can move data at any time to compact the memory. So there is no fixed point for objects in the Java Heap. You can use a ByteBuffer.allocateDirect() which allocates memory in the "C" space rather than the heap and it is fixed in memory.

I don't think that there's a way do to what you want, without using a customized JVM. You could theoretically use OpenJDK and patch it or enable some tracing (not sure whether there's one that fits your needs). I think n external monitoring tool such as the process explorer could solve your problem

Related

Does two languages use the same stack and heap on the ram?

I was reading regarding the memory allocation in python and was wondering if I have java and python installed in the same computer system then does java and python use the same stack and heap or they have different stack and heaps allocated for them in the RAM?
Can anyone please help me to clear my doubt.
Thanks in Advance.
Even if it is an implementation detail, each thread on a system has its own stack. The heap is an image from the 70's segmented model and has no sense on any process except Java ones on a modern OS: when a process requires more memory it just asks the kernel for it and has no preallocated heap. In a sense all processes in a system (except for Java ones) share the same available memory pool, but it not what was called a heap.
Java is different, because a Java program executes in a JVM, and the JVM has its allocated memory and the process cannot require memory from the system. So in a JVM, the whole process has a heap, and each thread has a stack. And if you launch 2 independant Java programs, you will have 2 independant JVM each with its own heap.
Two different programs never use the same heap. They each, get a portion of virtual address space, that can overlap in numbers somehow, but these are independent. When a program actually needs pages to be in RAM - it allocates them, in a lazy fashion, usually by 4KB ( might be bigger ). This is done via a Page Table, which is able to map virtual address space to phisycal one ( potentially swap too ).
You probably need to understand that you do not require everything, all the time in RAM. Your code can do a little portion, then give that space in RAM to someone else, or even swap out.

Java: why does it uses a fixed amount of memory? or how does it manage the memory?

It seems that the JVM uses some fixed amount of memory. At least I have often seen parameters -Xmx (for the maximum size) and -Xms (for the initial size) which suggest that.
I got the feeling that Java applications don't handle memory very well. Some things I have noticed:
Even some very small sample demo applications load huge amounts of memory. Maybe this is because of the Java library which is loaded. But why is it needed to load the library for each Java instance? (It seems that way because multiple small applications linearly take more memory. See here for some details where I describe this problem.) Or why is it done that way?
Big Java applications like Eclipse often crash with some OutOfMemory exception. This was always strange because there was still plenty of memory available on my system. Often, they consume more and more memory over runtime. I'm not sure if they have some memory leaks or if this is because of fragmentation in the memory pool -- I got the feeling that the latter is the case.
The Java library seem to require much more memory than similar powerful libraries like Qt for example. Why is this? (To compare, start some Qt applications and look at their memory usage and start some Java apps.)
Why doesn't it use just the underlying system technics like malloc and free? Or if they don't like the libc implementation, they could use jemalloc (like in FreeBSD and Firefox) which seems to be quite good. I am quite sure that this would perform better than the JVM memory pool. And not only perform better, also require less memory, esp. for small applications.
Addition: Does somebody have tried that already? I would be much interested in a LLVM based JIT-compiler for Java which just uses malloc/free for memory handling.
Or maybe this also differs from JVM implementation to implementation? I have used mostly the Sun JVM.
(Also note: I'm not directly speaking about the GC here. The GC is only responsible to calculate what objects can be deleted and to initialize the memory freeing but the actual freeing is a different subsystem. Afaik, it is some own memory pool implementation, not just a call to free.)
Edit: A very related question: Why does the (Sun) JVM have a fixed upper limit for memory usage? Or to put it differently: Why does JVM handle memory allocations differently than native applications?
You need to keep in mind that the Garbage Collector does a lot more than just collecting unreachable objects. It also optimizes the heap space and keeps track of exactly where there is memory available to allocate for the creation of new objects.
Knowing immediately where there is free memory makes the allocation of new objects into the young generation efficient, and prevents the need to run back and forth to the underlying OS. The JIT compiler also optimizes such allocations away from the JVM layer, according to Sun's Jon Masamitsu:
Fast-path allocation does not call
into the JVM to allocate an object.
The JIT compilers know how to allocate
out of the young generation and code
for an allocation is generated in-line
for object allocation. The interpreter
also knows how to do the allocation
without making a call to the VM.
Note that the JVM goes to great lengths to try to get large contiguous memory blocks as well, which likely have their own performance benefits (See "The Cost of Missing the Cache"). I imagine calls to malloc (or the alternatives) have a limited likelihood of providing contiguous memory across calls, but maybe I missed something there.
Additionally, by maintaining the memory itself, the Garbage Collector can make allocation optimizations based on usage and access patterns. Now, I have no idea to what extent it does this, but given that there's a registered Sun patent for this concept, I imagine they've done something with it.
Keeping these memory blocks allocated also provides a safeguard for the Java program. Since the garbage collection is hidden from the programmer, they can't tell the JVM "No, keep that memory; I'm done with these objects, but I'll need the space for new ones." By keeping the memory, the GC doesn't risk giving up memory it won't be able to get back. Naturally, you can always get an OutOfMemoryException either way, but it seems more reasonable not to needlessly give memory back to the operating system every time you're done with an object, since you already went to the trouble to get it for yourself.
All of that aside, I'll try to directly address a few of your comments:
Often, they consume more and more
memory over runtime.
Assuming that this isn't just what the program is doing (for whatever reason, maybe it has a leak, maybe it has to keep track of an increasing amount of data), I imagine that it has to do with the free hash space ratio defaults set by the (Sun/Oracle) JVM. The default value for -XX:MinHeapFreeRatio is 40%, while -XX:MaxHeapFreeRatio is 70%. This means that any time there is only 40% of the heap space remaining, the heap will be resized by claiming more memory from the operating system (provided that this won't exceed -Xmx). Conversely, it will only* free heap memory back to the operating system if the free space exceeds 70%.
Consider what happens if I run a memory-intensive operation in Eclipse; profiling, for example. My memory consumption will shoot up, resizing the heap (likely multiple times) along the way. Once I'm done, the memory requirement falls back down, but it likely won't drop so far that 70% of the heap is free. That means that there's now a lot of underutilized space allocated that the JVM has no intention of releasing. This is a major drawback, but you might be able to work around it by customizing the percentages to your situation. To get a better picture of this, you really should profile your application so you can see the utilized versus allocated heap space. I personally use YourKit, but there are many good alternatives to choose from.
*I don't know if this is actually the only time and how this is observed from the perspective of the OS, but the documentation says it's the "maximum percentage of heap free after GC to avoid shrinking," which seems to suggest that.
Even some very small sample demo
applications load huge amounts of
memory.
I guess this depends on what kind of applications they are. I feel that Java GUI applications run memory-heavy, but I don't have any evidence one way or another. Did you have a specific example that we could look at?
But why is it needed to load the
library for each Java instance?
Well, how would you handle loading multiple Java applications if not creating new JVM processes? The isolation of the processes is a good thing, which means independent loading. I don't think that's so uncommon for processes in general, though.
As a final note, the slow start times you asked about in another question likely come from several intial heap reallocations necessary to get to the baseline application memory requirement (due to -Xms and -XX:MinHeapFreeRatio), depending what the default values are with your JVM.
Java runs inside a Virtual Machine, which constrains many parts of its behavior. Note the term "Virtual Machine." It is literally running as though the machine is a separate entity, and the underlying machine/OS are simply resources. The -Xmx value is defining the maximum amount of memory that the VM will have, while the -Xms defines the starting memory available to the application.
The VM is a product of the binary being system agnostic - this was a solution used to allow the byte code to execute wherever. This is similar to an emulator - say for old gaming systems. It is emulating the "machine" that the game runs on.
The reason why you run into an OutOfMemoryException is because the Virtual Machine has hit the -Xmx limit - it has literally run out of memory.
As far as smaller programs go, they will often require a larger percentage of their memory for the VM. Also, Java has a default starting -Xmx and -Xms (I don't remember what they are right now) that it will always start with. The overhead of the VM and the libraries becomes much less noticable when you start to build and run "real" applications.
The memory argument related to QT and the like is true, but is not the whole story. While it uses more memory than some of those, those are compiled for specific architectures. It has been a while since I have used QT or similar libraries, but I remember the memory management not being very robust, and memory leaks are still common today in C/C++ programs. The nice thing about Garbage Collection is that it removes many of the common "gotchas" that cause memory leaks. (Note: Not all of them. It is still very possible to leak memory in Java, just a bit harder).
Hope this helps clear up some of the confusion you may have been having.
To answer a portion of your question;
Java at start-up allocates a "heap" of memory, or a fixed size block (the -Xms parameter). It doesn't actually use all this memory right off the bat, but it tells the OS "I want this much memory". Then as you create objects and do work in the Java environment, it puts the created objects into this heap of pre-allocated memory. If that block of memory gets full then it will request a little more memory from the OS, up until the "max heap size" (the -Xmx parameter) is reached.
Once that max size is reached, Java will no longer request more RAM from the OS, even if there is a lot free. If you try to create more objects, there is no heap space left, and you will get an OutOfMemory exception.
Now if you are looking at Windows Task Manager or something like that, you'll see "java.exe" using X megs of memory. That sort-of corresponds to the amount of memory that it has requested for the heap, not really the amount of memory inside the heap thats used.
In other words, I could write the application:
class myfirstjavaprog
{
public static void main(String args[])
{
System.out.println("Hello World!");
}
}
Which would basically take very little memory. But if I ran it with the cmd line:
java.exe myfirstjavaprog -Xms 1024M
then on startup java will immediately ask the OS for 1,024 MB of ram, and thats what will show in Windows Task Manager. In actuallity, that ram isnt being used, but java reserved it for later use.
Conversely, if I had an app that tried to create a 10,000 byte large array:
class myfirstjavaprog
{
public static void main(String args[])
{
byte[] myArray = new byte[10000];
}
}
but ran it with the command line:
java.exe myfirstjavaprog -Xms 100 -Xmx 100
Then Java could only alocate up to 100 bytes of memory. Since a 10,000 byte array won't fit into a 100 byte heap, that would throw an OutOfMemory exception, even though the OS has plenty of RAM.
I hope that makes sense...
Edit:
Going back to "why Java uses so much memory"; why do you think its using a lot of memory? If you are looking at what the OS reports, then that isn't what its actually using, its only what its reserved for use. If you want to know what java has actually used, then you can do a heap dump and explore every object in the heap and see how much memory its using.
To answer "why doesn't it just let the OS handle it?", well I guess that is just a fundamental Java question for those that designed it. The way I look at it; Java runs in the JVM, which is a virtual machine. If you create a VMWare instance or just about any other "virtualization" of a system, you usually have to specify how much memory that virtual system will/can consume. I consider the JVM to be similar. Also, this abstracted memory model lets the JVM's for different OSes all act in a similar way. So for example Linux and Windows have different RAM allocation models, but the JVM can abstract that away and follow the same memory usage for the different OSes.
Java does use malloc and free, or at least the implementations of the JVM may. But since Java tracks allocations and garbage collects unreachable objects, they are definitely not enough.
As for the rest of your text, I'm not sure if there's a question there.
Even some very small sample demo applications load huge amounts of memory. Maybe this is because of the Java library which is loaded. But why is it needed to load the library for each Java instance? (It seems that way because multiple small applications linearly take more memory. See here for some details where I describe this problem.) Or why is it done that way?
That's likely due to the overhead of starting and running the JVM
Big Java applications like Eclipse often crash with some OutOfMemory exception. This was always strange because there was still plenty of memory available on my system. Often, they consume more and more memory over runtime. I'm not sure if they have some memory leaks or if this is because of fragmentation in the memory pool -- I got the feeling that the latter is the case.
I'm not entirely sure what you mean by "often crash," as I don't think this has happened to me in quite a long time. If it is, it's likely due to the "maximum size" setting you mentioned earlier.
Your main question asking why Java doesn't use malloc and free comes down to a matter of target market. Java was designed to eliminate the headache of memory management from the developer. Java's garbage collector does a reasonably good job of freeing up memory when it can be freed, but Java isn't meant to rival C++ in situations with memory restrictions. Java does what it was intended to do (remove developer level memory management) well, and the JVM picks up the responsibility well enough that it's good enough for most applications.
The limits are a deliberate design decision from Sun. I've seen at least two other JVM's which does not have this design - the Microsoft one and the IBM one for their non-pc AS/400 systems. Both grows as needed using as much memory as needed.
Java doesn't use a fixed size of memory it is always in the range from -Xms to -Xmx.
If Eclipse crashes with OutOfMemoryError, than it required more memory than granted by -Xmx (a coniguration issue).
Java must not use malloc/free (for object creation) since its memory handling is much different due to garbage collection (GC). GC removes automatically unused objects, which is a benefit compared to be responsible for memory management.
For details on this complex topic see Tuning Garbage Collection

Java memory usage increases when App is used, but doesnt decrease when not being used

I have a java application that uses a lot of memory when used, but when the program is not being used, the memory usage doesnt go down.
Is there a way to force Java to release this memory? Because this memory is not needed at that time, I can understand to reserve a small amount of memory, but Java just reserves all the memory it ever uses. It also reuses this memory later but there must be a way to force Java to release it when its not needed.
System.gc is not working.
As pointed out in the comments, it's not certain that, while the garbage collector disposes objects, it gives back memory to the system.
Perhaps Tuning Garbage Collection Outline provides the solution to your problem:
By default the JVM grows or shrinks the heap at each GC to keep the ratio of free space to live objects at each collection within a specified range.
-XX:MinHeapFreeRatio - when the percentage of free space in a generation falls below this value the generation will be expanded to meet this percentage. Default is 40
-XX:MaxHeapFreeRatio - when the percentage of free space in a generation exceeded this value the generation will shrink to meet this value. Default is 70
Otherwise, if you suspect that you're leaking references you can figure out how, what and where objects are leaked is to monitor the heap in JVisualVM (a tool bundled with the standard SDK). You can, through this program, perform a heap-dump and get a histogram over object memory consumption:
What memory do you mean? If it is RAM (as opposed to the amount of used heap space of the Java VM itself) then this might be normal. It is a relatively expensive operation to allocate memory so once the JVM got some it is quite reluctant to give it back even if it is not needed at the time.
Have you considered using a memory profiler? If you don't have access to one, you can start with capturing a bunch of jmap -histo <pid> and writing a script to figure the differences.
System.gc has no guarantees about if it should free any memory when ran. See Why is it bad practice to call System.gc()?
Try tweaking the Xmx JVM arg down if it is set to a large value and take a look in JConsole to see what's going on with memory usage and GC activity. Normally you'd see a saw tooth pattern.
You might also want to use a profiler to see where the memory is being used and to identify any leaks.
One of two things is happening:
1) Your application is leaking references. Are you sure that you aren't hanging on to objects when you'll no longer need them? If you do, Java must maintain them in memory.
2) Java's working just fine. You get no benefit from memory that you aren't using.

Java memory usage with native processes

What is the best way to tune a server application written in Java that uses a native C++ library?
The environment is a 32-bit Windows machine with 4GB of RAM. The JDK is Sun 1.5.0_12.
The Java process is given 1024MB of memory (-Xmx) at startup but I often see OutOfMemoryErrors due to lack of heap space. If the memory is increased to 1200MB, the OutOfMemoryErrors occur due to lack of swap space. How is the memory shared between the JVM and the native process?
Does the Windows /3GB switch have any effect with native processes and Sun JVM?
I had lots of trouble with that setting (Java on 32-bit systems - msw and others) and they were all solved by reserving just *under 1GB of RAM to the JVM.
Otherwise, as stated, the actual occupied memory in the system for that process would be over 2GB; at that point I was having 'silent deaths' of the process - no errors, no warnings, just the process terminating very quietly.
I got more stability and performance running several JVM (each with under 1GB RAM) on the same system.
I found some info on JNI memory management here, and here's the JVM JNI section on memory management.
Well having a 3GB user space over a 2GB user space should help, but if your having problems running out of swap space at 2GB, I think 3GB is just going to make it worse. How big is your pagefile? Is it maxed out?
You can get a better idea on you heap allocation by hooking up jconsole to your jvm.
How is the memory shared between the JVM and the native process?
Sun's JVM's garbage collector is mark-and-sweep, with options to enable concurrent and incremental GC.
Well, more accurately, it's staged, and the above only applies to tenured (long-lived) objects. For young objects, GC is still done with a stop-and-copy collector, which is much better for working with short-lived objects (and all typical Java programs create many short-lived objects).
A copying collector walks over all elements in the heap, copying them to a new heap if they are referenced, and then discards the former heap. Thus 1M of live objects requires up to 2M of real memory: if every object is alive, there will be two copies of everything during garbage collection.
So the JVM requires far more system memory than is available to the code running within the VM, because there is a substantial overhead to management and garbage collection.
Does the Windows /3GB switch have any effect with native processes and Sun JVM?
The /3GB allows user virtual memory address space to be 3GB, but only for executables whose headers are marked with IMAGE_FILE_LARGE_ADDRESS_AWARE. As far as I am aware, Sun's java.exe is not. I don't have a Windows system here, so I can't verify.
You haven't explained your problem well enough, unfortunately. The real question is --- why is the Java process growing so much. Do you have a memory leak? Do you have a real reason to have that much data in the JVM?
Is the C++ library allocating its own memory from the C stack, or is it allocating memory from the Java object space, or is it doing something else entirely?

Memory footprint issues with JAVA, JNI, and C application

I have a piece of an application that is written in C, it spawns a JVM and uses JNI to interact with a Java application. My memory footprint via Process Explorer gets upto 1GB and runs out of memory. Now as far as I know it should be able to get upto 2GB. One thing I believe is that the memory the JVM is using isn't visible in the Process Explorer. My xmx is set to 256, I added some statements to watch the java side memory and it is peaking at 256 and GC is doing its job and it is all good on that side. So my question is, where is the other 700+ MB being consumed? Anyone out there a Java/JNI/C Memory expert?
There could be a leak in the JNI code.
Remember to use (*jni)->DeleteLocalRef() for any object references you get once you are done with them. If you use any native C buffers to create new Java objects, make sure you free them off once the object is created. Check the JNI Specification for further guidelines.
Depending on the VM you are using you might be able to turn on JNI checking. For example, on the IBM JDK you can specify "-Xcheck:jni".
Try a test app in C that doesn't spawn the JVM but instead tries to allocate more and more memory. See whether the test app can reach the 2 GB barrier.
The C and JNI code can allocate memory as well (malloc/free/new/etc), which is outside of the VM's 256m. The xMX only restricts what the VM will allocate itself. Depending on what you're allocating in the C code, and what other things are loaded in memory you may or may not be able to get up to 2GB.
If you say that it's the Windows process that runs out of memory as opposed to the JVM, then my initial guess is that you probably invoke some (your own) native methods from the JVM and those native methods leak memory. So, I concur with #John Gardner here.
Well thanks to all of your help especially #alexander I have discovered that all the extra memory that isn't visible via Process Explorer is being used by the Java Heap. In fact via other tests that I have run the JVM's memory consumption is included in what I see from the Process Explorer. So the heap is taking large amounts of memory, I will have to do some more research about that and maybe ask a separate question.
Write a C test harness and use valgrind/alleyoop to check for leakage in your C code, and similarly use the java jvisualvm tool.

Categories