Java JAR memory usage VS class file memory usage - java

I recently changed my large Java application to be delivered in JARs instead of individual class files. I have 405 JARS which hold 5000 class files. My problem is that when I run my program(s) as JARs (classpath is a wildcard to get all JARs) Java will continually use more and more memory. I have seen the memory go > 2GB and it seems like Java is not doing stop-the-world garbage collections to keep the memory lower. If I run the exact same program against the exploded JARs (only class files), Java's memory usage stays much lower (< 256MB) and stays there. This is happening in Oracle's Java 8 on Windows 7 (x64) and Windows Server (x64). Why would packaging my application as JARs change the memory profile? Also I have run the program for a long time as JARs with the memory maximum limited to 128MB with no problems so I don't have a memory leak.
With JAR files in classpath
With class files in classpath
Edit: I accepted the answer from #K Erlandsson because I think it is the best explanation and this is just an ugly quirk of Java. Thanks every one (and especially #K Erlandsson) for your help.

The first thing to note is that how much memory that is totally used on the heap is not very interesting at all times, since much of the used memory can be garbage and will be cleared by the next GC.
It is how much heap that is used by live objects that you need to be concerned about. You write in a comment:
I don't know if this matters, but if I use jvisualvm.exe to force a GC
(mark sweep) the heap memory usage will drop clearing almost all the
heap memory.
This matters. A lot. This means that when you see a higher heap usage when you use your jars, you see more garbage, not more memory consumed by live objects. The garbage is cleared when you do a GC and all is well.
Loading classes from jar files will consume more memory, temporarily, than loading them from class files. The jar files need to be opened, seeked, and read from. This requires more operations and more temporary data than simply opening a specific .class file and reading it.
Since most of the heap usage is cleared by a GC, this additional memory consumption is not something you need to be very concerned about.
You also write:
Java will continually use more and more memory. I have seen the memory
go > 2GB and it seems like Java is not doing stop-the-world garbage
collections to keep the memory lower.
This is typical behavior. The GC only runs when the JVM thinks it is necessary. The JVM will tune this depending on memory behavior.
Edit: Now that we see your jConsole images we see a difference in committed heap memory (250 mb vs 680 mb). Committed heap is the actual size of the heap. This will vary (up to what you set with -Xmx) depending on what that JVM thinks will yield the best performance for your application. However, it will mostly increase, almost never decrease.
For the jar case the JVM has assigned a bigger heap to your application. Probably due to more memory being required during the initial class loading. The JVM then thought a bigger heap would be faster.
When you have a bigger heap, more committed memory, there is more memory to use before running a GC. That is why you see the difference in memory usage in the two case.
Bottom line: All the extra usage you see is garbage, not live objets, why you do not need to be concerned about this behavior unless you have an actual problem since the memory will be reclaimed on the next GC.

It's quite common to load resources from the classpath. When a resource is originated from a jar file, the URL object will keep a reference to the jar file entry. This might be adding some memory consumption. It's possible to disable this caching by disabling default url caching.
The API for disabling default URL caching is quite awkward:
public static void disableUrlConnectionCaching() {
// sun.net.www.protocol.jar.JarURLConnection leaves the JarFile instance open if URLConnection caching is enabled.
try {
URL url = new URL("jar:file://valid_jar_url_syntax.jar!/");
URLConnection urlConnection = url.openConnection();
urlConnection.setDefaultUseCaches(false);
} catch (MalformedURLException e) {
// ignore
} catch (IOException e) {
// ignore
}
}
Disable default URL caching in the startup of your application.
Tomcat already disables URL caching by default because it also causes file locking issues and prevents updating jar files in a running application.
https://github.com/apache/tomcat/blob/5bbbcb1f8ca224efeb8e8308089817e30e4011aa/java/org/apache/catalina/core/JreMemoryLeakPreventionListener.java#L408-L423

Related

growing permsize when loading lots of data into memory

I always thought, that the memory of permsize of a JVM is filled with loading classes during starting up the JVM. Probably also with stuff like JNI during runtime ? But in general it should not growth during runtime "signifcantly".
Now I noticed, that since I load a lots of data (20GB) into the heapspace, which max is 32GB ( ArrayLists of Data ), then I get a 'OutOfMemoryError: PermGen space'.
Is there any correlation or just accidentally ?
I know howto increase the permsize. This is not the question.
With tomcat, I have set the following for increasing PermGen space.
set "JAVA_OPTS=-XX:MaxPermSize=256m"
You may like to do something like above.
I have set in MB(256m), I am not sure how to set for GB.
Hope helps.
The PermGen memory space is not part of the heap (sometimes this causes confusion). It's where some kind of objects are allocated, like
Class objects, Method objects, and the pool of strings objects. Unlike the name would indicate, this memory space is also collected (during
the FullGC), but often bring major headaches, as known
OutOfMemoryError.
Problems with bursting PermGen are difficult to diagnose precisely
because it is not the application objects . Most of the cases, the problem is connected to
an exaggerated amount of classes that are loaded into memory. A well known issue, was the use
of Eclipse with many plugins ( WTP ) with default JVM settings . Many classes were loaded in memory and ends with a burst of the permGEN.
Another problem of PermGen are the hot deploys in application servers. For several reasons, the server cannot release
the context classes at the destroy time . A new version of the application is then loaded,
but old the classes remains, increasing the PermGen.
That's why sometimes we need to restart the whole container because of the PermGen.

PermGen Out of Memory reasons

I constantly detect OOM in PermGen for my environment:
java 6
jboss-4.2.3
Not a big web-application
I know about String.intern() problem - but I don't have enough valuable usage of it.
Increasing of MaxPermGen size didn't take a force (from 128 Mb to 256 Mb).
What other reasons could invoke OOM for PermGen?
What scenario of investigation is the best in such situation (strategy, tools and etc.)?
Thanks for any help
See this note
Put JDBC driver in common/lib (as tomcat documentation says) and not in WEB-INF/lib
Don't put commons-logging into WEB-INF/lib since tomcat already bootstraps it
new class objects get placed into the PermGen and thus occupy an ever increasing amount of space. Regardless of how large you make the PermGen space, it will inevitably top out after enough deployments. What you need to do is take measures to flush the PermGen so that you can stabilize its size. There are two JVM flags which handle this cleaning:
-XX:+CMSPermGenSweepingEnabled
This setting includes the PermGen in a garbage collection run. By default, the PermGen space is never included in garbage collection (and thus grows without bounds).
-XX:+CMSClassUnloadingEnabled
This setting tells the PermGen garbage collection sweep to take action on class objects. By default, class objects get an exemption, even when the PermGen space is being visited during a garabage collection.
You typically get this error when redeploying an application while having a classloader leak, because it means all your classes are loaded again while the old versions stay around.
There are two solutions:
Restart the app server instead of redeploying the application - easy, but annoying
Investigate and fix the leak using a profiler. Unfortunately, classloader leaks can be very hard to pinpoint.

OutOfMemoryError when calling the main method of a jar

I have a Java app that imports another jar as a library and calls its main method as shown below. But someApp is a very large process and constantly throws an OutOfMemoryError. No matter what I set my Java apps heap size to, someApp does not seem to share the allocated memory.
try {
someApp.main(args);
} catch (Exception ex) {
}
How do I get someApp to allocate more heap space? Can I use processBuilder? What do I do?
Thanks.
As it stands at the moment, you're merely calling a class from another application within your own Java process. This is exactly the same as you'd do for calling a "library method" (the term has no technical difference, you're simply invoking a method on an object of a class that can be resolved by your classloader).
So right now, someApp is running in the same JVM as your own application, and will share its maximum heap size. This can be increase with the JVM argument -Xmx (e.g. -Xmx2048m for a 2GB max heap), though it sounds like you're doing this already withotu success.
It would be possible to launch someApp in a separate Java process, which would allow you to configure separate JVM arguments and thus give it a separate heap size.
However, I don't think this is going to help much. If you're unable to get this application to run in the same JVM, regardless of your heap limit, there's nothing that would suggest it would work in a difference JVM. For example, if you're running with a 2.5GB heap and still running out of memory, running your own app with a 0.5GB heap and spawning a separate JVM with 2GB heap will not solve the problem, as something is still running out of memory. (In fact, separate memory pools make an OOME slightly more likely since there are two distinct chunks of free space, whereas in the former case both applications can benefit from the same pool of free space).
I suggest you verify that your heap sizes really are being picked up (connecting via JMX using JConsole or JVisualVM will quickly let you see how big the max heap size is). If you're really still running out of memory with large heaps, it sounds like someApp has a memory leak (or a requirement for an even larger heap). Capturing a heap dump in this case, with the JVM argument -XX:+HeapDumpOnOutOfMemoryError, will allow you to inspect the heap with an external tool and determine what's filling the memory.
Hopefully you've simply failed to increase the heap size correctly, as if the application really is failing with a large heap there are no simple solutions.
Unless someApp itself is building a new process, this will already be in the same process as your calling code, so it should be affected by whatever heap configuration you've set when starting up the JVM.
Have you kept track of how much memory the process is actually taking?
This doesn't make sense, unless you're running into OS limitations on how much memory you can allocate to a single Java process on your OS (see Java maximum memory on Windows XP)
Short of that, the way you're invoking someApp, it acts as a regular library. The main method acts like any other method.
Have you tried debugging the OutOfMemoryError? There may be something obscure that the app doesn't like about being invoked from your application...
If the jar you are importing is authored by you and could be more efficient, then modify it. It sounds like the problem you are having is loading in a shotty package. If this is a 3rd party package and you are allowed to modify it, poke around in the code and find where there might be limitations, change it, and rebuild it.

Java: why does it uses a fixed amount of memory? or how does it manage the memory?

It seems that the JVM uses some fixed amount of memory. At least I have often seen parameters -Xmx (for the maximum size) and -Xms (for the initial size) which suggest that.
I got the feeling that Java applications don't handle memory very well. Some things I have noticed:
Even some very small sample demo applications load huge amounts of memory. Maybe this is because of the Java library which is loaded. But why is it needed to load the library for each Java instance? (It seems that way because multiple small applications linearly take more memory. See here for some details where I describe this problem.) Or why is it done that way?
Big Java applications like Eclipse often crash with some OutOfMemory exception. This was always strange because there was still plenty of memory available on my system. Often, they consume more and more memory over runtime. I'm not sure if they have some memory leaks or if this is because of fragmentation in the memory pool -- I got the feeling that the latter is the case.
The Java library seem to require much more memory than similar powerful libraries like Qt for example. Why is this? (To compare, start some Qt applications and look at their memory usage and start some Java apps.)
Why doesn't it use just the underlying system technics like malloc and free? Or if they don't like the libc implementation, they could use jemalloc (like in FreeBSD and Firefox) which seems to be quite good. I am quite sure that this would perform better than the JVM memory pool. And not only perform better, also require less memory, esp. for small applications.
Addition: Does somebody have tried that already? I would be much interested in a LLVM based JIT-compiler for Java which just uses malloc/free for memory handling.
Or maybe this also differs from JVM implementation to implementation? I have used mostly the Sun JVM.
(Also note: I'm not directly speaking about the GC here. The GC is only responsible to calculate what objects can be deleted and to initialize the memory freeing but the actual freeing is a different subsystem. Afaik, it is some own memory pool implementation, not just a call to free.)
Edit: A very related question: Why does the (Sun) JVM have a fixed upper limit for memory usage? Or to put it differently: Why does JVM handle memory allocations differently than native applications?
You need to keep in mind that the Garbage Collector does a lot more than just collecting unreachable objects. It also optimizes the heap space and keeps track of exactly where there is memory available to allocate for the creation of new objects.
Knowing immediately where there is free memory makes the allocation of new objects into the young generation efficient, and prevents the need to run back and forth to the underlying OS. The JIT compiler also optimizes such allocations away from the JVM layer, according to Sun's Jon Masamitsu:
Fast-path allocation does not call
into the JVM to allocate an object.
The JIT compilers know how to allocate
out of the young generation and code
for an allocation is generated in-line
for object allocation. The interpreter
also knows how to do the allocation
without making a call to the VM.
Note that the JVM goes to great lengths to try to get large contiguous memory blocks as well, which likely have their own performance benefits (See "The Cost of Missing the Cache"). I imagine calls to malloc (or the alternatives) have a limited likelihood of providing contiguous memory across calls, but maybe I missed something there.
Additionally, by maintaining the memory itself, the Garbage Collector can make allocation optimizations based on usage and access patterns. Now, I have no idea to what extent it does this, but given that there's a registered Sun patent for this concept, I imagine they've done something with it.
Keeping these memory blocks allocated also provides a safeguard for the Java program. Since the garbage collection is hidden from the programmer, they can't tell the JVM "No, keep that memory; I'm done with these objects, but I'll need the space for new ones." By keeping the memory, the GC doesn't risk giving up memory it won't be able to get back. Naturally, you can always get an OutOfMemoryException either way, but it seems more reasonable not to needlessly give memory back to the operating system every time you're done with an object, since you already went to the trouble to get it for yourself.
All of that aside, I'll try to directly address a few of your comments:
Often, they consume more and more
memory over runtime.
Assuming that this isn't just what the program is doing (for whatever reason, maybe it has a leak, maybe it has to keep track of an increasing amount of data), I imagine that it has to do with the free hash space ratio defaults set by the (Sun/Oracle) JVM. The default value for -XX:MinHeapFreeRatio is 40%, while -XX:MaxHeapFreeRatio is 70%. This means that any time there is only 40% of the heap space remaining, the heap will be resized by claiming more memory from the operating system (provided that this won't exceed -Xmx). Conversely, it will only* free heap memory back to the operating system if the free space exceeds 70%.
Consider what happens if I run a memory-intensive operation in Eclipse; profiling, for example. My memory consumption will shoot up, resizing the heap (likely multiple times) along the way. Once I'm done, the memory requirement falls back down, but it likely won't drop so far that 70% of the heap is free. That means that there's now a lot of underutilized space allocated that the JVM has no intention of releasing. This is a major drawback, but you might be able to work around it by customizing the percentages to your situation. To get a better picture of this, you really should profile your application so you can see the utilized versus allocated heap space. I personally use YourKit, but there are many good alternatives to choose from.
*I don't know if this is actually the only time and how this is observed from the perspective of the OS, but the documentation says it's the "maximum percentage of heap free after GC to avoid shrinking," which seems to suggest that.
Even some very small sample demo
applications load huge amounts of
memory.
I guess this depends on what kind of applications they are. I feel that Java GUI applications run memory-heavy, but I don't have any evidence one way or another. Did you have a specific example that we could look at?
But why is it needed to load the
library for each Java instance?
Well, how would you handle loading multiple Java applications if not creating new JVM processes? The isolation of the processes is a good thing, which means independent loading. I don't think that's so uncommon for processes in general, though.
As a final note, the slow start times you asked about in another question likely come from several intial heap reallocations necessary to get to the baseline application memory requirement (due to -Xms and -XX:MinHeapFreeRatio), depending what the default values are with your JVM.
Java runs inside a Virtual Machine, which constrains many parts of its behavior. Note the term "Virtual Machine." It is literally running as though the machine is a separate entity, and the underlying machine/OS are simply resources. The -Xmx value is defining the maximum amount of memory that the VM will have, while the -Xms defines the starting memory available to the application.
The VM is a product of the binary being system agnostic - this was a solution used to allow the byte code to execute wherever. This is similar to an emulator - say for old gaming systems. It is emulating the "machine" that the game runs on.
The reason why you run into an OutOfMemoryException is because the Virtual Machine has hit the -Xmx limit - it has literally run out of memory.
As far as smaller programs go, they will often require a larger percentage of their memory for the VM. Also, Java has a default starting -Xmx and -Xms (I don't remember what they are right now) that it will always start with. The overhead of the VM and the libraries becomes much less noticable when you start to build and run "real" applications.
The memory argument related to QT and the like is true, but is not the whole story. While it uses more memory than some of those, those are compiled for specific architectures. It has been a while since I have used QT or similar libraries, but I remember the memory management not being very robust, and memory leaks are still common today in C/C++ programs. The nice thing about Garbage Collection is that it removes many of the common "gotchas" that cause memory leaks. (Note: Not all of them. It is still very possible to leak memory in Java, just a bit harder).
Hope this helps clear up some of the confusion you may have been having.
To answer a portion of your question;
Java at start-up allocates a "heap" of memory, or a fixed size block (the -Xms parameter). It doesn't actually use all this memory right off the bat, but it tells the OS "I want this much memory". Then as you create objects and do work in the Java environment, it puts the created objects into this heap of pre-allocated memory. If that block of memory gets full then it will request a little more memory from the OS, up until the "max heap size" (the -Xmx parameter) is reached.
Once that max size is reached, Java will no longer request more RAM from the OS, even if there is a lot free. If you try to create more objects, there is no heap space left, and you will get an OutOfMemory exception.
Now if you are looking at Windows Task Manager or something like that, you'll see "java.exe" using X megs of memory. That sort-of corresponds to the amount of memory that it has requested for the heap, not really the amount of memory inside the heap thats used.
In other words, I could write the application:
class myfirstjavaprog
{
public static void main(String args[])
{
System.out.println("Hello World!");
}
}
Which would basically take very little memory. But if I ran it with the cmd line:
java.exe myfirstjavaprog -Xms 1024M
then on startup java will immediately ask the OS for 1,024 MB of ram, and thats what will show in Windows Task Manager. In actuallity, that ram isnt being used, but java reserved it for later use.
Conversely, if I had an app that tried to create a 10,000 byte large array:
class myfirstjavaprog
{
public static void main(String args[])
{
byte[] myArray = new byte[10000];
}
}
but ran it with the command line:
java.exe myfirstjavaprog -Xms 100 -Xmx 100
Then Java could only alocate up to 100 bytes of memory. Since a 10,000 byte array won't fit into a 100 byte heap, that would throw an OutOfMemory exception, even though the OS has plenty of RAM.
I hope that makes sense...
Edit:
Going back to "why Java uses so much memory"; why do you think its using a lot of memory? If you are looking at what the OS reports, then that isn't what its actually using, its only what its reserved for use. If you want to know what java has actually used, then you can do a heap dump and explore every object in the heap and see how much memory its using.
To answer "why doesn't it just let the OS handle it?", well I guess that is just a fundamental Java question for those that designed it. The way I look at it; Java runs in the JVM, which is a virtual machine. If you create a VMWare instance or just about any other "virtualization" of a system, you usually have to specify how much memory that virtual system will/can consume. I consider the JVM to be similar. Also, this abstracted memory model lets the JVM's for different OSes all act in a similar way. So for example Linux and Windows have different RAM allocation models, but the JVM can abstract that away and follow the same memory usage for the different OSes.
Java does use malloc and free, or at least the implementations of the JVM may. But since Java tracks allocations and garbage collects unreachable objects, they are definitely not enough.
As for the rest of your text, I'm not sure if there's a question there.
Even some very small sample demo applications load huge amounts of memory. Maybe this is because of the Java library which is loaded. But why is it needed to load the library for each Java instance? (It seems that way because multiple small applications linearly take more memory. See here for some details where I describe this problem.) Or why is it done that way?
That's likely due to the overhead of starting and running the JVM
Big Java applications like Eclipse often crash with some OutOfMemory exception. This was always strange because there was still plenty of memory available on my system. Often, they consume more and more memory over runtime. I'm not sure if they have some memory leaks or if this is because of fragmentation in the memory pool -- I got the feeling that the latter is the case.
I'm not entirely sure what you mean by "often crash," as I don't think this has happened to me in quite a long time. If it is, it's likely due to the "maximum size" setting you mentioned earlier.
Your main question asking why Java doesn't use malloc and free comes down to a matter of target market. Java was designed to eliminate the headache of memory management from the developer. Java's garbage collector does a reasonably good job of freeing up memory when it can be freed, but Java isn't meant to rival C++ in situations with memory restrictions. Java does what it was intended to do (remove developer level memory management) well, and the JVM picks up the responsibility well enough that it's good enough for most applications.
The limits are a deliberate design decision from Sun. I've seen at least two other JVM's which does not have this design - the Microsoft one and the IBM one for their non-pc AS/400 systems. Both grows as needed using as much memory as needed.
Java doesn't use a fixed size of memory it is always in the range from -Xms to -Xmx.
If Eclipse crashes with OutOfMemoryError, than it required more memory than granted by -Xmx (a coniguration issue).
Java must not use malloc/free (for object creation) since its memory handling is much different due to garbage collection (GC). GC removes automatically unused objects, which is a benefit compared to be responsible for memory management.
For details on this complex topic see Tuning Garbage Collection

What memory types does Tomcat use and how can it be controlled

I know there are several memory types that Tomcat uses when running.
The only I have ever used - java heap. It can be controlled through JAVA_OPTS env property with something like '-Xmx128M -Xms64M'
I have found that there is also -XX:MaxPermSize, -XX:MaxNewSize and etc.
The reason I'm asking is that I'm trying to launch Tomcat5.5 on 200Mb RAM memory (it is VPS server). I have setup java heap size with '-Xmx128M -Xms64M', but it seems that right from startup it consumes more than that (if ever can start. Sometimes startup fails right off the bat with OutOfMemoryException), with no applications have been deployed
Noticable thing is that if I launch maven's tomcat plugin, it works just fine. Only separate tomcat fails with memory.
Thanks in advance for your ideas.
As you say, heap memory is just one of the JVM's memory pools, there are others.
Read this to get an idea of what they are, how to control them, and how to monitor them:
http://java.sun.com/j2se/1.5.0/docs/guide/management/jconsole.html
Heap and Non-heap Memory
The JVM manages two kinds of memory:
heap and non-heap memory, both created
when it starts.
Heap memory is the runtime data area
from which the JVM allocates memory
for all class instances and arrays.
The heap may be of a fixed or variable
size. The garbage collector is an
automatic memory management system
that reclaims heap memory for objects.
Non-heap memory includes a method area
shared among all threads and memory
required for the internal processing
or optimization for the JVM. It stores
per-class structures such as a runtime
constant pool, field and method data,
and the code for methods and
constructors. The method area is
logically part of the heap but,
depending on implementation, a JVM may
not garbage collect or compact it.
Like the heap, the method area may be
of fixed or variable size. The memory
for the method area does not need to
be contiguous.
In addition to the method area, a JVM
implementation may require memory for
internal processing or optimization
which also belongs to non-heap memory.
For example, the JIT compiler requires
memory for storing the native machine
code translated from the JVM code for
high performance.
Read here for some tips on setting the java heap size. It is quite strange that Tomcat is giving you OutOfMemoryExceptions even without any applications deployed. Perhaps there is something wrong with your configuration (what OS are you using, how do you start Tomcat?).

Categories