Java's varying available heap size

Java's varying available heap size - java

I just found out that there are some libraries to compute the shallow size of a java object, so I thought I can also write this in a very simple way. Here is what I tried.
Start the program with some Xmx say A
Create objects of type whose size you want to calculate (say type T) and store them in a list so that GC shouldn't clean them up.
When we hit OOM, let the code handle it and empty the list.
Now check the number of the objects of type T we allocated. Let this be n
Do a binary search to find out the delta inorder to successfully allocate n+1 objects.
Here is the code, I tried out
import java.util.ArrayList;
public class test {
public static void main(String[] a) {
ArrayList<Integer> l = new ArrayList<>();
int i=0;
try {
while(true) {
l.add(new Integer(1));
i++;
}
} catch(Throwable e) {
} finally {
l.clear();
System.out.println(i + "");
}
}
}
But I noticed that the number of objects allocated in each run for a same Xmx was varying. Why is this? Is there anything inside JVM randomized?

But I noticed that the number of objects allocated in each run for a same Xmx was varying. Why is this?
Some events in the JVM are non-deterministic, and this can effect garbage collector behavior.
But there could be factors in play that are resulting in variable numbers of (your) objects being created before you fill up the heap. These include:
Not all of the objects in the heap will be the ArrayList and Integer objects that you are explicitly creating. There will be Object[] objects that get created when you resize the ArrayList, various objects generated by your println calls ... and other things that happen under the hood.
Heap resizing behavior. The heap is not immediately sized to -Xmx size. The JVM typically starts with a smaller heap size and expands it on demand. By the time you get an OOME, the JVM has most likely expanded the heap to the max permitted, but the sequence of expansions is potentially sensitive to ... various factors including some that may be non-deterministic.
Heap generations. A typical Java GC uses an old space and a new space. The old space contains long-lived objects. New objects are allocated into the new space ... unless they are very large. The actual distribution of the objects can affect when the GC runs occur, and when JVM is going to decide that the heap is full.
JIT compilation. At certain points in the execution of your application, the JVM will (typically) decide to JIT compile your code. When this happens, extra objects get allocated.
Is there anything inside JVM randomized?
It is unlikely to be explicit randomization affecting this benchmark. There is sufficient non-determinism at various levels (i.e. in the hardware, the OS and the JVM) to explain the inconsistent results you are seeing.
In short: I wouldn't expect your benchmark to give consistent results for the number of objects that can be created.

Related

Where is the OutOfMemoryError object created in Java

An OutOfMemoryError occurs when the heap does not have enough memory to create new objects. If the heap does not have enough memory, where is the OutOfMemoryError object created. I am trying to understand this, please advise.

Of course, this is an implementation-dependent behavior. HotSpot has some heap memory inaccessible for ordinary allocations, the JVM can use to construct an OutOfMemoryError in. However, since Java allows an arbitrary number of threads, an arbitrary number of threads may hit the wall at the same time, so there is no guaranty that the memory is enough for constructing a distinct OutOfMemoryError instance for each of them.
Therefore, an emergency OutOfMemoryError instance is created at the JVM startup persisting throughout the entire session, to ensure, that the error can be thrown even if there is really no memory left. Since the instance will be shared for all threads encountering the error while there’s really no memory left, you will recognize this extraneous condition by the fact that this error will have no stack trace then.
The following program
ConcurrentHashMap<OutOfMemoryError,Integer> instances = new ConcurrentHashMap<>();
ExecutorService executor = Executors.newCachedThreadPool();
executor.invokeAll(Collections.nCopies(1000, () -> {
ArrayList<Object> list = new ArrayList<>();
for(;;) try {
list.add(new int[10_000_000]);
} catch(OutOfMemoryError err) {
instances.merge(err, 1, Integer::sum);
return err;
}
}));
executor.shutdown();
System.out.println(instances.size()+" distinct errors created");
instances.forEach((err,count) -> {
StackTraceElement[] trace = err.getStackTrace();
System.out.println(err.getClass().getName()+"#"+Integer.toHexString(err.hashCode())
+(trace!=null&&trace.length!=0? " has": " has no")+" stacktrace, used "+count+'x');
});
running under jdk1.8.0_65 with -Xmx100M and waiting half a minute gave me
5 distinct errors created
java.lang.OutOfMemoryError#c447d22 has no stacktrace, used 996x
java.lang.OutOfMemoryError#fe0b0b7 has stacktrace, used 1x
java.lang.OutOfMemoryError#1e264651 has stacktrace, used 1x
java.lang.OutOfMemoryError#56eccd20 has stacktrace, used 1x
java.lang.OutOfMemoryError#70ab58d7 has stacktrace, used 1x
showing that the reserved memory could serve the construction of four distinct OutOfMemoryError instances (including the memory needed to record their stack traces) while all other threads had to fall back to the reserved shared instance.
Of course, numbers may vary between different environments.

It's generated natively by the JVM, which isn't limited by -Xmx or other parameters. The heap reserved for your program is exhausted, not the memory available for the JVM.

Used memory dosnt changed after get additional object from memcached

I dump all memory (jmap -histo ) after i load data from memcahced and than i load the the same data again (The data is loaded to another instance), but the used memory wasnt changed.
No GC was done (i allocate 2g new size heap and check with jstat and another tools that no gc was done ).
The second heap contains 2 instance from the type that I load from memcached.
I compare the heap after the second load to the heap after the first load.
The second heap has in any class types more instances than the first heap except [I that the second heap has less than the first one. The same of the gap (bytes) between the other classes in the heap is the same like the gap for [I.
I look on the byte code and didnt see anything suspicious. Any idea?
public static void main(String[] args) throws Throwable {
Object a1 = mc.get("userClient");
-- get dump
Object a11 = mc.get("userClient");
-- get dump
mc.shutdown();
}

If you wish to see the size of an Object on the heap, then turn off TLAB's (-XX:-UseTLAB) and monitor the change in Runtime.freeMemory() (dump is better because freememory is approximation ) while creating an object. This technique will only work for objects added to the heap, so local variables which are added to the stack wont show up.

Avoiding PermGen out of memory and GC overhead limit exceeded

I'm trying to generate classes and load them at run time.
I'm using a ClassLoader object to load the classes. Since I don't want to run out of PermGen memory, from time to time I un-reference the class loader and create a new one to load the new classes to be used. This seems to work fine and I don't get a PermGen out of memory.
The problem is that when I do that, after a while I get the following error:
java.lang.OutOfMemoryError: GC overhead limit exceeded
So my question is, when should I un-reference the class loader to avoid both errors?: Should I monitor in my code the PermGen usage so that I un-reference the class loader and call System.gc() when the PermGen usage is close to the limit?
Or should I follow a different approach? Thanks

There is no single correct answer to this.
On the one hand, if unlinking the classloader is solving your permgen leakage problems, then you should continue to do that.
On the other hand, a "GC overhead limit exceeded" error means that your application is spending too much time garbage collection. In most circumstances, this means that your heap is too full. But that can mean one of two things:
The heap is too small for your application's requirements.
Your application has a memory leak.
You could assume that the problem is the former one and just increase the heap size. But if the real problem is the latter one, then increasing the heap size is just postponing the inevitable ... and the correct thing to do would be to find and fix the memory leak.
Don't call System.gc(). It won't help.

Are you loading the same class multiple times?
Because you should cache the loaded class.
If not, how many classes are you loading?
If they are plenty you may have to fix a limit of loaded classes (this number can be either based on heap size or a number based on how much memory does it take to have a loaded class) and discard the least used when loading the next one.

I had somewhat similar situation with class unloading.
I'm using several class loaders to simulate multiple JVM inside of JUnit test (this is usually used to work with Oracle Coherence cluster, but I was also successfully used this technique to start multi node HBase/Hadoop cluster inside of JVM).
For various reasons tests may require restart of such "virtual" JVM, which means forfeiting old ClassLoader and creating new one.
Sometimes JVM delays class unloading event if you for Full GC, which leads to various problems later.
One technique I found usefully for forcing JVM to collect PermSpace is following.
public static void forcePermSpaceGC(double factor) {
if (PERM_SPACE_MBEAN == null) {
// probably not a HotSpot JVM
return;
}
else {
double f = ((double)getPermSpaceUsage()) / getPermSpaceLimit();
if (f > factor) {
List<String> bloat = new ArrayList<String>();
int spree = 0;
int n = 0;
while(spree < 5) {
try {
byte[] b = new byte[1 << 20];
Arrays.fill(b, (byte)('A' + ++n));
bloat.add(new String(b).intern());
spree = 0;
}
catch(OutOfMemoryError e) {
++spree;
System.gc();
}
}
return;
}
}
}
Full sourcecode
I'm filling PermSpace with String using intern() until JVM would collect them.
But
I'm using that technique for testing
Various combination of hardware / JVM version may require different threshold, so it is often quicker to restart whole JVM instead of forcing it to properly collect all garbage

Understanding Java Memory Management

Java programmers know that JVM runs a Garbage Collector, and System.gc() would just be a suggestion to JVM to run a Garbage Collector. It is not necessarily that if we use System.gc(), it would immediately run the GC.Please correct me if I misunderstand Java's Garbage Collector.
Is/are there any other way/s doing memory management other than relying on Java's Garbage Collector?If you intend to answer the question by some sort of programming practice that would help managing the memory, please do so.

The most important thing to remember about Java memory management is "nullify" your reference.
Only objects that are not referenced are to be garbage collected.
For example, objects in the following code is never get collected and your memory will be full just to do nothing.
List objs = new ArrayList();
for (int i = 0; i < Integer.MAX_VALUE; i++) objs.add(new Object());
But if you don't reference those object ... you can loop as much as you like without memory problem.
List objs = new ArrayList();
for (int i = 0; i < Integer.MAX_VALUE; i++) new Object();
So what ever you do, make sure you remove reference to object to no longer used (set reference to null or clear collection).
When the garbage collector will run is best left to JVM to decide. Well unless your program is about to start doing things that use a lot of memory and is speed critical so you may suggest JVM to run GC before going in as you may likely get the garbaged collected and extra memory to go on. Other wise, I personally see no reason to run System.gc().
Hope this helps.

Below is little summary I wrote back in the days (I stole it from some blog, but I can't remember where from - so no reference, sorry)
There is no manual way of doing garbage collection in Java.
Java Heap is divided into three generation for the sake of garbage collection. These are the young generation, tenured or old generation, and Perm area.
New objects are created in the young generation and subsequently moved to the old generation.
String pool is created in Perm area of Heap, Garbage collection can occur in perm space but depends on upon JVM to JVM.
Minor garbage collection is used to move an object from Eden space to Survivor 1 and Survivor 2 space, and Major collection is used to move an object from young to tenured generation.
Whenever Major garbage collection occurs application, threads stops during that period which will reduce application’s performance and throughput.
There are few performance improvements has been applied in garbage collection in Java 6 and we usually use JRE 1.6.20 for running our application.
JVM command line options -Xms and -Xmx is used to setup starting and max size for Java Heap. The ideal ratio of this parameter is either 1:1 or 1:1.5 based on my experience, for example, you can have either both –Xmx and –Xms as 1GB or –Xms 1.2 GB and 1.8 GB.
Command line options: -Xms:<min size> -Xmx:<max size>

Just to add to the discussion: Garbage Collection is not the only form of Memory Management in Java.
In the past, there have been efforts to avoid the GC in Java when implementing the memory management (see Real-time Specification for Java (RTSJ)). These efforts were mainly dedicated to real-time and embedded programming in Java for which GC was not suitable - due to performance overhead or GC-introduced latency.
The RTSJ characteristics
Immortal and Scoped Memory Management - see below for examples.
GC and Immortal/Scoped Memory can coexist withing one application
RTSJ requires a specially modified JVM.
RTSJ advantages:
low latency, no GC pauses
delivers predictable performance that is able to meet real-time system requirements
Why RTSJ failed/Did not make a big impact:
Scoped Memory concept is hard to program with, error-prone and difficult to learn.
Advance in Real-time GC algoritms reduced the GC pause-time in such way that Real-time GCs replaced the RTSJ in most of the real-time apps. However, Scoped Memories are still used in places where no latencies are tolerated.
Scoped Memory Code Example (take from An Example of Scoped Memory Usage):
import javax.realtime.*;
public class ScopedMemoryExample{
private LTMemory myMem;
public ScopedMemoryExample(int Size) {
// initialize memory
myMem = new LTMemory(1000, 5000);
}
public void periodicTask() {
while (true)) {
myMem.enter(new Runnable() {
public void run() {
// do some work in the SCOPED MEMORY
new Object();
...
// end of the enter() method, the scoped Memory is emptied.
}
});
}
}
}
Here, a ScopedMemory implementation called LTMemory is preallocated. Then a thread enters the scoped memory, allocates the temporary data that are needed only during the time of the computation. After the end of the computation, the thread leaves the scoped memory which immediately makes the whole content of the specific ScopedMemory to be emptied. No latency introduced, done in constant time e.g. predictable time, no GC is triggered.

From my experience, in java you should rely on the memory management that is provided by JVM itself.
The point I'd focus on in this topic is to configure it in a way acceptable for your use case. Maybe checking/understanding JVM tuning options would be useful: http://docs.oracle.com/cd/E15523_01/web.1111/e13814/jvm_tuning.htm

You cannot avoid garbage collection if you use Java. Maybe there are some obscure JVM implementations that do, but I don't know of any.
A properly tuned JVM shouldn't require any System.gc() hints to operate smoothly. The exact tuning you would need depends heavily on what your application does, but in my experience, I always turn on the concurrent-mark-and-sweep option with the following flag: -XX:+UseConcMarkSweepGC. This flag allows the JVM to take advantage of the extra cores in your CPU to clean up dead memory on a background thread. It helps to drastically reduce the amount of time your program is forcefully paused when doing garbage collections.

Well, the GC is always there -- you can't create objects that are outside its grasp (unless you use native calls or allocate a direct byte buffer, but in the latter case you don't really have an object, just a bunch of bytes). That said, it's definitely possible to circumvent the GC by reusing objects. For instance, if you need a bunch of ArrayList objects, you could just create each one as you need it and let the GC handle memory management; or you could call list.clear() on each one after you finish with it, and put it onto some queue where somebody else can use it.
Standard best practices are to not do that sort of reuse unless you have good reason to (ie, you've profiled and seen that the allocations + GC are a problem, and that reusing objects fixes that problem). It leads to more complicated code, and if you get it wrong it can actually make the GC's job harder (because of how the GC tracks objects).

Basically the idea in Java is that you should not deal with memory except using "new" to allocate new objects and ensure that there is no references left to objects when you are done with them.
All the rest is deliberately left to the Java Runtime and is - also deliberately - defined as vaguely as possible to allow the JVM designers the most freedom in doing so efficiently.
To use an analogy: Your operating system manages named areas of harddisk space (called "files") for you. Including deleting and reusing areas you do not want to use any more. You do not circumvent that mechanism but leave it to the operating system
You should focus on writing clear, simple code and ensure that your objects are properly done with. This will give the JVM the best possible working conditions.

You are correct in saying that System.gc() is a request to the compiler and not a command. But using below program you can make sure it happens.
import java.lang.ref.WeakReference;
public class GCRun {
public static void main(String[] args) {
String str = new String("TEMP");
WeakReference<String> wr = new WeakReference<String>(str);
str = null;
String temp = wr.get();
System.out.println("temp -- " + temp);
while(wr.get() != null) {
System.gc();
}
}
}

I would suggest to take a look at the following tutorials and its contents
This is a four part tutorial series to know about the basics of garbage collection in Java:
Java Garbage Collection Introduction
How Java Garbage Collection Works?
Types of Java Garbage Collectors
Monitoring and Analyzing Java Garbage Collection
I found This tutorial very helpful.

"Nullify"ing the reference when not required is the best way to make an object eligible for Garbage collection.
There are 4 ways in which an object can be Garbage collected.
Point the reference to null, once it is no longer required.
String s = new String("Java");
Once this String is not required, you can point it to null.
s = null;
Hence, s will be eligible for Garbage collection.
Point one object to another, so that both reference points to same object and one of the object is eligible for GC.
String s1 = new String("Java");
String s2 = new String("C++");
In future if s2 also needs to pointed to s1 then;
s1 = s2;
Then the object having "Java" will be eligible for GC.
All the objects created within a method are eligible for GC once the method is completed. Hence, once the method is destroyed from the stack of the thread then the corresponding objects in that method will be destroyed.
Island of Isolation is another concept where the objects with internal links and no extrinsic link to reference is eligible for Garbage collection.
"Island of isolation" of Garbage Collection
Examples:
Below is a method of Camera class in android. See how the developer has pointed mCameraSource to null once it is not required. This is expert level code.
public void release() {
if (mCameraSource != null) {
mCameraSource.release();
mCameraSource = null;
}
}
How Garbage Collector works?
Garbage collection is performed by the daemon thread called Garbage Collector. When there is sufficient memory available that time this demon thread has low priority and it runs in background. But when JVM finds that the heap is full and JVM wants to reclaim some memory then it increases the priority of Garbage collector thread and calls Runtime.getRuntime.gc() method which searches for all the objects which are not having reference or null reference and destroys those objects.

Where to find the evidence of how to calculate the size of a java object

I've searched about the size of a java object for a long time, there are a lot of answers like this, everyone is telling me that the size of the overhead of a java object, and how to calculate out the actual size. But how do they know that? I did not find any evidence from the official oracle documents. What is the evidence for that conclusion? Or the data was just come from some guesses based on some experiments?
Another thing. It is mentioned in the official document that there is a 'approximative' way to measure the object - the Instrumentation way, can anybody explain to me what is the 'approximately' means? When it is accurate, when it is not. Better to have the evidence.

how to calculate out the actual size. But how do they know that?
from experience.
I did not find any evidence from the official oracle documents.
Its up to the JVM. For OpenJDK based JVMs, 32-bit JVM has a different header size to 64-bit JVM as the header contains a reference. Other JVMs could be different again.
Or the data was just come from some guesses based on some experiments?
Essentially, yes.
can anybody explain to me what is the 'approximately' means?
When you measure the size of an object it can means many different things
How big is the object? (Shallow depth)
How much memory does it use? (Object allocation is 8-byte aligned i.e. always a multiple of 8)
How much space does the object and all the objects referenced use? (Deep depth)
How much space might be freed if it were discarded? (how many objects are shared and does it appear in the middle of two fragments of freed memory)
Do you count the space used on the stack, or space used in off heap memory?
Given you can have many difference answers depending on what you need to know, it is useful to have one number which is approximately close to all of these which you use for calculations.
Where you get a problem using Runtime is that the TLAB allocated data in large blocks. These large blocks can be further allocated in a multi-thread way. The downside is you don't get accurate memory used information.
static long memTaken() {
final Runtime rt = Runtime.getRuntime();
return rt.totalMemory() - rt.freeMemory();
}
public static void main(String... args) {
long used1 = memTaken();
Float i = new Float(0);
long used2 = memTaken();
System.out.println("new Float(0) used "+(used2 - used1)+" bytes.");
}
run without options
new Float(0) used 0 bytes.
Turn off the TLAB and you see with -XX:-UseTLAB
new Float(0) used 336 bytes.
This is much higher than you might expect because the class itself had to be loaded. If you create one instance of a Float first by adding to the start
Float j = new Float(1);
you get
new Float(0) used 16 bytes

I think the answer can only be empirical as it is implementation dependent:
The Java virtual machine does not mandate any particular internal structure for objects.

For me the best method is the most down-to-earth one, using
long memTaken() {
final Runtime rt = Runtime.getRuntime();
return rt.totalMemory() - rt.freeMemory();
}
Remember initial memTaken();
Make an array of a million object (adapt this number to fit your heap);
Subtract memTaken() from the remembered one.
There may be some transient allocation in the process, so you also need to run the GC. There are no guarantees provided by System.gc(), but this approach has always worked for me:
for (int i = 0; i < 3; i++) { System.gc(); Thread.sleep(50); }
You must be careful to ensure this gives stable results. For example, an approach is to calculate memory load for several different object counts and compare results. They should all match on the memory/instance answer they give.
I have advised this before, been put down for the lame solution, been recommended to use proper tooling, etc. So I used tooling, for example jvisualvm—and it gave wrong results. This method has never given wrong results to me.

there are two different sizes for java objects: the shallow size and the retained size.
the shallow size of an object is the sum of sizes of all it's fields: straight-forward for primitive members, and the size of a pointer for every non-primitive member(and this varies between 32 & 64 bit architectures). you can find the shallow size of an object at runtime using instrumantation. here is a nice tutorial
retained size is defined by the heap space that will be freed when the object is garbaged.
finding the retained size of an object is not trivial, mainly because objects are usually composed of other objects. for example:
public class Clazz {
public byte[] member;
public static void main(String[] args) {
byte[] bytes = new byte[128];
Clazz a = new Clazz();
Clazz b = new Clazz();
a.member = bytes;
b.member = bytes;
}
}
what is the retained size of a? and b?
use a profiler to compute retained size (most profilers are using static heap analysis for that).

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.