We're trying to tweak some Oracle JVM garbage collection options and one developer tried to use -XX:PretenureSizeThreshold to make sure a large array of objects was put in Tenured right away. I'm pretty sure the assumption was that the array size equals or exceeds the total size of all the objects in it.
But in Java, aren't arrays of objects just arrays of references? I.e. each object in the array, as well as the array object itself, is separate in memory and treated as separate by the garbage collector? I think the array object can still get fairly large if there are millions of entries, but it shouldn't be anywhere near the total size of the objects it "contains" if each object is much bigger than a reference.
I think there's confusion because AFAIK, in C:
It's possible to have an array of structs that really does store the structs.
It's also possible to have an array of pointers to structs.
I'm pretty sure Java always uses 1. for arrays of primitive types and always uses 2. for arrays of objects, while C can use either for any type...?
What if I use an ArrayList with frequent append()s (as we are in the case at hand)? Is only the array copied, and not the objects in the array? Also, when the array is copied, even if the old array was in Tenured the new one starts in Eden, right?
But in Java, aren't arrays of objects just arrays of references?
Just references. All objects are allocated on the heap, never in arrays or on the stack (at least officially, the optimizer may use stack allocation if possible, but this is transparent).
it shouldn't be anywhere near the total size of the objects it "contains" if each object is much bigger than a reference.
Yes, in Java whenever you say "assign/store an object", you mean the reference (pointer in C terminology).
What if I use an ArrayList with frequent append()s (as we are in the case at hand)? Is only the array copied, and not the objects in the array?
The array gets only copied when resizing is needed, i.e., very rarely and the amortized cost is proportional to the number of inserts. The referenced objects gets never copied.
Also, when the array is copied, even if the old array was in Tenured the new one starts in Eden, right?
Yes!
Using -XX:PretenureSizeThreshold for tuning is unlikely to help you.
This parameter applies only to direct Eden allocation, while most allocation is happening in TLAB (Thread Local Allocation Buffer) and -XX:PretenureSizeThreshold is ignored.
TLAB could be quite large for thread actively allocating memory (few megabytes).
You can tweak TLAB sizing, to reduce this effect, but that would probably do more harm than good.
But in Java, aren't arrays of objects just arrays of references? I.e.
each object in the array, as well as the array object itself, is
separate in memory and treated as separate by the garbage collector?
Yes.
I think there's confusion because AFAIK, in C:
It's possible to have an array of structs that really does store the structs.
It's also possible to have an array of pointers to structs.
I'm pretty sure Java always uses 1. for arrays of primitive types and
always uses 2. for arrays of objects, while C can use either for any
type...?
Java, like C, typically stores arrays of primitive types as actual arrays with elements of those types. So an int[] array with 10 elements is typically going to reserve 10×4 bytes for the array, plus overhead for the entire array object.
Arrays of objects, however, are as you say, arrays of references. So an object[] of 10 elements is going to typically take up 10×4 bytes (or perhaps 10×8 bytes on 64-bit CPUs) for the array, plus overhead, plus space for each object that each non-null element references. This corresponds in C to an array of pointers.
(I use the term "typically", because even though that's how most JVMs do it, they are not required to allocate memory in any particular fashion.)
Also be aware that Java does not have true multi-dimensional arrays like C (or C#). An int[][] in Java is actually a one-dimensional array, where each element is a reference to its own int[] subarray. In C, an int[][] really is a two-dimensional array of integers (where the lengths of all but the first dimension must be known at compile time).
Addendum
Also note that, like you say, C can have true arrays of structs, which are neither primitive types nor pointers. Java does not have this capability.
Related
whereas in java objects' creation occur in heap , As a result of this action
objects has a dynamic capacity in java.in this level I have an ambiguity that why arrays capacity in Java should be listed at the time of their initialization?
thanks.
Your question is bit confusing. But the size of the array is not defined in the array declaration. It's defined when an array is created. When you assigned a created array to your declared array variable.
You have to do this because Java arrays are of fixed size and not dynamically resized. You can use ArrayList instead of array if you don't want the size to be fixed.
It has nothing to do with java. Allocating memory for arrays is almost the same thing in java as using malloc in C.
I was reading about data locality and want to use it to improve my game engine that I'm writing.
Let's say that I have created five objects at different times that are now all in different places in the memory not next to each other. If I add them all to an array, will that array only hold pointers to those objects and they will stay in the same place in the memory or will adding them all to an array rearrange them and make them contiguous.
I ask this because I thought that using arrays would be a good way to make them contiguous, but I don't know if an array will fix my problem!
tl;dr
Manipulating an array of references to objects has no effect on the objects, and has no effect on the objects’ location in memory.
Objects
An array of objects is really an array of references (pointers) to objects. A pointer is an address to another location in memory.
We speak of the array as holding objects, but that is not technically accurate. Because Java does not expose pointers themselves to us as programmers, we are generally unaware of their presence. When we access an element in the array, we are actually retrieving a pointer, but Java immediately follows that pointer to locate the object elsewhere in memory.
This automatic look-up, following the pointer to the object, makes the array of pointers feel like an array of objects. The Java programmer thinks of her array as holding her objects when in reality the objects are a hop-skip-and-a-jump away.
Arrays in Java are implemented as contiguous blocks of memory. For an array of objects, the pointers to those objects are being stored in contiguous memory. But when we access the elements, we are jumping to another location in memory to access the actual object that we want.
Adding elements may be “cheap” in that if memory happens to be available next door in memory, it can be allocated to the array to make room for more elements. In practice this is unlikely. Chances are a new array must be built elsewhere in memory, with all the pointers being copied over to the new array and then discarding the original array.
Such a new-array-and-copy-over is “expensive”. When feasible, we want to avoid this operation. If you know the likely maximum size of your array, specify that size when declaring the array. The entire block of contiguous memory is claimed immediately, with empty content in the array until you later assign a pointer to the elements.
Inserting into the middle of an array is also expensive. Either a new array is built and elements copied over, or all the elements after the insertion point must be moved down into their neighboring position.
None of these operations to the array affect the objects. The objects are floating around in the ether of memory. The objects know nothing of the array. Operations on the array do not affect the objects nor their position in memory. The only relationship is that if the reference held in the array is the last reference still pointing to the object, then when that array element is cleared or deleted, the object becomes a candidate for garbage-collection.
Primitives
In Java, the eight primitive types (byte, short, int, long, float, double, boolean, and char) are not objects/classes and are not Object-Oriented Programming. One advantage is that they are fast and take little memory, compared to objects.
An array of primitives hold the values within the array itself. So these values are stored next to one another, contiguous in memory. No references/pointers. No jumping around in memory.
As for adding or inserting, the same behavior discussed above applies. Except that instead of pointers being shuffled around, the actual primitive values are being shuffled around.
Tips
In business apps, it is generally best to use objects.
That means using the wrapper classes instead of primitives. For example, Integer instead of int. The auto-boxing facility in Java makes this easier by automatically converting between primitive values and their object wrapper.
And preferring objects means using a Collection instead of arrays, usually a List, specifically a ArrayList. Or for immutable use, a List implementation returned from the new List.of method.
In contrast to business apps, in extreme situations where speed and memory usage are paramount, such as your game engine, then make the most of arrays and primitives.
In the future, the distinction between objects and primitives may blur if the work done in Project Valhalla comes to fruition.
The data or the values are stored in the objects and the values are retrieved using the references of the objects. lemme clear one more thing arrays in Java are stored in the form of objects. so there is no doubt that objects stores values and accessed using reference variable of that particular object. Hope you got it.
Java deals with references to objects only. As such, there's no guarantee that the elements of an array will be contiguous in memory.
Edit: Guess this answer wasn't that clear. My bad. I meant that there's no guarantee that the objects themselves will be contiguous, in spite of the fact that the references will be, as 1-D arrays are stored contiguously. Still, Basil Bourque's answer perfectly explains how this works.
This is the usual way for declare a Java array:
int[] arr = new int[100];
But this array is using heap space. Is there a way we can declare an array using stack space like c++?
Arrays are objects irrespective of whether it holds primitive type or object type, so like any other object its allocated space on the heap.
But then from Java 6u23 version, Escape Analysis came into existence, which is by default activated in Java 7.
Escape Analysis is about the scope of the object, when an object is defined inside a method scope rather than a class scope, then the JVM knows that this object cant escape this limited method scope, and applies various optimization on it.. like Constant folding, etc
Then it can also allocate the object which is defined in the method scope,
on the Thread's Stack, which is accessing the method.
In a word, no.
The only variables that are stored on the stack are primitives and object references. In your example, the arr reference is stored on the stack, but it references data that is on the heap.
If you're asking this question coming from C++ because you want to be sure your memory is cleaned up, read about garbage collection. In short, Java automatically takes care of cleaning up memory in the heap as well as memory on the stack.
Arrays are dynamically allocated so they go on the heap.
I mean, what happens when you do this:
int[] arr = new int[4];
arr = new int[5];
If the first allocation was done on the stack, how would we garbage collect it? The reference arr is stored on the stack, but the actual array of data must be on the heap.
It's not yet supported as a language feature, because that would require value types since passing on-stack data by reference would not be safe.
But as an optimization (escape analysis) the JVM may already do that for local variables containing small, fixed-size arrays iff it can prove that it does not escape the local/callee scope. That said, it's just a runtime optimization and not some spec guarantee, so relying on it is difficult.
Since interfaces only specify methods and not instance variables, how is storage allotted to something like:
Comparable[] aux = new Comparable[20];
How much per location storage (i.e. not counting array overhead) will be allocated?
The array is only allocating enough contiguous memory for the pointers to the objects, it doesn't need to allocate memory for the actual objects itself.
We can sometimes forget, Java still uses "pointers" (aka references), it just doesn't provide the same level of access to those pointers that other languages do
Objects are reference types, therefore every Object subtypes (including Comparator and every other interface) are reference types. It means that the size of every array item is the size of an object reference. It doesn't make a difference what kind of object it is.
Programing languages like C,C++ will not store array values in Heap rather it keeps the value in STACK. But in Java why there is a necessity to keep array values in heap?
In Java, arrays (just like all other objects) are passed around by reference: When you pass an array to a method, it will get a reference pointing to the same location in memory, no copy is being made. This means that the array needs to remain "alive" after the method that created it, and so cannot be stored in the stack frame for the method. It needs to managed by the garbage collector, just like all other objects.
There is some research going in to optimize JVM memory allocation using "escape analysis": If an object (such as an array) can be guaranteed to never leave the current scope, it becomes possible to in fact allocate it on the stack, which is more efficient.
A short answer is that an array in Java is a reference type, and reference types live on the heap. It's worth noting that in C#, one can switch to unsafe mode and initialise arrays with stackalloc which will create the array on the stack. It's therefore quite probable that the VM would allow you to make an array on the stack, and it's merely an implementation detail that means arrays all live on the heap.
int[]a={10,20,30};
this array stores on stack, but the following array stores in heap:
`int[]num=new int num[2];`//here we build the object of array , object always located in heap
always treat arrays like java object, so do not be confused by the fact that arrays don't store on ram when we have a declaration like the one above.