I want to improve my knowledge about memory model of programming languages (particulary in Java), so I have one question.
Here is very simple code:
// Allocating memory in heap for SimpleObject's instance
// Creating reference to this object with name so1
SimpleObject so1 = new SimpleObject();
// Allocating memory in heap for array of 10 references to SimpleObject's objects
// Now I know, that array stores only references to the objects
// (Previously I thought that array stores objects)
// Then we create reference to this array with name soArray
SimpleObject[] soArray = new SimpleObject[10];
Now the question:
// What is going on here?
soArray[0] = so1;
// object so1 had been really moved to memory area of soArray?
// And so1 reference have been updated to new memory address?
// Or we just had assigned so1 object's reference to soArray[0] element?
// Or so1 object had been copied to the soArray[0]?
// Then original so1 object has been deleted and all links to it had been updated?
If you know, how it works in other languages, such as (C, C++, C# or other), please answer, I will be glad to know it.
Everybody know, that ArrayList can be faster than LinkedList, because elements of array could be stored in CPU cache, while if we working with LinkedList, CPU has to get next object from RAM each time.
So how could it work, if at first I had created object in heap and only then I had put object in array?
UPD: Thank you guys, now I understand how array is working, but what about caching array in CPU cache in that way?
Arrays store references to objects, not the objects themselves. You therefore swap the reference at position 0 when assigning soArray[0]. The objects themselved can be moved within the heap, but this is usually due to GC, not assignments.
If the objects themselves were stored directly in the array, you could not have instances of subclasses with more instance fields in your array. They would not fit into the allocated space and therefore only become instances of the base class. This is what actually happens in C++ when you assign class instances stored on the stack.
In Java, arrays store references to objects. In C++ parlance they store pointers to objects.
SimpleObject[] soArray = new SimpleObject[10]; //Java
SimpleObject* cppArray[10]; // C++ equivalent
soArray[0] = so1; puts a reference to so1 in soArray[0] in the same way that cppArray[0] = &so1 stores a pointer to so1. The original object remains unchanged, no additional memory is allocated or deallocated.
In C++ you can store the objects directly in an array.
SimpleObject soArray[10]; // An array that stores Simple Objects in place
SimpleObject so1; // A new object
soArray[0] = so1; // This *copies* so1 into soArray[0]
We assign the reference to the object pointed by so1 to the array element.
Here's an example in using Python Tutor (there's no equivalent tool for Java that I know, but the memory model is similar, except for the Class being an object, so ignore that):
Related
Example in c++
Employee emp=new Employee();
Now here x bytes are created to store Employee. And y bytes are required to store reference. Hence x+y bytes are required.
Now in c++ since there is no garbage collection, its programmers duty to destroy the object.
Employee emp2=emp;
Question1: Now does it create one more address space of (x+y) for emp2 ?
In Java
It just points to the object in heap.
Question2 : So does this mean if the same object is added into a ArrayList in java, lets say 100 times then the memory used is only to store the reference of the object in heap? i.e only 100*y+x amount of memory will be used?
So does this mean if the same object is added into a ArrayList in java, lets say 100 times then the memory used is only to store the reference of the object in heap? i.e only 100*y+x amount of memory will be used? - Yes. In collections only references to actual objects (which are almost always on heap) are added.
As a side note, java has 4-byte references irrespective of the architecture (32 bit / 64 bit).
There is the stack memory of function calls' local variables; a stack as function calls are nested.
This is the memory for variables. In Java a reference to an object. The object itself is stored on the heap, garbage collected. Also an array is an Object in Java. Java does not have C structs on the stack. This is a historical design decision to keep everything simple, as successor of "complex" C++.
Now in C++ you have an immediate struct or array on the stack (without new/malloc). You then need a copy constructor, that shovels the data from one space to the other (stack or heap).
The effective difference is that in C one can have a linked list where every node is fat, with the data inside.
In Java, say a LinkedList<T> every node (a heap object) will contain an additional indirection, a reference to a T data object.
The data may be shared in Java, and are possibly copied in C.
From this perspective you can do your own calculations of memory usage. I feel the need to mention that Java's good garbage collection in general is better than malloc/free of C. So Java certainly is not that bad.
E[] arr = (E[])new Object[INITIAL_ARRAY_LENGTH];
The code above was taken from this following post:
Where E is a generic class type. How does the compiler/JVM know how much memory it needs to assign when we are using type Object to instantiate the array. My understanding is, type casting only allows to change reference type, but not the underlying object structure.
An array of a certain size of reference type will take same size in memory no matter what types of objects it holds. This is because the memory holds only the references (pointers) and that's it and not the memory for the array items which is assigned when those objects are created. The heap will then hold new objects as they're created and assigned to the array.
So, the following arrays will all take up the same size:
new Integer[10]
new BigInteger[10]
new String[10]
new Object[10]
Note that to the compiler, an array of a non-constrained generic type translates to an array of Object.
Also note that arrays of primitives likely have a different memory footprint.
.....
Again, this is just the memory for the array itself, not the items it references -- and this is a very important point in all of this, probably the most important point for understanding this.
Suppose I have the following code in java
Object object = new Object();
mylist.add(object);
mylist2.add(object);
As far as I understand I have created one object in memory and both mylist and mylist2 have some kind reference to this object. Is that correct?
In which case how much more memory does the program above use compared to if I had just done
Object object = new Object();
mylist.add(object);
I'm wondering because I sometimes feel it would be useful to have two different data structures holding the same information for different purposes.
Ex:
A binary tree and a hash map such that you can easily search for objects in constant time and easily iterate through an ordered list of the objects.
It depends on what type of list you use. If you use an ArrayList, then there is no overhead for each entry except the reference itself (4 bytes on a 32 bit machine, ignoring the empty space in this kind of list ;)). If you use for example a LinkedList then there is a wrapper object around it, which additionally holds a reference to the previous/next element in the list.
On most VMs the size of a reference is the native pointer size (from John Skeet)
So if you use a 32bit VM e.g. it will be 4 bytes.
I'm writing an array-backed hashtable in Java, where the type of key and value are Object; no other guarantee.
The easiest way for me code-wise is to create an object to hold them:
public class Pair {
public Object key;
public Object value;
}
And then create an array
public Pair[] storage = new Pair[8];
But how does the jvm treat that in memory? Which is to say, will the array actually:
be an array of pointers to Pair() objects sitting elsewhere, or
contain the actual data?
edit
Since the objects are instantiated later as new Pair(), they're randomly placed in the heap. Is there any good way to ensure they're sequential in the heap? Would I need to do some trickery with sun.misc.unsafe to make that work?
Explaining my motivation, if I want to try and ensure that sequential items are in the same page of memory, is there any way to do this in Java?
The array will be an object on the heap containing pointers to the Pair objects which will also be on the heap (but separate from the array itself).
No, the storage array will only contain pointers to the actual Pair objects existing somewhere else on the heap. Yet, remember to instantiate 8 Pair objects and make each element of the array point to these objects. You need to have something like this after the code that you have written:
for(int i=0;i<storage.length;i++)
storage[i] = new Pair() ;
Only then will the Pair objects be created and correctly referred to by the storage array.
Programing languages like C,C++ will not store array values in Heap rather it keeps the value in STACK. But in Java why there is a necessity to keep array values in heap?
In Java, arrays (just like all other objects) are passed around by reference: When you pass an array to a method, it will get a reference pointing to the same location in memory, no copy is being made. This means that the array needs to remain "alive" after the method that created it, and so cannot be stored in the stack frame for the method. It needs to managed by the garbage collector, just like all other objects.
There is some research going in to optimize JVM memory allocation using "escape analysis": If an object (such as an array) can be guaranteed to never leave the current scope, it becomes possible to in fact allocate it on the stack, which is more efficient.
A short answer is that an array in Java is a reference type, and reference types live on the heap. It's worth noting that in C#, one can switch to unsafe mode and initialise arrays with stackalloc which will create the array on the stack. It's therefore quite probable that the VM would allow you to make an array on the stack, and it's merely an implementation detail that means arrays all live on the heap.
int[]a={10,20,30};
this array stores on stack, but the following array stores in heap:
`int[]num=new int num[2];`//here we build the object of array , object always located in heap
always treat arrays like java object, so do not be confused by the fact that arrays don't store on ram when we have a declaration like the one above.