Suppose I have the following code in java
Object object = new Object();
mylist.add(object);
mylist2.add(object);
As far as I understand I have created one object in memory and both mylist and mylist2 have some kind reference to this object. Is that correct?
In which case how much more memory does the program above use compared to if I had just done
Object object = new Object();
mylist.add(object);
I'm wondering because I sometimes feel it would be useful to have two different data structures holding the same information for different purposes.
Ex:
A binary tree and a hash map such that you can easily search for objects in constant time and easily iterate through an ordered list of the objects.
It depends on what type of list you use. If you use an ArrayList, then there is no overhead for each entry except the reference itself (4 bytes on a 32 bit machine, ignoring the empty space in this kind of list ;)). If you use for example a LinkedList then there is a wrapper object around it, which additionally holds a reference to the previous/next element in the list.
On most VMs the size of a reference is the native pointer size (from John Skeet)
So if you use a 32bit VM e.g. it will be 4 bytes.
Related
I was reading about data locality and want to use it to improve my game engine that I'm writing.
Let's say that I have created five objects at different times that are now all in different places in the memory not next to each other. If I add them all to an array, will that array only hold pointers to those objects and they will stay in the same place in the memory or will adding them all to an array rearrange them and make them contiguous.
I ask this because I thought that using arrays would be a good way to make them contiguous, but I don't know if an array will fix my problem!
tl;dr
Manipulating an array of references to objects has no effect on the objects, and has no effect on the objects’ location in memory.
Objects
An array of objects is really an array of references (pointers) to objects. A pointer is an address to another location in memory.
We speak of the array as holding objects, but that is not technically accurate. Because Java does not expose pointers themselves to us as programmers, we are generally unaware of their presence. When we access an element in the array, we are actually retrieving a pointer, but Java immediately follows that pointer to locate the object elsewhere in memory.
This automatic look-up, following the pointer to the object, makes the array of pointers feel like an array of objects. The Java programmer thinks of her array as holding her objects when in reality the objects are a hop-skip-and-a-jump away.
Arrays in Java are implemented as contiguous blocks of memory. For an array of objects, the pointers to those objects are being stored in contiguous memory. But when we access the elements, we are jumping to another location in memory to access the actual object that we want.
Adding elements may be “cheap” in that if memory happens to be available next door in memory, it can be allocated to the array to make room for more elements. In practice this is unlikely. Chances are a new array must be built elsewhere in memory, with all the pointers being copied over to the new array and then discarding the original array.
Such a new-array-and-copy-over is “expensive”. When feasible, we want to avoid this operation. If you know the likely maximum size of your array, specify that size when declaring the array. The entire block of contiguous memory is claimed immediately, with empty content in the array until you later assign a pointer to the elements.
Inserting into the middle of an array is also expensive. Either a new array is built and elements copied over, or all the elements after the insertion point must be moved down into their neighboring position.
None of these operations to the array affect the objects. The objects are floating around in the ether of memory. The objects know nothing of the array. Operations on the array do not affect the objects nor their position in memory. The only relationship is that if the reference held in the array is the last reference still pointing to the object, then when that array element is cleared or deleted, the object becomes a candidate for garbage-collection.
Primitives
In Java, the eight primitive types (byte, short, int, long, float, double, boolean, and char) are not objects/classes and are not Object-Oriented Programming. One advantage is that they are fast and take little memory, compared to objects.
An array of primitives hold the values within the array itself. So these values are stored next to one another, contiguous in memory. No references/pointers. No jumping around in memory.
As for adding or inserting, the same behavior discussed above applies. Except that instead of pointers being shuffled around, the actual primitive values are being shuffled around.
Tips
In business apps, it is generally best to use objects.
That means using the wrapper classes instead of primitives. For example, Integer instead of int. The auto-boxing facility in Java makes this easier by automatically converting between primitive values and their object wrapper.
And preferring objects means using a Collection instead of arrays, usually a List, specifically a ArrayList. Or for immutable use, a List implementation returned from the new List.of method.
In contrast to business apps, in extreme situations where speed and memory usage are paramount, such as your game engine, then make the most of arrays and primitives.
In the future, the distinction between objects and primitives may blur if the work done in Project Valhalla comes to fruition.
The data or the values are stored in the objects and the values are retrieved using the references of the objects. lemme clear one more thing arrays in Java are stored in the form of objects. so there is no doubt that objects stores values and accessed using reference variable of that particular object. Hope you got it.
Java deals with references to objects only. As such, there's no guarantee that the elements of an array will be contiguous in memory.
Edit: Guess this answer wasn't that clear. My bad. I meant that there's no guarantee that the objects themselves will be contiguous, in spite of the fact that the references will be, as 1-D arrays are stored contiguously. Still, Basil Bourque's answer perfectly explains how this works.
I am trying to figure out which of the two objects require more memory.
By checking this reference now I see a null costs 4 to 8 bytes Java - Does null variable require space in memory
But no idea how much would cost to have an empty ArrayList of String. Does it cost the same talking about memory?
Any idea about it?
If, by empty list, you mean:
List<Object> empty = new ArrayList<> ();
Then that will take a lot more space than using null.
But if you plan to use:
List<Object> empty = Collections.emptyList(); // I'm a singleton
Then because all your empty lists will refer to the same object, you will end up with the same memory consumption as if you were using null.
You should use
Collections.emptyList()
Internally this points to a public static final List EMPTY_LIST = new EmptyList<>(); so won't eat any extra memory. It has the same effect as creating an empty list yourself, but without the overhead.
Explicitly constructing a new empty list will use more memory. Most (if not all) implementations of List have some book keeping and overhead (size, head, tail, etc) that are incurred. eg. LinkedList has a tail and a head which is going to use up some more memory when instantiated.
Ultimately null will only use 4-8 bytes depending on whether you're running a 32 or 64 bit platform.
As mentioned in other answers, if you only ever need and empty list you should use Collections.emptyList(), however if you do need a mutable list you won't be able to use this as the instance you'll get back is immutable.
To answer the initial question about ArrayList, the implementation only has a single extra field other than the stored elements, size. So although it won't use as little space as null the overhead is miniscule.
TLDR; null will only use the space needed to store a 32 or 64 bit pointer, an actual instantiation of a list implementation will consume more memory because of some state that is held on the implementation.
The reference takes the same memory (4 bytes on 32-bit systems or 8 bytes on 64-bit systems).
The "object" takes more space in case of an empty list, because null itself does not consume any space.
This question already has answers here:
How to determine the size of an object in Java
(28 answers)
Closed 8 years ago.
It was an interview question that a List contains 5 objects like Organisation, Employee, Address,Country etc. How will you know which object is the heaviest one without running through java agent. There is one condition that all the objects available inside Arraylist are not serializable. Basically interviewer wants know how to write code know the size of the objects available inside ArrayList so that you can now that a particular object is heavier. Please provide me help. Once again let me put the conditions once again.
You can not use any profiler tool.
All the objects are not serializable.
You can run though java agent.
You have to write code to test and run as normal java program.
You can use instrumentation interface.
http://www.javapractices.com/topic/TopicAction.do?Id=83
You can't practically do this. Bear in mind that Java deals in references, and your list will simply contain a reference to the given object. Consider:
MyBigObject obj = new MyBigObject();
List<MyBigObject> list1 = new ArrayList<MyBigObject>();
list1.add(obj);
So your list contains a reference to your object. Now if I do this:
List<MyBigObject> list2 = new ArrayList<MyBigObject>();
list2.add(obj);
my second list contains a reference, to the same object. To say that list2 actually is the size of the 'contained' object is meaningless.
When you construct objects, they consist of primitives and references. You can account for the size of the primitives (since they're copied by value) but you can't do this for the references objects, since they're simply pointers to other objects. You can say an object is a certain size and made up of references (which may be 32 or 64 bit), but that's another matter.
You can see how much space is needed to allocate an object by doing -XX:-UseTLAB on the command line and use this method
public static long memoryUsed() {
return Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory();
}
long before = memoryUsed();
new Object();
long used = memoryUsed() - before; // 16 bytes.
You can also use reflection to scan through the fields of each object. You can use Unsafe to get the offset of each of the fields and estimate the end of the object (including object alignment)
I'm writing an array-backed hashtable in Java, where the type of key and value are Object; no other guarantee.
The easiest way for me code-wise is to create an object to hold them:
public class Pair {
public Object key;
public Object value;
}
And then create an array
public Pair[] storage = new Pair[8];
But how does the jvm treat that in memory? Which is to say, will the array actually:
be an array of pointers to Pair() objects sitting elsewhere, or
contain the actual data?
edit
Since the objects are instantiated later as new Pair(), they're randomly placed in the heap. Is there any good way to ensure they're sequential in the heap? Would I need to do some trickery with sun.misc.unsafe to make that work?
Explaining my motivation, if I want to try and ensure that sequential items are in the same page of memory, is there any way to do this in Java?
The array will be an object on the heap containing pointers to the Pair objects which will also be on the heap (but separate from the array itself).
No, the storage array will only contain pointers to the actual Pair objects existing somewhere else on the heap. Yet, remember to instantiate 8 Pair objects and make each element of the array point to these objects. You need to have something like this after the code that you have written:
for(int i=0;i<storage.length;i++)
storage[i] = new Pair() ;
Only then will the Pair objects be created and correctly referred to by the storage array.
In Java, we can always use an array to store object reference. Then we have an ArrayList or HashTable which is automatically expandable to store objects. But does anyone know a native way to have an auto-expandable array of object references?
Edit: What I mean is I want to know if the Java API has some class with the ability to store references to objects (but not storing the actual object like XXXList or HashTable do) AND the ability of auto-expansion.
Java arrays are, by their definition, fixed size. If you need auto-growth, you use XXXList classes.
EDIT - question has been clarified a bit
When I was first starting to learn Java (coming from a C and C++ background), this was probably one of the first things that tripped me up. Hopefully I can shed some light.
Unlike C++, Object arrays in Java do not store objects. They store object references.
In C++, if you declared something similar to:
String myStrings[10];
You would get 10 String objects. At this point, it would be perfectly legal to do something like println(myStrings[5].length); - you'd get '0' - the default constructor for String creates an empty string with length 0.
In Java, when you construct a new array, you get an empty container that can hold 10 String references. So the call:
String[] myStrings = new String[10];
println(myStringsp[5].length);
would throw a null pointer exception, because you haven't actually placed a String reference into the array yet.
If you are coming from a C++ background, think of new String[10] as being equivalent to new (String *)[10] from C++.
So, with that in mind, it should be fairly clear why ArrayList is the solution for an auto expanding array of objects (and in fact, ArrayList is implemented using simple arrays, with a growth algorithm built in that allocates new expanded arrays as needed and copies the content from the old to the new).
In practice, there are actually relatively few situations where we use arrays. If you are writing a container (something akin to ArrayList, or a BTree), then they are useful, or if you are doing a lot of low level byte manipulation - but at the level that most development occurs, using one of the Collections classes is by far the preferred technique.
All the classes implementing Collection are expandable and store only references: you don't store objects, you create them in some data space and only manipulate references to them, until they go out of scope without reference on them.
You can put a reference to an object in two or more Collections. That's how you can have sorted hash tables and such...
What do you mean by "native" way? If you want an expandable list f objects then you can use the ArrayList. With List collections you have the get(index) method that allows you to access objects in the list by index which gives you similar functionality to an array. Internally the ArrayList is implemented with an array and the ArrayList handles expanding it automatically for you.
Straight from the Array Java Tutorials on the sun webpage:
-> An array is a container object that holds a fixed number of values of a single type.
Because the size of the array is declared when it is created, there is actually no way to expand it afterwards. The whole purpose of declaring an array of a certain size is to only allocate as much memory as will likely be used when the program is executed. What you could do is declare a second array that is a function based on the size of the original, copy all of the original elements into it, and then add the necessary new elements (although this isn't very 'automatic' :) ). Otherwise, as you and a few others have mentioned, the List Collections is the most efficient way to go.
In Java, all object variables are references. So
Foo myFoo = new Foo();
Foo anotherFoo = myFoo;
means that both variables are referring to the same object, not to two separate copies. Likewise, when you put an object in a Collection, you are only storing a reference to the object. Therefore using ArrayList or similar is the correct way to have an automatically expanding piece of storage.
There's no first-class language construct that does that that I'm aware of, if that's what you're looking for.
It's not very efficient, but if you're just appending to an array, you can use Apache Commons ArrayUtils.add(). It returns a copy of the original array with the additional element in it.
if you can write your code in javascript, yes, you can do that. javascript arrays are sparse arrays. it will expand whichever way you want.
you can write
a[0] = 4;
a[1000] = 434;
a[888] = "a string";