In Java, does it cost memory to declare a class level instance variable without initializing it?
For example: Does int i; use any memory if I don't initialize it with i = 5;?
Details:
I have a huge super-class that many different (not different enough to have their own super classes) sub-classes extend. Some sub-classes don't use every single primitive declared by the super-class. Can I simply keep such primitives as uninitialized and only initialize them in necessary sub-classes to save memory?
All members defined in your classes have default values, even if you don't initialize them explicitly, so they do use memory.
For example, every int will be initialized by default to 0, and will occupy 4 bytes.
For class members :
int i;
is the same as :
int i = 0;
Here's what the JLS says about instance variables :
If a class T has a field a that is an instance variable, then a new instance variable a is created and initialized to a default value (§4.12.5) as part of each newly created object of class T or of any class that is a subclass of T (§8.1.4). The instance variable effectively ceases to exist when the object of which it is a field is no longer referenced, after any necessary finalization of the object (§12.6) has been completed.
Yes, memory allocates though you are not assigning any value to it.
int i;
That takes 32 bit memory (allocation). No matter you are using it or not.
Some sub-classes don't use every single primitive declared by the super-Class. Can I simply keep such primitives as uninitialized and only initialize them in necessary sub-classes to save memory?
Again, no matter where you initialized, the memory allocates.
Only thing you need to take care is, just find the unused primitives and remove them.
Edit:
Adding one more point that unlike primitive's references by default value is null, which carries a a memory of
4 bytes(32-bit)
8 bytes on (64-bit)
The original question talks about class level variables and the answer is that they do use space, but it's interesting to look at method scoped ones too.
Let's take a small example:
public class MemTest {
public void doSomething() {
long i = 0; // Line 3
if(System.currentTimeMillis() > 0) {
i = System.currentTimeMillis();
System.out.println(i);
}
System.out.println(i);
}
}
If we look at the bytecode generated:
L0
LINENUMBER 3 L0
LCONST_0
LSTORE 1
Ok, as expected we assign a value at line 3 in the code, now if we change line 3 to (and remove the second println due to a compiler error):
long i; // Line 3
... and check the bytecode then nothing is generated for line 3. So, the answer is that no memory is used at this point. In fact, the LSTORE occurs only on line 5 when we assign to the variable. So, declaring an unassigned method variable does not use any memory and, in fact, doesn't generate any bytecode. It's equivalent to making the declaration where you first assign to it.
Yes. In your class level variables will assign to its default value even if you don't initialize them.
In this case you int variables will assign to 0 and will occupied 4 bytes per each.
Neither the Java Language Specification nor the Java Virtual Machine Specification specifies the answer to this because it's an implementation detail. In fact, JVMS §2.7 specifically says:
Representation of Objects
The Java Virtual Machine does not mandate any particular internal structure for objects.
In theory, a conformant virtual machine could implement objects which have a lot of fields using set of bit flags to mark which fields have been set to non-default values. Initially no fields would be allocated, the flag bits would be all 0, and the object would be small. When a field is first set, the corresponding flag bit would be set to 1 and the object would be resized to make space for it. [The garbage collector already provides the necessary machinery for momentarily pausing running code in order to relocate live objects around the heap, which would be necessary for resizing them.]
In practice, this is not a good idea because even if it saves memory it is complicated and slow. Access to fields would require temporarily locking the object to prevent corruption due to multithreading; then reading the current flag bits; and if the field exists then counting the set bits to calculate the current offset of the wanted field relative to the base of the object; then reading the field; and finally unlocking the object.
So, no general-purpose Java virtual machine does anything like this. Some objects with an exorbitant number of fields might benefit from it, but even they couldn't rely on it, because they might need to run on the common virtual machines which don't do that.
A flat layout which allocates space for all fields when an object is first instantiated is simple and fast, so that is the standard. Programmers assume that objects are allocated that way and thus design their programs accordingly to best take advantage of it. Likewise, virtual machine designers optimize to make that usage fast.
Ultimately the flat layout of fields is a convention, not a rule, although you can rely on it anyway.
In Java, when you declare a class attribute such as String str;, you are declaring a reference to an object, but it is not pointing yet to any object unless you affect a value to it str=value;. But as you may guess, the reference, even without pointing to a memory place, consumes itself some memory.
Related
My teacher gave me a question:
"What occurs when objects are created in Java".
To the best of my knowledge, memory allocation, variable initialization and constructor method invocation happen when an object is created.
But my teacher said that I was almost right. The 2 later things are right, except memory heap. Instead, he said the memory allocation occurs. I think that object is stored in heap, so my teacher is wrong. Do you think so?
As always, the best place to find a solution for these kinds of questions is in the Java Language Specification.
Specifically, from the section on new instance creation, it can be understood that this is the sequence when a new object is created, as long as no exceptions occur:
Memory is allocated.
Fields are initialized to their default values.
The "first line" of the chosen constructor is invoked, unless it's an Object. By first line I mean either explicit call to super() or this(), or an implicit call to super().
The instance initializer is executed and the fields are initialized to their requested values (actually field initialization is usually compiled as an inline part of the instance initializer).
The rest of the constructor code is executed.
Now, it is possible that your teacher is talking about memory allocation as an actual operating system call - and in that case he's right in the sense that the JVM manages its own heap and thus a Java memory allocation does not necessarily translate to an OS memory allocation call (although it may).
I'll answer that using a simple example.
Say you have a class Car. Now you write:
Car car;
car = new Car();
The first statement creates a reference with car in the stack.
In the second statement, the Car class will be loaded to the main memory, then it will allocate memory for the members of Car in the heap. When this happens, the members will be initialized with values provided by the JVM.
While the JVM is running the program, whenever a new object is created, the JVM reserves as portion of the Heap for that object (where the object will be stored). The amount of Heap that gets reserved is based on the size of the object.
The JVM maps out this segment in the Heap to represent all of the attributes of the object being stored. A reference (address in Heap) to the object is kept by the JVM and stored in a table that allows the JVM to keep track of all the objects that have been allocated on the Heap. The JVM uses these references to access the objects later (when the program accesses the object).
On top of what other people have said, if this is the first use of the object then its Class must be initialised -as described in the JLS (the section before the one on new instance creation!).
This basically involves loading into memory the necessary information about the class i.e. creating a Klass object for the static variables and method table to live. This may also involve loading super classes and interfaces. This is all carried out by the ClassLoader.
When object is created in java then these 6 step will be happens one by one---
1.JVM allocates 8 bytes of memory for the reference variable & assign default value as null.
JVM will verify whether class loading is done or not,if class is already loaded then it will ignore or else it will perform class loading.
At the time of class loading ,if any static variable are there then it will allocating memory.
By using new operator,object memory will e created inside heap memory.
At the time of object creation,if any instance variables are there then those will allocate memory inside object Memory.
It will assign object memory address to the reference variable which is created first.
There are cases when one needs a memory efficient to store lots of objects. To do that in Java you are forced to use several primitive arrays (see below why) or a big byte array which produces a bit CPU overhead for converting.
Example: you have a class Point { float x; float y;}. Now you want to store N points in an array which would take at least N * 8 bytes for the floats and N * 4 bytes for the reference on a 32bit JVM. So at least 1/3 is garbage (not counting in the normal object overhead here). But if you would store this in two float arrays all would be fine.
My question: Why does Java not optimize the memory usage for arrays of references? I mean why not directly embed the object in the array like it is done in C++?
E.g. marking the class Point final should be sufficient for the JVM to see the maximum length of the data for the Point class. Or where would this be against the specification? Also this would save a lot of memory when handling large n-dimensional matrices etc
Update:
I would like to know wether the JVM could theoretically optimize it (e.g. behind the scene) and under which conditions - not wether I can force the JVM somehow. I think the second point of the conclusion is the reason it cannot be done easily if at all.
Conclusions what the JVM would need to know:
The class needs to be final to let the JVM guess the length of one array entry
The array needs to be read only. Of course you can change the values like Point p = arr[i]; p.setX(i) but you cannot write to the array via inlineArr[i] = new Point(). Or the JVM would have to introduce copy semantics which would be against the "Java way". See aroth's answer
How to initialize the array (calling default constructor or leaving the members intialized to their default values)
Java doesn't provide a way to do this because it's not a language-level choice to make. C, C++, and the like expose ways to do this because they are system-level programming languages where you are expected to know system-level features and make decisions based on the specific architecture that you are using.
In Java, you are targeting the JVM. The JVM doesn't specify whether or not this is permissible (I'm making an assumption that this is true; I haven't combed the JLS thoroughly to prove that I'm right here). The idea is that when you write Java code, you trust the JIT to make intelligent decisions. That is where the reference types could be folded into an array or the like. So the "Java way" here would be that you cannot specify if it happens or not, but if the JIT can make that optimization and improve performance it could and should.
I am not sure whether this optimization in particular is implemented, but I do know that similar ones are: for example, objects allocated with new are conceptually on the "heap", but if the JVM notices (through a technique called escape analysis) that the object is method-local it can allocate the fields of the object on the stack or even directly in CPU registers, removing the "heap allocation" overhead entirely with no language change.
Update for updated question
If the question is "can this be done at all", I think the answer is yes. There are a few corner cases (such as null pointers) but you should be able to work around them. For null references, the JVM could convince itself that there will never be null elements, or keep a bit vector as mentioned previously. Both of these techniques would likely be predicated on escape analysis showing that the array reference never leaves the method, as I can see the bookkeeping becoming tricky if you try to e.g. store it in an object field.
The scenario you describe might save on memory (though in practice I'm not sure it would even do that), but it probably would add a fair bit of computational overhead when actually placing an object into an array. Consider that when you do new Point() the object you create is dynamically allocated on the heap. So if you allocate 100 Point instances by calling new Point() there is no guarantee that their locations will be contiguous in memory (and in fact they will most likely not be allocated to a contiguous block of memory).
So how would a Point instance actually make it into the "compressed" array? It seems to me that Java would have to explicitly copy every field in Point into the contiguous block of memory that was allocated for the array. That could become costly for object types that have many fields. Not only that, but the original Point instance is still taking up space on the heap, as well as inside of the array. So unless it gets immediately garbage-collected (I suppose any references could be rewritten to point at the copy that was placed in the array, thereby theoretically allowing immediate garbage-collection of the original instance) you're actually using more storage than you would be if you had just stored the reference in the array.
Moreover, what if you have multiple "compressed" arrays and a mutable object type? Inserting an object into an array necessarily copies that object's fields into the array. So if you do something like:
Point p = new Point(0, 0);
Point[] compressedA = {p}; //assuming 'p' is "optimally" stored as {0,0}
Point[] compressedB = {p}; //assuming 'p' is "optimally" stored as {0,0}
compressedA[0].setX(5)
compressedB[0].setX(1)
System.out.println(p.x);
System.out.println(compressedA[0].x);
System.out.println(compressedB[0].x);
...you would get:
0
5
1
...even though logically there should only be a single instance of Point. Storing references avoids this kind of problem, and also means that in any case where a nontrivial object is being shared between multiple arrays your total storage usage is probably lower than it would be if each array stored a copy of all of that object's fields.
Isn't this tantamount to providing trivial classes such as the following?
class Fixed {
float hiddenArr[];
Point pointArray(int position) {
return new Point(hiddenArr[position*2], hiddenArr[position*2+1]);
}
}
Also, it's possible to implement this without making the programmer explicitly state that they'd like it; the JVM is already aware of "value types" (POD types in C++); ones with only other plain-old-data types inside them. I believe HotSpot uses this information during stack elision, no reason it couldn't do it for arrays too?
I have a basic doubt regarding object creation in java.Suppose i have two classes as follows
Class B{
public int value=100;
}
Class A{
public B getB(){
return new B();
}
public void accessValue(){
//accessing the value without storing object B
System.out.println("value is :"+getB().value);
//accessing the value by storing object B in variable b
B b=getB();
System.out.println("value is :"+b.value);
}
}
My question is,does storing the object and accessing the value make any difference in terms of memory or both are same?
They are both equivalent, since you are instantiating B both times. The first way is just a shorter version of the second.
Following piece of code is using an anonymous object. which can't be reused later in code.
//accessing the value without storing object B
System.out.println("value is :"+getB().value);
Below code uses the object by assigning it to a reference.
//accessing the value by storing object B in variable b
B b=getB();
System.out.println("value is :"+b.value);
Memory and performance wise it's NOT much difference except that in later version stack frame has an extra pointer.
It is the same. This way: B b=getB(); just keeps your code more readable. Keep in mind, that object must be stored somewhere in memory anyway.
If you never reuse the B-object after this part, the first option with an anonymous object is probably neater:
the second option would need an additional store/load command (as Hot Licks mentioned) if it isn't optimized by the compiler
possibly first storing the object in a variable creates slight overhead for the garbage collector as opposed to an anonymous object, but that's more of a "look into that" than a definitive statement of me
If you do want to access a B a second time, storing one in its own variable is faster.
EDIT: ah, both points already mentioned above while I was typing.
You will not be able to say the difference without looking at the generated machine code. It could be that the JIT puts the local variable "b" onto the stack. More likely however that the JIT will optimize b away. Depends on the JRE and JIT you are using. In any case, the difference is minor and only significant in extremely special cases.
Actually there is no difference in the second instance you are just giving the new object reference to b.
So code wise you cannot achieve the println if you use version 1, as you dont have any reference as you have in the second case unless you keep creating new object for every method call.
In that case the difference, if any, would not be worth mentioning. In the second case an extra bytecode or two would probably be generated, if the compiler didn't optimize them away, but any decent JIT would almost certainly optimize the two cases to the identical machine code.
And, in any event, the cost of an extra store/load would be inconsequential for 99.9% of applications (and swamped in this case by the new operation).
Details: If you look at the bytecodes, in the first case the getB method is called and returns a value on the top of the stack. Then a getfield references value and places that on the top of the stack. Then a StringBuilder append is done to begin building the println parameter list.
In the second case there is an extra astore and aload (pointer store/load) after the getB invocation, and the setup for the StringBuilder is stuck between the two, so that side-effects occur in the order specified in the source. If there were no side-effects to worry about the compiler might have chosen to do the very slightly more efficient dupe and astore sequence. In any case, a decent JIT would recognize that b is never used again and optimize away the store.
When I have a class like this:
class Test {
private int _id = 0 ; // 4 bytes
private int _age = 0 ; // 4 bytes
}
I'm sure that each instance of it will consume more than 8 bytes in memory because of the 2 integers.
But what about methods? If I have a class with a million methods, and 2 instances of it, are the methods going to take twice as much place in memory ?
How does it work?
Thank you.
No. Methods occur only once in memory1. They don't vary on a per instance basis, so they don't need storage on a per-instance basis.
An object in Java basically consists of some fixed-size "housekeeping" (a pointer to the type information including the vtable), potentially GC-related bits (think mark and sweep), information about the monitor for the instance etc - and then the fields.
1 This is a bit of a simplification. There may be various representations, such as bytecode, native code etc - but that's regardless of separate instances.
Having two instances of the same class does not duplicate the amount of space needed for the method code. That is to say, the methods reside in one place in memory and then each instance of the class has a pointer pointing to that place in memory. This is because otherwise memory would be wasted. The code that needs to be executed for each method is the same, regardless of which instance of the class calls it, so it would make no sense to duplicate it.
But to execute the methods like instance.method(), the local copy of the method will be made in stack per instance, where as the instance will be in the heap.
Is it correct to assume a Java object only takes up the 8 bytes for the object reference as long as all it's members are set to null or does the definition of members already use up space in the instance for some reason?
In other words, if I have a large collection of objects that I want to be space efficient, can I count on leaving unused members set to null for reducing memory footprint?
No, you need either 4 or 8 bytes ( depending whether it's a 32 or 64 bit system ) for each null you are storing in a field. How would the object know its field was null if there wasn't something stored somewhere to tell it so?
No, null is also information and has also to be stored.
The object references themselves will still take up some memory, so the only real memory savings will be from the heap objects that would have been referred to.
So, if you have
MyClass {
public Object obj;
MyClass(Object o) { obj = o; }
}
MyClass a = new MyClass( new HashMap() );
MyClass b = new MyClass( null );
the two objects a and b themselves take up the same amount of memory (they both hold one reference), but the Object obj referred to by a takes up memory on the heap, so this memory could have been saved by setting the reference to null.
The Java Language Specification and the Java Virtual Machine Specification do not say anything about the size of Java objects in memory - it is deliberately left undefined, because it depends on the implementation of the JVM. This means that you cannot make any assumptions on how much memory a Java object uses in general or what the memory layout of a Java object in memory is.
As others have already said: A reference (variable) takes up memory, whether the reference is null or not. Your objects will not shrink and take up less memory when you set member variables to null.
What about class, object and method names? As you can call, access or instantiate them using Java reflect, the names have to be stored somewhere too, right? But CMIIW.
Kage
When a Java object has all its members are null then to it will consume memory because
It also has some additional memory requirement for headers, referencing and housekeeping.
The heap memory used by a Java object includes
memory for primitive fields, according to their size.
memory for reference fields (4 bytes each).
an object header, consisting of a few bytes of "housekeeping" information.
Objects in java also requires some "housekeeping" information, such as recording an object's class, ID and status flags such as whether the object is currently reachable, currently synchronization-locked etc.
Java object header size varies on 32 and 64 bit jvm.
Although these are the main memory consumers jvm also requires additional fields sometimes like for alignment of the code e.t.c.
So this is the reason when your Java object has all its members are null then to it will consume memory.