I was wondering what happens at memory level when an object in defined, but not initialized.
For example:
public class MainClass {
public static void main (String[] args){
Object foo;
}
}
Is foo pointing to a memory space?
Does this behaviour change between different programming languages?
Thanks in advance.
Edit: I know the object will point to null when used, but I am interested to know what happens just after the object has been defined, and not instantiated yet.
Is there a reference to the memory in this case?
I was wondering what happens at memory level when an object in defined, but not initialized.
Lets assume that we are talking about Java here.
First I must correct your incorrect description. (For reasons that will become apparent ...)
This is not defining an object. Rather, it is declaring a variable. The identifier foo denotes a variable. Since the type of the variable is (in this case) Object which is a reference type, the variable may contain either a reference to a Java object or null.
Is foo pointing to a memory space?
The answer is slightly complicated.
If the variable is initialized, it will either point to some object or it will contain null.
If the variable is NOT initialized, then it depends on what type of variablle we are talking about:
For a field of a class (static or instance), a variable that is not explicitly initialized is default initialized to null.
For a variable that is a parameter or a catch variable, the Java language semantics ensure that the variable is always initialized ... so this is moot.
For a local variable, the JLS doesn't say what it contains before a value is assigned to it. You could say that the value is indeterminate. However the JLS (and at runtime, the JVM's classfile verifier) ensure that a program cannot use a local variable that is in an indeterminate state. (It is a compilation error in Java code to read a variable that has not been definitely assigned.) So it really makes no difference what the variable actually contains.
Note that in pure Java1 it is not possible to access a variable that contains a value that it wasn't set by either assignment or initialization. The Java Language Specification doesn't allow it and neither does the JVM Specification. A variable can never be observed to contain a random memory address.
Does this behavior change between different programming languages?
Err ... yes. For example, in C and C++, a program may use the value of a pointer variable that has not been initialized. The behavior that ensues is unspecified.
1 - If you use native code or Unsafe, it is possible for the code to corrupt a variable to contain anything. But don't do this deliberately as this is liable to hard crash the JVM. Native code and Unsafe means not pure Java.
In java , foo will point to "null" when it define in class ,
and foo will point to nothing where it define in function .
In Java you can think of object variables as pointers. By default they point to nothing, only the pointer itself is allocated (e.g. 8 bytes on the stack).
You can have it point to an actual instance of an object by allocating that object and assigning to the variable:
Object foo; // points to nothing (and may not be used)
foo.toString(); // compile error: The local variable obj may not have been initialized
foo = new Object(); // points to an instance of a new Object
foo = null; // again points to nothing, but is now initialized
foo.toString(); // will compile, but throw NullPointerException at run time
This is fundamentally different from C or C++, where Object foo; would actually be a local object allocated on the stack. Java never allocates objects on the stack, only primitive types or pointers.
Related
Consider the following code snippet in Java. It won't compile.
package temppkg;
final public class Main
{
private String x;
private int y;
private void show()
{
String z;
int a;
System.out.println(x.toString()); // Causes a NullPointerException but doesn't issue a compiler error.
System.out.println(y); // Works fine displaying its default value which is zero.
System.out.println(z.toString()); // Causes a compile-time error - variable z might not have been initialized.
System.out.println(a); // Causes a compile-time error - variable a might not have been initialized.
}
public static void main(String []args)
{
new Main().show();
}
}
Why do the class members (x and y) declared in the above code snippet not issue any compile-time error even though they are not explicitly initialized and only local variables are required to be initialized?
When in doubt, check the Java Language Specification (JLS).
In the introduction you'll find:
Chapter 16 describes the precise way in which the language ensures
that local variables are definitely set before use. While all other
variables are automatically initialized to a default value, the Java
programming language does not automatically initialize local variables
in order to avoid masking programming errors.
The first paragraph of chapter 16 states,
Each local variable and every blank final field must have a definitely
assigned value when any access of its value occurs....A Java compiler
must carry out a specific conservative flow analysis to make sure
that, for every access of a local variable or blank final field f, f
is definitely assigned before the access; otherwise a compile-time
error must occur.
The default values themselves are in section 4.12.5. The section opens with:
Each class variable, instance variable, or array component is
initialized with a default value when it is created.
...and then goes on to list all the default values.
The JLS is really not that hard to understand and I've found myself using it more and more to understand why Java does what it does...after all, it's the Java bible!
Why would they issue a compile warning?, as instance variables String will get a default value of null, and int will get a default value of 0.
The compiler has no way to know that x.toString(), will cause a runtime exception, because the value of null is not actually set till after runtime.
In general the compiler couldn't know for sure if a class member has or has not been initialized before. For example, you could have a setter method that sets the value for a class member, and another method which accesses that member. The compiler can't issue a warning when accessing that variable because it can't know whether the setter has been called before or not.
I agree that in this case (the member is private and there is no single method that writes the variable) it seems it could raise a warning from the compiler. Well, in fact you are still not sure that the variable has not been initialized since it could have been accessed via reflexion.
Thins are much easier with local variables, since they can't be accessed from outside the method, nor even via reflexion (afaik, please correct me if wrong), so the compiler can be a bit more helpful and warn us of uninitialized variables.
I hope this answer helps you :)
Class members could have been initialized elsewhere in your code, so the compiler can't see if they were initialized at compilation time.
Member variables are automatically initialized to their default values when you construct (instantiate) an object. That holds true even when you have manually initialized them, they will be initialized to default values first and then to the values you supplied.
This is a little little lengthy article but it explains it: Object initialization in Java
Whereas for the local variables (ones that are declared inside a method) are not initialized automatically, which means you have to do it manually, even if you want them to have their default values.
You can see what the default values are for variables with different data types here.
The default value for the reference type variables is null. That's why it's throwing NullPointerException on the following:
System.out.println(x.toString()); // Causes a NullPointerException but doesn't issue a compiler error.
In the following case, the compiler is smart enough to know that the variable is not initialized yet (because it's local and you haven't initialized it), that's why compilation issue:
System.out.println(z.toString()); // "Cuases a compile-time error - variable z might not have been initialized.
Does it have a value?
I am trying to understand what is the state of a declared but not-initialized variable/object in Java.
I cannot actually test it, because I keep getting the "Not Initialized" compile-error and I cannot seem to be able to suppress it.
Though for example, I would guess that if the variable would be an integer it could be equal to 0.
But what if the variable would be a String, would be it be equal to null or the isEmpty() would return true?
Is the value the same for all non-initialized variables? or every declaration (meaning, int, string, double etc) has a different value when not explicitly initialized?
UPDATE
So as I see now, it makes a big difference if the variable is declared locally or in the Class, though I seem to be unable to understand why when declaring as static in the class it gives no error, but when declaring in the main it produces the "Not Initialized" error.
How exactly a JVM does this is entirely up to the JVM and shouldn't matter for a programmer, since the compiler ensures that you do not read uninitialized local variables.
Fields however are different. They need not be assigned before reading them (unless they are final) and the value of a field that has not been assigned is null for reference types or the 0 value of the appropriate primitive type, if the field has a primitive type.
Using s.isEmpty() for a field String s; that has not been assigned results in a NullPointerException.
So as I see now, it makes a big difference if the variable is declared locally or in the Class, though I seem to be unable to understand why when declaring in the class it gives no error, but when declaring in the main it produces the "Not Initialized" error.
In general it's undesirable to work with values that do not have a value. For this reason the language designers had 2 choices:
a) define a default value for variables not yet initialized
b) prevent the programmers from accessing the variable before writing to them.
b) is hard to achieve for fields and therefore option a) was chosen for fields. (There could be multiple methods reading/writing that could be valid or invalid depending on the order of calls, which could only be determined at runtime).
For local variables option b) is viable, since all possible paths of the execution of the method can be checked for assignment statements. This option was chosen during the language design for local variables, since it can help to find many easy mistakes.
Fabian already provided a very clear answer, I just try to add the specification from the official documentation for reference.
Fields in Class
It's not always necessary to assign a value when a field is declared. Fields that are declared but not initialized will be set to a reasonable default by the compiler. Generally speaking, this default will be zero or null, depending on the data type. Relying on such default values, however, is generally considered bad programming style.
If not specified the default value, it only be treated as a bad style, while it's not the same case in local variables.
Local Variables
Local variables are slightly different; the compiler never assigns a default value to an uninitialized local variable. If you cannot initialize your local variable where it is declared, make sure to assign it a value before you attempt to use it. Accessing an uninitialized local variable will result in a compile-time error.
The default value will be based on the type of the data and place where you are using initialized variable . Please refer below for Primitive default.
https://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html
final Object o;
List l = new ArrayList(){{
// closure over o, in lexical scope
this.add(o);
}};
why must o be declared final? why don't other JVM languages with mutable vars have this requirement?
This is not JVM-deep, it all happens at syntactic-sugar level. The reason is that exporting a non-final var via a closure makes it vulnerable to datarace issues and, since Java was designed to be a "blue-collar" language, such a surprising change in the behavior of an otherwise tame and safe local var was deemed way too "advanced".
It's not hard to deduce logically why it has to be final.
In Java, when a local variable is captured into an anonymous class, it is copied by value. The reason for this is that the object may live longer than the current function call (e.g. it may be returned, etc.), but local variables only live as long as the current function call. So it is not possible to simply "reference" the variable because it may not exist by then. Some languages like Python, Ruby, JavaScript, do allow you to reference variables after the scope is gone, by keeping a reference to the environment in the heap or something. But this is hard to do with the JVM because local variables are allocated on the function's stack frame, which is destroyed when the function call is done.
Now, since it is copied, there are two copies of the variable (and more, if there are more closures capturing this variable). If they were assignable, then you can change one of them without changing the other. For example, hypothetically:
Object o;
Object x = new Object(){
public String toString() {
return o.toString();
}
};
o = somethingElse;
System.out.println(x.toString()); // prints the original object, not the re-assigned one
// even though "o" now refers to the re-assigned one
Since there is only one o variable in the scope, you would expect them to to refer to the same thing. In the example above, after you assign to o, you would expect a later access of o from the object to refer to the new value; but it doesn't. This would be surprising and unexpected to the programmer, and violates the principle that uses of the same variable refer to the same thing.
So to avoid this surprise, they mandate that you cannot assign to it anywhere; i.e. it has to be final.
Now, of course, you can still initialize the final variable from a non-final variable. And inside the closure, you can still assign the final variable to something else non-final.
Object a; // non-final
final Object o = a;
Object x = new Object(){
Object m = o; // non-final
public String toString() {
return ,.toString();
}
};
But then this is all good since you are explicitly using different variables, so there is no surprise about what it does.
I have this code:
MyClass object;
.... some code here where object may or may not be initialised...
if (object.getId > 0) {
....
}
Which results in a compile error: object may not have been initialised, which is fair enough.
Now I change my code to this:
MyClass object;
.... some conditional code here where object may or may not be initialised...
if (object != null && object.getId > 0) {
....
}
I get the same compile error! I have to initialise object to null:
MyClass object = null;
So what's the difference between not initialising an object, and initialising to null? If I declare an object without initialisation isn't it null anyway?
Thanks
fields (member-variables) are initialized to null (or to a default primitive value, if they are primitives)
local variables are not initialized and you are responsible for setting the initial value.
It's a language-definition thing.
The language states that variables of METHOD-scope MUST be manually initialized -- if you want them to start out as NULL, you must explicitly say so -- if you fail to do so, they are basically in an undefined state.
Contrarily, the language states that variables of CLASS-scope do not need to be manually initialized -- failure to initialize them results in them automatically getting initialized to NULL -- so you don't have to worry about it.
As far as the difference between the two states (null vs. undefined), yes they are basically the same -- but the language dictates that you need to initialize a variable (whether that's done automatically for you or not, depending on the variable's scope).
Your declaration of object is really a declaration of a pointer, or reference, to an instance of MyClass on the heap. If you don't initialize the pointer you essentially get a pointer pointing to somewhere random. By explicity initializing the pointer to NULL you are setting it to point to a NULL address that the compiler knows is invalid.
Extra confusion is introduced in Java because it implicitly initialises member variables to NULL for you.
It makes a bit more sense if you've used lower level languages like C++.
Consider this:
public class TestClass {
private String a;
private String b;
public TestClass()
{
a = "initialized";
}
public void doSomething()
{
String c;
a.notify(); // This is fine
b.notify(); // This is fine - but will end in an exception
c.notify(); // "Local variable c may not have been initialised"
}
}
I don't get it. "b" is never initialized but will give the same run-time error as "c", which is a compile-time error. Why the difference between local variables and members?
Edit: making the members private was my initial intention, and the question still stands...
The language defines it this way.
Instance variables of object type default to being initialized to null.
Local variables of object type are not initialized by default and it's a compile time error to access an undefined variable.
See section 4.12.5 for SE7 (same section still as of SE14)
http://docs.oracle.com/javase/specs/jls/se7/html/jls-4.html#jls-4.12.5
Here's the deal. When you call
TestClass tc = new TestClass();
the new command performs four important tasks:
Allocates memory on the heap for the new object.
Initiates the class fields to their default values (numerics to 0, boolean to false, objects to null).
Calls the constructor (which may re-initiate the fields, or may not).
Returns a reference to the new object.
So your fields 'a' and 'b' are both initiated to null, and 'a' is re-initiated in the constructor. This process is not relevant for method calling, so local variable 'c' is never initialized.
For the gravely insomniac, read this.
The rules for definite assignment are quite difficult (read chapter 16 of JLS 3rd Ed). It's not practical to enforce definite assignment on fields. As it stands, it's even possible to observe final fields before they are initialised.
The compiler can figure out that c will never be set. The b variable could be set by someone else after the constructor is called, but before doSomething(). Make b private and the compiler may be able to help.
The compiler can tell from the code for doSomething() that c is declared there and never initialized. Because it is local, there is no possibility that it is initialized elsewhere.
It can't tell when or where you are going to call doSomething(). b is a public member. It is entirely possible that you would initialize it in other code before calling the method.
Member-variables are initialized to null or to their default primitive values, if they are primitives.
Local variables are UNDEFINED and are not initialized and you are responsible for setting the initial value. The compiler prevents you from using them.
Therefore, b is initialized when the class TestClass is instantiated while c is undefined.
Note: null is different from undefined.
You've actually identified one of the bigger holes in Java's system of generally attempting to find errors at edit/compile time rather than run time because--as the accepted answer said--it's difficult to tell if b is initialized or not.
There are a few patterns to work around this flaw. First is "Final by default". If your members were final, you would have to fill them in with the constructor--and it would use path-analysis to ensure that every possible path fills in the finals (You could still assign it "Null" which would defeat the purpose but at least you would be forced to recognize that you were doing it intentionally).
A second approach is strict null checking. You can turn it on in eclipse settings either by project or in default properties. I believe it would force you to null-check your b.notify() before you call it. This can quickly get out of hand so it tends to go with a set of annotations to make things simpler:
The annotations might have different names but in concept once you turn on strict null checking and the annotations the types of variables are "nullable" and "NotNull". If you try to place a Nullable into a not-null variable you must check it for null first. Parameters and return types are also annotated so you don't have to check for null every single time you assign to a not-null variable.
There is also a "NotNullByDefault" package level annotation that will make the editor assume that no variable can ever have a null value unless you tag it Nullable.
These annotations mostly apply at the editor level--You can turn them on within eclipse and probably other editors--which is why they aren't necessarily standardized. (At least last time I check, Java 8 might have some annotations I haven't found yet)