scopes of variables in Java

scopes of variables in Java - java

In Java, you don't have declare a method physically before you use it. The same thing doesn't apply for variables.
Why is this the case? Is it just for "legacy" reason (ie., the Java's creators didn't feel like doing it), or is it just not possible?
Eg.,
public class Test
{
// It is OK for meth1 to invoke meth2
public void meth1() { meth2(); }
public void meth2() { }
// But why is it NOT ok for field1 to reference field2
private int field1 = field2;
private int field2 = 3;
}
If I wanted my Java compiler to support this kind of forward-reference, what is the general idea as to how to do it?
I understand there'd be issue about circular depencies, which we'd need to be careful about. But other than that, I really don't see why it should not be possible.
[Edit]Ok, here's my initial thought as to how to do this.
While analysing the code, the compiler would build a graph of dependencies for the variables in the given scope. And if it sees a loop (ie., int a = b; int b = a), then it would throw an error. If there is no loops, then there must be some optimal way to re-arrange the statements (behind the scence) such that a field will only reference fields declared before it, and so it shouuld try to figure out the order. I haven't worked out the exact algorithm, but I think it is possible. Unless someone can scientifically prove me wrong.
Recap the question:
Say I'm trying to build my own dialect of Java, which supports this sort of scoping. My main question is, could you give me some ideas as to how to do this
Thanks

According to the JLS, Section 12.4.1, initialization of class variables proceeds from top to bottom, in "textual order":
The static initializers and class variable initializers are executed in textual order, and may not refer to class variables declared in the class whose declarations appear textually after the use, even though these class variables are in scope (§8.3.2.3). This restriction is designed to detect, at compile time, most circular or otherwise malformed initializations.
So, if you make your own compiler to recognize forward class variable declarations, then it is violating the Java Language Specification.

I will give you a simple piece of code:
public class Test
{
private int foo = bar;
private int bar = foo;
}
What would you expect this to do?
I presume that designers of Java has done it so because assignment of values to instance variables must be executed in some order. In the case of Java, they are executed downwards (from up to bottom).
EDIT
What about this?
public class Test {
private int foo = quu++;
private int bar = quu++;
private int quu = 1;
}
What values would foo and bar have? Which quu++ statement would be executed first?
My point is, Java designers must have thought that it is counter intuitive to do it as you did describe in your question, i.e. unordered execution with compile time code analysis.
FINAL EDIT
Let's complicate things:
class Test {
private James james = new James(anInt);
private Jesse jesse = new Jesse(anInt);
private IntWrapper anInt = new IntWrapper();
}
class James {
public James(IntWrapper anInt) {
if(--anInt.value != 0) {
new Jesse(anInt);
}
else {
anInt.isJames = true;
}
}
}
class Jesse {
public Jesse(IntWrapper anInt) {
if(--anInt.value != 0) {
new James(anInt);
}
else {
anInt.isJames = false;
}
}
}
class IntWrapper {
public int value = 99;
public boolean isJames;
}
I am not sure what it proves regarding your question because I am not sure about your point.
There is no circular dependency here but value of an instance variable of IntWrapper, isJames, depends on the execution order and it might be difficult to detect this kind of stuff with a lexical/semantic analyzer.

How will field1 know value of field2 before it has been defined and assigned a value?

It has to do with the order of initialization. Fields are initialized top to bottom. In your example, when field1 tries to reference field2 in the initializer, the latter is yet to be initialized itself.
The rule about forward references is aimed at catching the most obvious cases of this problem. It does not catch all; for example, you can still do:
private int field1 = getField2();
private int field2 = 3;
private int getField2() { return field2; }
and get field1 intialized to zero where you might be expecting 3.

Related

In Java, is there a difference between initializing an object in constructor or in declaration? [duplicate]

Is there any advantage for either approach?
Example 1:
class A {
B b = new B();
}
Example 2:
class A {
B b;
A() {
b = new B();
}
}

There is no difference - the instance variable initialization is actually put in the constructor(s) by the compiler.
The first variant is more readable.
You can't have exception handling with the first variant.
There is additionally the initialization block, which is as well put in the constructor(s) by the compiler:
{
a = new A();
}
Check Sun's explanation and advice
From this tutorial:
Field declarations, however, are not part of any method, so they cannot be executed as statements are. Instead, the Java compiler generates instance-field initialization code automatically and puts it in the constructor or constructors for the class. The initialization code is inserted into a constructor in the order it appears in the source code, which means that a field initializer can use the initial values of fields declared before it.
Additionally, you might want to lazily initialize your field. In cases when initializing a field is an expensive operation, you may initialize it as soon as it is needed:
ExpensiveObject o;
public ExpensiveObject getExpensiveObject() {
if (o == null) {
o = new ExpensiveObject();
}
return o;
}
And ultimately (as pointed out by Bill), for the sake of dependency management, it is better to avoid using the new operator anywhere within your class. Instead, using Dependency Injection is preferable - i.e. letting someone else (another class/framework) instantiate and inject the dependencies in your class.

Another option would be to use Dependency Injection.
class A{
B b;
A(B b) {
this.b = b;
}
}
This removes the responsibility of creating the B object from the constructor of A. This will make your code more testable and easier to maintain in the long run. The idea is to reduce the coupling between the two classes A and B. A benefit that this gives you is that you can now pass any object that extends B (or implements B if it is an interface) to A's constructor and it will work. One disadvantage is that you give up encapsulation of the B object, so it is exposed to the caller of the A constructor. You'll have to consider if the benefits are worth this trade-off, but in many cases they are.

I got burned in an interesting way today:
class MyClass extends FooClass {
String a = null;
public MyClass() {
super(); // Superclass calls init();
}
#Override
protected void init() {
super.init();
if (something)
a = getStringYadaYada();
}
}
See the mistake? It turns out that the a = null initializer gets called after the superclass constructor is called. Since the superclass constructor calls init(), the initialization of a is followed by the a = null initialization.

my personal "rule" (hardly ever broken) is to:
declare all variables at the start of
a block
make all variables final unless they
cannot be
declare one variable per line
never initialize a variable where
declared
only initialize something in a
constructor when it needs data from
the constructor to do the
initialization
So I would have code like:
public class X
{
public static final int USED_AS_A_CASE_LABEL = 1; // only exception - the compiler makes me
private static final int A;
private final int b;
private int c;
static
{
A = 42;
}
{
b = 7;
}
public X(final int val)
{
c = val;
}
public void foo(final boolean f)
{
final int d;
final int e;
d = 7;
// I will eat my own eyes before using ?: - personal taste.
if(f)
{
e = 1;
}
else
{
e = 2;
}
}
}
This way I am always 100% certain where to look for variables declarations (at the start of a block), and their assignments (as soon as it makes sense after the declaration). This winds up potentially being more efficient as well since you never initialize a variable with a value that is not used (for example declare and init vars and then throw an exception before half of those vars needed to have a value). You also do not wind up doing pointless initialization (like int i = 0; and then later on, before "i" is used, do i = 5;.
I value consistency very much, so following this "rule" is something I do all the time, and it makes it much easier to work with the code since you don't have to hunt around to find things.
Your mileage may vary.

Example 2 is less flexible. If you add another constructor, you need to remember to instantiate the field in that constructor as well. Just instantiate the field directly, or introduce lazy loading somewhere in a getter.
If instantiation requires more than just a simple new, use an initializer block. This will be run regardless of the constructor used. E.g.
public class A {
private Properties properties;
{
try {
properties = new Properties();
properties.load(Thread.currentThread().getContextClassLoader().getResourceAsStream("file.properties"));
} catch (IOException e) {
throw new ConfigurationException("Failed to load properties file.", e); // It's a subclass of RuntimeException.
}
}
// ...
}

Using either dependency injection or lazy initialization is always preferable, as already explained thoroughly in other answers.
When you don't want or can't use those patterns, and for primitive data types, there are three compelling reasons that I can think of why it's preferable to initialize the class attributes outside the constructor:
avoided repetition = if you have more than one constructor, or when you will need to add more, you won't have to repeat the initialization over and over in all the constructors bodies;
improved readability = you can easily tell with a glance which variables will have to be initialized from outside the class;
reduced lines of code = for every initialization done at the declaration there will be a line less in the constructor.

I take it is almost just a matter of taste, as long as initialization is simple and doesn't need any logic.
The constructor approach is a bit more fragile if you don't use an initializer block, because if you later on add a second constructor and forget to initialize b there, you'll get a null b only when using that last constructor.
See http://java.sun.com/docs/books/tutorial/java/javaOO/initial.html for more details about initialization in Java (and for explanations on initalizer blocks and other not well known initialization features).

I've not seen the following in the replies:
A possible advantage of having the initialisation at the time of declaration might be with nowadays IDE's where you can very easily jump to the declaration of a variable (mostly
Ctrl-<hover_over_the_variable>-<left_mouse_click>) from anywhere in your code. You then immediately see the value of that variable. Otherwise, you have to "search" for the place where the initialisation is done (mostly: constructor).
This advantage is of course secondary to all other logical reasonings, but for some people that "feature" might be more important.

Both of the methods are acceptable. Note that in the latter case b=new B() may not get initialized if there is another constructor present. Think of initializer code outside constructor as a common constructor and the code is executed.

I think Example 2 is preferable. I think the best practice is to declare outside the constructor and initialize in the constructor.

The second is an example of lazy initialization. First one is more simple initialization, they are essentially same.

There is one more subtle reason to initialize outside the constructor that no one has mentioned before (very specific I must say). If you are using UML tools to generate class diagrams from the code (reverse engineering), most of the tools I believe will note the initialization of Example 1 and will transfer it to a diagram (if you prefer it to show the initial values, like I do). They will not take these initial values from Example 2. Again, this is a very specific reason - if you are working with UML tools, but once I learned that, I am trying to take all my default values outside of constructor unless, as was mentioned before, there is an issue of possible exception throwing or complicated logic.

The second option is preferable as allows to use different logic in ctors for class instantiation and use ctors chaining. E.g.
class A {
int b;
// secondary ctor
A(String b) {
this(Integer.valueOf(b));
}
// primary ctor
A(int b) {
this.b = b;
}
}
So the second options is more flexible.

It's quite different actually:
The declaration happens before construction. So say if one has initialized the variable (b in this case) at both the places, the constructor's initialization will replace the one done at the class level.
So declare variables at the class level, initialize them in the constructor.

class MyClass extends FooClass {
String a = null;
public MyClass() {
super(); // Superclass calls init();
}
#Override
protected void init() {
super.init();
if (something)
a = getStringYadaYada();
}
}
Regarding the above,
String a = null;
null init could be avoided since anyway it's the default.
However, if you were to need another default value,
then, because of the uncontrolled initialization order,
I would fix as follow:
class MyClass extends FooClass
{
String a;
{
if( a==null ) a="my custom default value";
}
...

Access to final field via getter before its initializing

First of all, this question has forward relation to these nice questions:
1) Use of uninitialized final field - with/without 'this.' qualifier
2) Why isn't a qualified static final variable allowed in a static initialization block?
But I will ask it in slightly another angle of view. Just to remember: mentioned questions above were asked about access to final fields using keyword this in Java 7.
In my question there is something similar, but not the same. Well, consider the following code:
public class TestInit {
final int a;
{
// System.out.println(a); -- compile error
System.out.println(getA());
// a = a; -- compile error
a = getA();
System.out.println(getA());
}
private int getA() {
return a;
}
public static void main(String[] args) {
new TestInit();
}
}
And output is:
0
0
As you can see there are two unclear things here:
There is another legal way to access non-initialized final field: using its getter.
Should we consider that an assigment a = getA(); as legal for blank final field and that it will always assign to it its default value like for non-final field according to JLS? In other words, should it be considered as expected behaviour?

What you're really running into is the inference ability of the compiler.
For your first compiler error, that fails because the compiler definitely knows a is not assigned. Same with the second a=a (you can't assign from a because it's definitely not assigned at that point).
The line that works a=getA() is a first, single, definite assignment of a, so that's fine (regardless where the value comes from; the implementation of getA() doesn't matter and isn't assessed at this point).
The method getA() is valid and can return a because the instance initializer has definitely assigned a value to a.
It makes sense to a human looking at it, but it's not an inference built into the compiler. Each block, independently evaluated, is valid.

how to declare a helper method in java, to be used locally being able to reference variables

I have a class with several methods. Now I would like to define a helper method that should be only visible to method A, like good old "sub-functions" .
public class MyClass {
public methodA() {
int visibleVariable=10;
int result;
//here somehow declare the helperMethod which can access the visibleVariable and just
//adds the passed in parameter
result = helperMethod(1);
result = helperMethod(2);
}
}
The helperMethod is only used by MethodA and should access MethodA's declared variables - avoiding passing in explicitly many parameters which are already declared within methodA.
Is that possible?
EDIT:
The helper mehod is just used to avoid repeating some 20 lines of code which differ in only 1 place. And this 1 place could easily be parameterized while all the other variables in methodA remain unchanged in these 2 cases

Well you could declare a local class and put the method in there:
public class Test {
public static void main(String[] args) {
final int x = 10;
class Local {
int addToX(int value) {
return x + value;
}
}
Local local = new Local();
int result1 = local.addToX(1);
int result2 = local.addToX(2);
System.out.println(result1);
System.out.println(result2);
}
}
But that would be a very unusual code. Usually this suggests that you need to take a step back and look at your design again. Do you actually have a different type that you should be creating?
(If another type (or interface) already provided the right signature, you could use an anonymous inner class instead. That wouldn't be much better...)

Given the variables you declare at the top of your method can be marked as final (meaning they don't change after being initialized) You can define your helper method inside a helper class like below. All the variables at the top could be passed via the constructor.
public class HelperClass() {
private final int value1;
private final int value2;
public HelperClass(int value1, int value2) {
this.value1 = value1;
this.value2 = value2;
}
public int helperMethod(int valuex) {
int result = -1;
// do calculation
return result;
}
}
you can create an instance of HelperClass and use it inside the method

It is not possible. It is also not good design. Violating the rules of variable scope is a sure-fire way to make your code buggy, unreadable and unreliable. If you really have so many related variables, consider putting them into their own class and giving a method to that class.
If what you mean is more akin to a lambda expression, then no, this is not possible in Java at this time (but hopefully in Java 8).

No, it is not possible.
I would advise you create a private method in your class that does the work. As you are author of the code, you are in control of which other methods access the private method. Moreover, private methods will not be accessible from the outside.
In my experience, methods should not declare a load of variables. If they do, there is a good chance that your design is flawed. Think about constants and if you couldn't declare some of those as private final variables in your class. Alternatively, thinking OO, you could be missing an object to carry those variables and offer you some functionality related to the processing of those variables.

methodA() is not a method, it's missing a return type.
You can't access variables declared in a method from another method directly.
You either has to pass them as arguments or declare methodA in its own class together with the helpermethods.
This is probably the best way to do it:
public class MyClass {
public void methodA() {
int visibleVariable=10;
int result;
result = helperMethod(1, visibleVariable);
result = helperMethod(2, visibleVariable);
}
public int helperMethod(int index, int visibleVariable) {
// do something with visibleVariable
return 0;
}
}

compiler optimizes away a final field added to a serialized class

This is a strange one, am wondering if this is by design or a compiler bug. Am using Sun Java 6 on a PC, but also seen with Sun Java 5 on a linux box.
Suppose I have a file on disk containing a serialized class. I then add a final field to the class and declare and initialize the field in one statement:
public final int newField = 2;
If I read in an object that was serialized before this field was added, this new field is given a default value of 0, which is correct. However, the compiler replaces occurrences of the field by the constant "2". OTOH, if I instead initialize the new field inside of a constructor:
Data (int d) {
this.oldField = d;
this.newField = 2;
}
then the field does not get optimized away and everything works as expected.
More explicitly, here is my Main class:
import java.io.*;
public class Main {
public static void main(String[] args) {
try {
FileInputStream fis = new FileInputStream ("Data.txt");
ObjectInputStream ois = new ObjectInputStream (fis);
Data d = (Data)ois.readObject();
ois.close();
fis.close();
d.print();
}
catch (Exception e) {
e.printStackTrace();
}
}
}
and here is the Data class:
import java.io.*;
public class Data implements Serializable {
private static final long serialVersionUID = 1;
private int oldField;
private final int newField = 2;
Data(int d) {
this.oldField= d;
}
public void print() { {
System.out.println(this.oldField + " " + this.newField);
}
}
So my Data.txt file has a serialized instance of the original class with just "oldField = 1". So the expected output of running this is
1 0
but instead I get
1 2
whereas if I move the initialization into the constructor the output is correct.
Is this expected behaviour? My feeling is that the compiler should not be optimizing away fields in serialized classes, precisely for this reason. Perhaps that is asking too much. Thanks!

This is by design. If a variable is declared as final, has a primitive type and an initializer which is a constant expressions, then it is a "constant variable". This alters the semantics of initialization among other things. Specifically, the expression value is evaluated at compile time, and then embedded into the code.
Refer to JLS 4.12.4 for details.
When you put the initialization into the constructor, the variable is no longer a "constant variable" and normal initialization occurs.
By the way, I would have thought that a final variable having a different value than what the initializer said was a bad thing, not a good thing. To me, that would be highly unintuitive. At any rate, relying on this kind of behaviour does not strike me as good practice.

It's not clear why you think this is a problem. The compiler is doing exactly what it should do. The class definition says that the final field's value is 2, and that is exactly what you're getting.
The underlying reason is that when the final variable's value is declared in the initializer, the compiler knows its value immediately, so it is able to substitute the actual value as a literal wherever you use the variable. When you initialized it in constructor code, it couldn't see the value so easily so it doesn't do that optimization. It is still free to do so if it can.
In the language of the JLS #4.12.4, this is a 'constant variable', and JLS #13.1 says 'References to fields that are constant variables (§4.12.4) are resolved at compile time to the constant value that is denoted', which is exactly what is happening here.
My feeling is that the compiler should not be optimizing away fields
in serialized classes
Why should the compiler have different rules for serialized classes?

The usage of final variable always provokes compiler optimization. Its doing what it supposed to do.

Using "this" with methods (in Java)

what about using "this" with methods in Java? Is it optional or there are situations when one needs to use it obligatory?
The only situation I have encountered is when in the class you invoke a method within a method. But it is optional. Here is a silly example just to show what I mean:
public class Test {
String s;
private String hey() {
return s;
}
public String getS(){
String sm = this.hey();
// here I could just write hey(); without this
return sm;
}
}

Three obvious situations where you need it:
Calling another constructor in the same class as the first part of your constructor
Differentiating between a local variable and an instance variable (whether in the constructor or any other method)
Passing a reference to the current object to another method
Here's an example of all three:
public class Test
{
int x;
public Test(int x)
{
this.x = x;
}
public Test()
{
this(10);
}
public void foo()
{
Helper.doSomethingWith(this);
}
public void setX(int x)
{
this.x = x;
}
}
I believe there are also some weird situations using inner classes where you need super.this.x but they should be avoided as hugely obscure, IMO :)
EDIT: I can't think of any examples why you'd want it for a straight this.foo() method call.
EDIT: saua contributed this on the matter of obscure inner class examples:
I think the obscure case is: OuterClass.this.foo() when accessing foo() of the outer
class from the code in an Inner class that has a foo() method as well.

I use "this" to clarify code, often as a hint that I'm calling an instance method rather than accessing a class-level method or a field.
But no. Unless disambiguation is required due to scope naming collision, you don't actually need "this."

For most general programing, the this keyword is optional and generally used to avoid confusion. However, there are a few places where it is needed.
class Foo {
int val;
public Foo(int val) {
this(val, 0); //this MUST be here to refer to another constructor
}
public Foo(int val, int another) {
val = val; //this will work, but it generally not recommended.
this.val = val; //both are the same, but this is more useful.
method1(); //in a Foo instance, it will refer to this.method1()
this.method1(); //but in a Foo2 instance, you must use this to do the same
}
public void method1() {}
}
class Foo2 extends Foo {
public Foo2(int val) {
this(val); //this will refer to the other Foo2 constructor
}
public Foo2(int val, int another) {
super(val, another);
super.method1(); //this will refer to Foo.method1()
}
#Override
public void method1() {}//overridden method
}
These are not all the cases, but some of the more general ones. I hope this helps you better understand the this and super keywords and how/when to use them.

The only reason to prepend this in front of a method invocation is to indicate that you're calling a non-static method. I can't think of any other valid reason to do this (correct me I'm wrong). I don't recommend this convention as it doesn't add much value. If not applied consistently then it could be misleading (as a this-less method invocation could still be a non-static method). How often does one care if the method being invoked is static or not? Furthermore, most IDEs will highlight static methods differently.
I have heard of conventions where this indicates calling the subclass's method while an absence of this is calling the super class's method. But this is just silly as the convention could be the other way around.
Edit: As mmyers points out (see comment), this works with static methods. With that, I see absolutely no reason to prepend with this as it doesn't make any difference.

The only time it is really required is when you have a parameter to a method with the same name as a member variable. Personally, I try to always use it to make the scope of the variable/method explicit. For example you could have a static method or an instance method. When reading the code it can be helpful to know which is which.

Not an answer (so feel free to vote it down), but I couldn't fit this into a comment where someone was asking.
A lot of people use "this.x" to visually differentiate instance variables from local variables and parameters.
So they would do this:
private int sum;
public int storeSquare (int b) {
int c=b*b;
this.sum+=c; // Makes sum "pop" I guess
return c;
}
Personally I think it's a bad habit: any usable editor will put instance and local variables in a different color for you reliably--it doesn't require any human-fallible patterns.
Doing it with "this." is only 50% safe. Sure the compiler will catch it if you try to put this.x when x is a local variable, but there is nothing that is going to stop you from "Forgetting" to tag an instance variable with this., and if you forget to tag just one (or if someone else works on your code) and you are relying on the pattern, then the pattern may be more damaging than good
Personally I'm fairly sure the pattern stems from programmers (rightful) discomfort with the fact that in this case:
public void setMe(int me) {
this.me=me;
}
the fact that you need "this." in front of the me is determined by the name of the parameter--I agree it just feels sloppy. You want to be consistent--if you need this. in front of the me there, why not always use it?
Although I understand the discomfort, typing this. every single place that an instance variable is used is just pedantic, pointless, ugly and unreliable. If it really bothers you and you absolutely need to use a pattern to solve it, try the habit of putting "p" in front of your parameters. As a side effect, it should even make it more constant because the parameter case will now match the method case..
public void setMe( int pMe)

http://java.sun.com/j2se/1.5.0/docs/api/java/lang/StringBuilder.html
You absolutely need this if your method needs to return the object's instance.
public class StringBuildable {
public StringBuildable append(String text) {
// Code to insert the string -- previously this.internalAppend(text);
return this;
}
}
This allows you to chain methods together in the following fashion:
String string = new StringBuildable()
.append("hello")
.append(' ')
.append.("World")
.toString()
;

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

scopes of variables in Java - java

How will field1 know value of field2 before it has been defined and assigned a value?

Related

In Java, is there a difference between initializing an object in constructor or in declaration? [duplicate]

Access to final field via getter before its initializing

how to declare a helper method in java, to be used locally being able to reference variables

compiler optimizes away a final field added to a serialized class

Using "this" with methods (in Java)

Categories

Resources