Allowing the this reference to escape

Allowing the this reference to escape - java

I would appreciate help in understanding the following from 'Java Concurrency in Practice':
Calling an overrideable instance method(one that is neither
private nor final) from the constructor can also allow the
this reference to escape.
Does 'escape' here simply mean that we may probably be calling an instance method,before the instance is fully constructed?
I do not see 'this' escaping the scope of the instance in any other way.
How does 'final' prevent this from happening?Is there some aspect of 'final' in instance creation that I am missing?

It means calling code outside the class, and passing this.
That code will assume that the instance is fully initialized, and may break if it isn't.
Similarly, your class might assume that some methods will only be called after the instance is fully initialized, but the external code is likely to break those assumptions.
final methods cannot be overridden, so you can trust them to not pass this around.
If you call any non-final method in the constructor for a non-final class, a derived class might override that method and pass this anywhere.
Even when you call final methods, you still need to make sure that they are safely written – that they do not pass this anywhere, and that themselves don't call any non-final methods.

"Escape" means that a reference to the partially-constructed this object might be passed to some other object in the system. Consider this scenario:
public Foo {
public Foo() {
setup();
}
protected void setup() {
// do stuff
}
}
public Bar extends Foo implements SomeListener {
#Override protected void setup() {
otherObject.addListener(this);
}
}
The problem is that the new Bar object is being registered with otherObject before its construction is completed. Now if otherObject starts calling methods on barObject, fields might not have been initialized, or barObject might otherwise be in an inconsistent state. A reference to the barObject (this to itself) has "escaped" into the rest of the system before it's ready.
Instead, if the setup() method is final on Foo, the Bar class can't put code in there that will make the object visible before the Foo constructor finishes.

I believe the example is something like
public class Foo {
public Foo() {
doSomething();
}
public void doSomething() {
System.out.println("do something acceptable");
}
}
public class Bar extends Foo {
public void doSomething() {
System.out.println("yolo");
Zoom zoom = new Zoom(this); // at this point 'this' might not be fully initialized
}
}
Because the super constructor is always called first (either implicitly or explicitly), the doSomething will always get called for a child class. Because the above method is neither final nor private, you can override it in a child class and do whatever you want, which may conflict with what Foo#doSomething() was meant to do.

Per secure coding
Example BAD code:
final class Publisher {
public static volatile Publisher published;
int num;
Publisher(int number) {
published = this;
// Initialization
this.num = number;
// ...
}
}
If an object's initialization (and consequently, its construction) depends on a security check within the constructor, the security check can be bypassed when an untrusted caller obtains the partially initialized instance. See rule OBJ11-J. Be wary of letting constructors throw exceptions for more information.
final class Publisher {
public static Publisher published;
int num;
Publisher(int number) {
// Initialization
this.num = number;
// ...
published = this;
}
}
Because the field is nonvolatile and nonfinal, the statements within
the constructor can be reordered by the compiler in such a way that
the this reference is published before the initialization statements
have executed.
Correct code:
final class Publisher {
static volatile Publisher published;
int num;
Publisher(int number) {
// Initialization
this.num = number;
// ...
published = this;
}
}
The this reference is said to have escaped when it is made available
beyond its current scope. Following are common ways by which the this
reference can escape:
Returning this from a non-private, overridable method that is invoked from the constructor of a class whose object is being
constructed. (For more information, see rule MET05-J. Ensure that
constructors do not call overridable methods.)
Returning this from a nonprivate method of a mutable class, which allows the caller to manipulate the object's state indirectly. This
commonly occurs in method-chaining implementations; see rule VNA04-J.
Ensure that calls to chained methods are atomic for more information.
Passing this as an argument to an alien method invoked from the constructor of a class whose object is being constructed.
Using inner classes. An inner class implicitly holds a reference to the instance of its outer class unless the inner class is declared
static.
Publishing by assigning this to a public static variable from the constructor of a class whose object is being constructed.
Throwing an exception from a constructor. Doing so may cause code to be vulnerable to a finalizer attack; see rule OBJ11-J. Be wary of
letting constructors throw exceptions for more information.
Passing internal object state to an alien method. This enables the method to retrieve the this reference of the internal member object.
This rule describes the potential consequences of allowing the this
reference to escape during object construction, including race
conditions and improper initialization. For example, declaring a field
final ordinarily ensures that all threads see the field in a fully
initialized state; however, allowing the this reference to escape
during object construction can expose the field to other threads in an
uninitialized or partially initialized state. Rule TSM03-J. Do not
publish partially initialized objects, which describes the guarantees
provided by various mechanisms for safe publication, relies on
conformance to this rule. Consequently, programs must not allow the
this reference to escape during object construction.
In general, it is important to detect cases in which the this
reference can leak out beyond the scope of the current context. In
particular, public variables and methods should be carefully
scrutinized.

Related

create inner class instance in constructor

I am reading the book Java concurrency in practice, in section 3.2 , it gives the following code example to illustrate implicitly allowing the this reference to escape (Don’t do this, especailly in constructor) :
public class ThisEscape {
public ThisEscape(EventSource source) {
source.registerListener (
new EventListener() {
public void onEvent(Event e) {
doSomething(e);
}
}
);
}
}
The book then says :
When ThisEscape publishes the EventListener, it implicitly
publishes the enclosing ThisEscape instance as well, because inner
class instances contain a hidden reference to the enclosing instance.
I understand the above words from Java's perspective, but I can't come up with a example how could the above code's EventListener escaping enclosing reference this be harmful? In what way?
For example, if I create a new instance of ThisEscape:
ThisEscape myEscape = new Escape(mySource);
Then, what? How is it harmful now? In which way it is harmful?
Could someone please use above code as the base and explain to me how it is harmful?
======= MORE ======
The book is trying to say something like the anonymous EventListener holds a hidden reference to the containing class instance which is not yet fully constructed. I want to know in example, how could this un-fully constructed reference be misused, and I prefer to see a code example about this point.
The book gives a right way of doing things, that's to use a static factory method as below:
public static SafeListener newInstance(EventSource source) {
SafeListener safe = new SafeListener();
source.registerListener (safe.listener);
return safe;
}
I just don't get the point of the whole thing.

Problem 1: Operating on non-fully-constructed object
Consider this slightly modified example:
public class ThisEscape {
private String prefixText = null;
private void doSomething(Event e) {
System.out.println(prefixText.toUpperCase() + e.toString());
}
public ThisEscape(EventSource source) {
source.registerListener(
new EventListener() {
public void onEvent(Event e) {
doSomething(e); // hidden reference to `ThisEscape` is used
}
}
);
// What if an event is fired at this point from another thread?
// prefixText is not yet assigned,
// and doSomething() relies on it being not-null
prefixText = "Received event: ";
}
}
This would introduce a subtle and very hard-to-find bug, for example in multithreaded applications.
Consider that the event source fires and event after source.registerListener(...) has completed, but before prefixText was assigned. This could happen in a different thread.
In this case, the doSomething() would access the non-yet-initialized prefixText field, which would result in a NullPointerException. In other scenarios, the result could be invalid behavior or wrong calculation results, which would be event worse than an exception. And this kind of error is extremely hard to find in real-world applications, mostly due to the fact that it happens sporadically.
Problem 2: Garbage collection
The hidden reference to the enclosing instance would hinder the garbage collector from cleaning up the "enclosing instance" in certain cases.
This would happen if the enclosing instance isn't needed anymore by the program logic, but the instance of the inner class it produced is still needed.
If the "enclosing instance" in turn holds references to a lot of other objects which aren't needed by the program logic, then it would result in a massive memory leak.
A code example.
Given a slightly modified ThisEscape class form the question:
public class ThisEscape {
private long[] aVeryBigArray = new long[4711 * 815];
public ThisEscape(EventSource source) {
source.registerListener(
new EventListener() {
public void onEvent(Event e) {
doSomething(e);
}
private void doSomething(Event e) {
System.out.println(e.toString());
}
}
);
}
}
Please note that the inner anonymous class (which extends/implements EventListener) is non-static and thus contains a hidden reference to the instance of the containing class (ThisEscape).
Also note that the anonymous class doesn't actually use this hidden reference: no non-static methods or fields from the containing class are used from within the anonymous class.
Now this could be a possible usage:
// Register an event listener to print the event to System.out
new ThisEscape(myEventSource);
With this code we wanted to achieve that an event is registered within myEventSource. We do not need the instance of ThisEscape anymore.
But assuming that the EventSource.registerListener(EventListener) method stores a reference to the event listener created within ThisEscape, and the anonymous event listener holds a hidden reference to the containing class instance, the instance of ThisEscape can't be garbage-collected.
I've intentionally put a big non-static long array into ThisEscape, to demonstrate that the ThisEscape class instance could actually hold a lot of data (directly or indirectly), so the memory leak can be significant.

The issue with publishing an object mid-construction, in a multithreaded context, is that the object may be used before construction completes (or after the constructor raised an exception).
Even if the publishing happens as the last explicit step in the constructor, there are three things to keep in mind:
The order of side effects within a thread does not determine the order in which those side effects become visible to other threads. So even if the constructor is written in such a way that it fully populates the object before it publishes a reference to it, there's no guarantee that other threads will see the fully populated object when they read the reference.
A final field normally has special concurrency properties, but these properties depend on reaching the end of the constructor before the object becomes visible to other threads. If other threads perceive the object before it's fully constructed, then they may not even see the correct values of final fields.
Superclass constructors are called before any initialization happens in a subclass. So, for example, if a subclass contains the field String foo = "foo", then during the superclass constructor, the field will still be null, which will affect the results of virtual methods that use it. So if a reference to the object is published during the superclass constructor, other threads can act on the object while it's in an incomplete (and bizarre) state.

Shared instance variable vs local variable

Is there a reason to prefer using shared instance variable in class vs. local variable and have methods return the instance to it? Or is either one a bad practice?
import package.AClass;
public class foo {
private AClass aVar = new AClass();
// ... Constructor
public AClass returnAClassSetted() {
doStuff(aVar);
return avar;
}
private void doStuff(AClass a) {
aVar = a.setSomething("");
}
}
vs.
import package.AClass;
public class foo {
// ... Constructor
public AClass returnAClassSetted() {
AClass aVar = new AClass();
aVar = doStuff();
return aVar;
}
private AClass doStuff() {
AClass aVar1 = new AClass();
aVar1.setSomething("");
return aVar1;
}
}
First one makes more sense to me in so many ways but I often see code that does the second. Thanks!

Instance variables are shared by all methods in the class. When one method changes the data, another method can be affected by it. It means that you can't understand any one method on its own since it is affected by the code in the other methods in the class. The order in which methods are called can affect the outcome. The methods may not be reentrant. That means that if the method is called, again, before it finishes its execution (say it calls a method that then calls it, or fires an event which then a listener calls the method) then it may fail or behave incorrectly since the data is shared. If that wasn't enough potential problems, when you have multithreading, the data could be changed while you are using it causing inconsistent and hard to reproduce bugs (race conditions).
Using local variables keeps the scope minimized to the smallest amount of code that needs it. This makes it easier to understand, and to debug. It avoids race conditions. It is easier to ensure the method is reentrant. It is a good practice to minimize the scope of data.

Your class name should have been Foo.
The two versions you have are not the same, and it should depend on your use case.
The first version returns the same AClass object when different callers call returnAClassSetted() method using the same Foo object. If one of them changes the state of the returned AClass object, all of them will get see the change. Your Foo class is effectively a Singleton.
The second version returns a new AClass object every time a caller calls returnAClassSetted() method using either the same or different Foo object. Your Foo class is effectively a Builder.
Also, if you want the second version, remove the AClass aVar = new AClass(); and just use AClass aVar = doStuff();. Because you are throwing away the first AClass object created by new AClass();

It's not a yes/no question. It basically depends on the situation and your needs. Declaring the variable in the smallest scope as possible is considered the best practice. However there may be some cases (like in this one) where, depending on the task, it's better to declare it inside/outside the methods. If you declare them outside it will be one instance, and it will be two on the other hand.

Instance properties represent the state of a specific instance of that Class. It might make more sense to think about a concrete example. If the class is Engine, one of the properties that might represent the state of the Engine might be
private boolean running;
... so given an instance of Engine, you could call engine.isRunning() to check the state.
If a given property is not part of the state (or composition) of your Class, then it might be best suited to be a local variable within a method, as implementation detail.

In Instance variables values given are default values means null so if it's an object reference, 0 if it's and int.
Local variables usually don't get default values, and therefore need to be explicitly initialized and the compiler generates an error if you fail to do so.
Further,
Local variables are only visible in the method or block in which they are declared whereas the instance variable can be seen by all methods in the class.

Why is a subclass' static initializer not invoked when a static method declared in its superclass is invoked on the subclass?

Given the following classes:
public abstract class Super {
protected static Object staticVar;
protected static void staticMethod() {
System.out.println( staticVar );
}
}
public class Sub extends Super {
static {
staticVar = new Object();
}
// Declaring a method with the same signature here,
// thus hiding Super.staticMethod(), avoids staticVar being null
/*
public static void staticMethod() {
Super.staticMethod();
}
*/
}
public class UserClass {
public static void main( String[] args ) {
new UserClass().method();
}
void method() {
Sub.staticMethod(); // prints "null"
}
}
I'm not targeting at answers like "Because it's specified like this in the JLS.". I know it is, since JLS, 12.4.1 When Initialization Occurs reads just:
A class or interface type T will be initialized immediately before the first occurrence of any one of the following:
...
T is a class and a static method declared by T is invoked.
...
I'm interested in whether there is a good reason why there is not a sentence like:
T is a subclass of S and a static method declared by S is invoked on T.

Be careful in your title, static fields and methods are NOT inherited. This means that when you comment staticMethod() in Sub , Sub.staticMethod() actually calls Super.staticMethod() then Sub static initializer is not executed.
However, the question is more interesting than I thought at the first sight : in my point of view, this shouldn't compile without a warning, just like when one calls a static method on an instance of the class.
EDIT: As #GeroldBroser pointed it, the first statement of this answer is wrong. Static methods are inherited as well but never overriden, simply hidden. I'm leaving the answer as is for history.

I think it has to do with this part of the jvm spec:
Each frame (§2.6) contains a reference to the run-time constant pool (§2.5.5) for the type of the current method to support dynamic linking of the method code. The class file code for a method refers to methods to be invoked and variables to be accessed via symbolic references. Dynamic linking translates these symbolic method references into concrete method references, loading classes as necessary to resolve as-yet-undefined symbols, and translates variable accesses into appropriate offsets in storage structures associated with the run-time location of these variables.
This late binding of the methods and variables makes changes in other classes that a method uses less likely to break this code.
In chapter 5 in the jvm spec they also mention:
A class or interface C may be initialized, among other things, as a result of:
The execution of any one of the Java Virtual Machine instructions new, getstatic, putstatic, or invokestatic that references C (§new, §getstatic, §putstatic, §invokestatic). These instructions reference a class or interface directly or indirectly through either a field reference or a method reference.
...
Upon execution of a getstatic, putstatic, or invokestatic instruction, the class or interface that declared the resolved field or method is initialized if it has not been initialized already.
It seems to me the first bit of documentation states that any symbolic reference is simply resolved and invoked without regard as to where it came from. This documentation about method resolution has the following to say about that:
[M]ethod resolution attempts to locate the referenced method in C and its superclasses:
If C declares exactly one method with the name specified by the method reference, and the declaration is a signature polymorphic method (§2.9), then method lookup succeeds. All the class names mentioned in the descriptor are resolved (§5.4.3.1).
The resolved method is the signature polymorphic method declaration. It is not necessary for C to declare a method with the descriptor specified by the method reference.
Otherwise, if C declares a method with the name and descriptor specified by the method reference, method lookup succeeds.
Otherwise, if C has a superclass, step 2 of method resolution is recursively invoked on the direct superclass of C.
So the fact that it's called from a subclass seems to simply be ignored. Why do it this way? In the documentation you provided they say:
The intent is that a class or interface type has a set of initializers that put it in a consistent state, and that this state is the first state that is observed by other classes.
In your example, you alter the state of Super when Sub is statically initialized. If initialization happened when you called Sub.staticMethod you would get different behavior for what the jvm considers the same method. This might be the inconsistency they were talking about avoiding.
Also, here's some of the decompiled class file code that executes staticMethod, showing use of invokestatic:
Constant pool:
...
#2 = Methodref #18.#19 // Sub.staticMethod:()V
...
Code:
stack=0, locals=1, args_size=1
0: invokestatic #2 // Method Sub.staticMethod:()V
3: return

The JLS is specifically allowing the JVM to avoid loading the Sub class, it's in the section quoted in the question:
A reference to a static field (§8.3.1.1) causes initialization of only the class or interface that actually declares it, even though it might be referred to through the name of a subclass, a subinterface, or a class that implements an interface.
The reason is to avoid having the JVM load classes unnecessarily. Initializing static variables is not an issue because they are not getting referenced anyway.

The reason is quite simple: for JVM not to do extra work prematurely (Java is lazy in its nature).
Whether you write Super.staticMethod() or Sub.staticMethod(), the same implementation is called. And this parent's implementation typically does not depend on subclasses. Static methods of Super are not supposed to access members of Sub, so what's the point in initializing Sub then?
Your example seems to be artificial and not well-designed.
Making subclass rewrite static fields of superclass does not sound like a good idea. In this case an outcome of Super's methods will depend on which class is touched first. This also makes hard to have multiple children of Super with their own behavior. To cut it short, static members are not for polymorphism - that's what OOP principles say.

According to this article, when you call static method or use static filed of a class, only that class will be initialized.
Here is the example screen shot.

for some reason jvm think that static block is no good, and its not executed
I believe, it is because you are not using any methods for subclass, so jvm sees no reason to "init" the class itself, the method call is statically bound to parent at compile time - there is late binding for static methods
http://ideone.com/pUyVj4
static {
System.out.println("init");
staticVar = new Object();
}
Add some other method, and call it before the sub
Sub.someOtherMethod();
new UsersClass().method();
or do explicit Class.forName("Sub");
Class.forName("Sub");
new UsersClass().method();

When static block is executed Static Initializers
A static initializer declared in a class is executed when the class is initialized
when you call Sub.staticMethod(); that means class in not initialized.Your are just refernce
When a class is initialized
When a Class is initialized in Java After class loading, initialization of class takes place which means initializing all static members of class. A Class is initialized in Java when :
1) an Instance of class is created using either new() keyword or using reflection using class.forName(), which may throw ClassNotFoundException in Java.
2) an static method of Class is invoked.
3) an static field of Class is assigned.
4) an static field of class is used which is not a constant variable.
5) if Class is a top level class and an assert statement lexically nested within class is executed.
When a class is loaded and initialized in JVM - Java
that's why your getting null(default value of instance variable).
public class Sub extends Super {
static {
staticVar = new Object();
}
public static void staticMethod() {
Super.staticMethod();
}
}
in this case class is initialize and you get hashcode of new object().If you do not override staticMethod() means your referring super class method
and Sub class is not initialized.

Does self-reference in the constructor counts as "escaping"?

Reading this article about JSR-133, it says:
all of the writes to final fields (and to variables reachable
indirectly through those final fields) become "frozen," ...
If an object's reference is not allowed to escape during construction,
then once a constructor has completed and a thread publishes a
reference to an object, that object's final fields are guaranteed to
be visible ...
The one caveat with initialization safety is that the object's
reference must not "escape" its constructor -- the constructor should
not publish, directly or indirectly, a reference to the object being
constructed.
My question is about what is considered escaping. More specifically, I want to know if this (somewhat artificial and strange) code results in a safely-publishable Child object:
class Parent {
/** NOT final. */
private int answer;
public int getAnswer() {
return answer;
}
public void setAnswer(final int _answer) {
answer = _answer;
}
}
public class Child extends Parent {
private final Object self;
public Child() {
super.setAnswer(42);
self = this;
}
#Override
public void setAnswer(final int _answer) {
throw new UnsupportedOperationException();
}
}
Firstly, while Parent is clearly mutable, Child is "effectively immutable", since the parent setter that would allow mutability is not reachable anymore.
The reference to "this" in the constructor is not visible to anyone (not getter, and not passed to any other object). So, does this count as "escaping"?
But the object as a whole is being referenced by a final field (self), and so in theory, it's whole content should then be "frozen". OTOH, the final field is itself not reachable, so maybe it doesn't count; I could very well imagine the JIT just completely optimizing it away.
If "self" was made accessible through a getter, but the getter is not called in the constructor, does it then count as escaping (assuming it didn't before)? This would prevent the JIT from optimizing it away, so that it must then "count", maybe?
So, is Child "safely-publishable", and if not, why, and would a getter for "self" change the answer?
In case the purpose of the question isn't clear, I think that if this works, it would allow one to easily make a mutable class "safely-publishable", by just extending it as shown above.

You may be misunderstanding the meaning of escaping. The point is that the value of this must not reach any code foreign to the constructor. I think a few examples would explain it better:
setting a private field to this doesn't count as escaping;
calling a private method, which in turn doesn't call any further methods, and doesn't assign this to a foreign object's variable, doesn't count as escaping;
calling a public, overridable method belonging to this does count as escaping unless the class is final. Therefore your code lets this escape when you call setAnswer, not when you assign this to self. Why? Because a subclass may override this method and publish this to any foreign code.
A note on your reasoning about self: self is reachable from this and this doesn't depend on the fact that a foreign caller cannot get its value. It is enough that a method may internally dereference it. Anyway, the rules about freezing do not take into account the access level of variables. For example, everything is reachable via reflection.

method local innerclasses accessing the local variables of the method

Hi I was going through the SCJP book about the innerclasses, and found this statement, it goes something like this.
A method local class can only refer to the local variables which are marked final
and in the explanation the reason specified is about the scope and lifetime of the local class object and the local variables on the heap, but I am unable to understand that. Am I missing anything here about final??

The reason is, when the method local class instance is created, all the method local variables it refers to are actually copied into it by the compiler. That is why only final variables can be accessed. A final variable or reference is immutable, so it stays in sync with its copy within the method local object. Were it not so, the original value / reference could be changed after the creation of the method local class, giving way to confusing behaviour and subtle bugs.
Consider this example from the JavaSpecialist newsletter no. 25:
public class Access1 {
public void f() {
final int i = 3;
Runnable runnable = new Runnable() {
public void run() {
System.out.println(i);
}
};
}
}
The compiler turns the inner class into this:
class Access1$1 implements Runnable {
Access1$1(Access1 access1) {
this$0 = access1;
}
public void run() {
System.out.println(3);
}
private final Access1 this$0;
}
Since the value of i is final, the compiler can "inline" it into the inner class.

As I see it, accessing local variables from method-local-classes (e.g. anonymous class) is a risky thing. It is allowed by the compiler, but it requires good understanding of what is going on.
When the inner class is instantiated, all the references to local variables it uses are copied, and passed as implicit constructor parameters (check the bytecode). Actually the compiler could have allowed making the references non-final, but it would be confusing, since it would not be clear what happens if the method alters the references after the instantiation.
However, making the reference final does not eliminate all problems. While the reference is immutable, the object behind the reference may still be mutable. Any mutations of the object done between the instantiation of the inner class until its activation will be seen by the inner class, and sometimes this is not the intention of the programmer.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Allowing the this reference to escape - java

Related

create inner class instance in constructor

Shared instance variable vs local variable

Why is a subclass' static initializer not invoked when a static method declared in its superclass is invoked on the subclass?

Does self-reference in the constructor counts as "escaping"?

method local innerclasses accessing the local variables of the method

Categories

Resources