create inner class instance in constructor - java

I am reading the book Java concurrency in practice, in section 3.2 , it gives the following code example to illustrate implicitly allowing the this reference to escape (Don’t do this, especailly in constructor) :
public class ThisEscape {
public ThisEscape(EventSource source) {
source.registerListener (
new EventListener() {
public void onEvent(Event e) {
doSomething(e);
}
}
);
}
}
The book then says :
When ThisEscape publishes the EventListener, it implicitly
publishes the enclosing ThisEscape instance as well, because inner
class instances contain a hidden reference to the enclosing instance.
I understand the above words from Java's perspective, but I can't come up with a example how could the above code's EventListener escaping enclosing reference this be harmful? In what way?
For example, if I create a new instance of ThisEscape:
ThisEscape myEscape = new Escape(mySource);
Then, what? How is it harmful now? In which way it is harmful?
Could someone please use above code as the base and explain to me how it is harmful?
======= MORE ======
The book is trying to say something like the anonymous EventListener holds a hidden reference to the containing class instance which is not yet fully constructed. I want to know in example, how could this un-fully constructed reference be misused, and I prefer to see a code example about this point.
The book gives a right way of doing things, that's to use a static factory method as below:
public static SafeListener newInstance(EventSource source) {
SafeListener safe = new SafeListener();
source.registerListener (safe.listener);
return safe;
}
I just don't get the point of the whole thing.

Problem 1: Operating on non-fully-constructed object
Consider this slightly modified example:
public class ThisEscape {
private String prefixText = null;
private void doSomething(Event e) {
System.out.println(prefixText.toUpperCase() + e.toString());
}
public ThisEscape(EventSource source) {
source.registerListener(
new EventListener() {
public void onEvent(Event e) {
doSomething(e); // hidden reference to `ThisEscape` is used
}
}
);
// What if an event is fired at this point from another thread?
// prefixText is not yet assigned,
// and doSomething() relies on it being not-null
prefixText = "Received event: ";
}
}
This would introduce a subtle and very hard-to-find bug, for example in multithreaded applications.
Consider that the event source fires and event after source.registerListener(...) has completed, but before prefixText was assigned. This could happen in a different thread.
In this case, the doSomething() would access the non-yet-initialized prefixText field, which would result in a NullPointerException. In other scenarios, the result could be invalid behavior or wrong calculation results, which would be event worse than an exception. And this kind of error is extremely hard to find in real-world applications, mostly due to the fact that it happens sporadically.
Problem 2: Garbage collection
The hidden reference to the enclosing instance would hinder the garbage collector from cleaning up the "enclosing instance" in certain cases.
This would happen if the enclosing instance isn't needed anymore by the program logic, but the instance of the inner class it produced is still needed.
If the "enclosing instance" in turn holds references to a lot of other objects which aren't needed by the program logic, then it would result in a massive memory leak.
A code example.
Given a slightly modified ThisEscape class form the question:
public class ThisEscape {
private long[] aVeryBigArray = new long[4711 * 815];
public ThisEscape(EventSource source) {
source.registerListener(
new EventListener() {
public void onEvent(Event e) {
doSomething(e);
}
private void doSomething(Event e) {
System.out.println(e.toString());
}
}
);
}
}
Please note that the inner anonymous class (which extends/implements EventListener) is non-static and thus contains a hidden reference to the instance of the containing class (ThisEscape).
Also note that the anonymous class doesn't actually use this hidden reference: no non-static methods or fields from the containing class are used from within the anonymous class.
Now this could be a possible usage:
// Register an event listener to print the event to System.out
new ThisEscape(myEventSource);
With this code we wanted to achieve that an event is registered within myEventSource. We do not need the instance of ThisEscape anymore.
But assuming that the EventSource.registerListener(EventListener) method stores a reference to the event listener created within ThisEscape, and the anonymous event listener holds a hidden reference to the containing class instance, the instance of ThisEscape can't be garbage-collected.
I've intentionally put a big non-static long array into ThisEscape, to demonstrate that the ThisEscape class instance could actually hold a lot of data (directly or indirectly), so the memory leak can be significant.

The issue with publishing an object mid-construction, in a multithreaded context, is that the object may be used before construction completes (or after the constructor raised an exception).
Even if the publishing happens as the last explicit step in the constructor, there are three things to keep in mind:
The order of side effects within a thread does not determine the order in which those side effects become visible to other threads. So even if the constructor is written in such a way that it fully populates the object before it publishes a reference to it, there's no guarantee that other threads will see the fully populated object when they read the reference.
A final field normally has special concurrency properties, but these properties depend on reaching the end of the constructor before the object becomes visible to other threads. If other threads perceive the object before it's fully constructed, then they may not even see the correct values of final fields.
Superclass constructors are called before any initialization happens in a subclass. So, for example, if a subclass contains the field String foo = "foo", then during the superclass constructor, the field will still be null, which will affect the results of virtual methods that use it. So if a reference to the object is published during the superclass constructor, other threads can act on the object while it's in an incomplete (and bizarre) state.

Related

Thread safety, static methods and some weird code

I recently stumbled upon piece of code similar to the one below. This code does reek right off. Looks like singleton but its not because there is no private constructor. I know for sure this is going to have thread safety issue given big enough load to it. Specially given class instance. Can someone please point out thread safety issues with this code?
public class AClass extends AnotherClass {
public static final AClass instance = new AClass();
public static SomeObject doSomethingThatCallsAService(Params params) {
return methodThatCallsService(params, instance);
}
public static SomeObject methodThatCallsService(Params params, AClass instance) {
-----call service here ---------
instance.doSomethingElse();
}
private void doSomethingElse() {
--- do some trivial work -----
}
}
Given that the object does not carry state, there's no concern for thread safety, regardless of the number of threads calling methods on or having a reference to the singleton object.
All the methods in the class, including static ones, don't use any shared data. So whether they call methods on the singleton object or they pass the instance around, there's no need for access to anything to be synchronized.
As the code is, the only data that could possibly need synchronization is params in the argument to methodThatCallsService, and that's only if this method modifies the data and multiple threads hold a reference to the same Params object.
But as far as this class is concerned, it's thread-safe, even if the singleton implementation is vulnerable.

Shared instance variable vs local variable

Is there a reason to prefer using shared instance variable in class vs. local variable and have methods return the instance to it? Or is either one a bad practice?
import package.AClass;
public class foo {
private AClass aVar = new AClass();
// ... Constructor
public AClass returnAClassSetted() {
doStuff(aVar);
return avar;
}
private void doStuff(AClass a) {
aVar = a.setSomething("");
}
}
vs.
import package.AClass;
public class foo {
// ... Constructor
public AClass returnAClassSetted() {
AClass aVar = new AClass();
aVar = doStuff();
return aVar;
}
private AClass doStuff() {
AClass aVar1 = new AClass();
aVar1.setSomething("");
return aVar1;
}
}
First one makes more sense to me in so many ways but I often see code that does the second. Thanks!
Instance variables are shared by all methods in the class. When one method changes the data, another method can be affected by it. It means that you can't understand any one method on its own since it is affected by the code in the other methods in the class. The order in which methods are called can affect the outcome. The methods may not be reentrant. That means that if the method is called, again, before it finishes its execution (say it calls a method that then calls it, or fires an event which then a listener calls the method) then it may fail or behave incorrectly since the data is shared. If that wasn't enough potential problems, when you have multithreading, the data could be changed while you are using it causing inconsistent and hard to reproduce bugs (race conditions).
Using local variables keeps the scope minimized to the smallest amount of code that needs it. This makes it easier to understand, and to debug. It avoids race conditions. It is easier to ensure the method is reentrant. It is a good practice to minimize the scope of data.
Your class name should have been Foo.
The two versions you have are not the same, and it should depend on your use case.
The first version returns the same AClass object when different callers call returnAClassSetted() method using the same Foo object. If one of them changes the state of the returned AClass object, all of them will get see the change. Your Foo class is effectively a Singleton.
The second version returns a new AClass object every time a caller calls returnAClassSetted() method using either the same or different Foo object. Your Foo class is effectively a Builder.
Also, if you want the second version, remove the AClass aVar = new AClass(); and just use AClass aVar = doStuff();. Because you are throwing away the first AClass object created by new AClass();
It's not a yes/no question. It basically depends on the situation and your needs. Declaring the variable in the smallest scope as possible is considered the best practice. However there may be some cases (like in this one) where, depending on the task, it's better to declare it inside/outside the methods. If you declare them outside it will be one instance, and it will be two on the other hand.
Instance properties represent the state of a specific instance of that Class. It might make more sense to think about a concrete example. If the class is Engine, one of the properties that might represent the state of the Engine might be
private boolean running;
... so given an instance of Engine, you could call engine.isRunning() to check the state.
If a given property is not part of the state (or composition) of your Class, then it might be best suited to be a local variable within a method, as implementation detail.
In Instance variables values given are default values means null so if it's an object reference, 0 if it's and int.
Local variables usually don't get default values, and therefore need to be explicitly initialized and the compiler generates an error if you fail to do so.
Further,
Local variables are only visible in the method or block in which they are declared whereas the instance variable can be seen by all methods in the class.

Does self-reference in the constructor counts as "escaping"?

Reading this article about JSR-133, it says:
all of the writes to final fields (and to variables reachable
indirectly through those final fields) become "frozen," ...
If an object's reference is not allowed to escape during construction,
then once a constructor has completed and a thread publishes a
reference to an object, that object's final fields are guaranteed to
be visible ...
The one caveat with initialization safety is that the object's
reference must not "escape" its constructor -- the constructor should
not publish, directly or indirectly, a reference to the object being
constructed.
My question is about what is considered escaping. More specifically, I want to know if this (somewhat artificial and strange) code results in a safely-publishable Child object:
class Parent {
/** NOT final. */
private int answer;
public int getAnswer() {
return answer;
}
public void setAnswer(final int _answer) {
answer = _answer;
}
}
public class Child extends Parent {
private final Object self;
public Child() {
super.setAnswer(42);
self = this;
}
#Override
public void setAnswer(final int _answer) {
throw new UnsupportedOperationException();
}
}
Firstly, while Parent is clearly mutable, Child is "effectively immutable", since the parent setter that would allow mutability is not reachable anymore.
The reference to "this" in the constructor is not visible to anyone (not getter, and not passed to any other object). So, does this count as "escaping"?
But the object as a whole is being referenced by a final field (self), and so in theory, it's whole content should then be "frozen". OTOH, the final field is itself not reachable, so maybe it doesn't count; I could very well imagine the JIT just completely optimizing it away.
If "self" was made accessible through a getter, but the getter is not called in the constructor, does it then count as escaping (assuming it didn't before)? This would prevent the JIT from optimizing it away, so that it must then "count", maybe?
So, is Child "safely-publishable", and if not, why, and would a getter for "self" change the answer?
In case the purpose of the question isn't clear, I think that if this works, it would allow one to easily make a mutable class "safely-publishable", by just extending it as shown above.
You may be misunderstanding the meaning of escaping. The point is that the value of this must not reach any code foreign to the constructor. I think a few examples would explain it better:
setting a private field to this doesn't count as escaping;
calling a private method, which in turn doesn't call any further methods, and doesn't assign this to a foreign object's variable, doesn't count as escaping;
calling a public, overridable method belonging to this does count as escaping unless the class is final. Therefore your code lets this escape when you call setAnswer, not when you assign this to self. Why? Because a subclass may override this method and publish this to any foreign code.
A note on your reasoning about self: self is reachable from this and this doesn't depend on the fact that a foreign caller cannot get its value. It is enough that a method may internally dereference it. Anyway, the rules about freezing do not take into account the access level of variables. For example, everything is reachable via reflection.

Allowing the this reference to escape

I would appreciate help in understanding the following from 'Java Concurrency in Practice':
Calling an overrideable instance method(one that is neither
private nor final) from the constructor can also allow the
this reference to escape.
Does 'escape' here simply mean that we may probably be calling an instance method,before the instance is fully constructed?
I do not see 'this' escaping the scope of the instance in any other way.
How does 'final' prevent this from happening?Is there some aspect of 'final' in instance creation that I am missing?
It means calling code outside the class, and passing this.
That code will assume that the instance is fully initialized, and may break if it isn't.
Similarly, your class might assume that some methods will only be called after the instance is fully initialized, but the external code is likely to break those assumptions.
final methods cannot be overridden, so you can trust them to not pass this around.
If you call any non-final method in the constructor for a non-final class, a derived class might override that method and pass this anywhere.
Even when you call final methods, you still need to make sure that they are safely written – that they do not pass this anywhere, and that themselves don't call any non-final methods.
"Escape" means that a reference to the partially-constructed this object might be passed to some other object in the system. Consider this scenario:
public Foo {
public Foo() {
setup();
}
protected void setup() {
// do stuff
}
}
public Bar extends Foo implements SomeListener {
#Override protected void setup() {
otherObject.addListener(this);
}
}
The problem is that the new Bar object is being registered with otherObject before its construction is completed. Now if otherObject starts calling methods on barObject, fields might not have been initialized, or barObject might otherwise be in an inconsistent state. A reference to the barObject (this to itself) has "escaped" into the rest of the system before it's ready.
Instead, if the setup() method is final on Foo, the Bar class can't put code in there that will make the object visible before the Foo constructor finishes.
I believe the example is something like
public class Foo {
public Foo() {
doSomething();
}
public void doSomething() {
System.out.println("do something acceptable");
}
}
public class Bar extends Foo {
public void doSomething() {
System.out.println("yolo");
Zoom zoom = new Zoom(this); // at this point 'this' might not be fully initialized
}
}
Because the super constructor is always called first (either implicitly or explicitly), the doSomething will always get called for a child class. Because the above method is neither final nor private, you can override it in a child class and do whatever you want, which may conflict with what Foo#doSomething() was meant to do.
Per secure coding
Example BAD code:
final class Publisher {
public static volatile Publisher published;
int num;
Publisher(int number) {
published = this;
// Initialization
this.num = number;
// ...
}
}
If an object's initialization (and consequently, its construction) depends on a security check within the constructor, the security check can be bypassed when an untrusted caller obtains the partially initialized instance. See rule OBJ11-J. Be wary of letting constructors throw exceptions for more information.
final class Publisher {
public static Publisher published;
int num;
Publisher(int number) {
// Initialization
this.num = number;
// ...
published = this;
}
}
Because the field is nonvolatile and nonfinal, the statements within
the constructor can be reordered by the compiler in such a way that
the this reference is published before the initialization statements
have executed.
Correct code:
final class Publisher {
static volatile Publisher published;
int num;
Publisher(int number) {
// Initialization
this.num = number;
// ...
published = this;
}
}
The this reference is said to have escaped when it is made available
beyond its current scope. Following are common ways by which the this
reference can escape:
Returning this from a non-private, overridable method that is invoked from the constructor of a class whose object is being
constructed. (For more information, see rule MET05-J. Ensure that
constructors do not call overridable methods.)
Returning this from a nonprivate method of a mutable class, which allows the caller to manipulate the object's state indirectly. This
commonly occurs in method-chaining implementations; see rule VNA04-J.
Ensure that calls to chained methods are atomic for more information.
Passing this as an argument to an alien method invoked from the constructor of a class whose object is being constructed.
Using inner classes. An inner class implicitly holds a reference to the instance of its outer class unless the inner class is declared
static.
Publishing by assigning this to a public static variable from the constructor of a class whose object is being constructed.
Throwing an exception from a constructor. Doing so may cause code to be vulnerable to a finalizer attack; see rule OBJ11-J. Be wary of
letting constructors throw exceptions for more information.
Passing internal object state to an alien method. This enables the method to retrieve the this reference of the internal member object.
This rule describes the potential consequences of allowing the this
reference to escape during object construction, including race
conditions and improper initialization. For example, declaring a field
final ordinarily ensures that all threads see the field in a fully
initialized state; however, allowing the this reference to escape
during object construction can expose the field to other threads in an
uninitialized or partially initialized state. Rule TSM03-J. Do not
publish partially initialized objects, which describes the guarantees
provided by various mechanisms for safe publication, relies on
conformance to this rule. Consequently, programs must not allow the
this reference to escape during object construction.
In general, it is important to detect cases in which the this
reference can leak out beyond the scope of the current context. In
particular, public variables and methods should be carefully
scrutinized.

How do JVM's implicit memory barriers behave when chaining constructors?

Referring to my earlier question on incompletely constructed objects, I have a second question. As Jon Skeet pointed out, there's an implicit memory barrier in the end of a constructor that makes sure that final fields are visible to all threads. But what if a constructor calls another constructor; is there such a memory barrier in the end of each of them, or only in the end of the one that got called in the first place? That is, when the "wrong" solution is:
public class ThisEscape {
public ThisEscape(EventSource source) {
source.registerListener(
new EventListener() {
public void onEvent(Event e) {
doSomething(e);
}
});
}
}
And the correct one would be a factory method version:
public class SafeListener {
private final EventListener listener;
private SafeListener() {
listener = new EventListener() {
public void onEvent(Event e) {
doSomething(e);
}
}
}
public static SafeListener newInstance(EventSource source) {
SafeListener safe = new SafeListener();
source.registerListener(safe.listener);
return safe;
}
}
Would the following work too, or not?
public class MyListener {
private final EventListener listener;
private MyListener() {
listener = new EventListener() {
public void onEvent(Event e) {
doSomething(e);
}
}
}
public MyListener(EventSource source) {
this();
source.register(listener);
}
}
Update: The essential question is that is this() guaranteed to actually call the private constructor above (in which case there would be the barrier where intended and everything would be safe), or is it possible that the private constructor gets inlined into the public one as an optimization to save one memory barrier (in which case there wouldn't be a barrier until in the end of the public constructor)?
Are the rules of this() defined precisely somewhere? If not, then I think we must assume that inlining chained constructors is allowed, and probably some JVMs or maybe even javacs are doing it.
I think it is safe as java memory model states that:
Let o be an object, and c be a constructor for o in which a final
field f is written. A freeze action on final field f of o takes place
when c exits, either normally or abruptly. Note that if one
constructor invokes another constructor, and the invoked constructor
sets a final field, the freeze for the final field takes place at the
end of the invoked constructor.
An object is considered to be completely initialized when its constructor finishes.
This applies also for chained constructors.
If you have to register in the constructor define the listener as a static inner class. This is safe.
Your second version is not correct, because it is allowing the 'this' reference to escape from the construction process. Having 'this' escape invalidates the initialization safety guarantees that give final fields their safety.
To address the implicit question, the barrier at the end of construction only happens at the very end of object construction. The intuition one reader offered about inlining is a useful one; from the perspective of the Java Memory Model, method boundaries do not exist.
EDIT After the comment that suggested the compiler inlining the private constructor (I had not thought of that optimization) chances are that the code will be unsafe. And the worst part of unsafe multithreaded code is that is seems to work, so you are better off avoiding it completely. If you want to play different tricks (you do really want to avoid the factory for some reason) consider adding a wrapper to guarantee the coherence of data in the internal implementation object and register in the external object.
My guess is that it will be fragile but ok. The compiler cannot know whether the internal constructor will be called only from within other constructors or not, so it has to make sure that the result would be correct for code calling only the internal constructor, so whatever mechanism it uses (memory barrier?) has to be in place there.
I would guess that the compiler would add the memory barrier at the end of each and every constructor. The problem is still there: you are passing the this reference to other code (possibly other threads) before it is fully constructed --that is bad--, but if the only ´construction´ that is left is registering the listener, then the object state is as stable as it will ever be.
The solution is fragile in that some other day, you or some other programmer may need to add another member to the object and may forget that the chained constructors is a concurrency trick and may decide to initialize the field in the public constructor, and in doing so will add a hard to detect potential data race in your application, so I would try to avoid that construct.
BTW: The guessed safety may be wrong. I don't know how complex/smart the compiler is, and whether the memory barrier (or the like) is something it could try to optimize away... since the constructor is private the compiler does have enough information to know that it is only called from other constructors, and that is enough information to determine that the synchronization mechanism is not necessary in the internal constructor...
Escaping object reference in c-tor can publish an incompletely constructed object. This is true even if the publication is the last statement in the constructor.
Your SafeListener might not behave ok in a concurrent environment, even if c-tor inlining is performed (which I think it's not - think about creating objects using reflection by accessing private c-tor).

Categories