Java, lazily initialized field without synchronization

Java, lazily initialized field without synchronization - java

Sometimes when I need lazily initialized field, I use following design pattern.
class DictionaryHolder {
private volatile Dictionary dict; // some heavy object
public Dictionary getDictionary() {
Dictionary d = this.dict;
if (d == null) {
d = loadDictionary(); // costy operation
this.dict = d;
}
return d;
}
}
It looks like Double Checking idion, but not exactly. There is no synchronization and it is possible for loadDictionary method to be called several times.
I use this pattern when the concurrency is pretty low. Also I bear in mind following assumptions when using this pattern:
loadDictionary method always returns the same data.
loadDictionary method is thread-safe.
My questions:
Is this pattern correct? In other words, is it possible for getDictionary() to return invalid data?
Is it possible to make dict field non-volatile for more efficiency?
Is there any better solution?

I personally feel that the Initialization on demand holder idiom is a good fit for this case. From the wiki:
public class Something {
private Something() {}
private static class LazyHolder {
private static final Something INSTANCE = new Something();
}
public static final Something getInstance() {
return LazyHolder.INSTANCE;
}
}
Though this might look like a pattern intended purely for singleton control, you can do many more cool things with it. For e.g. the holder class can invoke a method which in turn populates some kind of data.
Also, it seems that in your case if multiple threads queue on the loadDictionary call (which is synchronized), you might end up loading the same thing multiple times.

The simplest solution is to rely on the fact that a class is not loaded until it is needed. i.e. it is lazy loaded anyway. This way you can avoid having to do those checks yourself.
public enum Dictionary {
INSTANCE;
private Dictionary() {
// load dictionary
}
}
There shouldn't be a need to make it any more complex, certainly you won't make it more efficient.
EDIT: If Dictionary need to extend List or Map you can do.
public enum Dictionary implements List<String> { }
OR a better approach is to use a field.
public enum Dictionary {
INSTANCE;
public final List<String> list = new ArrayList<String>();
}
OR use a static initialization block
public class Dictionary extends ArrayList<String> {
public static final Dictionary INSTANCE = new Dictionary();
private Dictionary() { }
}

Your code is correct. To avoid loading more than once, synchronized{} would be nice.
You can remove volatile, if Dictionary is immutable.
private Dictionary dict; // not volatile; assume Dictionary immutable
public Dictionary getDict()
if(dict==null)
dict = load()
return dict;
If we add double checked locking, it's perfect
public Dictionary getDict()
if(dict==null)
synchronized(this)
if(dict==null)
dict = load()
return dict;
Double checked locking works great for immutable objects, without need of volatile.
Unfortunately the above 2 getDict() methods aren't theoretically bullet proof. The weak java memory model will allow some spooky actions - in theory. To be 100% correct by the book, we must add a local variable, which clutters our code:
public Dictionary getDict()
Dictionary local = dict;
if(local==null)
synchronized(this)
local = dict;
if(local==null)
local = dict = load()
return local;

1.Is this pattern correct? In other words, is it possible for getDictionary() to return invalid data?
Yes if it's okay that loadDictionary() can be called by several threads simultaneously and thus different calls to getDictionary() can return different objects. Otherwise you need a solution with syncronization.
2.Is it possible to make dict field non-volatile for more efficiency?
No, it can cause memory visibility problems.
3.Is there any better solution?
As long as you want a solution without syncronization (either explicit or implicit) - no (as far as I understand). Otherwise, there are a lot of idioms such as using enum or inner holder class (but they use implicit synchronization).

Just a quick stab at this but what about...
class DictionaryHolder {
private volatile Dictionary dict; // some heavy object
public Dictionary getDictionary() {
Dictionary d = this.dict;
if (d == null) {
synchronized (this) {
d = this.dict;
if (d == null) { // gated test for null
this.dict = d = loadDictionary(); // costy operation
}
}
return d;
}
}

Is it possible to make dict field non-volatile for more efficiency?
No. That would hurt visibility, i.e. when one thread initializes dict, other threads may not see the updated reference in time (or at all). This in turn would results in multiple heavy initializations, thus lots of useless work , not to mention returning references to multiple distinct objects.
Anyway, when dealing with concurrency, micro-optimizations for efficiency would be my last thought.

Initialize-on-demand holder class idiom
This method relies on the JVM only
intializing the class members upon
first reference to the class. In this
case, we have a inner class that is
only referenced within the
getDictionary() method. This means
DictionaryHolder will get initialized
on the first call to getDictionary().
public class DictionaryHolder {
private DictionaryHolder ()
{
}
public static Dictionary getDictionary()
{
return DictionaryLazyHolder.instance;
}
private static class DictionaryLazyHolder
{
static final DictionaryHolder instance = new DictionaryHolder();
}
}

Related

A rare usage of WeakReference?

I have a class whose instances are initialized and used by underlying flatform.
class MyAttributeConverter implements AttributeConverter<XX, YY> {
public YY convertToDatabaseColumn(XX attribute) { return null; }
public XX convertToEntityAttribute(YY dbData) { return null; }
}
Nothing's wrong and I thought I need to add some static methods for being used as method references.
private static MyAttributeConverter instance;
// just a lazy-initialization;
// no synchronization is required;
// multiple instantiation is not a problem;
private static MyAttributeConverter instance() {
if (instance == null) {
instance = new MyAttributeConverter();
}
return instance;
}
// do as MyAttributeConverter::toDatabaseColumn(xx)
public static YY toDatabaseColumn(XX attribute) {
return instance().convertToDatabaseColumn(attribute);
}
public static XX toEntityAttribute(YY dbData) {
return instance().convertToEntityAttribute(attribute);
}
Still nothing seems wrong (I believe) and I don't like the instance persisted with the class and that's why I'm trying to do this.
private static WeakReference<MyAttributeConverter> reference;
public static <R> R applyInstance(Function<? super MyAttributeConverter, ? extends R> function) {
MyAttributeConverter referent;
if (reference == null) {
referent = new MyAttributeConverter();
refernce = new WeakReference<>(referent);
return applyInstance(function);
}
referent = reference.get();
if (referent == null) {
referent = new MyAttributeConverter();
refernce = new WeakReference<>(referent);
return applyInstance(function);
}
return function.apply(referent); // ##?
}
I basically don't even know how to test this code. And I'm sorry for my questions which each might be somewhat vague.
Is this a (right/wrong) approach?
Is there any chance that reference.get() inside the function.apply idiom may be null?
Is there any chance that there may be some problems such as memory-leak?
Should I rely on SoftReference rather than WeakReference?
Thank you.

Note that a method like
// multiple instantiation is not a problem;
private static MyAttributeConverter instance() {
if (instance == null) {
instance = new MyAttributeConverter();
}
return instance;
}
is not thread safe, as it bears two reads of the instance field; each of them may perceive updates made by other threads or not. This implies that the first read in instance == null may perceive a newer value written by another thread whereas the second in return instance; could evaluate to the previous value, i.e. null. So this method could return null when more than one thread is executing it concurrently. This is a rare corner case, still, this method is not safe. You’d need a local variable to ensure that the test and the return statement use the same value.
// multiple instantiation is not a problem;
private static MyAttributeConverter instance() {
MyAttributeConverter current = instance;
if (current == null) {
instance = current = new MyAttributeConverter();
}
return current;
}
This still is only safe when MyAttributeConverter is immutable using only final fields. Otherwise, a thread may return an instance created by another thread in an incompletely constructed state.
You can use the simple way to make it safe without those constraints:
private static final MyAttributeConverter instance = new MyAttributeConverter();
private static MyAttributeConverter instance() {
return instance;
}
This still is lazy as class initialization only happens on one of the specified triggers, i.e. the first invocation of the method instance().
Your usage of WeakReference is subject to the same problems. Further, it’s not clear why you resort to a recursive invocation of your method at two points where you already have the required argument in a local variable.
A correct implementation can be far simpler:
private static WeakReference<MyAttributeConverter> reference;
public static <R> R applyInstance(
Function<? super MyAttributeConverter, ? extends R> function) {
WeakReference<MyAttributeConverter> r = reference;
MyAttributeConverter referent = r != null? r.get(): null;
if (referent == null) {
referent = new MyAttributeConverter();
reference = new WeakReference<>(referent);
}
return function.apply(referent);
}
But before you are going to use it, you should reconsider whether the complicated code is worth the effort. The fact that you are accepting the need to reconstruct the object when it has been garbage collected, even potentially constructing multiple instances on concurrent invocations, suggest that you know that the construction will be cheap. When the construction is cheap, you probably don’t need to cache an instance of it at all.
Just consider
public static <R> R applyInstance(
Function<? super MyAttributeConverter, ? extends R> function) {
return function.apply(new MyAttributeConverter());
}
It’s at least worth trying, measuring the application’s performance and comparing it with the other approaches.
On the other hand, it doesn’t look like the instance was occupying a significant amount of memory nor holding non-memory resources. As otherwise, you were more worried about the possibility of multiple instances flying around. So the other variant worth trying and comparing, is the one shown above using a static final field with lazy class initialization and no opportunity to garbage collect that small object.
One last clarification. You asked
Is there any chance that reference.get() inside the function.apply idiom may be null?
Since there is no reference.get() invocation inside the evaluation of function.apply, there is no chance that such an invocation may evaluate to null at this point. The function receives a strong reference and since the calling code ensured that this strong reference is not null, it will never become null during the invocation of the apply method.
Generally, the garbage collector will never alter the application state in a way that code using strong references will notice a difference (letting the availability of more memory aside).
But since you asked specifically about reference.get(), a garbage collector may collect an object after its last use, regardless of method executions or local scopes. So the referent could get collected during the execution of the apply method when this method does not use the object anymore. Runtime optimizations may allow this to happen earlier than you might guess by looking at the source code, because what may look like an object use (e.g. a field read) may not use the object at runtime (e.g. because that value is already held in a CPU register, eliminating the need to access the object’s memory). As said, all without altering the method’s behavior.
So a hypothetical reference.get() during the execution of the apply method could in principle evaluate to null, but there is no reason for concern, as said, the behavior of the apply method does not change. The JVM will retain the object’s memory as long as needed for ensuring this correct method execution.
But that explanation was just for completeness. As said, you should not use weak nor soft references for objects not holding expensive resources.

Ensuring safe publication and thread safety in java by means of static factories

The class below is meant to be immutable (but see edit):
public final class Position extends Data {
double latitude;
double longitude;
String provider;
private Position() {}
private static enum LocationFields implements
Fields<Location, Position, List<Byte>> {
LAT {
#Override
public List<byte[]> getData(Location loc, final Position out) {
final double lat = loc.getLatitude();
out.latitude = lat;
// return an arrayList
}
#Override
public void parse(List<Byte> list, final Position pos)
throws ParserException {
try {
pos.latitude = listToDouble(list);
} catch (NumberFormatException e) {
throw new ParserException("Malformed file", e);
}
}
}/* , LONG, PROVIDER, TIME (field from Data superclass)*/;
}
// ========================================================================
// Static API (factories essentially)
// ========================================================================
public static Position saveData(Context ctx, Location data)
throws IOException {
final Position out = new Position();
final List<byte[]> listByteArrays = new ArrayList<byte[]>();
for (LocationFields bs : LocationFields.values()) {
listByteArrays.add(bs.getData(data, out).get(0));
}
Persist.saveData(ctx, FILE_PREFIX, listByteArrays);
return out;
}
public static List<Position> parse(File f) throws IOException,
ParserException {
List<EnumMap<LocationFields, List<Byte>>> entries;
// populate entries from f
final List<Position> data = new ArrayList<Position>();
for (EnumMap<LocationFields, List<Byte>> enumMap : entries) {
Position p = new Position();
for (LocationFields field : enumMap.keySet()) {
field.parse(enumMap.get(field), p);
}
data.add(p);
}
return data;
}
/**
* Constructs a Position instance from the given string. Complete copy
* paste just to get the picture
*/
public static Position fromString(String s) {
if (s == null || s.trim().equals("")) return null;
final Position p = new Position();
String[] split = s.split(N);
p.time = Long.valueOf(split[0]);
int i = 0;
p.longitude = Double.valueOf(split[++i].split(IS)[1].trim());
p.latitude = Double.valueOf(split[++i].split(IS)[1].trim());
p.provider = split[++i].split(IS)[1].trim();
return p;
}
}
Being immutable it is also thread safe and all that. As you see the only way to construct instances of this class - except reflection which is another question really - is by using the static factories provided.
Questions :
Is there any case an object of this class might be unsafely published ?
Is there a case the objects as returned are thread unsafe ?
EDIT : please do not comment on the fields not being private - I realize this is not an immutable class by the dictionary, but the package is under my control and I won't ever change the value of a field manually (after construction ofc). No mutators are provided.
The fields not being final on the other hand is the gist of the question. Of course I realize that if they were final the class would be truly immutable and thread safe (at least after Java5). I would appreciate providing an example of bad use in this case though.
Finally - I do not mean to say that the factories being static has anything to do with thread safety as some of the comments seem(ed) to imply. What is important is that the only way to create instances of this class is through those (static of course) factories.

Yes, instances of this class can be published unsafely. This class is not immutable, so if the instantiating thread makes an instance available to other threads without a memory barrier, those threads may see the instance in a partially constructed or otherwise inconsistent state.
The term you are looking for is effectively immutable: the instance fields could be modified after initialization, but in fact they are not.
Such objects can be used safely by multiple threads, but it all depends on how other threads get access to the instance (i.e., how they are published). If you put these objects on a concurrent queue to be consumed by another thread—no problem. If you assign them to a field visible to another thread in a synchronized block, and notify() a wait()-ing thread which reads them—no problem. If you create all the instances in one thread which then starts new threads that use them—no problem!
But if you just assign them to a non-volatile field and sometime "later" another thread happens to read that field, that's a problem! Both the writing thread and the reading thread need synchronization points so that the write truly can be said to have happened before the read.
Your code doesn't do any publication, so I can't say if you are doing it safely. You could ask the same question about this object:
class Option {
private boolean value;
Option(boolean value) { this.value = value; }
boolean get() { return value; }
}
If you are doing something "extra" in your code that you think would make a difference to the safe publication of your objects, please point it out.

Position is not immutable, the fields have package visibility and are not final, see definition of immutable classes here: http://www.javapractices.com/topic/TopicAction.do?Id=29.
Furthermore Position is not safely published because the fields are not final and there is no other mechanism in place to ensure safe publication. The concept of safe publication is explained in many places, but this one seems particularly relevant: http://www.ibm.com/developerworks/java/library/j-jtp0618/
There are also relevant sources on SO.
In a nutshell, safe publication is about what happens when you give the reference of your constructed instance to another thread, will that thread see the fields values as intended? the answer here is no, because the Java compiler and JIT compiler are free to re-order the field initialization with the reference publication, leading to half baked state becoming visible to other threads.
This last point is crucial, from the OP comment to one of the answers below he appears to believe static methods somehow work differently from other methods, that is not the case. A static method can get inlined much like any other method, and the same is true for constructors (the exception being final fields in constructors post Java 1.5). To be clear, while the JMM doesn't guarantee the construction is safe, it may well work fine on certain or even all JVMs. For ample discussion, examples and industry expert opinions see this discussion on the concurrency-interest mailing list: http://jsr166-concurrency.10961.n7.nabble.com/Volatile-stores-in-constructors-disallowed-to-see-the-default-value-td10275.html
The bottom line is, it may work, but it is not safe publishing according to JMM. If you can't prove it is safe, it isn't.

The fields of the Position class are not final, so I believe that their values are not safely published by the constructor. The constructor is therefore not thread-safe, so no code (such as your factory methods) that use them produce thread-safe objects.

Singleton pattern interview

I am recently asked about java related question in an interview with following code, since I am very new to java and barely code in Java so I really have no idea what the following code does.
The question was
Select the option that describes the worst thing with the following code:
public class Bolton {
private static Bolton INST = null;
public static Bolton getInstance()
{
if ( INST == null )
{
INST = new Bolton();
}
return INST;
}
private Bolton() {
}
}
Here are the options for this question
More than one instance of Bolton can be created
A Bolton will never be created
The constructor is private and can't be called
Value can be garbage collected, and the call to getInstance may return garbage data
Which of the above options is correct? And Why?

This is a Singleton Pattern
The idea of a Singleton Pattern is to only have one available instance of a class. Therefore the constructor is set to private and the class maintains, in this case, a getInstance() method that either calls an existing instance variable, INST in this class, or creates a new one for the executing program. The answer is probably 1, because it's not thread safe. It may be confused for 3, which I had put down earlier, but that is by design, technically, so not actually a flaw.
Here's an example of Lazy Initialization, thread-safe singleton pattern from Wikipedia:
public class SingletonDemo {
private static volatile SingletonDemo instance = null;
private SingletonDemo() { }
public static SingletonDemo getInstance() {
if (instance == null) {
synchronized (SingletonDemo.class){
if (instance == null) {
instance = new SingletonDemo();
}
}
}
return instance;
}
}
Setting the instance variable to volatile tells Java to read it from memory and to not set it in cache.
Synchronized statements or methods help with concurrency.
Read more about double checked locking which is what happens for a "lazy initialization" singleton

More than one instance of Bolton can be created
This option is correct due to lack of synchronization in the above code.
Suppose two threads concurrently check for null and both will find that the value is null and both will call the constructor (which refutes singleton).
Also there is other catch in this, even if two threads dont check for null together but suppose one thread calls the constructor; but the constructor wont return till the object is constructed (assuming that the object is costly to create and requires time), so till the constructor returns some other thread might check for the null condition and still find the object as null as the object is not yet constructed.
This scenario in which some pre condition is called check-then-act which is culprit for Race.
For singleton there are two standards that are being used:
Double Checked locking
Enum based singleton pattern
UPDATE:
Here is another great article which discusses the double checked locking

Interviewer basically wants to check your knoweldge of Singleton pattern . Can the pattern be broken?. Ans is Yes. Check this or google - when singleton is not a singleton.
Best course is to use Enum based Singleton as suggested by Joshua Bloch

The getInstance() method should be synchronized, otherwise many instances could be created if multiple threads calls getInstance() at the same time. So I would select option 1.

We use Singleton Pattern when we want to have only one object of this class and it will be used every where. So to restrict the class to create many objects, we should use private for constructor of this class. And create one public function to return the object of this class.
public class MethodWin {
private int isLoggedOn=0;
private static MethodWin objectMethodWin = new MethodWin();
private MethodWin() { }
public static MethodWin getInstance() {
return objectMethodWin;
}
public void setIsLoggedOn(int value) {
this.isLoggedOn=value;
}
public int getIsLoggedOn() {
return this.isLoggedOn;
}
}
So when we need to create this obect, we should:
MethodWin meth = MethodWin.getInstance();

Original Answer is that only one instance of Bolton is created.

Through reflection we can create many objects even if the constructor is private. In multi-threaded environment there are chances to create more than one instance. Through serialization there are chances to create more than one object.

simple answer is 2) A Bolton will never be created because when you create instance the constructor will call inside constructor initialization when call getInstance method then answer will be single instance will be created.

How to deal with one time execution of "code piece" or method?

I come across this kind of a situation a lot.
class A{
public static boolean flag = true;
public void method(){
// calls method in class B
B b = new B();
while(someCondition){
b.method();
}
}
}
.
class B{
public void method(){
if(A.flag){
// Read all data from a flat file and store it in HashMAp/ArrayList etc
//only for the first time
A.flag = false;
}
// Manipulate the data
}
}
I seem to be running into this type of situation quite often in completely different situations.
Is this how it is normally dealt with? I feel a bit silly and unnatural using static variables and if statements to resolve the issue.
In this case, I don't want to cause an overhead by reading data every time the method is executed.

It looks like you need the Singleton Pattern. Figure out what data B needs to load upon its first use, and package that into a separate class that gets used as a singleton instance. See this link for more information on how to implement the singleton pattern in Java.
Following this pattern, you can avoid the need for checking a flag every time your method is called, and you can simultaneously avoid any threading issues (if there are any).

As John B pointed out, a simple check for null should be enough rather than using a flag. If thread safety becomes an issue, you might also look into Guava's CacheBuilder and Suppliers.memoize() for these types of situations.

Rather than reading an external flag to determine if the data has already been stored, why not check the data store to see if it is populated? Or if the data store is expensive (DB) use a local static variable rather than one in a different class.

You could simply check if the list/map that stores your data has been initialised by a previous call
class A {
public void method() {
B b = new B();
while (someCondition) {
b.method();
}
}
}
class B {
private List myList;
public void method() {
if (myList == null) {
// Read all data from a flat file and store it in myList
}
// manipulate the data
}
}

Multithreading headache. Java. Return values

Hi guys I have the following.
class a extends Thread
{
public synchronized BigInteger getUniqueID()
{
BigInteger aUniqueID = new BigInteger(getUniqueKeyFromDatabase);
return aUniqueID;
}
}
class b extends a
{
public run()
{
BigInteger uniquieID = getUniqueID();
// store UniqueID in another database table with other stuff
}
}
And what I'm getting is duplicate unique id stored in the database table. I'm assuming because uniqieID is being changed in this multi threaded environment.
I'm obviously going horribly horribly wrong somewhere, I'm guessing I shouldn't be returning the value in this way. Or should be defining uniqueID as new BigInteger based on the response from the getUniqueID method.
Any help would be greatly appreciated, as my fragile mind has been warped right now!
Cheers
Alan

BigInteger is an (from the JavaDocs)
Immutable arbitrary-precision integer
So that rules out anyone mutating the BigInteger object. I'd look into getUniqueKeyKeyFromDatabase

You getUniqueKeyFromDatabase() has to be a method which will not return the same value twice. Everything else doesn't matter.
Each thread has it own copy of local variables are they are not shared.
BTW: don't extend Thread, its bad practice which often leads to confusion.

Your problem is because you're not really synchronizing anything. The getUniqueID() method in class A is synchronized on its own implicit monitor. But that means each time you create a new thread, you're synchronizing each one on itself. Does that make sense ?
You need to synchronize on some shared variable. A quick fix to illustrate the point (but really don't use this in practice) is: In the example below all your threads are synchronizing on the same object ( a shared static ).
class A extends Thread {
static Object shared = new Object();
public BigInteger getUniqueID()
{
synchronize (shared) {
BigInteger aUniqueID = new BigInteger(getUniqueKeyFromDatabase);
return aUniqueID;
}
}
}

Chances are, that the synchronized modifier for getUniqueID() is pointless, as you don't modify any state there. It does not protect the getUniqueKeyFromDatabase() either, because it synchronizes on the instance. This means that every Thread runs without synchronizing with the others.
You could try if
public BigInteger getUniqueID() {
synchronized (a.class) {
BigInteger aUniqueID = new BigInteger(getUniqueKeyFromDatabase);
return aUniqueID;
}
}
Works better for you. If it does, you should think about your database design (or whatever happens in getUniqueKeyFromDatabase). Synchonization should really be done by the database, not in client code.

You must have a problem with the method which returns the unique id. To assure uniqueness of your ids for each object use something like below, for example in you class a.
PS: Class names should start from capital letters. Also as suggested by #Peter Lawrey implement Runnable instead of extending a Thread.
private static int nextId = 0;
protected int id;
public a(){
this.id = getNextId();
}
private static int getNextId(){
return nextId++;
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java, lazily initialized field without synchronization - java

Related

A rare usage of WeakReference?

Ensuring safe publication and thread safety in java by means of static factories

Singleton pattern interview

How to deal with one time execution of "code piece" or method?

Multithreading headache. Java. Return values

Categories

Resources