Lazy-loaded singleton: Double-checked locking vs Initialization on demand holder idiom - java

I have a requirement to lazy-load resources in a concurrent environment. The code to load the resources should be executed only once.
Both Double-checked locking (using JRE 5+ and the volatile keyword) and Initialization on demand holder idiom seems to fit the job well.
Just by looking at the code, Initialization on demand holder idiom seems cleaner and more efficient (but hey, I'm guessing here). Still, I will have to take care and document the pattern at every one of my Singletons. At least to me, It would be hard to understand why code was written like this on the spot...
My question here is: Which approach s is better? And why?
If your answer is none. How would you tackle this requirement in a Java SE environment?
Alternatives
Could I use CDI for this without imposing it's use over my entire project? Any articles out there?

To add another, perhaps cleaner, option. I suggest the enum variation:
What is the best approach for using an Enum as a singleton in Java?

As far as readability I would go with the initialization on demand holder. The double checked locking, I feel, is a dated and an ugly implementation.
Technically speaking, by choosing double checked locking you would always incur a volatile read on the field where as you can do normal reads with the initialization on demand holder idiom.

Initialisation-on-demand holder only works for a singleton, you can't have per-instance lazily loaded elements. Double-checked locking imposes a cognitive burden on everyone who has to look at the class, as it is easy to get wrong in subtle ways. We used to have all sorts of trouble with this until we encapsulated the pattern into utility class in our concurrency library
We have the following options:
Supplier<ExpensiveThing> t1 = new LazyReference<ExpensiveThing>() {
protected ExpensiveThing create() {
… // expensive initialisation
}
};
Supplier<ExpensiveThing> t2 = Lazy.supplier(new Supplier<ExpensiveThing>() {
public ExpensiveThing get() {
… // expensive initialisation
}
});
Both have identical semantics as far as the usage is concerned. The second form makes any references used by the inner supplier available to GC after initialisation. The second form also has support for timeouts with TTL/TTI strategies.

Initialization-on-demand holder is always best practice for implementing singleton pattern. It exploits the following features of the JVM very well.
Static nested classes are loaded only when called by name.
The class loading mechanism is by default concurrency protected. So when a thread initializes a class, the other threads wait for its completion.
Also, you don't have to use the synchronize keyword, it makes your program 100 times slower.

I suspect that the initialization on demand holder is marginally faster that double-checked locking (using a volatile). The reason is that the former has no synchronization overhead once the instance has been created, but the latter involves reading a volatile which (I think) entails a full memory read.
If performance is not a significant concern, then the synchronized getInstance() approach is the simplest.

Related

Singleton access performance

Whenever I want to use singleton pattern in my app code I use similar code to this:
public class Singleton {
private volatile static Singleton INSTANCE;
public static synchronized Singleton singleton() {
if (INSTANCE == null) {
INSTANCE = new Singleton();
}
return INSTANCE;
}
}
I think it's quite common solution, but now I'm wondering what's the best way to use that singleton instance, whether using it as inline function:
singleton().doSomething();
singleton().doSomethingElse();
or create singleton field in every class which uses it:
private Singleton mSingleton = singleton(); // Or pass it in constructor
mSingleton.doSomething();
mSingleton.doSomethingElse();
Question
In which case performance is better?
Do not use the Singelton Pattern!
A Singelton is merely a global variable and makes your code inflexible, hart to reuse and hard to test.
Make your Singeltons ordinary objects and ensure that there is only one instance when constructing the object tree either by passing around the one and only instance of the logical singelton or by using a dependency injection framework.
You went off on a tangent and forgot to answer the question. – shmosel
Do not choose certain syntax for performance!
There is no point of choosing a certain syntax for performance reason especially in a managed language like Java.
The JVM will modify and optimize the resulting byte code when executing is and it may even recompile it to machine code. This makes it impossible to judge the performance impact of a certain syntax.
Even if the syntax chosen had any performance effect, would it be recognizable? Changes in syntax can change performance only in fractions. But to recognize a difference it muts be more than 50% (20% when having both versions running in parallel).
Both aspects make any considerations of performance effects of syntax useless. Don't think of performance unless you have *experienced * a problem an proven by measurement that the code in question really it the bottleneck. Anything else is considered premature optimization.
Choose your syntax so that it expresses your intent an makes your code best readable.
In case of Singleton performance will be better because if you create same object many times to perform same task it will take memory in heap. it consume heap memory. So if you can do a task by using only one object then why create many object.

Double-checked locking as an anti-pattern [duplicate]

This question already has answers here:
Java double checked locking
(11 answers)
Closed 8 years ago.
There's a common belief and multiple sources (including wiki) that claim this idiom to be an anti-pattern.
What are the arguments against using it in production code given the correct implementation is used (for example, using volatile)
What are the appropriate alternatives for implementing lazy initialization in a multithreaded environment ? Locking the whole method may become a bottleneck and even while modern synchronization is relatively cheap, it's still much slower especially under contention. Static holder seems to be a language-specific and a bit ugly hack (at least for me). Atomics-based implementation seems not be so different from traditional DCL while allowing multiple calculations or requires more complicated code. For example, Scala is still using DCL for implementing the lazy values while proposed alternative seems to be much more complicated.
Don't use double checked locking. Ever. It does not work. Don't try to find a hack to make it work, because it may not on a later JRE.
As far as I know, there is no other save way for lazy initialization than locking the whole object / synchronizing.
synchronized (lock) {
// lookup
// lazy init
}
For singletons the static holder (as #trashgod mentioned) is nice, but will not remain single if you have multiple classloaders.
If you require a lazy singleton in a multi-classloader environment, use the ServiceLoader.

Using synchronized/locks in future code

We are building a web app with Scala, Play framework, and MongoDB (with ReactiveMongo as our driver). The application architecture is non-blocking end to end.
In some parts of our code, we need to access some non-thread-safe libraries such as Scala's parser combinators, Scala's reflection etc. We are currently enclosing such calls in synchronized blocks. I have two questions:
Are there any gotchas to look out for when using synchronized with future-y code?
Is it better to use locks (such as ReentrantLock) rather than synchronized, from both performance and usability standpoint?
This is an old question)) see here using-actors-instead-of-synchronized for example. In short it would be more advisable to use actors instead of locks:
class GreetingActor extends Actor with ActorLogging {
def receive = {
case Greeting(who) ⇒ log.info("Hello " + who)
}
}
only one message will be processed at any given time, so you can put any not-thread safe code you want instead of log.info, everything will work OK. BTW using ask pattern you can seamlessly integrate your actors into existing code that requires futures.
For me the main problem you will face is that any call to a synchronized or a locked section of code may block and thus paralyze the threads of the execution context. To avoid this issue, you can wrap any call to a potentially blocking method using scala.concurrent.blocking:
import scala.concurrent._
import ExecutionContext.Implicits.global
def synchronizedMethod(s: String) = synchronized{ s.size }
val f = future{
println("Hello")
val i = blocking{ //Adjust the execution context behavior
synchronizedMethod("Hello")
}
println(i)
i
}
Of course, it may be better to consider alternatives like thread-local variables or wrapping invocation to serial code inside an actor.
Finally, I suggest using synchronized instead of locks. For most application (especially if the critical sections are huge), the performance difference is not noticeable.
The examples you mention i.e. reflection and parsing should be reasonably immutable and you shouldn't need to lock, but if you're going to use locks then a synchronized block will do. I don't think there's much of a performance difference between using synchronized vs Lock.
Well I think the easiest and safest way would be (if at all you can) from Thread Confinement.
i.e. each thread creates its own instance of parser combinators etc and then use it.
And in case you need any synchronization (which should be avoided as under traffic it will be the killer), synchornized or ReentrantLock will give almost same performace. It again depends on what objects need to be Guarded on what locks etc. In a web-application, it is discouraged unless absolutely necessary.

Real life use and explanation of the AtomicLongFieldUpdate class

Is anybody aware of any real life use of the class AtomicLongFieldUpdate?
I have read the description but I have not quite grasped the meaning of it.
Why do I want to know that? Curiosity and for OCPJP preparation.
Thanks in advance.
You can think of a cost ladder for the following:
ordinary long: cheap, but unsafe for multi-threaded access
volatile long: more expensive, safe for multi-threaded access, atomic operations not possible
AtomicLong: most expensive, safe for multi-threaded access, atomic operations possible
(When I say 'unsafe' or 'not possible' I mean 'without an external mechanism like synchronization' of course.)
In the case where multi-threaded access is needed, but most operations are simple reads or writes, with only a few atomic operations needed, you can create one static instance of AtomicLongFieldUpdate and use this when atomic updates are needed. The memory/runtime overhead is then similar to a simple volatile variable, except for the atomic operations which are of the order of (or slightly more expensive than) the ordinary AtomicLong operations.
Here is a nice little tutorial.
The reason why you would use e.g. AtomicLongFieldUpdater in favor to AtomicLong is simply to reduce the heap cost. Internally both work pretty much the same on th compareAndSet level which both use sun.misc.Unsafe at the end.
Consider you have a certain class that is initialized 1000k times. With AtomicLong you'd create 1000k AtomicLongs. With AtomicLongFieldUpdater on the other hand, you'd create 1 CONSTANT AtomicLongFieldUpdater and 1000k long primitives which of course does not need so much heap space.
Is anybody aware of any real life use of the AtomicLongFieldUpdate class?
I've never used this class myself but in doing a get usage on my workspace I see a couple "real life" instances of its use:
com.google.common.util.concurrent.AtomicDouble uses it to atomically modify their internal volatile long field which stores the bits from a double using Number.doubleToRawLongBits(...). Pretty cool.
net.sf.ehcache.Element uses it to atomically update the hitCount field.
I have read the description but I have not quite grasped the meaning of it.
It basically provides the same functionality as AtomicLong but on a field local to another class. The memory load of the AtomicLongFieldUpdate is less than the AtomicLong in that you configure one instance of the update for each field so lower memory overhead but more CPU overhead (albeit maybe small) from the reflection.
The javadocs say:
This class is designed for use in atomic data structures in which several fields of the same node are independently subject to atomic updates.
Sure but then I'd just use multiple Atomic* fields. Just about the only reason why I'd use the class is if there was an existing class that I could not change that I wanted to increment atomically.
Of course. I have been reading Alibaba Druid recently. I found AtomicLongFieldUpdater is used in this project widely.
// stats
private volatile long recycleErrorCount = 0L;
private volatile long connectErrorCount = 0L;
protected static final AtomicLongFieldUpdater<DruidDataSource> recycleErrorCountUpdater
= AtomicLongFieldUpdater.newUpdater(DruidDataSource.class, "recycleErrorCount");
protected static final AtomicLongFieldUpdater<DruidDataSource> connectErrorCountUpdater
= AtomicLongFieldUpdater.newUpdater(DruidDataSource.class, "connectErrorCount");
As defined above, the properties recycleErrorCount and connectErrorCount are used to count error occurrence times.
Quite a lot of DataSource (The class that holds properties above) will be created during an application lifetime in which case using ALFU reduces heap space consumption obviously than using AtomicLong.
Atomics are usually used in parallel programming.
Under the work-stealing mode, it only supports async, finish, forasync, isolated, and atomic variables.
You can view atomic as a safe protection from data race and other problems that you need to concern in parallel programming.

In a class that has many instances, is it better to use synchronization, or an atomic variable for fields?

I am writing a class of which will be created quite a few instances. Multiple threads will be using these instances, so the getters and setters of the fields of the class have to be concurrent. The fields are mainly floats. Thing is, I don't know what is more resource-hungry; using a synchronized section, or make the variable something like an AtomicInteger?
You should favor atomic primitives when it is possible to do so. On many architectures, atomic primitives can perform a bit better because the instructions to update them can be executed entirely in user space; I think that synchronized blocks and Locks generally need some support from the operating system kernel to work.
Note my caveat: "when it is possible to do so". You can't use atomic primitives if your classes have operations that need to atomically update more than one field at a time. For example, if a class has to modify a collection and update a counter (for example), that can't be accomplished using atomic primitives alone, so you'd have to use synchronized or some Lock.
The question already has an accepted answer, but as I'm not allowed to write comments yet here we go. My answer is that it depends. If this is critical, measure. The JVM is quite good at optimizing synchronized accesses when there is no (or little) contention, making it much cheaper than if a real kernel mutex had to be used every time. Atomics basically use spin-locks, meaning that they will try to make an atomic change and if they fail they will try again and again until they succeed. This can eat quite a bit of CPU is the resource is heavily contended from many threads.
With low contention atomics may well be the way to go, but in order to be sure try both and measure for your intended application.
I would probably start out with synchronized methods in order to keep the code simple; then measure and make the change to atomics if it makes a difference.
It is very important to construct the instances properly before they have been used by multiple threads. Otherwise those threads will get incomplete or wrong data from those partially constructed instances. My personal preference would be to use synchronized block.
Or you can also follow the "Lazy initialization holder class idiom" outlined by Brain Goetz in his book "Java concurrency in Practice":
#ThreadSafe
public class ResourceFactory {
private static class ResourceHolder {
public static Resource resource = new Resource();
}
public static Resource getResource() {
return ResourceHolder.resource;
}
}
Here the JVM defers initializing the ResourceHolder class until it is actually used. Moreover Resource is initialized with a static initializer, no additional synchronization is needed.
Note: Statically initialized objects require no explicit synchronization either during construction or when being referenced. But if the object is mutable, synchronization is still required by both readers and writers to make subsequent modifications visible and also to avoid data corruption.

Categories