Java - Thread Synchronization in a web app

Java - Thread Synchronization in a web app - java

I have a web app where I load components lazily. There is a lot of
static Bla bla;
...
if(bla == null)
bla = new Bla();
spread throughout the code. What do I need to do to make sure this is thread safe? Should I just wrap anytime I do one of these initializations in a synchronized block? Is there any problem with doing that?

The best solution for lazy loading on a static field, as described in Effective Java [2nd edition, Item 71, p. 283] and Java Concurrency in Practice [p. 348], is the Initialization on demand holder idiom:
public class Something {
private Something() {
}
private static class LazyHolder {
private static final Something something = new Something();
}
public static Something getInstance() {
return LazyHolder.something;
}
}

It is tricky to use volatile variable.
It's described here:
http://www.ibm.com/developerworks/java/library/j-dcl.html
Example cited from above link:
class Singleton
{
private Vector v;
private boolean inUse;
private static Singleton instance = new Singleton();
private Singleton()
{
v = new Vector();
inUse = true;
//...
}
public static Singleton getInstance()
{
return instance;
}
}
will work 100% and is much more clear (to read and to understand) comparing to double checking and other approaches.

Assuming you are using Java 1.5 or later, you can do this:
private static volatile Helper helper = null;
public static Helper getHelper() {
if (helper == null) {
synchronized(Helper.class) {
if (helper == null)
helper = new Helper();
}
}
return helper;
}
That is guaranteed to be threadsafe.
I recommend you read this to understand why the var HAS to be volatile, and the double check for null is actually needed: http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html

The lazy instantiation is only really a part of the problem. What about accessing these fields?
Typically in a J2EE application you avoid doing this kind of thing as much as you can so that you can isolate your code from any threading issues.
Perhaps if you expand one what kind of global state you want to keep there are better ways to solve the problem.
That being said, to answer your question directly, you need to ensure that access to these fields is done synchronized, both reading and writing. Java 5 has better options than using synchronized in some cases. I suggest reading Java Concurrency in Practice to understand these issues.

The best way is indeed to enclose everything in a synchronized block, and declare the variable volatile, as in:
private static volatile Bla bla;
synchronized{
if(bla == null) bla = new Bla();
}
If you really need to have only one single instance assigned to the bla at any time you web application is running, you have to keep in mind the fact a static keyword applied to a variable declaration only ensures there will be one per classloader that reads the class definition of the class defining it.

Because bla is static, it can be accessed from different instances of the containing class
and code like
synchronized{...} or synchronized(this){...} does not defend against this. You must obtain a lock on the same object in all cases so for example synchronized(bla){...}

I'd ask why you think it's necessary to load these lazily. If it's a web app, and you know you need these objects, why would you not want to load them eagerly once the app started up?
Please explain the benefit that lazy loading is providing. If it's static, is there ever a possibility that you won't initialize these objects? If the answer is no, I'd challenge the design and recommend that you load eagerly on start-up.

Related

Is there a functional difference between initializing singleton in a getInstance() method, or in the instance variable definition

Is there any functional difference between these two ways of implementing a Singleton?
public class MySingleton {
private static MySingleton instance;
public static MySingleton getInstance() {
if (instance == null) {
instance = new MySingleton();
}
return instance;
}
}
public class MySingleton {
private static final MySingleton instance = new MySingleton();
public static MySingleton getInstance() {
return instance;
}
}
Besides the fact that the first way would allow for some sort of clearInstance() method. Though you could just make instance not final in the second method.
Does the first method technically perform better because it is only initialized the first time it is needed instead of when the program starts?

The first one is lazy loading and the second is eager loading. Maybe your application never call the singleton, so if creating new instance of your singleton be heavy resource consuming action, then the lazy loading is better since it create new instance once needed.

The first method you use is not thread safe. I would consider it to be a bug.
The second method is simpler, thread safe, fast and, if you make sure the constructor won't throw silly exceptions, correct.
If you absolutely need more logic you can go with the first method, must make sure you protect it with a mutex. Something like:
public class MySingleton {
private static final Object mylock = new Object();
private static MySingleton instance;
public static MySingleton getInstance() {
synchronized(mylock) {
if (instance == null) {
instance = new MySingleton();
}
return instance;
}
}
}
Clearly the code is more complex, uses more memory, it's slower, you can't declare the variable as final...
Both methods will initialize the Singleton lazily. In Java, all variable initialization and static constructors are involved by the class loader when the class is used, not on the start of the code. If your code path never invokes getInstance the Singleton will never get initialized.
Personally, I avoid singletons, but when I use them is always with an immediate allocation on the variable declaration.
Correction
I ran a few experiments, and it turns out class initialization happened in parallel with the execution of the main thread. It didn't waited, as I believed it would. At least on a very simplified test scenario the initialization is eager, but asynchronous.

Is there any functional difference between these two ways of implementing a Singleton?
Yes. If you use an initializer in the variable declaration, then the instance is created when the class is initialized, even if the instance is never accessed. If you initialize it in the getInstance() method then the instance is only created if it is accessed. That has thread safety implications. It does does not otherwise make much difference if initializing an instance is cheap and without lasting external side effects, but that may not always be the case.
Does the first method technically perform better because it is only
initialized the first time it is needed instead of when the program
starts?
If you are going to use an instance in any case then you are going to pay the cost of initializing it at some point no matter what, so there is no performance difference in that sense. However, a thread-safe version of the first method will be slightly more expensive than the second method on the first invocation, and you will pay that extra overhead again on every subsequent invocation.

Its about Lazy Initialization vs Eager initialization. The difference is, in the first one the instance will not create until you call the getInstance() method, but in the second one its already have been created even before you call the getInstance() method.
Please refer this link if you want more info

From the unit testing point of view I prefer the lazy instatiatiation. Given that the singleton's initialization has further side effects (which are irrelevant to the actual test), and you want to test a class which needs the singleton (maybe just one particular method), it's easier to mock the singleton and inject it to the instance variable while preparing the test. Using a mock for your singleton instance you have easier control what the singleton's method return to your class under test.
The overhead of the thread safe instantiation can be minimized by the double checked lock pattern:
private static volatile MySingleton instance;
public static MySingleton getInstance() {
if (instance == null) {
synchronized ( MySingleton.class ) {
if (instance == null) {
instance = new MySingleton();
}
}
}
return instance;
}
Thus only the rare situation where two (or more) threads access the singleton for the first time (and at the same time) may enter the lock state. Afterwards the first ''if null'' will return false and you never enter the lock state again.
Important: the member has to be declared volatile for this pattern to work reliably.
Note: It has been proven that the above "double checked lock" pattern is not 100 percent reliable. See the discussion below in the comments and especially Brian Goetz' arcticle

About Singleton pattern in java, instead of assigning static variable to new local variable in method, why not use directly static variable?

I haven't understood what the code's purpose => DataProvider instance = sInstance; is in below method. Anyone help me to explain in detail ? Why don't use directly sInstance ?
private static volatile DataProvider sInstance = null;
public static DataProvider getInstance() {
DataProvider instance = sInstance;
if (instance == null) {
synchronized (DataProvider.class) {
instance = sInstance;
if (instance == null) {
instance = sInstance = new DataProvider();
}
}
}
return instance;
}

It is used as a lazy initialization (e.i. only create the singleton instance when needed). The problem with this code is that it is broken. Apparently even when using the synchronize block, there is a posaibility that things goes wrong (due to raceconditions). So do not use this method if you want to be safe!
Alternatives:
Using a direct assignment (like you sugessted);
private static volatile DataProvider sInstance = new DataProvider();
Or using a enum (as suggested by #MadProgrammer);
public enum DataProvider
{
INSTANCE;
// singleton content
}

According to the book Prentice.Hall.Effective.Java.2nd.Edition.May.2008 of Joshua Bloch,
In particular, the need for the local variable result may be unclear.
What this variable does is to ensure that field is read only once in
the common case where it’s already initialized. While not strictly
necessary, this may improve performance and is more elegant by the
standards applied to low-level concurrent programming. On my machine,
the method above is about 25 percent faster than the obvious version
without a local variable.

The main reason is Volatile. As #Hien Nguyen's answer, it improve 25% performance. Cause Volatile is always get data from main memory instead of cache, so it's too slow. Declare instance = sInstance to avoid read data from main memory multiple time (slow).
There're 3 time we read data from sInstance if we don't use temp variable, so we use temp variable will imporve performance.
See this topic to understand why access Volatile is slow: Why access volatile variable is about 100 slower than member?
Your answer maybe the same as this topic: Java: using a local variable in double check idiom

Is there a way to explicitly enable/disable Java compiler to reorder instructions?

I am learning Java concurrency and know that the following singleton is not completely thread safe. A thread may get instance before it is initialized because of instructions reordering. A correct way to prevent this potential problem is to use volatile keyword.
public class DoubleCheckedLocking {
private static Instance instance;
public static Instance getInstance() {
if (instance == null) {
synchronized (DoubleCheckedLocking.class) {
if (instance == null)
instance = new Instance();
}
}
return instance;
}
}
I tried to reproduce the potential problem without volatile keyword and wrote a demo to show that using the above code may cause a NullPointerException in multithreading environment. But I failed to find a way to explicitly let the Java compiler perform instructions reordering and my demo with the above singleton always works pretty well without any problems.
So my question is how to explicitly enable/disable Java compiler to reorder instructions or how to reproduce the problem without using volatile keyword in a double-checked locking singleton?

The dangerous thing here is not necessarily, that other threads may receive null as an answer from getInstance. The dangerous thing is, that they may observe an instance, which is not (yet) properly initialized.
To check this, add a few fields to your singleton, say:
class Singleton {
private List<Object> members;
private Singleton() {
members = new ArrayList<>();
members.addAll(queryMembers());
}
private Collection<Object> queryMembers() {
return Arrays.asList("Hello", 1, 2L, "world", new Object());
}
public int size() {
return members.size();
}
private static Singleton instance = null;
public static Singleton getInstance() {
if (instance == null) {
synchronized (DoubleCheckedLocking.class) {
if (instance == null)
instance = new Singleton();
}
}
return instance;
}
}
This is called "unsafe publication". Other threads may see the singleton instance partially initialized (i.e., the members field may still be null, or the list may be empty, or only partially filled, or worse: in an inconsistent state due to an object just being added).
In the example code above, no external caller of size should ever see a value different from 5, right? I didn't try it, but I wouldn't be surprised, if callers can observe different values, if the timing isn't right.
The reason for this is, that the compiler is allowed to translate
instance = new Singleton();
into something along the lines of
instance = allocate_instance(Singleton.class); // pseudo-code
instance.<init>();
and thus, we have a window, in which instance is no longer null, but the actual object is not yet properly initialized.
The "Double-Checked Locking is Broken" Declaration gives an in-depth explanation of this.

This is an excerpt from the Java Concurrency in Practice book:
Debugging tip: For server applications, be sure to always specify the
-server JVM command line switch when invoking the JVM, even for development and testing. The server JVM performs more optimization
than the client JVM, such as hoisting variables out of a loop that are
not modified in the loop; code that might appear to work in the
development environment (client JVM) can break in the deployment
environment (server JVM). For example, had we "forgotten" to declare
the variable asleep as volatile in Listing 3.4, the server JVM could
hoist the test out of the loop (turning it into an infinite loop), but
the client JVM would not. An infinite loop that shows up in
development is far less costly than one that only shows up in
production.
So you can give it a try. But there is no 100% sure way of enabling reordering.

How to use synchronized blocks across classes?

I want to know how to use synchronized blocks across classes. What I mean is, I want to have synchronized blocks in more than one class but they're all synchronizing on the same object. The only way that I've thought of how to do this is like this:
//class 1
public static Object obj = new Object();
someMethod(){
synchronized(obj){
//code
}
}
//class 2
someMethod(){
synchronized(firstClass.obj){
//code
}
}
In this example I created an arbitrary Object to synchronize on in the first class, and in the second class also synchronized on it by statically referring to it. However, this seems like bad coding to me.
Is there a better way to achieve this?

Having a static object that is used as a lock typically is not desirable because only one thread at a time in the whole application can make progress. When you have multiple classes all sharing the same lock that's even worse, you can end up with a program that has little to no actual concurrency.
The reason Java has intrinsic locks on every object is so that objects can use synchronization to protect their own data. Threads call methods on the object, if the object needs to be protected from concurrent changes then you can add the synchronized keyword to the object's methods so that each calling thread must acquire the lock on that object before it can execute a method on it. That way calls to unrelated objects don't require the same lock and you have a better chance of having code actually run concurrently.
Locking shouldn't necessarily be your first go-to technique for concurrency. Actually there are a number of techniques you can use. In order of descending preference:
1) eliminate mutable state wherever possible; immutable objects and stateless functions are ideal because there's no state to protect and no locking required.
2) use thread-confinement where you can; if you can limit state to a single thread then you can avoid data races and memory visibility issues, and minimize the amount of locking.
3) use concurrency libraries and frameworks in preference to rolling your own objects with locking. Get acquainted with the classes in java.util.concurrent. These are a lot better written than anything an application developer can manage to throw together.
Once you've done as much as you can with 1, 2, and 3 above, then you can think about using locking (where locking includes options like ReentrantLock as well as intrinsic locking). Associating the lock with the object being protected minimizes the scope of the lock so that a thread doesn't hold the lock longer than it needs to.
Also if the locks aren't on the data being locked then if at some point you decide to use different locks rather than having everything lock on the same thing, then avoiding deadlocks may be challenging. Locking on the data structures that need protecting makes the locking behavior easier to reason about.
Advice to avoid intrinsic locks altogether may be premature optimization. First make sure you're locking on the right things no more than necessary.

OPTION 1:
More simple way would be to create a separate object (singleton) using enum or static inner class. Then use it to lock in both the classes, it looks elegant:
// use any singleton object, at it's simplest can use any unique string in double quotes
public enum LockObj {
INSTANCE;
}
public class Class1 {
public void someMethod() {
synchronized (LockObj.INSTANCE) {
// some code
}
}
}
public class Class2 {
public void someMethod() {
synchronized (LockObj.INSTANCE) {
// some code
}
}
}
OPTION:2
you can use any string as JVM makes sure it's only present once per JVM. Uniqueness is to make sure no-other lock is present on this string. Don't use this option at all, this is just to clarify the concept.
public class Class1 {
public void someMethod() {
synchronized ("MyUniqueString") {
// some code
}
}
}
public class Class2 {
public void someMethod() {
synchronized ("MyUniqueString") {
// some code
}
}
}

Your code seems valid to me, even if it does not look that nice. But please make your Object you are synchronizing on final.
However there could be some considerations depending on your actual context.
In any way should clearly state out in the Javadocs what you want to archive.
Another approach is to sync on FirstClass e.g.
synchronized (FirstClass.class) {
// do what you have to do
}
However every synchronized method in FirstClass is identical to the synchronized block above. With other words, they are also synchronized on the same object. - Depending on the context it may be better.
Under other circumstances, maybe you'd like to prefer some BlockingQueue implementation if it comes down that you want to synchronize on db access or similar.

I think what you want to do is this. You have two worker classes that perform some operations on the same context object. Then you want to lock both of the worker classes on the context object.Then the following code will work for you.
public class Worker1 {
private final Context context;
public Worker1(Context context) {
this.context = context;
}
public void someMethod(){
synchronized (this.context){
// do your work here
}
}
}
public class Worker2 {
private final Context context;
public Worker2(Context context) {
this.context = context;
}
public void someMethod(){
synchronized (this.context){
// do your work here
}
}
}
public class Context {
public static void main(String[] args) {
Context context = new Context();
Worker1 worker1 = new Worker1(context);
Worker2 worker2 = new Worker2(context);
worker1.someMethod();
worker2.someMethod();
}
}

I think you are going the wrong way, using synchronized blocks at all. Since Java 1.5 there is the package java.util.concurrent which gives you high level control over synchronization issues.
There is for example the Semaphore class, which provides does some base work where you need only simple synchronization:
Semaphore s = new Semaphore(1);
s.acquire();
try {
// critical section
} finally {
s.release();
}
even this simple class gives you a lot more than synchronized, for example the possibility of a tryAcquire() which will immediately return whether or not a lock was obtained and leaves to you the option to do non-critical work until the lock becomes available.
Using these classes also makes it clearer, what prupose your objects have. While a generic monitor object might be misunderstood, a Semaphore is by default something associated with threading.
If you peek further into the concurrent-package, you will find more specific synchronisation-classes like the ReentrantReadWriteLock which allows to define, that there might be many concurrent read-operations, while only write-ops are actually synchronized against other read/writes. You will find a Phaser which allows you to synchronize threads such that specific tasks will be performed synchronously (sort of the opposite of synchornized) and also lots of data structures which might make synchronization unnecessary at all in certain situations.
All-in-all: Don't use plain synchronized at all unless you know exactly why or you are stuck with Java 1.4. It is hard to read and understand and most probably you are implementing at least parts of the higher functions of Semaphore or Lock.

For your scenario, I can suggest you to write a Helper class which returns the monitor object via specific method. Method name itself define the logical name of the lock object which helps your code readability.
public class LockingSupport {
private static final LockingSupport INSTANCE = new LockingSupport();
private Object printLock = new Object();
// you may have different lock
private Object galaxyLock = new Object();
public static LockingSupport get() {
return INSTANCE;
}
public Object getPrintLock() {
return printLock;
}
public Object getGalaxyLock() {
return galaxyLock;
}
}
In your methods where you want to enforce the synchronization, you may ask the support to return the appropriate lock object as shown below.
public static void unsafeOperation() {
Object lock = LockingSupport.get().getPrintLock();
synchronized (lock) {
// perform your operation
}
}
public void unsafeOperation2() { //notice static modifier does not matter
Object lock = LockingSupport.get().getPrintLock();
synchronized (lock) {
// perform your operation
}
}
Below are few advantages:
By having this approach, you may use the method references to find all places where the shared lock is being used.
You may write the advanced logic to return the different lock object(e.g. based on caller's class package to return same lock object for all classes of one package but different lock object for classes of other package etc.)
You can gradually upgrade the Lock implementation to use java.util.concurrent.locks.LockAPIs. as shown below
e.g. (changing lock object type will not break existing code, thought it is not good idea to use Lock object as synchronized( lock) )
public static void unsafeOperation2() {
Lock lock = LockingSupport.get().getGalaxyLock();
lock.lock();
try {
// perform your operation
} finally {
lock.unlock();
}
}
Hopes it helps.

First of all, here are the issues with your current approach:
The lock object is not called lock or similar. (Yes ... a nitpick)
The variable is not final. If something accidentally (or deliberately) changes obj, your synchronization will break.
The variable is public. That means other code could cause problems by acquiring the lock.
I imagine that some of these effects are at the root of your critique: "this seems like bad coding to me".
To my mind, there are two fundamental problems here:
You have a leaky abstraction. Publishing the lock object outside of "class 1" in any way (as a public or package private variable OR via a getter) is exposing the locking mechanism. That should be avoided.
Using a single "global" lock means that you have a concurrency bottleneck.
The first problem can be addressed by abstracting out the locking. For example:
someMethod() {
Class1.doWithLock(() -> { /* code */ });
}
where doWithLock() is a static method that takes a Runnable or Callable or similar, and then runs it with an appropriate lock. The implementation of doWithLock() can use its own private static final Object lock ... or some other locking mechanism according to its specification.
The second problem is harder. Getting rid of a "global lock" typically requires either a re-think of the application architecture, or changing to a different data structures that don't require an external lock.

Ways for lazy "run once" initialization in Java with override from unit tests

I'm looking for a piece of code which behaves a bit like a singleton but isn't (because singleton's are bad :) What I'm looking for must meet these goals:
Thread safe
Simple (Understand & use, i.e. few lines of code. Library calls are OK)
Fast
Not a singleton; for tests, it must be possible to overwrite the value (and reset it after the test).
Local (all necessary information must be in one place)
Lazy (run only when the value is actually needed).
Run once (code on RHS must be executed once and only once)
Example code:
private int i = runOnce(5); // Set i to 5
// Create the connection once and cache the result
private Connection db = runOnce(createDBConnection("DB_NAME"));
public void m() {
String greet = runOnce("World");
System.out.println("Hello, "+greet+"!");
}
Note that the fields are not static; only the RHS (right hand side) of the expression is ... well "static" to some degree. A test should be able to inject new values for i and greet temporarily.
Also note that this piece of code outlines how I intend to use this new code. Feel free to replace runOnce() with anything or move it to some other place (the constructor, maybe, or an init() method or a getter). But the less LOC, the better.
Some background information:
I'm not looking for Spring, I'm looking for a piece of code which can be used for the most common case: You need to implement an interface and there won't ever be a second implementation except for tests where you want to pass in mock objects. Also, Spring fails #2, #3 and #5: You need to learn the config language, you must set up the app context somewhere, it needs an XML parser and it's not local (information is spread all over the place).
A global config object or factory doesn't meet the bill because of #5.
static final is out because of #4 (can't change final). static smells because of classloader issues but you'll probably need it inside runOnce(). I'd just prefer to be able to avoid it in the LHS of the expression.
One possible solution might be to use ehcache with a default setup which would return the same object. Since I can put things in the cache, this would also allow to override the value at any time. But maybe there is a more compact/simple solution than ehcache (which again needs an XML config file, etc).
[EDIT] I'm wondering why so many people downvote this. It's a valid question and the use case is pretty common (at least in my code). So if you don't understand the question (or the reason behind it) or if you have no answer or you don't care, why downvote? :/
[EDIT2] If you look at the app context of Spring, you'll find that more than 99% of all beans have just a single implementation. You could have more but in practice, you simply don't. So instead of separating interface, implementation and configuration, I'm looking at something which has only an implementation (in the most simple case), a current() method and one or two lines of clever code to initialize the result for current() once (when it is called for the first time) but at the same times allows to override the result (thread safe, if possible). Think of it as an atomic "if(o==null) o = new O(); return o" where you can override the o. Maybe an AtomicRunOnceReference class is the solution.
Right now, I just feel that what we all have and use daily is not the optimum, that there is a baffling solution which will make us all slap our heads and say "that's it". Just as we felt when Spring came around a few years ago and we realized where all our singleton problems came from and how to solve them.

The best idiom for threadsafe initialization code (imho) is the lazy inner class. The classic version is
class Outer {
class Inner {
private final static SomeInterface SINGLETON;
static {
// create the SINGLETON
}
}
public SomeInterface getMyObject() {
return Inner.SINGLETON;
}
}
because it's threadsafe, lazy-loading and (imho) elegant.
Now you want testability and replaceability. It's hard to advise without knowing what exactly it is but the most obvious solution would be to use dependency injection, particularly if you're using Spring and have an application context anyway.
That way your "singleton"'s behaviour is represented by an interface and you simply inject one of those into your relevant classes (or a factory to produce one) and then you can of course replace it with whatever you like for testing purposes.

Here is a solution that fulfills all my requirements:
/** Lazy initialization of a field value based on the (correct)
* double checked locking idiom by Joschua Bloch
*
* <p>See "Effective Java, Second Edition", p. 283
*/
public abstract class LazyInit<T>
{
private volatile T field;
/** Return the value.
*
* <p>If the value is still <code>null</code>, the method will block and
* invoke <code>computeValue()</code>. Calls from other threads will wait
* until the call from the first thread will complete.
*/
#edu.umd.cs.findbugs.annotations.SuppressWarnings("UG_SYNC_SET_UNSYNC_GET")
public T get ()
{
T result = field;
if (result == null) // First check (no locking)
{
synchronized (this)
{
result = field;
if (result == null) // Second check (with locking)
{
field = result = computeValue ();
}
}
}
return result;
}
protected abstract T computeValue ();
/** Setter for tests */
public synchronized void set (T value)
{
field = value;
}
public boolean hasValue()
{
return field != null;
}
}

You can use IoC techniques even if you dont use an IoC framework (Spring/Guice/...). And it is in my opinion the only clean way to avoid a Singleton.

I think one solution you could use is providing a protected method which can be overriden in the test (a solution I've used before for testing legacy code).
So something like:
private SomeObject object;
protected SomeObject getObject() {
if (object == null) {
object = new SomeObject();
}
return object;
}
Then in your test class you can do:
public void setUp() {
MyClassUnderTest cut = new MyClassUserTest() {
#Override
protected SomeObject getObject() }
return mockSomeObject;
}
};
}
I have to say I'm not overly keen on this pattern as it exposes the protected field where you might not really want it, but it's useful for getting you out of some situations where injection isn't an option

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.