Related
I was reading Effective Java, and came across a condition where Joshua Bloch recommends something like
class MyComparator extends Comparator<String>{
private MyComparator(){}
private static final MyComparator INSTANCE = new MyComparator();
public int compare(String s1,String s2){
// Omitted
}
}
XYZComparator is stateless, it has no fields. hence all instances of the class are functionally equivalent. Thus it should be a singleton to save on unnecessary object creation.
So is it always safe to create a static final Object of whatever class it is pointing to if it has no fields? Wouldn't this cause multithreading issue when compare is called from two threads parallely? Or I misunderstood something basic. Is it like every thread has autonomy of execution if no fields is shared?
So is it always safe to create a static final Object of whatever class it is pointing to if it has no fields?
I would dare to say yes. Having no fields makes a class stateless and, thus, immutable, which is always desirable in a multithreading environment.
Stateless objects are always thread-safe.
Immutable objects are always thread-safe.
An excerpt from Java Concurrency In Practice:
Since the actions of a thread accessing a stateless object cannot affect the correctness of operations in other threads, stateless objects are thread-safe.
Stateless objects are always thread-safe.
The fact that most servlets can be implemented with no state greatly reduces the burden of making servlets threadͲ
safe. It is only when servlets want to remember things from one request to another that the thread-safety requirement becomes an issue.
...
An immutable object is one whose state cannot be changed after construction. Immutable objects are inherently
thread-safe; their invariants are established by the constructor, and if their state cannot be changed, these invariants
always hold.
Immutable objects are always thread-safe.
Immutable objects are simple. They can only be in one state, which is carefully controlled by the constructor. One of the
most difficult elements of program design is reasoning about the possible states of complex objects. Reasoning about
the state of immutable objects, on the other hand, is trivial.
Wouldn't this cause multithreading issue when compare is called from two threads parallelly?
No. Each thread has own stack where local variables (including method parameters) are stored. The thread's stack isn't shared, so there is no way to mess it up parallelly.
Another good example would be a stateless servlet. One more extract from that great book.
#ThreadSafe
public class StatelessFactorizer implements Servlet {
public void service(ServletRequest req, ServletResponse resp) {
BigInteger i = extractFromRequest(req);
BigInteger[] factors = factor(i);
encodeIntoResponse(resp, factors);
}
}
StatelessFactorizer is, like most servlets, stateless: it has no fields and references no fields from other classes. The
transient state for a particular computation exists solely in local variables that are stored on the thread's stack and are
accessible only to the executing thread. One thread accessing a StatelessFactorizer cannot influence the result of
another thread accessing the same StatelessFactorizer; because the two threads do not share state, it is as if they
were accessing different instances.
Is it like every thread has autonomy of execution if no fields is shared?
Each thread has its own program counter, stack, and local variables. There is a term "thread confinement" and one of its forms is called "stack confinement".
Stack confinement is a special case of thread confinement in which an object can only be reached through local variables. Just as encapsulation can make it easier to preserve invariants, local variables can make it easier to confine objects to a thread. Local variables are intrinsically confined to the executing thread; they exist on the executing thread's stack, which is not accessible to other threads.
To read:
Java Concurrency In Practice
Thread Confinement
Stack Confinement using local object reference
Multithreading issues are caused by unwanted changes in state. If there is no state that is changed, there are no such issues. That is also why immutable objects are very convenient in a multithreaded environment.
In this particular case, the method only operates on the input parameters s1 and s2 and no state is kept.
So is it always safe to create a static final Object of whatever class it is pointing to if it has no fields?
"Always" is too strong a claim. It's easy to construct an artificial class where instances are not thread-safe despite having no fields:
public class NotThreadSafe {
private static final class MapHolder {
private static final Map<NotThreadSafe, StringBuilder> map =
// use ConcurrentHashMap so that different instances don't
// interfere with each other:
new ConcurrentHashMap<>();
}
private StringBuilder getMyStringBuilder() {
return MapHolder.map.computeIfAbsent(this, k -> new StringBuilder());
}
public void append(final Object s) {
getMyStringBuilder().append(s);
}
public String get() {
return getMyStringBuilder().toString();
}
}
. . . but that code is not realistic. If your instances don't have any mutable state, then they'll naturally be threadsafe; and in normal Java code, mutable state means instance fields.
XYZComparator is stateless, it has no fields. hence all instances of the class are functionally equivalent. Thus it should be a singleton to save on unnecessary object creation.
From that point of view, the "current day" answer is probably: make MyComparator an enum. The JVM guarantees that MyComparatorEnum.INSTANCE will be a true singelton, and you don't have to worry about the subtle details that you have to consider when building singletons "yourself".
Explanation
So is it always safe to create a static final Object of whatever class it is pointing to if it has no fields?
Depends. Multi-threading issues can only occur when one thread is changing something while another thread is using it at the same time. Since the other thread might then not be aware of the changes due to caching and other effects. Or it results in a pure logic bug where the creator did not think about that a thread can be interrupted during an operation.
So when a class is stateless, which you have here, it is absolutely safe to be used in a multi-threaded environment. Since there is nothing for any thread to change in the first place.
Note that this also means that a class is not allowed to use not-thread-safe stuff from elsewhere. So for example changing a field in some other class while another thread is using it.
Example
Here is a pretty classic example:
public class Value {
private int value;
public int getValue() {
return value;
}
public void increment() {
int current = value; // or just value++
value = current + 1;
}
}
Now, lets assume both threads call value.increment(). One thread gets interrupted after:
int current = value; // is 0
Then the other starts and fully executes increment. So
int current = value; // is 0
value = current + 1; // is 1
So value is now 1. Now the first thread continues, the expected outcome would be 2, but we get:
value = current + 1; // is 1
Since its current was already computed before the second thread ran through, so it is still 0.
We also say that an operation (or method in this case) is not atomic. So it can be interrupted by the scheduler.
This issue can of course only happen because Value has a field value, so it has a changeable state.
YES. It is safe to create a static final object of a class if it has no fields. Here, the Comparator provides functionality only, through its compare(String, String) method.
In case of multithreading, the compare method will have to deal with local variables only (b/c it is from stateless class), and local variables are not shared b/w thread, i.e., each thread will have its own (String, String) copy and hence will not interfere with each other.
Calling the compare method from two threads in parallel is safe (stack confinement). The parameters you pass to the method are stored in that thread's stack, that any other thread cannot access.
An immutable singleton is always recommended. Abstain from creating mutable singletons, as they introduce global state in your application, that is bad.
Edit: If the params passed are mutable object references, then you have to take special care to ensure thread safety.
Given:
A lazy initialized singleton class implemented with double-check locking pattern with all the relevant volatile and synchronized stuff in getInstance. This singleton launches asynchronous operations via an ExecutorService,
There are seven type of tasks, each one identified by a unique key,
When a task is launched, it is stored in a cached based on ConcurrentHashMap,
When a client ask for a task, if the task in the cache is done, a new one is launched and cached; if it is running, the task is retrieved from the cache and passed to the client.
Here is a excerpt of the code:
private static volatile TaskLauncher instance;
private ExecutorService threadPool;
private ConcurrentHashMap<String, Future<Object>> tasksCache;
private TaskLauncher() {
threadPool = Executors.newFixedThreadPool(7);
tasksCache = new ConcurrentHashMap<String, Future<Object>>();
}
public static TaskLauncher getInstance() {
if (instance == null) {
synchronized (TaskLauncher.class) {
if (instance == null) {
instance = TaskLauncher();
}
}
}
return instance;
}
public Future<Object> getTask(String key) {
Future<Object> expectedTask = tasksCache.get(key);
if (expectedTask == null || expectedTask.isDone()) {
synchronized (tasksCache) {
if (expectedTask == null || expectedTask.isDone()) {
// Make some stuff to create a new task
expectedTask = [...];
threadPool.execute(expectedTask);
taskCache.put(key, expectedTask);
}
}
}
return expectedTask;
}
I got one major question, and another minor one:
Do I need to perform double-check locking control in my getTask method? I know ConcurrentHashMap is thread-safe for read operations, so my get(key) is thread-safe and may not need double-check locking (but yet quite unsure of this…). But what about the isDone() method of Future?
How do you chose the right lock object in a synchronized block? I know it must no be null, so I use first the TaskLauncher.class object in getInstance() and then the tasksCache, already initialized, in the getTask(String key) method. And has this choice any importance in fact?
Do I need to perform double-check locking control in my getTask method?
You don't need to do double-checked locking (DCL) here. (In fact, it is very rare that you need to use DCL. In 99.9% of cases, regular locking is just fine. Regular locking on a modern JVM is fast enough that the performance benefit of DCL is usually too small to make a noticeable difference.)
However, synchronization is necessary unless you declared tasksCache to be final. And if tasksCache is not final, then simple locking should be just fine.
I know ConcurrentHashMap is thread-safe for read operations ...
That's not the issue. The issue is whether reading the value of the taskCache reference is going to give you the right value if the TaskLauncher is created and used on different threads. The thread-safety of fetching a reference from a variable is not affected one way or another by the thread-safety of the referenced object.
But what about the isDone() method of Future?
Again ... that has no bearing on whether or not you need to use DCL or other synchronization.
For the record, the memory semantics "contract" for Future is specified in the javadoc:
"Memory consistency effects: Actions taken by the asynchronous computation happen-before actions following the corresponding Future.get() in another thread."
In other words, no extra synchronization is required when you call get() on a (properly implemented) Future.
How do you chose the right lock object in a synchronized block?
The locking serves to synchronize access to the variables read and written by different threads while hold the lock.
In theory, you could write your entire application to use just one lock. But if you did that, you would get the situation where one thread waits for another, despite the first thread not needing to use the variables that were used by the other one. So normal practice is use a lock that is associated with the variables.
The other thing you need to be ensure is that when two threads need to access the same set of variables, they use the same object (or objects) as locks. If they use different locks, then they don't achieve proper synchronization ...
(There are also issues about whether lock on this or on a private lock, and about the order in which locks should be acquired. But these are beyond the scope of the question you asked.)
Those are the general "rules". To decide in a specific case, you need to understand precisely what you are trying to protect, and choose the lock accordingly.
AbstractQueuedSync used in side FutureTask has a variable state of a
thread and its a volatile (thread safe) variable. So need not to worry about isDone() method.
private volatile int state;
Choice of lock object is based on the instance type and situation,
Lets say you have multiple objects and they have Sync blocks on
TaskLauncher.class then all the methods in all the instances with be
synchronized by this single lock (use this if you want to share a
single shared memory across all the instances).
If all instances have their own shared memory b/w threads and methods use this. Using this will save you one extra lock object as well.
In your case you can use
TaskLauncher.class ,tasksCache, this its all same in terms of synchronization as its singelton.
I was reading a blog which discussed about When Singleton is not Singleton.
In one of the cases of which Author tries to explain shows how double checked locking can also be a failure when implemented on Singleton.
// Double-checked locking -- don't use
public static MySingleton getInstance() {
if (_instance==null) {
synchronized (MySingleton.class) {
if (_instance==null) {
_instance = new MySingleton();
}
}
}
}
For the above code block Author says:
"In this situation, we intend to avoid the expense of grabbing the lock of the singleton class every time the method is called. The lock is grabbed only if the singleton instance does not exist, and then the existence of the instance is checked again in case another thread passed the first check an instant before the current thread."
Can someone help me explaining what exactly this means?
I'll try to talk it through.
The synchronized block takes time to enter as it requires cross-thread coordination. We'll try to avoid entering it if needed.
Now, if we are working with multiple threads, if the object already exists, let's just return it, as methods will synchronize themselves against threading race conditions internally. We can do this before entering a synchronized block as if it was created, it was created. The constructor is already designed so a partially-constructed object cannot be returned, as specified by the memory model design.
If the singleton object doesn't exist yet, we need to create one. But what if while we were checking another thread created it? We'll use synchronized to ensure no other threads hold it. Now, once we enter, we check again. If the singleton was created by another thread, let's return it since it exists already. If we didn't do this, a thread could get its singleton and do something to it and we'd just steamroller over its changes and effects.
If not, let's lock it and return a new one. By holding the lock, we now protect the singleton from the other side. Another thread waits for the lock, and noticing it's been created(as per the inner null comparison) returns the existing one. If we didn't acquire the lock, threads would both steamroller over changes, and find their changes destroyed as well.
Please note that the code block in your post is incomplete. It would need to return _instance if any of the null checks returned false, using else blocks.
Now, if we were in a single-threaded environment this would not have been important. We could just use:
public static MySingleton getInstance() {
if (_instance==null) {
_instance = new MySingleton();
}
else return _instance;
}
With newer versions, java uses this behavior in many cases, as part of its libraries, checking if a lock is needed before taking time to acquire it. Before, it either failed to acquire the lock(bad, data loss) or acquired it immediately(bad, more potential for slowdown and deadlock).
You should still implement this yourself in your own classes for thread safety.
He doesn't explain how it can fail in that quote. He is just explaining double-checked locking. He probably refers elsewhere to the fact that double-checked locking itself didn't work prior to Java 1.5. But that's a long time ago.
I have found on wikipedia, the best of explanation of different Singleton implentations, their flaws and what is the best. Follow this link:
http://en.wikipedia.org/wiki/Singleton_pattern
Hope it helps!
In the examples mentioned for Out-of-order writes for double-checked locking scenarios (ref:
IBM article & Wikipedia Article)
I could not understand the simple reason of why Thread1 would come of out synchronized block before the constructor is fully initialized. As per my understanding, creating "new" and the calling constructor should execute in-sequence and the synchronized lock should not be release till all the work in not completed.
Please let me know what I am missing here.
The constructor can have completed - but that doesn't mean that all the writes involved within that constructor have been made visible to other threads. The nasty situation is when the reference becomes visible to other threads (so they start using it) before the contents of the object become visible.
You might find Bill Pugh's article on it helps shed a little light, too.
Personally I just avoid double-checked locking like the plague, rather than trying to make it all work.
The code in question is here:
public static Singleton getInstance()
{
if (instance == null)
{
synchronized(Singleton.class) { //1
if (instance == null) //2
instance = new Singleton(); //3
}
}
return instance;
}
Now the problem with this cannot be understood as long as you keep thinking that the code executes in the order it is written. Even if it does, there is the issue of cache synchronization across multiple processors (or cores) in a Symmetrical Multiprocessing architecture, which is the mainstream today.
Thread1 could for example publish the instance reference to the main memory, but fail to publish any other data inside the Singleton object that was created. Thread2 will observe the object in an inconsistent state.
As long as Thread2 doesn't enter the synchronized block, the cache synchronization doesn't have to happen, so Thread2 can go on indefinitely without ever observing the Singleton in a consistent state.
Thread 2 checks to see if the instance is null when Thread 1 is at //3 .
public static Singleton getInstance()
{
if (instance == null)
{
synchronized(Singleton.class) { //1
if (instance == null) //2
instance = new Singleton(); //3
}
}
return instance;//4
}
At this point the memory for instance has been allocated from the heap and the pointer to it is stored in the instance reference, so the "if statement" executed by Thread 2 returns "false".
Note that because instance is not null when Thread2 checks it, thread 2 does not enter the synchronized block and instead returns a reference to a " fully constructed, but partially initialized, Singleton object."
There's a general problem with code not being executed in the order it's written. In Java, a thread is only obligated to be consistent with itself. An instance created on one line with new has to be ready to go on the next. There's no such oblgation to other threads. For instance, if fieldA is 1 and 'fieldB' is 2 going into this code on thread 1:
fieldA = 5;
fieldB = 10;
and thread 2 runs this code:
int x = fieldA;
int y = FieldB;
x y values of 1 2, 5 2, and 5 10 are all to be expected, but 1 10--fieldB was set and/or picked up before fieldA--is perfectly legal, and likely, as well. So double-checked locking is a special case of a more general problem, and if you work with multiple threads you need to be aware of it, particularly if they all access the same fields.
One simple solution from Java 1.5 that should be mentioned: fields marked volatile are guaranteed to be read from main memory immediately before being referenced and written immediately after. If fieldA and fieldB above were declared volatile, an x y value of 1 10 would not be possible. If instance is volatile, double-checked locking works. There's a cost to using volatile fields, but it's less than synchronizing, so the double-checked locking becomes a pretty good idea. It's an even better idea because it avoids having a bunch of threads waiting to synch while CPU cores are sitting idle.
But you do want to understand this (if you can't be talked out of multithreading). On the one hand you need to avoid timing problems and on the other avoid bringing your program to a halt with all the threads waiting to get into synch blocks. And it's very difficult to understand.
consider this class,with no instance variables and only methods which are non-synchronous can we infer from this info that this class in Thread-safe?
public class test{
public void test1{
// do something
}
public void test2{
// do something
}
public void test3{
// do something
}
}
It depends entirely on what state the methods mutate. If they mutate no shared state, they're thread safe. If they mutate only local state, they're thread-safe. If they only call methods that are thread-safe, they're thread-safe.
Not being thread safe means that if multiple threads try to access the object at the same time, something might change from one access to the next, and cause issues. Consider the following:
int incrementCount() {
this.count++;
// ... Do some other stuff
return this.count;
}
would not be thread safe. Why is it not? Imagine thread 1 accesses it, count is increased, then some processing occurs. While going through the function, another thread accesses it, increasing count again. The first thread, which had it go from, say, 1 to 2, would now have it go from 1 to 3 when it returns. Thread 2 would see it go from 1 to 3 as well, so what happened to 2?
In this case, you would want something like this (keeping in mind that this isn't any language-specific code, but closest to Java, one of only 2 I've done threading in)
int incrementCount() synchronized {
this.count++;
// ... Do some other stuff
return this.count;
}
The synchronized keyword here would make sure that as long as one thread is accessing it, no other threads could. This would mean that thread 1 hits it, count goes from 1 to 2, as expected. Thread 2 hits it while 1 is processing, it has to wait until thread 1 is done. When it's done, thread 1 gets a return of 2, then thread 2 goes throguh, and gets the expected 3.
Now, an example, similar to what you have there, that would be entirely thread-safe, no matter what:
int incrementCount(int count) {
count++;
// ... Do some other stuff
return this.count;
}
As the only variables being touched here are fully local to the function, there is no case where two threads accessing it at the same time could try working with data changed from the other. This would make it thread safe.
So, to answer the question, assuming that the functions don't modify anything outside of the specific called function, then yes, the class could be deemed to be thread-safe.
Consider the following quote from an article about thread safety ("Java theory and practice: Characterizing thread safety"):
In reality, any definition of thread safety is going to have a certain degree of circularity, as it must appeal to the class's specification -- which is an informal, prose description of what the class does, its side effects, which states are valid or invalid, invariants, preconditions, postconditions, and so on. (Constraints on an object's state imposed by the specification apply only to the externally visible state -- that which can be observed by calling its public methods and accessing its public fields -- rather than its internal state, which is what is actually represented in its private fields.)
Thread safety
For a class to be thread-safe, it first must behave correctly in a single-threaded environment. If a class is correctly implemented, which is another way of saying that it conforms to its specification, no sequence of operations (reads or writes of public fields and calls to public methods) on objects of that class should be able to put the object into an invalid state, observe the object to be in an invalid state, or violate any of the class's invariants, preconditions, or postconditions.
Furthermore, for a class to be thread-safe, it must continue to behave correctly, in the sense described above, when accessed from multiple threads, regardless of the scheduling or interleaving of the execution of those threads by the runtime environment, without any additional synchronization on the part of the calling code. The effect is that operations on a thread-safe object will appear to all threads to occur in a fixed, globally consistent order.
So your class itself is thread-safe, as long as it doesn't have any side effects. As soon as the methods mutate any external objects (e.g. some singletons, as already mentioned by others) it's not any longer thread-safe.
Depends on what happens inside those methods. If they manipulate / call any method parameters or global variables / singletons which are not themselves thread safe, the class is not thread safe either.
(yes I see that the methods as shown here here have no parameters, but no brackets either, so this is obviously not full working code - it wouldn't even compile as is.)
yes, as long as there are no instance variables. method calls using only input parameters and local variables are inherently thread-safe. you might consider making the methods static too, to reflect this.
If it has no mutable state - it's thread safe. If you have no state - you're thread safe by association.
No, I don't think so.
For example, one of the methods could obtain a (non-thread-safe) singleton object from another class and mutate that object.
Yes - this class is thread safe but this does not mean that your application is.
An application is thread safe if the threads in it cannot concurrently access heap state. All objects in Java (and therefore all of their fields) are created on the heap. So, if there are no fields in an object then it is thread safe.
In any practical application, objects will have state. If you can guarantee that these objects are not accessed concurrently then you have a thread safe application.
There are ways of optimizing access to shared state e.g. Atomic variables or with carful use of the volatile keyword, but I think this is going beyond what you've asked.
I hope this helps.