Is it reasonable to synchronize on a local variable? - java

From the Java memory model, we know that every thread has its own thread stack, and that local variables are placed in each thread's own thread stack.
And that other threads can't access these local variables.
So in which case should we synchronize on local variables?

You are talking about the below case:
public class MyClass {
public void myMethod() {
//Assume Customer is a Class
Customer customer = getMyCustomer();
synchronized(customer) {
//only one thread at a time can access customer object
which ever holds the lock
}
}
}
In the above code, customer is a local reference variable, but you are still using a synchronized block to restrict access to the object customer is pointing to (by a single thread at a time).
In Java memory model, objects live in heap (even though references are local to a Thread which live in a stack) and synchronization is all about restricting access to an object on the heap by exactly one thread at a time.
In short, when you say local variable (non-primitive), only reference is local, but not the actual object itself i.e., it is actually referring to an object on the heap which can be accessed by many other threads. Because of this, you need synchronization on the object so that single thread can only access that object at a time.

There are two situations:
The local variable is of a primitive type like int or double.
The local variable is of a reference type like ArrayList.
In the first situation, you can't synchronize, as you can only synchronize on Objects (which are pointed to by reference-type variables).
In the second situation, it all depends on what the local variable points to. If it points to an object that other threads (can) also point to, then you need to make sure that your code is properly synchronized.
Examples: you assigned the local variable from a static or instance field, or you got the object from a shared collection.
If, however, the object was created in your thread and only assigned to that local variable, and you never give out a reference to it from your thread to another thread, and the objects implementation itself also doesn't give out references, then you don't need to worry about synchronization.

The point is: synchronization is done for a purpose. You use it to ensure that exactly one thread can do some special protection-worthy activity at any given time.
Thus: if you need synchronization, it is always about more than one thread. And of course, then you need to lock on something that all those threads have access to.
Or in other words: there is no point in you locking the door in order to prevent yourself from entering the building.
But, as the other answer points out: it actually depends on the definition of "local" variable. Lets say you have:
void foo() {
final Object lock = new Object();
Thread a = new Thread() { uses lock
Thread b = new Thread() { uses lock
then sure, that "local" variable can be used as lock for those two threads. And beyond that: that example works because synchronization happens on the monitor of a specific object. And objects reside on the heap. All of them.

Yes, it does make sense when the local variable is used to synchronize access to a block of code from threads that are defined and created in the same method as the local variable.

Related

Thread's copy of local variable

Threads have separate copy of local variables. I have a method in which a hashtable object is created. Will there be two different copies of hashtable objects for two different threads ?. the hashtable object is then passed to other method.
example method:
public void exampleMethod(String a,String b, String c)
{
final Hashtable<String,String> parameterMap=new Hashtable<String,String>();
parameterMap.put("key1",a);
parameterMap.put("key2",b);
parameterMap.put("key3",c);
pqrsObject.takeRequest(parameterMap);
}
The hashtable in your example is a local variable and will be created for every call to the method.
that is because every time you call your function the line
final Hashtable<String,String> parameterMap=new Hashtable<String,String>();
is called, creating a new hashtable and placing it in parameterMap. i don't know what you need the hashtable for, but if you need it outside the method you might want to create it on a class of in other ways
now to your question:
Threads have separate copy of local variables
well, not only threads. local variable have a scope of deceleration. when you get out of the scope (in this case the function) the local variable deleted. the next call will create new variables, meaning every call to the function will have different local variables, even if their value is the same
Will there be two different copies of hashtable objects for two
different threads ?
i guess the answer is clear for you now, Yes, there will be different copies of hashtable for the different threads
parameterMap, being local will be on the stack, although new Hashtable<String,String>() will create the Hashtable object on the heap, being pointed by your local variable parameterMap. So, each thread that runs as a light weight process, running in same process address space, sharing the global data, but separate stack, so separate local variables.
To share, you may declare parameterMap within your class as instance variable (if shared instances across threads) or class static variable.
Will there be two different copies of hashtable objects for two different threads ?.
Yes.
Each thread has its own stack created when you create the thread. That stack is not shared with other threads.
Each time you invoke that method, a local variable is created for that specific thread( the hashtable itself will be created on objects heap, and the reference to this hashtable will be kept on stack).
For instance, if you call this method from 2 different threads, you'll end up with 2 HashTables on objects heap and 2 references on theirs threads stack.

Thread Safe - final local method variable passed on to threads?

Will the following code cause same problems, if variable 'commonSet' of this method was instead a class level field. If it was a class level field, I'll have to wrap adding to set operation within a synchronized block as HashSet is not thread safe. Should I do the same in following code, since multiple threads are adding on to the set or even the current thread may go on to mutate the set.
public void threadCreatorFunction(final String[] args) {
final Set<String> commonSet = new HashSet<String>();
final Runnable runnable = new Runnable() {
#Override
public void run() {
while (true) {
commonSet.add(newValue());
}
}
};
new Thread(runnable, "T_A").start();
new Thread(runnable, "T_B").start();
}
The reference to 'commonSet' is 'locked' by using final. But multiple threads operating on it can still corrupt the values in the set(it may contain duplicates?). Secondly, confusion is since 'commonSet' ia a method level variable - it's same reference will be on the stack memory of the calling method (threadCreatorFunction) and stack memory of run methods - is this correct?
There are quite a few questions related to this:
Why do variables passed to runnable need to be final?
Why are only final variables accessible in anonymous class?
But, I cannot see them stressing on thread safe part of such sharing/passing of mutables.
No, this is absolutely not thread-safe. Just because you've got it in a final variable, that means that both threads will see the same reference, which is fine - but it doesn't make the object any more thread-safe.
Either you need to synchronize access, or use ConcurrentSkipListSet.
An interesting example.
The reference commonSet is thread safe and immutable. It is on the stack for the first thread and a field of your anonymous Runnable class as well. (You can see this in a debugger)
The set commonSet refers to is mutable and not thread safe. You need to use synchronized, or a Lock to make it thread safe. (Or use a thread safe collection instead)
I think you're missing a word in your first sentence:
Will the following code cause same problems if variable 'commonSet' of this method was a ??? instead a class level field.
I think you're a little bit confused though. The concurrency issues have nothing to do with whether or not the reference to your mutable data structure is declared final. You need to declare the reference as final because you're closing over it inside the anonymous inner class declaration for your Runnable. If you're actually going to have multiple threads reading/writing the data structure then you need to either use locks (synchronize) or use a concurrent data structure like java.util.concurrent.ConcurrentHashMap.
The commonSet is shared among two Threads. You have declared it as final and thus you made the reference immutable (you can not re-assign it), but the actual data inside the Set is still mutable. Suppose that one Thread puts some data in and some other Thread reads some data out. Whenever the first thread puts data in, you most probably want to lock that Set so that no other Thread could read until that data is written. Does that happen with a HashSet? Not really.
As others have already commented, you are mistaking some concepts, like final and synchronized.
I think that if you explain what you want to accomplish with your code,it would be much easier to help you. I've got the impression that this code snippet is more an example that the actual code.
Some questions: Why is the set defined inside the function? should it be shared among threads? Something that puzzles me is that you crate two threads with the same instance of the runnable
new Thread(runnable, "T_A").start();
new Thread(runnable, "T_B").start();
Whether commonset is used by single thread or multiple it is only the reference that is immutable for final objects(i.e, once assigned you cannot assign another obj reference again) however you can still modify the contents referenced by this object using that reference.
If it were not final one thread could have initialized it again and changed the reference
commonSet = new HashSet<String>();
commonSet.add(newValue());
in which case these two threads may use two different commonsets which is probably not what you want

ThreadLocal vs local variable in Runnable

Which one among ThreadLocal or a local variable in Runnable will be preferred? For performance reasons. I hope using a local variable will give more chances for cpu caching, etc.
Which one among ThreadLocal or a local variable in Runnable will be preferred.
If you have a variable that is declared inside the thread's class (or the Runnable) then a local variable will work and you don't need the ThreadLocal.
new Thread(new Runnable() {
// no need to make this a thread local because each thread already
// has their own copy of it
private SimpleDateFormat format = new SimpleDateFormat(...);
public void run() {
...
// this is allocated per thread so no thread-local
format.parse(...);
...
}
}).start();
On the other hand, ThreadLocals are used to save state on a per thread basis when you are executing common code. For example, the SimpleDateFormat is (unfortunately) not thread-safe so if you want to use it in code executed by multiple threads you would need to store one in a ThreadLocal so that each thread gets it's own version of the format.
private final ThreadLocal<SimpleDateFormat> localFormat =
new ThreadLocal<SimpleDateFormat>() {
#Override
protected SimpleDateFormat initialValue() {
return new SimpleDateFormat(...);
}
};
...
// if a number of threads run this common code
SimpleDateFormat format = localFormat.get();
// now we are using the per-thread format (but we should be using Joda Time :-)
format.parse(...);
An example of when this is necessary is a web request handler. The threads are allocated up in Jetty land (for example) in some sort of pool that is outside of our control. A web request comes in which matches your path so Jetty calls your handler. You need to have a SimpleDateFormat object but because of its limitations, you have to create one per thread. That's when you need a ThreadLocal. When you are writing reentrant code that may be called by multiple threads and you want to store something per-thread.
Instead, if you want pass in arguments to your Runnable then you should create your own class and then you can access the constructor and pass in arguments.
new Thread(new MyRunnable("some important string")).start();
...
private static class MyRunnable implements {
private final String someImportantString;
public MyRunnable(String someImportantString) {
this.someImportantString = someImportantString;
}
// run by the thread
public void run() {
// use the someImportantString string here
...
}
}
Whenever your program could correctly use either of the two (ThreadLocal or local variable), choose the local variable: it will be more performant.
ThreadLocal is for storing per-thread state past the execution scope of a method. Obviously local variables can't persist past the scope of their declaration. If you needed them to, that's when you might start using a ThreadLocal.
Another option is using synchronized to manage access to a shared member variable. This is a complicated topic and I won't bother to go into it here as it's been explained and documented by more articulate people than me in other places. Obviously this is not a variant of "local" storage -- you'd be sharing access to a single resource in a thread-safe way.
I was also confused why i need ThreadLocal when i can just use local variables, since they both maintain their state inside a thread. But after a lot of searching and experimenting i see why is ThreadLocal needed.
I found two uses so far -
Saving thread specific values inside the same shared object
Alternative to passing variables as parameters through N-layers of code
1:
If you have two threads operating on the same object and both threads modify this object - then both threads keep losing their modifications to each other.
To make this object have two separate states for each thread, we declare this object or part of it ThreadLocal.
Of course, ThreadLocal is only beneficial here because both threads are sharing the same object. If they are using different objects, there's no need for the objects to be ThreadLocal.
2:
The second benefit of ThreadLocal, seems to be a side effect of how its implemented.
A ThreadLocal variable can be .set() by a thread, and then be .get() anywhere else. .get() will retrieve the same value that this thread had set anywhere else. We'll need a globally available wrapper to do a .get() and .set(), to actually write down the code.
When we do a threadLocalVar.set() - its as if its put inside some global "map", where this current thread is the key.
As if -
someGlobalMap.put(Thread.currentThread(),threadLocalVar);
So ten layers down, when we do threadLocalVar.get() - we get the value that this thread had set ten layers up.
threadLocalVar = someGlobalMap.get(Thread.currentThread());
So the function at tenth level does not have to lug around this variable as parameter, and can access it with a .get() without worrying about if it is from the right thread.
Lastly, since a ThreadLocal variable is a copy to each thread, of course, it does not need synchronization. I misunderstood ThreadLocal earlier as an alternative to synchronization, that it is not. It is just a side effect of it, that we dont need to synchronize the activity of this variable now.
Hope this has helped.
This question is answered by the simple rule that a variable should be declared in the smallest possible enclosing scope. A ThreadLocal is the largest possible enclosing scope so you should only use it for data that is needed across many lexical scopes. If it can be a local variable, it should be.

to ensure a java method is thread safe

is it enough to use only local variables and no instance variables. Thus only using memory on the stack (per thread).
But what happens when you create a new MyObject that is local to the method. Doesn't the new object get created on the heap ? Is it thread safe becuase the reference to it is local (thread safe) ?
It is thread safe because if it is only referenced by variables in that particular method (it is, as you said, a local variable), then no other threads can possibly have a reference to the object, and therefore cannot change it.
Imagine you and I are pirates (threads). You go and bury your booty (the object) on an island (the heap), keeping a map to it (the reference). I happen to use the same island for burying my booty, but unless you give me your map, or I go digging all over the island (which isn't allowed on the island of Java), I can't mess with your stash.
Your new MyObject is thread-safe because each call to the method will create its own local instance on the heap. None of the calls refer to a common method; if there are N calls, that means N instances of MyObject on the heap. When the method exits, each instance is eligible for GC as long as you don't return it to the caller.
Well, let me ask you a question: does limiting your method to local variables mean your method can't share a resource with another thread? If not, then obviously this isn't sufficient for thread safety in general.
If you're worried about whether another thread can modify an object you created in another thread, then the only thing you need to worry about is never leaking a reference to that object out of the thread. If you achieve that, your object will be in the heap, but no other thread will be able to reference it so it doesn't matter.
Edit
Regarding my first statement, here's a method with no instance variables:
public void methodA() {
File f = new File("/tmp/file");
//...
}
This doesn't mean there can't be a shared resource between two threads :-).
Threre's no way to other threads to access such object reference. But if that object is not thread-safe, then the overall thread-safety is compromised.
Consider for example that MyObject is a HashMap.
The argument that if it's in the heap, it's not thread-safe, is not valid. The heap is not accessible via pointer arithmetic, so it doesn't affect where the object is actually stored (besides ThreadLocal's).

Use of Volatile variables for safe publication of Immutable objects

I came across this statement:
In properly constructed objects, all
threads will see correct values of
final fields, regardless of how the
object is published.
Then why a volatile variable is used to safely
publishing an Immutable object?
I'm really confused. Can anybody make it clear with a suitable example?
In this case, the volatility would only ensure visibility of the new object; any other threads that happened to get hold of your object via a non-volatile field would indeed see the correct values of final fields as per JSR-133's initialization safety guarantees.
Still, making the variable volatile doesn't hurt; is correct from a memory management perspective anyway; and would be necessary for non-final fields initialised in a constructor (although there shouldn't be any of these in an immutable object). If you wish to share variables between threads, you'll need to ensure adequate synchronization to give visibility anyway; though in this case you're right, that there's no danger to the atomicity of the constructor.
Thanks to Tom Hawtin for pointing out I'd completely overlooked the JMM guarantees on final fields; previous incorrect answer is given below.
The reason for the volatile variable is that is establishes a happens-before relationship (according to the Java Memory Model) between the construction of the object, and the assignment of the variable. This achieves two things:
Subsequent reads of that variable from different threads are guaranteed to see the new value. Without marking the variable as volatile, these threads could see stale values of the reference.
The happens-before relationship places limits on what reorderings the compiler can do. Without a volatile variable, the assignment to the variable could happen before the object's constructor runs - hence other threads could get a reference to the object before it was fully constructed.
Since one of the fundamental rules of immutable objects is that you don't publish references during the constructor, it's this second point that is likely being referenced here. In a multithreaded environment without proper concurrent handling, it is possible for a reference to the object to be "published" before that object has been constructed. Thus another thread could get that object, see that one of its fields is null, and then later see that this "immutable" object has changed.
Note that you don't have to use volatile fields to achieve this if you have other appropriate synchronization primitives - for example, if the assignment (and all later reads) are done in a synchronized block on a given monitor - but in a "standalone" sense, marking the variable as volatile is the easiest way to tell the JVM "this might be read by multiple threads, please make the assignment safe in that context."
A volatile reference to an immutable object could be useful. This would allow you to swap one object for another to make the new data available to other threads.
I would suggets you look at using AtomicReference first however.
If you need final volatile fields you have a problem. All fields, including final ones are available to other threads as soon as the constructor returns. So if you pass an object to another thread in the constructor, it is possible for the other thread to see an inconsistent state. IMHO you should consider a different solution so you don't have to do this.
You cant really see the difference in Immutable class.see the below example.in Myclass.class
public static Foo getInstance(){
if(INSTANCE == null){
INSTANCE = new Foo();
}
return INSTANCE;
}
in the above code if Foo is declared final(final Foo INSTANCE;) it guarantees that it won't publish references during the constructor call.partial object construction is not possible
consider this...if this Myclass is Immutable, its state is not gonna change after object construction, making Volatile(volatile final Foo INSTANCE;) keyword redundant.but if this class allows its object state to be changed(Not immutable) multiple threads CAN actually update the object and some updates are not visible to other threads, hence volatile keyword ensures safety publication of objects in non-Immutable class.

Categories