Thread's copy of local variable - java

Threads have separate copy of local variables. I have a method in which a hashtable object is created. Will there be two different copies of hashtable objects for two different threads ?. the hashtable object is then passed to other method.
example method:
public void exampleMethod(String a,String b, String c)
{
final Hashtable<String,String> parameterMap=new Hashtable<String,String>();
parameterMap.put("key1",a);
parameterMap.put("key2",b);
parameterMap.put("key3",c);
pqrsObject.takeRequest(parameterMap);
}

The hashtable in your example is a local variable and will be created for every call to the method.
that is because every time you call your function the line
final Hashtable<String,String> parameterMap=new Hashtable<String,String>();
is called, creating a new hashtable and placing it in parameterMap. i don't know what you need the hashtable for, but if you need it outside the method you might want to create it on a class of in other ways
now to your question:
Threads have separate copy of local variables
well, not only threads. local variable have a scope of deceleration. when you get out of the scope (in this case the function) the local variable deleted. the next call will create new variables, meaning every call to the function will have different local variables, even if their value is the same
Will there be two different copies of hashtable objects for two
different threads ?
i guess the answer is clear for you now, Yes, there will be different copies of hashtable for the different threads

parameterMap, being local will be on the stack, although new Hashtable<String,String>() will create the Hashtable object on the heap, being pointed by your local variable parameterMap. So, each thread that runs as a light weight process, running in same process address space, sharing the global data, but separate stack, so separate local variables.
To share, you may declare parameterMap within your class as instance variable (if shared instances across threads) or class static variable.

Will there be two different copies of hashtable objects for two different threads ?.
Yes.
Each thread has its own stack created when you create the thread. That stack is not shared with other threads.
Each time you invoke that method, a local variable is created for that specific thread( the hashtable itself will be created on objects heap, and the reference to this hashtable will be kept on stack).
For instance, if you call this method from 2 different threads, you'll end up with 2 HashTables on objects heap and 2 references on theirs threads stack.

Related

Is it reasonable to synchronize on a local variable?

From the Java memory model, we know that every thread has its own thread stack, and that local variables are placed in each thread's own thread stack.
And that other threads can't access these local variables.
So in which case should we synchronize on local variables?
You are talking about the below case:
public class MyClass {
public void myMethod() {
//Assume Customer is a Class
Customer customer = getMyCustomer();
synchronized(customer) {
//only one thread at a time can access customer object
which ever holds the lock
}
}
}
In the above code, customer is a local reference variable, but you are still using a synchronized block to restrict access to the object customer is pointing to (by a single thread at a time).
In Java memory model, objects live in heap (even though references are local to a Thread which live in a stack) and synchronization is all about restricting access to an object on the heap by exactly one thread at a time.
In short, when you say local variable (non-primitive), only reference is local, but not the actual object itself i.e., it is actually referring to an object on the heap which can be accessed by many other threads. Because of this, you need synchronization on the object so that single thread can only access that object at a time.
There are two situations:
The local variable is of a primitive type like int or double.
The local variable is of a reference type like ArrayList.
In the first situation, you can't synchronize, as you can only synchronize on Objects (which are pointed to by reference-type variables).
In the second situation, it all depends on what the local variable points to. If it points to an object that other threads (can) also point to, then you need to make sure that your code is properly synchronized.
Examples: you assigned the local variable from a static or instance field, or you got the object from a shared collection.
If, however, the object was created in your thread and only assigned to that local variable, and you never give out a reference to it from your thread to another thread, and the objects implementation itself also doesn't give out references, then you don't need to worry about synchronization.
The point is: synchronization is done for a purpose. You use it to ensure that exactly one thread can do some special protection-worthy activity at any given time.
Thus: if you need synchronization, it is always about more than one thread. And of course, then you need to lock on something that all those threads have access to.
Or in other words: there is no point in you locking the door in order to prevent yourself from entering the building.
But, as the other answer points out: it actually depends on the definition of "local" variable. Lets say you have:
void foo() {
final Object lock = new Object();
Thread a = new Thread() { uses lock
Thread b = new Thread() { uses lock
then sure, that "local" variable can be used as lock for those two threads. And beyond that: that example works because synchronization happens on the monitor of a specific object. And objects reside on the heap. All of them.
Yes, it does make sense when the local variable is used to synchronize access to a block of code from threads that are defined and created in the same method as the local variable.

non static, private variable value is somehow shared/overwritten in a multi thread run

See if you guys could solve this. It is driving me insane.
I have 2 instances of a Class which has private instance File variables (NOT static, NOT volatile)
private File tmpF;
each instances were then executed in different threads in the same pool.
instance 1 and 2 both create a temp file and assigned it to its File variable (NOT static). I called
tmpF = File.createTempFile("myTempFile" + unique_Id)
right before temp file creation, I debugged using IntelliJ IDEA and verified that each thread has different unique_Id.
Here is what is driving me insane. When the latter threads created a temp file and assigned it to its own tmpF variable, the earlier thread tmpF variable's value changed to the latter thread's tmpF value. How is this possible when tmpF is NOT static ???
When I tried changing the variable into a local method variable. The problem disappears... so it is definitely something to do with the fact that is a class field. Adding synchronized doesn't work either interestingly.
The problem sounds like you are sharing mutable data between threads, which ought to be avoided in concurrent environments, as per Brian Goetz's book, Java Concurrency in Practice. You have a few different options, depending on your restrictions.
If your class instances are really meant to be local to a single thread, try refactor your field so that it is final (i.e. private final File tmpF;), ensuring that it is instantiated exactly once. The file could be injected from a factory class.
If your class has a single instance and is shared between threads and you really need to have each thread use it's own file, try using Java's ThreadLocal class.
Hope that helps.

Thread Safe - final local method variable passed on to threads?

Will the following code cause same problems, if variable 'commonSet' of this method was instead a class level field. If it was a class level field, I'll have to wrap adding to set operation within a synchronized block as HashSet is not thread safe. Should I do the same in following code, since multiple threads are adding on to the set or even the current thread may go on to mutate the set.
public void threadCreatorFunction(final String[] args) {
final Set<String> commonSet = new HashSet<String>();
final Runnable runnable = new Runnable() {
#Override
public void run() {
while (true) {
commonSet.add(newValue());
}
}
};
new Thread(runnable, "T_A").start();
new Thread(runnable, "T_B").start();
}
The reference to 'commonSet' is 'locked' by using final. But multiple threads operating on it can still corrupt the values in the set(it may contain duplicates?). Secondly, confusion is since 'commonSet' ia a method level variable - it's same reference will be on the stack memory of the calling method (threadCreatorFunction) and stack memory of run methods - is this correct?
There are quite a few questions related to this:
Why do variables passed to runnable need to be final?
Why are only final variables accessible in anonymous class?
But, I cannot see them stressing on thread safe part of such sharing/passing of mutables.
No, this is absolutely not thread-safe. Just because you've got it in a final variable, that means that both threads will see the same reference, which is fine - but it doesn't make the object any more thread-safe.
Either you need to synchronize access, or use ConcurrentSkipListSet.
An interesting example.
The reference commonSet is thread safe and immutable. It is on the stack for the first thread and a field of your anonymous Runnable class as well. (You can see this in a debugger)
The set commonSet refers to is mutable and not thread safe. You need to use synchronized, or a Lock to make it thread safe. (Or use a thread safe collection instead)
I think you're missing a word in your first sentence:
Will the following code cause same problems if variable 'commonSet' of this method was a ??? instead a class level field.
I think you're a little bit confused though. The concurrency issues have nothing to do with whether or not the reference to your mutable data structure is declared final. You need to declare the reference as final because you're closing over it inside the anonymous inner class declaration for your Runnable. If you're actually going to have multiple threads reading/writing the data structure then you need to either use locks (synchronize) or use a concurrent data structure like java.util.concurrent.ConcurrentHashMap.
The commonSet is shared among two Threads. You have declared it as final and thus you made the reference immutable (you can not re-assign it), but the actual data inside the Set is still mutable. Suppose that one Thread puts some data in and some other Thread reads some data out. Whenever the first thread puts data in, you most probably want to lock that Set so that no other Thread could read until that data is written. Does that happen with a HashSet? Not really.
As others have already commented, you are mistaking some concepts, like final and synchronized.
I think that if you explain what you want to accomplish with your code,it would be much easier to help you. I've got the impression that this code snippet is more an example that the actual code.
Some questions: Why is the set defined inside the function? should it be shared among threads? Something that puzzles me is that you crate two threads with the same instance of the runnable
new Thread(runnable, "T_A").start();
new Thread(runnable, "T_B").start();
Whether commonset is used by single thread or multiple it is only the reference that is immutable for final objects(i.e, once assigned you cannot assign another obj reference again) however you can still modify the contents referenced by this object using that reference.
If it were not final one thread could have initialized it again and changed the reference
commonSet = new HashSet<String>();
commonSet.add(newValue());
in which case these two threads may use two different commonsets which is probably not what you want

ThreadLocal vs local variable in Runnable

Which one among ThreadLocal or a local variable in Runnable will be preferred? For performance reasons. I hope using a local variable will give more chances for cpu caching, etc.
Which one among ThreadLocal or a local variable in Runnable will be preferred.
If you have a variable that is declared inside the thread's class (or the Runnable) then a local variable will work and you don't need the ThreadLocal.
new Thread(new Runnable() {
// no need to make this a thread local because each thread already
// has their own copy of it
private SimpleDateFormat format = new SimpleDateFormat(...);
public void run() {
...
// this is allocated per thread so no thread-local
format.parse(...);
...
}
}).start();
On the other hand, ThreadLocals are used to save state on a per thread basis when you are executing common code. For example, the SimpleDateFormat is (unfortunately) not thread-safe so if you want to use it in code executed by multiple threads you would need to store one in a ThreadLocal so that each thread gets it's own version of the format.
private final ThreadLocal<SimpleDateFormat> localFormat =
new ThreadLocal<SimpleDateFormat>() {
#Override
protected SimpleDateFormat initialValue() {
return new SimpleDateFormat(...);
}
};
...
// if a number of threads run this common code
SimpleDateFormat format = localFormat.get();
// now we are using the per-thread format (but we should be using Joda Time :-)
format.parse(...);
An example of when this is necessary is a web request handler. The threads are allocated up in Jetty land (for example) in some sort of pool that is outside of our control. A web request comes in which matches your path so Jetty calls your handler. You need to have a SimpleDateFormat object but because of its limitations, you have to create one per thread. That's when you need a ThreadLocal. When you are writing reentrant code that may be called by multiple threads and you want to store something per-thread.
Instead, if you want pass in arguments to your Runnable then you should create your own class and then you can access the constructor and pass in arguments.
new Thread(new MyRunnable("some important string")).start();
...
private static class MyRunnable implements {
private final String someImportantString;
public MyRunnable(String someImportantString) {
this.someImportantString = someImportantString;
}
// run by the thread
public void run() {
// use the someImportantString string here
...
}
}
Whenever your program could correctly use either of the two (ThreadLocal or local variable), choose the local variable: it will be more performant.
ThreadLocal is for storing per-thread state past the execution scope of a method. Obviously local variables can't persist past the scope of their declaration. If you needed them to, that's when you might start using a ThreadLocal.
Another option is using synchronized to manage access to a shared member variable. This is a complicated topic and I won't bother to go into it here as it's been explained and documented by more articulate people than me in other places. Obviously this is not a variant of "local" storage -- you'd be sharing access to a single resource in a thread-safe way.
I was also confused why i need ThreadLocal when i can just use local variables, since they both maintain their state inside a thread. But after a lot of searching and experimenting i see why is ThreadLocal needed.
I found two uses so far -
Saving thread specific values inside the same shared object
Alternative to passing variables as parameters through N-layers of code
1:
If you have two threads operating on the same object and both threads modify this object - then both threads keep losing their modifications to each other.
To make this object have two separate states for each thread, we declare this object or part of it ThreadLocal.
Of course, ThreadLocal is only beneficial here because both threads are sharing the same object. If they are using different objects, there's no need for the objects to be ThreadLocal.
2:
The second benefit of ThreadLocal, seems to be a side effect of how its implemented.
A ThreadLocal variable can be .set() by a thread, and then be .get() anywhere else. .get() will retrieve the same value that this thread had set anywhere else. We'll need a globally available wrapper to do a .get() and .set(), to actually write down the code.
When we do a threadLocalVar.set() - its as if its put inside some global "map", where this current thread is the key.
As if -
someGlobalMap.put(Thread.currentThread(),threadLocalVar);
So ten layers down, when we do threadLocalVar.get() - we get the value that this thread had set ten layers up.
threadLocalVar = someGlobalMap.get(Thread.currentThread());
So the function at tenth level does not have to lug around this variable as parameter, and can access it with a .get() without worrying about if it is from the right thread.
Lastly, since a ThreadLocal variable is a copy to each thread, of course, it does not need synchronization. I misunderstood ThreadLocal earlier as an alternative to synchronization, that it is not. It is just a side effect of it, that we dont need to synchronize the activity of this variable now.
Hope this has helped.
This question is answered by the simple rule that a variable should be declared in the smallest possible enclosing scope. A ThreadLocal is the largest possible enclosing scope so you should only use it for data that is needed across many lexical scopes. If it can be a local variable, it should be.

to ensure a java method is thread safe

is it enough to use only local variables and no instance variables. Thus only using memory on the stack (per thread).
But what happens when you create a new MyObject that is local to the method. Doesn't the new object get created on the heap ? Is it thread safe becuase the reference to it is local (thread safe) ?
It is thread safe because if it is only referenced by variables in that particular method (it is, as you said, a local variable), then no other threads can possibly have a reference to the object, and therefore cannot change it.
Imagine you and I are pirates (threads). You go and bury your booty (the object) on an island (the heap), keeping a map to it (the reference). I happen to use the same island for burying my booty, but unless you give me your map, or I go digging all over the island (which isn't allowed on the island of Java), I can't mess with your stash.
Your new MyObject is thread-safe because each call to the method will create its own local instance on the heap. None of the calls refer to a common method; if there are N calls, that means N instances of MyObject on the heap. When the method exits, each instance is eligible for GC as long as you don't return it to the caller.
Well, let me ask you a question: does limiting your method to local variables mean your method can't share a resource with another thread? If not, then obviously this isn't sufficient for thread safety in general.
If you're worried about whether another thread can modify an object you created in another thread, then the only thing you need to worry about is never leaking a reference to that object out of the thread. If you achieve that, your object will be in the heap, but no other thread will be able to reference it so it doesn't matter.
Edit
Regarding my first statement, here's a method with no instance variables:
public void methodA() {
File f = new File("/tmp/file");
//...
}
This doesn't mean there can't be a shared resource between two threads :-).
Threre's no way to other threads to access such object reference. But if that object is not thread-safe, then the overall thread-safety is compromised.
Consider for example that MyObject is a HashMap.
The argument that if it's in the heap, it's not thread-safe, is not valid. The heap is not accessible via pointer arithmetic, so it doesn't affect where the object is actually stored (besides ThreadLocal's).

Categories