Below is the example code
Class Abc {
void method1(){
ExecutorService threadPool = Executors.newFixedThreadPool(10);
for(int i=0;i<100;i++){
threadPool.execute(new Runnable() {
doSomeThing(Param);
});
}
threadPool.shutdown();
}
void doSomeThing(Param param){
Object ref1,ref2,ref3,ref4;
}
}
Here we execute the method doSomeThing() in multithread. And doSomeThing() method has many object references.
My question is if any thread changes the state of object reference will this change is visible to other thread?
If so what i need to do to make the thread to have its own state. I know we can fix this by creating a new instance of class while passing it in execute(). I am trying to fix the problem with this style
Each call to doSomeThing will get its own set of variables, whether they're in the same thread or not.
The variables will be equal to whatever you set them to in each call.
My question is if any thread changes the state of object reference will this change is visible to other thread?
And the simple answer is yes. However, this is far too simple to be helpful.
What you are asking is fundamental to the multithreading concept. Essentially, if you pass the same object to several threads at once then either the changes each thread makes to the object must be choreographed carefully or you must live with unpredictable results.
Related
I'm wondering if the following class is thread safe:
class Example {
private Thing thing;
public setThing(Thing thing) {
this.thing = thing;
}
public use() {
thing.function();
}
}
Specifically, what happens if one thread calls setThing while another thread is in Thing::function via Example::use?
For example:
Example example = new Example();
example.setThing(new Thing());
createThread(example); // create first thread
createThread(example); // create second thread
//Thread1
while(1) {
example.use();
}
//Thread2
while(1) {
sleep(3600000); //yes, i know to use a scheduled thread executor
setThing(new Thing());
}
Specifically, I want to know, when setThing is called while use() is executing, will it continue with the old object successfully, or could updating the reference to the object somehow cause a problem.
There are 2 points when reasoning about thread safety of a particulcar class :
Visibility of shared state between threads.
Safety (preserving class invariants) when class object is used by multiple threads through class methods.
Shared state of Example class consists only from one Thing object.
The class isn't thread safe from visibility perspective. Result of setThing by one thread isn't seen by other threads so they can work with stale data. NPE is also acceptable cause initial value of thing during class initialization is null.
It's not possible to say whether it's safe to access Thing class through use method without its source code. However Example invokes use method without any synchronization so it should be, otherwise Example isn't thread safe.
As a result Example isn't thread safe. To fix point 1 you can either add volatile to thing field if you really need setter or mark it as final and initialize in constructor. The easiest way to ensure that 2 is met is to mark use as synchronized. If you mark setThing with synchronized as well you don't need volatile anymore. However there lots of other sophisticated techniques to meet point 2. This great book describes everything written here in more detail.
If the method is sharing resources and the thread is not synchronized, then the they will collide and several scenarios can occur including overwriting data computed by another thread and stored in a shared variable.
If the method has only local variables, then you can use the method by mutliple threads without worring about racing. However, usually non-helper classes manipulate member variables in their methods, therefore it's recommended to make methods synchronized or if you know exactly where the problem might occur, then lock (also called synchronize) a subscope of a method with a final lock/object.
In this example, is it sufficient to declare the parameter obj as final to safely use it in the thread, below?
public void doSomethingAsync (final Object obj)
{
Thread thread = new Thread ()
{
#Override public void run () { ... do something with obj ... }
}
thread.start ();
}
At first glance it may seem fine. A caller invokes doSomethingAsync and obj gets cached until needed in the thread.
But what happens if there are a burst of calls to doSomethingAsync such that they complete before the threads have done anything with obj?
If the Java compiler simply makes obj into a member variable, the last call to doSomethingAsync will overwrite the prior values of obj, making prior invocations of the thread use a wrong value. Or, does the compiler generate a queue or some dimensioned storage for obj so that each thread gets the proper value?
At first glance it may seem fine. A caller invokes doSomethingAsync and obj gets cached until needed in the thread.
The object is not "cached", the variable reference merely cannot be assigned to another object. The final keyword only prevents the variable from being re-assigned, it does not prevent the object that is being referenced from being mutated.
But what happens if there are a burst of calls to doSomethingAsync such that they complete before the threads have done anything with obj?
If the threads modify the referenced object the behavior would be undefined, they would be competing for the object and their reference to the object may have "old" values because the object was not synchronized between the threads. If the object is immutable, it has no state and cannot be changed, then it is inherently thread safe.
If the Java compiler simply makes obj into a method variable, the last call to doSomethingAsync will overwrite the prior values of obj, making prior invocations of the thread use a wrong value. Or, does the compiler generate a queue or some dimensioned storage for obj so that each thread gets the proper value?
The compiler does not guarantee that the threads get executed in order, threads run concurrently. This is why the synchronize keyword exists, so that you can guarantee that when you reference the object you reference the same state of the object that all of the other threads see. Obviously this is at a cost to performance so it is recommended to only pass immutable objects into threads so that you don't have to synchronize the threads every time you do something with the object.
Large edit here, based on a conversation the Original Poster and I had in chat.
It seems Peri's real question was about the way Java stored local variables like "obj" for use by Thread. This is called "captured variables" if you want to google it yourself. There is a nice discussion here.
Basically what happens is that all your local variable, the ones stored on the stack, plus the "this" pointer get copied into your local class (Thread in this case) when the local class is instantiated.
Original answer follows for the sake of the comments. But it is now obsolete.
Each time you call doSomethingAsync you are creating a new thread. If you call doSomethingAsync just once with a particular object, and then you modify that same object in the calling thread, then you have no idea what what the asynchronous thread will do. It might "do something with the object" before you modify it in the calling thread, after you modify in the calling thread or even WHILE you are concurrently modifying it in the calling thread. Unless the Object itself is thread safe this will cause problems.
Similarly, if you call doSomethignAsync twice with the same object, then you have no idea which asynchronous thread will modify the object first, and no guarantee they will not act concurrently on the same object.
Finally, if you call doSomethignAsync twice with 2 different objects then you don't know which asynchronous thread will act on its own object first, but you don't care, because they can't conflict with each other unless the objects have Static mutable variables (class variables) that are being modified).
If you require that one task get completed before another task and in the order submitted, then a single threaded ExecutorService is your answer.
If the Java compiler simply makes obj into a member variable, the last call to doSomethingAsync will overwrite the prior values of obj, making prior invocations of the thread use a wrong value
No, this will not happen. The subsequent call to doSomethingAsync cannot overwrite the obj captured by previous invocations of doSomethingAsync. This stands even if you remove the final keyword (assume java let you do it for just this time).
I think your question ultimately is about how closure works/is implemented in java. However, your code is not demonstrating the complication in the proper way because the code is not even trying to modify the variable obj in the same lexical scope.
In a way Java is not really capturing the variable obj, but its value. You could write the your code in a different way, and the overall effect is the same:
class YourThread extends Thread {
private Object param;
public YourThread (Object obj){
param = obj;
}
#Override
public void run(){
//do something with your param
}
}
and you no longer need the final keyword:
public void doSomethingAsync (Object obj){
Thread t = new YourThread (obj);
t.start();
}
Now, say you have two instances of YourThread created, how could the second instance modify what has been passed as parameter to the first instance?
Closure in Other Languages
In other languages, magical things can indeed happen, but to show it you need to write the code slightly different:
public void doSomethingAsync (Object obj){
//Here let's assume obj is not null
Thread thread = new Thread (){
#Override
public void run () { ... /*do something with obj*/ ... }
}
thread.start ();
obj = null;
}
This is not valid Java code, but in certain languages code like that is allowed. And the thread, when its run method is executed, might see obj as null.
Similarly, in the below code (again, not valid in Java), thread2 could potentially impact thread1 if thread2 executes first and changes obj in its run method:
public void doSomethingAsync (Object obj){
Thread thread1 = new Thread (){
#Override
public void run () { ... /*do something with obj*/ ... }
}
thread1.start ();
Thread thread2 = new Thread (){
#Override
public void run () { ... /*do something with obj*/ ... }
}
thread2.start ();
}
Back to Java
The reason Java forces you to put a final on obj is that although Java's syntax looks extremely similar to the closure syntax used in other languages, it is not doing the same closure semantics. Knowing it is final, Java does not need to create capturing object (thus additional heap allocation), but use something similar to YourThread behind the scene. See this link for more details
I have seen this NullPointerException on synchronized statement.
code:
synchronized(a){
a = new A()
}
So according to the above answer I have understood that it is not possible to use synchronized keyword on null reference.
So I changed my code to this:
synchronized(a = new A()){}
But am not sure if this is identical with my original code?
update:
what I want to achieve is lock the creation of a ( a = new A() )
Synchronized requires an object that will provide locking mechanism. It can be any object (in fact, synchronized without parameters will synchronize on this), but Java API provides classes dedicated to this functionality, for example ReentrantLock.
In code you provided every call to function containing synchronized block will use different object for locking, effectivly making synchronization useless.
Edit:
Since you updated your post with what you are actually trying to accomplish I can help you more.
public class Creator {
private A a;
public void createA() {
synchronized(this) {
a = new A();
}
}
}
I don't know if this fits your design since the code sample you provided is very small, but you should get the idea. Here instance of the Creator class is used to synchronize the creation of A. If you share it across multiple threads, each one of them calling createA(), you can be sure that one instantiation process will be finished before another one begins.
synchronized(a = new A()){}
so what it will do is it will create a new object of class A and use
that as Lock, so in simple word every thread can enter in synchronized
block anytime because each thread will have new lock and there will be
no other thread that is using that object as lock so every thread can
enter your synchronized block anytime and outcome will be no
synchronization
For Example
class TestClass {
SomeClass someVariable;
public void myMethod () {
synchronized (someVariable) {
...
}
}
public void myOtherMethod() {
synchronized (someVariable) {
...
}
}
}
here we can say Then those two blocks will be protected by execution
of 2 different threads at any time while someVariable is not modified.
Basically, it's said that those two blocks are synchronized against
the variable someVariable.
But in your case there will be always a new object so there will be no synchronization
These two code snippets are not equivalent!
In the first code snippet you synchronize on some object referenced by a, and afterwards you change the reference which will not change the synchronization object.
In the second snippet you first assign a newly created object to reference a and then synchronize on it. So the synchronization object will be the new one.
Generally, it is a very bad idea to change the reference which is used in the synchronized statement, regardless whether it is done inside the block (first code) or diretcly in the synchronized statement (second code). Make it final! Oh, and it mustn't be null, either.
I'm new to Java, so pls excuse if answer to below simple case is obvious.
class A{
public void foo(Customer cust){
cust.setName(cust.getFirstName() + " " + cust.getLastName());
cust.setAddress(new Address("Rome"));
}
}
I've a Singleton object (objectA) created for class A.
Given I don't have any class variable, is it thread safe if I call objectA.foo(new Customer()) from different threads?
What if I change foo to static and call A.foo(new Customer()) from different threads?
is it still thread safe?
Given I don't have any class variable, is it thread safe if I call
objectA.foo(new Customer()) from different threads?
Of course it is. Your foo() method doesn't change any state of the A object (since it doesn't have any) and the object you pass, new Customer(), as an argument to the method is not available to any other thread.
What if I change foo to static and call A.foo(new Customer()) from
different threads? is it still thread safe?
As long as you don't have any mutable static state, you're still good.
Yes, it will be thread-safe IF you call foo(new Customer()) from different threads. But this is only because each time you call new Customer() you are making a new (and therefore different) Customer object, and all that foo does is alter the state of the Customer that is passed to it. Thus these threads will not collide, because even though they are calling the same method, they will be manipulating different customers.
However, if you were to create a customer variable first
Customer bob = new Customer()
and then call foo(bob) from two different threads, it would not be thread safe. The first thread could be changing the address while the second thread is changing the name, causing inconsistent behavior and / or corrupt data.
If you want to make this method truly thread-safe, just declare the method synchronized:
public synchronized void foo(Customer cust) {...}
thread safety is required where a function is accessing a static shared variable. like a function which is updating a shared document, so if two thread in parallel updated changes of one thread will get ignore. Or a static variable which is shared across the application, singleton object.
Above are some situation where thread safety required In your case you are not updating any shared resource so this is a thread safe.
Directly from this web site, I came across the following description about creating object thread safety.
Warning: When constructing an object that will be shared between
threads, be very careful that a reference to the object does not
"leak" prematurely. For example, suppose you want to maintain a List
called instances containing every instance of class. You might be
tempted to add the following line to your constructor:
instances.add(this);
But then other threads can use instances to access the object before
construction of the object is complete.
Is anybody able to express the same concept with other words or another more graspable example?
Thanks in advance.
Let us assume, you have such class:
class Sync {
public Sync(List<Sync> list) {
list.add(this);
// switch
// instance initialization code
}
public void bang() { }
}
and you have two threads (thread #1 and thread #2), both of them have a reference the same List<Sync> list instance.
Now thread #1 creates a new Sync instance and as an argument provides a reference to the list instance:
new Sync(list);
While executing line // switch in the Sync constructor there is a context switch and now thread #2 is working.
Thread #2 executes such code:
for(Sync elem : list)
elem.bang();
Thread #2 calls bang() on the instance created in point 3, but this instance is not ready to be used yet, because the constructor of this instance has not been finished.
Therefore,
you have to be very careful when calling a constructor and passing a reference to the object shared between a few threads
when implementing a constructor you have to keep in mind that the provided instance can be shared between a few threads
Thread A is creating Object A, in the middle of creation object A (in first line of constructor of Object A) there is context switch. Now thread B is working, and thread B can look into object A (he had reference already). However Object A is not yet fully constructed because Thread A don't have time to finish it.
Here is your clear example :
Let's say, there is class named House
class House {
private static List<House> listOfHouse;
private name;
// other properties
public House(){
listOfHouse.add(this);
this.name = "dummy house";
//do other things
}
// other methods
}
And Village:
class Village {
public static void printsHouses(){
for(House house : House.getListOfHouse()){
System.out.println(house.getName());
}
}
}
Now if you are creating a House in a thread, "X". And when the executing thread is just finished the bellow line,
listOfHouse.add(this);
And the context is switched (already the reference of this object is added in the list listOfHouse, while the object creation is not finished yet) to another thread, "Y" running,
printsHouses();
in it! then printHouses() will see an object which is still not fully created and this type of inconsistency is known as Leak.
Lot of good data here but I thought I'd add some more information.
When constructing an object that will be shared between threads, be very careful that a reference to the object does not "leak" prematurely.
While you are constructing the object, you need to make sure that there is no way for other threads to access this object before it can be fulling constructed. This means that in a constructor you should not, for example:
Assign the object to a static field on the class that is accessible by other threads.
Start a thread on the object in the constructor which may start using fields from the object before they are fulling initialized.
Publish the object into a collection or via any other mechanisms that allow other threads to see the object before it can be fulling constructed.
You might be tempted to add the following line to your constructor:
instances.add(this);
So something like the following is improper:
public class Foo {
// multiple threads can use this
public static List<Foo> instances = new ArrayList<Foo>();
public Foo() {
...
// this "leaks" this, publishing it to other threads
instances.add(this);
...
// other initialization stuff
}
...
One addition bit of complexity is that the Java compiler/optimizer has the ability to reorder the instructions inside of the constructor so they happen at a later time. This means that even if you do instances.add(this); as the last line of the constructor, this is not enough to ensure that the constructor really has finished.
If multiple threads are going to be accessing this published object, it must be synchronized. The only fields you don't need to worry about are final fields which are guaranteed to be finished constructing when the constructor finishes. volatile fields are themselves synchronized so you don't have to worry about them.
I think that the following example illustrate what authors wanted to say:
public clsss MyClass {
public MyClass(List<?> list) {
// some stuff
list.add(this); // self registration
// other stuff
}
}
The MyClass registers itself in list that can be used by other thread. But it runs "other stuff" after the registration. This means that if other thread starts using the object before it finished its constructor the object is probably not fully created yet.
Its describing the following situation:
Thread1:
//we add a reference to this thread
object.add(thread1Id,this);
//we start to initialize this thread, but suppose before reaching the next line we switch threads
this.initialize();
Thread2:
//we are able to get th1, but its not initialized properly so its in an invalid state
//and hence th1 is not valid
Object th1 = object.get(thread1Id);
As the thread scheduler can stop execution of a thread at any time (even half-way through a high level instruction like instances.push_back(this)) and switch to executing a different thread, unexpected behaviour can happen if you don't synchronize parallel access to objects.
Look at the code below:
#include <vector>
#include <thread>
#include <memory>
#include <iostream>
struct A {
std::vector<A*> instances;
A() { instances.push_back(this); }
void printSize() { std::cout << instances.size() << std::endl; }
};
int main() {
std::unique_ptr<A> a; // Initialized to nullptr.
std::thread t1([&a] { a.reset(new A()); }); // Construct new A.
std::thread t2([&a] { a->printSize(); }); // Use A. This will fail if t1 don't happen to finish before.
t1.join();
t2.join();
}
As the access to a in main()-function is not synchronized execution will fail every once in a while.
This happens when execution of thread t1 is halted before finishing construction of the object A and thread t2 is executed instead. This results in thread t2 trying to access a unique_ptr<A> containing a nullptr.
You just have to make sure, that even, when one thread hasn't initialized the Object, no Thread will access it (and get a NullpointerException).
In this case, it would happen in the constructor (I suppose), but another thread could access that very object between its add to the list and the end of the constructor.