Just to make sure I understand the concepts presented in java concurrency in practice.
Lets say I have the following program:
public class Stuff{
private int x;
public Stuff(int x){
this.x=x;
}
public int getX(){return x;}
}
public class UseStuff(){
private Stuff s;
public void makeStuff(int x){
s=new Stuff(x);
}
public int useStuff(){
return s.getX();
}
}
If I let multiple threads to play with this code, then I'm not only in trouble because s might be pointing to multiple instances if two or more threads are entering to the makeStuff method, but even if just one thread creates a new Stuff, then an other thread who is just entered to useStuff can return the value 0 (predefined int value) or the value assigned to "x" by its constructor.
That all depends on whether the constructor has finished initializing x.
So at this point, to make it thread safe I must do one thing and then I can choose from two different ways.
First I must make makeStuff() atomic, so "s" will point to one object at a time.
Then I either make useStuff synchronized as well which ensures the I get back the Stuff object x var only after its constructor has finished building it, OR i can make Stuff's x final, and by this the JMM makes sure that x's value will only be visible after it has been initialized.
Do I understand the importance of final fields in the context of concurrency and JMM?
Do I understand the importance of final fields in the context of concurrency and JMM?
Not quite. The spec writes:
final fields also allow programmers to implement thread-safe immutable objects without synchronization. A thread-safe immutable object is seen as immutable by all threads, even if a data race is used to pass references to the immutable object between threads. This can provide safety guarantees against misuse of an immutable class by incorrect or malicious code
If you make x final, this guarantees that every thread that obtains a reference to a Stuff instance will observe x to have been assigned. It does not guarantee that any thread will obtain such a reference.
That is, in the absence of synchronization action in useStuff(), the runtime is permitted to satisfy a read of s from a register, which might return a stale value.
The cheapest correctly synchronized variant of this code is declaring s volatile, which ensures that writes to s happen-before (and are therefore visible to) subsequent reads of s. If you do that, you need not even make x final (because the write to x happens-before the write of s, the read of s happens-before the read of x, and happens-before is transitive).
Some answers claim that s can only refer to one object at a time. This is wrong; because there is no memory barrier, different threads can have their own notion about the value of s. In order for all threads to see a consistent value assigned to s, you need to declare s as volatile, or use some other memory barrier.
If you do this, you won't need to declare x as final for the correct value to be visible to all threads (but you might still want to; fields shouldn't be mutable without a reason). That's because the initialization of x happens-before the assignment of s in "source code order," and the write of the volatile field s happens-before other thread reads that value from s. If you subsequently modified the value of a non-final field x, however, you could run into trouble because the modification isn't guaranteed to be visible to other threads. Making Stuff immutable would eliminate that possibility.
Of course, there's nothing to stop threads from clobbering the value assigned to s, so different threads could still see different values for x. This isn't really a threading issue though. Even a single thread could write and then read different values of x over time. But preventing this behavior in a multi-threaded environment requires atomicity, that is, checking to see whether s has a value and assigning one if not should appear as one indivisible action to other threads. An AtomicReference would be the best solution, but the synchronized keyword would work too.
What are you trying to protect by making things synchronized? Are you concerned that thread A will call makeStuff and then thread B will call getStuff afterwards and the value won't be there? I'm not sure how synchronizing any of this will help that. Depending on what problem you are trying to avoid, it might be as simple as marking s as volatile.
I'm not sure what you're doing there. Why are you trying to create an object and then assign it to a field? Why save it if it can be overwritten by other call to makeStuff? It seems like you use UseStuff both as an proxy and as a factory to your actual Stuff model object. You better separate the two:
public class StuffFactory {
public static Stuff createStuff(int value) {
return new StuffProxy(value);
}
}
public class StuffProxy extends Stuff {
// Replacement for useStuff from your original UseStuff class
#Override
public int getX() {
//Put custom logic here
return super.getX();
}
}
The logic here is that each thread is responsible for creation of their own Stuff objects (using the factory) so concurrent access no longer an issue.
Related
I have the below code:
public class Foo {
private volatile Map<String, String> map;
public Foo() {
refresh();
}
public void refresh() {
map = getData();
}
public boolean isPresent(String id) {
return map.containsKey(id);
}
public String getName(String id) {
return map.get(id);
}
private Map<String, String> getData() {
// logic
}
}
Is the above code thread safe or do I need to add synchronized or mutexes in there? If it's not thread safe, please clarify why.
Also, I've read that one should use AtomicReference instead of this, but in the source of the AtomicReference class, I can see that the field used to hold the value is volatile (along with a few convenience methods).
Is there a specific reason to use AtomicReference instead?
I've read multiple answer related to this but the concept of volatile still confuses me. Thanks in advance.
If you're not modifying the contents of map (except inside of refresh() when creating it), then there are no visibility issues in the code.
It's still possible to do isPresent(), refresh(), getName() (if no outside synchronization is used) and end up with isPresent()==true and getName()==null.
A class is "thread safe" if it does the right thing when it is used by multiple threads at the same time. There is no way to tell whether a class is thread safe unless you can say what "the right thing" means, and especially, what "the right thing when used by multiple threads" means.
What is the right thing if thread A calls foo.isPresent("X") and it returns true, and then thread B calls foo.refresh(), and then thread A calls foo.getName("X")?
If you are going to claim "thread safety", then you must be very explicit about what the caller should expect in cases like that.
Volatile is only useful in this scenario to update the value immediately. It doesn't really make the code by itself thread-safe.
But because you've stated in your comment, you only update the reference and because the reference-switch is atomic, your code will be thread-safe.(from the given code)
If I understood your question correctly and your comments - your class Foo holds a Map in which only the reference should be updated e.g. a whole new Map added instead of mutating it. If this is the premise:
It does not make any difference if you declare it as volatile or not. Every read/write operation in Java is atomic itself. You will never see a half transaction on these operations. See the JLS 17.7
17.7. Non-Atomic Treatment of double and long
For the purposes of the Java programming language memory model, a single write to a non-volatile long or double value is treated as two separate writes: one to each 32-bit half. This can result in a situation where a thread sees the first 32 bits of a 64-bit value from one write, and the second 32 bits from another write.
Writes and reads of volatile long and double values are always atomic.
Writes to and reads of references are always atomic, regardless of whether they are implemented as 32-bit or 64-bit values.
Some implementations may find it convenient to divide a single write action on a 64-bit long or double value into two write actions on adjacent 32-bit values. For efficiency's sake, this behavior is implementation-specific; an implementation of the Java Virtual Machine is free to perform writes to long and double values atomically or in two parts.
Implementations of the Java Virtual Machine are encouraged to avoid splitting 64-bit values where possible. Programmers are encouraged to declare shared 64-bit values as volatile or synchronize their programs correctly to avoid possible complications.
EDIT: Although the top statement still stands as it is - for thread safety it's necessary to add volatile to reflect the immediate update on different Threads to reflect the reference update. The behavior of a Thread is to make local copy of it while with volatile it will do a happens-before relationship in other words the Threads will have the same state of the Map.
There is an article about volatile using in ibm,and the explanation confused me,below is a sample in this article and its explanation:
public class BackgroundFloobleLoader {
public volatile Flooble theFlooble;
public void initInBackground() {
// do lots of stuff
theFlooble = new Flooble(); // this is the only write to theFlooble
}
}
public class SomeOtherClass {
public void doWork() {
while (true) {
// do some stuff...
// use the Flooble, but only if it is ready
if (floobleLoader.theFlooble != null)
doSomething(floobleLoader.theFlooble);
}
}
}
Without the theFlooble reference being volatile, the code in doWork() would be at risk for seeing a partially constructed Flooble as it dereferences the theFlooble reference.
How to understand this?Why without volatile,we may use a partially constructed Flooble object?Thanks!
Without the volatile you could see a partially constructed object. E.g. consider this Flooble object.
public class Flooble {
public int x;
public int y;
public Flooble() {
x = 5;
y = 1;
}
}
public class SomeOtherClass {
public void doWork() {
while (true) {
// do some stuff...
// use the Flooble, but only if it is ready
if (floobleLoader.theFlooble != null)
doSomething(floobleLoader.theFlooble);
}
public void doSomething(Flooble flooble) {
System.out.println(flooble.x / flooble.y);
}
}
}
Without volatile the method doSomething is not guaranteed to see the values 5 and 1 for x and y. It could see for instance x == 5 but y == 0, leading to division by zero.
When you execute this operation theFlooble = new Flooble(), three writes occur:
tmpFlooble.x = 5
tmpFlooble.y = 1
theFlooble = tmpFlooble
If these writes happen in this order everything is ok. But without the volatile the compiler is free to reorder these writes and perform them as it wishes. E.g. first point 3 and then points 1 and 2.
This actually happens all the time. The compiler really does reorder the writes. This is done to increase performance.
The error can easily happen in the following way:
Thread A executes initInBackground() method from class BackgroundFloobleLoader. The compiler reorders the writes so before executing the body of Flooble() (where x and y are set), the thread A first executes theFlooble = new Flooble(). Now, theFlooble points to a flooble instance, whose x and y are 0. Before thread A continues, some other thread B executes method doWork() of class SomeOtherClass. This method calls method doSomething(floobleLoader.theFlooble) with the current value of theFlooble. In this method theFlooble.x is divided by theFlooble.y resulting in division by zero. Thread B finishes due to uncaught exception. Thread A continues and sets theFlooble.x = 5 and theFlooble.y = 1.
This scenario of course won't happen on every run, but according to the rules of Java, can happen.
When different threads access your code, any thread can perform modifications on the state of your object, which means that when other threads access it, the state may not be as it should.
From the oracle documentation:
The Java programming language allows threads to access shared
variables. As a rule, to ensure that shared variables are
consistently and reliably updated, a thread should ensure that it has
exclusive use of such variables by obtaining a lock that,
conventionally, enforces mutual exclusion for those shared variables.
The Java programming language provides a second mechanism, volatile
fields, that is more convenient than locking for some purposes.
A field may be declared volatile, in which case the Java Memory Model
ensures that all threads see a consistent value for the variable.
source
Which means the value of this variable will never be cached thread-locally, all reads and writes will go straight to "main memory"
For example picture thread1 and thread2 accessing the object:
Thread1 access the object and stores it in its local cache
Trhead2 modifies the object
Thread1 accesses the object again, but since it is still in its cache, it doesn't access the updated state by thread2.
Look at it from the point of view of the code that does this:
if (floobleLoader.theFlooble != null)
doSomething(floobleLoader.theFlooble);
Clearly, you need a guarantee that all of the writes performed by new Flooble() are visible to this code before theFlooble could possibly test as != null. Nothing in the code without volatile provides this guarantee. So you need a guarantee you don't have. Fail.
Java provides several ways to get the guarantee you need. One is by use of a volatile variable:
... any write to a volatile variable establishes a happens-before relationship with subsequent reads of that same variable. This means that changes to a volatile variable are always visible to other threads. What's more, it also means that when a thread reads a volatile variable, it sees not just the latest change to the volatile, but also the side effects of the code that led up the change. -- Docs
So putting a write to a volatile in one thread and a read to a volatile in the other establishes precisely the happens-before relationship we need.
I doubt there is such a thing as partially constructed objects in Java. Volatile guarantees that every thread will see a constructed object. Since volatile works like a tiny synchronized block on the referenced object you would end up with a NPE if theFlobble == null. Maybe that is what they mean.
Objects encapsulate a lot of things: variables, methods, etc. and these take time to come into existence inside a computer. In Java, if any variable is declared volatile then all reads and writes to it is atomic. So if a variable referencing an object is declared volatile then access to its members is allowed only when it fully loads in your system (how do you read or write to something that isn't there at all?)
I am trying to wrap my head around thread safety in java (or in general). I have this class (which I hope complies with the definition of a POJO) which also needs to be compatible with JPA providers:
public class SomeClass {
private Object timestampLock = new Object();
// are "volatile"s necessary?
private volatile java.sql.Timestamp timestamp;
private volatile String timestampTimeZoneName;
private volatile BigDecimal someValue;
public ZonedDateTime getTimestamp() {
// is synchronisation necessary here? is this the correct usage?
synchronized (timestampLock) {
return ZonedDateTime.ofInstant(timestamp.toInstant(), ZoneId.of(timestampTimeZoneName));
}
}
public void setTimestamp(ZonedDateTime dateTime) {
// is this the correct usage?
synchronized (timestampLock) {
this.timestamp = java.sql.Timestamp.from(dateTime.toInstant());
this.timestampTimeZoneName = dateTime.getZone().getId();
}
}
// is synchronisation required?
public BigDecimal getSomeValue() {
return someValue;
}
// is synchronisation required?
public void setSomeValue(BigDecimal val) {
someValue = val;
}
}
As stated in the commented rows in the code, is it necessary to define timestamp and timestampTimeZoneName as volatile and are the synchronized blocks used as they should be? Or should I use only the synchronized blocks and not define timestamp and timestampTimeZoneName as volatile? A timestampTimeZoneName of a timestamp should not be erroneously matched with another timestamp's.
This link says
Reads and writes are atomic for all variables declared volatile
(including long and double variables)
Should I understand that accesses to someValue in this code through the setter/getter are thread safe thanks to volatile definitions? If so, is there a better (I do not know what "better" might mean here) way to accomplish this?
To determine if you need synchronized, try to imagine a place where you can have a context switch that would break your code.
In this case, if the context switch happens where I put the comment, then in getTimestamp() you're going to be reading different values from each timestamp type.
Also, although assignments are atomic, this expression java.sql.Timestamp.from(dateTime.toInstant()); certainly isn't, so you can get a context switch inbetween dateTime.toInstant() and the call to from. In short you definitely need the synchronized blocks.
synchronized (timestampLock) {
this.timestamp = java.sql.Timestamp.from(dateTime.toInstant());
//CONTEXT SWITCH HERE
this.timestampTimeZoneName = dateTime.getZone().getId();
}
synchronized (timestampLock) {
return ZonedDateTime.ofInstant(timestamp.toInstant(), ZoneId.of(timestampTimeZoneName));
}
In terms of volatile, I'm pretty sure they're required. You have to guarantee that each thread definitely is getting the most updated version of a variable.
This is the contract of volatile. And although it may be covered by the synchronized block, and volatile not actually necessary here, it's good to write anyway. If the synchronized block does the job of volatile already, the VM won't do the guarantee twice. This means volatile won't cost you any more, and it's a very good flashing light that says to the programmer: "I'M USED IN MULTIPLE THREADS".
For someValue: If there's no synchronized block here, then volatile is definitely necessary. If you call a set in one thread, the other thread has no queue that tells it that may have been updated outside of this thread. So it may use an old and cached value. The JIT can do a lot of funny optimizations if it assumes single thread. Ones that can simply break your program.
Now I'm not entirely certain if synchronized is required here. My guess is no. I would add it anyway to be safe though. Or you can let java worry about the synchronization and use http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/atomic/AtomicInteger.html
Nothing new here, this is just a more explicit version of something #Cruncher already said:
You need synchronized whenever it is important for two or more fields in your program to be consistent with one another. Suppose you have two parallel lists, and your code depends on them both being the same length. That's called an invariant as in, the two lists are invariably the same length.
How can you write a method, append(x,y), that adds a new pair of values to the lists without temporarily breaking the invariant? You can't. The method must add one item to the first list, breaking the invariant, and then add the other item to the second list, fixing it again. There's no other way.
In a single-threaded program, that temporary broken state is no problem because no other method can possibly use the lists while append(x,y) is running. That's no longer true in a multithreaded program. In the worst case, append(x,y) could add x to the x list, and then the scheduler could suspend the thread at that exact moment to allow other threads to run. The CPUs could execute millions of instructions before append(x,y) gets to finish the job and make the lists right again. During all of that time, other threads would see the broken invariant, and possibly corrupt your data or crash the program as a result.
The fix is for append(x,y) to be synchronized on some object, and (this is the important part), for every other method that uses the lists to be synchronized on the same object. Since only one thread can be synchronized on a given object at a given time, it will not be possible for any other thread to see the lists in an inconsistent state.
So, if thread A calls append(x,y), and thread B tries to look at the lists "at the same time", will thread B see the what the lists looked like before or after thread A did its work? That's called a data race. And with only the synchronization that I have described so far, there's no way to know which thread will win. All we've done so far is to guarantee one particular invariant.
If it matters which thread wins the race, then that means that there is some higher-level invariant that also needs protection. You will have to add more synchronization to protect that one too. "Thread safety" -- two little words to name a subject that is both broad and deep.
Good Luck, and Have Fun!
// is synchronisation required?
public BigDecimal getSomeValue() {
return someValue;
}
// is synchronisation required?
public void setSomeValue(BigDecimal val) {
someValue = val;
}
I think Yes you are require to put the synchronization block because consider an example in which one thread is setting the value and at the same time other thread is trying to read from getter method, like here in the example you will see the syncronization block.So, if you take your variable inside the method then you must require the synchronization block.
I want to make sure that I correctly understand the 'Effectively Immutable Objects' behavior according to Java Memory Model.
Let's say we have a mutable class which we want to publish as an effectively immutable:
class Outworld {
// This MAY be accessed by multiple threads
public static volatile MutableLong published;
}
// This class is mutable
class MutableLong {
private long value;
public MutableLong(long value) {
this.value = value;
}
public void increment() {
value++;
}
public long get() {
return value;
}
}
We do the following:
// Create a mutable object and modify it
MutableLong val = new MutableLong(1);
val.increment();
val.increment();
// No more modifications
// UPDATED: Let's say for this example we are completely sure
// that no one will ever call increment() since now
// Publish it safely and consider Effectively Immutable
Outworld.published = val;
The question is:
Does Java Memory Model guarantee that all threads MUST have Outworld.published.get() == 3 ?
According to Java Concurrency In Practice this should be true, but please correct me if I'm wrong.
3.5.3. Safe Publication Idioms
To publish an object safely, both the reference to the object and the
object's state must be made visible to other threads at the same time.
A properly constructed object can be safely published by:
- Initializing an object reference from a static initializer;
- Storing a reference to it into a volatile field or AtomicReference;
- Storing a reference to it into a final field of a properly constructed object; or
- Storing a reference to it into a field that is properly guarded by a lock.
3.5.4. Effectively Immutable Objects
Safely published effectively immutable objects can be used safely by
any thread without additional synchronization.
Yes. The write operations on the MutableLong are followed by a happens-before relationship (on the volatile) before the read.
(It is possible that a thread reads Outworld.published and passes it on to another thread unsafely. In theory, that could see earlier state. In practice, I don't see it happening.)
There is a couple of conditions which must be met for the Java Memory Model to guarantee that Outworld.published.get() == 3:
the snippet of code you posted which creates and increments the MutableLong, then sets the Outworld.published field, must happen with visibility between the steps. One way to achieve this trivially is to have all that code running in a single thread - guaranteeing "as-if-serial semantics". I assume that's what you intended, but thought it worth pointing out.
reads of Outworld.published must have happens-after semantics from the assignment. An example of this could be having the same thread execute Outworld.published = val; then launch other the threads which could read the value. This would guarantee "as if serial" semantics, preventing re-ordering of the reads before the assignment.
If you are able to provide those guarantees, then the JMM will guarantee all threads see Outworld.published.get() == 3.
However, if you're interested in general program design advice in this area, read on.
For the guarantee that no other threads ever see a different value for Outworld.published.get(), you (the developer) have to guarantee that your program does not modify the value in any way. Either by subsequently executing Outworld.published = differentVal; or Outworld.published.increment();. While that is possible to guarantee, it can be so much easier if you design your code to avoid both the mutable object, and using a static non-final field as a global point of access for multiple threads:
instead of publishing MutableLong, copy the relevant values into a new instance of a different class, whose state cannot be modified. E.g.: introduce the class ImmutableLong, which assigns value to a final field on construction, and doesn't have an increment() method.
instead of multiple threads accessing a static non-final field, pass the object as a parameter to your Callable/Runnable implementations. This will prevent the possibility of one rogue thread from reassigning the value and interfering with the others, and is easier to reason about than static field reassignment. (Admittedly, if you're dealing with legacy code, this is easier said than done).
The question is: Does Java Memory Model guarantee that all threads
MUST have Outworld.published.get() == 3 ?
The short answer is no. Because other threads might access Outworld.published before it has been read.
After the moment when Outworld.published = val; had been performed, under condition that no other modifications done with the val - yes - it always be 3.
But if any thread performs val.increment then its value might be different for other threads.
consider this class,with no instance variables and only methods which are non-synchronous can we infer from this info that this class in Thread-safe?
public class test{
public void test1{
// do something
}
public void test2{
// do something
}
public void test3{
// do something
}
}
It depends entirely on what state the methods mutate. If they mutate no shared state, they're thread safe. If they mutate only local state, they're thread-safe. If they only call methods that are thread-safe, they're thread-safe.
Not being thread safe means that if multiple threads try to access the object at the same time, something might change from one access to the next, and cause issues. Consider the following:
int incrementCount() {
this.count++;
// ... Do some other stuff
return this.count;
}
would not be thread safe. Why is it not? Imagine thread 1 accesses it, count is increased, then some processing occurs. While going through the function, another thread accesses it, increasing count again. The first thread, which had it go from, say, 1 to 2, would now have it go from 1 to 3 when it returns. Thread 2 would see it go from 1 to 3 as well, so what happened to 2?
In this case, you would want something like this (keeping in mind that this isn't any language-specific code, but closest to Java, one of only 2 I've done threading in)
int incrementCount() synchronized {
this.count++;
// ... Do some other stuff
return this.count;
}
The synchronized keyword here would make sure that as long as one thread is accessing it, no other threads could. This would mean that thread 1 hits it, count goes from 1 to 2, as expected. Thread 2 hits it while 1 is processing, it has to wait until thread 1 is done. When it's done, thread 1 gets a return of 2, then thread 2 goes throguh, and gets the expected 3.
Now, an example, similar to what you have there, that would be entirely thread-safe, no matter what:
int incrementCount(int count) {
count++;
// ... Do some other stuff
return this.count;
}
As the only variables being touched here are fully local to the function, there is no case where two threads accessing it at the same time could try working with data changed from the other. This would make it thread safe.
So, to answer the question, assuming that the functions don't modify anything outside of the specific called function, then yes, the class could be deemed to be thread-safe.
Consider the following quote from an article about thread safety ("Java theory and practice: Characterizing thread safety"):
In reality, any definition of thread safety is going to have a certain degree of circularity, as it must appeal to the class's specification -- which is an informal, prose description of what the class does, its side effects, which states are valid or invalid, invariants, preconditions, postconditions, and so on. (Constraints on an object's state imposed by the specification apply only to the externally visible state -- that which can be observed by calling its public methods and accessing its public fields -- rather than its internal state, which is what is actually represented in its private fields.)
Thread safety
For a class to be thread-safe, it first must behave correctly in a single-threaded environment. If a class is correctly implemented, which is another way of saying that it conforms to its specification, no sequence of operations (reads or writes of public fields and calls to public methods) on objects of that class should be able to put the object into an invalid state, observe the object to be in an invalid state, or violate any of the class's invariants, preconditions, or postconditions.
Furthermore, for a class to be thread-safe, it must continue to behave correctly, in the sense described above, when accessed from multiple threads, regardless of the scheduling or interleaving of the execution of those threads by the runtime environment, without any additional synchronization on the part of the calling code. The effect is that operations on a thread-safe object will appear to all threads to occur in a fixed, globally consistent order.
So your class itself is thread-safe, as long as it doesn't have any side effects. As soon as the methods mutate any external objects (e.g. some singletons, as already mentioned by others) it's not any longer thread-safe.
Depends on what happens inside those methods. If they manipulate / call any method parameters or global variables / singletons which are not themselves thread safe, the class is not thread safe either.
(yes I see that the methods as shown here here have no parameters, but no brackets either, so this is obviously not full working code - it wouldn't even compile as is.)
yes, as long as there are no instance variables. method calls using only input parameters and local variables are inherently thread-safe. you might consider making the methods static too, to reflect this.
If it has no mutable state - it's thread safe. If you have no state - you're thread safe by association.
No, I don't think so.
For example, one of the methods could obtain a (non-thread-safe) singleton object from another class and mutate that object.
Yes - this class is thread safe but this does not mean that your application is.
An application is thread safe if the threads in it cannot concurrently access heap state. All objects in Java (and therefore all of their fields) are created on the heap. So, if there are no fields in an object then it is thread safe.
In any practical application, objects will have state. If you can guarantee that these objects are not accessed concurrently then you have a thread safe application.
There are ways of optimizing access to shared state e.g. Atomic variables or with carful use of the volatile keyword, but I think this is going beyond what you've asked.
I hope this helps.