java threads synchronization - java

In the class below, is the method getIt() thread safe and why?
public class X {
private long myVar;
public void setIt(long var){
myVar = var;
}
public long getIt() {
return myVar;
}
}

It is not thread-safe. Variables of type long and double in Java are treated as two separate 32-bit variables. One thread could be writing and have written half the value when another thread reads both halves. In this situation, the reader would see a value that was never supposed to exist.
To make this thread-safe you can either declare myVar as volatile (Java 1.5 or later) or make both setIt and getIt synchronized.
Note that even if myVar was a 32-bit int you could still run into threading issues where one thread could be reading an out of date value that another thread has changed. This could occur because the value has been cached by the CPU. To resolve this, you again need to declare myVar as volatile (Java 1.5 or later) or make both setIt and getIt synchronized.
It's also worth noting that if you are using the result of getIt in a subsequent setIt call, e.g. x.setIt(x.getIt() * 2), then you probably want to synchronize across both calls:
synchronized(x)
{
x.setIt(x.getIt() * 2);
}
Without the extra synchronization, another thread could change the value in between the getIt and setIt calls causing the other thread's value to be lost.

This is not thread-safe. Even if your platform guarantees atomic writes of long, the lack of synchronized makes it possible that one thread calls setIt() and even after this call has finished it is possible that another thread can call getIt() and this call could return the old value of myVar.
The synchronized keyword does more than an exclusive access of one thread to a block or a method. It also guarantees that the second thread is informed about a change of a variable.
So you either have to mark both methods as synchronized or mark the member myVar as volatile.
There's a very good explanation about synchronization here:
Atomic actions cannot be interleaved, so they can be used without fear of thread interference. However, this does not eliminate all need to synchronize atomic actions, because memory consistency errors are still possible. Using volatile variables reduces the risk of memory consistency errors, because any write to a volatile variable establishes a happens-before relationship with subsequent reads of that same variable. This means that changes to a volatile variable are always visible to other threads. What's more, it also means that when a thread reads a volatile variable, it sees not just the latest change to the volatile, but also the side effects of the code that led up the change.

No, it's not. At least, not on platforms that lack atomic 64-bit memory accesses.
Suppose that Thread A calls setIt, copies 32 bits into memory where the backing value is, and is then pre-empted before it can copy the other 32 bits.
Then Thread B calls getIt.

No it is not, because longs are not atomic in java, so one thread could have written 32 bits of the long in the setIt method, and then the getIt could read the value, and then setIt could set the other 32 bits.
So the end result is that getIt returns a value that was never valid.

It ought to be, and generally is, but is not guaranteed to be thread safe. There could be issues with different cores having different versions in CPU cache, or the store/retrieve not being atomic for all architectures. Use the AtomicLong class.

The getter is not thread safe because it’s not guarded by any mechanism that guarantees the most up-to-date visibility. Your choices are:
making myVar final (but then you can’t mutate it)
making myVar volatile
use synchronized to accessing myVar

AFAIK, Modern JVMs no longer split long and double operations. I don't know of any reference which states this is still a problem. For example, see AtomicLong which doesn't use synchronization in Sun's JVM.
Assuming you want to be sure it is not a problem then you can use synchronize both get() and set(). However, if you are performing an operation like add, i.e. set(get()+1) then this synchronization doesn't buy you much, you still have to synchronize the object for the whole operation. (A better way around this is to use a single operation for add(n) which is synchronized)
However, a better solution is to use an AtomicLong. This supports atomic operations like get, set and add and DOESN'T use synchronization.

Since it is a read only method. You should synchronize the set method.
EDIT : I see why the get method needs to be synchronized as well. Good job explaining Phil Ross.

Related

Volatile keyword for thread safety

I have the below code:
public class Foo {
private volatile Map<String, String> map;
public Foo() {
refresh();
}
public void refresh() {
map = getData();
}
public boolean isPresent(String id) {
return map.containsKey(id);
}
public String getName(String id) {
return map.get(id);
}
private Map<String, String> getData() {
// logic
}
}
Is the above code thread safe or do I need to add synchronized or mutexes in there? If it's not thread safe, please clarify why.
Also, I've read that one should use AtomicReference instead of this, but in the source of the AtomicReference class, I can see that the field used to hold the value is volatile (along with a few convenience methods).
Is there a specific reason to use AtomicReference instead?
I've read multiple answer related to this but the concept of volatile still confuses me. Thanks in advance.
If you're not modifying the contents of map (except inside of refresh() when creating it), then there are no visibility issues in the code.
It's still possible to do isPresent(), refresh(), getName() (if no outside synchronization is used) and end up with isPresent()==true and getName()==null.
A class is "thread safe" if it does the right thing when it is used by multiple threads at the same time. There is no way to tell whether a class is thread safe unless you can say what "the right thing" means, and especially, what "the right thing when used by multiple threads" means.
What is the right thing if thread A calls foo.isPresent("X") and it returns true, and then thread B calls foo.refresh(), and then thread A calls foo.getName("X")?
If you are going to claim "thread safety", then you must be very explicit about what the caller should expect in cases like that.
Volatile is only useful in this scenario to update the value immediately. It doesn't really make the code by itself thread-safe.
But because you've stated in your comment, you only update the reference and because the reference-switch is atomic, your code will be thread-safe.(from the given code)
If I understood your question correctly and your comments - your class Foo holds a Map in which only the reference should be updated e.g. a whole new Map added instead of mutating it. If this is the premise:
It does not make any difference if you declare it as volatile or not. Every read/write operation in Java is atomic itself. You will never see a half transaction on these operations. See the JLS 17.7
17.7. Non-Atomic Treatment of double and long
For the purposes of the Java programming language memory model, a single write to a non-volatile long or double value is treated as two separate writes: one to each 32-bit half. This can result in a situation where a thread sees the first 32 bits of a 64-bit value from one write, and the second 32 bits from another write.
Writes and reads of volatile long and double values are always atomic.
Writes to and reads of references are always atomic, regardless of whether they are implemented as 32-bit or 64-bit values.
Some implementations may find it convenient to divide a single write action on a 64-bit long or double value into two write actions on adjacent 32-bit values. For efficiency's sake, this behavior is implementation-specific; an implementation of the Java Virtual Machine is free to perform writes to long and double values atomically or in two parts.
Implementations of the Java Virtual Machine are encouraged to avoid splitting 64-bit values where possible. Programmers are encouraged to declare shared 64-bit values as volatile or synchronize their programs correctly to avoid possible complications.
EDIT: Although the top statement still stands as it is - for thread safety it's necessary to add volatile to reflect the immediate update on different Threads to reflect the reference update. The behavior of a Thread is to make local copy of it while with volatile it will do a happens-before relationship in other words the Threads will have the same state of the Map.

Volatile Vs Atomic [duplicate]

This question already has answers here:
What is the difference between atomic / volatile / synchronized?
(7 answers)
Closed 9 years ago.
I read somewhere below line.
Java volatile keyword doesn't means atomic, its common misconception
that after declaring volatile, ++ operation will be atomic, to make
the operation atomic you still need to ensure exclusive access using
synchronized method or block in Java.
So what will happen if two threads attack a volatile primitive variable at same time?
Does this mean that whosoever takes lock on it, that will be setting its value first. And if in meantime, some other thread comes up and read old value while first thread was changing its value, then doesn't new thread will read its old value?
What is the difference between Atomic and volatile keyword?
The effect of the volatile keyword is approximately that each individual read or write operation on that variable is made atomically visible to all threads.
Notably, however, an operation that requires more than one read/write -- such as i++, which is equivalent to i = i + 1, which does one read and one write -- is not atomic, since another thread may write to i between the read and the write.
The Atomic classes, like AtomicInteger and AtomicReference, provide a wider variety of operations atomically, specifically including increment for AtomicInteger.
Volatile and Atomic are two different concepts. Volatile ensures, that a certain, expected (memory) state is true across different threads, while Atomics ensure that operation on variables are performed atomically.
Take the following example of two threads in Java:
Thread A:
value = 1;
done = true;
Thread B:
if (done)
System.out.println(value);
Starting with value = 0 and done = false the rule of threading tells us, that it is undefined whether or not Thread B will print value. Furthermore value is undefined at that point as well! To explain this you need to know a bit about Java memory management (which can be complex), in short: Threads may create local copies of variables, and the JVM can reorder code to optimize it, therefore there is no guarantee that the above code is run in exactly that order. Setting done to true and then setting value to 1 could be a possible outcome of the JIT optimizations.
volatile only ensures, that at the moment of access of such a variable, the new value will be immediately visible to all other threads and the order of execution ensures, that the code is at the state you would expect it to be. So in case of the code above, defining done as volatile will ensure that whenever Thread B checks the variable, it is either false, or true, and if it is true, then value has been set to 1 as well.
As a side-effect of volatile, the value of such a variable is set thread-wide atomically (at a very minor cost of execution speed). This is however only important on 32-bit systems that i.E. use long (64-bit) variables (or similar), in most other cases setting/reading a variable is atomic anyways. But there is an important difference between an atomic access and an atomic operation. Volatile only ensures that the access is atomically, while Atomics ensure that the operation is atomically.
Take the following example:
i = i + 1;
No matter how you define i, a different Thread reading the value just when the above line is executed might get i, or i + 1, because the operation is not atomically. If the other thread sets i to a different value, in worst case i could be set back to whatever it was before by thread A, because it was just in the middle of calculating i + 1 based on the old value, and then set i again to that old value + 1. Explanation:
Assume i = 0
Thread A reads i, calculates i+1, which is 1
Thread B sets i to 1000 and returns
Thread A now sets i to the result of the operation, which is i = 1
Atomics like AtomicInteger ensure, that such operations happen atomically. So the above issue cannot happen, i would either be 1000 or 1001 once both threads are finished.
There are two important concepts in multithreading environment:
atomicity
visibility
The volatile keyword eradicates visibility problems, but it does not deal with atomicity. volatile will prevent the compiler from reordering instructions which involve a write and a subsequent read of a volatile variable; e.g. k++.
Here, k++ is not a single machine instruction, but three:
copy the value to a register;
increment the value;
place it back.
So, even if you declare a variable as volatile, this will not make this operation atomic; this means another thread can see a intermediate result which is a stale or unwanted value for the other thread.
On the other hand, AtomicInteger, AtomicReference are based on the Compare and swap instruction. CAS has three operands: a memory location V on which to operate, the expected old value A, and the new value B. CAS atomically updates V to the new value B, but only if the value in V matches the expected old value A; otherwise, it does nothing. In either case, it returns the value currently in V. The compareAndSet() methods of AtomicInteger and AtomicReference take advantage of this functionality, if it is supported by the underlying processor; if it is not, then the JVM implements it via spin lock.
As Trying as indicated, volatile deals only with visibility.
Consider this snippet in a concurrent environment:
boolean isStopped = false;
:
:
while (!isStopped) {
// do some kind of work
}
The idea here is that some thread could change the value of isStopped from false to true in order to indicate to the subsequent loop that it is time to stop looping.
Intuitively, there is no problem. Logically if another thread makes isStopped equal to true, then the loop must terminate. The reality is that the loop will likely never terminate even if another thread makes isStopped equal to true.
The reason for this is not intuitive, but consider that modern processors have multiple cores and that each core has multiple registers and multiple levels of cache memory that are not accessible to other processors. In other words, values that are cached in one processor's local memory are not visisble to threads executing on a different processor. Herein lies one of the central problems with concurrency: visibility.
The Java Memory Model makes no guarantees whatsoever about when changes that are made to a variable in one thread may become visible to other threads. In order to guarantee that updates are visisble as soon as they are made, you must synchronize.
The volatile keyword is a weak form of synchronization. While it does nothing for mutual exclusion or atomicity, it does provide a guarantee that changes made to a variable in one thread will become visible to other threads as soon as it is made. Because individual reads and writes to variables that are not 8-bytes are atomic in Java, declaring variables volatile provides an easy mechanism for providing visibility in situations where there are no other atomicity or mutual exclusion requirements.
The volatile keyword is used:
to make non atomic 64-bit operations atomic: long and double. (all other, primitive accesses are already guaranteed to be atomic!)
to make variable updates guaranteed to be seen by other threads + visibility effects: after writing to a volatile variable, all the variables that where visible before writing that variable become visible to another thread after reading the same volatile variable (happen-before ordering).
The java.util.concurrent.atomic.* classes are, according to the java docs:
A small toolkit of classes that support lock-free thread-safe
programming on single variables. In essence, the classes in this
package extend the notion of volatile values, fields, and array
elements to those that also provide an atomic conditional update
operation of the form:
boolean compareAndSet(expectedValue, updateValue);
The atomic classes are built around the atomic compareAndSet(...) function that maps to an atomic CPU instruction. The atomic classes introduce the happen-before ordering as the volatile variables do. (with one exception: weakCompareAndSet(...)).
From the java docs:
When a thread sees an update to an atomic variable caused by a
weakCompareAndSet, it does not necessarily see updates to any other
variables that occurred before the weakCompareAndSet.
To your question:
Does this mean that whosoever takes lock on it, that will be setting
its value first. And in if meantime, some other thread comes up and
read old value while first thread was changing its value, then doesn't
new thread will read its old value?
You don't lock anything, what you are describing is a typical race condition that will happen eventually if threads access shared data without proper synchronization. As already mentioned declaring a variable volatile in this case will only ensure that other threads will see the change of the variable (the value will not be cached in a register of some cache that is only seen by one thread).
What is the difference between AtomicInteger and volatile int?
AtomicInteger provides atomic operations on an int with proper synchronization (eg. incrementAndGet(), getAndAdd(...), ...), volatile int will just ensure the visibility of the int to other threads.
So what will happen if two threads attack a volatile primitive variable at same time?
Usually each one can increment the value. However sometime, both will update the value at the same time and instead of incrementing by 2 total, both thread increment by 1 and only 1 is added.
Does this mean that whosoever takes lock on it, that will be setting its value first.
There is no lock. That is what synchronized is for.
And in if meantime, some other thread comes up and read old value while first thread was changing its value, then doesn't new thread will read its old value?
Yes,
What is the difference between Atomic and volatile keyword?
AtomicXxxx wraps a volatile so they are basically same, the difference is that it provides higher level operations such as CompareAndSwap which is used to implement increment.
AtomicXxxx also supports lazySet. This is like a volatile set, but doesn't stall the pipeline waiting for the write to complete. It can mean that if you read a value you just write you might see the old value, but you shouldn't be doing that anyway. The difference is that setting a volatile takes about 5 ns, bit lazySet takes about 0.5 ns.

Should getters and setters be synchronized?

private double value;
public synchronized void setValue(double value) {
this.value = value;
}
public double getValue() {
return this.value;
}
In the above example is there any point in making the getter synchronized?
I think its best to cite Java Concurrency in Practice here:
It is a common mistake to assume that synchronization needs to be used only when writing to shared variables; this is simply not true.
For each mutable state variable that may be accessed by more than one
thread, all accesses to that variable must be performed with the same
lock held. In this case, we say that the variable is guarded by that
lock.
In the absence of synchronization, the compiler, processor, and runtime can do some downright weird things to the order in which operations appear to execute. Attempts to reason about the order in which memory actions "must" happen in insufflciently synchronized multithreaded programs will almost certainly be incorrect.
Normally, you don't have to be so careful with primitives, so if this would be an int or a boolean it might be that:
When a thread reads a variable without synchronization, it may see a
stale value, but at least it sees a value that was actually placed
there by some thread rather than some random value.
This, however, is not true for 64-bit operations, for instance on long or double if they are not declared volatile:
The Java Memory Model requires fetch and
store operations to be atomic, but for nonvolatile long and double
variables, the JVM is permitted to treat a 64-bit read or write as two
separate 32-bit operations. If the reads and writes occur in different
threads, it is therefore possible to read a nonvolatile long and get
back the high 32 bits of one value and the low 32 bits of another.
Thus, even if you don't care about stale values, it is not safe to use
shared mutable long and double variables in multithreaded programs
unless they are declared volatile or guarded by a lock.
Let me show you by example what is a legal way for a JIT to compile your code. You write:
while (myBean.getValue() > 1.0) {
// perform some action
Thread.sleep(1);
}
JIT compiles:
if (myBean.getValue() > 1.0)
while (true) {
// perform some action
Thread.sleep(1);
}
In just slightly different scenarios even the Java compiler could prouduce similar bytecode (it would only have to eliminate the possibility of dynamic dispatch to a different getValue). This is a textbook example of hoisting.
Why is this legal? The compiler has the right to assume that the result of myBean.getValue() can never change while executing above code. Without synchronized it is allowed to ignore any actions by other threads.
The reason here is to guard against any other thread updating the value when a thread is reading and thus avoid performing any action on stale value.
Here get method will acquire intrinsic lock on "this" and thus any other thread which might attempt to set/update using setter method will have to wait to acquire lock on "this" to enter the setter method which is already acquired by thread performing get.
This is why its recommended to follow the practice of using same lock when performing any operation on a mutable state.
Making the field volatile will work here as there are no compound statements.
It is important to note that synchronized methods use intrinsic lock which is "this". So get and set both being synchronized means any thread entering the method will have to acquire lock on this.
When performing non atomic 64 bit operations special consideration should be taken. Excerpts from Java Concurrency In Practice could be of help here to understand the situation -
"The Java Memory Model requires fetch and store operations to be atomic, but for non-volatile long and double variables, the JVM is permitted to treat a 64 bit read or write as two separate 32
bit operations. If the reads and writes occur in different threads, it is therefore possible to read a non-volatile long and get back the high 32 bits of one value and the low 32 bits of another. Thus, even if you don't care about stale values, it
is not safe to use shared mutable long and double variables in multi-threaded programs unless they are declared
volatile or guarded by a lock."
Maybe for someone this code looks awful, but it works very well.
private Double value;
public void setValue(Double value){
updateValue(value, true);
}
public Double getValue(){
return updateValue(value, false);
}
private double updateValue(Double value,boolean set){
synchronized(MyClass.class){
if(set)
this.value = value;
return value;
}
}

Multithreaded access and variable cache of threads

I could find the answer if I read a complete chapter/book about multithreading, but I'd like a quicker answer. (I know this stackoverflow question is similar, but not sufficiently.)
Assume there is this class:
public class TestClass {
private int someValue;
public int getSomeValue() { return someValue; }
public void setSomeValue(int value) { someValue = value; }
}
There are two threads (A and B) that access the instance of this class. Consider the following sequence:
A: getSomeValue()
B: setSomeValue()
A: getSomeValue()
If I'm right, someValue must be volatile, otherwise the 3rd step might not return the up-to-date value (because A may have a cached value). Is this correct?
Second scenario:
B: setSomeValue()
A: getSomeValue()
In this case, A will always get the correct value, because this is its first access so he can't have a cached value yet. Is this right?
If a class is accessed only in the second way, there is no need for volatile/synchronization, or is it?
Note that this example was simplified, and actually I'm wondering about particular member variables and methods in a complex class, and not about whole classes (i.e. which variables should be volatile or have synced access). The main point is: if more threads access certain data, is synchronized access needed by all means, or does it depend on the way (e.g. order) they access it?
After reading the comments, I try to present the source of my confusion with another example:
From UI thread: threadA.start()
threadA calls getSomeValue(), and informs the UI thread
UI thread gets the message (in its message queue), so it calls: threadB.start()
threadB calls setSomeValue(), and informs the UI thread
UI thread gets the message, and informs threadA (in some way, e.g. message queue)
threadA calls getSomeValue()
This is a totally synchronized structure, but why does this imply that threadA will get the most up-to-date value in step 6? (if someValue is not volatile, or not put into a monitor when accessed from anywhere)
If two threads are calling the same methods, you can't make any guarantees about the order that said methods are called. Consequently, your original premise, which depends on calling order, is invalid.
It's not about the order in which the methods are called; it's about synchronization. It's about using some mechanism to make one thread wait while the other fully completes its write operation. Once you've made the decision to have more than one thread, you must provide that synchronization mechanism to avoid data corruption.
As we all know, that its the crucial state of the data that we need to protect, and the atomic statements which govern the crucial state of the data must be Synchronized.
I had this example, where is used volatile, and then i used 2 threads which used to increment the value of a counter by 1 each time till 10000. So it must be a total of 20000. but to my surprise it didnt happened always.
Then i used synchronized keyword to make it work.
Synchronization makes sure that when a thread is accessing the synchronized method, no other thread is allowed to access this or any other synchronized method of that object, making sure that data corruption is not done.
Thread-Safe class means that it will maintain its correctness in the presence of the scheduling and interleaving of the underlining Runtime environment, without any thread-safe mechanism from the Client side, which access that class.
Let's look at the book.
A field may be declared volatile, in which case the Java memory model (§17) ensures that all threads see a consistent value for the variable.
So volatile is a guarantee that the declared variable won't be copied into thread local storage, which is otherwise allowed. It's further explained that this is an intentional alternative to locking for very simple kinds of synchronized access to shared storage.
Also see this earlier article, which explains that int access is necessarily atomic (but not double or long).
These together mean that if your int field is declared volatile then no locks are necessary to guarantee atomicity: you will always see a value that was last written to the memory location, not some confused value resulting from a half-complete write (as is possible with double or long).
However you seem to imply that your getters and setters themselves are atomic. This is not guaranteed. The JVM can interrupt execution at intermediate points of during the call or return sequence. In this example, this has no consequences. But if the calls had side effects, e.g. setSomeValue(++val), then you would have a different story.
The issue is that java is simply a specification. There are many JVM implementations and examples of physical operating environments. On any given combination an an action may be safe or unsafe. For instance On single processor systems the volatile keyword in your example is probably completely unnecessary. Since the writers of the memory and language specifications can't reasonably account for possible sets of operating conditions, they choose to white-list certain patterns that are guaranteed to work on all compliant implementations. Adhering to to these guidelines ensures both that your code will work on your target system and that it will be reasonably portable.
In this case "caching" typically refers to activity that is going on at the hardware level. There are certain events that occur in java that cause cores on a multi processor systems to "Synchronize" their caches. Accesses to volatile variables are an example of this, synchronized blocks are another. Imagine a scenario where these two threads X and Y are scheduled to run on different processors.
X starts and is scheduled on proc 1
y starts and is scheduled on proc 2
.. now you have two threads executing simultaneously
to speed things up the processors check local caches
before going to main memory because its expensive.
x calls setSomeValue('x-value') //assuming proc 1's cache is empty the cache is set
//this value is dropped on the bus to be flushed
//to main memory
//now all get's will retrieve from cache instead
//of engaging the memory bus to go to main memory
y calls setSomeValue('y-value') //same thing happens for proc 2
//Now in this situation depending on to order in which things are scheduled and
//what thread you are calling from calls to getSomeValue() may return 'x-value' or
//'y-value. The results are completely unpredictable.
The point is that volatile(on compliant implementations) ensures that ordered writes will always be flushed to main memory and that other processor's caches will be flagged as 'dirty' before the next access regardless of the thread from which that access occurs.
disclaimer: volatile DOES NOT LOCK. This is important especially in the following case:
volatile int counter;
public incrementSomeValue(){
counter++; // Bad thread juju - this is at least three instructions
// read - increment - write
// there is no guarantee that this operation is atomic
}
this could be relevant to your question if your intent is that setSomeValue must always be called before getSomeValue
If the intent is that getSomeValue() must always reflect the most recent call to setSomeValue() then this is a good place for the use of the volatile keyword. Just remember that without it there is no guarantee that getSomeValue() will reflect to most recent call to setSomeValue() even if setSomeValue() was scheduled first.
If I'm right, someValue must be volatile, otherwise the 3rd step might not return the up-to-date value (because A may have a cached
value). Is this correct?
If thread B calls setSomeValue(), you need some sort of synchronization to ensure that thread A can read that value. volatile won't accomplish this on its own, and neither will making the methods synchronized. The code that does this is ultimately whatever synchronization code you added that made sure that A: getSomeValue() happens after B: setSomeValue(). If, as you suggest, you used a message queue to synchronize threads, this happens because the memory changes made by thread A became visible to thread B once thread B acquired the lock on your message queue.
If a class is accessed only in the second way, there is no need for
volatile/synchronization, or is it?
If you are really doing your own synchronization then it doesn't sound like you care whether these classes are thread-safe. Be sure that you aren't accessing them from more than one thread at the same time though; otherwise, any methods that aren't atomic (assiging an int is) may lead to you to be in an unpredictable state. One common pattern is to put the shared state into an immutable object so that you are sure that the receiving thread isn't calling any setters.
If you do have a class that you want to be updated and read from multiple threads, I'd probably do the simplest thing to start, which is often to synchronize all public methods. If you really believe this to be a bottleneck, you could look into some of the more complex locking mechanisms in Java.
So what does volatile guarantee?
For the exact semantics, you might have to go read tutorials, but one way to summarize it is that 1) any memory changes made by the last thread to access the volatile will be visible to the current thread accessing the volatile, and 2) that accessing the volatile is atomic (it won't be a partially constructed object, or a partially assigned double or long).
Synchronized blocks have analogous properties: 1) any memory changes made by the last thread to access to the lock will be visible to this thread, and 2) the changes made within the block are performed atomically with respect to other synchronized blocks
(1) means any memory changes, not just changes to the volatile (we're talking post JDK 1.5) or within the synchronized block. This is what people mean when they refer to ordering, and this is accomplished in different ways on different chip architectures, often by using memory barriers.
Also, in the case of synchronous blocks (2) only guarantees that you won't see inconsistent values if you are within another block synchronized on the same lock. It's usually a good idea to synchronize all access to shared variables, unless you really know what you are doing.

Is unsynchronized read of integer threadsafe in java?

I see this code quite frequently in some OSS unit tests, but is it thread safe ? Is the while loop guaranteed to see the correct value of invoc ?
If no; nerd points to whoever also knows which CPU architecture this may fail on.
private int invoc = 0;
private synchronized void increment() {
invoc++;
}
public void isItThreadSafe() throws InterruptedException {
for (int i = 0; i < TOTAL_THREADS; i++) {
new Thread(new Runnable() {
public void run() {
// do some stuff
increment();
}
}).start();
}
while (invoc != TOTAL_THREADS) {
Thread.sleep(250);
}
}
No, it's not threadsafe. invoc needs to be declared volatile, or accessed while synchronizing on the same lock, or changed to use AtomicInteger. Just using the synchronized method to increment invoc, but not synchronizing to read it, isn't good enough.
The JVM does a lot of optimizations, including CPU-specific caching and instruction reordering. It uses the volatile keyword and locking to decide when it can optimize freely and when it has to have an up-to-date value available for other threads to read. So when the reader doesn't use the lock the JVM can't know not to give it a stale value.
This quote from Java Concurrency in Practice (section 3.1.3) discusses how both writes and reads need to be synchronized:
Intrinsic locking can be used to guarantee that one thread sees the effects of another in a predictable manner, as illustrated by Figure 3.1. When thread A executes a synchronized block, and subsequently thread B enters a synchronized block guarded by the same lock, the values of variables that were visible to A prior to releasing the lock are guaranteed to be visible to B upon acquiring the lock. In other words, everything A did in or prior to a synchronized block is visible to B when it executes a synchronized block guarded by the same lock. Without synchronization, there is no such guarantee.
The next section (3.1.4) covers using volatile:
The Java language also provides an alternative, weaker form of synchronization, volatile variables, to ensure that updates to a variable are propagated predictably to other threads. When a field is declared volatile, the compiler and runtime are put on notice that this variable is shared and that operations on it should not be reordered with other memory operations. Volatile variables are not cached in registers or in caches where they are hidden from other processors, so a read of a volatile variable always returns the most recent write by any thread.
Back when we all had single-CPU machines on our desktops we'd write code and never have a problem until it ran on a multiprocessor box, usually in production. Some of the factors that give rise to the visiblity problems, things like CPU-local caches and instruction reordering, are things you would expect from any multiprocessor machine. Elimination of apparently unneeded instructions could happen for any machine, though. There's nothing forcing the JVM to ever make the reader see the up-to-date value of the variable, you're at the mercy of the JVM implementors. So it seems to me this code would not be a good bet for any CPU architecture.
Well!
private volatile int invoc = 0;
Will do the trick.
And see Are java primitive ints atomic by design or by accident? which sites some of the relevant java definitions. Apparently int is fine, but double & long might not be.
edit, add-on. The question asks, "see the correct value of invoc ?". What is "the correct value"? As in the timespace continuum, simultaneity doesn't really exist between threads. One of the above posts notes that the value will eventually get flushed, and the other thread will get it. Is the code "thread safe"? I would say "yes", because it won't "misbehave" based on the vagaries of sequencing, in this case.
Theoretically, it is possible that the read is cached. Nothing in Java memory model prevents that.
Practically, that is extremely unlikely to happen (in your particular example). The question is, whether JVM can optimize across a method call.
read #1
method();
read #2
For JVM to reason that read#2 can reuse the result of read#1 (which can be stored in a CPU register), it must know for sure that method() contains no synchronization actions. This is generally impossible - unless, method() is inlined, and JVM can see from the flatted code that there's no sync/volatile or other synchronization actions between read#1 and read#2; then it can safely eliminate read#2.
Now in your example, the method is Thread.sleep(). One way to implement it is to busy loop for certain times, depending on CPU frequency. Then JVM may inline it, and then eliminate read#2.
But of course such implementation of sleep() is unrealistic. It is usually implemented as a native method that calls OS kernel. The question is, can JVM optimize across such a native method.
Even if JVM has knowledge of internal workings of some native methods, therefore can optimize across them, it's improbable that sleep() is treated that way. sleep(1ms) takes millions of CPU cycles to return, there is really no point optimizing around it to save a few reads.
--
This discussion reveals the biggest problem of data races - it takes too much effort to reason about it. A program is not necessarily wrong, if it is not "correctly synchronized", however to prove it's not wrong is not an easy task. Life is much simpler, if a program is correctly synchronized and contains no data race.
As far as I understand the code it should be safe. The bytecode can be reordered, yes. But eventually invoc should be in sync with the main thread again. Synchronize guarantees that invoc is incremented correctly so there is a consistent representation of invoc in some register. At some time this value will be flushed and the little test succeeds.
It is certainly not nice and I would go with the answer I voted for and would fix code like this because it smells. But thinking about it I would consider it safe.
If you're not required to use "int", I would suggest AtomicInteger as an thread-safe alternative.

Categories