compareAndSet memory effects when expected==update

compareAndSet memory effects when expected==update - java

Java exposes the CAS operation through its atomic classes, e.g.
AtomicInteger.compareAndSet(expected,update)
When expected == update, are these calls a no-op or do they still have the memory consistency effects of a volatile read+write (as is the case when expected != update)?

I went through the native code and there does not appear to be any difference in relation to the values being equal. Specifically in this case since integer equivalence doesnt do so on the reference.
It will run through the same logic with same memory consistencies in the event expected != update
One note is that there will always be at least a volatile load on the field's location, so you at the very least will have a volatile read of the backing int field.

AFAIK there is no check for expected == update and no change in behaviour. To do so could add cycles if the hardware didn't do it already, which I suspect it doesn't.
I wouldn't write code which depends on side effects of calling CAS in any case.
You could add the check yourself if you think it is likely of course.

Related

VarHandle get/setOpaque

I keep fighting to understand what VarHandle::setOpaque and VarHandle::getOpaque are really doing. It has not been easy so far - there are some things I think I get (but will not present them in the question itself, not to muddy the waters), but overall this is miss-leading at best for me.
The documentation:
Returns the value of a variable, accessed in program order...
Well in my understanding if I have:
int xx = x; // read x
int yy = y; // read y
These reads can be re-ordered. On the other had if I have:
// simplified code, does not compile, but reads happen on the same "this" for example
int xx = VarHandle_X.getOpaque(x);
int yy = VarHandle_Y.getOpaque(y);
This time re-orderings are not possible? And this is what it means "program order"? Are we talking about insertions of barriers here for this re-ordering to be prohibited? If so, since these are two loads, would the same be achieved? via:
int xx = x;
VarHandle.loadLoadFence()
int yy = y;
But it gets a lot trickier:
... but with no assurance of memory ordering effects with respect to other threads.
I could not come up with an example to even pretend I understand this part.
It seems to me that this documentation is targeted at people who know exactly what they are doing (and I am definitely not one)... So can someone shed some light here?

Well in my understanding if I have:
int xx = x; // read x
int yy = y; // read y
These reads can be re-ordered.
These reads may not only happen to be reordered, they may not happen at all. The thread may use an old, previously read value for x and/or y or values it did previously write to these variables whereas, in fact, the write may not have been performed yet, so the “reading thread” may use values, no other thread may know of and are not in the heap memory at that time (and probably never will).
On the other had if I have:
// simplified code, does not compile, but reads happen on the same "this" for example
int xx = VarHandle_X.getOpaque(x);
int yy = VarHandle_Y.getOpaque(y);
This time re-orderings are not possible? And this is what it means "program order"?
Simply said, the main feature of opaque reads and writes, is, that they will actually happen. This implies that they can not be reordered in respect to other memory access of at least the same strength, but that has no impact for ordinary reads and writes.
The term program order is defined by the JLS:
… the program order of t is a total order that reflects the order in which these actions would be performed according to the intra-thread semantics of t.
That’s the evaluation order specified for expressions and statements. The order in which we perceive the effects, as long as only a single thread is involved.
Are we talking about insertions of barriers here for this re-ordering to be prohibited?
No, there is no barrier involved, which might be the intention behind the phrase “…but with no assurance of memory ordering effects with respect to other threads”.
Perhaps, we could say that opaque access works a bit like volatile was before Java 5, enforcing read access to see the most recent heap memory value (which makes only sense if the writing end also uses opaque or an even stronger mode), but with no effect on other reads or writes.
So what can you do with it?
A typical use case would be a cancellation or interruption flag that is not supposed to establish a happens-before relationship. Often, the stopped background task has no interest in perceiving actions made by the stopping task prior to signalling, but will just end its own activity. So writing and reading the flag with opaque mode would be sufficient to ensure that the signal is eventually noticed (unlike the normal access mode), but without any additional negative impact on the performance.
Likewise, a background task could write progress updates, like a percentage number, which the reporting (UI) thread is supposed to notice timely, while no happens-before relationship is required before the publication of the final result.
It’s also useful if you just want atomic access for long and double, without any other impact.
Since truly immutable objects using final fields are immune to data races, you can use opaque modes for timely publishing immutable objects, without the broader effect of release/acquire mode publishing.
A special case would be periodically checking a status for an expected value update and once available, querying the value with a stronger mode (or executing the matching fence instruction explicitly). In principle, a happens-before relationship can only be established between the write and its subsequent read anyway, but since optimizers usually don’t have the horizon to identify such a inter-thread use case, performance critical code can use opaque access to optimize such scenario.

The opaque means that the thread executing opaque operation is guaranteed to observe its own actions in program order, but that's it.
Other threads are free to observe the threads actions in any order. On x86 it is a common case since it has
write ordered with store-buffer forwarding
memory model so even if the thread does store before load. The store can be cached in the store buffer and some thread being executed on any other core observes the thread action in reverse order load-store instead of store-load. So opaque operation is done on x86 for free (on x86 we actually also have acquire for free, see this extremely exhaustive answer for details on some other architectures and their memory models: https://stackoverflow.com/a/55741922/8990329)
Why is it useful? Well, I could speculate that if some thread observed a value stored with opaque memory semantic then subsequent read will observe "at least this or later" value (plain memory access does not provide such guarantees, does it?).
Also since Java 9 VarHandles are somewhat related to acquire/release/consume semantic in C I think it is worth noting that opaque access is similar to memory_order_relaxed which is defined in the Standard as follows:
For memory_order_relaxed, no operation orders memory.
with some examples provided.

I have been struggling with opaque myself and the documentation is certainly not easy to understand.
From the above link:
Opaque operations are bitwise atomic and coherently ordered.
The bitwise atomic part is obvious. Coherently ordered means that loads/stores to a single address have some total order, each reach sees the most recent address before it and the order is consistent with the program order. For some coherence examples, see the following JCStress test.
Coherence doesn't provide any ordering guarantees between loads/stores to different addresses so it doesn't need to provide any fences so that loads/stores to different addresses are ordered.
With opaque, the compiler will emit the loads/stores as it sees them. But the underlying hardware is still allowed to reorder load/stores to different addresses.
I upgraded your example to the message-passing litmus test:
thread1:
X.setOpaque(1);
Y.setOpaque(1);
thread2:
ry = Y.getOpaque();
rx = X.getOpaque();
if (ry == 1 && rx == 0) println("Oh shit");
The above could fail on a platform that would allow for the 2 stores to be reordered or the 2 loads (again ARM or PowerPC). Opaque is not required to provide causality. JCStress has a good example for that as well.
Also, the following IRIW example can fail:
thread1:
X.setOpaque(1);
thread2:
Y.setOpaque(1);
thread3:
rx_thread3 = X.getOpaque();
[LoadLoad]
ry_thread3 = Y.getOpaque();
thread4:
ry_thread4 = Y.getOpaque();
[LoadLoad]
rx_thread4 = X.getOpaque();
Can it be that we end up with rx_thread3=1,ry_thread3=0,ry_thread4=1 and rx_thread4 is 0?
With opaque this can happen. Even though the loads are prevented from being reordered, opaque accesses do not require multi-copy-atomicity (stores to different addresses issued by different CPUs can be seen in different orders).
Release/acquire is stronger than opaque, since with release/acquire it is allowed to fail, therefor with opaque, it is allowed to fail. So Opaque is not required to provide consensus.

Q: Code isnt working without syncronized method [duplicate]

I have a class that contains a boolean field like this one:
public class MyClass
{
private bool boolVal;
public bool BoolVal
{
get { return boolVal; }
set { boolVal = value; }
}
}
The field can be read and written from many threads using the property. My question is if I should fence the getter and setter with a lock statement? Or should I simply use the volatile keyword and save the locking? Or should I totally ignore multithreading since getting and setting boolean values atomic?
regards,

There are several issues here.
The simple first. Yes, reading and writing a boolean variable is an atomic operation. (clarification: What I mean is that read and write operations by themselves are atomic operations for booleans, not reading and writing, that will of course generate two operations, which together will not be atomic)
However, unless you take extra steps, the compiler might optimize away such reading and writing, or move the operations around, which could make your code operate differently from what you intend.
Marking the field as volatile means that the operations will not be optimized away, the directive basically says that the compiler should never assume the value in this field is the same as the previous one, even if it just read it in the previous instruction.
However, on multicore and multicpu machines, different cores and cpus might have a different value for the field in their cache, and thus you add a lock { } clause, or anything else that forces a memory barrier. This will ensure that the field value is consistent across cores. Additionally, reads and writes will not move past a memory barrier in the code, which means you have predictability in where the operations happen.
So if you suspect, or know, that this field will be written to and read from multiple threads, I would definitely add locking and volatile to the mix.
Note that I'm no expert in multithreading, I'm able to hold my own, but I usually program defensively. There might (I would assume it is highly likely) that you can implement something that doesn't use a lock (there are many lock-free constructs), but sadly I'm not experienced enough in this topic to handle those things. Thus my advice is to add both a lock clause and a volatile directive.

volatile alone is not enough and serves for a different purpose, lock should be fine, but in the end it depends if anyone is going to set boolVal in MyClass iself, who knows, you may have a worker thread spinning in there. It also depends and how you are using boolVal internally. You may also need protection elsewhere. If you ask me, if you are not DEAD SURE you are going to use MyClass in more than one thread, then it's not worth even thinking about it.
P.S. you may also want to read this section

Is the expression "a==1 ? 1 : 0" with comparison plus ternary operator expression atomic?

Quick question? Is this line atomic in C++ and Java?
class foo {
bool test() {
// Is this line atomic?
return a==1 ? 1 : 0;
}
int a;
}
If there are multiple thread accessing that line, we could end up with doing the check
a==1 first, then a is updated, then return, right?
Added: I didn't complete the class and of course, there are other parts which update a...

No, for both C++ and Java.
In Java, you need to make your method synchronized and protect other uses of a in the same way. Make sure you're synchronizing on the same object in all cases.
In C++, you need to use std::mutex to protect a, probably using std::lock_guard to make sure you properly unlock the mutex at the end of your function.

return a==1 ? 1 : 0;
is a simple way of writing
if(a == 1)
return 1;
else
return 0;
I don't see any code for updating a. But I think you could figure it out.

Regardless of whether there is a write, reading the value of a non-atomic type in C++ is not an atomic operation. If there are no writes then you might not care whether it's atomic; if some other thread might be modifying the value then you certainly do care.

The correct way of putting it is simply: No! (both for Java and C++)
A less correct, but more practical answer is: Technically this is not atomic, but on most mainstream architectures, it is at least for C++.
Nothing is being modified in the code you posted, the variable is only tested. The code will thus usually result in a single TEST (or similar) instruction accessing that memory location, and that is, incidentially, atomic. The instruction will read a cache line, and there will be one well-defined value in the respective loaction, whatever it may be.
However, this is incidential/accidential, not something you can rely on.
It will usually even work -- again, incidentially/accidentially -- when a single other thread writes to the value. For this, the CPU fetches a cache line, overwrites the location for the respective address within the cache line, and writes back the entire cache line to RAM. When you test the variable, you fetch a cache line which contains either the old or the new value (nothing in between). No happens-before guarantees of any kind, but you can still consider this "atomic".
It is much more complicated when several threads modify that variable concurrently (not part of the question). For this to work properly, you need to use something from C++11 <atomic>, or use an atomic intrinsic, or something similar. Otherwise it is very much unclear what happens, and what the result of an operation may be -- one thread might read the value, increment it and write it back, but another one might read the original value before the modified value is written back.
This is more or less guaranteed to end badly, on all current platforms.

No, it is not atomic (in general) although it can be in some architectures (in C++, for example, in intel if the integer is aligned which it will be unless you force it not to be).
Consider these three threads:
// thread one: // thread two: //thread three
while (true) while (true) while (a) ;
a = 0xFFFF0000; a = 0x0000FFFF;
If the write to a is not atomic (for example, intel if a is unaligned, and for the sake of discussion with 16bits in each one of two consecutive cache lines). Now while it seems that the third thread cannot ever come out of the loop (the two possible values of a are both non-zero), the fact is that the assignments are not atomic, thread two could update the higher 16bits to be 0, and thread three could read the lower 16bits to be 0 before thread two gets the time to complete the update, and come out of the loop.
The whole conditional is irrelevant to the question, since the returned value is local to the thread.

No, it still a test followed by a set and then a return.
Yes, multithreadedness will be a problem.
It's just syntactic sugar.

Your question can be rephrased as: is statement:
a == 1
atomic or not? No it is not atomic, you should use std::atomic for a or check that condition under lock of some sort. If whole ternary operator atomic or not does not matter in this context as it does not change anything. If you mean in your question if in this code:
bool flag = somefoo.test();
flag to be consistent to a == 1, it would definitely not, and it irrelevant if whole ternary operator in your question is atomic.

There a lot of good answers here, but none of them mention the need in Java to mark a as volatile.
This is especially important if no other synchronization method is employed, but other threads could updating a. Otherwise, you could be reading an old value of a.

Consider the following code:
bool done = false;
void Thread1() {
while (!done) {
do_something_useful_in_a_loop_1();
}
do_thread1_cleanup();
}
void Thread2() {
do_something_useful_2();
done = true;
do_thread2_cleanup();
}
The synchronization between these two threads is done using a boolean variable done. This is a wrong way to synchronize two threads.
On x86, the biggest issue is the compile-time optimizations.
Part of the code of do_something_useful_2() can be moved below "done = true" by the compiler.
Part of the code of do_thread2_cleanup() can be moved above "done = true" by the compiler.
If do_something_useful_in_a_loop_1() doesn't modify "done", the compiler may re-write Thread1 in the following way:
if (!done) {
while(true) {
do_something_useful_in_a_loop_1();
}
}
do_thread1_cleanup();
so Thread1 will never exit.
On architectures other than x86, the cache effects or out-of-order instruction execution may lead to other subtle problems.
Most race detector will detect such race.
Also, most dynamic race detectors will report data races on the memory accesses that were intended to be synchronized with this bool
(i.e. between do_something_useful_2() and do_thread1_cleanup())
To fix such race you need to use compiler and/or memory barriers (if you are not an expert -- simply use locks).

A quote from "Java Threads" book about volatile keyword

I was just wondering if someone could explain the meaning of this:
Operations like increment and
decrement (e.g. ++ and --) can't be
used on a volatile variable because
these operations are syntactic sugar
for a load, change and a store.
I think increment and decrement should just work fine for a volatile variable, the only difference would be every time you read or write you would be accessing from/writing to main memory rather than from cache.

volatile variable only ensures visibility . It does not ensure atomicity. I guess, that is how the statement should be interpreted.

I think you're taking the quote out of context.
Of course ++ and -- can be applied to volatile variables. They just won't be atomic.
And since volatile often implies that they must be handled in an atomic manner, this is counter to the goal.
The problem with ++ and -- is that they might feel like they are atomic, when indeed they are not.
Doing a = a + 1 makes it (somewhat) explicit that it is not an atomic operation, but one might (wrongly) think that a++ is atomic.

The Java Language Specification does not have atomic operations for the ++ and -- operators. In other words, when you write code in the following manner:
a++;
the Java compiler actually emits code that is similar to the set of steps below (the actual instructions will vary depending on the nature of the variable):
Load the operand onto the stack using one of the operations for loading data.
Duplicate the value of the operand on the stack (for the purpose of returning later). This usually accomplished using a dup operation.
Increment the value on the stack. Usually accomplished using the iadd operation in the VM.
Return the value (obtained in step 2).
As you can observe, there are multiple operations in the VM for what is commonly thought to be an atomic operation. The VM can ensure atomicity only upto the level of an individual operation. Any further requirement can be achieved only via synchronization or other techniques.
Using the volatile keyword, allows other threads to obtain the most recent value of a variable; all read operations on a variable will return the recently updated value on a per-instruction basis. For example, if the variable a were to be volatile in the previous example, then a thread reading the value of a would see different values if it were to read a after instruction 2 and after instruction 3. Use of volatile does not protect against this scenario. It protects against the scenario where multiple threads see multiple values for a after instruction 2 (for instance).

Volatile does not garanty atomicity in an opeartion that involves multiple steps.
Look at it this way it I am reading a value and that is all am doing, the read operation is an atomic operation. It is a single step and hence the use of volatile here will be fine. If however I am reading that value and changing that value before writing back, that is a multistep operation and for this volatile does not manage the atomicity.
The increment and decrement opeartions are multi-stepped and hence the use of the volatile modifier is not sufficient.

Nope -- you use "volatile" to indicate that the variable can be changed by an external entity.
This would typically be some JNI C code, or, a special register linked to some hardware such as a thermometer. Java cannot guarantee that all JVMs on all architectures can will be capable of incrementing these values in a single machine cycle. So it doesnt let you do it anywhere.

Does access/write to Boolean object needs synchronization

This may seem a very silly question.
Consider this:
I have a simple Boolean object with a getter and a setter. Now both of the methods are called from a lot of threads very frequently.
Do I need to have a synchronization for this boolean?
Also are Boolean assignments atomic operations?
[UPDATE]:
I know about Atomic Boolean already. I already have a lot of varied solutions, But I was specifically looking for answers and justification of answers for the above 2 question.

No, Boolean access is NOT atomic (on the level of machine code), although it does take "only 1 operation in Java".
Therefore, yes, you do need synchronization for Boolean.
Please see slides 4-6 of this presentation for code examples.
On a related note, you should not synchronize on a Boolean

Yes. But if it's a flag that is written from one thread only and you want to ensure visibility to all threads, then a cheap alternative is using volatile.
Yes - even though the internal representation of the object (i.e. the actual boolean flag inside the Boolean wrapper) were 64 bit and it could therefore get "split" in a concurrent situation, a boolean can only have one of the two values, right? So plain assignments (get or set) are atomic, but if you're doing anything else (like check-then-act), for instance x = !x, then it's of course not atomic unless synchronized.

From a technical perspective, synchronization is not required for writes in one thread to be perceived in another thread. What you do need is a happens-before edge. It is likely that either volatile or synchronized will be used to achieve the happens-before edge. Both of those techniques result in a synchronized-with edge. So, in practice, you will probably use synchronization to manage the state of your boolean.
Yes. Note that you are not changing the state of the Boolean object. You are only modifying the reference to the Boolean object. Section 17.7 of the language specification states that "writes to and reads of references are always atomic."
Update: Let me expound upon the need for a happens-before edge. Without a happens-before edge, then the changes that one thread makes to the variable are not guaranteed to ever be perceived by the other threads. It is not simply that the change may be perceived at a bad time such as in between a read and a write. The change may never be perceived.
Let's say that we have a boolean variable that we initialize to false. Then we start two threads. The first thread sets the variable to true and stop. The second thread continually checks the variable until it is true, after which it stops. There is no guarantee that the second thread will ever see the variable as true.

Use AtomicBoolean.

No, you don't. But declare the variable volatile so that the values are reflected in all threads that are accessing the boolean. If you look at AtomicBoolean's set(..) method, it doesn't have any synchronization either.
Yes, practically assignment is atomic. Just setting the value does not need synchronization. However, if you want to do something like:
if (!bool) {
bool = false;
}
then you need synchronization (or AtomicBoolean, which is more efficient than synchronization)

http://docs.oracle.com/javase/tutorial/essential/concurrency/atomic.html
says that read and write of primitive variables is atomic.
Hence it is possible to enforce strict alternation or happens after relationships using boolean (or volatile boolean in case of cache effects)

even if it was atomic there are still syncronization isues since you will propably check the value sometime
e.g.
if (boolVar==true)
-> other thread takes control
do_something();

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.