Memory effects of synchronization in Java

Memory effects of synchronization in Java - java

JSR-133 FAQ says:
But there is more to synchronization
than mutual exclusion. Synchronization
ensures that memory writes by a thread
before or during a synchronized block
are made visible in a predictable
manner to other threads which
synchronize on the same monitor. After
we exit a synchronized block, we
release the monitor, which has the
effect of flushing the cache to main
memory, so that writes made by this
thread can be visible to other
threads. Before we can enter a
synchronized block, we acquire the
monitor, which has the effect of
invalidating the local processor cache
so that variables will be reloaded
from main memory. We will then be able
to see all of the writes made visible
by the previous release.
I also remember reading that on modern Sun VMs uncontended synchronizations are cheap. I am a little confused by this claim. Consider code like:
class Foo {
int x = 1;
int y = 1;
..
synchronized (aLock) {
x = x + 1;
}
}
Updates to x need the synchronization, but does the acquisition of the lock clear the value of y also from the cache? I can't imagine that to be the case, because if it were true, techniques like lock striping might not help. Alternatively can the JVM reliably analyze the code to ensure that y is not modified in another synchronized block using the same lock and hence not dump the value of y in cache when entering the synchronized block?

The short answer is that JSR-133 goes too far in its explanation. This isn't a serious issue because JSR-133 is a non-normative document which isn't part of the language or JVM standards. Rather, it is only a document which explains one possible strategy that is sufficient for implementing the memory model, but isn't in general necessary. On top of that, the comment about "cache flushing" is basically totally out place since essentially zero architectures would implement the Java memory model by doing any type of "cache flushing" (and many architectures don't even have such instructions).
The Java memory model is formally defined in terms of things like visibility, atomicity, happens-before relationships and so on, which explains exactly what threads must see what, what actions must occur before other actions and other relationships using a precisely (mathematically) defined model. Behavior which isn't formally defined could be random, or well-defined in practice on some hardware and JVM implementation - but of course you should never rely on this, as it might change in the future, and you could never really be sure that it was well-defined in the first place unless you wrote the JVM and were well-aware of the hardware semantics.
So the text that you quoted is not formally describing what Java guarantees, but rather is describing how some hypothetical architecture which had very weak memory ordering and visibility guarantees could satisfy the Java memory model requirements using cache flushing. Any actual discussion of cache flushing, main memory and so on is clearly not generally applicable to Java as these concepts don't exist in the abstract language and memory model spec.
In practice, the guarantees offered by the memory model are much weaker than a full flush - having every atomic, concurrency-related or lock operation flush the entire cache would be prohibitively expensive - and this is almost never done in practice. Rather, special atomic CPU operations are used, sometimes in combination with memory barrier instructions, which help ensure memory visibility and ordering. So the apparent inconsistency between cheap uncontended synchronization and "fully flushing the cache" is resolved by noting that the first is true and the second is not - no full flush is required by the Java memory model (and no flush occurs in practice).
If the formal memory model is a bit too heavy to digest (you wouldn't be alone), you can also dive deeper into this topic by taking a look at Doug Lea's cookbook, which is in fact linked in the JSR-133 FAQ, but comes at the issue from a concrete hardware perspective, since it is intended for compiler writers. There, they talk about exactly what barriers are needed for particular operations, including synchronization - and the barriers discussed there can pretty easily be mapped to actual hardware. Much of the actual mapping is discussed right in the cookbook.

BeeOnRope is right, the text you quote delves more into typical implementation details than into what the Java Memory Model does indeed guarantee. In practice, you may often see that y is actually purged from CPU caches when you synchronize on x (also, if x in your example were a volatile variable in which case explicit synchronization is not necessary to trigger the effect). This is because on most CPUs (note that this is a hardware effect, not something the JMM describes), the cache works on units called cache lines, which are usually longer than a machine word (for example 64 bytes wide). Since only complete lines can be loaded or invalidated in the cache, there are good chances that x and y will fall into the same line and that flushing one of them will also flush the other one.
It is possible to write a benchmark which shows this effect. Make a class with just two volatile int fields and let two threads perform some operations (e.g. incrementing in a long loop), one on one of the fields and one on the another. Time the operation. Then, insert 16 int fields in between the two original fields and repeat the test (16*4=64). Note that an array is just a reference so an array of 16 elements won't do the trick. You may see a significant improvement in performance because operations on one field will not influence the other one any more. Whether this works for you will depend on the JVM implementation and processor architecture. I have seen this in practice on Sun JVM and a typical x64 laptop, the difference in performance was by a factor of several times.

Updates to x need the synchronization,
but does the acquisition of the lock
clear the value of y also from the
cache? I can't imagine that to be the
case, because if it were true,
techniques like lock striping might
not help.
I'm not sure, but I think the answer may be "yes". Consider this:
class Foo {
int x = 1;
int y = 1;
..
void bar() {
synchronized (aLock) {
x = x + 1;
}
y = y + 1;
}
}
Now this code is unsafe, depending on what happens im the rest of the program. However, I think that the memory model means that the value of y seen by bar should not be older than the "real" value at the time of acquisition of the lock. That would imply the cache must be invalidated for y as well as x.
Also can the JVM reliably analyze the
code to ensure that y is not modified
in another synchronized block using
the same lock?
If the lock is this, this analysis looks like it would be feasible as a global optimization once all classes have been preloaded. (I'm not saying that it would be easy, or worthwhile ...)
In more general cases, the problem of proving that a given lock is only ever used in connection with a given "owning" instance is probably intractable.

we are java developers, we only know virtual machines, not real machines!
let me theorize what is happening - but I must say I don't know what I'm talking about.
say thread A is running on CPU A with cache A, thread B is running on CPU B with cache B,
thread A reads y; CPU A fetches y from main memory, and saved the value in cache A.
thread B assigns new value to 'y'. VM doesn't have to update the main memory at this point; as far as thread B is concerned, it can be reading/writing on a local image of 'y'; maybe the 'y' is nothing but a cpu register.
thread B exits a sync block and releases a monitor. (when and where it entered the block doesn't matter). thread B has updated quite some variables till this point, including 'y'. All those updates must be written to main memory now.
CPU B writes the new y value to place 'y' in main memory. (I imagine that) almost INSTANTLY, information 'main y is updated' is wired to cache A, and cache A invalidate its own copy of y. That must have happened really FAST on the hardware.
thread A acquires a monitor and enters a sync block - at this point it doesn't have to do anything regarding cache A. 'y' has already gone from cache A. when thread A reads y again, it's fresh from main memory with the new value assigned by B.
consider another variable z, which was also cached by A in step(1), but it's not updated by thread B in step(2). it can survive in cache A all the way to step(5). access to 'z' is not slowed down because of synchronization.
if the above statements make sense, then indeed the cost isn't very high.
addition to step(5): thread A may have its own cache which is even faster than cache A - it can use a register for variable 'y' for example. that will not be invalidated by step(4), therefore in step(5), thread A must erase its own cache upon sync entering. that's not a huge penalty though.

you might want to check jdk6.0 documentation
http://java.sun.com/javase/6/docs/api/java/util/concurrent/package-summary.html#MemoryVisibility
Memory Consistency Properties
Chapter 17 of the Java Language Specification defines the happens-before relation on memory operations such as reads and writes of shared variables. The results of a write by one thread are guaranteed to be visible to a read by another thread only if the write operation happens-before the read operation. The synchronized and volatile constructs, as well as the Thread.start() and Thread.join() methods, can form happens-before relationships. In particular:
Each action in a thread happens-before every action in that thread that comes later in the program's order.
An unlock (synchronized block or method exit) of a monitor happens-before every subsequent lock (synchronized block or method entry) of that same monitor. And because the happens-before relation is transitive, all actions of a thread prior to unlocking happen-before all actions subsequent to any thread locking that monitor.
A write to a volatile field happens-before every subsequent read of that same field. Writes and reads of volatile fields have similar memory consistency effects as entering and exiting monitors, but do not entail mutual exclusion locking.
A call to start on a thread happens-before any action in the started thread.
All actions in a thread happen-before any other thread successfully returns from a join on that thread
So,as stated in highlighted point above:All the changes that happens before a unlock happens on a monitor is visible to all those threads(and in there own synchronization block) which take lock on
the same monitor.This is in accordance with Java's happens-before semantics.
Therefore,all changes made to y would also be flushed to main memory when some other thread acquires the monitor on 'aLock'.

synchronize guarantees, that only one thread can enter a block of code. But it doesn't guarantee, that variables modifications done within synchronized section will be visible to other threads. Only the threads that enters the synchronized block is guaranteed to see the changes.
Memory effects of synchronization in Java could be compared with the problem of Double-Checked Locking with respect to c++ and Java
Double-Checked Locking is widely cited and used as an efficient method for implementing lazy initialization in a multi-threaded environment. Unfortunately, it will not work reliably in a platform independent way when implemented in Java, without additional synchronization. When implemented in other languages, such as C++, it depends on the memory model of the processor, the re-orderings performed by the compiler and the interaction between the compiler and the synchronization library. Since none of these are specified in a language such as C++, little can be said about the situations in which it will work. Explicit memory barriers can be used to make it work in C++, but these barriers are not available in Java.

Related

visibility guarantees of synchronized and volatile [duplicate]

I read this in an upvoted comment on StackOverflow:
But if you want to be safe, you can add simple synchronized(this) {}
at the end of you #PostConstruct [method]
[note that variables were NOT volatile]
I was thinking that happens-before is forced only if both write and read is executed in synchronized block or at least read is volatile.
Is the quoted sentence correct? Does an empty synchronized(this) {} block flush all variables changed in current method to "general visible" memory?
Please consider some scenerios
what if second thread never calls lock on this? (suppose that second thread reads in other methods). Remember that question is about: flush changes to other threads, not give other threads a way (synchronized) to poll changes made by original thread. Also no-synchronization in other methods is very likely in Spring #PostConstruct context - as original comment says.
is memory visibility of changes forced only in second and subsequent calls by another thread? (remember that this synchronized block is a last call in our method) - this would mark this way of synchronization as very bad practice (stale values in first call)

Much of what's written about this on SO, including many of the answers/comments in this thread, are, sadly, wrong.
The key rule in the Java Memory Model that applies here is: an unlock operation on a given monitor happens-before a subsequent lock operation on that same monitor. If only one thread ever acquires the lock, it has no meaning. If the VM can prove that the lock object is thread-confined, it can elide any fences it might otherwise emit.
The quote you highlight assumes that releasing a lock acts as a full fence. And sometimes that might be true, but you can't count on it. So your skeptical questions are well-founded.
See Java Concurrency in Practice, Ch 16 for more on the Java Memory Model.

All writes that occur prior to a monitor exit are visible to all threads after a monitor enter.
A synchronized(this){} can be turned into bytecode like
monitorenter
monitorexit
So if you have a bunch of writes prior to the synchronized(this){} they would have occurred before the monitorexit.
This brings us to the next point of my first sentence.
visible to all threads after a monitor enter
So now, in order for a thread to ensure the writes ocurred it must execute the same synchronization ie synchornized(this){}. This will issue at the very least a monitorenter and establish your happens before ordering.
So to answer your question
Does an empty synchronized(this) {} block flush all variables changed
in current method to "general visible" memory?
Yes, as long as you maintain the same synchronization when you want to read those non-volatile variables.
To address your other questions
what if second thread never calls lock on this? (suppose that second
thread reads in other methods). Remember that question is about: flush
changes to other threads, not give other threads a way (synchronized)
to poll changes made by original thread. Also no-synchronization in
other methods is very likely in Spring #PostConstruct context
Well in this case using synchronized(this) without any other context is relatively useless. There is no happens-before relationship and it's in theory just as useful as not including it.
is memory visibility of changes forced only in second and subsequent
calls by another thread? (remember that this synchronized block is a
last call in our method) - this would mark this way of synchronization
as very bad practice (stale values in first call)
Memory visibility is forced by the first thread calling synchronized(this), in that it will write directly to memory. Now, this doesn't necessarily mean each threads needs to read directly from memory. They can still read from their own processor caches. Having a thread call synchronized(this) ensures it pulls the value of the field(s) from memory and retrieve most up to date value.

What's the relationship/difference between Visibility and Ordering?

So far as I know, Visibility deals with under what condition can a thread observe/see the update to shared variable(s) by another thread.
Even single-processor system can suffer from Visibility issue.
While Ordering deals with in what sequence does a thread see memory operations performed by another run in another CPU.
Single-processor system does NOT suffer from Ordering issue.
But, I felt that sometimes so-called Ordering issue can be interpreted with the concept of Visibility,e,g.
//Thread1 runs
int data;
boolean ready;
void method1(){
data=1;
ready=true;
}
//Thread2 runs
void method2(){
if(ready){
System.out.print(data);
}
}
If the output of the above program was "0"(rather than "1"), we can say that there was an Ordering issue(i.e Reordering)---the write to ready appeared to be occurred before the write to data.
However, I think we can also interpret this output as a result of Visibility: Thread2 first saw the update to ready by Thread1, and then data,possibly due to Store Buffer flush to CPU Cache, And if the print(data) was executed before Thread2 saw update to data, then we got the output "0".
Taking this into account, I just wonder what's is the difference/relationship between Visibility and Ordering?

Yes, ordering and visibility are related issues.
Visibility is about whether / when one thread sees the results of memory writes performed by another thread
Ordering is about whether the order in which updates are seen by the second thread matches the (program) order in which the first thread wrote them.
The Java memory model doesn't directly address the ordering of writes. Any constraints on order are (as you hypothesize) a consequence of the visibility rules: these are specified in the JLS. This includes the case where you use volatile variables to effectively inhibit reordering.
It is also worth noting that the reordering of writes (from the perspective of a second thread) can happen whenever the JLS memory model does not require visibility. Correct synchronization will guarantee visibility at the point of synchronization.

Do I need putting flags into synchronized blocks?

I had a code like this:
if(!flag) {
synchronized(lock) {
lock.wait(1000);
}
}
if(!flag) { print("Error flag not set!"); }
And:
void f() {
flag = true;
synchronized(lock) {
lock.notify()
}
}
A friend of mine told me I should put flag = true inside the synchronized block:
synchronized(lock) {
flag = true;
lock.notify()
}
I do not understand why. Is it some classic example? Could someone, please, explain?
If I declare my flag volatile I then do not need putting it into the synchronized block?

As flag variable is used by multiple threads, some mechanism to ensure changes visibility must be used. This is indeed a common pattern in multithreading in general. Java memory model does not otherwise guarantee the other thread will ever see the new value of flag.
This is to allow optimizations employed by modern multiprocessor systems, where maintaining cache coherency at all times may be to costly. Memory access is usually orders of magnitute slower than other "usual" CPU operations, so modern processors go to really great lengths avoid it as much as possible. Instead, frequently accessed locations are kept in small, fast, local processor memory - a cache. Changes are only done to the cache, and flushed to the main memory at certain points. This works fine for one processor, as the memory content is not being changed by other parties, so we are guaranteed cache content reflects memory content. (Well, that's an oversimplification, but from high-level's programming point of view irrelevant, I believe). The problem is, as soon as we add another processor, independently changing the memory contents, this guarantee is lost. To mitigate this problem, various (sometimes elaborate - see e.g. here) cache coherency protocols were devised. Unsurprisingly, they require some bookkeeping and interprocessor communication overhead, though.
Other, somewhat related issue is atomicity of write operations. Basically, even if the change is seen by other threads, it may be seen partially. This is not usually so much of a problem in java, as the language specification guarantees atomicity of all writes. Still, writes to 64-bit primitives (long and double) are explicitly said to be treated as two separate, 32-bit writes:
For the purposes of the Java programming language memory model, a single write to a non-volatile long or double value is treated as two separate writes: one to each 32-bit half. This can result in a situation where a thread sees the first 32 bits of a 64-bit value from one write, and the second 32 bits from another write. (JLS 17.7)
Back to the code in question... synchronization is required, and synchronized block satisfies the need. Still, I find making the flags like that volatile more pleasant solution. Net effect is the same - visibility guarantee and atomic write - but it doesn't clutter the code with small synchronized blocks.

Main memory is slow. Really slow. The internal cache in your CPU today is about 1000 times faster. For this reason, modern code tries to keep as much data as possible in the CPU's cache.
One of the reasons why main memory is so slow is that it's shared. When you update the main memory, all CPU cores are notified of the change. Caches, on the other hand, are per core. That means when thread A updates the flag, it just updates its own cache. Other threads might or might not see the change.
There are two ways to ensure the flag is written to main memory:
Put it in a synchronized block
Declare it volatile
volatile has the advantage that any access to the flag will make sure the state of the flag in main memory is updated. Use this when you use the flag in many places.
In your case, you already have the synchronized block. But in the first case, the first if might be reading a stale value (i.e. the thread might wait() even though the flag is already true). So you still need volatile.

If you're checking and modifying the flag from different threads, it nees to be at least declared volatile, for threads to see the changes.
Putting the checks in synchronized blocks would work too.
And yes, it is a very basic thing in concurrency, so you should make sure you read up on the memory model, happens-before and other related subjects.

First of all: the lock.wait(1000) will return after a second even if the other thread did not send notify.
Secondly: your friend is right, in this case you have shared data accessed by different threads so acces to it should better be guarded with a lock like in your code.
Thirdly: mark your flag variable as volatile so different threads make sure they are always using the last "written" value
And finally: I would also put the if (!flag) code in a synchronized block -> it is also accessing the flag variable...

Simultaneous reading and changing the variable by different threads

I am interested in the situation where one thread is waiting for change of a variable in the while loop:
while (myFlag == false) {
// do smth
}
It is repeating an infinite number of times.
In the meantime, another thread have changed the value of this variable:
myFlag = true;
Can the reader-thread see the result of changing the value of the variable in the other thread if this variable is NOT volatile? In general, as I understand it will never happen. Or am I wrong? Then when and under what circumstances, the first thread can see the change in the variable and exit the loop? Is this possible without using volatile keyword? Does size of processor's cache play role in this situation?
Please explain and help me understand! Thank you in advance!!

Can the reader-thread see the result of changing the value of the variable in the other thread if this variable is NOT volatile?
It may be able to, yes. It's just that it won't definitely see the change.
In general, as I understand it will never happen.
No, that's not the case.
You're writing to a variable and then reading from it in a different thread. Whether or not you see it will depend on the exact processor and memory architecture involved. Without any memory barriers involved, you aren't guaranteed to see the new value - but you're certainly not guaranteed not to see it either.

Can the reader-thread see the result of changing the value of the variable in the other thread if this variable is NOT volatile?
I'd like to expand a bit on #Jon's excellent answer.
The Java memory model says that all memory in a particular thread will be updated if it crosses any memory barrier. Read barriers cause all cached memory in a particular thread to be updated from central memory and write barriers cause local thread changes to be written to central.
So if your thread that writes to another volatile field or enters a synchronized block it will cause your flag to be updated in central memory. If the reading thread reads from another volatile field or enters a synchronized block in the // do smth section after the update has happened, it will see the update. You just can't rely on when this will happen or if the order of write/read happens appropriately. If your thread doesn't have other memory synchronization points then it may never happen.
Edit:
Given the discussion below, which I've had a couple times now in various different questions, I thought I might expand more on my answer. There is a big difference between the guarantees provided by the Java language and its memory-model and the reality of JVM implementations. The JLS and JMM define memory-barriers and talk about "happens-before" guarantees only between volatile reads and writes on the same field and synchronized locks on the same object.
However, on all architectures that I've heard of, the implementation of the memory barriers that enforce the memory synchronization are not field or object specific. When a read is done on a volatile field and the read-barrier is crossed on a specific thread, it will be updated with all of central memory, not just the particular volatile field in question. This is the same for volatile writes. After a write is made to a volatile field, all updates from the local thread are written to central memory, not just the field. What the JLS does guarantee is that instructions cannot be reordered past the volatile access.
So, if thread-A has written to a volatile field then all updates, even those not marked as volatile will have been written to central memory. After this operation has completed, if thread-B then reads from a different volatile field, he will see all of thread-A's updates, even those not marked as volatile. Again, there is no guarantees around the timing of these events but if they happen in that order then the two threads will be updated.

Thread safety within Java

So, while working on something that was having locking issues, a question came to me. Do objects that only can be accessed from a single thread require locks or synchronization at all?
For example, given Thread1, Thread2, and Thread3, along with Buffer1, Buffer2, Buffer3, where each buffer is instanced as a thread is created, meaning that Thread1 will only ever access Buffer1, and the same for Thread2 and Buffer2, along with Thread3 and Buffer3. Thread1 will never touch Buffer2 or Buffer3. While adding/removing/modifying bytes in the stream, are locks needed?

No, You wont need any locks in this case. Locking and synchronization is only required when any resource is being shared between multiple threads.
If you go ahead and add synchronization on the private instance of that buffer then still it wont make any difference as there will be no thread waiting to acquire locks, The only one locking and releasing the buffer will be the owner thread.

1. When more than one thread try to access an object, then locking becomes necessary.
2. Moreover classes when developed needs to be thread safe, if concurrent access by threads is possible.
3. A class is said to be thread safe, it if behaves correctly in the presence of interleaving and scheduling of the underlying OS , without any synchronization mechanism from the client.
4. Locking the resources can cause overhead, prevents concurrent access, and bottle neck situations.

Only when two or more threads need to access a shared object you need to worry about locking.

No. This strategy for ensuring thread-safety is generally referred to as confinement.
Confinement relies on encapsulation techniques to ensure that multiple threads cannot access an object. "Concurrent Programming in Java" by Doug Lea has good chapter on the details of confinement and its strengths and weaknesses compared to other exclusion techniques.
Paraphrasing from Lea, in general there are 4 conditions needed for confinement of a reference r, to an object x, within a method m:
m cannot pass r as an argument to another method.
m cannot pass r as a return value.
m cannot record r in a field (instance or static) that is accessible from another thread.
m cannot may not let any other references escape (via 1-3) that may be traversed to r.

From what I remember from my studies, if you are using a private buffer for every thread you should not worry about locking it to avoid concurrent access, since you don't have any.
If no-one is reading the buffer apart from the creator, it could do whatever he wants on it without worrying that someone else is reading or writing it. so you should be fine
But you have to remember that a thread can be interrupted at any time, so your internal buffer can be in a inconsistent state. (this shouldn't be a problem since you are accessing only sequentially from the same thread)

Locks are not needed unless threads are concurrently using the same data structure.
Hence if different data structures are used by each thread, your code is guaranteed to be thread safe.
Incidentally, this is one of the main reasons why the key Java collection classes like java.util.ArrayList are not thread safe: making them thread safe would add a performance overhead which you shouldn't have to pay for if you don't need, and in a lot of cases you don't need it because you can ensure in some other way that only one thread accesses the ArrayList at once.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.