Why using "volatile" does not show any difference here? - java

I am learning the usage of volatile in Java. Here is a sample code I read from many articles:
static volatile boolean shutdownRequested = false;
...
public void shutdown() { shutdownRequested = true; }
public void doWork() {
while (!shutdownRequested) {
// do stuff
}
}
I try this on my machine with and without "volatile", but they show no difference: they can both shutdown.
So what's wrong? Is there anything wrong with my code, or does it depend on the version of the Java compiler?
Addition: in many articles, they say this program without "volatile" will not successfully shutdown because this loop while (!shutdownRequested) will be optimized to while(true) by Java compiler if the value of the variable shutdownRequested is not changed inside the loop. But the result of my experiment does not stand for that.

I assume you mean you have a setup something like this:
final Worker theWorker = new Worker(); // the object you show code for
new Thread(new Runnable() {
public void run() {
theWorker.doWork();
}
}.start();
try {
Thread.sleep(1000L);
} catch(InterruptedException ie) {}
theWorker.shutdown();
And what you found is that the shutdown works even without volatile.
It's typically the case that this is true: non-volatile writes may be seen eventually. The important thing is that there is not a guarantee this needs to be the case and you can't rely on it. In practical use you may also find there is a small but noticeable delay without volatile.
Volatile provides a guarantee that writes are seen immediately.
Here's some code that might reproduce the HotSpot optimization we discussed in the comments:
public class HotSpotTest {
static long count;
static boolean shouldContinue = true;
public static void main(String[] args) {
Thread t = new Thread(new Runnable() {
public void run() {
while(shouldContinue) {
count++;
}
}
});
t.start();
do {
try {
Thread.sleep(1000L);
} catch(InterruptedException ie) {}
} while(count < 999999L);
shouldContinue = false;
System.out.println(
"stopping at " + count + " iterations"
);
try {
t.join();
} catch(InterruptedException ie) {}
}
}
Here's a quick review if you don't know what HotSpot is: HotSpot is the Java just-in-time compiler. After some fragment of code has run a certain number of times (from memory, 1000 for desktop JVM, 3000 for server JVM), HotSpot takes the Java bytecode, optimizes it, and compiles it to native assembly. HotSpot is one of the reasons Java is so lightning fast. In my experience, code recompiled by HotSpot can be easily 10x faster. HotSpot is also much more aggressive about optimization than a regular Java compiler (like javac or others made by IDE vendors).
So what I found is the join just hangs forever if you let the loop run long enough first. Note that count is not volatile by design. Making count volatile seems to foil the optimization.
From the perspective of the Java memory model it makes sense that as long as there is absolutely no memory synchronization HotSpot is allowed to do this. HotSpot knows there's no reason the update needs to be seen so it doesn't bother checking.
I didn't print the HotSpot assembly since that requires some JDK software I don't have installed but I'm sure if you did, you'd find the same thing the link you provided recalls. HotSpot does indeed seem to optimize while(shouldContinue) to while(true). Running the program with the -Xint option to turn HotSpot off results in the update being seen as well which also points to HotSpot as the culprit.
So, again, it just goes to show you can't rely on a non-volatile read.

Volatile is for threading. It basically tells the threads the variable can change anytime, so anytime it wants the variable it can't rely on a cached copy it must re read it and the update it after changing it

Volatile in many senses is due to the local caching that a processor can do on a per-thread basis.
for example, lest say we have a processor with 4 threads running your java program (albeit a massively simplified example since a processor would do WAY more than just this). Lets also assume that each of those 4 main threads have access to a local cache (not to be confused with a main processor cache). So, if you just made that variable static, and all 4 threads were reading from that variable, they could all potentially put that variable in their local cache. Alright, great, access time is improved, and everything is faster. So, at the moment we have the following situation:
Thread 1: has a local copy of the variable
Thread 2: has a local copy of the variable
Thread 3: '' '' '' '' '' '' ''
Thread 4: '' '' '' '' '' '' ''
Alright, now, lets say that Thread 1 goes in and changes the ACTUAL variable, not just the copy. Thread 1 knows about the change immediately, but threads 2-4 could still be working on the value of the old, cached version of the variable since they haven't checked for any updates yet.
Now, to fix this type of situation, you can attach the 'volatile' keyword to the variable, which essentially tells it to broadcast its new value to all of the threads in the program IMMEDIATELY so that any operations on all of the threads will have the exact same value. Of course, this does incur some overhead, so something that is volatile will be a touch slower if it is modified often. Since your program is not multi-threaded (that I can tell) you'll see little to no difference using volatile or not. It's simply a trivial (and pointless) extra 'command' for the variable in single-threaded environments.

Related

Trying to understand shared variables in java threads

I have the following code :
class thread_creation extends Thread{
int t;
thread_creation(int x){
t=x;
}
public void run() {
increment();
}
public void increment() {
for(int i =0 ; i<10 ; i++) {
t++;
System.out.println(t);
}
}
}
public class test {
public static void main(String[] args) {
int i =0;
thread_creation t1 = new thread_creation(i);
thread_creation t2 = new thread_creation(i);
t1.start();
try {
Thread.sleep(500);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
t2.start();
}
}
When I run it , I get :
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
Why I am getting this output ? According to my understanding , the variable i is a shared variable between the two threads created. So according to the code , the first thread will execute and increments i 10 times , and hence , i will be equal to 10 . The second thread will start after the first one because of the sleep statement and since i is shared , then the second thread will start will i=10 and will start incrementing it 10 times to have i = 20 , but this is not the case in the output , so why that ?
You seem to think that int t; in thread_creation is a shared variable. I'm afraid you are mistaken. Each t instance is a different variable. So the two threads are updating distinct counters.
The output you are seeing reflects that.
This is the nub of your question:
How do I pass a shared variable then ?
Actually, you can't1. Strictly a shared variable is actually a variable belonging to a shared object. You cannot pass a variable per se. Java does not allow passing of variables. This is what "Java does not support call-by-reference" really means. You can't pass or return a variable or the address of a variable in any method call. (Or in any other way.)
In Java you pass and return values: either primitives, or references to objects. The values may read from a variable by the call's parameter expression or assigned to a variable after the call's return. But you are not passing the variable. A variable and its value / contents are different things.
So the only way to implement a shared counter is to implement it as a shared counter object.
Note that "variable" and "object" mean different things, both in Java and in other programming languages. You should NOT use the two terms interchangeable. For example, when I declare this in Java:
String s = "Hello";
the s variable is not a String object. It is a variable that contains a reference to the String object. Other variables may contain references to the same String object as well. The distinction is even more stark when the objects are mutable. (String is not mutable ... in Java.)
Here are the two (IMO) best ways to implement a shared counter object.
You could create a custom Java Counter class with a count variable, a get method, and methods for incrementing, decrementing the counter. The class needs to implement various methods as thread-safe and atomic; e.g. by using synchronized methods or blocks2.
You could just use an AtomicInteger instance. That takes care of atomicity and thread-safety ... to the extent that it is possible with this kind of API.
The latter approach is simpler and likely more efficient ... unless you need to do something special each time the counter changes.
(It is conceivable that you could implement a shared counter other ways, but that is too much detail for this answer.)
1 - I realize that I just said the same thing more than 3 times. But as the Bellman says in "The Hunting of the Snark": "What I tell you three times is true."
2 - If the counter is not implemented using synchronized or an equivalent mutual exclusion mechanism with the appropriate happens before semantics, you are liable to see Heisenbugs; e.g. race conditions and memory visibility problems.
Two crucial things you're missing. Both individually explain this behaviour - you can 'fix' either one and you'll still see this, you'd have to fix both to see 1-20:
Java is pass-by-value
When you pass i, you pass a copy of it. In fact, in java, all parameters to methods are always copies. Hence, when the thread does t++, it has absolutely no effect whatsoever on your i. You can trivially test this, and you don't need to mess with threads to see it:
public static void main(String[] args) {
int i = 0;
add5(i);
System.out.println(i); // prints 0!!
}
static void add5(int i) {
i = i + 5;
}
Note that all non-primitives are references. That means: A copy of the reference is passed. It's like passing the address of a house and not the house itself. If I have an address book, and I hand you a scanned copy of a page that contains the address to my summer home, you can still drive over there and toss a brick through the window, and I'll 'see' that when I go follow my copy of the address. So, when you pass e.g. a list and the method you passed the list to runs list.add("foo"), you DO see that. You may think: AHA! That means java does not pass a copy, it passed the real list! Not so. Java passed a copy of a street address (A reference). The method I handed that copy to decided to drive over there and act - that you can see.
In other words, =, ++, that sort of thing? That is done to the copy. . is java for 'drive to the address and enter the house'. Anything you 'do' with . is visible to the caller, = and ++ and such are not.
Fixing the code to avoid the pass-by-value problem
Change your code to:
class thread_creation extends Thread {
static int t; // now its global!
public void run() {
increment();
}
public void increment() {
for(int i =0 ; i<10 ; i++) {
t++;
// System.out.println(t);
}
}
}
public class test {
public static void main(String[] args) throws Exception {
thread_creation t1 = new thread_creation();
thread_creation t2 = new thread_creation();
t1.start();
Thread.sleep(500);
t2.start();
Thread.sleep(500);
System.out.println(thread_creation.t);
}
}
Note that I remarked out the print line. I did that intentionally - see below. If you run the above code, you'd think you see 20, but depending on your hardware, the OS, the song playing on your mp3 playing app, which websites you have open, and the phase of the moon, it may be less than 20. So what's going on there? Enter the...
The evil coin.
The relevant spec here is the JMM (The Java Memory Model). This spec explains precisely what a JVM must do, and therefore, what a JVM is free not to do, especially when it comes to how memory is actually managed.
The crucial aspect is the following:
Any effects (updates to fields, such as that t field) may or may not be observable, JVM's choice. There's no guarantee that anything you do is visible to anything else... unless there exists a Happens-Before/Happens-After relationship: Any 2 statements with such a relationship have the property that the JVM guarantees that you cannot observe the lack of the update done by the HB line from the HA line.
HB/HA can be established in various ways:
The 'natural' way: Anything that is 'before' something else _and runs in the same thread has an HB/HA relationship. In other words, if you do in one thread x++; System.out.println(x); then you can't observe that the x++ hasn't happened yet. It's stated like this so that if you're not observing, you get no guarantees, which gives the JVM the freedom to optimize. For example, Given x++;y++; and that's all you do, the JVM is free to re-order that and increment y before x. Or not. There are no guarantees, a JVM can do whatever it wants.
synchronized. The moment of 'exiting' a synchronized (x) {} block has HB to the HA of another thread 'entering' the top of any synchronized block on the same object, if it enters later.
volatile - but note that with volatile it's basically impossible which one came first. But one of them did, and any interaction with a volatile field is HB relative to another thread accessing the same field later.
thread starting. thread.start() is HB relative to the first line of the run() of that thread.
thread yielding. thread.yield() is HA relative to the last line of the thread.
There are a few more exotic ways to establish HB/HA but that's pretty much it.
Crucially, in your code there is no HB/HA between any of the statements that modify or print t!
In other words, the JVM is free to run it all in such a way that the effects of various t++ statements run by one thread aren't observed by another thread.
What the.. WHY????
Because of efficiency. Your memory banks on your CPU are, relative to how fast CPUs are, oceans away from the CPU core. Fetching or writing to core memory from a CPU takes an incredibly long time - your CPU is twiddling its thumbs for a very long time while it waits for the memory controller to get the job done. It could be running hundreds of instructions in that time.
So, CPU cores do not write to memory AT ALL. Instead they work with caches: They have an on-core cache page, and the only interaction with your main memory banks (which are shared by CPU cores) is 'load in an entire cache page' and 'write an entire cache page'. That cache page is then effectively a 'local copy' that only that core can see and interact with (but can do so very very quickly, as that IS very close to the core, unlike the main memory banks), and then once the algorithm is done it can flush that page back to main memory.
The JVM needs to be free to use this. Had the JVM actually worked like you want (that anything any thread does is instantly observable by all others), then anything that any line does must first wait 500 cycles to load the relevant page, then wait another 500 cycles to write it back. All java apps would literally be 1000x slower than they could be.
This in passing also explains that actual synchronizing is really slow. Nothing java can do about that, it is a fundamental limitation of our modern multi-core CPUs.
So, evil coin?
Note that the JVM does not guarantee that the CPU must neccessarily work with this cache stuff, nor does it make any promises about when cache pages are flushed. It merely limits the guarantees so that JVMs can be efficiently written on CPUs that work like that.
That means that any read or write to any field any java code ever does can best be thought of as follows:
The JVM first flips a coin. On heads, it uses a local cached copy. On tails, it copies over the value from some other thread's cached copy instead.
The coin is evil: It is not reliably a 50/50 arrangement. It is entirely plausible that throughout developing a feature and testing it, the coin lands tails every time it is flipped. It remains flipping tails 100% of the time for the first week that you deployed it. And then just when that big potential customer comes in and you're demoing your app, the coin, being an evil, evil coin, starts flipping heads a few times and breaking your app.
The correct conclusion is that the coin will mess with you and that you cannot unit test against it. The only way to win the game is to ensure that the coin is never flipped.
You do this by never touching a field from multiple threads unless it is constant (final, or simply never changes), or if all access to it (both reads and writes) has clearly established HB/HA between all threads.
This is hard to do. That's why the vast majority of apps don't do it at all. Instead, they:
Talk between threads using a database, which has vastly more advanced synchronization primitives: Transactions.
Talk using a message bus such as RabbitMQ or similar.
Use stuff from the java.util.concurrent package such as a Latch, ForkJoin, ConcurrentMap, or AtomicInteger. These are easier to use (specifically: It is a lot harder to write code for these abstractions that is buggy but where the bug cannot be observed or tested for on the machine of the developer that wrote it, it'll only blow up much later in production. But not impossible, of course).
Let's fix it!
volatile doesn't 'fix' ++. x++; is 'read x, increment by 1, write result to x' and volatile doesn't make that atomic, so we cannot use this. We can either replace t++ with:
synchronized(thread_creation.class) {
t++;
}
Which works fine but is really slow (and you shouldn't lock on publicly visible stuff if you can help it, so make a custom object to lock on, but you get the gist hopefully), or, better, dig into that j.u.c package for something that seems useful. And so there is! AtomicInteger!
class thread_creation extends Thread {
static AtomicInteger t = new AtomicInteger();
public void run() {
increment();
}
public void increment() {
for(int i =0 ; i<10 ; i++) {
t.incrementAndGet();
}
}
}
public class test {
public static void main(String[] args) throws Exception {
thread_creation t1 = new thread_creation();
thread_creation t2 = new thread_creation();
t1.start();
Thread.sleep(500);
t2.start();
Thread.sleep(500);
System.out.println(thread_creation.t.get());
}
}
That code will print 20. Every time (unless those threads take longer than 500msec which technically could be, but is rather unlikely of course).
Why did you remark out the print statement?
That HB/HA stuff can sneak up on you: When you call code you did not write, such as System.out.println, who knows what kind of HB/HA relationships are in that code? Javadoc isn't that kind of specific, they won't tell you. Turns out that on most OSes and JVM implementations, interaction with standard out, such as System.out.println, causes synchronization; either the JVM does it, or the OS does. Thus, introducing print statements 'to test stuff' doesn't work - that makes it impossible to observe the race conditions your code does have. Similarly, involving debuggers is a great way to make that coin really go evil on you and flip juuust so that you can't tell your code is buggy.
That is why I remarked it out, because with it in, I bet on almost all hardware you end up seeing 20 eventhough the JVM doesn't guarantee it and that first version is broken. Even if on your particular machine, on this day, with this phase of the moon, it seems to reliably print 20 every single time you run it.

Why is this code not going into an infinite loop as suggested by JSR133?

In JSR-133 section 3.1, which discusses the visibility of actions between threads - it is mentioned that the code example below, which does not utilise the volatile keyword for the boolean field, can become an infinite loop if two threads are running it. Here is the code from the JSR:
class LoopMayNeverEnd {
boolean done = false;
void work() {
while (!done) {
// do work
}
}
void stopWork() {
done = true;
}
}
Here is a quote of the important bit in that section that I'm interested in:
... Now imagine that two threads are created, and that one
thread calls work(), and at some point, the other thread calls stopWork(). Because there is
no happens-before relationship between the two threads, the thread in the loop may never
see the update to done performed by the other thread ...
And here is my own Java code I wrote just so I can see it loop:
public class VolatileTest {
private boolean done = false;
public static void main(String[] args) {
VolatileTest volatileTest = new VolatileTest();
volatileTest.runTest();
}
private void runTest() {
Thread t1 = new Thread(() -> work());
Thread t2 = new Thread(() -> stopWork());
t1.start();
t2.start();
}
private void stopWork() {
done = true;
System.out.println("stopped work");
}
private void work() {
while(!done){
System.out.println("started work");
}
}
}
Although the results from consecutive executions are different - as expected - I don't see it ever going into an infinite loop. I'm trying to understand how I can simulate the infinite loop that the documentation suggests, what am I missing? How does declaring the boolean volatile, remove the infinite loop?
The actual behavior is OS and JVM specific. For example, by default, Java runs in client mode on 32-bit Windows and in server mode on the Mac. In client mode the work method will terminate, but will not terminate in server mode.
This happens because of the Java server JIT compiler optimization. The JIT compiler may optimize the while loop, because it does not see the variable done changing within the context of the thread. Another reason of the infinite loop might be because one thread may end up reading the value of the flag from its registers or cache instead of going to memory. As a result, it may never see the change made by the another thread to this flag.
Essentially by adding volatile you make the thread owning done flag to not cache this flag. Thus, the boolean value is stored in common memory and therefore guarantees visibility. Also, by using volatile you disabling JIT optimization that can inline the flag value.
Basically if you want to reproduce infinite loop - just run your program in server mode:
java -server VolatileTest
The default, non-volatile, implicit declaration of all Java values allows the Jit compiler to "hoist" references to non-volatile values, out of loops so that they are only read 'once'. This is allowed after a tracing of execution paths can safely arrive at the fact that the methods called inside of such a loop, don't ever cause entry back into the classes methods where it might mutate the value of these non-volatile values.
The System.out.println() invocation goes to native code which keeps the JIT from resolving that 'done' is never modified. Thus the hoist does not happen when the System.out.println() is there and as you found out, the infinite loop is only happening with it removed where the JIT can resolve that there is no write to 'done'.
The ultimate problem is that this reference hoisting is conditional on "reachability" of a mutation of the value. Thus, you may have moments where there is no reach to a mutation of the value, during development, and thus the hoist happens and suddenly you can't exit the loop. A later change to the loop might use some function that makes it impossible to discern that the value cannot be written by the logic in the loop, and the hoist disappears and the loop works again.
This hoist is a big problem for many people who don't see it coming. There is a pretty large group of belief now that safe Java has class level variables either declared as volatile or final. If you really need a variable to be "optimizable", then don't use a class level variable and instead make it a parameter, or copy it into a local variable for the optimizer to go after. Doing this with read only access helps manage "dynamic" changes in a value that disrupt predictable execution paths too.
There is has been recurring discussion on the java concurrency mailing list about this issue. They don't seem to believe that this is a problem for Java developers and that this "optimization" of reference is far more valuable to performance than problematic to development.

How does Thread.yield prevent a print statement from executing in a while loop given that the main thread changes the 'while' condition [duplicate]

I'm looking at a code sample from "Java Concurrency in Practice" by Brian Goetz. He says that it is possible that this code will stay in an infinite loop because "the value of 'ready' might never become visible to the reader thread". I don't understand how this can happen...
public class NoVisibility {
private static boolean ready;
private static int number;
private static class ReaderThread extends Thread {
public void run() {
while (!ready)
Thread.yield();
System.out.println(number);
}
}
public static void main(String[] args) {
new ReaderThread().start();
number = 42;
ready = true;
}
}
Because ready isn't marked as volatile and the value may be cached at the start of the while loop because it isn't changed within the while loop. It's one of the ways the jitter optimizes the code.
So it's possible that the thread starts before ready = true and reads ready = false caches that thread-locally and never reads it again.
Check out the volatile keyword.
The reason is explained in the section following the one with the code sample.
3.1.1 Stale data
NoVisibility demonstrated on of the ways that insufficiently synchronized programs can cause surprising results: stale data. When the reader thread examines ready, it may see an out-of-date value. Unless synchronization is used every time a variable is accessed, it is possible to see a stale value for that variable.
The Java Memory Model allows the JVM to optimize reference accesses and such as if it is a single threaded application, unless the field is marked as volatile or the accesses with a lock being held (the story gets a bit complicated with locks actually).
In the example, you provided, the JVM could infer that ready field may not be modified within the current thread, so it would replace !ready with false, causing an infinite loop. Marking the the field as volatile would cause the JVM to check the field value every time (or at least ensure that ready changes propagate to the running thread).
The problem is rooted in the hardware -- each CPU has different behavior with respect to cache coherence, memory visibility, and reordering of operations. Java is in better shape here than C++ because it defines a cross-platform memory model that all programmers can count on. When Java runs on a system whose memory model is weaker than that required by the Java Memory Model, the JVM has to make up the difference.
Languages like C "inherit" the memory model of the underlying hardware. There is work afoot to give C++ a formal memory model so that C++ programs can mean the same thing on different platforms.
private static boolean ready;
private static int number;
The way the memory model can work is that each thread could be reading and writing to its own copy of these variables (the problem affects non-static member variables too). This is a consequence of the way the underlying architecture can work.
Jeremy Manson and Brian Goetz:
In multiprocessor systems, processors generally have one or more layers of memory cache,which improves performance both by speeding access to data (because the data is closer to the processor) and reducing traffic on the shared memory bus (because many memory operations can be satisfied by local caches.) Memory caches can improve performance tremendously, but they present a host of new challenges. What, for example, happens when two processors examine the same memory location at the same time? Under what conditions will they see the same value?
So, in your example, the two threads might run on different processors, each with a copy of ready in their own, separate caches. The Java language provides the volatile and synchronized mechanisms for ensuring that the values seen by the threads are in sync.
public class NoVisibility {
private static boolean ready = false;
private static int number;
private static class ReaderThread extends Thread {
#Override
public void run() {
while (!ready) {
Thread.yield();
}
System.out.println(number);
}
}
public static void main(String[] args) throws InterruptedException {
new ReaderThread().start();
number = 42;
Thread.sleep(20000);
ready = true;
}
}
Place the Thread.sleep() call for 20 secs what will happen is JIT will kick in during those 20 secs and it will optimize the check and cache the value or remove the condition altogether. And so the code will fail on visibility.
To stop that from happening you MUST use volatile.

Thread.sleep makes compiler read value every time

In java specification 17.3. Sleep and Yield, it says
It is important to note that neither Thread.sleep nor Thread.yield have any synchronization semantics.
This sentence is the point. If I replace Thread.sleep(100) by System.out.println("") in my test code below, the compiler always read iv.stop every time because System.out.println("") acquires a lock, check this question. Java specification says Thread.sleep does not have any synchronization semantics, so I wonder what makes compiler treat Thread.sleep(100) as same as System.out.println("").
My test code:
public class InfiniteLoop {
boolean stop = false;
public static void main(String[] args) throws InterruptedException {
final InfiniteLoop iv = new InfiniteLoop();
Thread t1 = new Thread(() -> {
while (!iv.stop) {
//uncomment this block of code, loop broken
// try {
// Thread.sleep(100);
// } catch (InterruptedException e) {
// e.printStackTrace();
// }
}
System.out.println("done");
});
Thread t2 = new Thread(() -> {
try {
Thread.sleep(100);
} catch (InterruptedException e) {
e.printStackTrace();
}
iv.stop = true;
});
t1.start();
t2.start();
}
}
As the comment above says, Thread.sleep() breaks the loop, this is different from the description of the Java specification: why?
Let's see what the docs actually says:
The compiler is free to read the field this.done just once, and reuse the cached value in each execution of the loop. This would mean that the loop would never terminate, even if another thread changed the value of this.done.
See the highlighted word "free"? "free" means that the compiler can either read this.done once, or not. It's the compiler's choice. That's what "free" means. Your loop breaks because the compiler sees your code and thought "I am going to read iv.stop every time, even though I can read it just once."
In other words, it is not guaranteed to always break the loop. Your code behaves exactly as the docs say.
so I wonder what makes compiler treat Thread.sleep(100) as same as System.out.println("").
Well there is certainly nothing in the language definition that says that they are at all the same. Thread.sleep(...) does not cross any memory barriers while System.out.println(...) does. What you may be seeing is an artifact of how your threaded application is running on your architecture. Maybe the thread gets swapped out because of CPU contention which forces the cache memory to be flushed. If you ran this on a different OS or on hardware with more cores you would most likely not see sleep(...) do anything.
The difference here may also be a compiler optimization. The while loop with nothing in it might not be even checking the value of the stop field since the compiler knows that nothing is updating it inside the loop and it's not volatile. As soon as you add something that does thread state manipulation, it changes the generated code so that the field is then actually paid attention.
Ultimately, the problem is around the publishing of the boolean stop field between threads. The field should be marked as volatile to ensure it is properly shared. As you mentioned, when you call System.out.println(...) this goes in and out of a synchronized block which crosses memory barriers which effectively update the stop field.
Though the question is already answered I don't feel the other answers address the confusion in the question.
If we simplify the code to
while (!iv.stop) {
// do something ....
}
Then the compiler is free (as others have said) to read iv.stop only once. The important points are:
To force the compiler to enforce a re-read of iv.stop, it should be declared volatile.
Without volatile, the compiler may, or may not change whether it decides to re-read iv.stop as a result of changing the loop contents ("do something...") but that cannot be reliably predicted.
You can't infer anything special in this context about the fact that sleep doesn't use locking semantics
(The third point references what I believe to be the confusion in the question)
So with regards to the issue of println() vs sleep(): The fact that sleep doesn't use locking semantics is irrelevant; println doesn't use locking semantics either.
println implementation may use locking to ensure it's own thread-safety, but that fact is not visible in the scope of the calling code (ie your code). (As a side note, sleep implementation ultimately will use some sort of locking deep down in its implementation (in the native code).
From the API perspective, both sleep and println are static methods which take one parameter, so the compiler is likely to be affected the same way by them, with regards to how it performs optimisations in the surrounding code, but like I said you can't rely on that.

Why does an empty while in Java not break when condition is set by other thread?

While trying to unit test a threaded class, I decided to use active waiting to control the behavior of the tested class. Using empty while statements for this failed to do what I intended. So my question is:
Why does the first code not complete, but the second does?
There is a similar question, but it doesn't have a real answer nor an MCVE and is far more specific.
Doesn't complete:
public class ThreadWhileTesting {
private static boolean wait = true;
private static final Runnable runnable = () -> {
try {Thread.sleep(50);} catch (InterruptedException ignored) {}
wait = false;
};
public static void main(String[] args) {
wait = true;
new Thread(runnable).start();
while (wait); // THIS LINE IS IMPORTANT
}
}
Does complete:
public class ThreadWhileTesting {
private static boolean wait = true;
private static final Runnable runnable = () -> {
try {Thread.sleep(50);} catch (InterruptedException ignored) {}
wait = false;
};
public static void main(String[] args) {
wait = true;
new Thread(runnable).start();
while (wait) {
System.out.println(wait); // THIS LINE IS IMPORTANT
}
}
}
I suspect that the empty while gets optimized by the Java compiler, but I am not sure. If this behavior is intended, how can I achieve what I want? (Yes, active waiting is intented since I cannot use locks for this test.)
wait isn't volatile and the loop body is empty, so the thread has no reason to believe it will change. It is JIT'd to
if (wait) while (true);
which never completes if wait is initially true.
The simple solution is just to make wait volatile, which prevents JIT making this optimization.
As to why the second version works: System.out.println is internally synchronized; as described in the JSR133 FAQ:
Before we can enter a synchronized block, we acquire the monitor, which has the effect of invalidating the local processor cache so that variables will be reloaded from main memory.
so the wait variable will be re-read from main memory next time around the loop.
However, you don't actually guarantee that the write of the wait variable in the other thread is committed to main memory; so, as #assylias notes above, it might not work in all conditions. (Making the variable volatile fixes this also).
The short answer is that both of those examples are incorrect, but the second works because of an implementation artifact of the System.out stream.
A deeper explanation is that according to the JLS Memory Model, those two examples have a number of legal execution traces which give unexpected (to you) behavior. The JLS explains it like this (JLS 17.4):
A memory model describes, given a program and an execution trace of that program, whether the execution trace is a legal execution of the program. The Java programming language memory model works by examining each read in an execution trace and checking that the write observed by that read is valid according to certain rules.
The memory model describes possible behaviors of a program. An implementation is free to produce any code it likes, as long as all resulting executions of a program produce a result that can be predicted by the memory model.
This provides a great deal of freedom for the implementor to perform a myriad of code transformations, including the reordering of actions and removal of unnecessary synchronization.
In your first example, you have one thread updating a variable and a second thread updating it with no form of synchronization between the tro threads. To cut a (very) long story short, this means that the JLS does not guarantee that the memory update made by the writing thread will every be visible to the reading thread. Indeed, the JLS text I quoted above means that the compiler is entitled to assume that the variable is never changed. If you perform an analysis using the rules set out in JLS 17.4, an execution trace where the reading thread never sees the change is legal.
In the second example, the println() call is (probably) causing some serendipitous flushing of memory caches. The result is that you are getting a different (but equally legal) execution trace, and the code "works".
The simple fix to make your examples both work is to declare the wait flag as volatile. This means that there is a happens-before relationship between a write of the variable in one thread and a subsequent read in another thread. That in turn means that in all legal execution traces, the result of the write will be visible to to the readin thread.
This is a drastically simplified version of what the JLS actually says. If you really want to understand the technical details, they are all in the spec. But be prepared for some hard work understanding the details.

Categories