Brian Goetz's improper publication

Brian Goetz's improper publication - java

The question has been posted before but no real example was provided that works. So Brian mentions that under certain conditions the AssertionError can occur in the following code:
public class Holder {
private int n;
public Holder(int n) { this.n = n; }
public void assertSanity() {
if (n!=n)
throw new AssertionError("This statement is false");
}
}
When holder is improperly published like this:
class someClass {
public Holder holder;
public void initialize() {
holder = new Holder(42);
}
}
I understand that this would occur when the reference to holder is made visible before the instance variable of the object holder is made visible to another thread. So I made the following example to provoke this behavior and thus the AssertionError with the following class:
public class Publish {
public Holder holder;
public void initialize() {
holder = new Holder(42);
}
public static void main(String[] args) {
Publish publish = new Publish();
Thread t1 = new Thread(new Runnable() {
public void run() {
for(int i = 0; i < Integer.MAX_VALUE; i++) {
publish.initialize();
}
System.out.println("initialize thread finished");
}
});
Thread t2 = new Thread(new Runnable() {
public void run() {
int nullPointerHits = 0;
int assertionErrors = 0;
while(t1.isAlive()) {
try {
publish.holder.assertSanity();
} catch(NullPointerException exc) {
nullPointerHits++;
} catch(AssertionError err) {
assertionErrors ++;
}
}
System.out.println("Nullpointerhits: " + nullPointerHits);
System.out.println("Assertion errors: " + assertionErrors);
}
});
t1.start();
t2.start();
}
}
No matter how many times I run the code, the AssertionError never occurs. So for me there are several options:
The jvm implementation (in my case Oracle's 1.8.0.20) enforces that the invariants set during construction of an object are visible to all threads.
The book is wrong, which I would doubt as the author is Brian Goetz ... nuf said
I'm doing something wrong in my code above
So the questions I have:
- Did someone ever provoke this kind of AssertionError successfully? With what code then?
- Why isn't my code provoking the AssertionError?

Your program is not properly synchronized, as that term is defined by the Java Memory Model.
That does not, however, mean that any particular run will exhibit the assertion failure you are looking for, nor that you necessarily can expect ever to see that failure. It may be that your particular VM just happens to handle that particular program in a way that turns out never to expose that synchronization failure. Or it may turn out the although susceptible to failure, the likelihood is remote.
And no, your test does not provide any justification for writing code that fails to be properly synchronized in this particular way. You cannot generalize from these observations.

You are looking for a very rare condition. Even if the code reads an unintialized n, it may read the same default value on the next read so the race you are looking for requires an update right in between these two adjacent reads.
The problem is that every optimizer will coerce the two reads in your code into one, once it starts processing your code, so after that you will never get an AssertionError even if that single read evaluates to the default value.
Further, since the access to Publish.holder is unsynchronized, the optimizer is allowed to read its value exactly once and assume unchanged during all subsequent iterations. So an optimized second thread would always process the same object which will never turn back to the uninitialized state. Even worse, an optimistic optimizer may go as far as to assume that n is always 42 as you never initialize it to something else in this runtime and it will not consider the case that you want a race condition. So both loops may get optimized to no-ops.
In other words: if your code doesn’t fail on the first access, the likeliness of spotting the error in subsequent iterations dramatically drops down, possibly to zero. This is the opposite of your idea to let the code run inside a long loop hoping that you will eventually encounter the error.
The best chances for getting a data race are on the first, non-optimized, interpreted execution of your code. But keep in mind, the chance for that specific data race are still extremely low, even when running the entire test code in pure interpreted mode.

Related

Nested spin-lock vs volatile check

I was about to write something about this, but maybe it is better to have a second opinion before appearing like a fool...
So the idea in the next piece of code (android's room package v2.4.1, RoomTrackingLiveData), is that the winner thread is kept alive, and is forced to check for contention that may have entered the process (coming from losing threads) while computing.
While fail CAS operations performed by these losing threads keep them out from entering and executing code, preventing repeating signals (mComputeFunction.call() OR postValue()).
final Runnable mRefreshRunnable = new Runnable() {
#WorkerThread
#Override
public void run() {
if (mRegisteredObserver.compareAndSet(false, true)) {
mDatabase.getInvalidationTracker().addWeakObserver(mObserver);
}
boolean computed;
do {
computed = false;
if (mComputing.compareAndSet(false, true)) {
try {
T value = null;
while (mInvalid.compareAndSet(true, false)) {
computed = true;
try {
value = mComputeFunction.call();
} catch (Exception e) {
throw new RuntimeException("Exception while computing database"
+ " live data.", e);
}
}
if (computed) {
postValue(value);
}
} finally {
mComputing.set(false);
}
}
} while (computed && mInvalid.get());
}
};
final Runnable mInvalidationRunnable = new Runnable() {
#MainThread
#Override
public void run() {
boolean isActive = hasActiveObservers();
if (mInvalid.compareAndSet(false, true)) {
if (isActive) {
getQueryExecutor().execute(mRefreshRunnable);
}
}
}
};
The most obvious thing here is that atomics are being used for everything they are not good at:
Identifying losers and ignoring winners (what reactive patterns need).
AND a happens once behavior, performed by the loser thread.
So this is completely counter intuitive to what atomics are able to achieve, since they are extremely good at defining winners, AND anything that requires a "happens once" becomes impossible to ensure state consistency (the last one is suitable to start a philosophical debate about concurrency, and I will definitely agree with any conclusion).
If atomics are used as: "Contention checkers" and "Contention blockers" then we can implement the exact principle with a volatile check of an atomic reference after a successful CAS.
And checking this volatile against the snapshot/witness during every other step of the process.
private final AtomicInteger invalidationCount = new AtomicInteger();
private final IntFunction<Runnable> invalidationRunnableFun = invalidationVersion -> (Runnable) () -> {
if (invalidationVersion != invalidationCount.get()) return;
try {
T value = computeFunction.call();
if (invalidationVersion != invalidationCount.get()) return; //In case computation takes too long...
postValue(value);
} catch (Exception e) {
e.printStackTrace();
}
};
getQueryExecutor().execute(invalidationRunnableFun.apply(invalidationCount.incrementAndGet()));
In this case, each thread is left with the individual responsibility of checking their position in the contention lane, if their position moved and is not at the front anymore, it means that a new thread entered the process, and they should stop further processing.
This alternative is so laughably simple that my first question is:
Why didn't they do it like this?
Maybe my solution has a flaw... but the thing about the first alternative (the nested spin-lock) is that it follows the idea that an atomic CAS operation cannot be verified a second time, and that a verification can only be achieved with a cmpxchg process.... which is... false.
It also follows the common (but wrong) believe that what you define after a successful CAS is the sacred word of GOD... as I've seen code seldom check for concurrency issues once they enter the if body.
if (mInvalid.compareAndSet(false, true)) {
// Ummm... yes... mInvalid is still true...
// Let's use a second atomicReference just in case...
}
It also follows common code conventions that involve "double-<enter something>" in concurrency scenarios.
So only because the first code follows those ideas, is that I am inclined to believe that my solution is a valid and better alternative.
Even though there is an argument in favor of the "nested spin-lock" option, but does not hold up much:
The first alternative is "safer" precisely because it is SLOWER, so it has MORE time to identify contention at the end of the current of incoming threads.
BUT is not even 100% safe because of the "happens once" thing that is impossible to ensure.
There is also a behavior with the code, that, when it reaches the end of a continuos flow of incoming threads, 2 signals are dispatched one after the other, the second to last one, and then the last one.
But IF it is safer because it is slower, wouldn't that defeat the goal of using atomics, since their usage is supposed to be with the aim of being a better performance alternative in the first place?

Unsafe access from multiple threads to set a single value to a single variable

I understood that reading and writing data from multiple threads need to have a good locking mechanism to avoid data race. However, one situation is: If multiple threads try to write to a single variable with a single value, can this be a problem.
For example, here my sample code:
public class Main {
public static void main(String[] args) {
final int[] a = {1};
while(true) {
new Thread(new Runnable() {
#Override
public void run() {
a[0] = 1;
assert a[0] == 1;
}
}).start();
}
}
}
I have run this program for a long time, and look like everything is fine. If this code can cause the problem, how can I reproduce that?

Your test case does not cover the actual problem. You test the variable's value in the same thread - but that thread already copied the initial state of the variable and when it changes within the thread, the changes are visible to that thread, just like in any single-threaded applications. The real issue with write operations is how and when is the updated value used in the other threads.
For example, if you were to write a counter, where each thread increments the value of the number, you would run into issues. An other problem is that your test operation take way less time than creating a thread, therefore the execution is pretty much linear. If you had longer code in the threads, it would be possible for multiple threads to access the variable at the same time. I wrote this test using Thread.sleep(), which is known to be unreliable (which is what we need):
int[] a = new int[]{0};
for(int i = 0; i < 100; i++) {
final int k = i;
new Thread(new Runnable() {
#Override
public void run() {
try {
Thread.sleep(20);
} catch(InterruptedException e) {
e.printStackTrace();
}
a[0]++;
System.out.println(a[0]);
}
}).start();
}
If you execute this code, you will see how unreliable the output is. The order of the numbers change (they are not in ascending order), there are duplicates and missing numbers as well. This is because the variable is copied to the CPU memory multiple times (once for each thread), and is pasted back to the shared ram after the operation is complete. (This does not happen right after it is completed to save time in case it is needed later).
There also might be some other mechanics in the JVM that copy the values within the RAM for threads, but I'm unaware of them.
The thing is, even locking doesn't prevent these issues. It prevents threads from accessing the variable at the same time, but it generally doesn't make sure that the value of the variable is updated before the next thread accesses it.

Real world example of Memory Consistency Errors in multi-threading?

In the tutorial of java multi-threading, it gives an exmaple of Memory Consistency Errors. But I can not reproduce it. Is there any other method to simulate Memory Consistency Errors?
The example provided in the tutorial:
Suppose a simple int field is defined and initialized:
int counter = 0;
The counter field is shared between two threads, A and B. Suppose thread A increments counter:
counter++;
Then, shortly afterwards, thread B prints out counter:
System.out.println(counter);
If the two statements had been executed in the same thread, it would be safe to assume that the value printed out would be "1". But if the two statements are executed in separate threads, the value printed out might well be "0", because there's no guarantee that thread A's change to counter will be visible to thread B — unless the programmer has established a happens-before relationship between these two statements.

I answered a question a while ago about a bug in Java 5. Why doesn't volatile in java 5+ ensure visibility from another thread?
Given this piece of code:
public class Test {
volatile static private int a;
static private int b;
public static void main(String [] args) throws Exception {
for (int i = 0; i < 100; i++) {
new Thread() {
#Override
public void run() {
int tt = b; // makes the jvm cache the value of b
while (a==0) {
}
if (b == 0) {
System.out.println("error");
}
}
}.start();
}
b = 1;
a = 1;
}
}
The volatile store of a happens after the normal store of b. So when the thread runs and sees a != 0, because of the rules defined in the JMM, we must see b == 1.
The bug in the JRE allowed the thread to make it to the error line and was subsequently resolved. This definitely would fail if you don't have a defined as volatile.

This might reproduce the problem, at least on my computer, I can reproduce it after some loops.
Suppose you have a Counter class:
class Holder {
boolean flag = false;
long modifyTime = Long.MAX_VALUE;
}
Let thread_A set flag as true, and save the time into
modifyTime.
Let another thread, let's say thread_B, read the Counter's flag. If thread_B still get false even when it is later than modifyTime, then we can say we have reproduced the problem.
Example code
class Holder {
boolean flag = false;
long modifyTime = Long.MAX_VALUE;
}
public class App {
public static void main(String[] args) {
while (!test());
}
private static boolean test() {
final Holder holder = new Holder();
new Thread(new Runnable() {
#Override
public void run() {
try {
Thread.sleep(10);
holder.flag = true;
holder.modifyTime = System.currentTimeMillis();
} catch (Exception e) {
e.printStackTrace();
}
}
}).start();
long lastCheckStartTime = 0L;
long lastCheckFailTime = 0L;
while (true) {
lastCheckStartTime = System.currentTimeMillis();
if (holder.flag) {
break;
} else {
lastCheckFailTime = System.currentTimeMillis();
System.out.println(lastCheckFailTime);
}
}
if (lastCheckFailTime > holder.modifyTime
&& lastCheckStartTime > holder.modifyTime) {
System.out.println("last check fail time " + lastCheckFailTime);
System.out.println("modify time " + holder.modifyTime);
return true;
} else {
return false;
}
}
}
Result
last check time 1565285999497
modify time 1565285999494
This means thread_B get false from Counter's flag filed at time 1565285999497, even thread_A has set it as true at time 1565285999494(3 milli seconds ealier).

The example used is too bad to demonstrate the memory consistency issue. Making it work will require brittle reasoning and complicated coding. Yet you may not be able to see the results. Multi-threading issues occur due to unlucky timing. If someone wants to increase the chances of observing issue, we need to increase chances of unlucky timing.
Following program achieves it.
public class ConsistencyIssue {
static int counter = 0;
public static void main(String[] args) throws InterruptedException {
Thread thread1 = new Thread(new Increment(), "Thread-1");
Thread thread2 = new Thread(new Increment(), "Thread-2");
thread1.start();
thread2.start();
thread1.join();
thread2.join();
System.out.println(counter);
}
private static class Increment implements Runnable{
#Override
public void run() {
for(int i = 1; i <= 10000; i++)
counter++;
}
}
}
Execution 1 output: 10963,
Execution 2 output: 14552
Final count should have been 20000, but it is less than that. Reason is count++ is multi step operation,
1. read count
2. increment count
3. store it
two threads may read say count 1 at once, increment it to 2. and write out 2. But if it was a serial execution it should have been 1++ -> 2++ -> 3.
We need a way to make all 3 steps atomic. i.e to be executed by only one thread at a time.
Solution 1: Synchronized
Surround the increment with Synchronized. Since counter is static variable you need to use class level synchronization
#Override
public void run() {
for (int i = 1; i <= 10000; i++)
synchronized (ConsistencyIssue.class) {
counter++;
}
}
Now it outputs: 20000
Solution 2: AtomicInteger
public class ConsistencyIssue {
static AtomicInteger counter = new AtomicInteger(0);
public static void main(String[] args) throws InterruptedException {
Thread thread1 = new Thread(new Increment(), "Thread-1");
Thread thread2 = new Thread(new Increment(), "Thread-2");
thread1.start();
thread2.start();
thread1.join();
thread2.join();
System.out.println(counter.get());
}
private static class Increment implements Runnable {
#Override
public void run() {
for (int i = 1; i <= 10000; i++)
counter.incrementAndGet();
}
}
}
We can do with semaphores, explicit locking too. but for this simple code AtomicInteger is enough

Sometimes when I try to reproduce some real concurrency problems, I use the debugger.
Make a breakpoint on the print and a breakpoint on the increment and run the whole thing.
Releasing the breakpoints in different sequences gives different results.
Maybe to simple but it worked for me.

Please have another look at how the example is introduced in your source.
The key to avoiding memory consistency errors is understanding the happens-before relationship. This relationship is simply a guarantee that memory writes by one specific statement are visible to another specific statement. To see this, consider the following example.
This example illustrates the fact that multi-threading is not deterministic, in the sense that you get no guarantee about the order in which operations of different threads will be executed, which might result in different observations across several runs. But it does not illustrate a memory consistency error!
To understand what a memory consistency error is, you need to first get an insight about memory consistency. The simplest model of memory consistency has been introduced by Lamport in 1979. Here is the original definition.
The result of any execution is the same as if the operations of all the processes were executed in some sequential order and the operations of each individual process appear in this sequence in the order specified by its program
Now, consider this example multi-threaded program, please have a look at this image from a more recent research paper about sequential consistency. It illustrates what a real memory consistency error might look like.
To finally answer your question, please note the following points:
A memory consistency error always depends on the underlying memory model (A particular programming languages may allow more behaviours for optimization purposes). What's the best memory model is still an open research question.
The example given above gives an example of sequential consistency violation, but there is no guarantee that you can observe it with your favorite programming language, for two reasons: it depends on the programming language exact memory model, and due to undeterminism, you have no way to force a particular incorrect execution.
Memory models are a wide topic. To get more information, you can for example have a look at Torsten Hoefler and Markus Püschel course at ETH Zürich, from which I understood most of these concepts.
Sources
Leslie Lamport. How to Make a Multiprocessor Computer That Correctly Executes Multiprocessor Programs, 1979
Wei-Yu Chen, Arvind Krishnamurthy, Katherine Yelick, Polynomial-Time Algorithms for Enforcing Sequential Consistency in SPMD Programs with Arrays, 2003
Design of Parallel and High-Performance Computing course, ETH Zürich

Java Thread seemingly skipping conditional statement [duplicate]

This question already has answers here:
Why doesnt this Java loop in a thread work?
(4 answers)
Closed 3 years ago.
For a recent library I'm writing, I wrote a thread which loops indefinitely. In this loop, I start with a conditional statement checking a property on the threaded object. However it seems that whatever initial value the property has, will be what it returns even after being updated.
Unless I do some kind of interruption such as Thread.sleep or a print statement.
I'm not really sure how to ask the question unfortunately. Otherwise I would be looking in the Java documentation. I have boiled down the code to a minimal example that explains the problem in simple terms.
public class App {
public static void main(String[] args) {
App app = new App();
}
class Test implements Runnable {
public boolean flag = false;
public void run() {
while(true) {
// try {
// Thread.sleep(1);
// } catch (InterruptedException e) {}
if (this.flag) {
System.out.println("True");
}
}
}
}
public App() {
Test t = new Test();
Thread thread = new Thread(t);
System.out.println("Starting thread");
thread.start();
try {
Thread.sleep(1000);
} catch (InterruptedException e) {}
t.flag = true;
System.out.println("New flag value: " + t.flag);
}
}
Now, I would presume that after we change the value of the flag property on the running thread, we would immediately see the masses of 'True' spitting out to the terminal. However, we don't..
If I un-comment the Thread.sleep lines inside the thread loop, the program works as expected and we see the many lines of 'True' being printed after we change the value in the App object. As an addition, any print method in place of the Thread.sleep also works, but some simple assignment code does not. I assume this is because it is pulled out as un-used code at compile time.
So, my question is really: Why do I have to use some kind of interruption to get the thread to check conditions correctly?

So, my question is really: Why do I have to use some kind of interruption to get the thread to check conditions correctly?
Well you don't have to. There are at least two ways to implement this particular example without using "interruption".
If you declare flag to be volatile, then it will work.
It will also work if you declare flag to be private, write synchronized getter and setter methods, and use those for all accesses.
public class App {
public static void main(String[] args) {
App app = new App();
}
class Test implements Runnable {
private boolean flag = false;
public synchronized boolean getFlag() {
return this.flag;
}
public synchronized void setFlag(boolean flag) {
return this.flag = flag;
}
public void run() {
while(true) {
if (this.getFlag()) { // Must use the getter here too!
System.out.println("True");
}
}
}
}
public App() {
Test t = new Test();
Thread thread = new Thread(t);
System.out.println("Starting thread");
thread.start();
try {
Thread.sleep(1000);
} catch (InterruptedException e) {}
t.setFlag(true);
System.out.println("New flag value: " + t.getFlag());
}
But why do you need to do this?
Because unless you use either a volatile or synchronized (and you use synchronized correctly) then one thread is not guaranteed to see memory changes made by another thread.
In your example, the child thread does not see the up-to-date value of flag. (It is not that the conditions themselves are incorrect or "don't work". They are actually getting stale inputs. This is "garbage in, garbage out".)
The Java Language Specification sets out precisely the conditions under which one thread is guaranteed to see (previous) writes made by another thread. This part of the spec is called the Java Memory Model, and it is in JLS 17.4. There is a more easy to understand explanation in Java Concurrency in Practice by Brian Goetz et al.
Note that the unexpected behavior could be due to the JIT deciding to keep the flag in a register. It could also be that the JIT compiler has decided it does not need force memory cache write-through, etcetera. (The JIT compiler doesn't want to force write-through on every memory write to every field. That would be a major performance hit on multi-core systems ... which most modern machines are.)
The Java interruption mechanism is yet another way to deal with this. You don't need any synchronization because the method calls that. In addition, interruption will work when the thread you are trying to interrupt is currently waiting or blocked on an interruptible operation; e.g. in an Object::wait call.

Because the variable is not modified in that thread, the JVM is free to effectively optimize the check away. To force an actual check, use the volatile keyword:
public volatile boolean flag = false;

Can we say that by synchronizing a block of code we are making the contained statements atomic?

I want to clear my understanding that if I surround a block of code with synchronized(this){} statement, does this mean that I am making those statements atomic?

No, it does not ensure your statements are atomic. For example, if you have two statements inside one synchronized block, the first may succeed, but the second may fail. Hence, the result is not "all or nothing". But regarding multiple threads, you ensure that no statement of two threads are interleaved. In other words: all statements of all threads are strictly serialized, even so, there is no guarantee, that all or none statements of a thread gets executed.
Have a look at how Atomicity is defined.
Here is an example showing that the reader is able to ready a corrupted state. Hence the synchronized block was not executed atomically (forgive me the nasty formatting):
public class Example {
public static void sleep() {
try { Thread.sleep(400); } catch (InterruptedException e) {};
}
public static void main(String[] args) {
final Example example = new Example(1);
ExecutorService executor = newFixedThreadPool(2);
try {
Future<?> reader = executor.submit(new Runnable() { #Override public void run() {
int value; do {
value = example.getSingleElement();
System.out.println("single value is: " + value);
} while (value != 10);
}});
Future<?> writer = executor.submit(new Runnable() { #Override public void run() {
for (int value = 2; value < 10; value++) example.failDoingAtomic(value);
}});
reader.get(); writer.get();
} catch (Exception e) { e.getCause().printStackTrace();
} finally { executor.shutdown(); }
}
private final Set<Integer> singleElementSet;
public Example(int singleIntValue) {
singleElementSet = new HashSet<>(Arrays.asList(singleIntValue));
}
public synchronized void failDoingAtomic(int replacement) {
singleElementSet.clear();
if (new Random().nextBoolean()) sleep();
else throw new RuntimeException("I failed badly before adding the new value :-(");
singleElementSet.add(replacement);
}
public int getSingleElement() {
return singleElementSet.iterator().next();
}
}

No, synchronization and atomicity are two different concepts.
Synchronization means that a code block can be executed by at most one thread at a time, but other threads (that execute some other code that uses the same data) can see intermediate results produced inside the "synchronized" block.
Atomicity means that other threads do not see intermediate results - they see either the initial or the final state of the data affected by the atomic operation.

It's unfortunate that java uses synchronized as a keyword. A synchronized block in Java is a "mutex" (short for "mutual exclusion"). It's a mechanism that insures only one thread at a time can enter the block.
Mutexes are just one of many tools that are used to achieve "synchronization" in a multi-threaded program: Broadly speaking, synchronization refers to all of the techniques that are used to insure that the threads will work in a coordinated fashion to achieve a desired outcome.
Atomicity is what Oleg Estekhin said, above. We usually hear about it in the context of "transactions." Mutual exclusion (i.e., Java's synchronized) guarantees something less than atomicity: Namely, it protects invariants.
An invariant is any assertion about the program's state that is supposed to be "always" true. E.g., in a game where players exchange virtual coins, the total number of coins in the game might be an invariant. But it's often impossible to advance the state of the program without temporarily breaking the invariant. The purpose of mutexes is to insure that only one thread---the one that is doing the work---can see the temporary "broken" state.

For code that use syncronized on that object - yes.
For code, that don't use syncronized keyword for that object - no.

Can we say that by synchronizing a block of code we are making the contained statements atomic?
You are taking a very big leap there. Atomicity means that the operation if atomic will complete in one CPU cycle or equivalent to one CPU cycle whereas Synchronizing a block means only one thread can access the critical region. It may take multiple CPU cycles for processing code in the critical region(which will make it non atomic).

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.