Why HotSpot will optimize the following using hoisting?

Why HotSpot will optimize the following using hoisting? - java

In the "Effective Java", the author mentioned that
while (!done) i++;
can be optimized by HotSpot into
if (!done) {
while (true) i++;
}
I am very confused about it. The variable done is usually not a const, why can compiler optimize that way?

The author assumes there that the variable done is a local variable, which does not have any requirements in the Java Memory Model to expose its value to other threads without synchronization primitives. Or said another way: the value of done won't be changed or viewed by any code other than what's shown here.
In that case, since the loop doesn't change the value of done, its value can be effectively ignored, and the compiler can hoist the evaluation of that variable outside the loop, preventing it from being evaluated in the "hot" part of the loop. This makes the loop run faster because it has to do less work.
This works in more complicated expressions too, such as the length of an array:
int[] array = new int[10000];
for (int i = 0; i < array.length; ++i) {
array[i] = Random.nextInt();
}
In this case, the naive implementation would evaluate the length of the array 10,000 times, but since the variable array is never assigned and the length of the array will never change, the evaluation can change to:
int[] array = new int[10000];
for (int i = 0, $l = array.length; i < $l; ++i) {
array[i] = Random.nextInt();
}
Other optimizations also apply here unrelated to hoisting.
Hope that helps.

Joshua Bloch's "Effective Java" explains why you must be careful when sharing variables between threads. If there doesn't exist any explicit happens before relation between threads, the HotSpot compiler is allowed to optimize the code for speed reasons as shown by dmide.
Most nowadays microprocessors offer different kinds of out-of-order strategies. This leads to a weak consistency model which is also the base for Java's Platform Memory Model. The idea behind is, as long as the programmer does not explicitly express the need for an inter-thread coordination, the processor and the compiler can do different optimizations.
The two keywords volatile (atomicity & visibility) and synchronized (atomicity & visibility & mutual exclusion) are used for expressing the visibility of changes (for other threads). However, in addition you must know the happens before rules (see Goetz et al “Java Concurrency in Practice” p. 341f (JCP) and Java Language Specification §17).
So, what happens when System.out.println() is called? See above.
First of all, you need two System.out.println() calls. One in the main method (after changing done) and one in the started thread (in the while loop). Now, we must consider the program order rule and the monitor lock rule from JLS §17. Here the short version: You have a common lock object M. Everything that happens in a thread A before A unlocks M is visible to another thread B in that moment when B locks M (see JCP).
In our case the two threads share a common PrintStream object in System.out. When we take a look inside println() you see a call of synchronized(this).
Conclusion: Both threads share a common lock M which is locked and unlocked. System.out.println() “flushes” the state change of variable done.

public class StopThread {
private static boolean stopRequested;
private static synchronized void requestStop() {
stopRequested = true;
}
private static synchronized boolean stopRequested() {
return stopRequested;
}
public static void main(String[] args)
throws InterruptedException {
Thread backgroundThread = new Thread(new Runnable() {
public void run() {
int i = 0;
while (!stopRequested())
i++;
}
});
backgroundThread.start();
TimeUnit.SECONDS.sleep(1);
requestStop();
}
}
the above code is right in effective code,it is equivalent that use volatile to decorate the stopRequested.
private static boolean stopRequested() {
return stopRequested;
}
If this method omit the synchronized keyword, this program isn't working well.
I think that this change cause the hoisting when the method omit the synchronized keyword.

If you add System.out.println("i = " + i); in the while loop. The hoisting won't work, meaning the program stops as expected. The println method is thread safe so that the jvm can not optimize the code segment?

Related

How to solve race condition of two writers using immutable objects

I was thinking about how to solve race condition between two threads which tries to write to the same variable using immutable objects and without helping any keywords such as synchronize(lock)/volatile in java.
But I couldn't figure it out, is it possible to solve this problem with such solution at all?
public class Test {
private static IAmSoImmutable iAmSoImmutable;
private static final Runnable increment1000Times = () -> {
for (int i = 0; i < 1000; i++) {
iAmSoImmutable.increment();
}
};
public static void main(String... args) throws Exception {
for (int i = 0; i < 10; i++) {
iAmSoImmutable = new IAmSoImmutable(0);
Thread t1 = new Thread(increment1000Times);
Thread t2 = new Thread(increment1000Times);
t1.start();
t2.start();
t1.join();
t2.join();
// Prints a different result every time -- why? :
System.out.println(iAmSoImmutable.value);
}
}
public static class IAmSoImmutable {
private int value;
public IAmSoImmutable(int value) {
this.value = value;
}
public IAmSoImmutable increment() {
return new IAmSoImmutable(++value);
}
}
If you run this code you'll get different answers every time, which mean a race condition is happening.

You can not solve race condition without using any of existence synchronisation (or volatile) techniques. That what they were designed for. If it would be possible there would be no need of them.
More particularly your code seems to be broken. This method:
public IAmSoImmutable increment() {
return new IAmSoImmutable(++value);
}
is nonsense for two reasons:
1) It makes broken immutability of class, because it changes object's variable value.
2) Its result - new instance of class IAmSoImmutable - is never used.

The fundamental problem here is that you've misunderstood what "immutability" means.
"Immutability" means — no writes. Values are created, but are never modified.
Immutability ensures that there are no race conditions, because race conditions are always caused by writes: either two threads performing writes that aren't consistent with each other, or one thread performing writes and another thread performing reads that give inconsistent results, or similar.
(Caveat: even an immutable object is effectively mutable during construction — Java creates the object, then populates its fields — so in addition to being immutable in general, you need to use the final keyword appropriately and take care with what you do in the constructor. But, those are minor details.)
With that understanding, we can go back to your initial sentence:
I was thinking about how to solve race condition between two threads which tries to write to the same variable using immutable objects and without helping any keywords such as synchronize(lock)/volatile in java.
The problem here is that you actually aren't using immutable objects: your entire goal is to perform writes, and the entire concept of immutability is that no writes happen. These are not compatible.
That said, immutability certainly has its place. You can have immutable IAmSoImmutable objects, with the only writes being that you swap these objects out for each other. That helps simplify the problem, by reducing the scope of writes that you have to worry about: there's only one kind of write. But even that one kind of write will require synchronization.
The best approach here is probably to use an AtomicReference<IAmSoImmutable>. This provides a non-blocking way to swap out your IAmSoImmutable-s, while guaranteeing that no write gets silently dropped.
(In fact, in the special case that your value is just an integer, the JDK provides AtomicInteger that handles the necessary compare-and-swap loops and so on for threadsafe incrementation.)

Even if the problems are resolved by :
Avoiding the change of IAmSoImmutable.value
Reassigning the new object created within increment() back into the iAmSoImmutable reference.
There still are pieces of your code that are not atomic and that needs a sort of synchronization.
A solution would be to use a synchronized method of course
public synchronized static void increment() {
iAmSoImmutable = iAmSoImmutable.increment();
}
Thread t1 = new Thread(() -> {
for (int i = 0; i < 1000; i++) {
increment();
}
});
Thread t2 = new Thread(() -> {
for (int i = 0; i < 1000; i++) {
increment();
}
});

Can we say that by synchronizing a block of code we are making the contained statements atomic?

I want to clear my understanding that if I surround a block of code with synchronized(this){} statement, does this mean that I am making those statements atomic?

No, it does not ensure your statements are atomic. For example, if you have two statements inside one synchronized block, the first may succeed, but the second may fail. Hence, the result is not "all or nothing". But regarding multiple threads, you ensure that no statement of two threads are interleaved. In other words: all statements of all threads are strictly serialized, even so, there is no guarantee, that all or none statements of a thread gets executed.
Have a look at how Atomicity is defined.
Here is an example showing that the reader is able to ready a corrupted state. Hence the synchronized block was not executed atomically (forgive me the nasty formatting):
public class Example {
public static void sleep() {
try { Thread.sleep(400); } catch (InterruptedException e) {};
}
public static void main(String[] args) {
final Example example = new Example(1);
ExecutorService executor = newFixedThreadPool(2);
try {
Future<?> reader = executor.submit(new Runnable() { #Override public void run() {
int value; do {
value = example.getSingleElement();
System.out.println("single value is: " + value);
} while (value != 10);
}});
Future<?> writer = executor.submit(new Runnable() { #Override public void run() {
for (int value = 2; value < 10; value++) example.failDoingAtomic(value);
}});
reader.get(); writer.get();
} catch (Exception e) { e.getCause().printStackTrace();
} finally { executor.shutdown(); }
}
private final Set<Integer> singleElementSet;
public Example(int singleIntValue) {
singleElementSet = new HashSet<>(Arrays.asList(singleIntValue));
}
public synchronized void failDoingAtomic(int replacement) {
singleElementSet.clear();
if (new Random().nextBoolean()) sleep();
else throw new RuntimeException("I failed badly before adding the new value :-(");
singleElementSet.add(replacement);
}
public int getSingleElement() {
return singleElementSet.iterator().next();
}
}

No, synchronization and atomicity are two different concepts.
Synchronization means that a code block can be executed by at most one thread at a time, but other threads (that execute some other code that uses the same data) can see intermediate results produced inside the "synchronized" block.
Atomicity means that other threads do not see intermediate results - they see either the initial or the final state of the data affected by the atomic operation.

It's unfortunate that java uses synchronized as a keyword. A synchronized block in Java is a "mutex" (short for "mutual exclusion"). It's a mechanism that insures only one thread at a time can enter the block.
Mutexes are just one of many tools that are used to achieve "synchronization" in a multi-threaded program: Broadly speaking, synchronization refers to all of the techniques that are used to insure that the threads will work in a coordinated fashion to achieve a desired outcome.
Atomicity is what Oleg Estekhin said, above. We usually hear about it in the context of "transactions." Mutual exclusion (i.e., Java's synchronized) guarantees something less than atomicity: Namely, it protects invariants.
An invariant is any assertion about the program's state that is supposed to be "always" true. E.g., in a game where players exchange virtual coins, the total number of coins in the game might be an invariant. But it's often impossible to advance the state of the program without temporarily breaking the invariant. The purpose of mutexes is to insure that only one thread---the one that is doing the work---can see the temporary "broken" state.

For code that use syncronized on that object - yes.
For code, that don't use syncronized keyword for that object - no.

Can we say that by synchronizing a block of code we are making the contained statements atomic?
You are taking a very big leap there. Atomicity means that the operation if atomic will complete in one CPU cycle or equivalent to one CPU cycle whereas Synchronizing a block means only one thread can access the critical region. It may take multiple CPU cycles for processing code in the critical region(which will make it non atomic).

Using semaphores to protect an array

Suppose that there are many threads that call the method m(int i) and change the value of the array in position i. Is the following code correct, or is there a race condition?
public class A{
private int []a =new int[N];
private Semaphore[] s=new Semaphore[N];
public A(){
for(int i =0 ; i<N ; i++)
s[i]=new Semaphore(1);
}
public void m(int i){
s[i].acquire();
a[i]++;
s[i].release();
}
}

The code is correct, I see no race condition although both a and s should be made final. You should also use a try/finally every time you use locks that need to be acquired and released:
s[i].acquire();
try {
a[i]++;
} finally {
s[i].release();
}
But, for updating an array, the idea of individual locks per item is very unnecessary. A single lock would be just as appropriate since the major cost is the memory updating and the other native synchronization. This said, if the actual operation is not a int ++ then you are warranted in using a Semaphore or other Lock object.
But for simple operations, something like the following is fine:
// make sure it is final if you are synchronizing on it
private final int[] a = new int[N];
...
public void m(int i) {
synchronized (a) {
a[i]++:
}
}
If you are really worried about the blocking then an array of AtomicInteger is another possibility but even this feels like overkill unless a profiler tells you otherwise.
private final AtomicInteger[] a = new AtomicInteger[N];
...
public A(){
for(int i = 0; i < N; i++)
a[i] = new AtomicInteger(0);
}
public void m(int i) {
a[i].incrementAndGet();
}
Edit:
I just wrote a quick stupid test program that compares a single synchronized lock, a synchronized on an array of locks, AtomicInteger array, and Semaphore array. Here are the results:
synchronized on the int[] 10617ms
synchronized on an array of Object[] 1827ms
AtomicInteger array 1414ms
Semaphore array 3211ms
But, the kicker is that this is with 10 threads each doing 10 million iterations. Sure it is faster but unless you are truly doing millions of iterations, you won't see any noticeable performance improvement in your application. This is the definition of "premature optimization". You will be paying for code complexity, increasing the likelihood of bugs, adding debugging time, increasing maintenance costs, etc.. To quote Knuth:
We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.
Now, as the OP implies in comments, the i++ is not the real operation that s/he is protecting. If the increment is a lot more time consuming (i.e. if the blocking is increased), then the array of locks will be required.

Is synchronization needed while reading if no contention could occur

Consider code sniper below:
package sync;
public class LockQuestion {
private String mutable;
public synchronized void setMutable(String mutable) {
this.mutable = mutable;
}
public String getMutable() {
return mutable;
}
}
At time Time1 thread Thread1 will update ‘mutable’ variable. Synchronization is needed in setter in order to flush memory from local cache to main memory.
At time Time2 ( Time2 > Time1, no thread contention) thread Thread2 will read value of mutable.
Question is – do I need to put synchronized before getter? Looks like this won’t cause any issues - memory should be up to date and Thread2’s local cache memory should be invalidated&updated by Thread1, but I’m not sure.

Rather than wonder, why not just use the atomic references in java.util.concurrent?
(and for what it's worth, my reading of happens-before does not guarantee that Thread2 will see changes to mutable unless it also uses synchronized ... but I always get a headache from that part of the JLS, so use the atomic references)

It will be fine if you make mutable volatile, details in the "cheap read-write lock"

Are you absolutely sure that the getter will be called only after the setter is called? If so, you don't need the getter to be synchronized, since concurrent reads do not need to synchronized.
If there is a chance that get and set can be called concurrently then you definitely need to synchronize the two.

If you worry so much about the performance in the reading thread, then what you do is read the value once using proper synchronization or volatile or atomic references. Then you assign the value to a plain old variable.
The assign to the plain variable is guaranteed to happen after the atomic read (because how else could it get the value?) and if the value will never be written to by another thread again you are all set.

I think you should start with something which is correct and optimise later when you know you have an issue. I would just use AtomicReference unless a few nano-seconds is too long. ;)
public static void main(String... args) {
AtomicReference<String> ars = new AtomicReference<String>();
ars.set("hello");
long start = System.nanoTime();
int runs = 1000* 1000 * 1000;
int length = test(ars, runs);
long time = System.nanoTime() - start;
System.out.printf("get() costs " + 1000*time / runs + " ps.");
}
private static int test(AtomicReference<String> ars, int runs) {
int len = 0;
for (int i = 0; i < runs; i++)
len = ars.get().length();
return len;
}
Prints
get() costs 1219 ps.
ps is a pico-second, with is 1 millionth of a micro-second.

This probably will never result in incorrect behavior, but unless you also guarantee the order that the threads startup in, you cannot necessarily guarantee that the compiler didn't reorder the read in Thread2 before the write in Thread1. More specifically, the entire Java runtime only has to guarantee that threads execute as if they were run in serial. So, as long as the thread has the same output running serially under optimizations, the entire language stack (compiler, hardware, language runtime) can do
pretty much whatever it wants. Including allowing Thread2 to cache the the result of LockQuestion.getMutable().
In practice, I would be very surprised if that ever happened. If you want to guarantee that this doesn't happen, have LockQuestion.mutable be declared as final and get initialized in the constructor. Or use the following idiom:
private static class LazySomethingHolder {
public static Something something = new Something();
}
public static Something getInstance() {
return LazySomethingHolder.something;
}

Avoiding a lost update in Java without directly using synchronization

I am wondering if it is possible to avoid the lost update problem, where multiple threads are updating the same date, while avoiding using synchronized(x) { }.
I will be doing numerous adds and increments:
val++;
ary[x] += y;
ary[z]++;
I do not know how Java will compile these into byte code and if a thread could be interrupted in the middle of one of these statements blocks of byte code. In other words are those statements thread safe?
Also, I know that the Vector class is synchronized, but I am not sure what that means. Will the following code be thread safe in that the value at position i will not change between the vec.get(i) and vec.set(...).
class myClass {
Vector<Integer> vec = new Vector<>(Integer);
public void someMethod() {
for (int i=0; i < vec.size(); i++)
vec.set(i, vec.get(i) + value);
}
}
Thanks in advance.

For the purposes of threading, ++ and += are treated as two operations (four for double and long). So updates can clobber one another. Not just be one, but a scheduler acting at the wrong moment could wipe out milliseconds of updates.
java.util.concurrent.atomic is your friend.
Your code can be made safe, assuming you don't mind each element updating individually and you don't change the size(!), as:
for (int i=0; i < vec.size(); i++) {
synchronized (vec) {
vec.set(i, vec.get(i) + value);
}
}
If you want to add resizing to the Vector you'll need to move the synchronized statement outside of the for loop, and you might as well just use plain new ArrayList. There isn't actually a great deal of use for a synchronised list.
But you could use AtomicIntegerArray:
private final AtomicIntegerArray ints = new AtomicIntegerArray(KNOWN_SIZE);
[...]
int len = ints.length();
for (int i=0; i<len; ++i) {
ints.addAndGet(i, value);
}
}
That has the advantage of no locks(!) and no boxing. The implementation is quite fun too, and you would need to understand it do more complex update (random number generators, for instance).

vec.set() and vec.get() are thread safe in that they will not set and retrieve values in such a way as to lose sets and gets in other threads. It does not mean that your set and your get will happen without an interruption.
If you're really going to be writing code like in the examples above, you should probably lock on something. And synchronized(vec) { } is as good as any. You're asking here for two operations to happen in sync, not just one thread safe operation.
Even java.util.concurrent.atomic will only ensure one operation (a get or set) will happen safely. You need to get-and-increment in one operation.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.