Java -> volatile and final: Volatile as flushing-all-memory-content - java

Take a look on that answer here (1):
https://stackoverflow.com/a/2964277/2182302 (Java Concurrency : Volatile vs final in "cascaded" variables?)
and on my old question here (2):
one java memoryFlushing volatile: A good programdesign?
So as i understand (see (2)) i can use volatile variables as memory barrier/flusher for ALL memory content not only for the referenced one by the volatile keyword.
now the accepted answer in (1) says that it would only flush the memory where the volatile-keyowrd is attached on.
So what is correct now?, and if the flushing-all principle in (2) is correct, why i cant then attach volatile to variables in combination with final?

Neither answer is correct, because you're thinking about it the wrong way. The concept of 'flush the memory' is simply made up. It's nowhere in the Java Virtual Machine Specification. It's just Not A Thing. Yes, many CPU/architectures do work that way, but the JVM does not.
You need to program to the JVM spec. Failure to do so means you write code that works perfectly fine on your machine, every time, and then you upload it to your server and it fails there. This is a horrible scenario: Buggy code, but bugs that cannot ever be trigged by tests. Yowza, those are bad.
So, what is in the JVM spec?
Not the concept of 'flushing'. What it does have, is the concept of HBHA: Happens-Before/Happens-After. Here's how it works:
There is a list of specific interactions which sets up that some line of code is defined to 'happen before' (HB/HA = Happens before/Happens after) some other line. An idea of this list is given below.
For any two lines which have an HBHA relationship, it would be impossible for the HA line to observe any state being such that it appears as if the HB line has not run yet. It's basically saying: HB lines occur before HA lines, except not quite that strong: You cannot observe the opposite (i.e. HB changes variable X, the HA line does not see this change to X, that'd be observing the opposite, that's impossible). Except timing-wise. In reality, HB/HA does not actually mean that lines get executed earlier or later: If you have 2 lines with an HB/HA relationship which have no effect on each other (one writes variable X. The other reads completely different variable Y), the JVM/CPU working together is free to reorder as much as it wants.
For any two lines with no defined HB/HA relationship, the JVM and CPU are free to do whatever it pleases. Including things that just cannot be explained with a simplistic 'flushing' model.
For example:
int a = 0, b = 0;
void thread1() {
a = 10;
b = 20;
}
void thread2() {
System.out.println(b);
System.out.println(a);
}
In the above, no HB/HA relationship has been established between thread 1 modifying the state of a/b, and thread 2 reading them.
Therefore, it is legal for a JVM to print 20 0, even though this cannot be explained with basic flushing notions: It is legal for the JVM to 'flush' b but not a.
It is somewhat unlikely for you to be capable of writing this code and actually observing that 20/0 print on any JVM version or any hardware, but the point is: It is allowed, and some day (or probably, it already exists), some exotic combo of JVM+hardware+OS version+state of the machine combines to actually make this happen, so if your code breaks if this sequence of events occurs, then you wrote a bug.
In effect, if one line mutates state, and another line reads it, and those 2 lines have no HB/HA, you messed up, and you need to fix your bug. Even (especially!) if you can't manage to write a test that actually proves it.
The trick here is that volatile reads do establish HB/HA, and as that is the only mechanism that the JVMS spec has to sync stuff up, yes, this has the effect of guaranteeing that you 'see all changes'. But this is not, at all, a good idea. Especially because the JVMS also says that the hotspot compiler is free to eliminate lines that have no side-effect.
So now we're going to have to get into a debate on whether 'establishes HBHA' is a side-effect. It probably is, but now we get to the rule of optimizations:
Write idiomatic code.
Whenever azul, the openjdk core dev team, etc are looking at improving the considerable optimization chops of the hotspot compiler, they look at real life code. It's like a gigantic pattern matcher: They look for patterns in code and finds ways to optimize them. They don't just write detectors for everything imaginable: They strongly prefer writing optimizers for patterns that commonly show up in real life java code. After all, what possible point is there spending time and effort optimizing a construction that almost no java code actually contains?
This gets us to the fundamental issue with using throw-away volatile reads as a way to establish HB/HA: Nobody does it that way, so the odds that at some point the JVMS is updated (or simply the conflicting rules are 'interpreted' as meaning: Yeah, hotspot can eliminate a pointless read, even if it did establish an HB/HA that is now no longer there) are quite high - you're also far more likely to run into JVM bugs if you do things in unique ways. After all, if you do things in ways that are well trodden, the bug would have been reported and fixed ages ago.
How to establish HB/HA:
The natural rule: Within a single thread, code cannot be observed to run in any way except sequentially, i.e. within one thread, all lines have HB/HA with each other in the obvious fashion.
synchronized blocks: If one thread exits a sync block and then another thread enters one on the same reference, then the sync-block-exit in A Happens-Before the sync-block-enter in B.
volatile reads and writes.
Some exotic stuff, such as: thread.start() happens-before the first line that thread's run() method, or all code in a thread is guaranteed to HB before thread.yield() on that thread finishes. These tend to be obvious.
Thus, to answer the question, is it good programming design?
No, it is not.
Establish HB/HA in the proper ways: Find something appropriate in java.util.concurrent and use it. From a simple lock to a queue to a fork/join pool for the entire job. Alternatively, stop sharing state. Alternatively, share state with mechanisms that are designed for concurrent access in more natural ways than HB/HA is, such as a database (transactions), or a message queue.

Related

Each action in a thread happens-before every action in that thread that comes later in the program's order

The first bullet point of Memory Consistency Properties is:
Each action in a thread happens-before every action in that thread that comes later in the program's order.
I guess this is a relatively recent addition to Java memory model because Jon Skeet didn't mention it in 2011.
What exactly does this bullet point mean in practice? I'm having a hard time making sense of it. Does it simply mean "There are no concurrency issues within a single thread"? Or is there more to it?
What exactly does this bullet point mean in practice?
Everything in a thread notionally occurs in the order the program executes (in reality it, instructions can be reordered to make the program run faster)
I'm having a hard time making sense of it.
Most likely you are over thinking it. Imagine you are reading the lyrics of a song. The words in each line happen after all the words before it, and all the words after that line happen after it.
Does it simply mean "There are no concurrency issues within a single thread"?
Yes, there shouldn't but there can be. e.g. The Spectre and Meltdown security issue exploited this.

Biased locking design decision

I am trying understand a rationale behind biased locking and making it a default. Since reading this blog post, namely:
"Since most objects are locked by at most one thread during their lifetime, we allow that thread to bias an object toward itself"
I am perplexed... Why would anyone design a synchronized set of methods to be accessed by one thread only? In most cases, people devise certain building blocks specifically for the multi-threaded use-case, and not a single-threaded one. In such cases, EVERY lock aquisition by a thread which is not biased is at the cost of a safepoint, which is a huge overhead! Could someone please help me understand what I am missing in this picture?
The reason is probably that there are a decent number of libraries and classes that are designed to be thread safe but that are still useful outside of such circumstances. This is especially true of a number of classes that predate the Collections framework. Vector and it's subclasses is a good example. If you also consider that most java programs are not multi threaded it is in most cases an overall improvement to use a biased locking scheme, this is especially true of legacy code where the use of such Classes is all to common.
You are correct in a way, but there are cases when this is needed, as Holger very correctly points in his comment. There is so-called, the grace period when no biased-locking is attempted at all, so it's not like this will happen all the time. As I last remember looking at the code, it was 5 seconds. To prove this you would need a library that could inspect Java Object's header (jol comes to my mind), since biased locking is hold inside mark word. So only after 5 seconds will the object that held a lock before will be biased towards the same lock.
EDIT
I wanted to write a test for this, but seems like there is one already! Here is the link for it

Mutate Non thread safe collections

Can anyone please explain to me the consequences of mutating a collection in java that is not thread-safe and is being used by multiple threads?
The results are undefined and somewhat random.
With JDK collections that are designed to fail fast, you might receive a ConcurrentModificationException. This is really the only consequence that is specific to thread safety with collections, as opposed to any other class.
Problems that occur generally with thread-unsafe classes may occur:
The internal state of the collection might be corrupted.
The mutation may appear to be successful, but the changes may not, in fact, be visible to other threads at any given time. They might be invisible at first and become visible later.
The changes might actually be successful under light load, but fail randomly under heavy load with lots of threads in contention.
Race conditions might occur, as was mentioned in a comment above.
There are lots of other possibilities, none of them pleasant. Worst of all, these things tend to most commonly reveal themselves in production, when the system is stressed.
In short, you probably don't want to do that.
The most common outcome is it looks like it works, but doesn't work all the time.
This can mean you have a problem which
works on one machine but doesn't on another.
works for a while but something apparently unrelated changes and your program breaks.
whenever you have a bug you don't know if it's a multi-threading issue or not if you are not using thread safe data structures.
What can happen is;
you rarely/randomly get an error and strange behaviour
your code goes into an infinite loop and stops working (HashMap used to do this)
The only option is to;
limit the amount of state which is shared between threads, ideally none at all.
be very careful about how data is updated.
don't rely on unit tests, you have to understand what the code doing and be confident it will be behave correctly in all possible situations.
The invariants of the data structure will not be guaranteed.
For example:
If thread 2 does a read whilst thread 1 is adding to the DS thread 1 may consider this element added while thread 2 doesn't see that the element has been added yet.
There are plenty of data structures that aren't thread-safe that will still appear to function(i.e. not throw) in a multi threaded environment and they might even perform correctly under certain circumstances(like if you aren't doing any writes to the data structure).
To fully understand this topic exploring the different classes of bugs that occur in concurrent systems is recommended: this short document seems like a good start.
http://pages.cs.wisc.edu/~remzi/OSTEP/threads-bugs.pdf

Java multi-threading accessing same variable

I have a Java program which create 2 threads, inside these 2 threads, they are trying to update the global variable abc to different value, let's say integer 1 and integer 3.
Let's say they execute the code at the same time (at same milisecond), for example:
public class MyThread implements Runnable{
public void run(){
while(true){
if (currentTime == specificTime){
abc = 1; //another thread update abc to 3
}
}
}
}
In this case, how can we determine the result of the variable abc? I am very curious how Operating System schedule the execution?
(I know Synchronize should be used, but I just want to know naturally how the system will handle this kind of conflict problem.)
The operating system has little involvement in this: at the time your threads are running, the memory allocated to abc is under control of JVM running your program, so it's your program that is in control.
When two threads access the same memory location, the last writer wins. Which particular thread gets to be the last writer, however, is non-deterministic, unless you use synchronization.
Moreover, without you taking special care of accessing the shared data, one thread may not even see the results of the other thread writing to the abc location.
To avoid synchronization issues, you should use synchronization or one of the java.util.concurrent.atomic classes.
From Java's perspective the situation is fairly simple if abc is not volatile or accessed with appropriate synchronisation.
Let's assume that abc is 0 originally. After your two threads have updated it to respectively 1 and 3, abc could be observed in three states: 0, 1 or 3. Which value you get is not deterministic and the result may vary from one run to the other.
Depends on the operating system, running environment etc.
Some environments will actually stop you from doing this - known as thread safety.
Otherwise the results are totally unpredictable which is why it is so dangerous to do this.
It mainly just depends on which thread updated it last for what the value will be. One thread will get CPU cycles before the other to do the atomic operation first.
Also, I don't think that operating systems go as far as to schedule threads because in most operating systems it is the program that is responsible for them, and without explicit calls like synchronise, or a threading pool model then I think the order of execution is pretty hard to predict. Its a very environment dependent thing.
From the system's perspective the result will depend on many software, hardware and run-time factors that cannot be known in advance. From this perspective there is no conflict nor a problem.
From the programmer's perspective the result is not deterministic and therefore a problem/conflic. The conflict needs to be resolved at design-time.
In this case, how can we determine the result of the variable abc? I
am very curious how Operating System schedule the execution?
The result will not be deterministic, as the value will be the last written one. You can not make any guarantee about the result. The execution is scheduled like any other one. As you demand no synchronization in your code the JVM will not enforce anything for you.
I know Synchronize should be used, but I just want to know naturally
how the system will handle this kind of conflict problem.
Simple said: it wont, as for the system there is no conflict. Only for you, the programmer, problems will occur, since you will eventually run into a data race and not deterministic behavior. It is completely up to you.
just add volatile modificator to your variable, then it'll be udpated through all threads. And thread reading it will get it's actual value. volatile means that value will be always up to date for all threads accessing it.

Java byecode maniulation to detect potential deadlocks

I've been caught by yet another deadlock in our Java application and started thinking about how to detect potential deadlocks in the future. I had an idea of how to do this, but it seems almost too simple.
I'd like to hear people's views on it.
I plan to run our application for several hours in our test environment, using a typical data set.
I think it would be possible to perform bytecode manipulation on our application such that, whenever it takes a lock (e.g. entering a synchronized block), details of the lock are added to a ThreadLocal list.
I could write an algorithm that, at some later point, compares the lists for all threads and checks if any contain the same pair of locks in opposite order - this would be reported as a deadlock possibility. Again, I would use bytecode manipulation to add this periodic check to my application.
So my question is this: is this idea (a) original and (b) viable?
This is something that we talked about when I took a course in concurrency. I'm not sure if your implementation is original, but the concept of analysis to determine potential deadlock is not unique. There are dynamic analysis tools for Java, such as JCarder. There is also research into some analysis that can be done statically.
Admittedly, it's been a couple of years since I've looked around. I don't think JCarder was the specific tool we talked about (at least, the name doesn't sound familiar, but I couldn't find anything else). But the point is that analysis to detect deadlock isn't an original concept, and I'd start by looking at research that has produced usable tools as a starting point - I would suspect that the algorithms, if not the implementation, are generally available.
I have done something similar to this with Lock by supplying my own implementation.
These days I use the actor model, so there is little need to lock the data (as I have almost no shared mutable data)
In case you didn't know, you can use the Java MX bean to detect deadlocked threads programmatically. This doesn't help you in testing but it will help you at least better detect and recover in production.
ThreadMXBean threadMxBean = ManagementFactory.getThreadMXBean();
long[] deadLockedThreadIds = threadMxBean.findMonitorDeadlockedThreads();
// log the condition or even interrupt threads if necessary
...
That way you can find some deadlocks, but never prove their absence. I'd better develop static checking tool, a kind of bytecode analizer, feeded with annotations for each synchronized method. Annotations should show the place of the annotated method in the resource graph. The task is then to find loops in the graph. Each loop means deadlock.

Categories