I was trying to understand the use of volatile keyword in java. I understand it will write the data in main memory not in thread cache.
But is that really useful. I am using multi threading and
shouldn't I be using synchronized cause I don't want dirty reads to other threads. so at what exact situation volatile can be useful and most important to use?
Please give some example.
synchronized is much more expensive than plain volatile.
volatile is useful when you just need to read/write single variable and don't care about atomicity of complex structures.
synchronized is useful when you need to perform complex operations, update several variables or set one variable when compared another one and ensure the atomicity of such operation. Also it is used when doing higher level synchronization such as conditions, i.e. synchronized/wait/notify in java. But for that Lock/Condition can be used too.
For even better explanation about using volatile variables you can view the following link with JB Nizet's answer. It compliments well the answer posted by Zbynek and further explains the relation between volatile, atomic variables & complexity. Hope this helps.
Related
(All of this is in a multi-threaded environment)
I have a scenario in my code which requires me to update the static variables in a class. Now, based on the latest combination of the variables, the code might enter one flow or the other. I've used synchronize to allow only one thread to update the variables. Since, I also want any other thread to notice the most recent value only, I've declared the two variables as volatile too. I've been reading about the differences b/w the two keywords, and I believe that both of them have to be used to achieve what I want. The read accesses are not synchronized, only the part where I write to the variables is synch-ed.
I am new to the multithreading domain in Java, and would like to know if this is a good practice. Is there a better (more efficient) way of achieving this in Java?
You can use ReentrantReadWriteLock to handle this type of scenario.
If we have volatile variable, we are guaranteed that if we have two threads and the two threads read they will get the value from the main memory,also if we write and then a read happens we will get in the read the changes, but what are guaranteed and not guaranteed when we have many threads reading and writing on the volatile variable?
NO. volatile won't do that.
Volatile: Almost Useless for Multi-Threaded Programming
There is a widespread notion that the keyword volatile is good for
multi-threaded programming. I've seen interfaces with volatile
qualifiers justified as "it might be used for multi-threaded
programming". I thought was useful until the last few weeks, when it
finally dawned on me (or if you prefer, got through my thick head)
that volatile is almost useless for multi-threaded programming. I'll
explain here why you should scrub most of it from your multi-threaded
code.
...
There may be languages where it may have such an effect, but in C, the answer is an emphatic NO.
EDIT:
Now that the language is specified as Java, the answer is different, since Java implements its own memory model, and the volatile keyword does have a significant impact. See Do you ever use the volatile keyword in Java? and many other questions.
From documentation page:
Package java.util.concurrent.atomic Description:
A small toolkit of classes that support lock-free thread-safe programming on single variables. In essence, the classes in this package extend the notion of volatile values, fields, and array elements to those that also provide an atomic conditional update operation of the form
boolean compareAndSet(expectedValue, updateValue);
With many options available in atomic package like
AtomicBoolean
AtomicInteger
AtomicLongArray
etc, can I use these AtomicXXX and slowly get rid of volatile variables in my legacy code?
EDIT:
Keep volatile for single write & multiple read operations in different threads (my conclusion after reading many articles), multi-writer, single-reader cases ( as per #erickson comments)
Use AtomicXXX for multiple updates & multiple reads among multiple threads to avoid synchronization. Provide atomicity to volatile variables.
My thought process has been changed with #ericksoncomments.volatile supports multiple write & single read` but can fail with multiple writes and multiple reads. I am confused on this concept.
Yes, an AtomicXXX instance provides the same visibility guarantees that you get from accessing a volatile field.
However, AtomicXXX do more than volatile fields, and accordingly, they are a bit more expensive to use. Specifically, they provide operations that are more like an optimized synchronized block than a volatile read or write. You increment-and-get, or compare-and-swap—multiple actions, atomically. Volatile variables don't provide any atomicity.
So, switching from volatile to AtomicXXX isn't necessarily a good move. Consider if it makes sense given how data are used, and perhaps do some profiling on a prototype to see what performance impact it will have.
Is anybody aware of any real life use of the class AtomicLongFieldUpdate?
I have read the description but I have not quite grasped the meaning of it.
Why do I want to know that? Curiosity and for OCPJP preparation.
Thanks in advance.
You can think of a cost ladder for the following:
ordinary long: cheap, but unsafe for multi-threaded access
volatile long: more expensive, safe for multi-threaded access, atomic operations not possible
AtomicLong: most expensive, safe for multi-threaded access, atomic operations possible
(When I say 'unsafe' or 'not possible' I mean 'without an external mechanism like synchronization' of course.)
In the case where multi-threaded access is needed, but most operations are simple reads or writes, with only a few atomic operations needed, you can create one static instance of AtomicLongFieldUpdate and use this when atomic updates are needed. The memory/runtime overhead is then similar to a simple volatile variable, except for the atomic operations which are of the order of (or slightly more expensive than) the ordinary AtomicLong operations.
Here is a nice little tutorial.
The reason why you would use e.g. AtomicLongFieldUpdater in favor to AtomicLong is simply to reduce the heap cost. Internally both work pretty much the same on th compareAndSet level which both use sun.misc.Unsafe at the end.
Consider you have a certain class that is initialized 1000k times. With AtomicLong you'd create 1000k AtomicLongs. With AtomicLongFieldUpdater on the other hand, you'd create 1 CONSTANT AtomicLongFieldUpdater and 1000k long primitives which of course does not need so much heap space.
Is anybody aware of any real life use of the AtomicLongFieldUpdate class?
I've never used this class myself but in doing a get usage on my workspace I see a couple "real life" instances of its use:
com.google.common.util.concurrent.AtomicDouble uses it to atomically modify their internal volatile long field which stores the bits from a double using Number.doubleToRawLongBits(...). Pretty cool.
net.sf.ehcache.Element uses it to atomically update the hitCount field.
I have read the description but I have not quite grasped the meaning of it.
It basically provides the same functionality as AtomicLong but on a field local to another class. The memory load of the AtomicLongFieldUpdate is less than the AtomicLong in that you configure one instance of the update for each field so lower memory overhead but more CPU overhead (albeit maybe small) from the reflection.
The javadocs say:
This class is designed for use in atomic data structures in which several fields of the same node are independently subject to atomic updates.
Sure but then I'd just use multiple Atomic* fields. Just about the only reason why I'd use the class is if there was an existing class that I could not change that I wanted to increment atomically.
Of course. I have been reading Alibaba Druid recently. I found AtomicLongFieldUpdater is used in this project widely.
// stats
private volatile long recycleErrorCount = 0L;
private volatile long connectErrorCount = 0L;
protected static final AtomicLongFieldUpdater<DruidDataSource> recycleErrorCountUpdater
= AtomicLongFieldUpdater.newUpdater(DruidDataSource.class, "recycleErrorCount");
protected static final AtomicLongFieldUpdater<DruidDataSource> connectErrorCountUpdater
= AtomicLongFieldUpdater.newUpdater(DruidDataSource.class, "connectErrorCount");
As defined above, the properties recycleErrorCount and connectErrorCount are used to count error occurrence times.
Quite a lot of DataSource (The class that holds properties above) will be created during an application lifetime in which case using ALFU reduces heap space consumption obviously than using AtomicLong.
Atomics are usually used in parallel programming.
Under the work-stealing mode, it only supports async, finish, forasync, isolated, and atomic variables.
You can view atomic as a safe protection from data race and other problems that you need to concern in parallel programming.
I know using volatile keyword in Java we get some kind of weak synchronization (It allows visibility updates but do not provide actual locking). Is there any situation where volatile should be given preference over actual locking in implementing concurrent programs. A somewhat similar question is there on SO which says volatile as a synchronization mechanism but that was tagged to C#.
If the shared state consists in a single field, and you don't use any get-and-set construct (like i++ for example) to assign it, then volatile is good enough. Most of the volatile usages can be replaced by the use of AtomicXxx types, though (which provide atomic get-and-set operations).
In short, you should prefer to avoid locks wherever they are not necessary since locks expose your program to deadlocks and deter performance by excluding concurrency from critical parts of code. So, whenever the situation permits, by all means rely on volatile; if all you additionally need is atomic two-step operations like compare-and-swap, use AtomicReference. Fall back to synchronized only for the scenarios where this is the only option. For example, if you need to lazily initialize a heavy object, you'll need locks to prevent double initialization—but again, not to fetch the already initialized instance (double-check idiom).
Volatile guarantees that all threads will see the last write of a variable by any other thread, that's it. There's no synchronization involved. If you synchronize both read and write method of an instance variable, then you don't have to make that variable volatile (all threads will see the most recent write).