Usually I work with multithreading in java.
I started with Petterson's and Dekker's mutual exclusion and volatile to guarantee that the value of variables dont' be saved in a cache and everything were ok.
Then I tried with semaphores and also volatile variables, that were ok too.
Nowadays I usually use both methods and blocks synchronized, but when I only need a variable to be accessed in mutual exclusion and the value need to be "volatile", I use the package java.util.concurrent.atomic such as AtomicIntegerArray or AtomicInteger.
Then, if you read about this in the Java API:
https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/atomic/package-summary.html
You will find this:
"The specifications of these methods enable implementations to employ efficient machine-level atomic instructions that are available on contemporary processors. However on some platforms, support may entail some form of internal locking. Thus the methods are not strictly guaranteed to be non-blocking -- a thread may block transiently before performing the operation"
This is something that makes me feel a little confusing.
Does it means that is not secure to use atomic objects?
Could this be the reason of an unexpected behavior in a concurrent program?
Related
Are AtomicIntegers considered synchronization primitives, or is it just the methods provided by Java (wait(), notify(), etc).
I am confused about the definition of primitives, as atomicintegers can operate on int and provide lock free thread sage programming. Without the use of synchronized.
AtomicInteger is a class. Its methods are... well, methods. Neither one of those would be considered a synchronization primative.
The compareAndSet method, which is also used by incrementAndGet and other such methods, uses Unsafe.compareAndSwapInt (on OpenJDK 7, which is what I have handy). That's a native method — so it could well be considered a primitive. And in fact, on modern CPUs, it translates to a CAS instruction, so it's a primitive all the way down to the hardware level.
The class also relies on volatile's memory visibility, which is also a synchronization primitive.
I think this question is a bit "vague"; but I think that "language primitive" typically refers to language elements that are part of the core of the language.
In other words: the keywords, and the associated semantics. In that sense; I would see the synchronized (in its two meanings) and volatile keywords as being the only "primitive" regarding multithreading.
Of course, classes such as Object; and therefore all its methods like wait(), notify() ... are also an essential part of Java (one which you can't avoid in the first place). And of course, same can be said about the Thread class.
Long story short: you can differentiate between concepts that exist as language keywords (and are thus handled by the compiler); and "on-top" concepts that come as "normal" classes. And as the answer from yshavit nicely describes, certain aspects of AtomicInteger can be directly mapped into the "native" side of things. So the real answer is maybe that, as said, the term "primitive" doesn't provide much aid in describing/differentiating concepts regarding Java multi-threading topics.
Regarding your first query:
Are AtomicIntegers considered synchronization primitives, or is it just the methods provided by Java (wait(), notify(), etc).
No. AtomicInteger is neither a method nor synchronized primitive.
AtomicInteger is a class with methods. Have a look at oracle documentation page on atomic packages
A small toolkit of classes that support lock-free thread-safe programming on single variables. In essence, the classes in this package extend the notion of volatile values, fields, and array elements to those that also provide an atomic conditional update operation of the form:
boolean compareAndSet(expectedValue, updateValue);
The classes in this package also contain methods to get and unconditionally set values, as well as a weaker conditional atomic update operation weakCompareAndSet
Regarding your second query:
I am confused about the definition of primitives, as atomicintegers can operate on int and provide lock free thread sage programming. Without the use of synchronized.
One key note:
The scope of synchronized is broad in nature compared to AtomicInteger or AtomicXXX variables. With synchronized methods or blocks, you can protect critical section of code, whcih contains many statements.
The compareAndSet method is not a general replacement for locking. It applies only when critical updates for an object are confined to a single variable.
Atomic classes are not general purpose replacements for java.lang.Integer and related classes. However, AtomicInteger extends Number to allow uniform access by tools and utilities that deal with numerically-based classes.
From documentation page:
Package java.util.concurrent.atomic Description:
A small toolkit of classes that support lock-free thread-safe programming on single variables. In essence, the classes in this package extend the notion of volatile values, fields, and array elements to those that also provide an atomic conditional update operation of the form
boolean compareAndSet(expectedValue, updateValue);
With many options available in atomic package like
AtomicBoolean
AtomicInteger
AtomicLongArray
etc, can I use these AtomicXXX and slowly get rid of volatile variables in my legacy code?
EDIT:
Keep volatile for single write & multiple read operations in different threads (my conclusion after reading many articles), multi-writer, single-reader cases ( as per #erickson comments)
Use AtomicXXX for multiple updates & multiple reads among multiple threads to avoid synchronization. Provide atomicity to volatile variables.
My thought process has been changed with #ericksoncomments.volatile supports multiple write & single read` but can fail with multiple writes and multiple reads. I am confused on this concept.
Yes, an AtomicXXX instance provides the same visibility guarantees that you get from accessing a volatile field.
However, AtomicXXX do more than volatile fields, and accordingly, they are a bit more expensive to use. Specifically, they provide operations that are more like an optimized synchronized block than a volatile read or write. You increment-and-get, or compare-and-swap—multiple actions, atomically. Volatile variables don't provide any atomicity.
So, switching from volatile to AtomicXXX isn't necessarily a good move. Consider if it makes sense given how data are used, and perhaps do some profiling on a prototype to see what performance impact it will have.
Is it safe to use the :volatile-mutable qualifier with deftype in a single-threaded program? This is a follow up to this question, this one, and this one. (It's a Clojure question, but I added the "Java" tag because Java programmers are likely to have insights about it, too.)
I've found that I can get a significant performance boost in a program I'm working on by using :volatile-mutable fields in a deftype rather than atoms, but I'm worried because the docstring for deftype says:
Note well that mutable fields are extremely difficult to use
correctly, and are present only to facilitate the building of higher
level constructs, such as Clojure's reference types, in Clojure
itself. They are for experts only - if the semantics and implications
of :volatile-mutable or :unsynchronized-mutable are not immediately
apparent to you, you should not be using them.
In fact, the semantics and implications of :volatile-mutable are not immediately apparent to me.
However, chapter 6 of Clojure Programming, by Emerick, Carper, and Grand says:
"Volatile" here has the same meaning as the volatile field modifier in
Java: reads and writes are atomic and must be executed in
program order; i.e., they cannot be reordered by the JIT compiler or
by the CPU. Volatiles are thus unsurprising and thread-safe — but
uncoordinated and still entirely open to race conditions.
This seems to imply that as long as accesses to a single volatile-mutable deftype field all take place within a single thread, there is nothing to special to worry about. (Nothing special, in that I still have to be careful about how I handle state if I might be using lazy sequences.) So if nothing introduces parallelism into my Clojure program, there should be no special danger to using deftype with :volatile-mutable.
Is that correct? What dangers am I not understanding?
That's correct, it's safe. You just have to be sure that your context is really single-threaded. Sometimes it's not that easy to guarantee that.
There's no risk in terms of thread-safety or atomicity when using a volatile mutable (or just mutable) field in a single-threaded context, because there's only one thread so there's no chance of two threads writing a new value to the field at the same time, or one thread writing a new value based on outdated values.
As others have pointed out in the comments you might want to simply use an :unsynchronized-mutable field to avoid the cost introduced by volatile. That cost comes from the fact that every write must be committed to main memory instead of thread local memory. See this answer for more info about this.
At the same time, you gain nothing by using volatile in a single-threaded context because there's no chance of having one thread writing a new value that will not be "seen" by other thread reading the same field.
That's what a volatile is intended for, but it's irrelevant in a single-thread context.
Also note that clojure 1.7 introduced volatile! intended to provide a "volatile box for managing state" as a faster alternative to
atom, with a similar interface but without it's compare and swap semantics. The only difference when using it is that you call vswap! and vreset! instead of swap! and reset!. I would use that instead of
deftype with ^:volatile-mutable if I need a volatile.
I'm new to threading in Java and I need to access data structure from few active threads. I've heard that java.util.concurrent.ConcurrentHashMap is threading-friendly. Do I need to use synchronized(map){}
while accessing ConcurrentHashMap or it will handle locks itself?
It handles the locks itself, and in fact you have no access to them (there is no other option)
You can use synchronized in special cases for writes, but it is very rare that you should need to do this. e.g. if you need to implement your own putIfAbsent because the cost of creating an object is high.
Using syncrhonized for reads would defeat the purpose of using the concurrent collection.
ConcurrentHashMap is suited only to the cases where you don't need any more atomicity than provided out-of-the-box. If for example you need to get a value, do something with it, and then set a new value, all in an atomic operation, this cannot be achieved without external locking.
In all such cases nothing can replace explicit locks in your code and it is nothing but waste to use this implementation instead of the basic HashMap.
Short answer: no you don't need to use synchronized(map).
Long answer:
all the operations provided by ConcurrentHashMap are thread safe and you can call them without worrying about locking
however, if you need some operations to be atomic in your code, you will still need some sort of locking at the client side
No, you don't need, but if you need to depend on internal synchronization, you should use Collections.synchronizedMap instead. From the javadoc of ConcurrentHashMap:
This class is fully interoperable with Hashtable in programs that rely on its thread safety but not on its synchronization details.
Actually it won't synchronize on the whole data structure but on subparts (some buckets) of it.
This implies that ConcurrentHashMap's iterators are weakly consistent and the size of the map can be inaccurate. (But on the other hand it's put and get operations are still consistent and the throughput is higher)
There is one more important feature to note for concurrenthmp other than the concurrency feature it provides, which is fail safe iterator. Use CHMP just because they want to edit the entryset for put/remove while iteration.
Collections.synchronizedMap(Map) is other one. But ConcurrentModificationException may come in the above case.
I know using volatile keyword in Java we get some kind of weak synchronization (It allows visibility updates but do not provide actual locking). Is there any situation where volatile should be given preference over actual locking in implementing concurrent programs. A somewhat similar question is there on SO which says volatile as a synchronization mechanism but that was tagged to C#.
If the shared state consists in a single field, and you don't use any get-and-set construct (like i++ for example) to assign it, then volatile is good enough. Most of the volatile usages can be replaced by the use of AtomicXxx types, though (which provide atomic get-and-set operations).
In short, you should prefer to avoid locks wherever they are not necessary since locks expose your program to deadlocks and deter performance by excluding concurrency from critical parts of code. So, whenever the situation permits, by all means rely on volatile; if all you additionally need is atomic two-step operations like compare-and-swap, use AtomicReference. Fall back to synchronized only for the scenarios where this is the only option. For example, if you need to lazily initialize a heavy object, you'll need locks to prevent double initialization—but again, not to fetch the already initialized instance (double-check idiom).
Volatile guarantees that all threads will see the last write of a variable by any other thread, that's it. There's no synchronization involved. If you synchronize both read and write method of an instance variable, then you don't have to make that variable volatile (all threads will see the most recent write).