I started going through Spring tutorial and it has me initialize an atomic number. I wasn't sure what an atomic number was, so I googled around and could not find a straight forward answer. What is an atomic number in Java?
Atomic means that the update operations done on that type are ensured to be done atomically (in one step, in one goal). Atomic types are valuable to use in a concurrent context (as "better volatiles")
If more than one thread executes a code like this, the counter can end up fewer than it should be.
int count
void increment() {
int previous = count;
count = previous + 1;
}
This is because it takes two steps to increment the counter, and a thread can read the count before another thread can store the new value (note that re-writing this into a one-liner doesn't change this fact; the JVM has to perform two steps regardless of how you write it). Forcing multiple steps to always happen in one unit (e.g. the read of the count and storing of the new count) is called "making the operation atomic".
"Atomic" values are objects that wrap values and exposes methods that conveniently provide common atomic operations, such as AtomicInteger#increment().
Reference: Java Atomic Variables
Traditional multi-threading approaches use locks to protect shared resources. Synchronization objects like Semaphores provide mechanisms for the programmer to write code that doesn't modify a shared resource concurrently. The synchronization approaches block other threads when one of the thread is modifying a shared resource. Obviously blocked threads are not doing meaningful work waiting for the lock to be released.
Atomic operations on the contrast are based on non-blocking algorithms in which threads waiting for shared resources don't get postponed. Atomic operations are implemented using hardware primitives like compare and swap (CAS) which are atomic instructions used in multi-threading for synchronization.
Java supports atomic classes that support lock free, thread safe programming on single variables. These classes are defined in java.util.concurrent.atomic package. Some of the key classes include AtomicBoolean, AtomicInteger, AtomicLong, AtomicIntegerArray, AtomicLongArray and AtomicReference.
Related
I want to track getVariableAndLogAccess(RequestInfo requestInfo) using the code below. Will it be thread safe if only these two methods access variable?
What is the standard way to make it thread safe?
public class MyAccessLog(){
private int recordIndex = 0;
private int variableWithAccessTracking = 42;
private final Map<Integer, RequestInfo> requestsLog = new HashMap<>();
public int getVariableAndLogAccess(RequestInfo requestInfo){
Integer myID = recordIndex++;
int variableValue = variableWithAccessTracking;
requestInfo.saveValue(variableValue);
requestLog.put(myID, requestInfo);
return variableValue;
}
public void setValueAndLog(RequestInfo requestInfo, int newValue){
Integer myID = recordIndex++;
variableWithAccessTracking = variableValue;
requestInfo.saveValue(variableValue);
requestLog.put(myID, requestInfo);
}
/*other methods*/
}
Will it be thread safe if only these two methods access variable?
No.
For instance, if two threads call setValueAndLog, they might end up with the same myID value.
What is the standard way to make it thread safe?
You should either replace your int with an AtomicInteger, use a lock, or a syncrhonized block to prevent concurrent modifications.
As a rule of thumb, using an atomic variable such as the previously mentioned AtomicInteger is better than using locks since locks involve the operating system. Calling the operating system is like bringing in the lawyers - both are best avoided for things you can solve yourself.
Note that if you use locks or synchronized blocks, both the setter and getter need to use the same lock. Otherwise the getter could be accessed while the setter is still updating the variable, leading to concurrency errors.
Will it be thread safe if only these two methods access variable?
Nope.
Intuitively, there are two reasons:
An increment consists of a read followed by a write. The JLS does not guarantee that the two will be performed as an atomic operation. And indeed, neither to Java implementations do that.
Modern multi-core systems implement memory access with fast local memory caches and slower main memory. This means that one thread is not guaranteed to see the results of another thread's memory writes ... unless there are appropriate "memory barrier" instructions to force main-memory writes / reads.
Java will only insert these instructions if the memory model says it is necessary. (Because ... they slow the code down!)
Technically, the JLS has a whole chapter describing the Java Memory Model, and it provides a set of rules that allow you to reason about whether memory is being used correctly. For the higher level stuff, you can reason based on the guarantees provided by AtomicInteger, etcetera.
What is the standard way to make it thread safe?
In this case, you could use either an AtomicInteger instance, or you could synchronize using a primitive object locking (i.e the synchronized keyword) or a Lock object.
#Malt is right. Your code is not even close to be thread safe.
You can use AtomicInteger for your counter, but LongAdder would be more suitable for your case, as it is optimized for cases where you need counting things and read the result of your counting less often then you update it. LongAdder also has the same thread safety assurance of AtomicInteger
From java doc on LongAdder:
This class is usually preferable to AtomicLong when multiple threads update a common sum that is used for purposes such as collecting statistics, not for fine-grained synchronization control. Under low update contention, the two classes have similar characteristics. But under high contention, expected throughput of this class is significantly higher, at the expense of higher space consumption.
This is a common approach to log in a thread safe way:
For counter use AtomicInteger counter with counter.addAndGet(1) method.
Add data using public synchronized void putRecord(Data data){ /**/}
If you only use recordIndex as a handler for the record you can replace a map with a synchronized list: List list = Collections.synchronizedList(new LinkedList());
I am learning multi-thread programming from 'Java Concurrency in Practice'.
At one point, book says that even an innocuous looking increment operation is not thread safe as it consists of three different operations...read,modify and write.
class A {
private void int c;
public void increment() {
++c;
}
}
So increment statement is not atomic, hence not thread safe.
My question is that if an environment is really concurrent (ie multiple threads are able to execute their program statements exactly at same time) then a statement which is really atomic also can't be thread safe as multiple threads can read same value.
So how can having an atomic statement help in achieving thread safety in a concurrent environment?
True concurrency does not exist when it comes to modifying state.
This post has some good descriptions of Concurrency and Parallelism.
As stated by #RitchieHindle in that post:
Concurrency is when two tasks can start, run, and complete in overlapping time periods. It doesn't necessarily mean they'll ever both be running at the same instant. Eg. multitasking on a single-core machine.
As an example, the danger of non-atomic operations is that one thread might read the value, another might modify the value, and then the original thread might modify and write the value (thus negating the modification the second thread did).
Atomic operations do not allow other operations access to the state while in the middle of the atomic operation. If, for example, the increment operator were atomic, it would read, modify, and write without any other thread having access to that variables state while those operations took place.
You can use AtomicInteger. The linked Javadoc says (in part) that it is an int value that may be updated atomically. AtomicInteger also implements addAndGet(int) which atomically adds the given value to the current value
private AtomicInteger ai = new AtomicInteger(1); // <-- or another initial value
public int increment() {
return ai.addAndGet(1); // <-- or another increment value
}
That can (for example) allow you to guarantee write order consistency for multiple threads. Consider, ai might represent (or include) some static (or global) resource. If a value is thread local then you don't need to consider atomicity.
Mutexes are pretty common in many programming languages, like e.g. C/C++. I miss them in Java. However, there are multiple ways I could write my own class Mutex:
Using a simple synchronized keyword on Mutex.
Using a binary semaphore.
Using atomic variables, like discussed here.
...?
What is the fastest (best runtime) way? I think synchronized is most common, but what about performance?
Mutexes are pretty common in many programming languages, like e.g. C/C++. I miss them in Java.
Not sure I follow you (especially because you give the answer in your question).
public class SomeClass {
private final Object mutex = new Object();
public void someMethodThatNeedsAMutex() {
synchronized(mutex) {
//here you hold the mutex
}
}
}
Alternatively, you can simply make the whole method synchronized, which is equivalent to using this as the mutex object:
public class SomeClass {
public synchronized void someMethodThatNeedsAMutex() {
//here you hold the mutex
}
}
What is the fastest (best runtime) way?
Acquiring / releasing a monitor is not going to be a significant performance issue per se (you can read this blog post to see an analysis of the impact). But if you have many threads fighting for the lock, it will create contention and degrade performance.
In that case, the best strategy is to not use mutexes by using "lock-free" algorithms if you are mostly reading data (as pointed out by Marko in the comments, lock-free uses CAS operations, which may involve retrying writes many times if you have lots of writing threads, eventually leading to worse performance) or even better, by avoiding to share too much stuff across threads.
The opposite is the case: Java designers solved it so well that you don't even recognize it: you don't need a first-class Mutex object, just the synchronized modifier.
If you have a special case where you want to juggle your mutexes in a non-nesting fashion, there's always the ReentrantLock and java.util.concurrent offers a cornucopia of synchronization tools that go way beyond the crude mutex.
In Java each object can be uses as Mutex.
This objects are typicaly named "lock" or "mutex".
You can create that object for yourself which is the prefered variant, because it avoids external access to that lock:
// usually a field in the class
private Object mutex = new Object();
// later in methods
synchronized(mutex) {
// mutual exclusive section for all that uses synchronized
// ob this mutex object
}
Faster is to avoid the mutex, by thinking what happens if another thread reads an non actual value. In some situations this would produce wrong calculation results, in other results only in a minimal delay. (but faster than with syncing)
Detailed explanation in book
Java Concurreny in practise
.
What is the fastest (best runtime) way?
That depends on many things. For example, ReentrantLock used to perform better under contention than using synchronized, but that changed when a new HotSpot version, optimizing synchronized locking, was released. So there's nothing inherent in any way of locking that favors one flavor of mutexes over the other (from a performance point of view) - in fact, the "best" solution can change with the data you're processing and the machine you're running on.
Also, why did the inventors of Java not solve this question for me?
They did - in several ways: synchronized, Locks, atomic variables, and a whole slew of other utilities in java.util.concurrent.
You can run micro benchmarks of each variant, like atomic, synchronized, locked. As others have pointed out, it depends a lot on the machine and number of threads in use. In my own experiments incrementing long integers, I found that with only one thread on a Xeon W3520, synchronized wins over atomic: Atomic/Sync/Lock: 8.4/6.2/21.8, in nanos per increment operation.
This is of course a border case since there is never any contention. Of course, in that case, we can also look at unsynchronized single-threads long increment, which comes out six times faster than atomic.
With 4 threads I get 21.8/40.2/57.3. Note that these are all increments across all threads, so we actually see a slowdown. It gets a bit better for locks with 64 threads: 22.2/45.1/45.9.
Another test on a 4-way/64T machine using Xeon E7-4820 yields for 1 thread: 9.1/7.8/29.1, 4 threads: 18.2/29.1/55.2 and 64 Threads: 53.7/402/420.
One more data point, this time a dual Xeon X5560, 1T: 6.6/5.8/17.8, 4T: 29.7/81.5/121, 64T: 31.2/73.4/71.6.
So, on a multi-socket machine, there is a heavy cache coherency tax.
you can use java.util.concurrent.locks.Lock in the same way as the mutex or java.util.concurrent.Semaphore. But using synchronized-keyword is a better way :-)
Regards
Andrej
In a multi-threaded environment like Android, where a simple int variable may be manipulated by multiple threads, are there circumstances in which it is still justified to use an int as a data member?
An int as a local variable, limited to the scope of the method that has exclusive access to it (and thus start & finish of modifying it is always in the same thread), makes perfect sense performance-wise.
But as a data member, even if wrapped by an accessor, it can run into the well known concurrent interleaved modification problem.
So it looks like to "play it safe" one could just use AtomicInteger across the board. But this seems awfully inefficient.
Can you bring an example of thread-safe int data member usage?
Is there any justification not to ALWAYS use AtomicInteger as data members?
Yes, there are good reasons to not always use AtomicInteger. AtomicInteger can be at at least an order of magnitude slower (probably more) because of the volatile construct than a local int and the other Unsafe constructs being used to set/get the underlying int value. volatile means that you cross a memory barrier every time you access an AtomicInteger which causes a cache memory flush on the processor in question.
Also, just because you have made all of your fields to be AtomicInteger does not protect you against race conditions when multiple fields are being accessed. There is just no substitute for making good decisions about when to use volatile, synchronized, and the Atomic* classes.
For example, if you had two fields in a class that you wanted to access in a reliable manner in a thread program, then you'd do something like:
synchronized (someObject) {
someObject.count++;
someObject.total += someObject.count;
}
If both of those members with AtomicInteger then you'd be accessing volatile twice so crossing 2 memory barriers instead of just 1. Also, the assignments are faster than the Unsafe operations inside of AtomicInteger. Also, because of the data race conditions with the two operations (as opposed to the synchronized blocks above) you might not get the right values for total.
Can you bring an example of thread-safe int data member usage?
Aside from making it final, there is no mechanism for a thread-safe int data member except for marking it volatile or using AtomicInteger. There is no magic way to paint thread-safety on all of your fields. If there was then thread programming would be easy. The challenge is to find the right places to put your synchronized blocks. To find the right fields that should be marked with volatile. To find the proper places to use AtomicInteger and friends.
If you have effecitvely immutable ints you can get away with not ensuring synchronization at the cost of its calculation. An example is hashCode
int hash = 0;
public int hashCode(){
if(hash == 0){
hash = calculateHashCode(); //needs to always be the same for each Object
}
return hash;
}
The obvious tradeoff here is the possibility of multiple calculations for the same hash value, but if the alternative is a synchronized hashCode that can have far worse implications.
This is technically thread-safe though redundant.
It depends on how it is used wrt. other data. A class encapsulates a behavior, so often a variable is almost meaningless without the others. In such cases it might be better to protect(*) data members that belong together (or the whole object), instead of just one integer. If you do this, then AtomicInteger is an unnecessary performance hit
(*) using the common thread safety mechanisms: mutex, semaphore, monitor etc.
Thread safety is not only about atomic int assignments, you need to carefully design your locking patterns to get consistency in your code.
If you have two Account classes with a public datamembers Balance consider the following simple code.
Account a;
...
int withdrawal = 100;
if(a.Balance >= withdrawal)
{
// No atomic operations in the world can save you from another thread
// withdrawing some balance here
a.Balance -= withdrawal
}
else
{
// Handle error
}
To be really frank. In real life, having atomic assignments is rarely enough to solve my real life concurrency issues.
I guess Google saw the OP and updated their documentation on the subject to be clearer:
"An AtomicInteger is used in applications such as atomically incremented counters, and cannot be used as a replacement for an Integer."
https://developer.android.com/reference/java/util/concurrent/atomic/AtomicInteger
When to use volatile keyword vs synchronization in multithreading?
Use volatile to guarantee that each read access to a variable will see the latest value written to that variable. Use synchronized whenever you need values to be stable for multiple instructions. (Note that this does not necessarily mean multiple statements; the single statement:
var++; // NOT thread safe!
is not thread-safe even if var is declared volatile. You need to do this:
synchronized(LOCK_OBJECT){var++;}
See here for a nice summary of this issue.
Volatile only ensures the read operation always gives the latest state from memory across threads. However, it does not ensure any write safety / ordering of operations, i.e. two threads can update the volatile variable in any random order. Also it does not ensure that multiple operations on the variable are atomic.
However a synchronized block ensures latest state and write safety. Also the access and update to variable is atomic inside a synchronized block.
The above, however is true, only if all the access / updates to the variable in question are using the same lock object so that at no time multiple threads gets access to the variable.
That's a pretty broad question. The best answer I can give is to use synchronized when performing multiple actions that must be seen by other threads as occurring atomically—either all or none of the steps have occurred.
For a single action, volatile may be sufficient; it acts as a memory barrier to ensure visibility of the change to other threads.