Is there any justification not to ALWAYS use AtomicInteger as data members? - java

In a multi-threaded environment like Android, where a simple int variable may be manipulated by multiple threads, are there circumstances in which it is still justified to use an int as a data member?
An int as a local variable, limited to the scope of the method that has exclusive access to it (and thus start & finish of modifying it is always in the same thread), makes perfect sense performance-wise.
But as a data member, even if wrapped by an accessor, it can run into the well known concurrent interleaved modification problem.
So it looks like to "play it safe" one could just use AtomicInteger across the board. But this seems awfully inefficient.
Can you bring an example of thread-safe int data member usage?

Is there any justification not to ALWAYS use AtomicInteger as data members?
Yes, there are good reasons to not always use AtomicInteger. AtomicInteger can be at at least an order of magnitude slower (probably more) because of the volatile construct than a local int and the other Unsafe constructs being used to set/get the underlying int value. volatile means that you cross a memory barrier every time you access an AtomicInteger which causes a cache memory flush on the processor in question.
Also, just because you have made all of your fields to be AtomicInteger does not protect you against race conditions when multiple fields are being accessed. There is just no substitute for making good decisions about when to use volatile, synchronized, and the Atomic* classes.
For example, if you had two fields in a class that you wanted to access in a reliable manner in a thread program, then you'd do something like:
synchronized (someObject) {
someObject.count++;
someObject.total += someObject.count;
}
If both of those members with AtomicInteger then you'd be accessing volatile twice so crossing 2 memory barriers instead of just 1. Also, the assignments are faster than the Unsafe operations inside of AtomicInteger. Also, because of the data race conditions with the two operations (as opposed to the synchronized blocks above) you might not get the right values for total.
Can you bring an example of thread-safe int data member usage?
Aside from making it final, there is no mechanism for a thread-safe int data member except for marking it volatile or using AtomicInteger. There is no magic way to paint thread-safety on all of your fields. If there was then thread programming would be easy. The challenge is to find the right places to put your synchronized blocks. To find the right fields that should be marked with volatile. To find the proper places to use AtomicInteger and friends.

If you have effecitvely immutable ints you can get away with not ensuring synchronization at the cost of its calculation. An example is hashCode
int hash = 0;
public int hashCode(){
if(hash == 0){
hash = calculateHashCode(); //needs to always be the same for each Object
}
return hash;
}
The obvious tradeoff here is the possibility of multiple calculations for the same hash value, but if the alternative is a synchronized hashCode that can have far worse implications.
This is technically thread-safe though redundant.

It depends on how it is used wrt. other data. A class encapsulates a behavior, so often a variable is almost meaningless without the others. In such cases it might be better to protect(*) data members that belong together (or the whole object), instead of just one integer. If you do this, then AtomicInteger is an unnecessary performance hit
(*) using the common thread safety mechanisms: mutex, semaphore, monitor etc.

Thread safety is not only about atomic int assignments, you need to carefully design your locking patterns to get consistency in your code.
If you have two Account classes with a public datamembers Balance consider the following simple code.
Account a;
...
int withdrawal = 100;
if(a.Balance >= withdrawal)
{
// No atomic operations in the world can save you from another thread
// withdrawing some balance here
a.Balance -= withdrawal
}
else
{
// Handle error
}
To be really frank. In real life, having atomic assignments is rarely enough to solve my real life concurrency issues.

I guess Google saw the OP and updated their documentation on the subject to be clearer:
"An AtomicInteger is used in applications such as atomically incremented counters, and cannot be used as a replacement for an Integer."
https://developer.android.com/reference/java/util/concurrent/atomic/AtomicInteger

Related

Is static int thread safe if it is incremented in a single method?

I want to track getVariableAndLogAccess(RequestInfo requestInfo) using the code below. Will it be thread safe if only these two methods access variable?
What is the standard way to make it thread safe?
public class MyAccessLog(){
private int recordIndex = 0;
private int variableWithAccessTracking = 42;
private final Map<Integer, RequestInfo> requestsLog = new HashMap<>();
public int getVariableAndLogAccess(RequestInfo requestInfo){
Integer myID = recordIndex++;
int variableValue = variableWithAccessTracking;
requestInfo.saveValue(variableValue);
requestLog.put(myID, requestInfo);
return variableValue;
}
public void setValueAndLog(RequestInfo requestInfo, int newValue){
Integer myID = recordIndex++;
variableWithAccessTracking = variableValue;
requestInfo.saveValue(variableValue);
requestLog.put(myID, requestInfo);
}
/*other methods*/
}
Will it be thread safe if only these two methods access variable?
No.
For instance, if two threads call setValueAndLog, they might end up with the same myID value.
What is the standard way to make it thread safe?
You should either replace your int with an AtomicInteger, use a lock, or a syncrhonized block to prevent concurrent modifications.
As a rule of thumb, using an atomic variable such as the previously mentioned AtomicInteger is better than using locks since locks involve the operating system. Calling the operating system is like bringing in the lawyers - both are best avoided for things you can solve yourself.
Note that if you use locks or synchronized blocks, both the setter and getter need to use the same lock. Otherwise the getter could be accessed while the setter is still updating the variable, leading to concurrency errors.
Will it be thread safe if only these two methods access variable?
Nope.
Intuitively, there are two reasons:
An increment consists of a read followed by a write. The JLS does not guarantee that the two will be performed as an atomic operation. And indeed, neither to Java implementations do that.
Modern multi-core systems implement memory access with fast local memory caches and slower main memory. This means that one thread is not guaranteed to see the results of another thread's memory writes ... unless there are appropriate "memory barrier" instructions to force main-memory writes / reads.
Java will only insert these instructions if the memory model says it is necessary. (Because ... they slow the code down!)
Technically, the JLS has a whole chapter describing the Java Memory Model, and it provides a set of rules that allow you to reason about whether memory is being used correctly. For the higher level stuff, you can reason based on the guarantees provided by AtomicInteger, etcetera.
What is the standard way to make it thread safe?
In this case, you could use either an AtomicInteger instance, or you could synchronize using a primitive object locking (i.e the synchronized keyword) or a Lock object.
#Malt is right. Your code is not even close to be thread safe.
You can use AtomicInteger for your counter, but LongAdder would be more suitable for your case, as it is optimized for cases where you need counting things and read the result of your counting less often then you update it. LongAdder also has the same thread safety assurance of AtomicInteger
From java doc on LongAdder:
This class is usually preferable to AtomicLong when multiple threads update a common sum that is used for purposes such as collecting statistics, not for fine-grained synchronization control. Under low update contention, the two classes have similar characteristics. But under high contention, expected throughput of this class is significantly higher, at the expense of higher space consumption.
This is a common approach to log in a thread safe way:
For counter use AtomicInteger counter with counter.addAndGet(1) method.
Add data using public synchronized void putRecord(Data data){ /**/}
If you only use recordIndex as a handler for the record you can replace a map with a synchronized list: List list = Collections.synchronizedList(new LinkedList());

Java: What is an atomic number?

I started going through Spring tutorial and it has me initialize an atomic number. I wasn't sure what an atomic number was, so I googled around and could not find a straight forward answer. What is an atomic number in Java?
Atomic means that the update operations done on that type are ensured to be done atomically (in one step, in one goal). Atomic types are valuable to use in a concurrent context (as "better volatiles")
If more than one thread executes a code like this, the counter can end up fewer than it should be.
int count
void increment() {
int previous = count;
count = previous + 1;
}
This is because it takes two steps to increment the counter, and a thread can read the count before another thread can store the new value (note that re-writing this into a one-liner doesn't change this fact; the JVM has to perform two steps regardless of how you write it). Forcing multiple steps to always happen in one unit (e.g. the read of the count and storing of the new count) is called "making the operation atomic".
"Atomic" values are objects that wrap values and exposes methods that conveniently provide common atomic operations, such as AtomicInteger#increment().
Reference: Java Atomic Variables
Traditional multi-threading approaches use locks to protect shared resources. Synchronization objects like Semaphores provide mechanisms for the programmer to write code that doesn't modify a shared resource concurrently. The synchronization approaches block other threads when one of the thread is modifying a shared resource. Obviously blocked threads are not doing meaningful work waiting for the lock to be released.
Atomic operations on the contrast are based on non-blocking algorithms in which threads waiting for shared resources don't get postponed. Atomic operations are implemented using hardware primitives like compare and swap (CAS) which are atomic instructions used in multi-threading for synchronization.
Java supports atomic classes that support lock free, thread safe programming on single variables. These classes are defined in java.util.concurrent.atomic package. Some of the key classes include AtomicBoolean, AtomicInteger, AtomicLong, AtomicIntegerArray, AtomicLongArray and AtomicReference.

Thread safety static variables

i read
thread safety for static variables and i understand it and i agree with it but
In book java se 7 programmer exam 804 can some one explain to me
public void run() {
synchronized(SharedCounter.class) {
SharedCounter.count++;
}
}
However, this code is inefficient since it acquires and releases the
lock every time just to increment the value of count.
can someone explain to me the above quote
The code is not particularly inefficient. It could be slightly more efficient. The main problem is that it is fragile: if any developer forgets to synchronize its access to the global SharedCounter.count variable, you have a thread-safety issue. Indeed, since i++ is not an atomic operation and since changing the value of a variable without synchronization doesn't make the variables new value visible to other threads, Every access to i must be done in a synchronized way.
The synchronization is thus not correctly encapsulated in a single class. Generally, accessing global public fields is bad design. It's even worse in a multi-threaded environment.
Using an AtomicInteger solves the encapsulation problem, and makes it slightly more efficient at the same time.
Synchronizing can be expensive, so it shouldn't be used carelessly. There are better ways such as using AtomicInteger.incrementAndGet(); which uses different mechanisms to handle the synchronization.
It's inefficient compared to using intrinsic CPU instructions which can do atomic increments without using a lock. See http://en.wikipedia.org/wiki/Fetch-and-add and http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/atomic/AtomicInteger.html

Is there a viable use case where Java synchronized keyword is better than Atomics?

I've not much experience with threading, but I wrote a nifty non-blocking sequential Id generator with Atomics.... It got me a very significant performance boost in testing. Now I wonder why anyone would use synchronized since it is so much slower... is there a reason on modern 64 bit multi-core hardware? Others are asking me about Atomics now. I would love to be able to tell them to never use that keyword unless they are deploying to ancient hardware.
Because you can't do multiple actions exclusively using only atomics (well, technically you can, because you can implement a "lock" using atomics, but i think that's beside the point). you also can't do a blocking wait using atomics (you can do a busy wait, but that's almost always a bad idea).
here's an exercise for the OP: write a program which writes timestamped log messages using multiple threads to the same file where the messages must show up in the file in timestamp order. implement this using only atomics, but without re-inventing ReentrantLock/synchronized.
Now I wonder why anyone would use synchronized since it is so much slower...
Maybe because speed isn't everything.
In fact, if you looked objectively at the overall performance benefit of using your "nifty" generator on a real application, I suspect you will find that it is too small to matter. Profiling the application will tell you that.
Then there is the issue of whether your benchmarking is actually valid; i.e. whether you are doing the things that are needed to avoid misleading effects like JVM warmup anomalies, optimization anomalies, and so on. And whether you are (actually) measuring the contended and uncontended cases.
Is there a viable use case where Java synchronized keyword is better than Atomics?
That's easy.
Any situation where you require exclusive access to one or more data structures to perform a sequence of operations, or an operation that is not intrinsically thread-safe. The AtomicXxx types don't support this kind of thing.
I would love to be able to tell them to never use that keyword unless they are deploying to ancient hardware.
Don't tell them that. It is incorrect. In fact, if you are new to Java threads, I recommend that you read "Java Concurrency in Practice" by Goetz et al before you start advising people.
Depends on what you are doing - if you only need the functionality that an atomic provides you, then yes, there would be no need to do the same work yourself (using the synchronized keyword). However, many multithreaded applications do things a lot more complicated than just needing increment a number atomically.
For example, you might need a unit of work to be done where you modify several data structures in memory and all of that has to happen without interference - you could use a synchronized function or block for that.
From what I understand, the synchronized keyword is actually a moderately heavyweight recursive (re-entrant) lock.
For instance, the following (horrible) code would not deadlock:
public static Object lock = new Object();
int recurCount = 0;
public int fLocktorial(int n) {
synchronized(lock) {
recurCount++;
if (n <= 0)
return 1;
return n * fLocktorial(n-1);
}
}
Implementing this requires the maintenance of additional state and logic within the lock, which may contribute to its lower performance over atomics and other primitives. However, it does allow you to grab locks arbitrarily inside functions, without worrying if a caller has already obtained the lock. Locks implemented naively using Atomics would deadlock in this case.
Additionally, synchronized may yield performance benefits if large amounts of processing are done within the lock. Getting a lock only has a performance hit once, while atomics force a core synchronization per operation. This flushes the processor pipeline, impacting performance.
Conceptually, a critical section protected by a lock transform state from one valid state to another.
int x, y; // invariant: x==y
void inc()
synchronized(lock)
x++;
y++;
void dec()
...
We could encapsulate the state in an object, and atomically change the object.
class State
final int x, y;
State(int x, y) { ... }
volatile State state;
void inc()
do
State s = state;
State s2 = new State(s.x+1, s.y+1);
while( ! compareAndSet( "state", s, s2) ) // use Unsafe or something
Is that better? not necessarily. It is appealing, and it's simpler when states get more complicated; but it's probably slower in most cases.

Java Thread - Synchronization issue

From Sun's tutorial:
Synchronized methods enable a simple strategy for preventing thread interference and memory consistency errors: if an object is visible to more than one thread, all reads or writes to that object's variables are done through synchronized methods. (An important exception: final fields, which cannot be modified after the object is constructed, can be safely read through non-synchronized methods, once the object is constructed) This strategy is effective, but can present problems with liveness, as we'll see later in this lesson.
Q1. Is the above statements mean that if an object of a class is going to be shared among multiple threads, then all instance methods of that class (except getters of final fields) should be made synchronized, since instance methods process instance variables?
In order to understand concurrency in Java, I recommend the invaluable Java Concurrency in Practice.
In response to your specific question, although synchronizing all methods is a quick-and-dirty way to accomplish thread safety, it does not scale well at all. Consider the much maligned Vector class. Every method is synchronized, and it works terribly, because iteration is still not thread safe.
No. It means that synchronized methods are a way to achieve thread safety, but they're not the only way and, by themselves, they don't guarantee complete safety in all situations.
Not necessarily. You can synchronize (e.g. place a lock on dedicated object) part of the method where you access object's variables, for example. In other cases, you may delegate job to some inner object(s) which already handles synchronization issues.
There are lots of choices, it all depends on the algorithm you're implementing. Although, 'synchronized' keywords is usually the simplest one.
edit
There is no comprehensive tutorial on that, each situation is unique. Learning it is like learning a foreign language: never ends :)
But there are certainly helpful resources. In particular, there is a series of interesting articles on Heinz Kabutz's website.
http://www.javaspecialists.eu/archive/Issue152.html
(see the full list on the page)
If other people have any links I'd be interested to see also. I find the whole topic to be quite confusing (and, probably, most difficult part of core java), especially since new concurrency mechanisms were introduced in java 5.
Have fun!
In the most general form yes.
Immutable objects need not be synchronized.
Also, you can use individual monitors/locks for the mutable instance variables (or groups there of) which will help with liveliness. As well as only synchronize the portions where data is changed, rather than the entire method.
synchronized methodName vs synchronized( object )
That's correct, and is one alternative. I think it would be more efficient to synchronize access to that object only instead synchronize all it's methods.
While the difference may be subtle, it would be useful if you use that same object in a single thread
ie ( using synchronized keyword on the method )
class SomeClass {
private int clickCount = 0;
public synchronized void click(){
clickCount++;
}
}
When a class is defined like this, only one thread at a time may invoke the click method.
What happens if this method is invoked too frequently in a single threaded app? You'll spend some extra time checking if that thread can get the object lock when it is not needed.
class Main {
public static void main( String [] args ) {
SomeClass someObject = new SomeClass();
for( int i = 0 ; i < Integer.MAX_VALUE ; i++ ) {
someObject.click();
}
}
}
In this case, the check to see if the thread can lock the object will be invoked unnecessarily Integer.MAX_VALUE ( 2 147 483 647 ) times.
So removing the synchronized keyword in this situation will run much faster.
So, how would you do that in a multithread application?
You just synchronize the object:
synchronized ( someObject ) {
someObject.click();
}
Vector vs ArrayList
As an additional note, this usage ( syncrhonized methodName vs. syncrhonized( object ) ) is, by the way, one of the reasons why java.util.Vector is now replaced by java.util.ArrayList. Many of the Vector methods are synchronized.
Most of the times a list is used in a single threaded app or piece of code ( ie code inside jsp/servlets is executed in a single thread ), and the extra synchronization of Vector doesn't help to performance.
Same goes for Hashtable being replaced by HashMap
In fact getters a should be synchronized too or fields are to be made volatile. That is because when you get some value, you're probably interested in a most recent version of the value. You see, synchronized block semantics provides not only atomicity of execution (e.g. it guarantees that only one thread executes this block at one time), but also a visibility. It means that when thread enters synchronized block it invalidates its local cache and when it goes out it dumps any variables that have been modified back to main memory. volatile variables has the same visibility semantics.
No. Even getters have to be synchronized, except when they access only final fields. The reason is, that, for example, when accessing a long value, there is a tiny change that another thread currently writes it, and you read it while just the first 4 bytes have been written while the other 4 bytes remain the old value.
Yes, that's correct. All methods that modify data or access data that may be modified by a different thread need to be synchronized on the same monitor.
The easy way is to mark the methods as synchronized. If these are long-running methods, you may want to only synchronize that parts that the the reading/writing. In this case you would definie the monitor, along with wait() and notify().
The simple answer is yes.
If an object of the class is going to be shared by multiple threads, you need to syncronize the getters and setters to prevent data inconsistency.
If all the threads would have seperate copy of object, then there is no need to syncronize the methods. If your instance methods are more than mere set and get, you must analyze the threat of threads waiting for a long running getter/setter to finish.
You could use synchronized methods, synchronized blocks, concurrency tools such as Semaphore or if you really want to get down and dirty you could use Atomic References. Other options include declaring member variables as volatile and using classes like AtomicInteger instead of Integer.
It all depends on the situation, but there are a wide range of concurrency tools available - these are just some of them.
Synchronization can result in hold-wait deadlock where two threads each have the lock of an object, and are trying to acquire the lock of the other thread's object.
Synchronization must also be global for a class, and an easy mistake to make is to forget to synchronize a method. When a thread holds the lock for an object, other threads can still access non synchronized methods of that object.

Categories