Thread safety static variables - java

i read
thread safety for static variables and i understand it and i agree with it but
In book java se 7 programmer exam 804 can some one explain to me
public void run() {
synchronized(SharedCounter.class) {
SharedCounter.count++;
}
}
However, this code is inefficient since it acquires and releases the
lock every time just to increment the value of count.
can someone explain to me the above quote

The code is not particularly inefficient. It could be slightly more efficient. The main problem is that it is fragile: if any developer forgets to synchronize its access to the global SharedCounter.count variable, you have a thread-safety issue. Indeed, since i++ is not an atomic operation and since changing the value of a variable without synchronization doesn't make the variables new value visible to other threads, Every access to i must be done in a synchronized way.
The synchronization is thus not correctly encapsulated in a single class. Generally, accessing global public fields is bad design. It's even worse in a multi-threaded environment.
Using an AtomicInteger solves the encapsulation problem, and makes it slightly more efficient at the same time.

Synchronizing can be expensive, so it shouldn't be used carelessly. There are better ways such as using AtomicInteger.incrementAndGet(); which uses different mechanisms to handle the synchronization.

It's inefficient compared to using intrinsic CPU instructions which can do atomic increments without using a lock. See http://en.wikipedia.org/wiki/Fetch-and-add and http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/atomic/AtomicInteger.html

Related

Fastest Way for Java to write mutexes?

Mutexes are pretty common in many programming languages, like e.g. C/C++. I miss them in Java. However, there are multiple ways I could write my own class Mutex:
Using a simple synchronized keyword on Mutex.
Using a binary semaphore.
Using atomic variables, like discussed here.
...?
What is the fastest (best runtime) way? I think synchronized is most common, but what about performance?
Mutexes are pretty common in many programming languages, like e.g. C/C++. I miss them in Java.
Not sure I follow you (especially because you give the answer in your question).
public class SomeClass {
private final Object mutex = new Object();
public void someMethodThatNeedsAMutex() {
synchronized(mutex) {
//here you hold the mutex
}
}
}
Alternatively, you can simply make the whole method synchronized, which is equivalent to using this as the mutex object:
public class SomeClass {
public synchronized void someMethodThatNeedsAMutex() {
//here you hold the mutex
}
}
What is the fastest (best runtime) way?
Acquiring / releasing a monitor is not going to be a significant performance issue per se (you can read this blog post to see an analysis of the impact). But if you have many threads fighting for the lock, it will create contention and degrade performance.
In that case, the best strategy is to not use mutexes by using "lock-free" algorithms if you are mostly reading data (as pointed out by Marko in the comments, lock-free uses CAS operations, which may involve retrying writes many times if you have lots of writing threads, eventually leading to worse performance) or even better, by avoiding to share too much stuff across threads.
The opposite is the case: Java designers solved it so well that you don't even recognize it: you don't need a first-class Mutex object, just the synchronized modifier.
If you have a special case where you want to juggle your mutexes in a non-nesting fashion, there's always the ReentrantLock and java.util.concurrent offers a cornucopia of synchronization tools that go way beyond the crude mutex.
In Java each object can be uses as Mutex.
This objects are typicaly named "lock" or "mutex".
You can create that object for yourself which is the prefered variant, because it avoids external access to that lock:
// usually a field in the class
private Object mutex = new Object();
// later in methods
synchronized(mutex) {
// mutual exclusive section for all that uses synchronized
// ob this mutex object
}
Faster is to avoid the mutex, by thinking what happens if another thread reads an non actual value. In some situations this would produce wrong calculation results, in other results only in a minimal delay. (but faster than with syncing)
Detailed explanation in book
Java Concurreny in practise
.
What is the fastest (best runtime) way?
That depends on many things. For example, ReentrantLock used to perform better under contention than using synchronized, but that changed when a new HotSpot version, optimizing synchronized locking, was released. So there's nothing inherent in any way of locking that favors one flavor of mutexes over the other (from a performance point of view) - in fact, the "best" solution can change with the data you're processing and the machine you're running on.
Also, why did the inventors of Java not solve this question for me?
They did - in several ways: synchronized, Locks, atomic variables, and a whole slew of other utilities in java.util.concurrent.
You can run micro benchmarks of each variant, like atomic, synchronized, locked. As others have pointed out, it depends a lot on the machine and number of threads in use. In my own experiments incrementing long integers, I found that with only one thread on a Xeon W3520, synchronized wins over atomic: Atomic/Sync/Lock: 8.4/6.2/21.8, in nanos per increment operation.
This is of course a border case since there is never any contention. Of course, in that case, we can also look at unsynchronized single-threads long increment, which comes out six times faster than atomic.
With 4 threads I get 21.8/40.2/57.3. Note that these are all increments across all threads, so we actually see a slowdown. It gets a bit better for locks with 64 threads: 22.2/45.1/45.9.
Another test on a 4-way/64T machine using Xeon E7-4820 yields for 1 thread: 9.1/7.8/29.1, 4 threads: 18.2/29.1/55.2 and 64 Threads: 53.7/402/420.
One more data point, this time a dual Xeon X5560, 1T: 6.6/5.8/17.8, 4T: 29.7/81.5/121, 64T: 31.2/73.4/71.6.
So, on a multi-socket machine, there is a heavy cache coherency tax.
you can use java.util.concurrent.locks.Lock in the same way as the mutex or java.util.concurrent.Semaphore. But using synchronized-keyword is a better way :-)
Regards
Andrej

Is there any justification not to ALWAYS use AtomicInteger as data members?

In a multi-threaded environment like Android, where a simple int variable may be manipulated by multiple threads, are there circumstances in which it is still justified to use an int as a data member?
An int as a local variable, limited to the scope of the method that has exclusive access to it (and thus start & finish of modifying it is always in the same thread), makes perfect sense performance-wise.
But as a data member, even if wrapped by an accessor, it can run into the well known concurrent interleaved modification problem.
So it looks like to "play it safe" one could just use AtomicInteger across the board. But this seems awfully inefficient.
Can you bring an example of thread-safe int data member usage?
Is there any justification not to ALWAYS use AtomicInteger as data members?
Yes, there are good reasons to not always use AtomicInteger. AtomicInteger can be at at least an order of magnitude slower (probably more) because of the volatile construct than a local int and the other Unsafe constructs being used to set/get the underlying int value. volatile means that you cross a memory barrier every time you access an AtomicInteger which causes a cache memory flush on the processor in question.
Also, just because you have made all of your fields to be AtomicInteger does not protect you against race conditions when multiple fields are being accessed. There is just no substitute for making good decisions about when to use volatile, synchronized, and the Atomic* classes.
For example, if you had two fields in a class that you wanted to access in a reliable manner in a thread program, then you'd do something like:
synchronized (someObject) {
someObject.count++;
someObject.total += someObject.count;
}
If both of those members with AtomicInteger then you'd be accessing volatile twice so crossing 2 memory barriers instead of just 1. Also, the assignments are faster than the Unsafe operations inside of AtomicInteger. Also, because of the data race conditions with the two operations (as opposed to the synchronized blocks above) you might not get the right values for total.
Can you bring an example of thread-safe int data member usage?
Aside from making it final, there is no mechanism for a thread-safe int data member except for marking it volatile or using AtomicInteger. There is no magic way to paint thread-safety on all of your fields. If there was then thread programming would be easy. The challenge is to find the right places to put your synchronized blocks. To find the right fields that should be marked with volatile. To find the proper places to use AtomicInteger and friends.
If you have effecitvely immutable ints you can get away with not ensuring synchronization at the cost of its calculation. An example is hashCode
int hash = 0;
public int hashCode(){
if(hash == 0){
hash = calculateHashCode(); //needs to always be the same for each Object
}
return hash;
}
The obvious tradeoff here is the possibility of multiple calculations for the same hash value, but if the alternative is a synchronized hashCode that can have far worse implications.
This is technically thread-safe though redundant.
It depends on how it is used wrt. other data. A class encapsulates a behavior, so often a variable is almost meaningless without the others. In such cases it might be better to protect(*) data members that belong together (or the whole object), instead of just one integer. If you do this, then AtomicInteger is an unnecessary performance hit
(*) using the common thread safety mechanisms: mutex, semaphore, monitor etc.
Thread safety is not only about atomic int assignments, you need to carefully design your locking patterns to get consistency in your code.
If you have two Account classes with a public datamembers Balance consider the following simple code.
Account a;
...
int withdrawal = 100;
if(a.Balance >= withdrawal)
{
// No atomic operations in the world can save you from another thread
// withdrawing some balance here
a.Balance -= withdrawal
}
else
{
// Handle error
}
To be really frank. In real life, having atomic assignments is rarely enough to solve my real life concurrency issues.
I guess Google saw the OP and updated their documentation on the subject to be clearer:
"An AtomicInteger is used in applications such as atomically incremented counters, and cannot be used as a replacement for an Integer."
https://developer.android.com/reference/java/util/concurrent/atomic/AtomicInteger

Is unsynchronized read of integer threadsafe in java?

I see this code quite frequently in some OSS unit tests, but is it thread safe ? Is the while loop guaranteed to see the correct value of invoc ?
If no; nerd points to whoever also knows which CPU architecture this may fail on.
private int invoc = 0;
private synchronized void increment() {
invoc++;
}
public void isItThreadSafe() throws InterruptedException {
for (int i = 0; i < TOTAL_THREADS; i++) {
new Thread(new Runnable() {
public void run() {
// do some stuff
increment();
}
}).start();
}
while (invoc != TOTAL_THREADS) {
Thread.sleep(250);
}
}
No, it's not threadsafe. invoc needs to be declared volatile, or accessed while synchronizing on the same lock, or changed to use AtomicInteger. Just using the synchronized method to increment invoc, but not synchronizing to read it, isn't good enough.
The JVM does a lot of optimizations, including CPU-specific caching and instruction reordering. It uses the volatile keyword and locking to decide when it can optimize freely and when it has to have an up-to-date value available for other threads to read. So when the reader doesn't use the lock the JVM can't know not to give it a stale value.
This quote from Java Concurrency in Practice (section 3.1.3) discusses how both writes and reads need to be synchronized:
Intrinsic locking can be used to guarantee that one thread sees the effects of another in a predictable manner, as illustrated by Figure 3.1. When thread A executes a synchronized block, and subsequently thread B enters a synchronized block guarded by the same lock, the values of variables that were visible to A prior to releasing the lock are guaranteed to be visible to B upon acquiring the lock. In other words, everything A did in or prior to a synchronized block is visible to B when it executes a synchronized block guarded by the same lock. Without synchronization, there is no such guarantee.
The next section (3.1.4) covers using volatile:
The Java language also provides an alternative, weaker form of synchronization, volatile variables, to ensure that updates to a variable are propagated predictably to other threads. When a field is declared volatile, the compiler and runtime are put on notice that this variable is shared and that operations on it should not be reordered with other memory operations. Volatile variables are not cached in registers or in caches where they are hidden from other processors, so a read of a volatile variable always returns the most recent write by any thread.
Back when we all had single-CPU machines on our desktops we'd write code and never have a problem until it ran on a multiprocessor box, usually in production. Some of the factors that give rise to the visiblity problems, things like CPU-local caches and instruction reordering, are things you would expect from any multiprocessor machine. Elimination of apparently unneeded instructions could happen for any machine, though. There's nothing forcing the JVM to ever make the reader see the up-to-date value of the variable, you're at the mercy of the JVM implementors. So it seems to me this code would not be a good bet for any CPU architecture.
Well!
private volatile int invoc = 0;
Will do the trick.
And see Are java primitive ints atomic by design or by accident? which sites some of the relevant java definitions. Apparently int is fine, but double & long might not be.
edit, add-on. The question asks, "see the correct value of invoc ?". What is "the correct value"? As in the timespace continuum, simultaneity doesn't really exist between threads. One of the above posts notes that the value will eventually get flushed, and the other thread will get it. Is the code "thread safe"? I would say "yes", because it won't "misbehave" based on the vagaries of sequencing, in this case.
Theoretically, it is possible that the read is cached. Nothing in Java memory model prevents that.
Practically, that is extremely unlikely to happen (in your particular example). The question is, whether JVM can optimize across a method call.
read #1
method();
read #2
For JVM to reason that read#2 can reuse the result of read#1 (which can be stored in a CPU register), it must know for sure that method() contains no synchronization actions. This is generally impossible - unless, method() is inlined, and JVM can see from the flatted code that there's no sync/volatile or other synchronization actions between read#1 and read#2; then it can safely eliminate read#2.
Now in your example, the method is Thread.sleep(). One way to implement it is to busy loop for certain times, depending on CPU frequency. Then JVM may inline it, and then eliminate read#2.
But of course such implementation of sleep() is unrealistic. It is usually implemented as a native method that calls OS kernel. The question is, can JVM optimize across such a native method.
Even if JVM has knowledge of internal workings of some native methods, therefore can optimize across them, it's improbable that sleep() is treated that way. sleep(1ms) takes millions of CPU cycles to return, there is really no point optimizing around it to save a few reads.
--
This discussion reveals the biggest problem of data races - it takes too much effort to reason about it. A program is not necessarily wrong, if it is not "correctly synchronized", however to prove it's not wrong is not an easy task. Life is much simpler, if a program is correctly synchronized and contains no data race.
As far as I understand the code it should be safe. The bytecode can be reordered, yes. But eventually invoc should be in sync with the main thread again. Synchronize guarantees that invoc is incremented correctly so there is a consistent representation of invoc in some register. At some time this value will be flushed and the little test succeeds.
It is certainly not nice and I would go with the answer I voted for and would fix code like this because it smells. But thinking about it I would consider it safe.
If you're not required to use "int", I would suggest AtomicInteger as an thread-safe alternative.

Java Thread - Synchronization issue

From Sun's tutorial:
Synchronized methods enable a simple strategy for preventing thread interference and memory consistency errors: if an object is visible to more than one thread, all reads or writes to that object's variables are done through synchronized methods. (An important exception: final fields, which cannot be modified after the object is constructed, can be safely read through non-synchronized methods, once the object is constructed) This strategy is effective, but can present problems with liveness, as we'll see later in this lesson.
Q1. Is the above statements mean that if an object of a class is going to be shared among multiple threads, then all instance methods of that class (except getters of final fields) should be made synchronized, since instance methods process instance variables?
In order to understand concurrency in Java, I recommend the invaluable Java Concurrency in Practice.
In response to your specific question, although synchronizing all methods is a quick-and-dirty way to accomplish thread safety, it does not scale well at all. Consider the much maligned Vector class. Every method is synchronized, and it works terribly, because iteration is still not thread safe.
No. It means that synchronized methods are a way to achieve thread safety, but they're not the only way and, by themselves, they don't guarantee complete safety in all situations.
Not necessarily. You can synchronize (e.g. place a lock on dedicated object) part of the method where you access object's variables, for example. In other cases, you may delegate job to some inner object(s) which already handles synchronization issues.
There are lots of choices, it all depends on the algorithm you're implementing. Although, 'synchronized' keywords is usually the simplest one.
edit
There is no comprehensive tutorial on that, each situation is unique. Learning it is like learning a foreign language: never ends :)
But there are certainly helpful resources. In particular, there is a series of interesting articles on Heinz Kabutz's website.
http://www.javaspecialists.eu/archive/Issue152.html
(see the full list on the page)
If other people have any links I'd be interested to see also. I find the whole topic to be quite confusing (and, probably, most difficult part of core java), especially since new concurrency mechanisms were introduced in java 5.
Have fun!
In the most general form yes.
Immutable objects need not be synchronized.
Also, you can use individual monitors/locks for the mutable instance variables (or groups there of) which will help with liveliness. As well as only synchronize the portions where data is changed, rather than the entire method.
synchronized methodName vs synchronized( object )
That's correct, and is one alternative. I think it would be more efficient to synchronize access to that object only instead synchronize all it's methods.
While the difference may be subtle, it would be useful if you use that same object in a single thread
ie ( using synchronized keyword on the method )
class SomeClass {
private int clickCount = 0;
public synchronized void click(){
clickCount++;
}
}
When a class is defined like this, only one thread at a time may invoke the click method.
What happens if this method is invoked too frequently in a single threaded app? You'll spend some extra time checking if that thread can get the object lock when it is not needed.
class Main {
public static void main( String [] args ) {
SomeClass someObject = new SomeClass();
for( int i = 0 ; i < Integer.MAX_VALUE ; i++ ) {
someObject.click();
}
}
}
In this case, the check to see if the thread can lock the object will be invoked unnecessarily Integer.MAX_VALUE ( 2 147 483 647 ) times.
So removing the synchronized keyword in this situation will run much faster.
So, how would you do that in a multithread application?
You just synchronize the object:
synchronized ( someObject ) {
someObject.click();
}
Vector vs ArrayList
As an additional note, this usage ( syncrhonized methodName vs. syncrhonized( object ) ) is, by the way, one of the reasons why java.util.Vector is now replaced by java.util.ArrayList. Many of the Vector methods are synchronized.
Most of the times a list is used in a single threaded app or piece of code ( ie code inside jsp/servlets is executed in a single thread ), and the extra synchronization of Vector doesn't help to performance.
Same goes for Hashtable being replaced by HashMap
In fact getters a should be synchronized too or fields are to be made volatile. That is because when you get some value, you're probably interested in a most recent version of the value. You see, synchronized block semantics provides not only atomicity of execution (e.g. it guarantees that only one thread executes this block at one time), but also a visibility. It means that when thread enters synchronized block it invalidates its local cache and when it goes out it dumps any variables that have been modified back to main memory. volatile variables has the same visibility semantics.
No. Even getters have to be synchronized, except when they access only final fields. The reason is, that, for example, when accessing a long value, there is a tiny change that another thread currently writes it, and you read it while just the first 4 bytes have been written while the other 4 bytes remain the old value.
Yes, that's correct. All methods that modify data or access data that may be modified by a different thread need to be synchronized on the same monitor.
The easy way is to mark the methods as synchronized. If these are long-running methods, you may want to only synchronize that parts that the the reading/writing. In this case you would definie the monitor, along with wait() and notify().
The simple answer is yes.
If an object of the class is going to be shared by multiple threads, you need to syncronize the getters and setters to prevent data inconsistency.
If all the threads would have seperate copy of object, then there is no need to syncronize the methods. If your instance methods are more than mere set and get, you must analyze the threat of threads waiting for a long running getter/setter to finish.
You could use synchronized methods, synchronized blocks, concurrency tools such as Semaphore or if you really want to get down and dirty you could use Atomic References. Other options include declaring member variables as volatile and using classes like AtomicInteger instead of Integer.
It all depends on the situation, but there are a wide range of concurrency tools available - these are just some of them.
Synchronization can result in hold-wait deadlock where two threads each have the lock of an object, and are trying to acquire the lock of the other thread's object.
Synchronization must also be global for a class, and an easy mistake to make is to forget to synchronize a method. When a thread holds the lock for an object, other threads can still access non synchronized methods of that object.

java threads synchronization

In the class below, is the method getIt() thread safe and why?
public class X {
private long myVar;
public void setIt(long var){
myVar = var;
}
public long getIt() {
return myVar;
}
}
It is not thread-safe. Variables of type long and double in Java are treated as two separate 32-bit variables. One thread could be writing and have written half the value when another thread reads both halves. In this situation, the reader would see a value that was never supposed to exist.
To make this thread-safe you can either declare myVar as volatile (Java 1.5 or later) or make both setIt and getIt synchronized.
Note that even if myVar was a 32-bit int you could still run into threading issues where one thread could be reading an out of date value that another thread has changed. This could occur because the value has been cached by the CPU. To resolve this, you again need to declare myVar as volatile (Java 1.5 or later) or make both setIt and getIt synchronized.
It's also worth noting that if you are using the result of getIt in a subsequent setIt call, e.g. x.setIt(x.getIt() * 2), then you probably want to synchronize across both calls:
synchronized(x)
{
x.setIt(x.getIt() * 2);
}
Without the extra synchronization, another thread could change the value in between the getIt and setIt calls causing the other thread's value to be lost.
This is not thread-safe. Even if your platform guarantees atomic writes of long, the lack of synchronized makes it possible that one thread calls setIt() and even after this call has finished it is possible that another thread can call getIt() and this call could return the old value of myVar.
The synchronized keyword does more than an exclusive access of one thread to a block or a method. It also guarantees that the second thread is informed about a change of a variable.
So you either have to mark both methods as synchronized or mark the member myVar as volatile.
There's a very good explanation about synchronization here:
Atomic actions cannot be interleaved, so they can be used without fear of thread interference. However, this does not eliminate all need to synchronize atomic actions, because memory consistency errors are still possible. Using volatile variables reduces the risk of memory consistency errors, because any write to a volatile variable establishes a happens-before relationship with subsequent reads of that same variable. This means that changes to a volatile variable are always visible to other threads. What's more, it also means that when a thread reads a volatile variable, it sees not just the latest change to the volatile, but also the side effects of the code that led up the change.
No, it's not. At least, not on platforms that lack atomic 64-bit memory accesses.
Suppose that Thread A calls setIt, copies 32 bits into memory where the backing value is, and is then pre-empted before it can copy the other 32 bits.
Then Thread B calls getIt.
No it is not, because longs are not atomic in java, so one thread could have written 32 bits of the long in the setIt method, and then the getIt could read the value, and then setIt could set the other 32 bits.
So the end result is that getIt returns a value that was never valid.
It ought to be, and generally is, but is not guaranteed to be thread safe. There could be issues with different cores having different versions in CPU cache, or the store/retrieve not being atomic for all architectures. Use the AtomicLong class.
The getter is not thread safe because it’s not guarded by any mechanism that guarantees the most up-to-date visibility. Your choices are:
making myVar final (but then you can’t mutate it)
making myVar volatile
use synchronized to accessing myVar
AFAIK, Modern JVMs no longer split long and double operations. I don't know of any reference which states this is still a problem. For example, see AtomicLong which doesn't use synchronization in Sun's JVM.
Assuming you want to be sure it is not a problem then you can use synchronize both get() and set(). However, if you are performing an operation like add, i.e. set(get()+1) then this synchronization doesn't buy you much, you still have to synchronize the object for the whole operation. (A better way around this is to use a single operation for add(n) which is synchronized)
However, a better solution is to use an AtomicLong. This supports atomic operations like get, set and add and DOESN'T use synchronization.
Since it is a read only method. You should synchronize the set method.
EDIT : I see why the get method needs to be synchronized as well. Good job explaining Phil Ross.

Categories