This may seem like pedantry but is really me questioning my fundamental assumptions.. :)
In the java documentation on synchronised methods, there is the following example:
public class SynchronizedCounter {
private int c = 0;
public synchronized void increment() {
c++;
}
public synchronized void decrement() {
c--;
}
public synchronized int value() {
return c;
}
}
Is the synchronized keyword really required on the value method? Surely this is atomic and whether the value is retrieved before or after any calls to related methods on other threads makes little difference? Would the following suffice:
public class SynchronizedCounter {
private int c = 0;
public synchronized void increment() {
c++;
}
public synchronized void decrement() {
c--;
}
public int value() {
return c;
}
}
I understand that in a more complex case, where multiple private variables were being accessed then yes, it would be essential - but in this simple case, is it safe to assume that this can be simplified?
Also, I suppose that there is a risk that future modifications may require the value method to be synchronised and this could be forgotten, leading to bugs, etc, so perhaps this counts somewhat as defensive programming, but I am ignoring that aspect here.. :)
Yes, synchronized is really required on value(). Otherwise a thread can call value() and get a stale answer.
Surely this is atomic
For ints I believe so, but if value was a long or double, it's not. It is even possible to see only some of the bits in the field updated!
value is retrieved before or after any calls to related methods on other threads makes little difference?
Depends on your use case. Often it does matter.
Some static analysis software such as FindBugs will flag this code as not having correct synchronization if value() isn't also synchronized.
synchronized is required for both reading and writing a variable from another thread. This guarantees
that values will be copied from cache or registers to RAM (granted this is important for writing not reading)
it establishes that writes will happen before reads if they appear so in code. Otherwise the compiler is free to rearrange lines of bytecode for optimization
Check Effective Java Item 66 for a more detailed analysis
Related
I've a class in multithreading application:
public class A {
private volatile int value = 0; // is volatile needed here?
synchronized public void increment() {
value++; // Atomic is better, agree
}
public int getValue() { // synchronized needed ?
return value;
}
}
The keyword volatile gives you the visibility aspects and without that you may read some stale value. A volatile read adds a memory barrier such that the compiler, hardware or the JVM can't reorder the memory operations in ways that would violate the visibility guarantees provided by the memory model. According to the memory model, a write to a volatile field happens-before every subsequent read of that same field, thus you are guaranteed to read the latest value.
The keyword synchronized is also needed since you are performing a compound action value++ which has to be done atomically. You read the value, increment it in the CPU and then write it back. All these actions has to be done atomically. However, you don't need to synchronize the read path since the keyword volatile guarantees the visibility. In fact, use of both volatile and synchronize on the read path would be confusing and would offer no performance or safety benefit.
The use of atomic variables is generally encouraged, since they use non blocking synchronization using CAS instructions built into the CPU which yields low lock contention and higher throughput. If it were written using the atomic variables, it would be something like this.
public class A {
private final LongAdder value = new LongAdder();
public void increment() {
value.add(1);
}
public int getValue() {
return value.intValue();
}
}
Given:
public class NamedCounter{
private final String name;
private int count;
public NamedCounter(String name) { this.name = name; }
public String getName() { return name; }
public void increment() { count++; }
public int getCount() { return count; }
public void reset() { count = 0; }
}
Which three changes should be made to adapt this class to be used safely by multiple threads? (Choose three.)
A. declare reset() using the synchronized keyword
B. declare getName() using the synchronized keyword
C. declare getCount() using the synchronized keyword
D. declare the constructor using the synchronized keyword
E. declare increment() using the synchronized keyword
Answers are A, C, E. But I don't understand why. According to synchronized keyword only used for manipulation operations. So why answer C?
This question comes from SCJP dumps.
The JVM uses the synchronized keyword to tell it when a variable's contents need to be made visible across threads. In the posted example not only is the increment method unsafe (the postincrement operator takes multiple steps to act, so that another thread can interfere with it while the increment is in progress) but all the methods that touch the count variable -- increment, reset, and getCount -- need the synchronized keyword to indicate that the count variable could be changed in another thread. Otherwise the JVM could perform optimizations like caching the value or eliminating the bytecode altogether, because it is free to reason about the code without taking into account the possibility of multithreaded access.
There are alternatives to synchronized here; for instance an AtomicInteger would work too, using it for count would make sure the value would change atomically and would be accessed in a way that is visible to all threads. But as the question is worded answers A, C, and E are all necessary.
Answer D is wrong because declaring a constructor as synchronized isn't valid syntax, and if it was valid it wouldn't accomplish anything for you. Synchronization protects against sharing across threads, but the constructor call doesn't share anything so there's nothing to protect.
Answer B is wrong because the name is an immutable object passed in as a constructor argument and assigned to a final instance member; since it is published safely and cannot change, synchronization is unnecessary.
I'm building a simple program to use in multi processes (Threads).
My question is more to understand - when I have to use a reserved word synchronized?
Do I need to use this word in any method that affects the bone variables?
I know I can put it on any method that is not static, but I want to understand more.
thank you!
here is the code:
public class Container {
// *** data members ***
public static final int INIT_SIZE=10; // the first (init) size of the set.
public static final int RESCALE=10; // the re-scale factor of this set.
private int _sp=0;
public Object[] _data;
/************ Constructors ************/
public Container(){
_sp=0;
_data = new Object[INIT_SIZE];
}
public Container(Container other) { // copy constructor
this();
for(int i=0;i<other.size();i++) this.add(other.at(i));
}
/** return true is this collection is empty, else return false. */
public synchronized boolean isEmpty() {return _sp==0;}
/** add an Object to this set */
public synchronized void add (Object p){
if (_sp==_data.length) rescale(RESCALE);
_data[_sp] = p; // shellow copy semantic.
_sp++;
}
/** returns the actual amount of Objects contained in this collection */
public synchronized int size() {return _sp;}
/** returns true if this container contains an element which is equals to ob */
public synchronized boolean isMember(Object ob) {
return get(ob)!=-1;
}
/** return the index of the first object which equals ob, if none returns -1 */
public synchronized int get(Object ob) {
int ans=-1;
for(int i=0;i<size();i=i+1)
if(at(i).equals(ob)) return i;
return ans;
}
/** returns the element located at the ind place in this container (null if out of range) */
public synchronized Object at(int p){
if (p>=0 && p<size()) return _data[p];
else return null;
}
Making a class safe for multi-threaded access is a complex subject. If you are not doing it in order to learn about threading, you should try to find a library that does it for you.
Having said that, a place to start is by imagining two separate threads executing a method line by line, in an alternating fashion, and see what would go wrong. For example, the add() method as written above is vulnerable to data destruction. Imagine thread1 and thread2 calling add() more or less at the same time. If thread1 runs line 2 and before it gets to line 3, thread2 runs line 2, then thread2 will overwrite thread1's value. Thus you need some way to prevent the threads from interleaving like that. On the other hand, the isEmpty() method does not need synchronization since there is just one instruction that compares a value to 0. Again, it is hard to get this stuff right.
You can check the following documentation about synchronized methods: http://docs.oracle.com/javase/tutorial/essential/concurrency/syncmeth.html
By adding the synchronized keyword two things are guaranteed to happen:
First, it is not possible for two invocations of synchronized methods on the same object to interleave. When one thread is executing a synchronized method for an object, all other threads that invoke synchronized methods for the same object block (suspend execution) until the first thread is done with the object.
Second, when a synchronized method exits, it automatically establishes a happens-before relationship with any subsequent invocation of a synchronized method for the same object. This guarantees that changes to the state of the object are visible to all threads.
So whenever you need to guarantee that only one thread accesses your variable at a time to read/write it to avoid consistency issues, one way is to make your method synchronized.
My advice to you is to first read Oracle's concurrency tutorial.
A few comments:
Having all your methods synchronized causes bottlenecks
Having _data variable public is a bad practice and will difficult concurrent programming.
It seems that you are reimplementing a collection, better use existing Java's concurrent collections.
Variable names would better not begin with _
Avoid adding comments to your code and try to have declarative method names.
+1 for everybody who said read a tutorial, but here's a summary anyway.
You need mutual exclusion (i.e., synchronized blocks) whenever it is possible for one thread to create a temporary situation that other threads must not be allowed to see. Suppose you have objects stored in a search tree. A method that adds a new object to the tree probably will have to reassign several object references, and until it finishes its work, the tree will be in an invalid state. If one thread is allowed to search the tree while another thread is in the add() method, then the search() function may return an incorrect result, or worse (maybe crash the program.)
One solution is to synchronize the add() method, and the search() method, and any other method that depends on the tree structure. All must be synchronized on the same object (the root node of the tree would be an obvious choice).
Java guarantees that no more than one thread can be synchronized on the same object at any given time. Therefore, no more than one thread will be able to see or change the internals of the tree at the same time, and the temporary invalid state created inside the add() method will be harmless.
My example above explains the principle of mutual exclusion, but it is a simplistic and inefficient solution to protecting a search tree. A more practical approach would use reader/writer locks, and synchronize only on interesting parts of the tree rather than on the whole thing. Practical synchronization of complex data structures is a hard problem, and whenever possible, you should let somebody else solve it for you. E.g., If you use the container classes in java.util.concurrent instead of creating your own data structures, you'll probably save yourself a lot of work (and maybe a whole lot of debugging).
You need to protect variables that form the object's state. If these variables are used in static method, you have to protect them as well. But, be careful, following example is wrong:
private static int stateVariable = 0;
//wrong!!!!
public static synchronized void increment() {
stateVariable++;
}
public synchronized int getValue() {
return stateVariable;
}
It seems that above is safe, but these methods operate on different locks. Above is more or less corresponds to following:
private static int stateVariable = 0;
//wrong!!!!
public static void increment() {
synchronized (YourClassName.class) {
stateVariable++;
}
}
public synchronized int getValue() {
synchronized (this) {
return stateVariable;
}
}
Notice that different locks are used when mixing static and object methods.
In Java, I understand that volatile keyword provides visibility to variables. The question is, if a variable is a reference to a mutable object, does volatile also provide visibility to the members inside that object?
In the example below, does it work correctly if multiple threads are accessing volatile Mutable m and changing the value?
example
class Mutable {
private int value;
public int get()
{
return a;
}
public int set(int value)
{
this.value = value;
}
}
class Test {
public volatile Mutable m;
}
This is sort of a side note explanation on some of the details of volatile. Writing this here because it is too much for an comment. I want to give some examples which show how volatile affects visibility, and how that changed in jdk 1.5.
Given the following example code:
public class MyClass
{
private int _n;
private volatile int _volN;
public void setN(int i) {
_n = i;
}
public void setVolN(int i) {
_volN = i;
}
public int getN() {
return _n;
}
public int getVolN() {
return _volN;
}
public static void main() {
final MyClass mc = new MyClass();
Thread t1 = new Thread() {
public void run() {
mc.setN(5);
mc.setVolN(5);
}
};
Thread t2 = new Thread() {
public void run() {
int volN = mc.getVolN();
int n = mc.getN();
System.out.println("Read: " + volN + ", " + n);
}
};
t1.start();
t2.start();
}
}
The behavior of this test code is well defined in jdk1.5+, but is not well defined pre-jdk1.5.
In the pre-jdk1.5 world, there was no defined relationship between volatile accesses and non-volatile accesses. therefore, the output of this program could be:
Read: 0, 0
Read: 0, 5
Read: 5, 0
Read: 5, 5
In the jdk1.5+ world, the semantics of volatile were changed so that volatile accesses affect non-volatile accesses in exactly the same way as synchronization. therefore, only certain outputs are possible in the jdk1.5+ world:
Read: 0, 0
Read: 0, 5
Read: 5, 0 <- not possible
Read: 5, 5
Output 3. is not possible because the reading of "5" from the volatile _volN establishes a synchronization point between the 2 threads, which means all actions from t1 taken before the assignment to _volN must be visible to t2.
Further reading:
Fixing the java memory model, part 1
Fixing the java memory model, part 2
In your example the volatile keyword only guarantees that the last reference written, by any thread, to 'm' will be visible to any thread reading 'm' subsequently.
It doesn't guarantee anything about your get().
So using the following sequence:
Thread-1: get() returns 2
Thread-2: set(3)
Thread-1: get()
it is totally legitimate for you to get back 2 and not 3. volatile doesn't change anything to that.
But if you change your Mutable class to this:
class Mutable {
private volatile int value;
public int get()
{
return a;
}
public int set(int value)
{
this.value = value;
}
}
Then it is guaranteed that the second get() from Thread-1 shall return 3.
Note however that volatile typically ain't the best synchronization method.
In you simple get/set example (I know it's just an example) a class like AtomicInteger, using proper synchronization and actually providing useful methods, would be better.
volatile only provides guarantees about the reference to the Object that is declared so. The members of that instance don't get synchronized.
According to the Wikipedia, you have:
(In all versions of Java) There is a global ordering on the reads and
writes to a volatile variable. This
implies that every thread accessing a
volatile field will read its current
value before continuing, instead of
(potentially) using a cached value.
(However, there is no guarantee about
the relative ordering of volatile
reads and writes with regular reads
and writes, meaning that it's
generally not a useful threading
construct.)
(In Java 5 or later) Volatile reads and writes establish a happens-before
relationship, much like acquiring and
releasing a mutex.
So basically what you have is that by declaring the field volatile, interacting with it creates a "point of synchronization", after which any change will be visible in other threads. But after that, using get() or set() is unsynched. The Java Spec has a more thorough explanation.
Use of volatile rather than a fully synchronized value is essentially an optimization. The optimization comes from the weaker guarantees provided for a volatile value compared with a synchronized access. Premature optimmization is the root of all evil; in this case, the evil could be hard to track down because it would be in the form of race conditions and such like. So if you need to ask, you probably ought not to use it.
volatile does not "provide visibility". Its only effect is to prevent processor caching of the variable, thus providing a happens-before relation on concurrent reads and writes. It does not affect the members of an object, nor does it provide any synchronisation synchronized locking.
As you haven't told us what the "correct" behaviour of your code is, the question cannot be answered.
When a class field is accessed via a getter method by multiple threads, how do you maintain thread safety? Is the synchronized keyword sufficient?
Is this safe:
public class SomeClass {
private int val;
public synchronized int getVal() {
return val;
}
private void setVal(int val) {
this.val = val;
}
}
or does the setter introduce further complications?
If you use 'synchronized' on the setter here too, this code is threadsafe. However it may not be sufficiently granular; if you have 20 getters and setters and they're all synchronized, you may be creating a synchronization bottleneck.
In this specific instance, with a single int variable, then eliminating the 'synchronized' and marking the int field 'volatile' will also ensure visibility (each thread will see the latest value of 'val' when calling the getter) but it may not be synchronized enough for your needs. For example, expecting
int old = someThing.getVal();
if (old == 1) {
someThing.setVal(2);
}
to set val to 2 if and only if it's already 1 is incorrect. For this you need an external lock, or some atomic compare-and-set method.
I strongly suggest you read Java Concurrency In Practice by Brian Goetz et al, it has the best coverage of Java's concurrency constructs.
In addition to Cowan's comment, you could do the following for a compare and store:
synchronized(someThing) {
int old = someThing.getVal();
if (old == 1) {
someThing.setVal(2);
}
}
This works because the lock defined via a synchronized method is implicitly the same as the object's lock (see java language spec).
From my understanding you should use synchronized on both the getter and the setter methods, and that is sufficient.
Edit: Here is a link to some more information on synchronization and what not.
If your class contains just one variable, then another way of achieving thread-safety is to use the existing AtomicInteger object.
public class ThreadSafeSomeClass {
private final AtomicInteger value = new AtomicInteger(0);
public void setValue(int x){
value.set(x);
}
public int getValue(){
return value.get();
}
}
However, if you add additional variables such that they are dependent (state of one variable depends upon the state of another), then AtomicInteger won't work.
Echoing the suggestion to read "Java Concurrency in Practice".
For simple objects this may suffice. In most cases you should avoid the synchronized keyword because you may run into a synchronization deadlock.
Example:
public class SomeClass {
private Object mutex = new Object();
private int val = -1; // TODO: Adjust initialization to a reasonable start
// value
public int getVal() {
synchronized ( mutex ) {
return val;
}
}
private void setVal( int val ) {
synchronized ( mutex ) {
this.val = val;
}
}
}
Assures that only one thread reads or writes to the local instance member.
Read the book "Concurrent Programming in Java(tm): Design Principles and Patterns (Java (Addison-Wesley))", maybe http://java.sun.com/docs/books/tutorial/essential/concurrency/index.html is also helpful...
Synchronization exists to protect against thread interference and memory consistency errors. By synchronizing on the getVal(), the code is guaranteeing that other synchronized methods on SomeClass do not also execute at the same time. Since there are no other synchronized methods, it isn't providing much value. Also note that reads and writes on primitives have atomic access. That means with careful programming, one doesn't need to synchronize the access to the field.
Read Sychronization.
Not really sure why this was dropped to -3. I'm simply summarizing what the Synchronization tutorial from Sun says (as well as my own experience).
Using simple atomic variable access is
more efficient than accessing these
variables through synchronized code,
but requires more care by the
programmer to avoid memory consistency
errors. Whether the extra effort is
worthwhile depends on the size and
complexity of the application.