I have a utility class that has one static method to modify values of the input Array List. This static method is invoked by a caller. The caller is used to process web service requests. For each request(per thread), the caller creates a new ArrayList and invokes the static method.
public class Caller{
public void callingMethod(){
//Get Cloned criteria clones a preset search criteria that has place holders for values and returns a new ArrayList of the original criteria. Not included code for the clone
ArrayList<Properties> clonedCriteria = getClonedCriteria();
CriteriaUpdater.update(clonedCriteria , "key1", "old_value1", "key1_new_value");
CriteriaUpdater.update(clonedCriteria , "key2", "old_value2", "key2_new_value");
//do something after the this call with the modified criteria arraylist
}
}
public class CriteriaUpdater
{
//updates the criteria, in the form of array of property objects, by replacing the token with the new value passed in
public static void update(ArrayList<Properties> criteria, String key, String token, String newValue)
{
for (Properties sc: criteria)
{
String oldValue = sc.getProperty(key);
if ((oldValue != null) && (oldValue.equals(token)))
sc.setProperty(key, newValue);
}
}
}
This is how the criteria are cloned:
public synchronized static ArrayList<Properties> cloneSearchCriteria(ArrayList<Properties> criteria) {
if (criteria == null) return null;
ArrayList<Properties> criteriaClone = new ArrayList<Properties>();
for (Properties sc : criteria) {
Properties clone = new Properties();
Enumeration propertyNames = sc.propertyNames();
while (propertyNames.hasMoreElements()) {
String key = (String) propertyNames.nextElement();
clone.put(key, (String) sc.get(key));
}
criteriaClone.add(clone);
}
return criteriaClone;
}
Given the above definitions, by not synchronizing the static method, would it still be able to correctly process concurrent method calls. My understanding is I have to synchronize this method for concurrency but wanted to confirm.
I understand each thread will have its own stack, but for static method it would be common to all threads - so in this case if we don't synchronize would it not cause a problem?
Appreciate suggestions and any corrections.
Thanks
You have a problem with a race condition. At least the underlying Properties data structure will never be corrupted but it could have an incorrect value. In particular, any number of threads could be in this section meaning the final value could be anything from any thread.
String oldValue = sc.getProperty(key);
if ((oldValue != null) && (oldValue.equals(token)))
sc.setProperty(key, newValue);
I am assuming your List is never altered, but if it is, you have to have synchronized. You could lock on the class, but locking on the collection you are altering might be a better choice.
It all depends on your getClonedCriteria() method. That's the method that is accessing shared state.
You are creating a "deep copy" of the criteria, so that every clone is independent from the original and from each other.
But there's a more subtle problem, which is that whatever initialization is performed on the prototype criteria must happen-before any thread that reads the criteria to clone it. Otherwise, the cloning thread may read an uninitialized version of the data structure.
One way to achieve this is to initialize the prototype criteria in a static initializer and assign it to a class member variable. Another is to initialize the criteria and then assign it to a volatile variable. Or, you could initialize and assign the prototype (in either order) to an ordinary class or instance member variable inside a synchronized block (or using a Lock), and then read the variable from another block synchronized on the same lock.
You are correct in that each thread has its own stack, so each thread will have its own copies of local variables and method arguments when it calls update(). When it runs it will save those local variables and method arguments to its stack.
However, the method argument criteria is a reference to a mutable object that will be stored on the heap where Java objects reside. If the threads can call update() on the same ArrayList, or the elements contained in the ArrayList could be contained in more than one ArrayList passed into different invocations of update() by different threads then synchronization errors could occur.
Related
I have a Java class, here's its code:
public class MyClass {
private AtomicInteger currentIndex;
private List<String> list;
MyClass(List<String> list) {
this.list = list; // list is initialized only one time in this constructor and is not modified anywhere in the class
this.currentIndex = new AtomicInteger(0);
}
public String select() {
return list.get(currentIndex.getAndIncrement() % list.size());
}
}
Now my question:
Is this class really thread safe thanks to using an AtomicInteger only or there must be an addional thread safety mechansim to ensure thread-safety (for example locks)?
The use of currentIndex.getAndIncrement() is perfectly thread-safe. However, you need a change to your code to make it thread-safe in all circumstances.
The fields currentIndex and list need to be made final to achieve full thread-safety, even on unsafe publication of the reference to your MyClass object.
private final AtomicInteger currentIndex;
private final List<String> list;
In practice, if you always ensure that your MyClass object itself is safely published, for example if you create it on the main thread, before any of the threads that use it are started, then you don't need the fields to be final.
Safe publication means that the reference to the MyClass object itself is done in a way that has a guaranteed multi-threaded ordering in the Java Memory Model.
It could be that:
All threads that use the reference get it from a field that was initialized by the thread that started them, before their thread was started
All threads that use the reference get it from a method that was synchronized on the same object as the code that set the reference (you have a synchronized getter and setter for the field)
You make the field that contains the reference volatile
It was in a final field if that final field was initialized as described in section 17.5 of the JLS.
A few more cases the are not easily used to publish references
I think your code contains two bugs.
First, normally when you receive an object from some unknown source like your constructor does, you make a defensive copy to be certain it is not modified outside of the class.
MyClass(List<String> list) {
this.list = new ArrayList<String>( list );
So if you do this, do you now need to mutate that list anywhere inside the class? If so, the method:
public String select() {
return list.get(currentIndex.getAndIncrement() % list.size());
isn't atomic. What could happen here is a thread call getAndIncrement() and then perform the modulus (%). Then at that point if it's swapped out with another thread that removes an item from the list, the old limit of list.size() will no longer be valid.
I think there's nothing for it but to add synchronized to the whole method:
public synchronized String select() {
return list.get(currentIndex.getAndIncrement() % list.size());
And the same with any other mutator.
(final as the other poster mentions is still required on the instance fields.)
Take this code:
public class MyClass {
private final Object _lock = new Object();
private final MyMutableClass _mutableObject = new MyMutableClass()
public void myMethod() {
synchronized(_lock) { // we are synchronizing on instance variable _lock
// do something with mutableVar
//(i.e. call a "set" method on _mutableObject)
}
}
}
now, imagine delegating the code inside myMethod() to some helper class where you pass the lock
public class HelperClass {
public helperMethod(Object lockVar, MyMutableClass mutableVar) {
synchronized(lockVar) { // we are now synchronizing on a method param,
// each thread has own copy
// do something with mutableVar
// (i.e. call a "set" method on mutableVar)
}
}
}
can "myMethod" be re-written to use the HelperClass by passing its lock var, so that everything is still thread safe? i.e.,
public void myMethod() {
_helperObject.helperMethod(_lock, _mutableObject);
}
I am not sure about this, because Java will pass the lockVar by value, and every thread will get a separate copy of lockVar (even though each copy points to the same object on the heap). I guess the question comes down to how 'synchronized' keyword works -- does it lock on the variable, or the value on the heap that the variable references?
Synchronization is done upon objects, not variables.
Variables/members [sometimes] contain objects and it is the resulting object contained in [variable] x that is actually synchronized upon in synchronized(x).
There are a few other issues with thread-visibility of variables (e.g. might read a "stale" object from a variable), but that does not apply here: there is no re-assignment of _lock and the visibility of the initial ("final") assignment is guaranteed. Because of this it is guaranteed that, in this case, the method parameter will always contain the correct (same) object used for the synchronization.
If the lock object used (where presumably _lock is not final) changes, however, then that would require re-evaluation of the appropriate values/thread-visibility but otherwise does not differ from any cross-thread access.
Happy coding.
I'm building a simple program to use in multi processes (Threads).
My question is more to understand - when I have to use a reserved word synchronized?
Do I need to use this word in any method that affects the bone variables?
I know I can put it on any method that is not static, but I want to understand more.
thank you!
here is the code:
public class Container {
// *** data members ***
public static final int INIT_SIZE=10; // the first (init) size of the set.
public static final int RESCALE=10; // the re-scale factor of this set.
private int _sp=0;
public Object[] _data;
/************ Constructors ************/
public Container(){
_sp=0;
_data = new Object[INIT_SIZE];
}
public Container(Container other) { // copy constructor
this();
for(int i=0;i<other.size();i++) this.add(other.at(i));
}
/** return true is this collection is empty, else return false. */
public synchronized boolean isEmpty() {return _sp==0;}
/** add an Object to this set */
public synchronized void add (Object p){
if (_sp==_data.length) rescale(RESCALE);
_data[_sp] = p; // shellow copy semantic.
_sp++;
}
/** returns the actual amount of Objects contained in this collection */
public synchronized int size() {return _sp;}
/** returns true if this container contains an element which is equals to ob */
public synchronized boolean isMember(Object ob) {
return get(ob)!=-1;
}
/** return the index of the first object which equals ob, if none returns -1 */
public synchronized int get(Object ob) {
int ans=-1;
for(int i=0;i<size();i=i+1)
if(at(i).equals(ob)) return i;
return ans;
}
/** returns the element located at the ind place in this container (null if out of range) */
public synchronized Object at(int p){
if (p>=0 && p<size()) return _data[p];
else return null;
}
Making a class safe for multi-threaded access is a complex subject. If you are not doing it in order to learn about threading, you should try to find a library that does it for you.
Having said that, a place to start is by imagining two separate threads executing a method line by line, in an alternating fashion, and see what would go wrong. For example, the add() method as written above is vulnerable to data destruction. Imagine thread1 and thread2 calling add() more or less at the same time. If thread1 runs line 2 and before it gets to line 3, thread2 runs line 2, then thread2 will overwrite thread1's value. Thus you need some way to prevent the threads from interleaving like that. On the other hand, the isEmpty() method does not need synchronization since there is just one instruction that compares a value to 0. Again, it is hard to get this stuff right.
You can check the following documentation about synchronized methods: http://docs.oracle.com/javase/tutorial/essential/concurrency/syncmeth.html
By adding the synchronized keyword two things are guaranteed to happen:
First, it is not possible for two invocations of synchronized methods on the same object to interleave. When one thread is executing a synchronized method for an object, all other threads that invoke synchronized methods for the same object block (suspend execution) until the first thread is done with the object.
Second, when a synchronized method exits, it automatically establishes a happens-before relationship with any subsequent invocation of a synchronized method for the same object. This guarantees that changes to the state of the object are visible to all threads.
So whenever you need to guarantee that only one thread accesses your variable at a time to read/write it to avoid consistency issues, one way is to make your method synchronized.
My advice to you is to first read Oracle's concurrency tutorial.
A few comments:
Having all your methods synchronized causes bottlenecks
Having _data variable public is a bad practice and will difficult concurrent programming.
It seems that you are reimplementing a collection, better use existing Java's concurrent collections.
Variable names would better not begin with _
Avoid adding comments to your code and try to have declarative method names.
+1 for everybody who said read a tutorial, but here's a summary anyway.
You need mutual exclusion (i.e., synchronized blocks) whenever it is possible for one thread to create a temporary situation that other threads must not be allowed to see. Suppose you have objects stored in a search tree. A method that adds a new object to the tree probably will have to reassign several object references, and until it finishes its work, the tree will be in an invalid state. If one thread is allowed to search the tree while another thread is in the add() method, then the search() function may return an incorrect result, or worse (maybe crash the program.)
One solution is to synchronize the add() method, and the search() method, and any other method that depends on the tree structure. All must be synchronized on the same object (the root node of the tree would be an obvious choice).
Java guarantees that no more than one thread can be synchronized on the same object at any given time. Therefore, no more than one thread will be able to see or change the internals of the tree at the same time, and the temporary invalid state created inside the add() method will be harmless.
My example above explains the principle of mutual exclusion, but it is a simplistic and inefficient solution to protecting a search tree. A more practical approach would use reader/writer locks, and synchronize only on interesting parts of the tree rather than on the whole thing. Practical synchronization of complex data structures is a hard problem, and whenever possible, you should let somebody else solve it for you. E.g., If you use the container classes in java.util.concurrent instead of creating your own data structures, you'll probably save yourself a lot of work (and maybe a whole lot of debugging).
You need to protect variables that form the object's state. If these variables are used in static method, you have to protect them as well. But, be careful, following example is wrong:
private static int stateVariable = 0;
//wrong!!!!
public static synchronized void increment() {
stateVariable++;
}
public synchronized int getValue() {
return stateVariable;
}
It seems that above is safe, but these methods operate on different locks. Above is more or less corresponds to following:
private static int stateVariable = 0;
//wrong!!!!
public static void increment() {
synchronized (YourClassName.class) {
stateVariable++;
}
}
public synchronized int getValue() {
synchronized (this) {
return stateVariable;
}
}
Notice that different locks are used when mixing static and object methods.
I have a Runnable class like:
Class R1 implements Runnable {
private static final Log LOGGER = LogFactory.getLog(R1.class);
private final ObjectClass obj;
private final SomeService service;
public R1(ObjectClass obj, SomeService service) {
this.obj = obj;
this.service = service;
}
#override
public void run() {
String value = this.obj.getSomeValue();
LOGGER.debug("Value is " + value);
// some actions, such as:
// service.someMethod(obj);
}
}
I use a ExecutorService object to execute R1 and put R1 in a queue.
But later outside R1 I change the value in the ObjectClass that I passed in R1 so the the actions in R1 after getSomeValue() aren't behaving as I expected. If I want to keep the value of ObjectClass object in R1 unchanged what can I do? Suppose the object is big and has a lot of get and set methods.
To make the problem clearer, I need to pass the obj into a service class object which is also used as a parameter in the runnable class. I have changed the original codes accordingly.
As per comments, apparently my suggested solution has problems.
As such, follow the other suggestions about creating a new instance and copying the properties you require across. Or create a lightweight data object that holds the properties you require. Either way, I believe you need 2 instances to do what you want.
I suggest you could implement clone method that creates a new instance.
http://download.oracle.com/javase/1,5,0/docs/api/java/lang/Cloneable.html
The problem here is that you have passed the instance into your R1class, but it is still the same single instance, so changes to it will affect everything else. So, implementing a clone method will allow you to easily create a copy of your instance that can be used in your R1 class, while allowing you to make further changes to your original.
In your R1 class,
public R1(ObjectClass obj) {
//this.obj = obj;
this.obj = obj.clone();
}
P.S. you must implement this method yourself. It won't just automatically give you a deep copy.
Depending on the nature of your program, there are a couple options.
You could "Override Clone Judiciously" (Item 11 in Effective Java) and clone the object before handing it to the runnable. If overriding clone doesn't work for you, it might be better to do one of the following:
Create a new instance of the object manually and copy the values from obj.
Add a subset of the data contained in obj. So instead of passing obj into the constructor, you would pass in someValue. I would advocate this method, so that you only supply R1 with the data it needs, and not the entire object.
Alternatively, if it doesn't matter that the data in obj changes before R1 is executed, then you only need to make sure that obj doesn't change while R1 is executing. In this case, you could add the synchronize keyword to the getSomeValue() method, and then have R1 synchronize on obj like so:
#Override
public void run() {
synchronize (obj) {
String value = obj.getSomeValue();
}
// some actions.
}
Pass the object to the constructor, and don't keep a reference to it.
if objet is too big,
maybe an immutable ParameterObject, with enough data/method ,is better.
If possible, try making your ObjectClass immutable. (no state changes supported). In Java you have to "do this yourself"; there's no notion of 'const' object (as in C++)
Perhaps you can have your orig ObjectClass but create a new class ImmutableObjectClass which takes your orig in the ctor.
Assumption: You don't care if R1 operates on old data.
You can then change your code to:
public class R1 implements Runnable {
private final String value;
// Option 1: Pull out the String in the constructor.
public R1(ObjectClass obj) {
this.value = obj.getSomeValue(); // Now it is immutable
}
// Option 2: Pass the String directly into the constructor.
public R1(String value) {
this.value = value; // This constructor has no coupling
}
#Override public void run() {
// Do stuff with value
}
}
If you want R1 to operate on the latest data, as opposed to what the data was when you constructed it, then you will need some type of synchronisation between R1 and the data modification.
The "problem" here is that under the Java Memory Model, threads may (and do) cache field values. This means that if one thread updates a field (of the ObjectClass object), other threads won't "see" the change - they'll still be looking at their cached (stale) value.
In order to make change visible across threads you have two options:
Make the fields you'll be changing in ObjectClass volatile - the volatile keyword forces threads to not cache the field's value (ie always use the latest value)
synchronize access, both read and write, to the fields - all changes made within a synchronized block are visible to other threads synchronizing on the same lock object (if you synchronize methods, the this object is used as the lock)
Take this code:
public class MyClass {
private final Object _lock = new Object();
private final MyMutableClass _mutableObject = new MyMutableClass()
public void myMethod() {
synchronized(_lock) { // we are synchronizing on instance variable _lock
// do something with mutableVar
//(i.e. call a "set" method on _mutableObject)
}
}
}
now, imagine delegating the code inside myMethod() to some helper class where you pass the lock
public class HelperClass {
public helperMethod(Object lockVar, MyMutableClass mutableVar) {
synchronized(lockVar) { // we are now synchronizing on a method param,
// each thread has own copy
// do something with mutableVar
// (i.e. call a "set" method on mutableVar)
}
}
}
can "myMethod" be re-written to use the HelperClass by passing its lock var, so that everything is still thread safe? i.e.,
public void myMethod() {
_helperObject.helperMethod(_lock, _mutableObject);
}
I am not sure about this, because Java will pass the lockVar by value, and every thread will get a separate copy of lockVar (even though each copy points to the same object on the heap). I guess the question comes down to how 'synchronized' keyword works -- does it lock on the variable, or the value on the heap that the variable references?
Synchronization is done upon objects, not variables.
Variables/members [sometimes] contain objects and it is the resulting object contained in [variable] x that is actually synchronized upon in synchronized(x).
There are a few other issues with thread-visibility of variables (e.g. might read a "stale" object from a variable), but that does not apply here: there is no re-assignment of _lock and the visibility of the initial ("final") assignment is guaranteed. Because of this it is guaranteed that, in this case, the method parameter will always contain the correct (same) object used for the synchronization.
If the lock object used (where presumably _lock is not final) changes, however, then that would require re-evaluation of the appropriate values/thread-visibility but otherwise does not differ from any cross-thread access.
Happy coding.