How to make this function thread safe? - java

public class Sol {
static Map<Integer, List<String>> emap;
static List<Integer> sortSalaries(List<List<String>> workers) {
List<Integer> res = new ArrayList<Integer>();
emap = new HashMap<>();
for (List<String> e: workers)
emap.put(Integer.parseInt(e.get(0)), e);
for(List<String> worker: workers )
{
//accessing workers
.....
}
Collections.sort(res);
return res;
}
public static int dfs(int eid) {
List<String> employee = emap.get(eid);
int salary=0;
String ans = employee.get(3);
for (int i=0;i<ans.length();i=i+2)
{
// accesing emap
......
}
return salary;
}
}
Do i have to use synchronized keyword to make it thread safe. Do i have to use Vector and Hashtable if method is synchronized.
Alternatively, What if i use Vector and Hashtable, move the emap variable to sortSalaries() and pass it to dfs(). Is it okay if i not use synchronized keyword in this case..

I asked you question in comment that - do you understand why these methods are not thread-safe if called from multiple threads? and you pointed me to a link without specifying that if you really understood it or not and why do you think that your class is not thread safe so I am providing a little bit of background instead of directly answering the question.
A Bit of Short Discussion
Any class or its methods might become not thread safe when you start sharing data among runner / calling threads. Your class by default is thread - safe if no data is shared among threads so easiest way to make your class thread - safe is to stop sharing data among threads and in your case, its going to be removal of - emap ( because its a class state and used in methods ) & List<List<String>> workers ( This is what I am not sure of since its a reference passed on from caller and different method calls will be working on same instance or might be different instances are passed to this method ) and replace these by method local variables.
Method local variables are thread - safe by default since new instances are created and destroyed for each call.
if you can't do that or not feasible , follow oleg.cherednik's answer to synchronize for variable - emap - either at block level or method level. Do remember that there are various ways to synchronize in Java with synchronized keyword being easiest.
Now for method parameters - List<List<String>> workers & int eid , synchronization for eid is not needed since you are simply reading it and not updating & also its not pass by reference but pass by value due to type being primitive.
Synchronization for access to List<List<String>> workers is needed if you are passing same list instance to calls of this method from different threads. Refer to Gray's Answer - Here and this point is missed in oleg.cherednik's answer. You are better judge if synchronization would be needed or not for this reference.
Its easy to assume that List iteration is thread- safe ( since you are not updating the list ) but that might not always be true . Refer this question and all answers for detailed discussion.
So summary is this - you start implementing thread - safety for your class by first analyzing if some objects are shared among threads or not. If objects are shared , read / write to those objects need to be synchronized ( to make it atomic & provided those objects are not already thread - safe ) . If no objects are shared - its already thread safe . Also, try to create your classes with already thread - safe data structures , that way you will have less work to do.
java.lang.NullPointerException ( NPE ) point of oleg.cherednik's answer stands too.

Protect emap from outer access
Init emap to exclude NPE
Example:
public final class Sol {
private static final Map<Integer, List<String>> emap = new HashMap<>();
static List<Integer> sortSalaries(List<List<String>> workers) {
synchronized (Foo.class) {
for (List<String> e : workers)
emap.put(Integer.parseInt(e.get(0)), e);
}
// do smth, not access emap
}
public static synchronized int dfs(int eid) {
// do smth with accessing emap
}
}
In sortSalaries you can minimize synchoronized block with for loop. In dfs you access emap in different places of the method and therefore you have to synchoonized enire method.
Using either ConcurrentHashMap or Vector do not help here, becuase betwee get/set elements to the collection, they could be changed, which is not OK for dfs method: it should feeze emap when it's called.

Related

Thread Safety in Java Using Atomic Variables

I have a Java class, here's its code:
public class MyClass {
private AtomicInteger currentIndex;
private List<String> list;
MyClass(List<String> list) {
this.list = list; // list is initialized only one time in this constructor and is not modified anywhere in the class
this.currentIndex = new AtomicInteger(0);
}
public String select() {
return list.get(currentIndex.getAndIncrement() % list.size());
}
}
Now my question:
Is this class really thread safe thanks to using an AtomicInteger only or there must be an addional thread safety mechansim to ensure thread-safety (for example locks)?
The use of currentIndex.getAndIncrement() is perfectly thread-safe. However, you need a change to your code to make it thread-safe in all circumstances.
The fields currentIndex and list need to be made final to achieve full thread-safety, even on unsafe publication of the reference to your MyClass object.
private final AtomicInteger currentIndex;
private final List<String> list;
In practice, if you always ensure that your MyClass object itself is safely published, for example if you create it on the main thread, before any of the threads that use it are started, then you don't need the fields to be final.
Safe publication means that the reference to the MyClass object itself is done in a way that has a guaranteed multi-threaded ordering in the Java Memory Model.
It could be that:
All threads that use the reference get it from a field that was initialized by the thread that started them, before their thread was started
All threads that use the reference get it from a method that was synchronized on the same object as the code that set the reference (you have a synchronized getter and setter for the field)
You make the field that contains the reference volatile
It was in a final field if that final field was initialized as described in section 17.5 of the JLS.
A few more cases the are not easily used to publish references
I think your code contains two bugs.
First, normally when you receive an object from some unknown source like your constructor does, you make a defensive copy to be certain it is not modified outside of the class.
MyClass(List<String> list) {
this.list = new ArrayList<String>( list );
So if you do this, do you now need to mutate that list anywhere inside the class? If so, the method:
public String select() {
return list.get(currentIndex.getAndIncrement() % list.size());
isn't atomic. What could happen here is a thread call getAndIncrement() and then perform the modulus (%). Then at that point if it's swapped out with another thread that removes an item from the list, the old limit of list.size() will no longer be valid.
I think there's nothing for it but to add synchronized to the whole method:
public synchronized String select() {
return list.get(currentIndex.getAndIncrement() % list.size());
And the same with any other mutator.
(final as the other poster mentions is still required on the instance fields.)

Is this static method thread safe or is synchronization needed

I have a utility class that has one static method to modify values of the input Array List. This static method is invoked by a caller. The caller is used to process web service requests. For each request(per thread), the caller creates a new ArrayList and invokes the static method.
public class Caller{
public void callingMethod(){
//Get Cloned criteria clones a preset search criteria that has place holders for values and returns a new ArrayList of the original criteria. Not included code for the clone
ArrayList<Properties> clonedCriteria = getClonedCriteria();
CriteriaUpdater.update(clonedCriteria , "key1", "old_value1", "key1_new_value");
CriteriaUpdater.update(clonedCriteria , "key2", "old_value2", "key2_new_value");
//do something after the this call with the modified criteria arraylist
}
}
public class CriteriaUpdater
{
//updates the criteria, in the form of array of property objects, by replacing the token with the new value passed in
public static void update(ArrayList<Properties> criteria, String key, String token, String newValue)
{
for (Properties sc: criteria)
{
String oldValue = sc.getProperty(key);
if ((oldValue != null) && (oldValue.equals(token)))
sc.setProperty(key, newValue);
}
}
}
This is how the criteria are cloned:
public synchronized static ArrayList<Properties> cloneSearchCriteria(ArrayList<Properties> criteria) {
if (criteria == null) return null;
ArrayList<Properties> criteriaClone = new ArrayList<Properties>();
for (Properties sc : criteria) {
Properties clone = new Properties();
Enumeration propertyNames = sc.propertyNames();
while (propertyNames.hasMoreElements()) {
String key = (String) propertyNames.nextElement();
clone.put(key, (String) sc.get(key));
}
criteriaClone.add(clone);
}
return criteriaClone;
}
Given the above definitions, by not synchronizing the static method, would it still be able to correctly process concurrent method calls. My understanding is I have to synchronize this method for concurrency but wanted to confirm.
I understand each thread will have its own stack, but for static method it would be common to all threads - so in this case if we don't synchronize would it not cause a problem?
Appreciate suggestions and any corrections.
Thanks
You have a problem with a race condition. At least the underlying Properties data structure will never be corrupted but it could have an incorrect value. In particular, any number of threads could be in this section meaning the final value could be anything from any thread.
String oldValue = sc.getProperty(key);
if ((oldValue != null) && (oldValue.equals(token)))
sc.setProperty(key, newValue);
I am assuming your List is never altered, but if it is, you have to have synchronized. You could lock on the class, but locking on the collection you are altering might be a better choice.
It all depends on your getClonedCriteria() method. That's the method that is accessing shared state.
You are creating a "deep copy" of the criteria, so that every clone is independent from the original and from each other.
But there's a more subtle problem, which is that whatever initialization is performed on the prototype criteria must happen-before any thread that reads the criteria to clone it. Otherwise, the cloning thread may read an uninitialized version of the data structure.
One way to achieve this is to initialize the prototype criteria in a static initializer and assign it to a class member variable. Another is to initialize the criteria and then assign it to a volatile variable. Or, you could initialize and assign the prototype (in either order) to an ordinary class or instance member variable inside a synchronized block (or using a Lock), and then read the variable from another block synchronized on the same lock.
You are correct in that each thread has its own stack, so each thread will have its own copies of local variables and method arguments when it calls update(). When it runs it will save those local variables and method arguments to its stack.
However, the method argument criteria is a reference to a mutable object that will be stored on the heap where Java objects reside. If the threads can call update() on the same ArrayList, or the elements contained in the ArrayList could be contained in more than one ArrayList passed into different invocations of update() by different threads then synchronization errors could occur.

How to avoid synchronization on a non-final field?

If we have 2 classes that operate on the same object under different threads and we want to avoid race conditions, we'll have to use synchronized blocks with the same monitor like in the example below:
class A {
private DataObject mData; // will be used as monitor
// thread 3
public setObject(DataObject object) {
mData = object;
}
// thread 1
void operateOnData() {
synchronized(mData) {
mData.doSomething();
.....
mData.doSomethingElse();
}
}
}
class B {
private DataObject mData; // will be used as monitor
// thread 3
public setObject(DataObject object) {
mData = object;
}
// thread 2
void processData() {
synchronized(mData) {
mData.foo();
....
mData.bar();
}
}
}
The object we'll operate on, will be set by calling setObject() and it will not change afterwards. We'll use the object as a monitor. However, intelliJ will warn about synchronization on a non-final field.
In this particular scenario, is the non-local field an acceptable solution?
Another problem with the above approach is that it is not guaranteed that the monitor (mData) will be observed by thread 1 or thread 2 after it is set by thread 3, because a "happens-before" relationship hasn't been established between setting and reading the monitor. It could be still observed as null by thread 1 for example. Is my speculation correct?
Regarding possible solutions, making the DataObject thread-safe is not an option. Setting the monitor in the constructor of the classes and declaring it final can work.
EDIT Semantically, the mutual exclusion needed is related to the DataObject. This is the reason that I don't want to have a secondary monitor. One solution would be to add lock() and unlock() methods on DataObject that need to be called before working on it. Internally they would use a Lock Object. So, the operateOnData() method becomes:
void operateOnData() {
mData.lock()
mData.doSomething();
.....
mData.doSomethingElse();
mData.unlock();
}
You may create a wrapper
class Wrapper
{
DataObject mData;
synchronized public setObject(DataObject mData)
{
if(this.mData!=null) throw ..."already set"
this.mData = mData;
}
synchronized public void doSomething()
{
if(mData==null) throw ..."not set"
mData.doSomething();
}
A wrapper object is created and passed to A and B
class A
{
private Wrapper wrapper; // set by constructor
// thread 1
operateOnData()
{
wrapper.doSomething();
}
Thread 3 also has a reference to the wrapper; it calls setObject() when it's available.
Some platforms provide explicit memory-barrier primitives which will ensure that if one thread writes to a field and then does a write barrier, any thread which has never examined the object in question can be guaranteed to see the effect of that write. Unfortunately, as of the last time I asked such a question, Cheapest way of establishing happens-before with non-final field, the only time Java could offer any guarantees of threading semantics without requiring any special action on behalf of a reading thread was by using final fields. Java guarantees that any references made to an object through a final field will see any stores which were performed to final or non-fields of that object before the reference was stored in the final field but that relationship is not transitive. Thus, given
class c1 { public final c2 f;
public c1(c2 ff) { f=ff; }
}
class c2 { public int[] arr; }
class c3 { public static c1 r; public static c2 f; }
If the only thing that ever writes to c3 is a thread which performs the code:
c2 cc = new c2();
cc.arr = new int[1];
cc.arr[0] = 1234;
c3.r = new c1(cc);
c3.f = c3.r.f;
a second thread performs:
int i1=-1;
if (c3.r != null) i1=c3.r.f.arr[0];
and a third thread performs:
int i2=-1;
if (c3.f != null) i2=c3.f.arr[0];
The Java standard guarantees that the second thread will, if the if condition yields true, set i1 to 1234. The third thread, however, might possibly see a non-null value for c3.f and yet see a null value for c3.arr or see zero in c3.f.arr[0]. Even though the value stored into c3.f had been read from c3.r.f and anything that reads the final reference c3.r.f is required to see any changes made to that object identified thereby before the reference c3.r.f was written, nothing in the Java Standard would forbid the JIT from rearranging the first thread's code as:
c2 cc = new c2();
c3.f = cc;
cc.arr = new int[1];
cc.arr[0] = 1234;
c3.r = new c1(cc);
Such a rewrite wouldn't affect the second thread, but could wreak havoc with the third.
A simple solution is to just define a public static final object to use as the lock. Declare it like this:
/**Used to sync access to the {#link #mData} field*/
public static final Object mDataLock = new Object();
Then in the program synchronize on mDataLock instead of mData.
This is very useful, because in the future someone may change mData such that it's value does change then your code would have a slew of weird threading bugs.
This method of synchronization removes that possibility. It also is really low cost.
Also having the lock be static means that all instances of the class share a single lock. In this case, that seems like what you want.
Note that if you have many instances of these classes, this could become a bottleneck. Since all of the instances are now sharing a lock, only a single instance can change any mData at a single time. All other instances have to wait.
In general, I think something like a wrapper for the data you want to synchronize is a better approach, but I think this will work.
This is especially true if you have multiple concurrent instances of these classes.

Java synchronize on object

How to synchronize two different methods from the same class in order to lock on the same object? Here is an example:
public class MyClass extends Thread implements Observer{
public List<AnotherClass> myList = null;
public MyClass(List<AnotherClass> myList){
this.myList = myList;
}
public void run(){
while(true){
//Do some stuff
myList.add(NotImportantElement);
}
}
public void doJob{
for(int i=0; i<myList.size; i++){
ElementClass x = myList.get(i);
//Do some more stuff
}
}
}
The question is how can I stop run() from accesing myList when doJob is executed and viceversa?
Imagine this: I start the thread and start adding elements to my list. At a random moment I call doJob() from another class that holds a reference to my thread.
How should I do the lock? Thanks!
L.E.
Ok, I understood the concept of the lock, but now I have another question.
Suppose I have a class with public static myList and only one instance of that class. From that instance I create n instances of Thread that take every element of that list and do some stuff with it.
Now, at a specific moment, myList is updated. What happens with those Threads that already were processing myList elements? How should I lock access on myList while updating it?
NOTE: This code assumes you only have one instance of MyClass. according to your post that sounds like the case.
public class MyClass extends Thread implements Observer{
private List<AnotherClass> myList = null;
private Object lock = new Object();
public MyClass(List<AnotherClass> myList){
this.myList = new ArrayList(myList);
}
public void run(){
while(true){
//Do some stuff
synchronized(lock) {
myList.add(NotImportantElement);
}
}
}
public void doJob{
synchronized(lock) {
for(int i=0; i<myList.size; i++){
ElementClass x = myList.get(i);
//Do some more stuff
}
}
}
}
EDIT: Added making a copy of List so that external entities could not change the list as per JB Nizet
EDIT 2: Made variables private so nobody else can access them
You can:
Declare both run and doJob synchronized. This will use this as lock;
Declare list as final and synchronize on it. This will use list as lock. Declaring lock field as final is good practice. This way some methods of your class my synchronize on one object, while other methods can use other object for synchronization. This reduces lock contention but increases code complexity;
Introduce explicit java.util.concurrent.locks.Lock variable and use it's methods for synchronization. This will improve code flexibility, but will increase code complexity as well;
Don't do explicit synchronization altogether and instead employ some thread-safe data structure from JDK. For example, BlockingQueue or CopyOnWriteArrayList. This will reduce code complexity and ensure thread safety.
Employ synchronization by reads/writes to volatile field. See this SO post. This will ensure safety, but will increase complexity greatly. On the second thought, don't do this :)
You can either add
synchronized
keyword to both methods OR use the
synchronized(Myclass.class) {
}
The former essentially uses the Myclass.class object but it is not as fine-grained as the latter.
Declare both methods as synchronized to lock every instance, or use a synchronized(this){...} block to make the lock only on the current instance.
synchronized(myList) {
// do stuff on myList
}
Specific documentation: Intrinsic Locks and Synchronization
Yet I encourage you to use a thread-safe concurrent data structure for what you want to achieve to avoid doing synchronizing yourself and to get (a lot) better performance: Concurrent package summary

Thread with lists

I have an app that is a little bit slow. I thoutght it could be faster using threads.
So, here is my plan: My program have a list of objects of type X and each object X has a very big list of Integers (let's consider Integer for the sake of simplicity).
I have a static method (called getSubsetOfX) that receives a object X from the list of X's and return a list of Integers of the object X the list returned is a subset of all the Integers contained in X.
This method is called for every X contained in the list. Then I insert the returned List in a List of Integer Lists.
This is the code I explained in a compact version:
// Class of object X
public class X{
public List<Integer> listX;
...
}
// Utility class
public class Util{
// Return a sub-set of Integer contained in X
public static List<Integer> getSubsetOfX(X x){...}
}
public class exec{
public static void main(String args[]){
// Let's suppose that lx is already filled with data!
List<X> lx = new ArrayList<X>();
// List of the subsets of integer
List<List<Integer>> li = new ArrayList<ArrayList<Integer>>();
for(X x : lx){
// I want to turn this step "threadrized"
li.add(getSubsetOfX(x));
}
}
}
I don't know if a List allow concurrent insertions. I don't know how to apply threads in it too. I read some about Threads, but, as the run() method doesn't return anything, how can turn the method getSubsetOfX(X x) parallel?
Can you help me doing this?
Just to be clear, getSubsetOfX() is the call that takes a long time, right?
For this sort of task, I'd suggest you look at Java's Executors. The first step would be to create a Callable that runs getSubsetOfX(x) on a given instance of X. Something like this:
public class SubsetCallable implements Callable<List<Integer>> {
X x;
public SubsetCallable(X x) {
this.x = x;
}
public List<Integer> call() {
return Util.getSubsetOfX(x);
}
}
Then you can create an ExecutorService using one of the methods in Executors. Which method to use depends on your available resources and your desired execution model - they're all described in the documentation. Once you create the ExecutorService, just create a SubsetCallable for each instance of X that you have and pass it off to the service to run it. I think it could go something like this:
ExecutorService exec = ...;
List<SubsetCallable> callables = new LinkedList<SubsetCallable>();
for (X x : lx) {
callables.append(new SubsetCallable(x));
}
List<Future<List<Integer>>> futures = exec.invokeAll(lc);
for (Future<List<Integer>> f : futures) {
li.add(f.get());
}
This way you can delegate the intense computation to other threads, but you still only access the list of results in one thread, so you don't have to worry about synchronization. (As winsharp93 pointed out, ArrayList, like most of Java's standard collections, is unsynchronized and thus not safe for concurrent access.)
I don't know if a List allow concurrent insertions.
See Class ArrayList:
Note that this implementation is not
synchronized. If multiple threads
access an ArrayList instance
concurrently, and at least one of the
threads modifies the list
structurally, it must be synchronized
externally. (A structural modification
is any operation that adds or deletes
one or more elements, or explicitly
resizes the backing array; merely
setting the value of an element is not
a structural modification.) This is
typically accomplished by
synchronizing on some object that
naturally encapsulates the list. If no
such object exists, the list should be
"wrapped" using the
Collections.synchronizedList method.
This is best done at creation time, to
prevent accidental unsynchronized
access to the list:
List list = Collections.synchronizedList(new ArrayList(...));
But be careful: Synchronization comes with a significant performance cost. This could relativity the performance you get by using multiple threads (especially when the calculations are quite fast do do).
Thus, avoid accessing those synchronized collections wherever possible. Prefer thread-local lists instead which you can then merge with your shared list using AddAll.

Categories