I'm working on my lab lesson, Multithreading.
In multithreading I know that if we use synchronized keyword, it never let hit all the threads to method at the same time but put them in queue and let them access ony by one.
But My teacher said, its not a good practice to use synchronized (didnt get time to why, but will ask in next class).
Here is my code
import java.util.HashMap;
import java.util.Map;
public class Testmultithread {
static String printMe(int inp) {
return Integer.toString(inp);
}
public static void main(String[] args) {
Map<String, Integer> listofval = new HashMap<String, Integer>();
listofval.put("1", 1);
listofval.put("2", 2);
listofval.put("3", 3);
listofval.put("4", 4);
listofval.put("5", 5);
for (Map.Entry<String, Integer> entry : listofval.entrySet()) {
Testmultithread.printMe(entry.getValue());
}
}
}
May I know please how can I achieve multithread in above code (Map entries accessing printMe method at multithread level) without using synchronized keyword... ?
Suggestion please...!
Thanks
Have a method that just displays values from Nth index. 'N' would be static field level variable.
Have a Thread with 'run' method that runs forever and does the following.
1) Print value in index 'n'.
2) Increment N.
3) If N > list size, then 'break'.
Let 'N' be 'Atomic integer'.
In main method have two threads and just start them. Let your list have some 100 values in it so that you can see two threads picking values.
However, this is not thread safe and two threads can get same value of 'N'.
Good luck.
Related
Can someone explain the output of the following program:
public class DataRace extends Thread {
static ArrayList<Integer> arr = new ArrayList<>();
public void run() {
Random random = new Random();
int local = random.nextInt(10) + 1;
arr.add(local);
}
public static void main(String[] args) {
DataRace t1 = new DataRace();
DataRace t2 = new DataRace();
DataRace t3 = new DataRace();
DataRace t4 = new DataRace();
t1.start();
t2.start();
t3.start();
t4.start();
try {
t1.join();
t2.join();
t3.join();
t4.join();
} catch (InterruptedException e) {
System.out.println("interrupted");
}
System.out.println(DataRace.arr);
}
}
Output:
[8, 5]
[9, 2, 2, 8]
[2]
I am having trouble understanding the varying number of values in my output. I would expect the main thread to either wait until all threads have finished execution as I am joining them in the try-catch block and then output four values, one from each thread, or print to the console in case of an interruption. Neither of which is really happening here.
How does it come into play here if this is due to data race in multithreading?
The main problem is that multiple threads are adding to the same shared ArrayList concurrently. ArrayList is not thread-safe. From source one can read:
Note that this implementation is not synchronized.
If multiple threads
access an ArrayList instance concurrently, and at least one of the
threads modifies the list structurally, it must be synchronized
externally. (A structural modification is any operation that adds or
deletes one or more elements, or explicitly resizes the backing array;
merely setting the value of an element is not a structural
modification.) This is typically accomplished by synchronizing on some
object that naturally encapsulates the list. If no such object exists,
the list should be "wrapped" using the Collections.synchronizedList
method. This is best done at creation time, to prevent accidental
unsynchronized access to the list:
In your code every time you call
arr.add(local);
inside the add method implementation, among others, a variable that keeps track of the size of the array will be updated. Below is shown the relevant part of the add method of the ArrayList:
private void add(E e, Object[] elementData, int s) {
if (s == elementData.length)
elementData = grow();
elementData[s] = e;
size = s + 1; // <--
}
where the variable field size is:
/**
* The size of the ArrayList (the number of elements it contains).
*
* #serial
*/
private int size;
Notice that neither is the add method synchronized nor the variable size is marked with the volatile clause. Hence, suitable to race-conditions.
Therefore, because you did not ensure mutual exclusion on the accesses to that ArrayList (e.g., surrounding the calls to the ArrayList with the synchronized clause), and because the ArrayList does not ensure that the size variable is updated atomically, each thread might see (or not) the last updated value of that variable. Hence, threads might see outdated values of the size variable, and add elements into positions that already other threads have added before. In the extreme, all threads might end-up adding an element into the same position (e.g., as one of your outputs [2]).
The aforementioned race-condition leads to undefined behavior, hence the reason why:
System.out.println(DataRace.arr);
outputs different number of elements in different execution of your code.
To make the ArrayList thread-safe or for alternatives have a look at the following SO thread: How do I make my ArrayList Thread-Safe?, where it showcases the use of Collections.synchronizedList()., CopyOnWriteArrayList among others.
An example of ensuring mutual exclusion of the accesses to the arr structure:
public void run() {
Random random = new Random();
int local = random.nextInt(10) + 1;
synchronized (arr) {
arr.add(local);
}
}
or :
static final List<Integer> arr = Collections.synchronizedList(new ArrayList<Integer>());
public void run() {
Random random = new Random();
int local = random.nextInt(10) + 1;
arr.add(local);
}
TL;DR
ArrayList is not Thread-Safe. Therefore it's behaviour in a race-condition is undefined. Use synchronized or CopyOnWriteArrayList instead.
Longer answer
ArrayList.add ultimately calls this private method:
private void add(E e, Object[] elementData, int s) {
if (s == elementData.length)
elementData = grow();
elementData[s] = e;
size = s + 1;
}
When two Threads reach this same point at the "same" time, they would have the same size (s), and both will try add an element on the same position and update the size to s + 1, thus likely keeping the result of the second.
If the size limit of the ArrayList is reached, and it has to grow(), a new bigger array is created and the contents copied, likely causing any other changes made concurrently to be lost (is possible that multiple threads will be trying to grow).
Alternatives here are to use monitors - a.k.a. synchronized, or to use Thread-Safe alternatives like CopyOnWriteArrayList.
I think there is a lot of similar or closely related questions. For example see this.
Basically the reason of this "unexpected" behabiour is because ArrayList is not thread-safe. You can try List<Integer> arr = new CopyOnWriteArrayList<>() and it will work as expected. This data structure is recommended when we want to perform read operation frequently and the number of write operations is relatively rare. For good explanation see What is CopyOnWriteArrayList in Java - Example Tutorial.
Another option is to use List<Integer> arr = Collections.synchronizedList(new ArrayList<>()).
You can also use Vector but it is not recommended (see here).
This article also will be useful - Vector vs ArrayList in Java.
I have a scenario where I have to maintain a Map which can be populated by multiple threads, each modifying their respective List (unique identifier/key being the thread name), and when the list size for a thread exceeds a fixed batch size, we have to persist the records to the database.
Aggregator class
private volatile ConcurrentHashMap<String, List<T>> instrumentMap = new ConcurrentHashMap<String, List<T>>();
private ReentrantLock lock ;
public void addAll(List<T> entityList, String threadName) {
try {
lock.lock();
List<T> instrumentList = instrumentMap.get(threadName);
if(instrumentList == null) {
instrumentList = new ArrayList<T>(batchSize);
instrumentMap.put(threadName, instrumentList);
}
if(instrumentList.size() >= batchSize -1){
instrumentList.addAll(entityList);
recordSaver.persist(instrumentList);
instrumentList.clear();
} else {
instrumentList.addAll(entityList);
}
} finally {
lock.unlock();
}
}
There is one more separate thread running after every 2 minutes (using the same lock) to persist all the records in Map (to make sure we have something persisted after every 2 minutes and the map size does not gets too big)
if(//Some condition) {
Thread.sleep(//2 minutes);
aggregator.getLock().lock();
List<T> instrumentList = instrumentMap.values().stream().flatMap(x->x.stream()).collect(Collectors.toList());
if(instrumentList.size() > 0) {
saver.persist(instrumentList);
instrumentMap .values().parallelStream().forEach(x -> x.clear());
aggregator.getLock().unlock();
}
}
This solution is working fine in almost for every scenario that we tested, except sometimes we see some of the records went missing, i.e. they are not persisted at all, although they were added fine to the Map.
My questions are:
What is the problem with this code?
Is ConcurrentHashMap not the best solution here?
Does the List that is used with the ConcurrentHashMap have an issue?
Should I use the compute method of ConcurrentHashMap here (no need I think, as ReentrantLock is already doing the same job)?
The answer provided by #Slaw in the comments did the trick. We were letting the instrumentList instance escape in non-synchronized way i.e. access/operations are happening over list without any synchonization. Fixing the same by passing the copy to further methods did the trick.
Following line of code is the one where this issue was happening
recordSaver.persist(instrumentList);
instrumentList.clear();
Here we are allowing the instrumentList instance to escape in non-synchronized way i.e. it is passed to another class (recordSaver.persist) where it was to be actioned on but we are also clearing the list in very next line(in Aggregator class) and all of this is happening in non-synchronized way. List state can't be predicted in record saver... a really stupid mistake.
We fixed the issue by passing a cloned copy of instrumentList to recordSaver.persist(...) method. In this way instrumentList.clear() has no affect on list available in recordSaver for further operations.
I see, that you are using ConcurrentHashMap's parallelStream within a lock. I am not knowledgeable about Java 8+ stream support, but quick searching shows, that
ConcurrentHashMap is a complex data structure, that used to have concurrency bugs in past
Parallel streams must abide to complex and poorly documented usage restrictions
You are modifying your data within a parallel stream
Based on that information (and my gut-driven concurrency bugs detector™), I wager a guess, that removing the call to parallelStream might improve robustness of your code. In addition, as mentioned by #Slaw, you should use ordinary HashMap in place of ConcurrentHashMap if all instrumentMap usage is already guarded by lock.
Of course, since you don't post the code of recordSaver, it is possible, that it too has bugs (and not necessarily concurrency-related ones). In particular, you should make sure, that the code that reads records from persistent storage — the one, that you are using to detect loss of records — is safe, correct, and properly synchronized with rest of your system (preferably by using a robust, industry-standard SQL database).
It looks like this was an attempt at optimization where it was not needed. In that case, less is more and simpler is better. In the code below, only two concepts for concurrency are used: synchronized to ensure a shared list is properly updated and final to ensure all threads see the same value.
import java.util.ArrayList;
import java.util.List;
public class Aggregator<T> implements Runnable {
private final List<T> instruments = new ArrayList<>();
private final RecordSaver recordSaver;
private final int batchSize;
public Aggregator(RecordSaver recordSaver, int batchSize) {
super();
this.recordSaver = recordSaver;
this.batchSize = batchSize;
}
public synchronized void addAll(List<T> moreInstruments) {
instruments.addAll(moreInstruments);
if (instruments.size() >= batchSize) {
storeInstruments();
}
}
public synchronized void storeInstruments() {
if (instruments.size() > 0) {
// in case recordSaver works async
// recordSaver.persist(new ArrayList<T>(instruments));
// else just:
recordSaver.persist(instruments);
instruments.clear();
}
}
#Override
public void run() {
while (true) {
try { Thread.sleep(1L); } catch (Exception ignored) {
break;
}
storeInstruments();
}
}
class RecordSaver {
void persist(List<?> l) {}
}
}
I have several computing threads that create result values (Objects). After each thread is finished a 'add' method from a result collector is called. This result collector is singleton, so there is only one representation.
Inside the result collector is a list which holds result objects:
List<TestResult> results = Collections.synchronizedList(new ArrayList<>());
The add method adds the result of each thread to the list:
public void addResult(TestResult result){
System.out.println("added " + result.getpValue());
this.results.add(result);
}
It is called within the thread, after the computing stuff is done.
The big problem is: After all threads are finished the list of results is empty. As you can see in the addResult method I added a print statement for the pValue. The p value of all results is printed out.
So it looks like the threads work on different lists. Despite the fact that the collector class is singleton.
It was asked for the complete code of the result collector (Javadoc removed to trim size)
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
public class ResultCollector {
private static ResultCollector resultCollector;
private final List<TestResult> results;
public static ResultCollector getInstance(){
if(resultCollector == null){
resultCollector = new ResultCollector();
}
return resultCollector;
}
private ResultCollector() {
results = Collections.synchronizedList(new ArrayList<>());
}
public void addResult(TestResult result){
System.out.println("added " + result.getpValue());
this.results.add(result);
}
}
I updated the add method to print out the hash of the current class to make sure every thread has the same with:
System.out.println("added to" + System.identityHashCode(this) + " count: " +results.size());
The output hash code is the same for all threads and the size increases to the expected value. Also the hash code is the same when I call my toString method or getter for the list outside the multithread environment.
Calling of the threads:
public IntersectMultithread(...) {
Set<String> tracks = intervals.keySet();
for(String track: tracks){
IntersectWrapper wrapper = new IntersectWrapper(...);
wrappers.add(wrapper);
exe.execute(wrapper);
}
exe.shutdown();
}
A Synchronized list is just a wrapper over a list. You should actually be using Concurrent Collections for this purpose in modern Java; they implement smarter and more efficient locking and provide better performance.
Caveat: the only synchronized list is one that copies on write. So, if that's an issue (i.e. you're adding more than iterating), then your way is fine).*
The error in your code is almost certainly in your singleton class which you haven't shown. Either your class is not truly a singleton (did you use an enum? that's a good way to guarantee it), or your list creation is more confusing than let on.
If you post more code, I can update the answer with more info :).
EDIT: I think your problem is here based on your updated code:
exe.shutdown();
You need to wait for the executor to complete with awaitTermination() with a good timeout relevant to the work you are doing.
Your threads just start and die off instantly right now :)
For example:
taskExecutor.awaitTermination(Long.MAX_VALUE, TimeUnit.NANOSECONDS);
From here: https://stackoverflow.com/a/1250655/857994
Addition to the correct answer above
Yes, the exe.shutdown(); is the problem, but the threads do not die instantly, instead they seem to run through. This is why the 'add' method printed everything correct if extended with a print.
The issue was that my output was done before the threads could finish their computation. So there were no values at that time, shortly after the threads finish and the print works.
I run a program which contains the following classes (not only, but these are the relevant ones for the question)
Under Results class I have a synchronized LinkedHashMap such as:
private static Map<Integer,Result> resultsHashMap=Collections.synchronizedMap(new LinkedHashMap<Integer, Result>());
and a getter method:
public static Map<Integer,Result> getResultsHashMap() {
return resultsHashMap;
}
As well I have inside my Result class a constructor with this synchronized code:
public Result(){
synchronized (Lock.lock) {
uniqueIdResult++;
}
}
and a synchronized getter method as such:
public static int getUniqueIdResult() {
synchronized (Lock.lock) {
return uniqueIdResult;
}
}
the uniqueIdResult is defined as following:
private static int uniqueIdResult=0;
Also I have a Lock class consists this Object:
public static final Lock lock=new Lock();
Now, this is the important issue i'm after. In my program I have the next 2 lines, which are creating a Result and putting it into the HashMap
Result result = new Result();
Results.getResultsHashMap().put(Result.getUniqueIdResult(), result);
I try to run my program with different number of Threads. When it is being run with 1 thread the output is as I expect it to be (specifically, but not necessarily important, Results.resultsHashMap contains 433 keys, which is what should be, and the keys are starting from 1).
But when I run it with different number of Threads, it gives a different output. For example running with 6 Threads gives a different number of keys each time, sometimes 430, sometimes 428, sometimes 427, etc.. and the starting key is not always related to the total number of keys (e.g total_number_of_keys-starting_key_number+1, which seemed to me in the beginning to be some pattern, but realized it's not)
The iteration is like this:
int counterOfResults=0;
for (Integer key : Results.getResultsHashMap().keySet()) {
System.out.println(key + " " + Results.getResultsHashMap().get(key));
counterOfResults++;
}
System.out.println(counterOfResults);
Also when synchronizing the getter method for getting the hashMap, without synchronization of the Result creation and the insertion to the hashMap, the output with multiple threads gives wrong output.
Also, when synchronizing only one of the lines (creation of Result and putting into hashMap), the output is not coherent under multiple Threads.
However when I synchronize both these lines (the creation of Result and putting into the map) like so:
Result result;
synchronized (Lock.lock) {
result = new Result(currentLineTimeNationalityNameYearofbirth.getName(),currentLineTimeNationalityNameYearofbirth.getTime(),citycompetionwas,date,distance,stroke,gender,kindofpool);
Results.getResultsHashMap().put(Result.getUniqueIdResult(), result);
}
the output is perfect, no matter how many Threads I use.
Also, I will note that the output is being printed only after all Threads have finished, by using join method for all Threads created.
So my question is:
As far as I know, before synchronizing the 2 lines (creating Result and puting into hashMap) all of my critical sections ,e.g, changing and getting the uniqueIdResult, getting the resultsHashMap (as I mentioned, I tried synchronizing this getter method also) are being synchronized on the same object, plus I put a further safe approach when puting the hashMap with Collections.synchronizedMap, which,as far as I know, should make the hashMap thread-safe.
Why then the output is not as I expect it to be? Where is there a safety problem?
There's no exclusion around these lines:
Result result = new Result();
Results.getResultsHashMap().put(Result.getUniqueIdResult(), result);
If you have 4 threads, they might all execute the first line (which will increment the uniqueIdResult variable four times), and then all execute the second line (at which point they will all see the same return value from getUniqueIdResult()). That explains how your keys could start at 4 when you have 4 (or more) threads.
Because you have multiple threads potentially (and unpredictably) storing to the same key, you also end up with a variable number of entries in your map.
You should probably remove the increment from the Result class constructor and instead do it in the getUniqueIdResult method:
public static int getUniqueIdResult() {
synchronized (Lock.lock) {
return ++uniqueIdResult;
}
}
(Having done that, there is no longer any need to create instances of Result at all).
I have code similar to following:
public class Cache{
private final Object lock = new Object();
private HashMap<Integer, TreeMap<Long, Integer>> cache =
new HashMap<Integer, TreeMap<Long, Integer>>();
private AtomicLong FREESPACE = new AtomicLong(102400);
private void putInCache(TreeMap<Long, Integer> tempMap, int fileNr){
int length; //holds the length of data in tempMap
synchronized(lock){
if(checkFreeSpace(length)){
cache.get(fileNr).putAll(tmpMap);
FREESPACE.getAndAdd(-length);
}
}
}
private boolean checkFreeSpace(int length){
while(FREESPACE.get() < length && thereIsSomethingToDelete()){
// deleteSomething returns the length of deleted data or 0 if
// it could not delete anything
FREESPACE.getAndAdd(deleteSomething(length));
}
if(FREESPACE.get() < length) return true;
return false;
}
}
putInCache is called by about 139 threads a second. Can I be sure that these two methods will synchronize on both cache and FREESPACE? Also, is checkFreeSpace() multithread-safe i.e can I be sure that there will be only one invocation of this method at a time? Can the "multithread-safety" of this code be improved?
To have your question answered fully, you would need to show the implementations of the thereIsSomethingToDelete() and deleteSomething() methods.
Given that checkFreeSpace is a public method (does it really need to be?), and is unsynchronized, it is possible it could be called by another thread while the synchronized block in the putInCache() method is running. This by itself might not break anything, since it appears that the checkFreeSpace method can only increase the amount of free space, not reduce it.
What would be more serious (and the code sample doesn't allow us to determine this) is if the thereIsSomethingToDelete() and deleteSomething() methods don't properly synchronize their access to the cache object, using the same Object lock as used by putInCache().
You don't usually synchronize on the fields you want to control access to directly.
The fields that you want to synchronize access to must only be accessed from within synchronized blocks (on the same object) to be considered thread safe. You are already doing this in putInCache().
Therefore, because checkFreeSpace() accesses shared state in an unsynchronized fashion, it is not thread safe.