Java list collect results from different threads in list - java

I have several computing threads that create result values (Objects). After each thread is finished a 'add' method from a result collector is called. This result collector is singleton, so there is only one representation.
Inside the result collector is a list which holds result objects:
List<TestResult> results = Collections.synchronizedList(new ArrayList<>());
The add method adds the result of each thread to the list:
public void addResult(TestResult result){
System.out.println("added " + result.getpValue());
this.results.add(result);
}
It is called within the thread, after the computing stuff is done.
The big problem is: After all threads are finished the list of results is empty. As you can see in the addResult method I added a print statement for the pValue. The p value of all results is printed out.
So it looks like the threads work on different lists. Despite the fact that the collector class is singleton.
It was asked for the complete code of the result collector (Javadoc removed to trim size)
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
public class ResultCollector {
private static ResultCollector resultCollector;
private final List<TestResult> results;
public static ResultCollector getInstance(){
if(resultCollector == null){
resultCollector = new ResultCollector();
}
return resultCollector;
}
private ResultCollector() {
results = Collections.synchronizedList(new ArrayList<>());
}
public void addResult(TestResult result){
System.out.println("added " + result.getpValue());
this.results.add(result);
}
}
I updated the add method to print out the hash of the current class to make sure every thread has the same with:
System.out.println("added to" + System.identityHashCode(this) + " count: " +results.size());
The output hash code is the same for all threads and the size increases to the expected value. Also the hash code is the same when I call my toString method or getter for the list outside the multithread environment.
Calling of the threads:
public IntersectMultithread(...) {
Set<String> tracks = intervals.keySet();
for(String track: tracks){
IntersectWrapper wrapper = new IntersectWrapper(...);
wrappers.add(wrapper);
exe.execute(wrapper);
}
exe.shutdown();
}

A Synchronized list is just a wrapper over a list. You should actually be using Concurrent Collections for this purpose in modern Java; they implement smarter and more efficient locking and provide better performance.
Caveat: the only synchronized list is one that copies on write. So, if that's an issue (i.e. you're adding more than iterating), then your way is fine).*
The error in your code is almost certainly in your singleton class which you haven't shown. Either your class is not truly a singleton (did you use an enum? that's a good way to guarantee it), or your list creation is more confusing than let on.
If you post more code, I can update the answer with more info :).
EDIT: I think your problem is here based on your updated code:
exe.shutdown();
You need to wait for the executor to complete with awaitTermination() with a good timeout relevant to the work you are doing.
Your threads just start and die off instantly right now :)
For example:
taskExecutor.awaitTermination(Long.MAX_VALUE, TimeUnit.NANOSECONDS);
From here: https://stackoverflow.com/a/1250655/857994

Addition to the correct answer above
Yes, the exe.shutdown(); is the problem, but the threads do not die instantly, instead they seem to run through. This is why the 'add' method printed everything correct if extended with a print.
The issue was that my output was done before the threads could finish their computation. So there were no values at that time, shortly after the threads finish and the print works.

Related

Runnable locked (park) using ExecutorService and BlockingQueue

Note: I understand the rules site, but I can't to put all code (complex/large code).
I put a DIFFERENT (all the real code is too much and you don't need here) code in Github but reproduces the Problem (the main class is joseluisbz.mock.support.TestOptimalDSP and switching class is joseluisbz.mock.support.runnable.ProcessorDSP) like the video.
Please don't recommend to me another jar or external library for this code.
I wish I was more specific, but I don't know what part to extract and show.
Before you close this question: Obviously, I am willing to refine my question if someone tells me where to look (technical detail).
I made a video in order to show my issue.
Even to formulate the question, I made a diagram to show the situation.
My program has a JTree, showing the relations between Worker.
I have a diagram interaction between threads controlling life with ExecutorService executorService = Executors.newCachedThreadPool(); and List<Future<?>> listFuture = Collections.synchronizedList(new ArrayList<>());
Each Runnable is started in this way listFuture().add(executorService().submit(this)); in its constructor. The lists are created like this: BlockingQueue<Custom> someBlockingQueue = new LinkedBlockingQueue<>();
My diagram shows who the Worker's father is if he has one.
It also shows, the writing relationships between the BlockingQueue.
RunnableStopper stops related runnables contained in Worker like property.
RunnableDecrementer, RunnableIncrementer, RunnableFilter operates with a cycle that runs each Custom that it receives for its BlockingQueue.
For which they always create a RunnableProcessor (it has no loop, but because of its long processing, once the task is finished it should be collected by the GC).
Internally the RunnableIncrementer has a Map Map<Integer, List<Custom>> mapListDelayedCustom = new HashMap<>();//Collections.synchronizedMap(new HashMap<>());
When arrives some Custom... I need to obtain the List of lastReceivedCustom List<Custom> listDelayedCustom = mapListDelayedCustom.putIfAbsent(custom.getCode(), new ArrayList<>());
I'm controlling the Size (is not growing indefinitely).
My code stops working when I add the following lines:
if (listDelayedCustom.size() > SomeValue) {
//No operation has yet been included in if sentence
}
But commenting the lines doesn't block
//if (listDelayedCustom.size() > SomeValue) {
// //No operation has yet been included in if sentence
//}
What could be blocking my Runnable?
It makes no sense that adding the lines indicated (Evaluate the size of a list: if sentence) above stops working.
Any advice to further specify my question?
First, the way you set thread names is wrong. You use this pattern:
public class Test
{
public static class Task implements Runnable
{
public Task()
{
Thread.currentThread().setName("Task");
}
#Override
public void run()
{
System.out.println("Task: "+Thread.currentThread().getName());
}
}
public static void main(String[] args)
{
new Thread(new Task()).start();
System.out.println("Main: "+Thread.currentThread().getName());
}
}
which gives the (undesired) result:
Main: Task
Task: Thread-0
It's incorrect because, in the Task constructor, the thread has not started yet, so you're changing the name of the calling thread, not the one of the spawned thread. You should set the name in the run() method.
As a result, the thread names in your screenshot are wrong.
Now the real issue. In WorkerDSPIncrement, you have this line:
List<ChunkDTO> listDelayedChunkDTO = mapListDelayedChunkDTO.putIfAbsent(chunkDTO.getPitch(), new ArrayList<>());
The documentation for putIfAbsent() says:
If the specified key is not already associated with a value (or is mapped to null) associates it with the given value and returns null, else returns the current value.
Since the map is initially empty, the first time you call putIfAbsent(), it returns null and assigns it to listDelayedChunkDTO.
Then you create a ProcessorDSP object:
ProcessorDSP processorDSP = new ProcessorDSP(controlDSP, upNodeDSP, null,
dHnCoefficients, chunkDTO, listDelayedChunkDTO, Arrays.asList(parent.getParentBlockingQueue()));
It means you pass null as the listDelayedChunkDTO parameter. So when this line executes in ProcessorDSP:
if (listDelayedChunkDTO.size() > 2) {
it throws a NullPointerException and the runnable stops.

Missing updates with locks and ConcurrentHashMap

I have a scenario where I have to maintain a Map which can be populated by multiple threads, each modifying their respective List (unique identifier/key being the thread name), and when the list size for a thread exceeds a fixed batch size, we have to persist the records to the database.
Aggregator class
private volatile ConcurrentHashMap<String, List<T>> instrumentMap = new ConcurrentHashMap<String, List<T>>();
private ReentrantLock lock ;
public void addAll(List<T> entityList, String threadName) {
try {
lock.lock();
List<T> instrumentList = instrumentMap.get(threadName);
if(instrumentList == null) {
instrumentList = new ArrayList<T>(batchSize);
instrumentMap.put(threadName, instrumentList);
}
if(instrumentList.size() >= batchSize -1){
instrumentList.addAll(entityList);
recordSaver.persist(instrumentList);
instrumentList.clear();
} else {
instrumentList.addAll(entityList);
}
} finally {
lock.unlock();
}
}
There is one more separate thread running after every 2 minutes (using the same lock) to persist all the records in Map (to make sure we have something persisted after every 2 minutes and the map size does not gets too big)
if(//Some condition) {
Thread.sleep(//2 minutes);
aggregator.getLock().lock();
List<T> instrumentList = instrumentMap.values().stream().flatMap(x->x.stream()).collect(Collectors.toList());
if(instrumentList.size() > 0) {
saver.persist(instrumentList);
instrumentMap .values().parallelStream().forEach(x -> x.clear());
aggregator.getLock().unlock();
}
}
This solution is working fine in almost for every scenario that we tested, except sometimes we see some of the records went missing, i.e. they are not persisted at all, although they were added fine to the Map.
My questions are:
What is the problem with this code?
Is ConcurrentHashMap not the best solution here?
Does the List that is used with the ConcurrentHashMap have an issue?
Should I use the compute method of ConcurrentHashMap here (no need I think, as ReentrantLock is already doing the same job)?
The answer provided by #Slaw in the comments did the trick. We were letting the instrumentList instance escape in non-synchronized way i.e. access/operations are happening over list without any synchonization. Fixing the same by passing the copy to further methods did the trick.
Following line of code is the one where this issue was happening
recordSaver.persist(instrumentList);
instrumentList.clear();
Here we are allowing the instrumentList instance to escape in non-synchronized way i.e. it is passed to another class (recordSaver.persist) where it was to be actioned on but we are also clearing the list in very next line(in Aggregator class) and all of this is happening in non-synchronized way. List state can't be predicted in record saver... a really stupid mistake.
We fixed the issue by passing a cloned copy of instrumentList to recordSaver.persist(...) method. In this way instrumentList.clear() has no affect on list available in recordSaver for further operations.
I see, that you are using ConcurrentHashMap's parallelStream within a lock. I am not knowledgeable about Java 8+ stream support, but quick searching shows, that
ConcurrentHashMap is a complex data structure, that used to have concurrency bugs in past
Parallel streams must abide to complex and poorly documented usage restrictions
You are modifying your data within a parallel stream
Based on that information (and my gut-driven concurrency bugs detector™), I wager a guess, that removing the call to parallelStream might improve robustness of your code. In addition, as mentioned by #Slaw, you should use ordinary HashMap in place of ConcurrentHashMap if all instrumentMap usage is already guarded by lock.
Of course, since you don't post the code of recordSaver, it is possible, that it too has bugs (and not necessarily concurrency-related ones). In particular, you should make sure, that the code that reads records from persistent storage — the one, that you are using to detect loss of records — is safe, correct, and properly synchronized with rest of your system (preferably by using a robust, industry-standard SQL database).
It looks like this was an attempt at optimization where it was not needed. In that case, less is more and simpler is better. In the code below, only two concepts for concurrency are used: synchronized to ensure a shared list is properly updated and final to ensure all threads see the same value.
import java.util.ArrayList;
import java.util.List;
public class Aggregator<T> implements Runnable {
private final List<T> instruments = new ArrayList<>();
private final RecordSaver recordSaver;
private final int batchSize;
public Aggregator(RecordSaver recordSaver, int batchSize) {
super();
this.recordSaver = recordSaver;
this.batchSize = batchSize;
}
public synchronized void addAll(List<T> moreInstruments) {
instruments.addAll(moreInstruments);
if (instruments.size() >= batchSize) {
storeInstruments();
}
}
public synchronized void storeInstruments() {
if (instruments.size() > 0) {
// in case recordSaver works async
// recordSaver.persist(new ArrayList<T>(instruments));
// else just:
recordSaver.persist(instruments);
instruments.clear();
}
}
#Override
public void run() {
while (true) {
try { Thread.sleep(1L); } catch (Exception ignored) {
break;
}
storeInstruments();
}
}
class RecordSaver {
void persist(List<?> l) {}
}
}

synchronized on an Object seems like it's not synchronized

I run a program which contains the following classes (not only, but these are the relevant ones for the question)
Under Results class I have a synchronized LinkedHashMap such as:
private static Map<Integer,Result> resultsHashMap=Collections.synchronizedMap(new LinkedHashMap<Integer, Result>());
and a getter method:
public static Map<Integer,Result> getResultsHashMap() {
return resultsHashMap;
}
As well I have inside my Result class a constructor with this synchronized code:
public Result(){
synchronized (Lock.lock) {
uniqueIdResult++;
}
}
and a synchronized getter method as such:
public static int getUniqueIdResult() {
synchronized (Lock.lock) {
return uniqueIdResult;
}
}
the uniqueIdResult is defined as following:
private static int uniqueIdResult=0;
Also I have a Lock class consists this Object:
public static final Lock lock=new Lock();
Now, this is the important issue i'm after. In my program I have the next 2 lines, which are creating a Result and putting it into the HashMap
Result result = new Result();
Results.getResultsHashMap().put(Result.getUniqueIdResult(), result);
I try to run my program with different number of Threads. When it is being run with 1 thread the output is as I expect it to be (specifically, but not necessarily important, Results.resultsHashMap contains 433 keys, which is what should be, and the keys are starting from 1).
But when I run it with different number of Threads, it gives a different output. For example running with 6 Threads gives a different number of keys each time, sometimes 430, sometimes 428, sometimes 427, etc.. and the starting key is not always related to the total number of keys (e.g total_number_of_keys-starting_key_number+1, which seemed to me in the beginning to be some pattern, but realized it's not)
The iteration is like this:
int counterOfResults=0;
for (Integer key : Results.getResultsHashMap().keySet()) {
System.out.println(key + " " + Results.getResultsHashMap().get(key));
counterOfResults++;
}
System.out.println(counterOfResults);
Also when synchronizing the getter method for getting the hashMap, without synchronization of the Result creation and the insertion to the hashMap, the output with multiple threads gives wrong output.
Also, when synchronizing only one of the lines (creation of Result and putting into hashMap), the output is not coherent under multiple Threads.
However when I synchronize both these lines (the creation of Result and putting into the map) like so:
Result result;
synchronized (Lock.lock) {
result = new Result(currentLineTimeNationalityNameYearofbirth.getName(),currentLineTimeNationalityNameYearofbirth.getTime(),citycompetionwas,date,distance,stroke,gender,kindofpool);
Results.getResultsHashMap().put(Result.getUniqueIdResult(), result);
}
the output is perfect, no matter how many Threads I use.
Also, I will note that the output is being printed only after all Threads have finished, by using join method for all Threads created.
So my question is:
As far as I know, before synchronizing the 2 lines (creating Result and puting into hashMap) all of my critical sections ,e.g, changing and getting the uniqueIdResult, getting the resultsHashMap (as I mentioned, I tried synchronizing this getter method also) are being synchronized on the same object, plus I put a further safe approach when puting the hashMap with Collections.synchronizedMap, which,as far as I know, should make the hashMap thread-safe.
Why then the output is not as I expect it to be? Where is there a safety problem?
There's no exclusion around these lines:
Result result = new Result();
Results.getResultsHashMap().put(Result.getUniqueIdResult(), result);
If you have 4 threads, they might all execute the first line (which will increment the uniqueIdResult variable four times), and then all execute the second line (at which point they will all see the same return value from getUniqueIdResult()). That explains how your keys could start at 4 when you have 4 (or more) threads.
Because you have multiple threads potentially (and unpredictably) storing to the same key, you also end up with a variable number of entries in your map.
You should probably remove the increment from the Result class constructor and instead do it in the getUniqueIdResult method:
public static int getUniqueIdResult() {
synchronized (Lock.lock) {
return ++uniqueIdResult;
}
}
(Having done that, there is no longer any need to create instances of Result at all).

Problems with adding string to a list by many threads

this is a follow-up post to Do I need a concurrent collection for adding elements to a list by many threads?
everybody there has focused on expaning of the list. I understand how that can be a problem, but .. what about adding elements ? is that thread-safe?
example code
static final Collection<String> FILES = new ArrayList<String>(1000000);
and I execute in many threads (I add less than 1000000 elements)
FILES.add(string)
is that thread safe ? what are possible problems with doing it that way ?
ArrayList<String> is not synchronized by itself. Use Collections.synchronizedList(new ArrayList<String>()) instead.
use Collections.syncronizedList(new ArrayList<String>(1000000))
Even if the list doesn't expand, it maintains its internal state, such as the current size. In a wider sense, ArrayList is specified a mutable object which is not thread-safe, therefore it is illegal to concurrently call any mutator methods on it.
public boolean add(E e) {
ensureCapacityInternal(size + 1); // Increments modCount!!
elementData[size++] = e;
return true;
}
This is the source cod of the ArrayList in 7u40-b43.
The method ensureCapacityInternal() is not thread safe,and size++ is not either.
The size++ is done by three steps,1)get the size,2) size +1,3) write the new value back.

Multiple threads editing an ArrayList and its objects

I am trying to run 2 concurrent threads, where one keeps adding objects to a list and the other updates these objects and may remove some of these objects from the list as well.
I have a whole project where I've used ArrayList for my methods and classes so it's difficult to change it now.
I've looked around and I found a few ways of doing this, but as I said it is difficult to change from ArrayList. I tried using synchronized and notify() for the method adding the objects to the list and wait() for the method changing these objects and potentially removing them if they meet certain criteria.
Now, I've figured out how to do this using a CopyOnWriteArrayList, but I would like to know if there's a possibility of using ArrayList itself to simulate this. so that I don't have to edit my entire code.
So, basically, I would like to do something like this, but with ArrayList:
import java.util.Iterator;
import java.util.concurrent.CopyOnWriteArrayList;
public class ListExample{
CopyOnWriteArrayList<MyObject> syncList;
public ListExample(){
syncList = new CopyOnWriteArrayList<MyObject>();
Thread thread1 = new Thread(){
public void run(){
synchronized (syncList){
for(int i = 0; i < 10; i++){
syncList.add(new MyObject(i));
}
}
}
};
Thread thread2 = new Thread(){
public void run(){
synchronized (syncList){
Iterator<MyObject> iterator = syncList.iterator();
while(iterator.hasNext()){
MyObject temp = iterator.next();
//this is just a sample list manipulation
if (temp.getID() > 3)
syncList.remove(temp);
System.out.println("Object ID: " + temp.getID() + " AND list size: " + syncList.size());
}
}
}
};
thread1.start();
thread2.start();
}
public static void main(String[] args){
new ListExample();
}
}
class MyObject{
private int ID;
public MyObject(int ID){
this.ID = ID;
}
public int getID(){
return ID;
}
public void setID(int ID){
this.ID = ID;
}
}
I've also read about Collections.synchronizedList(new ArrayList()) but again, I believe this would require me to change my code as I have a substantial number of methods that take ArrayList as a parameter.
Any guidance would be appreciated, because I am out of ideas. Thank you.
You may be interested on the collections provided by the java.util.concurrent package. They are very useful for producer/consumer scenarios, where one or more threads add things to a queue, and other threads take them. There are different methods depending on whether you want to block, or fail when the queue is full/empty.
About refactoring your methods, you should have used interfaces (e.g. List) instead of concrete implementation classes (such as ArrayList). That is the purpose of interfaces, and the Java API has a good suply of them.
As a quick solution u may extend ArrayList and make modifying methods (add/remove) synchronized. and re-factor code to replace ArrayList to your custom-ArrayList
Use Vector instead of ArrayList. Remember to store it in a List reference as Vector contains deprecated methods. Vector, unlike ArrayList, synchronizes its internal operations, and unlike CopyOnWriteArrayList, does not copy the internal array each time a modification is made.
Of course you should be using java.util.concurrent pakage. But let's look at what is happening/could happen with only ArrayList and synchronization.
In your code, if you have just ArrayList in place of CopyOnWriteArrayList, it should work as you have provided full synchronization synchronized (syncList) on whatever you are doing/manipulating in threads. You do not require any wait() notify() if whole thing is synchronized (But that's not recommended, will come to that).
But this code will give ConcurrentModificationException because once you are using iterator syncList.iterator() you should not remove element from that list, otherwise it may give undesirable results while iterating that's why it's designed to fail fast and give exception. To avoid this you can use like:
Iterator<MyObject> iterator = syncList.iterator();
ArrayList<MyObject> toBeRemoved = new ArrayList<MyObject>();
while(iterator.hasNext()){
MyObject temp = iterator.next();
//this is just a sample list manipulation
if (temp.getID() > 3)
{
//syncList.remove(temp);
toBeRemoved.add(temp);
}
System.out.println("Object ID: " + temp.getID() + " AND list size: " + syncList.size());
}
syncList.removeAll(toBeRemoved);
Now regarding synchronization, you should strive to minimize its scope otherwise there'll be unnecessary waiting between threads, thats why java.util.concurrent package is given to have high performance in multithreading (using even non blocking algorithms). Or you can also use Collections.synchronizedList(new ArrayList()) but they are not as good as concurrent classes.
If you want to use conditional synchronization like in producer/consumer problem, then you can use wait() notify() mechanism on same object (lock). But again there're already some classes to help like using java.util.concurrent.LinkedBlockingQueue.

Categories