I am trying to run 2 concurrent threads, where one keeps adding objects to a list and the other updates these objects and may remove some of these objects from the list as well.
I have a whole project where I've used ArrayList for my methods and classes so it's difficult to change it now.
I've looked around and I found a few ways of doing this, but as I said it is difficult to change from ArrayList. I tried using synchronized and notify() for the method adding the objects to the list and wait() for the method changing these objects and potentially removing them if they meet certain criteria.
Now, I've figured out how to do this using a CopyOnWriteArrayList, but I would like to know if there's a possibility of using ArrayList itself to simulate this. so that I don't have to edit my entire code.
So, basically, I would like to do something like this, but with ArrayList:
import java.util.Iterator;
import java.util.concurrent.CopyOnWriteArrayList;
public class ListExample{
CopyOnWriteArrayList<MyObject> syncList;
public ListExample(){
syncList = new CopyOnWriteArrayList<MyObject>();
Thread thread1 = new Thread(){
public void run(){
synchronized (syncList){
for(int i = 0; i < 10; i++){
syncList.add(new MyObject(i));
}
}
}
};
Thread thread2 = new Thread(){
public void run(){
synchronized (syncList){
Iterator<MyObject> iterator = syncList.iterator();
while(iterator.hasNext()){
MyObject temp = iterator.next();
//this is just a sample list manipulation
if (temp.getID() > 3)
syncList.remove(temp);
System.out.println("Object ID: " + temp.getID() + " AND list size: " + syncList.size());
}
}
}
};
thread1.start();
thread2.start();
}
public static void main(String[] args){
new ListExample();
}
}
class MyObject{
private int ID;
public MyObject(int ID){
this.ID = ID;
}
public int getID(){
return ID;
}
public void setID(int ID){
this.ID = ID;
}
}
I've also read about Collections.synchronizedList(new ArrayList()) but again, I believe this would require me to change my code as I have a substantial number of methods that take ArrayList as a parameter.
Any guidance would be appreciated, because I am out of ideas. Thank you.
You may be interested on the collections provided by the java.util.concurrent package. They are very useful for producer/consumer scenarios, where one or more threads add things to a queue, and other threads take them. There are different methods depending on whether you want to block, or fail when the queue is full/empty.
About refactoring your methods, you should have used interfaces (e.g. List) instead of concrete implementation classes (such as ArrayList). That is the purpose of interfaces, and the Java API has a good suply of them.
As a quick solution u may extend ArrayList and make modifying methods (add/remove) synchronized. and re-factor code to replace ArrayList to your custom-ArrayList
Use Vector instead of ArrayList. Remember to store it in a List reference as Vector contains deprecated methods. Vector, unlike ArrayList, synchronizes its internal operations, and unlike CopyOnWriteArrayList, does not copy the internal array each time a modification is made.
Of course you should be using java.util.concurrent pakage. But let's look at what is happening/could happen with only ArrayList and synchronization.
In your code, if you have just ArrayList in place of CopyOnWriteArrayList, it should work as you have provided full synchronization synchronized (syncList) on whatever you are doing/manipulating in threads. You do not require any wait() notify() if whole thing is synchronized (But that's not recommended, will come to that).
But this code will give ConcurrentModificationException because once you are using iterator syncList.iterator() you should not remove element from that list, otherwise it may give undesirable results while iterating that's why it's designed to fail fast and give exception. To avoid this you can use like:
Iterator<MyObject> iterator = syncList.iterator();
ArrayList<MyObject> toBeRemoved = new ArrayList<MyObject>();
while(iterator.hasNext()){
MyObject temp = iterator.next();
//this is just a sample list manipulation
if (temp.getID() > 3)
{
//syncList.remove(temp);
toBeRemoved.add(temp);
}
System.out.println("Object ID: " + temp.getID() + " AND list size: " + syncList.size());
}
syncList.removeAll(toBeRemoved);
Now regarding synchronization, you should strive to minimize its scope otherwise there'll be unnecessary waiting between threads, thats why java.util.concurrent package is given to have high performance in multithreading (using even non blocking algorithms). Or you can also use Collections.synchronizedList(new ArrayList()) but they are not as good as concurrent classes.
If you want to use conditional synchronization like in producer/consumer problem, then you can use wait() notify() mechanism on same object (lock). But again there're already some classes to help like using java.util.concurrent.LinkedBlockingQueue.
Related
I have a scenario where I have to maintain a Map which can be populated by multiple threads, each modifying their respective List (unique identifier/key being the thread name), and when the list size for a thread exceeds a fixed batch size, we have to persist the records to the database.
Aggregator class
private volatile ConcurrentHashMap<String, List<T>> instrumentMap = new ConcurrentHashMap<String, List<T>>();
private ReentrantLock lock ;
public void addAll(List<T> entityList, String threadName) {
try {
lock.lock();
List<T> instrumentList = instrumentMap.get(threadName);
if(instrumentList == null) {
instrumentList = new ArrayList<T>(batchSize);
instrumentMap.put(threadName, instrumentList);
}
if(instrumentList.size() >= batchSize -1){
instrumentList.addAll(entityList);
recordSaver.persist(instrumentList);
instrumentList.clear();
} else {
instrumentList.addAll(entityList);
}
} finally {
lock.unlock();
}
}
There is one more separate thread running after every 2 minutes (using the same lock) to persist all the records in Map (to make sure we have something persisted after every 2 minutes and the map size does not gets too big)
if(//Some condition) {
Thread.sleep(//2 minutes);
aggregator.getLock().lock();
List<T> instrumentList = instrumentMap.values().stream().flatMap(x->x.stream()).collect(Collectors.toList());
if(instrumentList.size() > 0) {
saver.persist(instrumentList);
instrumentMap .values().parallelStream().forEach(x -> x.clear());
aggregator.getLock().unlock();
}
}
This solution is working fine in almost for every scenario that we tested, except sometimes we see some of the records went missing, i.e. they are not persisted at all, although they were added fine to the Map.
My questions are:
What is the problem with this code?
Is ConcurrentHashMap not the best solution here?
Does the List that is used with the ConcurrentHashMap have an issue?
Should I use the compute method of ConcurrentHashMap here (no need I think, as ReentrantLock is already doing the same job)?
The answer provided by #Slaw in the comments did the trick. We were letting the instrumentList instance escape in non-synchronized way i.e. access/operations are happening over list without any synchonization. Fixing the same by passing the copy to further methods did the trick.
Following line of code is the one where this issue was happening
recordSaver.persist(instrumentList);
instrumentList.clear();
Here we are allowing the instrumentList instance to escape in non-synchronized way i.e. it is passed to another class (recordSaver.persist) where it was to be actioned on but we are also clearing the list in very next line(in Aggregator class) and all of this is happening in non-synchronized way. List state can't be predicted in record saver... a really stupid mistake.
We fixed the issue by passing a cloned copy of instrumentList to recordSaver.persist(...) method. In this way instrumentList.clear() has no affect on list available in recordSaver for further operations.
I see, that you are using ConcurrentHashMap's parallelStream within a lock. I am not knowledgeable about Java 8+ stream support, but quick searching shows, that
ConcurrentHashMap is a complex data structure, that used to have concurrency bugs in past
Parallel streams must abide to complex and poorly documented usage restrictions
You are modifying your data within a parallel stream
Based on that information (and my gut-driven concurrency bugs detector™), I wager a guess, that removing the call to parallelStream might improve robustness of your code. In addition, as mentioned by #Slaw, you should use ordinary HashMap in place of ConcurrentHashMap if all instrumentMap usage is already guarded by lock.
Of course, since you don't post the code of recordSaver, it is possible, that it too has bugs (and not necessarily concurrency-related ones). In particular, you should make sure, that the code that reads records from persistent storage — the one, that you are using to detect loss of records — is safe, correct, and properly synchronized with rest of your system (preferably by using a robust, industry-standard SQL database).
It looks like this was an attempt at optimization where it was not needed. In that case, less is more and simpler is better. In the code below, only two concepts for concurrency are used: synchronized to ensure a shared list is properly updated and final to ensure all threads see the same value.
import java.util.ArrayList;
import java.util.List;
public class Aggregator<T> implements Runnable {
private final List<T> instruments = new ArrayList<>();
private final RecordSaver recordSaver;
private final int batchSize;
public Aggregator(RecordSaver recordSaver, int batchSize) {
super();
this.recordSaver = recordSaver;
this.batchSize = batchSize;
}
public synchronized void addAll(List<T> moreInstruments) {
instruments.addAll(moreInstruments);
if (instruments.size() >= batchSize) {
storeInstruments();
}
}
public synchronized void storeInstruments() {
if (instruments.size() > 0) {
// in case recordSaver works async
// recordSaver.persist(new ArrayList<T>(instruments));
// else just:
recordSaver.persist(instruments);
instruments.clear();
}
}
#Override
public void run() {
while (true) {
try { Thread.sleep(1L); } catch (Exception ignored) {
break;
}
storeInstruments();
}
}
class RecordSaver {
void persist(List<?> l) {}
}
}
I have several computing threads that create result values (Objects). After each thread is finished a 'add' method from a result collector is called. This result collector is singleton, so there is only one representation.
Inside the result collector is a list which holds result objects:
List<TestResult> results = Collections.synchronizedList(new ArrayList<>());
The add method adds the result of each thread to the list:
public void addResult(TestResult result){
System.out.println("added " + result.getpValue());
this.results.add(result);
}
It is called within the thread, after the computing stuff is done.
The big problem is: After all threads are finished the list of results is empty. As you can see in the addResult method I added a print statement for the pValue. The p value of all results is printed out.
So it looks like the threads work on different lists. Despite the fact that the collector class is singleton.
It was asked for the complete code of the result collector (Javadoc removed to trim size)
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
public class ResultCollector {
private static ResultCollector resultCollector;
private final List<TestResult> results;
public static ResultCollector getInstance(){
if(resultCollector == null){
resultCollector = new ResultCollector();
}
return resultCollector;
}
private ResultCollector() {
results = Collections.synchronizedList(new ArrayList<>());
}
public void addResult(TestResult result){
System.out.println("added " + result.getpValue());
this.results.add(result);
}
}
I updated the add method to print out the hash of the current class to make sure every thread has the same with:
System.out.println("added to" + System.identityHashCode(this) + " count: " +results.size());
The output hash code is the same for all threads and the size increases to the expected value. Also the hash code is the same when I call my toString method or getter for the list outside the multithread environment.
Calling of the threads:
public IntersectMultithread(...) {
Set<String> tracks = intervals.keySet();
for(String track: tracks){
IntersectWrapper wrapper = new IntersectWrapper(...);
wrappers.add(wrapper);
exe.execute(wrapper);
}
exe.shutdown();
}
A Synchronized list is just a wrapper over a list. You should actually be using Concurrent Collections for this purpose in modern Java; they implement smarter and more efficient locking and provide better performance.
Caveat: the only synchronized list is one that copies on write. So, if that's an issue (i.e. you're adding more than iterating), then your way is fine).*
The error in your code is almost certainly in your singleton class which you haven't shown. Either your class is not truly a singleton (did you use an enum? that's a good way to guarantee it), or your list creation is more confusing than let on.
If you post more code, I can update the answer with more info :).
EDIT: I think your problem is here based on your updated code:
exe.shutdown();
You need to wait for the executor to complete with awaitTermination() with a good timeout relevant to the work you are doing.
Your threads just start and die off instantly right now :)
For example:
taskExecutor.awaitTermination(Long.MAX_VALUE, TimeUnit.NANOSECONDS);
From here: https://stackoverflow.com/a/1250655/857994
Addition to the correct answer above
Yes, the exe.shutdown(); is the problem, but the threads do not die instantly, instead they seem to run through. This is why the 'add' method printed everything correct if extended with a print.
The issue was that my output was done before the threads could finish their computation. So there were no values at that time, shortly after the threads finish and the print works.
The first thread is filling a collection continuously with objects. A second thread needs to iterate over these objects, but it will not change the collection.
Currently I use Collection.synchronized for making it thread-safe, but is there a fast way to doing it?
Update
It's simple: The first thread (ui) continuously writes the mouse position to the ArrayList, as long as the mousebutton is pressed down. The second thread (render) draws a line based on the list.
Use java.util.concurrent.ArrayBlockingQueue.ArrayBlockingQueue implementation of BlockingQueue. It perfectly suits your needs.
It is perfectly suited for producer-consumer cases as that is one in yours.
You can also configure access policy. Javadoc explains access policy like this:
Fair if true then queue accesses for threads blocked on insertion or removal, are processed in FIFO order; if false the access order is unspecified.
Even if you synchronize the list, it's not necessarily thread-safe while iterating over it, so make sure you synchronize on it:
synchronized(synchronizedList) {
for (Object o : synchronizedList) {
doSomething()
}
}
Edit:
Here's a very clearly written article on the matter:
http://java67.blogspot.com/2014/12/how-to-synchronize-arraylist-in-java.html
As mentioned in comments, you need explicit synchronization on this list, because iteration is not atomic:
List<?> list = // ...
Thread 1:
synchronized(list) {
list.add(o);
}
Thread 2:
synchronized(list) {
for (Object o : list) {
// do actions on object
}
}
There are 3 options which I can currently think of to handle concurrency in ArrayList:-
Using Collections.synchronizedList(list) - currently you are using it.
CopyOnWriteArrayList - behaves much like ArrayList class, except that when the list is modified, instead of modifying the underlying array, a new array in created and the old array is discarded. It will be slower than 1.
Creating custom ArrayList class using ReentrantReadWriteLock. You can create a wrapper around ArrayList class. Use read lock when reading/iterating/looping and use write lock when adding elements in array.
For e.g:-
import java.util.List;
import java.util.concurrent.locks.Lock;
import java.util.concurrent.locks.ReadWriteLock;
import java.util.concurrent.locks.ReentrantReadWriteLock;
public class ReadWriteList<E> {
private final List<E> list;
private ReadWriteLock lock = new ReentrantReadWriteLock();
private final Lock r =lock.readLock();
private final Lock w =lock.writeLock();
public ReadWriteList(List<E> list){
this.list=list;
}
public boolean add(E e){
w.lock();
try{
return list.add(e);
}
finally{
w.unlock();
}
}
//Do the same for other modification methods
public E getElement(int index){
r.lock();
try{
return list.get(index);
}
finally{
r.unlock();
}
}
public List<E> getList(){
r.lock();
try{
return list;
}
finally{
r.unlock();
}
}
//Do the same for other read methods
}
If you're reading far more often than writing, you can use CopyOnWriteArrayList
Rather than a List will a Set suit your needs?
If so, you can use Collections.newSetFromMap(new ConcurrentHashMap<>())
this is a follow-up post to Do I need a concurrent collection for adding elements to a list by many threads?
everybody there has focused on expaning of the list. I understand how that can be a problem, but .. what about adding elements ? is that thread-safe?
example code
static final Collection<String> FILES = new ArrayList<String>(1000000);
and I execute in many threads (I add less than 1000000 elements)
FILES.add(string)
is that thread safe ? what are possible problems with doing it that way ?
ArrayList<String> is not synchronized by itself. Use Collections.synchronizedList(new ArrayList<String>()) instead.
use Collections.syncronizedList(new ArrayList<String>(1000000))
Even if the list doesn't expand, it maintains its internal state, such as the current size. In a wider sense, ArrayList is specified a mutable object which is not thread-safe, therefore it is illegal to concurrently call any mutator methods on it.
public boolean add(E e) {
ensureCapacityInternal(size + 1); // Increments modCount!!
elementData[size++] = e;
return true;
}
This is the source cod of the ArrayList in 7u40-b43.
The method ensureCapacityInternal() is not thread safe,and size++ is not either.
The size++ is done by three steps,1)get the size,2) size +1,3) write the new value back.
I have two threads modifying the same objects. The objects are custom, non-synchronized objects in an ArrayList (not a vector). I want to make these two threads work nicely together, since they are called at the same time.
Here is the only important method in thread 1.
public void doThread1Action() {
//something...
for(myObject x : MyArrayList){
modify(x);
}
}
Here is a method in thread 2:
public void doThread2Action() {
//something...
for(myObject x : MyArrayList){
modifyAgain(x);
}
}
At the moment, when testing, I occasionally get `ConcurrentModificationExceptions``. (I think it depends on how fast thread 1 finishes its iterations, before thread 2 tries to modify the objects.)
Am I right in thinking that by simply appending synchronized to the beginning of these two methods, the threads will work together in a synchronized way and not try to access the ArrayList? Or should I change the ArrayList to a Vector?
A ConcurrentModificationException does not stem from modifying objects in a collection but from adding / removing from a collection while an iterator is active.
The shared resources is the collection and there must be a third method using and add/remove. To get concurrency right you must synchronize access to the collection resource in all methods that access it.
To avoid overly long synchronized blocks a common pattern may be to copy the collection in a synchronized block and then iterate over it. If you do it this way, be aware the problem you are talking about in first place (concurrent modification of your object) is again in place - but this time you can lock on another resource.
You do not need to synchronize access to the list as long as you don't modify it structurally, i.e. as long as you don't add or remove objects from the list. You also shouldn't see ConcurrentModificationExceptions, because these are only thrown when you structurally modify the list.
So, assuming that you only modify the objects contained in the list, but you do not add or remove or reorder objects on the list, it is possible to synchronize on the contained objects whenever you modify them, like so:
void modifyAgain(MyObject x) {
synchronized(x) {
// do the modification
}
}
I would not use the synchronized modifier on the modifyAgain() method, as that would not allow two distinct objects in the list to be modified concurrently.
The modify() method in the other thread must of course be implemented in the same way as modifyAgain().
You need to sychronsize access to the collection on the same lock, so just using synchronized keyword on the methods (assuming they are in different classes) would be locking on two different objects.
so here is an example of what you might need to do:
Object lock = new Object();
public void doThread1Action(){
//something...
synchronized(lock){
for(myObject x : MyArrayList){
modify(x);
}
}
public void doThread2Action(){
//something...
synchronized(lock){
for(myObject x : MyArrayList){
modifyAgain(x);
}
}
Also you could consider using a CopyOnWriteArrayList instead of Vector
I guess your problem is related to ConcurrentModificationException. This class in its Java docs says:
/**
* This exception may be thrown by methods that have detected
concurrent
* modification of an object when such modification is not
permissible.
*/
In your case, problem is iterator in a list and may modified. I guess by following implementation your problem will sole:
public void doThread1Action()
{
synchronized(x //for sample)
{
//something...
for(myObject x : MyArrayList)
{
modify(x);
}
}
}
and then:
public void doThread2Action()
{
synchronized(x //for sample)
{
//something...
for(myObject x : MyArrayList)
{
modifyAgain(x);
}
}
}
For take better result I want anyone correct my solution.