Problems with adding string to a list by many threads - java

this is a follow-up post to Do I need a concurrent collection for adding elements to a list by many threads?
everybody there has focused on expaning of the list. I understand how that can be a problem, but .. what about adding elements ? is that thread-safe?
example code
static final Collection<String> FILES = new ArrayList<String>(1000000);
and I execute in many threads (I add less than 1000000 elements)
FILES.add(string)
is that thread safe ? what are possible problems with doing it that way ?

ArrayList<String> is not synchronized by itself. Use Collections.synchronizedList(new ArrayList<String>()) instead.

use Collections.syncronizedList(new ArrayList<String>(1000000))

Even if the list doesn't expand, it maintains its internal state, such as the current size. In a wider sense, ArrayList is specified a mutable object which is not thread-safe, therefore it is illegal to concurrently call any mutator methods on it.

public boolean add(E e) {
ensureCapacityInternal(size + 1); // Increments modCount!!
elementData[size++] = e;
return true;
}
This is the source cod of the ArrayList in 7u40-b43.
The method ensureCapacityInternal() is not thread safe,and size++ is not either.
The size++ is done by three steps,1)get the size,2) size +1,3) write the new value back.

Related

Updating a Shared Resource in Multithreaded Program

Can someone explain the output of the following program:
public class DataRace extends Thread {
static ArrayList<Integer> arr = new ArrayList<>();
public void run() {
Random random = new Random();
int local = random.nextInt(10) + 1;
arr.add(local);
}
public static void main(String[] args) {
DataRace t1 = new DataRace();
DataRace t2 = new DataRace();
DataRace t3 = new DataRace();
DataRace t4 = new DataRace();
t1.start();
t2.start();
t3.start();
t4.start();
try {
t1.join();
t2.join();
t3.join();
t4.join();
} catch (InterruptedException e) {
System.out.println("interrupted");
}
System.out.println(DataRace.arr);
}
}
Output:
[8, 5]
[9, 2, 2, 8]
[2]
I am having trouble understanding the varying number of values in my output. I would expect the main thread to either wait until all threads have finished execution as I am joining them in the try-catch block and then output four values, one from each thread, or print to the console in case of an interruption. Neither of which is really happening here.
How does it come into play here if this is due to data race in multithreading?
The main problem is that multiple threads are adding to the same shared ArrayList concurrently. ArrayList is not thread-safe. From source one can read:
Note that this implementation is not synchronized.
If multiple threads
access an ArrayList instance concurrently, and at least one of the
threads modifies the list structurally, it must be synchronized
externally. (A structural modification is any operation that adds or
deletes one or more elements, or explicitly resizes the backing array;
merely setting the value of an element is not a structural
modification.) This is typically accomplished by synchronizing on some
object that naturally encapsulates the list. If no such object exists,
the list should be "wrapped" using the Collections.synchronizedList
method. This is best done at creation time, to prevent accidental
unsynchronized access to the list:
In your code every time you call
arr.add(local);
inside the add method implementation, among others, a variable that keeps track of the size of the array will be updated. Below is shown the relevant part of the add method of the ArrayList:
private void add(E e, Object[] elementData, int s) {
if (s == elementData.length)
elementData = grow();
elementData[s] = e;
size = s + 1; // <--
}
where the variable field size is:
/**
* The size of the ArrayList (the number of elements it contains).
*
* #serial
*/
private int size;
Notice that neither is the add method synchronized nor the variable size is marked with the volatile clause. Hence, suitable to race-conditions.
Therefore, because you did not ensure mutual exclusion on the accesses to that ArrayList (e.g., surrounding the calls to the ArrayList with the synchronized clause), and because the ArrayList does not ensure that the size variable is updated atomically, each thread might see (or not) the last updated value of that variable. Hence, threads might see outdated values of the size variable, and add elements into positions that already other threads have added before. In the extreme, all threads might end-up adding an element into the same position (e.g., as one of your outputs [2]).
The aforementioned race-condition leads to undefined behavior, hence the reason why:
System.out.println(DataRace.arr);
outputs different number of elements in different execution of your code.
To make the ArrayList thread-safe or for alternatives have a look at the following SO thread: How do I make my ArrayList Thread-Safe?, where it showcases the use of Collections.synchronizedList()., CopyOnWriteArrayList among others.
An example of ensuring mutual exclusion of the accesses to the arr structure:
public void run() {
Random random = new Random();
int local = random.nextInt(10) + 1;
synchronized (arr) {
arr.add(local);
}
}
or :
static final List<Integer> arr = Collections.synchronizedList(new ArrayList<Integer>());
public void run() {
Random random = new Random();
int local = random.nextInt(10) + 1;
arr.add(local);
}
TL;DR
ArrayList is not Thread-Safe. Therefore it's behaviour in a race-condition is undefined. Use synchronized or CopyOnWriteArrayList instead.
Longer answer
ArrayList.add ultimately calls this private method:
private void add(E e, Object[] elementData, int s) {
if (s == elementData.length)
elementData = grow();
elementData[s] = e;
size = s + 1;
}
When two Threads reach this same point at the "same" time, they would have the same size (s), and both will try add an element on the same position and update the size to s + 1, thus likely keeping the result of the second.
If the size limit of the ArrayList is reached, and it has to grow(), a new bigger array is created and the contents copied, likely causing any other changes made concurrently to be lost (is possible that multiple threads will be trying to grow).
Alternatives here are to use monitors - a.k.a. synchronized, or to use Thread-Safe alternatives like CopyOnWriteArrayList.
I think there is a lot of similar or closely related questions. For example see this.
Basically the reason of this "unexpected" behabiour is because ArrayList is not thread-safe. You can try List<Integer> arr = new CopyOnWriteArrayList<>() and it will work as expected. This data structure is recommended when we want to perform read operation frequently and the number of write operations is relatively rare. For good explanation see What is CopyOnWriteArrayList in Java - Example Tutorial.
Another option is to use List<Integer> arr = Collections.synchronizedList(new ArrayList<>()).
You can also use Vector but it is not recommended (see here).
This article also will be useful - Vector vs ArrayList in Java.

Synchronise one process and parallel group

I'm getting ConcurrentModificationException, so i decided to try something new.
I have an ArrayList of points, which is used in Recycler View for download all graphs in parallel threads.
Is it possible to lock this downloads, when i'm writing to ArrayList and not having a chain download later?
Because now i'm using this in my Fragment:
public static final Object lock = new Object();
synchronized (lock) { /*download and write to ArrayList*/ }
And in every Cell i want to use
synchronized (lock) { /*download graphs*/ }
And i think when one cell is loading graph, all will wait, then second will load and so on.
ConcurrentModificationException usually indicates that a collection was modified while an iterator was in progress. It's not necessarily in a different thread :
List<Object> someList = new ArrayList<>();
someList.add("an item");
for (Object item : someList) {
// throws ConcurrentModificationException
someList.add("a new item");
}
It's difficult to know what really happen in your case. What happen if you replace ArrayLists with CopyOnWriteArrayList ? They are much less efficient than ArrayLists but it probably doesn't matter, at least compared to network operations.

Is it safe to loop over the same List simultaneously?

I have a list
testList= new ArrayList<String>(); // gets initialized once at the very beginning init(). This list never changes. It's static and final.
I have a static method that checks if an input value is in this List :
public static boolean isInList (String var1)
throws InterruptedException {
boolean in = false;
for (String s : testList) {
if (s.equals(var1))
{
in = true;
break;
}
}
return in;
}
I have a lot of threads that use this method concurrently and check if a certain value is in this list. It seems to work fine. However, I'm not sure if this is safe. Am I doing this correctly? Is it thread safe?
It is thread-safe as long as no thread is modifying the list while other threads are reading it.
If you are using iterators over the list, they will "fail-fast" (as in throw ConcurrentModificationException) if the list is modified under them. Other methods of accessing (i.e. get(n)) won't throw an exception but may return unexpected results.
This is all covered in great detail in the Javadoc for List and ArrayList, which you should study carefully.
ArrayList is not a thread safe object. It may works for you now, but in general, when working with threads, you should make sure you're using thread-safe objects that will work with your threads as you expect.
You can use Collections.synchronizedList()
testList = Collections.synchronizedList(new ArrayList<String>());
As long as you can guarantee that no one is writing to the list, it's safe.
Note that even if the list is static and final, the code itself doesn't guarantee that the list is never modified. I recommend using Collections.unmodifiableList() instead, because it guarantees that no element is ever added to or removed from the list.
By the way, you can rewrite your code to this:
public static boolean isInList(String var1) {
for (String s : testList) {
if (Objects.equals(s, var1)) {
return true;
}
}
return false;
}
or just
testList.contains(var1);

Multiple threads editing an ArrayList and its objects

I am trying to run 2 concurrent threads, where one keeps adding objects to a list and the other updates these objects and may remove some of these objects from the list as well.
I have a whole project where I've used ArrayList for my methods and classes so it's difficult to change it now.
I've looked around and I found a few ways of doing this, but as I said it is difficult to change from ArrayList. I tried using synchronized and notify() for the method adding the objects to the list and wait() for the method changing these objects and potentially removing them if they meet certain criteria.
Now, I've figured out how to do this using a CopyOnWriteArrayList, but I would like to know if there's a possibility of using ArrayList itself to simulate this. so that I don't have to edit my entire code.
So, basically, I would like to do something like this, but with ArrayList:
import java.util.Iterator;
import java.util.concurrent.CopyOnWriteArrayList;
public class ListExample{
CopyOnWriteArrayList<MyObject> syncList;
public ListExample(){
syncList = new CopyOnWriteArrayList<MyObject>();
Thread thread1 = new Thread(){
public void run(){
synchronized (syncList){
for(int i = 0; i < 10; i++){
syncList.add(new MyObject(i));
}
}
}
};
Thread thread2 = new Thread(){
public void run(){
synchronized (syncList){
Iterator<MyObject> iterator = syncList.iterator();
while(iterator.hasNext()){
MyObject temp = iterator.next();
//this is just a sample list manipulation
if (temp.getID() > 3)
syncList.remove(temp);
System.out.println("Object ID: " + temp.getID() + " AND list size: " + syncList.size());
}
}
}
};
thread1.start();
thread2.start();
}
public static void main(String[] args){
new ListExample();
}
}
class MyObject{
private int ID;
public MyObject(int ID){
this.ID = ID;
}
public int getID(){
return ID;
}
public void setID(int ID){
this.ID = ID;
}
}
I've also read about Collections.synchronizedList(new ArrayList()) but again, I believe this would require me to change my code as I have a substantial number of methods that take ArrayList as a parameter.
Any guidance would be appreciated, because I am out of ideas. Thank you.
You may be interested on the collections provided by the java.util.concurrent package. They are very useful for producer/consumer scenarios, where one or more threads add things to a queue, and other threads take them. There are different methods depending on whether you want to block, or fail when the queue is full/empty.
About refactoring your methods, you should have used interfaces (e.g. List) instead of concrete implementation classes (such as ArrayList). That is the purpose of interfaces, and the Java API has a good suply of them.
As a quick solution u may extend ArrayList and make modifying methods (add/remove) synchronized. and re-factor code to replace ArrayList to your custom-ArrayList
Use Vector instead of ArrayList. Remember to store it in a List reference as Vector contains deprecated methods. Vector, unlike ArrayList, synchronizes its internal operations, and unlike CopyOnWriteArrayList, does not copy the internal array each time a modification is made.
Of course you should be using java.util.concurrent pakage. But let's look at what is happening/could happen with only ArrayList and synchronization.
In your code, if you have just ArrayList in place of CopyOnWriteArrayList, it should work as you have provided full synchronization synchronized (syncList) on whatever you are doing/manipulating in threads. You do not require any wait() notify() if whole thing is synchronized (But that's not recommended, will come to that).
But this code will give ConcurrentModificationException because once you are using iterator syncList.iterator() you should not remove element from that list, otherwise it may give undesirable results while iterating that's why it's designed to fail fast and give exception. To avoid this you can use like:
Iterator<MyObject> iterator = syncList.iterator();
ArrayList<MyObject> toBeRemoved = new ArrayList<MyObject>();
while(iterator.hasNext()){
MyObject temp = iterator.next();
//this is just a sample list manipulation
if (temp.getID() > 3)
{
//syncList.remove(temp);
toBeRemoved.add(temp);
}
System.out.println("Object ID: " + temp.getID() + " AND list size: " + syncList.size());
}
syncList.removeAll(toBeRemoved);
Now regarding synchronization, you should strive to minimize its scope otherwise there'll be unnecessary waiting between threads, thats why java.util.concurrent package is given to have high performance in multithreading (using even non blocking algorithms). Or you can also use Collections.synchronizedList(new ArrayList()) but they are not as good as concurrent classes.
If you want to use conditional synchronization like in producer/consumer problem, then you can use wait() notify() mechanism on same object (lock). But again there're already some classes to help like using java.util.concurrent.LinkedBlockingQueue.

Java and Synchronizing two threads

I have two threads modifying the same objects. The objects are custom, non-synchronized objects in an ArrayList (not a vector). I want to make these two threads work nicely together, since they are called at the same time.
Here is the only important method in thread 1.
public void doThread1Action() {
//something...
for(myObject x : MyArrayList){
modify(x);
}
}
Here is a method in thread 2:
public void doThread2Action() {
//something...
for(myObject x : MyArrayList){
modifyAgain(x);
}
}
At the moment, when testing, I occasionally get `ConcurrentModificationExceptions``. (I think it depends on how fast thread 1 finishes its iterations, before thread 2 tries to modify the objects.)
Am I right in thinking that by simply appending synchronized to the beginning of these two methods, the threads will work together in a synchronized way and not try to access the ArrayList? Or should I change the ArrayList to a Vector?
A ConcurrentModificationException does not stem from modifying objects in a collection but from adding / removing from a collection while an iterator is active.
The shared resources is the collection and there must be a third method using and add/remove. To get concurrency right you must synchronize access to the collection resource in all methods that access it.
To avoid overly long synchronized blocks a common pattern may be to copy the collection in a synchronized block and then iterate over it. If you do it this way, be aware the problem you are talking about in first place (concurrent modification of your object) is again in place - but this time you can lock on another resource.
You do not need to synchronize access to the list as long as you don't modify it structurally, i.e. as long as you don't add or remove objects from the list. You also shouldn't see ConcurrentModificationExceptions, because these are only thrown when you structurally modify the list.
So, assuming that you only modify the objects contained in the list, but you do not add or remove or reorder objects on the list, it is possible to synchronize on the contained objects whenever you modify them, like so:
void modifyAgain(MyObject x) {
synchronized(x) {
// do the modification
}
}
I would not use the synchronized modifier on the modifyAgain() method, as that would not allow two distinct objects in the list to be modified concurrently.
The modify() method in the other thread must of course be implemented in the same way as modifyAgain().
You need to sychronsize access to the collection on the same lock, so just using synchronized keyword on the methods (assuming they are in different classes) would be locking on two different objects.
so here is an example of what you might need to do:
Object lock = new Object();
public void doThread1Action(){
//something...
synchronized(lock){
for(myObject x : MyArrayList){
modify(x);
}
}
public void doThread2Action(){
//something...
synchronized(lock){
for(myObject x : MyArrayList){
modifyAgain(x);
}
}
Also you could consider using a CopyOnWriteArrayList instead of Vector
I guess your problem is related to ConcurrentModificationException. This class in its Java docs says:
/**
* This exception may be thrown by methods that have detected
concurrent
* modification of an object when such modification is not
permissible.
*/
In your case, problem is iterator in a list and may modified. I guess by following implementation your problem will sole:
public void doThread1Action()
{
synchronized(x //for sample)
{
//something...
for(myObject x : MyArrayList)
{
modify(x);
}
}
}
and then:
public void doThread2Action()
{
synchronized(x //for sample)
{
//something...
for(myObject x : MyArrayList)
{
modifyAgain(x);
}
}
}
For take better result I want anyone correct my solution.

Categories