java Vector and thread safety

java Vector and thread safety - java

I'm wondering if this code will do any trouble:
I have a vector that is shared among many threads. Every time a thread has to add/remove stuff from the vector I do it under a synchronized block. However, the main thread has a call:
System.out.println("the vector's size: "+ vec.size());
which isn't synchronized.
Should this cause trouble?

All Vector methods are synchronized themselves, so as long as you are only synchronizing around a single method, your own synchronization is not necessary. If you have several method calls, which depend on each other, e.g. something like vec.get(vec.size()-2) to get the second last element, you have to use your own synchronization since otherwise, the vector may change between vec.size() and vec.get().

I assume you are referring to java.util.Vector.
Actually Vector.size() is synchronized and will return a value consistent with the vector's state (when the thread calling size() enters the monitor.) If it returns 42, then at some point in time the vector contained exactly 42 elements.
If you're adding items in a loop in another thread then you cannot predict the exact size, but it should be fine for monitoring purposes.

Each of the methods of java.util.Vector is synchronized, so this won't cause any problems for something that is just logging the size.
To improve performance you may be better off replacing your Vector with an ArrayList. The methods of ArrayList aren't synchronized, so you would need to synchronize all access yourself.

Mind that you can always obtain a synchronized version of a collection by using Collections.synchronizedCollection(Collection<T> c) static method..

Related

Adding to ArrayList from separate thread

I have a program with 3 threads (excluding the main thread). The first thread moves an object across the window, the second thread checks for object collisions, and the third is supposed to add to the ArrayList of objects periodically. All three of these threads are manipulating the same list of objects (Though the first 2 are not actually changing the list, just the objects inside). However, when the thread meant to add to the list tries to add an object, I receive an error. Is it possible to manipulate an ArrayList from a different thread?

You can prevent the race conditions by placing the code that manipulates the array list inside synchronized(arrayList) { ... } blocks.

There is nothing special about ArrayList which prevents it from being read and written from multiple threads. However, note the warning in the Javadoc:
Note that this implementation is not synchronized. If multiple threads access an ArrayList instance concurrently, and at least one of the threads modifies the list structurally, it must be synchronized externally. (A structural modification is any operation that adds or deletes one or more elements, or explicitly resizes the backing array; merely setting the value of an element is not a structural modification.) This is typically accomplished by synchronizing on some object that naturally encapsulates the list. If no such object exists, the list should be "wrapped" using the Collections.synchronizedList method. This is best done at creation time, to prevent accidental unsynchronized access to the list:
List list = Collections.synchronizedList(new ArrayList(...));
It is also worth reading through the Synchronization Tutorial.

Yes you can handle the array in multiple threads. You can read more in the Java documentation about using the synchronized keyword with objects.

First, If you have a multithreaded application...prefer to use something like Vector instead of ArrayList since ArrayList is not considered thread safe.
Also, for handling concurrency,
You can used make a synchronized method and perform operations to that, or use a synchronized block.

ArrayList vs Vector performance in single-threaded application

I was just looking for the answer for the question why ArrayList is faster than Vector and i found ArrayList is faster as it is not synchronized.
so my doubt is:
If ArrayList is not synchronized why would we use it in multithreaded environment and compare it with Vector.
If we are in a single threaded environment then how the performance of the Vector decreases as there is no Synchronization going on as we are dealing with a single thread.
Why should we compare the performance considering the above points ?
Please guide me :)

a) Methods using ArrayList in a multithreaded program may be synchronized.
class X {
List l = new ArrayList();
synchronized void add(Object e) {
l.add(e);
}
...
b) We can use ArrayList without exposing it to other threads, this is when ArrayList is referenced only from local variables
void x() {
List l = new ArrayList(); // no other thread except current can access l
...
Even in a single threaded environment entering a synchronized method takes a lock, this is where we lose performance
public synchronized boolean add(E e) { // current thread will take a lock here
modCount++;
...

You can use ArrayList in a multithread environment if the list is not shared between threads.
If the list is shared between threads you can synchronize the access to that list.
Otherwise you can use Collections.synchronizedList() to get a List that can be used thread safely.
Vector is an old implementation of a synchronized List that is no longer used because the internal implementation basically synchronize every method. Generally you want to synchronize a sequence of operations. Otherwyse you can throw a ConcurrentModificationException when iterating the list another thread modify it. In addition synchronize every method is not good from a performance point of view.
In addition also in a single thread environment accessing a synchronized method needs to perform some operations, so also in a single thread application Vector is not a good solution.

Just because a component is single threaded doesn't mean that it cannot be used in a thread safe context. Your application may have it's own locking in which case additional locking is redundant work.
Conversely, just because a component is thread safe, it doesn't mean that you cannot use it in an unsafe manner. Typically thread safety extends to a single operation. E.g. if you take an Iterator and call next() on a collection this is two operations and they are no longer thread safe when used in combination. You still have to use locking for Vector. Another simple example is
private Vector<Integer> vec =
vec.add(1);
int n = vec.remove(vec.size());
assert n == 1;
This is atleast three operations however the number of things which can go wrong are much more than you might suppose. This is why you end up doing your own locking and why the locking inside Vector might be redundant, even unwanted.
For you own interest;
vec can change at any point t another Vector or null
vec.add(2) can happen between any operation, changing the size and the last element.
vec.remove() can happen between any operation.
vec.add(null) can happen between any operation resulting in a possible NullPointerException
The vec can /* change */ in these places.
private Vector<Integer> vec =
vec.add(1); /* change*/
int n = vec.remove(vec.size() /* change*/);
assert n == 1;
In short, assuming that just because you used a thread safe collection your code is now thread safe is a big assumption.
A common pattern which breaks is
for(int n : vec) {
// do something.
}
Look harmless enough except
for(Iterator iter = vec.iterator(); /* change */ vec.hasNext(); ) {
/* change */ int n = vec.next();
I have marked with /* change */ where another thread could change the collection meaning this loop can get a ConcurrentModificationException (but might not)
there is no Synchronization
The JVM doesn't know there is no need for synchronization and so it still has to do something. It has an optimisation to reduce the cost of uncontended locks, but it still has to do work.

You need to understand the basic concept to know answer for your above questions...
When you say array list is not syncronized and vector is, we mean that the methods in those classes (like add(), get(), remove() etc...) are synchronized in vector class and not in array list class. These methods will act upon tha data being stored .
So, the data saved in vector class cannot be edited / read parallely as add, get, remove metods are synchornized and the same in array list can be done parallely as these methods in array list are not synchronized...
This parallel activity makes array list fast and vector slow... This behavior remains same though you use them in either multithreaded (or) single threaded enviornment...
Hope this answers your question...

Can objects get lost if a LinkedList is add/remove fast by lots of threads?

sound like a silly question. I just started Java Concurrency.
I have a LinkedList that acts as a task queue and is accessed by multiple threads. They removeFirst() and execute it, other threads put more tasks (.add()). Tasks can have the thread put them back to the queue.
I notice that when there are a lot of tasks and they are put back to the queue a lot, the number of tasks I add to the queue initially are not what come out, 1, or sometimes 2 is missing.
I checked everything and I synchronized every critical section + notifyAll().
Already mark the LinkedList as 'volatile'.
Exact number is 384 tasks, each is put back 3072 times.
The problem doesn't occur if there is a small number of tasks & put back. Also if I System.out.println() all the steps then it doesn't happens anymore so I can't debug.
Could it be possible that LinkedList.add() is not fast enough so the threads somehow miss it?
Simplified code:
public void callByAllThreads() {
Task executedTask = null;
do
{
// access by multiple thread
synchronized(asyncQueue) {
executedTask = asyncQueue.poll();
if(executedTask == null) {
inProcessCount.incrementAndGet(); // mark that there is some processing going on
}
}
if(executedTask != null) {
executedTask.callMethod(); // subclass of task can override this method
synchronized(asyncQueue) {
inProcessCount.decrementAndGet();
asyncQueue.notifyAll();
}
}
}
while(executedTask != null);
}
The Task can override callMethod:
public void callMethodOverride() {
synchronized(getAsyncQueue()) {
getAsyncQueue().add(this);
getAsyncQueue().notifyAll();
}
}

From the docs for LinkedList:
Note that this implementation is not synchronized. If multiple threads access a linked list concurrently, and at least one of the threads modifies the list structurally, it must be synchronized externally.
i.e. you should synchronize access to the list. You say you are, but if you are seeing items get "lost" then you probably aren't synchronizing properly. Instead of trying to do that, you could use a framework class that does it for you ...
... If you are always removing the next available (first) item (effectively a producer/consumer implementation) then you could use a BlockingQueue implementation, This is guaranteed to be thread safe, and has the advantage of blocking the consumer until an item is available. An example is the ArrayBlockingQueue.
For non-blocking thread-safe queues you can look at ConcurrentLinkedQueue
Marking the list instance variable volatile has nothing to do with your list being synchronized for mutation methods like add or removeFirst. volatile is simply to do with ensuring that read/write for that instance variable is communicated correctly between, and ordered correctly within, threads. Note I said that variable, not the contents of that variable (see the Java Tutorials > Atomic Access)

LinkedList is definitely not thread safe; you cannot use it safely with multiple threads. It's not a question of "fast enough," it's a question of changes made by one thread being visible to other threads. Marking it volatile doesn't help; that only affects references to the LinkedList being changed, not changes to the contents of the LinkedList.
Consider ConcurrentLinkedQueue or ConcurrentLinkedDeque.

LinkedList is not thread safe, so yes, multiple threads accessing it simultaneously will lead to problems. Synchronizing critical sections can solve this, but as you are still having problems you probably made a mistake somewhere. Try wrapping it in a Collections.synchronizedList() to synchronize all method calls.

Linked list is not thread safe , you can use ConcurrentLinkedQueue if it fits your need,which seems possibly can.
As documentation says
An unbounded thread-safe queue based on linked nodes. This queue
orders elements FIFO (first-in-first-out). The head of the queue is
that element that has been on the queue the longest time. The tail of
the queue is that element that has been on the queue the shortest
time. New elements are inserted at the tail of the queue, and the
queue retrieval operations obtain elements at the head of the queue. A
ConcurrentLinkedQueue is an appropriate choice when many threads will
share access to a common collection. This queue does not permit null
elements.

You increment your inProcessCount when executedTask == null which is obviously the opposite of what you want to do. So it’s no wonder that it will have inconsistent values.
But there are other issues as well. You call notifyAll() at several places but as long as there is no one calling wait() that has no use.
Note further that if you access an integer variable consistently from inside synchronized blocks only throughout the code, there is no need to make it an AtomicInteger. On the other hand, if you use it, e.g. because it will be accessed at other places without additional synchronization, you can move the code updating the AtomicInteger outside the synchronized block.
Also, a method which calls a method like getAsyncQueue() three times looks suspicious to a reader. Just call it once and remember the result in a local variable, then everone can be confident that it is the same reference on all three uses. Generally, you have to ensure that all code is using the same list, hence the appropriate modifier for the variable holding it is final, not volatile.

Understanding collections concurrency and Collections.synchronized*

I learned yesterday that I've been incorrectly using collections with concurrency for many, many years.
Whenever I create a collection that needs to be accessed by more than one thread I wrap it in one of the Collections.synchronized* methods. Then, whenever mutating the collection I also wrap it in a synchronized block (I don't know why I was doing this, I must have thought I read it somewhere).
However, after reading the API more closely, it seems you need the synchronized block when iterating the collection. From the API docs (for Map):
It is imperative that the user manually synchronize on the returned map when iterating over any of its collection views:
And here's a small example:
List<O> list = Collections.synchronizedList(new ArrayList<O>());
...
synchronized(list) {
for(O o: list) { ... }
}
So, given this, I have two questions:
Why is this even necessary? The only explanation I can think of is they're using a default iterator instead of a managed thread-safe iterator, but they could have created a thread-safe iterator and fixed this mess, right?
More importantly, what is this accomplishing? By putting the iteration in a synchronized block you are preventing multiple threads from iterating at the same time. But another thread could mutate the list while iterating so how does the synchronized block help there? Wouldn't mutating the list somewhere else screw with the iteration whether it's synchronized or not? What am I missing?
Thanks for the help!

Why is this even necessary? The only explanation I can think of is
they're using a default iterator instead of a managed thread-safe
iterator, but they could have created a thread-safe iterator and fixed
this mess, right?
Iterating works with one element at a time. For the Iterator to be thread-safe, they'd need to make a copy of the collection. Failing that, any changes to the underlying Collection would affect how you iterate with unpredictable or undefined results.
More importantly, what is this accomplishing? By putting the iteration
in a synchronized block you are preventing multiple threads from
iterating at the same time. But another thread could mutate the list
while iterating so how does the synchronized block help there?
Wouldn't mutating the list somewhere else screw with the iteration
whether it's synchronized or not? What am I missing?
The methods of the object returned by synchronizedList(List) work by synchronizing on the instance. So no other thread could be adding/removing from the same List while you are inside a synchronized block on the List.

The basic case
All of the methods of the object returned by Collections.synchronizedList() are synchronized to the list object itself. Whenever a method is called from one thread, every other thread calling any method of it is blocked until the first call finishes.
So far so good.
Iterare necesse est
But that doesn't stop another thread from modifying the collection when you're between calls to next() on its Iterator. And if that happens, your code will fail with a ConcurrentModificationException. But if you do the iteration in a synchronized block too, and you synchronize on the same object (i.e. the list), this will stop other threads from calling any mutator methods on the list, they have to wait until your iterating thread releases the monitor for the list object. The key is that the mutator methods are synchronized to the same object as your iterator block, this is what's stopping them.
We're not out of the woods yet...
Note though that while the above guarantees basic integrity, it doesn't guarantee correct behaviour at all times. You might have other parts of your code that make assumptions which don't hold up in a multi-threaded environment:
List<Object> list = Collections.synchronizedList( ... );
...
if (!list.contains( "foo" )) {
// there's nothing stopping another thread from adding "foo" here itself, resulting in two copies existing in the list
list.add( "foo" );
}
...
synchronized( list ) { //this block guarantees that "foo" will only be added once
if (!list.contains( "foo" )) {
list.add( "foo" );
}
}
Thread-safe Iterator?
As for the question about a thread-safe iterator, there is indeed a list implementation with it, it's called CopyOnWriteArrayList. It is incredibly useful but as indicated in the API doc, it is limited to a handful of use cases only, specifically when your list is only modified very rarely but iterated over so frequently (and by so many threads) that synchronizing iterations would cause a serious bottle-neck. If you use it inappropriately, it can vastly degrade the performance of your application, as each and every modification of the list creates an entire new copy.

Synchronizing on the returned list is necessary, because internal operations synchronize on a mutex, and that mutex is this, i.e. the synchronized collection itself.
Here's some relevant code from Collections, constructors for SynchronizedCollection, the root of the synchronized collection hierarchy.
SynchronizedCollection(Collection<E> c) {
if (c==null)
throw new NullPointerException();
this.c = c;
mutex = this;
}
(There is another constructor that takes a mutex, used to initialize synchronized "view" collections from methods such as subList.)
If you synchronize on the synchronized list itself, then that does prevent another thread from mutating the list while you're iterating over it.
The imperative that you synchronize of the synchronized collection itself exists because if you synchronize on anything else, then what you have imagined could happen - another thread mutating the collection while you're iterating over it, because the objects locked are different.

Sotirios Delimanolis answered your second question "What is this accomplishing?" effectively. I wanted to amplify his answer to your first question:
Why is this even necessary? The only explanation I can think of is they're using a default iterator instead of a managed thread-safe iterator, but they could have created a thread-safe iterator and fixed this mess, right?
There are several ways to approach making a "thread-safe" iterator. As is typical with software systems, there are multiple possibilities, and they offer different tradeoffs in terms of performance (liveness) and consistency. Off the top of my head I see three possibilities.
1. Lockout + Fail-fast
This is what's suggested by the API docs. If you lock the synchronized wrapper object while iterating it (and the rest of the code in the system written correctly, so that mutation method calls also all go through the synchronized wrapper object), the iteration is guaranteed to see a consistent view of the contents of the collection. Each element will be traversed exactly once. The downside, of course, is that other threads are prevented from modifying or even reading the collection while it's being iterated.
A variation of this would use a reader-writer lock to allow reads but not writes during iteration. However, the iteration itself can mutate the collection, so this would spoil consistency for readers. You'd have to write your own wrapper to do this.
The fail-fast comes into play if the lock isn't taken around the iteration and somebody else modifies the collection, or if the lock is taken and somebody violates the locking policy. In this case if the iteration detects that the collection has been mutated out from under it, it throws ConcurrentModificationException.
2. Copy-on-write
This is the strategy employed by CopyOnWriteArrayList among others. An iterator on such a collection does not require locking, it will always show consistent results during iterator, and it will never throw ConcurrentModificationException. However, writes will always copy the entire array, which can be expensive. Perhaps more importantly, the notion of consistency is altered. The contents of the collection might have changed while you were iterating it -- more precisely, while you were iterating a snapshot of its state some time in the past -- so any decisions you might make now are potentially out of date.
3. Weakly Consistent
This strategy is employed by ConcurrentLinkedDeque and similar collections. The specification contains the definition of weakly consistent. This approach also doesn't require any locking, and iteration will never throw ConcurrentModificationException. But the consistency properties are extremely weak. For example, you might attempt to copy the contents of a ConcurrentLinkedDeque by iterating over it and adding each element encountered to a newly created List. But other threads might be modifying the deque while you're iterating it. In particular, if a thread removes an element "behind" where you've already iterated, and then adds an element "ahead" of where you're iterating, the iteration will probably observe both the removed element and the added element. The copy will thus have a "snapshot" that never actually existed at any point in time. Ya gotta admit that's a pretty weak notion of consistency.
The bottom line is that there's no simple notion of making an iterator thread safe that would "fix this mess". There are several different ways -- possibly more than I've explained here -- and they all involve differing tradeoffs. It's unlikely that any one policy will "do the right thing" in all circumstances for all programs.

Synchronizing in a for each loop still throws ConcurrentModificationExceptions

I'm trying to iterate through a loop on one thread, like so:
for (UnitTask task : chain) {
g.drawLine((int) task.getLocation().getX(), (int) task.getLocation().getY(), (int) currentPos.getX(), (int) currentPos.getY());
g.fillOval((int) task.getLocation().getX() - 2, (int) task.getLocation().getY() - 2, 5, 5);
currentPos = task.getLocation();
}
However, I have another thread (the Swing event thread) which can add to this object. Hence, ConcurrentModificationException. I tried obtaining a lock by surrounding the code with synchronized (chain) { ... }, but I still get the errors.
As a bit of a Java synchronization newbie, I'm a little confused as to why. I would expect this to make the loop thread-safe, but evidently, it is not.
Interestingly, chain is an instance of a custom class, but it is only a thin wrapper around a LinkedList. The list itself is private, and there's no way for an external class to retrive it directly (there are methods to explicitly add/remove objects), so I wouldn't expect this to affect the outcome.

The meaning of
synchronized (c) {
... code that uses c ...
}
is
wait for c to be unlocked
lock c
execute the body
unlock c
So if you synchronize in your thread, then your thread will wait for c to be unlocked and then dive in.
Now, if you do not synchronize the code on the other thread that modifies c, that code is going to just go ahead and modify c without waiting for a lock. Synchronizing a block in one thread does not make another thread wait for a lock. If the other thread has a line such as
c.add(someOtherTask)
that is not in a synchronized block, then it's going to do the add no matter what. This is the cause of your exception. It is also the reason why you saw the exception even though you put the code in your thread in a synchronized block: your code was "playing by the rules" but the other thread couldn't have cared less.
Be careful about synchronizing long-running code though. You are better off, as Stephen C says, to use a concurrent collection type.

Synchronization will not necessarily help.
Basically the problem is that you are using a collection type that does not allow the collection to be modified while an iteration is in progress (except via the iterator's remove method ... if supported). This is not a threading / synchronization issue per se. (And if you try to solve it simply by synchronization, you may introduce another problem.)
If you want to be able to iterate and modify at the same time, you will need to use a different collection type such as ConcurrentLinkedDeque instead of LinkedList.
If the iteration and writing are happening on separate threads, then shouldn't synchronizing block the writing until the iteration is finished? Or am I missing something?
The problem will be in how you have implemented the synchronization:
If you are not explicitly doing some kind synchronization in your LinkedList version, then no synchronization is done for you.
If you use a synchronization wrapper created by one of the Collections.synchronizedXxx methods, then the javadocs for those methods clearly state that an Iterator object returned by the wrapper's iterator() method IS NOT synchronized.
If you are doing the synchronization by hand, then you have to make sure that everything is synchronizing on the same mutex. And that lock has to be held on that mutex for the duration of the iteration ... not just for the call the iterator().
And note that if you hold a lock for a long time (e.g. while you are iterating a long list), this can potentially block other threads that need to update the list for a long time. That kind of thing can be a concurrency bottleneck that can (in the worst case) reduce your system's performance to the speed of a single processor.
The ConcurrentXxx classes typically avoid this by relaxing the consistency guarantees for the sequences produced by the iterators. For instance, you may not see elements that were added to the collection after you started the iteration.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.