ArrayList vs Vector performance in single-threaded application - java

I was just looking for the answer for the question why ArrayList is faster than Vector and i found ArrayList is faster as it is not synchronized.
so my doubt is:
If ArrayList is not synchronized why would we use it in multithreaded environment and compare it with Vector.
If we are in a single threaded environment then how the performance of the Vector decreases as there is no Synchronization going on as we are dealing with a single thread.
Why should we compare the performance considering the above points ?
Please guide me :)

a) Methods using ArrayList in a multithreaded program may be synchronized.
class X {
List l = new ArrayList();
synchronized void add(Object e) {
l.add(e);
}
...
b) We can use ArrayList without exposing it to other threads, this is when ArrayList is referenced only from local variables
void x() {
List l = new ArrayList(); // no other thread except current can access l
...
Even in a single threaded environment entering a synchronized method takes a lock, this is where we lose performance
public synchronized boolean add(E e) { // current thread will take a lock here
modCount++;
...

You can use ArrayList in a multithread environment if the list is not shared between threads.
If the list is shared between threads you can synchronize the access to that list.
Otherwise you can use Collections.synchronizedList() to get a List that can be used thread safely.
Vector is an old implementation of a synchronized List that is no longer used because the internal implementation basically synchronize every method. Generally you want to synchronize a sequence of operations. Otherwyse you can throw a ConcurrentModificationException when iterating the list another thread modify it. In addition synchronize every method is not good from a performance point of view.
In addition also in a single thread environment accessing a synchronized method needs to perform some operations, so also in a single thread application Vector is not a good solution.

Just because a component is single threaded doesn't mean that it cannot be used in a thread safe context. Your application may have it's own locking in which case additional locking is redundant work.
Conversely, just because a component is thread safe, it doesn't mean that you cannot use it in an unsafe manner. Typically thread safety extends to a single operation. E.g. if you take an Iterator and call next() on a collection this is two operations and they are no longer thread safe when used in combination. You still have to use locking for Vector. Another simple example is
private Vector<Integer> vec =
vec.add(1);
int n = vec.remove(vec.size());
assert n == 1;
This is atleast three operations however the number of things which can go wrong are much more than you might suppose. This is why you end up doing your own locking and why the locking inside Vector might be redundant, even unwanted.
For you own interest;
vec can change at any point t another Vector or null
vec.add(2) can happen between any operation, changing the size and the last element.
vec.remove() can happen between any operation.
vec.add(null) can happen between any operation resulting in a possible NullPointerException
The vec can /* change */ in these places.
private Vector<Integer> vec =
vec.add(1); /* change*/
int n = vec.remove(vec.size() /* change*/);
assert n == 1;
In short, assuming that just because you used a thread safe collection your code is now thread safe is a big assumption.
A common pattern which breaks is
for(int n : vec) {
// do something.
}
Look harmless enough except
for(Iterator iter = vec.iterator(); /* change */ vec.hasNext(); ) {
/* change */ int n = vec.next();
I have marked with /* change */ where another thread could change the collection meaning this loop can get a ConcurrentModificationException (but might not)
there is no Synchronization
The JVM doesn't know there is no need for synchronization and so it still has to do something. It has an optimisation to reduce the cost of uncontended locks, but it still has to do work.

You need to understand the basic concept to know answer for your above questions...
When you say array list is not syncronized and vector is, we mean that the methods in those classes (like add(), get(), remove() etc...) are synchronized in vector class and not in array list class. These methods will act upon tha data being stored .
So, the data saved in vector class cannot be edited / read parallely as add, get, remove metods are synchornized and the same in array list can be done parallely as these methods in array list are not synchronized...
This parallel activity makes array list fast and vector slow... This behavior remains same though you use them in either multithreaded (or) single threaded enviornment...
Hope this answers your question...

Related

Java ConcurrentModificationException problems with classes

I am essentially doing the following:
Creating an object (for example a weapon object), that automatically adds that object to a list off all of those types of objects (ArrayList<Weapons>).
JPanel paints every 10 seconds with an updater thread that iterates through the ArrayList<Weapons>. I am also sending 'questions' to a server on another machine, i.e. asking if that weapon is allowed. If it is not, the weapon object is modified by the client computer. However, whenever I modify it, I receive a ConcurrentModificationException. Instead of crashing, which I actually hope it would do at this point, since the method that changes the weapon object is on a different thread, the whole program just locks up.
I have more than 1000 lines of code in this program, and more than three threads running that access the list, so if you need any code please ask but I'd rather not post right now because in my mind this seems like a trivial question for an expert at threads.
Thanks!
(Object is made >> added to list of objects >> JPanel's "Updater" thread is constantly painting all objects every 10 ticks...
Server says that object isn't allowed >> A thread on the client computer removes that object (or toggles a boolean that says it is not visible) >> ConcurrentModificationException).
Quoting the Javadoc of ArrayList
Note that this implementation is not synchronized. If multiple threads access an ArrayList instance concurrently, and at least one of the threads modifies the list structurally, it must be synchronized externally.
You describe multiple threads accessing the list, and at least one of them modifying it. So, all accesses to the list must be done in mutually synchronized blocks.
e.g. to iterate the list:
synchronized (list) {
for (Weapons weapons : list) {
// ...
}
}
e.g. to remove an item from the list:
synchronized (list) {
list.remove(0);
}
I think you can use a synchronised version of collection or list, i.e. java.util.Collections.synchronizedCollection(Collection<>), java.util.Collections.synchronizedList(List<>). Also, if you iterate over the list and remove items, ensure that use an implementation that enable remove item from the iterator. A good candidate is java.util.ArrayList.
Other technique is use "monitors", you can declare an attribute like: private static Object monitor = new Object(); then when the code tries to access to the list, protect the code inside a synchronized(monitor) block. Using this technique you ensure that no other thread will be able to modify your list until no protected code is running.
Please, excuse my English :).

Synchronizing elements in an array

I am new to multi-threading in Java and don't quite understand what's going on.
From online tutorials and lecture notes, I know that the synchronized block, which must be applied to a non-null object, ensures that only one thread can execute that block of code. Since an array is an object in Java, synchronize can be applied to it. Further, if the array stores objects, I should be able to synchronize each element of the array too.
My program has several threads updated an array of numbers, hence I created an array of Long objects:
synchronized (grid[arrayIndex]){
grid[arrayIndex] += a.getNumber();
}
This code sits inside the run() method of the thread class which I have extended. The array, grid, is shared by all of my threads. However, this does not return the correct results while running the same program on one thread does.
This will not work. It is important to realize that grid[arrayIndex] += ... is actually replacing the element in the grid with a new object. This means that you are synchronizing on an object in the array and then immediately replacing the object with another in the array. This will cause other threads to lock on a different object so they won't block. You must lock on a constant object.
You can instead lock on the entire array object, if it is never replaced with another array object:
synchronized (grid) {
// this changes the object to another Long so can't be used to lock
grid[arrayIndex] += a.getNumber();
}
This is one of the reasons why it is a good pattern to lock on a final object. See this answer with more details:
Why is it not a good practice to synchronize on Boolean?
Another option would be to use an array of AtomicLong objects, and use their addAndGet() or getAndAdd() method. You wouldn't need synchronization to increment your objects, and multiple objects could be incremented concurrently.
The java class Long is immutable, you cannot change its value. So when you perform an action:
grid[arrayIndex] += a.getNumber();
it is not changing the value of grid[arrayIndex], which you are locking on, but is actually creating a new Long object and setting its value to the old value plus a.getNumber. So you will end up with different threads synchronizing on different objects, which leads to the results you are seeing
The synchronized block you have here is no good. When you synchronize on the array element, which is presumably a number, you're synchronizing only on that object. When you reassign the element of the array to a different object than the one you started with, the synchronization is no longer on the correct object and other threads will be able to access that index.
One of these two options would be more correct:
private final int[] grid = new int[10];
synchronized (grid) {
grid[arrayIndex] += a.getNumber();
}
If grid can't be final:
private final Object MUTEX = new Object();
synchronized (MUTEX) {
grid[arrayIndex] += a.getNumber();
}
If you use the second option and grid is not final, any assignment to grid should also be synchronized.
synchronized (MUTEX) {
grid = new int[20];
}
Always synchronize on something final, always synchronize on both access and modification, and once you have that down, you can start looking into other locking mechanisms, such as Lock, ReadWriteLock, and Semaphore. These can provide more complex locking mechanisms than synchronization that is better for scenarios where Java's default synchronization alone isn't enough, such as locking data in a high-throughput system (read/write locking) or locking in resource pools (counting semaphores).

what to use in multithreaded environment; Vector or ArrayList

I have this situation:
web application with cca 200 concurent requests (Threads) are in need to log something to local filesystem. I have one class to which all threads are placing their calls, and that class internally stores messages to one Array (Vector or ArrayList) which then in turn will be written to filesystem.
Idea is to return from thread's call ASAP so thread can do it's job as fast as possible, what thread wanted to log can be written to filesystem later, it is not so crucial.
So, that class in turn removes first element from that list and writes it to filesystem, while in real time there is 10 or 20 threads which are appending new logs at the end of that list.
I would like to use ArrayList since it is not synchronized and therefore thread's calls will last less, question is:
am I risking deadlocks / data loss? Is it better to use Vector since it is thread safe? Is it slower to use Vector?
Actually both ArrayList and Vector are very bad choices here, not because of synchronization (which you would definitely need), but because removing the first element is O(n).
The perfect data structure for your purspose is the ConcurrentLinkedQueue: it offers both thread safety (without using synchronization), and O(1) adding and removing.
Are you limitted to particular (old) java version? It not please consider using java.util.concurrent.LinkedBlockingQueue for this kind of stuff. It's really worth looking at java.util.concurrent.* package when dealing with concurrency.
Vector is worse than useless. Don't use it even when using multithreading. A trivial example of why it's bad is to consider two threads simultaneously iterating and removing elements on the list at the same time. The methods size(), get(), remove() might all be synchronized but the iteration loop is not atomic so - kaboom. One thread is bound to try removing something which is not there, or skip elements because the size() changes.
Instead use synchronized() blocks where you expect two threads to access the same data.
private ArrayList myList;
void removeElement(Object e)
{
synchronized (myList) {
myList.remove(e);
}
}
Java 5 provides explicit Lock objects which allow more finegrained control, such as being able to attempt to timeout if a resource is not available in some time period.
private final Lock lock = new ReentrantLock();
private ArrayList myList;
void removeElement(Object e) {
{
if (!lock.tryLock(1, TimeUnit.SECONDS)) {
// Timeout
throw new SomeException();
}
try {
myList.remove(e);
}
finally {
lock.unlock();
}
}
There actually is a marginal difference in performance between a sychronizedlist and a vector. (http://www.javacodegeeks.com/2010/08/java-best-practices-vector-arraylist.html)

How can CopyOnWriteArrayList be thread-safe?

I've taken a look into OpenJDK source code of CopyOnWriteArrayList and it seems that all write operations are protected by the same lock and read operations are not protected at all. As I understand, under JMM all accesses to a variable (both read and write) should be protected by lock or reordering effects may occur.
For example, set(int, E) method contains these lines (under lock):
/* 1 */ int len = elements.length;
/* 2 */ Object[] newElements = Arrays.copyOf(elements, len);
/* 3 */ newElements[index] = element;
/* 4 */ setArray(newElements);
The get(int) method, on the other hand, only does return get(getArray(), index);.
In my understanding of JMM, this means that get may observe the array in an inconsistent state if statements 1-4 are reordered like 1-2(new)-4-2(copyOf)-3.
Do I understand JMM incorrectly or is there any other explanations on why CopyOnWriteArrayList is thread-safe?
If you look at the underlying array reference you'll see it's marked as volatile. When a write operation occurs (such as in the above extract) this volatile reference is only updated in the final statement via setArray. Up until this point any read operations will return elements from the old copy of the array.
The important point is that the array update is an atomic operation and hence reads will always see the array in a consistent state.
The advantage of only taking out a lock for write operations is improved throughput for reads: This is because write operations for a CopyOnWriteArrayList can potentially be very slow as they involve copying the entire list.
Getting the array reference is an atomic operation. So, readers will either see the old array or the new array - either way the state is consistent. (set(int,E) computes the new array contents before setting the reference, so the array is consistent when the asignment is made.)
The array reference itself is marked as volatile so that readers do not need to use a lock to see changes to the referenced array. (EDIT: Also, volatile guarantees that the assignment is not re-ordered, which would lead to the assignment being done when the array is possibly in an inconsistent state.)
The write lock is required to prevent concurrent modification, which may result the array holding inconsistent data or changes being lost.
So according to Java 1.8, following are the declarations of array and lock in CopyOnWriteArrayList.
/** The array, accessed only via getArray/setArray. */
private transient volatile Object[] array;
/** The lock protecting all mutators */
final transient ReentrantLock lock = new ReentrantLock();
Following is definition of add method of CopyOnWriteArrayList
public boolean add(E e) {
final ReentrantLock lock = this.lock;
lock.lock();
try {
Object[] elements = getArray();
int len = elements.length;
Object[] newElements = Arrays.copyOf(elements, len + 1);
newElements[len] = e;
setArray(newElements);
return true;
} finally {
lock.unlock();
}
}
As #Adamski has already mentioned array is volatile and only updated via the setArray method . After that, if all the read only calls are made, and so they would be getting the updated value and hence array is always consistent here.
CopyOnWriteArrayList is a concurrent Collection class introduced in Java 5 Concurrency API along with its popular cousin ConcurrentHashMap in Java.
CopyOnWriteArrayList implements List interface like ArrayList, Vector and LinkedList but its a thread-safe collection and it achieves its thread-safety in a slightly different way than Vector or other thread-safe collection class.
As name suggest CopyOnWriteArrayList creates copy of underlying
ArrayList with every mutation operation e.g. add or set. Normally
CopyOnWriteArrayList is very expensive because it involves costly
Array copy with every write operation but its very efficient if you
have a List where Iteration outnumber mutation e.g. you mostly need to
iterate the ArrayList and don't modify it too often.
Iterator of CopyOnWriteArrayList is fail-safe and doesn't throw
ConcurrentModificationException even if underlying
CopyOnWriteArrayList is modified once Iteration begins because
Iterator is operating on separate copy of ArrayList. Consequently all
the updates made on CopyOnWriteArrayList is not available to Iterator.
To get the most updated version do a new read like list.iterator();
That being said, updating this collection alot will kill performance. If you tried to sort a CopyOnWriteArrayList you'll see the list throws an UnsupportedOperationException (the sort invokes set on the collection N times). You should only use this read when you are doing upwards of 90+% reads.

java Vector and thread safety

I'm wondering if this code will do any trouble:
I have a vector that is shared among many threads. Every time a thread has to add/remove stuff from the vector I do it under a synchronized block. However, the main thread has a call:
System.out.println("the vector's size: "+ vec.size());
which isn't synchronized.
Should this cause trouble?
All Vector methods are synchronized themselves, so as long as you are only synchronizing around a single method, your own synchronization is not necessary. If you have several method calls, which depend on each other, e.g. something like vec.get(vec.size()-2) to get the second last element, you have to use your own synchronization since otherwise, the vector may change between vec.size() and vec.get().
I assume you are referring to java.util.Vector.
Actually Vector.size() is synchronized and will return a value consistent with the vector's state (when the thread calling size() enters the monitor.) If it returns 42, then at some point in time the vector contained exactly 42 elements.
If you're adding items in a loop in another thread then you cannot predict the exact size, but it should be fine for monitoring purposes.
Each of the methods of java.util.Vector is synchronized, so this won't cause any problems for something that is just logging the size.
To improve performance you may be better off replacing your Vector with an ArrayList. The methods of ArrayList aren't synchronized, so you would need to synchronize all access yourself.
Mind that you can always obtain a synchronized version of a collection by using Collections.synchronizedCollection(Collection<T> c) static method..

Categories