I used concurrent hashmap for creating a matrix. It indices ranges to 100k. I have created 40 threads. Each of the thread access those elements of matrices and modifies to that and write it back of the matrix as:
ConcurrentHashMap<Integer, ArrayList<Double>> matrix =
new ConcurrentHashMap<Integer, ArrayList<Double>>(25);
for (Entry(Integer,ArrayList<Double>)) entry: matrix.entrySet())
upDateEntriesOfValue(entry.getValue());
I did not found it thread safe. Values are frequently returned as null and my program is getting crashed. Is there any other way to make it thread safe.Or this is thread safe and i have bug in some other places. One thing is my program does not crash in single threaded mode.
The iterator is indeed thread-safe for the ConcurrentHashMap.
But what is not thread-safe in your code is the ArrayList<Double> you seem to update! Your problems might come from this data structure.
You may want to use a concurrent data structure adapted to you needs.
Using a map for a matrix is really inefficient, and the way you have used it, it won't even support sparse arrays particularly well.
I suggest you use a double[][] where you lock each row (or column if that is better) If the matrix is small enough you may be better of using only one CPU as this can save you quite a bit of overhead.
I would suggest you create no more threads than you have cores. For CPU intensive tasks, using more thread can be slower, not faster.
Matrix is 100k*50 at max
EDIT: Depending on the operation you are performing, I would try to ensure you have the shorter dimension first so you can process each long dimension in a different thread efficiently.
e.g
double[][] matrix = new double[50][100*1000];
for(int i=0;i<matrix.length;i++) {
final double[] line = matrix[i];
executorService.submit(new Runnable() {
public void run() {
synchronized(line) {
processOneLine(line);
}
}
});
}
This allows all you thread to run concurrently because they don't share any data structures. They can also access each double efficiently because they are continuous in memory and stored as efficiently as possible. i.e. 100K doubles uses about 800KB, but List<Double> uses 2800KB and each value can be randomly arranged in memory which means your cache has to work much harder.
thanks but in fact i have 80 cores in total
To uses 80 core efficiently you might want to break the longer lines in two or four so you can keep all the cores busy, or find a way to perform more than one operation at a time.
TheConcurrentHashMap will be thread safe for accesses into the Map, but the Lists served out need to be thread-safe, if multiple threads can operate on the same List instances concurrently so use a thread-safe list while modifying.
In your case working on ConcurrentHashMap is tread-safe but when thread goes to ArrayList this is not synchronized and hence multiple threads can access it simultaneously which makes it non thread-safe. either you can use synchronized block where you are performing modification in list
Related
I was just looking for the answer for the question why ArrayList is faster than Vector and i found ArrayList is faster as it is not synchronized.
so my doubt is:
If ArrayList is not synchronized why would we use it in multithreaded environment and compare it with Vector.
If we are in a single threaded environment then how the performance of the Vector decreases as there is no Synchronization going on as we are dealing with a single thread.
Why should we compare the performance considering the above points ?
Please guide me :)
a) Methods using ArrayList in a multithreaded program may be synchronized.
class X {
List l = new ArrayList();
synchronized void add(Object e) {
l.add(e);
}
...
b) We can use ArrayList without exposing it to other threads, this is when ArrayList is referenced only from local variables
void x() {
List l = new ArrayList(); // no other thread except current can access l
...
Even in a single threaded environment entering a synchronized method takes a lock, this is where we lose performance
public synchronized boolean add(E e) { // current thread will take a lock here
modCount++;
...
You can use ArrayList in a multithread environment if the list is not shared between threads.
If the list is shared between threads you can synchronize the access to that list.
Otherwise you can use Collections.synchronizedList() to get a List that can be used thread safely.
Vector is an old implementation of a synchronized List that is no longer used because the internal implementation basically synchronize every method. Generally you want to synchronize a sequence of operations. Otherwyse you can throw a ConcurrentModificationException when iterating the list another thread modify it. In addition synchronize every method is not good from a performance point of view.
In addition also in a single thread environment accessing a synchronized method needs to perform some operations, so also in a single thread application Vector is not a good solution.
Just because a component is single threaded doesn't mean that it cannot be used in a thread safe context. Your application may have it's own locking in which case additional locking is redundant work.
Conversely, just because a component is thread safe, it doesn't mean that you cannot use it in an unsafe manner. Typically thread safety extends to a single operation. E.g. if you take an Iterator and call next() on a collection this is two operations and they are no longer thread safe when used in combination. You still have to use locking for Vector. Another simple example is
private Vector<Integer> vec =
vec.add(1);
int n = vec.remove(vec.size());
assert n == 1;
This is atleast three operations however the number of things which can go wrong are much more than you might suppose. This is why you end up doing your own locking and why the locking inside Vector might be redundant, even unwanted.
For you own interest;
vec can change at any point t another Vector or null
vec.add(2) can happen between any operation, changing the size and the last element.
vec.remove() can happen between any operation.
vec.add(null) can happen between any operation resulting in a possible NullPointerException
The vec can /* change */ in these places.
private Vector<Integer> vec =
vec.add(1); /* change*/
int n = vec.remove(vec.size() /* change*/);
assert n == 1;
In short, assuming that just because you used a thread safe collection your code is now thread safe is a big assumption.
A common pattern which breaks is
for(int n : vec) {
// do something.
}
Look harmless enough except
for(Iterator iter = vec.iterator(); /* change */ vec.hasNext(); ) {
/* change */ int n = vec.next();
I have marked with /* change */ where another thread could change the collection meaning this loop can get a ConcurrentModificationException (but might not)
there is no Synchronization
The JVM doesn't know there is no need for synchronization and so it still has to do something. It has an optimisation to reduce the cost of uncontended locks, but it still has to do work.
You need to understand the basic concept to know answer for your above questions...
When you say array list is not syncronized and vector is, we mean that the methods in those classes (like add(), get(), remove() etc...) are synchronized in vector class and not in array list class. These methods will act upon tha data being stored .
So, the data saved in vector class cannot be edited / read parallely as add, get, remove metods are synchornized and the same in array list can be done parallely as these methods in array list are not synchronized...
This parallel activity makes array list fast and vector slow... This behavior remains same though you use them in either multithreaded (or) single threaded enviornment...
Hope this answers your question...
I have the following code ,
import java.util.Arrays;
public class ParellelStream {
public static void main(String args[]){
Double dbl[] = new Double[1000000];
for(int i=0; i<dbl.length;i++){
dbl[i]=Math.random();
}
long start = System.currentTimeMillis();
Arrays.parallelSort(dbl);
System.out.println("time taken :"+((System.currentTimeMillis())-start));
}
}
When I run this code it takes time approx 700 to 800 ms, but when I replace the line Arrays.parallelSort to Arrays.sort it takes 500 to 600 ms. I read about the Arrays.parallelSort and Arrays.sort method which says that Arrays.parellelSort gives poor performance when dataset are small but here I am using array of 1000000 elements. what could be the reason for parallelSort poor performance ?? I am using java8.
The parallelSort function will use a thread for each cpu core you have on your machine. Specifically parallelSort runs tasks on the ForkJoin common thread pool. If you only have one core you would not see an improvement over single threaded sort.
If you only have multiple cores you are going to have some upfront cost associated with creating the new threads which will mean that for relatively small arrays you are not going to see linear performance gains.
The compare function for comparing doubles is not an expensive function. I think that in this case 1000000 elements can be safely considered small and the benefits of using multiple threads is outweighed by the upfront costs of creating those threads. Since the upfront costs will be fixed you should see a performance gain with larger arrays.
I read about the Arrays.parallelSort and Arrays.sort method which says
that Arrays.parellelSort gives poor performance when dataset are small
but here I am using array of 1000000 elements.
This is not the only thing to take in consideration. It depends a lot on your machine (how your CPU handle multi-threading etc).
Here a quote from the Parallelism part of The Java Tutorials
Note that parallelism is not automatically faster than performing
operations serially, although it can be if you have enough data and
processor cores [...] it is still your responsibility to determine if
your application is suitable for parallelism.
You might also want to have a look at the code of java.util.ArraysParallelSortHelpers for a better understanding of the algorithm.
Note that the parallelSort method use the ForkJoinPool introduced in Java 7 to take advantages of each processors of your computer as stated in the javadoc :
A ForkJoinPool is constructed with a given target parallelism level;
by default, equal to the number of available processors.
Note that if the length of the array is less then 1 << 13, the array will be sorted using the appropriate Arrays.sort method.
See also
Fork/Join
I am deciding what is the best way to achieve high performance gain while achieving thread safety (synchronization) for required point.
Consider the following case. There are two entry point in system and I want to make sure there is no two or more threads updates cashAccounts and itemStore at same time. So I created a Object call Lock and use it as follows.
public class ForwardPath {
public void fdWay(){
synchronized (Lock.class){
//here I am updating both cashAccount object and
//itemStore object
}
}
}
.
public class BackWardPath {
public void bwdWay(){
synchronized (Lock.class){
//here I am updating both cashAccount object and
//itemStore object
}
}
}
But this implementation will greatly decrease performance, If both ForwardPath and BackWardPath are triggered frequently.
But in this case it is some what difficult to lock only cashAccount and itemStore because both these objects get updates several times inside both paths.
Is there a good way to achieve both performance gain and thread safety in this scenario ?
The example is far too abstract, and the little you describe leaves no alternative to synchronization in the methods.
To obtain high scalability (thats not necessarily highest performance in all situations, mind you), work is usually subdivided into units of work that are completely independent of each other (these they can be processed without any synchronization).
Lets assume a simple example, summing up numbers (purely to demonstrate the principle):
The naive solution would be to have one accumulator for the sum, and walk the numbers adding to the accumulator. Obviously, if you wanted to use multiple threads, the accumulator would need to be synchronized and become the major point of contention).
To eliminate the contention, you can partition the numbers into multiple slices - separate units of work. Each unit of work can be summed independently (one thread per unit of work, for example). To get the final sum, add up the partial sums of each unit of work. The only point where synchronization is now needed is when combining the partial results. If you had for example 10 billion numbers, and divide them into 10 units of work, you need only synchronized 10 times - instead of 10 billion times in the naive solution.
The principle is always the same here: Make sure you can do as much work as possible without synchronization, then combine the partial results to obtain the final result. Thinking on the individual operation level is to fine a granularity to lend itself well to multi threading.
Performance-gain by using Threads is an architectural question, just adding some Threads and synchronized won't do the trick and usually just screws up your code while not working any faster than before. Therefore your code example is not enough to help you on the actual problem you seem to be facing, as each threaded solution is unique to your actual code.
Here I would focus on custom application where I got degradation (no need for general discussion about fastness of threads against processes).
I've got MPI application on Java which solve some problem using iteration method. The schematic view to application bellow lets call it MyProcess(n), where "n" is the number of processes:
double[] myArray = new double[M*K];
for(int iter = 0;iter<iterationCount;++iter)
{
//some communication between processes
//main loop
for(M)
for(K)
{
//linear sequence of arithmetical instructions
}
//some communication between processes
}
To improve performance I've decided to use Java threads (lets call it MyThreads(n)). The code is almost the same – myArray becomes matrix, where each row contains array for appropriate thread.
double[][] myArray = new double[threadNumber][M*K];
public void run()
{
for(int iter = 0;iter<iterationCount;++iter)
{
//some synchronization primitives
//main loop
for(M)
for(K)
{
//linear sequence of arithmetical instructions
counter++;
}
// some synchronization primitives
}
}
Threads created and started using Executors.newFixedThreadPool(threadNumber).
The problem is that while for MyProcess(n) we got adequate performance(n in [1,8]), in case of MyThreads(n) performance degrades essentially(on my system by factor of n).
Hardware: Intel(R) Xeon(R) CPU X5355(2 processors, 4 cores on each)
Java version: 1.5(using d32 option).
At first I thought that got different workloads on threads, but no, variable “counter” shows, that number of iterations between different run of MyThreads(n) (n in [1,8]) are identical.
And it isn’t synchronization fault, because I have temporary comment all synchronization primitives.
Any suggestions/ideas would be appreciated.
Thanks.
There are 2 issues I see in your piece of code.
Firstly caching problem. Since you try to do this in multi thread/process I'd assume your M * K results in a large number; then when you do
double[][] myArray = new double[threadNumber][M*K];
You are essentially creating an array of double pointer with size threadNumber; each pointing to a double array of size M*K. The interesting point here is that the threadNumber count of arrays are not necessarily allocated onto the same block of memory. They are just double pointers which can be allocated anywhere inside JVM heap. As a result, when multiple threads run, you might encounter a lot of cache miss and you end up reading memory many times, eventually slow down your program.
If the above is the root cause, you can try enlarge your JVM heap size, and then do
double[] myArray = new double[threadNumber * M * K];
And have the threads operating on different segment of the same array. You should be able to see performance better.
Secondly synchronization issue. Note that double (or any primitive) array is NOT volatile. Thus your result on 1 thread isn't guaranteed to be visible to other threads. If you are using synchronization block this resolves the issue, as a side effect of synchronization is make sure visibility across threads; If not, when you are reading and writing the array, please always make sure you use Unsafe.putXXXVolatile() and Unsafe.getXXXVolatile() so that you can do volatile operations on arrays.
To take this further, Unsafe can also be used to create a continuous segment of memory which you can used to hold your data structure and achieve better performance. In your case I think 1) already do the trick.
I have this situation:
web application with cca 200 concurent requests (Threads) are in need to log something to local filesystem. I have one class to which all threads are placing their calls, and that class internally stores messages to one Array (Vector or ArrayList) which then in turn will be written to filesystem.
Idea is to return from thread's call ASAP so thread can do it's job as fast as possible, what thread wanted to log can be written to filesystem later, it is not so crucial.
So, that class in turn removes first element from that list and writes it to filesystem, while in real time there is 10 or 20 threads which are appending new logs at the end of that list.
I would like to use ArrayList since it is not synchronized and therefore thread's calls will last less, question is:
am I risking deadlocks / data loss? Is it better to use Vector since it is thread safe? Is it slower to use Vector?
Actually both ArrayList and Vector are very bad choices here, not because of synchronization (which you would definitely need), but because removing the first element is O(n).
The perfect data structure for your purspose is the ConcurrentLinkedQueue: it offers both thread safety (without using synchronization), and O(1) adding and removing.
Are you limitted to particular (old) java version? It not please consider using java.util.concurrent.LinkedBlockingQueue for this kind of stuff. It's really worth looking at java.util.concurrent.* package when dealing with concurrency.
Vector is worse than useless. Don't use it even when using multithreading. A trivial example of why it's bad is to consider two threads simultaneously iterating and removing elements on the list at the same time. The methods size(), get(), remove() might all be synchronized but the iteration loop is not atomic so - kaboom. One thread is bound to try removing something which is not there, or skip elements because the size() changes.
Instead use synchronized() blocks where you expect two threads to access the same data.
private ArrayList myList;
void removeElement(Object e)
{
synchronized (myList) {
myList.remove(e);
}
}
Java 5 provides explicit Lock objects which allow more finegrained control, such as being able to attempt to timeout if a resource is not available in some time period.
private final Lock lock = new ReentrantLock();
private ArrayList myList;
void removeElement(Object e) {
{
if (!lock.tryLock(1, TimeUnit.SECONDS)) {
// Timeout
throw new SomeException();
}
try {
myList.remove(e);
}
finally {
lock.unlock();
}
}
There actually is a marginal difference in performance between a sychronizedlist and a vector. (http://www.javacodegeeks.com/2010/08/java-best-practices-vector-arraylist.html)