Multi-thread MergeSort

Multi-thread MergeSort - java

I would like to realise a mergesort with multithreading.
so here is my code :
public class MergeSort<E extends Comparable<T>> implements Runnable {
public void run() {
mergeSort(array);
}
public synchronized void mergeSort(List<E> array) {
int size = array.size();
if (size > 1){
int mid = size / 2;
List<T> l = array.subList(0,mid);
List<T> r = array.subList(mid,vec.size());
Thread t = new Thread(new MergeSort<E>(left));
Thread t2 = new Thread(new MergeSort<E>(right));
t.start();
t2.start();
merge(l, r, array);
}
}
I would like my MergeSort to run, create 2 new threads, and then the method call the merge and finishes his job.
I tried without thread, juste by calling Mergesort(left)... It worked, so my algorithm is correct, but when I try with threads, the List is not sorted.
So, how to synchronize the threads? I Know there will be too much threads, but I just want to know how to synchronize to sort the list.

I can't tell exactly because some of the code is missing, but it does look like you're calling mergesort with "left" twice.

Couple of things to keep in mind:
Just by creating thread, dont assume that thread would start running instanteneously.
You are injecting left as parameter to both your thread instead of l and r.
If at all you want it to work, you would need thread pair each one to do its task and once that is done, you could proceed with next two halves after merging the result.

Related

MultiThreading in Java not Working as Expected

The first code does not always show the sum to be 1000, so I figured a way around to fix the problem but why does not the first code work? The results are highly inconsistent and I know that using synchronized does not serve any purpose in this case, but I was just trying.
class Thread1 extends Thread{
int[] count;
int[] event;
Thread1(int[] event, int[] count){
this.event=event;
this.count=count;
}
public void run(){
for(int i=0; i<500; i++){
int x = event[i];
synchronized (count){
count[x]++;
}
}
}
}
class Thread2 extends Thread{
int[] count;
int[] event;
Thread2(int[] event, int[] count){
this.event=event;
this.count=count;
}
public void run(){
for(int i=500; i<1000; i++){
int x = event[i];
synchronized (count){
count[x]++;
}
}
}
}
public class Learning {
public static void main(String[] args) {
Random random = new Random();
int[] event = new int[1000];
for(int i=0; i<event.length; i++){
event[i] = random.nextInt(3);
}
Thread1 a = new Thread1(event, new int[3]);
Thread2 b = new Thread2(event, new int[3]);
a.start();
b.start();
int second = a.count[1]+b.count[1];
int third = a.count[2]+b.count[2];
int first = a.count[0]+b.count[0];
System.out.println(first);
System.out.println(second);
System.out.println(third);
System.out.println("SUM--> "+(first+second+third));
}
}
WORKS HERE:
DOES NOT WORK HERE
The code sometimes shows a total of 1000 entries, sometimes doesn't. I don't feel there is any need to synchronize as no common resource is being accesed.

The Thread1 and Thread2 classes use their respective count objects to synchronize.
The problem is that you instantiate them like this:
Thread1 a = new Thread1(event, new int[3]);
Thread2 b = new Thread2(event, new int[3]);
See?
You are passing different arrays to the two threads. If your two threads use different objects as their count, you do not get mutual exclusion or proper memory visibility.
On further examination, it looks like the synchronized block is probably unnecessary anyway. (You don't need mutual exclusion, and you get certain guarantees that the child threads will see properly initialized arrays because of the start() happens-before.)
However, it is clear that it is necessary to join the two child threads in the main thread for a couple of reasons:
If you don't join(), you cannot be sure that the child threads have completed before the main thread looks at the results.
If you don't join(), there are potential memory anomalies ... even if the child threads have both terminated before the main thread looks at the counts. (The join() call imposes a happens-before relationship between the child threads and the main thread.)
Your attempted solution using stop() is bogus for the following reasons:
The stop() method is deprecated because it is dangerous. It shouldn't be used.
The stop() method doesn't have a specified synchronizing effect.
Based on the documented semantics (such as they are) there is no logical reason that calling stop() should fix the problem.
As a general rule "randomly trying thing" is not a sound strategy for fixing concurrency bugs. There is a good chance that a random chance will not fix a bug, but turn it from a bug that occurs frequently to one that occurs rarely ... or only on a different Java platform to the one you test on.
Why does it appear to work?
It looks like the child threads terminated before they stopped. But it is just luck that that is happening. It is unlikely to happen if you scale up the amount of work that the child threads do.

adding - a.stop(); b.stop(); after a.start(); b.start(); fixes the problem.
But I don't understand why.

How to ensure parallel processing in Java

I have a static Array, arr, whose elements are getting squarred and stored back again using the 'squarring' method. I want two threads to simultaneously modify the array. Each thread works on half of the array.
public class SimplerMultiExample {
private static int[] arr = new int[10];
public static void squarring(int start, int end)
{
for(int i=start; i<end; i++)
{
arr[i]*=arr[i];
System.out.println("here "+Thread.currentThread().getName());
}
}
private static Runnable doubleRun = new Runnable() {
#Override
public void run() {
if(Integer.parseInt(Thread.currentThread().getName())==1)
squarring(0,arr.length/2); //Thread named "1" is operaing on
//the 1st half of the array.
else
squarring(arr.length/2,arr.length);
}
};
public static void main(String[] args){
Thread doubleOne = new Thread(doubleRun);
doubleOne.setName("1");
Thread doubleTwo = new Thread(doubleRun);
doubleTwo.setName("2");
doubleOne.start();
doubleTwo.start();
}
}
The sysout in the 'squarring' method tells me that the threads are going into the method serially, that is, one of threads finishes before the other one accesses it. As a result, one of the threads finishes early whereas the other ones takes considerably longer to complete. I have tested the code with 1 million elements. Please advice on what I can do to ensure that the threads operate in parallel.
P.S - I am using a dual core system.

You don't have to program this from scratch:
Arrays.parallelSetAll(arr, x -> x * x);
parallelSetAll creates a parallel stream which does all the work for you:
Set all elements of the specified array, in parallel, using the provided generator function to compute each element.
If you'd like to know how to control the number of threads used for parallel processing, checkout this question.

I recommend you add the following code to the end of your main method to ensure they each finish their work:
try {
doubleOne.join(); //Waits for this thread to die.
doubleTwo.join(); //Waits for this thread to die.
} catch (InterruptedException e) {
e.printStackTrace();
}
You are creating two threads that could be executed in parallel provided that the scheduler chooses to interleave them. However, the scheduler is not guaranteed to interleave the execution. Your code works in parallel on my system (...sometimes.. but other times, the threads execute in series).
Since you are not waiting for the threads to complete (using the join method), it is less likely that you would observe the interleaving (since you only observe part of the program's execution).

How can I make two threads to do two different loops or methods?

I have a algorithm to calculate something in a grid looking very roughly like this:
public class Main {
pass1 ;
pass2 ;
public static void main(String[] args) throws java.lang.Exception {
Function f = new Function();
f.solve(pass1, pass2);
}
}
public class Function {
public void solve(pass1, pass2) {
method1(pass1, pass2);
method2(pass1, pass2);
method3(pass1, pass2);
}
method1(pass1, pass2) {
//parse grid
for (row = 0; row < numofrows; row++) {
for (col = 0; col < numofcols; col++) {
method4(stuff in here to pass);
}
}
}
method2(pass1, pass2) {
//parse grid
for (row = 0; row < numofrows; row++) {
for (col = 0; col < numofcols; col++) {
method4(stuff in here to pass );
}
}
}
method3(pass1, pass2) {
//do stuff
}
method4(stuff) {
//add object to hashmap
}
}
I want to make the algorithm faster using Threads.
The idea that I have is to make one thread do method1 and/or method2 with a even increment counter, and another thread to do it in an odd increment counter, making use of more cpu, because right now it's only using 25% (1/4 cores I assume).
Is it possible to make a thread do a different loop or method if I were to make method2even() and method2odd()? If so how would I implement this, I have been trying for hours and I can't wrap my head around it...

What you're suggesting is fine-grained parallelism, which can cause problems because of the memory hierarchy - if two threads are operating on alternating indices of the same array/matrix then they're going to have to essentially write directly to main memory (e.g. by flushing their caches after every operation) which is probably going to cause your multi-threaded program to run considerably slower than your single-threaded program. As much as possible, try to have your threads write to completely different segments of memory, e.g. entirely different arrays/matrices or at least different sections of the same array/matrix (e.g. thread1 writes to the first half of an array while thread2 writes to the second half - hopefully their array segments will be on different cache lines and they won't need to write to main memory to maintain coherency); if your threads are operating on the same memory segments then try to have them do so at different times, so that they can calculate their intermediate results in cache prior to flushing their final results to main memory.
So in your case, are method1, method2, and method3 independent of each other? If so then use a different thread for each method. If they're not independent, e.g. method1 must precede method2 must precede method3, then you could use a pipeline approach: Thread1 executes method1 on the first N elements of the matrix, then Thread2 executes method2 on the first N elements of the matrix while Thread1 executes method1 on the second N elements of the matrix, then Thread3 executes method3 on the first N elements of the matrix while Thread2 executes method2 on the second N elements of the matrix while Thread1 executes method1 on the first N elements of the matrix, and so on until all matrix elements have been processed.
If your threads need to talk to each other (e.g. to pass around matrix segments for pipelining) then I prefer to use something like a BlockingQueue: Method1 and Method2 would share a queue, with Method1 writing elements to it (via offer) and Method2 reading elements from it (via take). Method2 blocks with take until Method1 sends it a matrix segment to work on, then when Method2 is finished with the matrix segment it will send it on to Method3 via another BlockingQueue, then calls take again on the queue it shares with Method1.
Assuming that your methods are independent, some code to run them on separate threads would be as follows; this can be modified to accommodate pipelining instead. I'm omitting the MethodN constructors where you'll need to pass in the matrix etc. I'm using the Runnable interface, but as MadProgrammer said in the comments you can use Callable instead. The ExecutorService is responsible for assigning the Runnables to threads.
public class Method1 implements Runnable {
public void run() {
// execute method1
}
}
public class Method2 implements Runnable {
public void run() {
// execute method2
}
}
public class Method3 implements Runnable {
public void run() {
// execute method3
}
}
public class Function {
private ExecutorService executor = Executors.newFixedThreadPool(3);
public void solve(pass1, pass2) {
Method1 method1 = new Method1(pass1, pass2);
Method2 method2 = new Method2(pass1, pass2);
Method3 method3 = new Method3(pass1, pass2);
executor.submit(method1);
executor.submit(method2);
executor.submit(method3);
}
}

10 threads write to single hash simultaniously

Sorry for the question(( I just have stuck in the end of the day((
I need for test 10 threads writing to the same hash (really not a hash but very similar thing i need to prove it synchronization for write)
is this is right code?
Random rn = new Random();
Map<int,int> hash = new MyHashMap<int,int>();
for(int i = 0; i< 10; i++)
{
Thread th = new MyAddingThread();
th.Start();
}
public class MyAddingThread extends Thread{
public void run()
{
hash.Add(rn.nextInt,rn.nextInt);
}
}
May be better to change 10 to 100. but I have no idea how to test dat hash for synchronization(

HashMap is not thread-safe. Use a ConcurrentHashMap instead.
EDIT
If your real question is whether your code will allow you to test whether any given data structure is thread-safe or not, there really is no reliable way to do so. Multi-thread development can introduce any number of bugs that can be extremely difficult to detect. A data structure is either designed to be thread-safe, or it is not.

That won't work, as all threads should try to add the item to the hash at the same time.
To do that, you need to use a CountDownLatch
[[main]]
CountDownLatch startSignal = new CountDownLatch(1);
for(int i = 0; i< 10; i++)
{
Thread th = new MyAddingThread();
th.Start();
}
startSignal.countdown();
[[on the thread]]
public class MyAddingThread extends Thread{
public void run()
{
startSignal.await();
hash.Add(rn.nextInt,rn.nextInt);
}
}
The javadoc of this class has a similar (but more complex) example.
Also, if you want to test this properly, create the same number of threads as cores in your computer and each thread should do a loop and insert a few hundred items at least. if you try to only insert 10 elements, there are very little chances that you'll hit a concurrent problem.

A HashMap is not synchronized so you have to do it on your own!
You may synchronize the access:
public class MyAddingThread extends Thread {
public void run() {
synchronized(hash) {
hash.put(rn.nextInt(),rn.nextInt());
}
}
}
but than you miss most of the concurrent execution.
Another idea is to use a ConcurrentHashMap:
ConcurrentMap<Integer,Integer> hash = new ConcurrentHashMap<>();
public class MyAddingThread extends Thread {
public void run() {
hash.put(rn.nextInt(),rn.nextInt());
}
}
This would give more performance due to less blocking code.
The other problem is the use of Random here which is thread safe but will also result in poor performance.
Which solution is the best for your problem I cannot consider from your little pseudo code.

Data race in Java ArrayList class

I was reading about CopyOnWriteArrayList and was wondering how can I demonstrate data race in ArrayList class. Basically I'm trying to simulate a situation where ArrayList fails so that it becomes necessary to use CopyOnWriteArrayList. Any suggestions on how to simulate this.

A race is when two (or more) threads try to operate on shared data, and the final output depends on the order the data is accessed (and that order is indeterministic)
From Wikipedia:
A race condition or race hazard is a flaw in an electronic system or process whereby the output and/or result of the process is unexpectedly and critically dependent on the sequence or timing of other events. The term originates with the idea of two signals racing each other to influence the output first.
For example:
public class Test {
private static List<String> list = new CopyOnWriteArrayList<String>();
public static void main(String[] args) throws Exception {
ExecutorService e = Executors.newFixedThreadPool(5);
e.execute(new WriterTask());
e.execute(new WriterTask());
e.execute(new WriterTask());
e.execute(new WriterTask());
e.execute(new WriterTask());
e.awaitTermination(20, TimeUnit.SECONDS);
}
static class WriterTask implements Runnable {
#Override
public void run() {
for (int i = 0; i < 25000; i ++) {
list.add("a");
}
}
}
}
This, however, fails when using ArrayList, with ArrayIndexOutOfbounds. That's because before insertion the ensureCapacity(..) should be called to make sure the internal array can hold the new data. And here's what happens:
the first thread calls add(..), which in turn calls ensureCapacity(currentSize + 1)
before the first thread has actually incremented the size, the 2nd thread also calls ensureCapacity(currentSize + 1).
because both have read the initial value of currentSize, the new size of the internal array is currentSize + 1
the two threads make the expensive operation to copy the old array into the extended one, with the new size (which cannot hold both additions)
Then each of them tries to assign the new element to array[size++]. The first one succeeds, the second one fails, because the internal array has not been expanded properly, due to the rece condition.
This happens, because two threads have tried to add items at the same time on the same structure, and the addition of one of them has overridden the addition of the other (i.e. the first one was lost)
Another benefit of CopyOnWriteArrayList
multiple threads write to the ArrayList
a thread iterates the ArrayList. It will surely get ConcurrentModificationException
Here's how to demonstrate it:
public class Test {
private static List<String> list = new ArrayList<String>();
public static void main(String[] args) throws Exception {
ExecutorService e = Executors.newFixedThreadPool(2);
e.execute(new WriterTask());
e.execute(new ReaderTask());
}
static class ReaderTask implements Runnable {
#Override
public void run() {
while (true) {
for (String s : list) {
System.out.println(s);
}
}
}
}
static class WriterTask implements Runnable {
#Override
public void run() {
while(true) {
list.add("a");
}
}
}
}
If you run this program multiple times, you will often be getting ConcurrentModificationException before you get OutOfMemoryError.
If you replace it with CopyOnWriteArrayList, you don't get the exception (but the program is very slow)
Note that this is just a demonstration - the benefit of CopyOnWriteArrayList is when the number of reads vastly outnumbers the number of writes.

Example:
for (int i = 0; i < array.size(); ++i) {
Element elm = array.get(i);
doSomethingWith(elm);
}
If another thread calls array.clear() before this thread calls array.get(i), but after it has compared i with array.size(), -> ArrayIndexOutOfBoundsException.

Two threads, one incrementing the arraylist and one decrementing. Data race could happen here.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Multi-thread MergeSort - java

I can't tell exactly because some of the code is missing, but it does look like you're calling mergesort with "left" twice.

Related

MultiThreading in Java not Working as Expected

How to ensure parallel processing in Java

How can I make two threads to do two different loops or methods?

10 threads write to single hash simultaniously

Data race in Java ArrayList class

Categories

Resources