Threads stopping prematurely for certain values - java

Background
So I'm writing an application that aims to perform Monte Carlo simulations to investigate graphs that can evolve via the Moran process (evolutionary graph theory). For un-directed graphs this works perfectly but for directed graphs the application has been exhibiting strange behaviour and I can't for the life of me figure out why. What seems to happen is that when this Boolean variable isDirected is set to true, the threads exit the for loop they run in before the loop condition is met, despite working properly when isDirected is false.
The graphs are represented by an adjacency matrix so the only difference in the code when the graph is directed is that the adjacency matrix is non-symmetric, but I can't see any reason that would have an impact.
Code
The main relevant code is this section from the controller:
//Initialise a threadPool and an array of investigators to provide each thread with an Investigator runnable
long startTime = System.nanoTime();
int numThreads = 4;
Investigator[] invArray = new Investigator[numThreads];
ExecutorService threadPool = Executors.newFixedThreadPool(numThreads);
//Assign the tasks to the threads
for(int i=0;i<numThreads;i++){
invArray[i] = new Investigator(vertLimit,iterations,graphNumber/numThreads,isDirected,mutantFitness,vertFloor);
threadPool.submit(invArray[i]);
}
threadPool.shutdown();
//Wait till all the threads are finished, note this could cause the application to hang for the user if the threads deadlock
try{
threadPool.awaitTermination(Long.MAX_VALUE, TimeUnit.NANOSECONDS);
}catch(InterruptedException except){
System.out.println("Thread interrupted");
}
//The next two blocks average the results of the different threads into 1 array
double[] meanArray = new double[vertLimit];
double[] meanError = new double[vertLimit];
double[] fixProbArray = new double[vertLimit];
double[] fixProbError = new double[vertLimit];
for(int x=0;x<vertLimit;x++){
for(Investigator i:invArray){
meanArray[x] += i.getMeanArray()[x];
meanError[x] += Math.pow(i.getMeanError()[x], 2);
fixProbArray[x] += i.getFixProbArray()[x];
fixProbError[x] += Math.pow(i.getFixProbError()[x], 2);
}
meanArray[x] = meanArray[x]/numThreads;
fixProbArray[x] = fixProbArray[x]/numThreads;
meanError[x] = Math.sqrt(meanError[x]);
fixProbError[x] = Math.sqrt(fixProbError[x]);
}
long endTime = System.nanoTime();
//The remaining code is for printing and producing graphs of the results
As well as the Investigator class, the important parts of which are shown below:
public class Investigator implements Runnable{
public Investigator(int vertLimit,int iterations,int graphNumber,Boolean isDirected,int mutantFitness,int... vertFloor){
//Constructor just initialises all the class variables passed in
}
public void run(){
GraphGenerator g = new GraphGenerator();
Statistics stats = new Statistics();
//The outer loop iterates through graphs with increasing number of vertices, this is the problematic loop that exits too early
for(int x = vertFloor>2?vertFloor:2; x < vertLimit; x++){
System.out.println("Current vertex amount: " + x);
double[] currentMean = new double[graphNumber];
double[] currentMeanErr = new double[graphNumber];
double[] currentFixProb = new double[graphNumber];
double[] currentFixProbErr = new double[graphNumber];
//This loop generates the required number of graphs of the given vertex number and performs a simulation on each one
for(int y=0;y<graphNumber;y++){
Simulator s = new Simulator();
matrix = g.randomGraph(x, isDirected, mutantFitness);
s.moranSimulation(iterations, matrix);
currentMean[y] = stats.freqMean(s.getFixationTimes());
currentMeanErr[y] = stats.freqStandError(s.getFixationTimes());
currentFixProb[y] = s.getFixationProb();
currentFixProbErr[y] = stats.binomialStandardError(s.getFixationProb(), iterations);
}
meanArray[x] = Arrays.stream(currentMean).sum()/currentMean.length;
meanError[x] = Math.sqrt(Arrays.stream(currentMeanErr).map(i -> i*i).sum());
fixProbArray[x] = Arrays.stream(currentFixProb).sum()/currentFixProb.length;
fixProbError[x] = Math.sqrt(Arrays.stream(currentFixProbErr).map(i -> i*i).sum());;
}
}
//A number of getter methods also provided here
}
Problem
I've put in some print statements to work out what's going on and for some reason when I set isDirected to true the threads are finishing before x reaches the vertLimit (which I've checked is indeed the value I specified). I've tried manually using my GraphGenerator.randomGraph() method for a directed graph and it is giving the correct output as well as testing Simulator.moranSimulation() which also works fine for directed graphs when called manually and I'm not getting a thread interruption caught by my catch block so that's not the issue either.
For the same set of parameters the threads are finishing at different stages seemingly randomly, sometimes they are all on the same value of x when they stop, sometimes some of the threads will have gotten further than the others but that changes from run to run.
I'm completely stumped here and would really appreciate some help, thanks.

When tasks are being run by an ExecutorService, they can sometimes appear to end prematurely if an unhandled exception is thrown.
Each time you call .submit(Runnable) or .submit(Callable) you get a Future object back that represents the eventual completion of the task. The Future object has a .get() method that will return the result of the task when it is complete. Calling this method will block until that result is available. Also, if the task throws an exception that is not otherwise handled by your task code, the call to .get() will throw an ExecutionException which will wrap the actual thrown exception.
If your code is exiting prematurely due to an unhandled exception, call .get() on each Future object you get when you submit the task for execution (after you have submitted all the tasks you wish to) and catch any ExecutionExceptions that happen to be thrown to figure out what the actual underlying problem is.

It looks like you might be terminating the threads prematurely with threadPool.shutdown();
From the Docs:
This method does not wait for previously submitted tasks to complete execution. Use awaitTermination to do that.
The code invokes .shutdown before awaitTermination...

Related

Thread executes too many times and causes race condition even though I'm using locks

I'm working on a multithread application for an exercise used to simulate a warehouse (similar to the producer consumer problem) however I'm running into some trouble with the program where increasing the number of consumer threads makes the program behave in unexpected ways.
The code:
I'm creating a producer thread called buyer which has as a goal to order precisely 10 orders from the warehouse each. To do this they have a shared object called warehouse on which a buyer can place an order, the order is then stored in a buffer in the shared object. After this the buyer sleeps for some time until it either tries again or all packs have been bought. The code to do this looks like this:
public void run() {
//Run until the thread has bought 10 packages, this ensures the thread
//will eventually stop execution automatically.
while(this.packsBought < 10) {
try {
//Sleep for a random amount of time between 1 and 50
//milliseconds.
Thread.sleep(this.rand.nextInt(49) + 1);
//Catch any interruptExceptions.
} catch (InterruptedException ex) {
//There is no problem if this exception is thrown, the thread
//will just make an order earlier than planned. that being said
//there should be no manner in which this exception is thrown.
}
//Create a new order.
Order order = new Order(this.rand.nextInt(3)+ 1,
this,
this.isPrime);
//Set the time at which the order was placed as now.
order.setOrderTime(System.currentTimeMillis());
//place the newly created order in the warehouse.
this.warehouse.placeOrder(order);
}
//Notify the thread has finished execution.
System.out.println("Thread: " + super.getName() + " has finished.");
}
As you can see the function placeOrder(Order order); is used to place an order at the warehouse. this function is responsible for placing the order in the queue based on some logic related to prime status. The function looks like this:
public void placeOrder(Order order) {
try{
//halt untill there are enough packs to handle an order.
this.notFullBuffer.acquire();
//Lock to signify the start of the critical section.
this.mutexBuffer.lock();
//Insert the order in the buffer depending on prime status.
if (order.isPrime()) {
//prime order, insert behind all prime orders in buffer.
//Enumerate all non prime orders in the list.
for (int i = inPrime; i < sizeOrderList - 1; i++) {
//Move the non prime order back 1 position in the list.
buffer[i + 1] = buffer[i];
}
// Insert the prime order.
buffer[inPrime++] = order;
} else {
//No prime order, insert behind all orders in buffer.
buffer[inPrime + inNormal++] = order;
}
//Notify the DispatchWorkers that a new order has been placed.
this.notEmptyBuffer.release();
//Catch any InterruptException that might occure.
} catch(InterruptedException e){
//Even though this isn't expected behavior, there is no reason to
//notify the user of this event or to preform any other action as
//the thread will just return to the queue before placing another
//error if it is still required to do so.
} finally {
//Unlock and finalize the critical section.
mutexBuffer.unlock();
}
}
The orders are consumed by workers which act as the consumer thread. The thread itself contains very simple code looping until all orders have been processed. In this loop a different function handleOrder(); is called on the same warehouse object which handles a single order from the buffer. It does so with the following code:
public void handleOrder(){
//Create a variable to store the order being handled.
Order toHandle = null;
try{
//wait until there is an order to handle.
this.notEmptyBuffer.acquire();
//Lock to signify the start of the critical section.
this.mutexBuffer.lock();
//obtain the first order to handle as the first element of the buffer
toHandle = buffer[0];
//move all buffer elementst back by 1 position.
for(int i = 1; i < sizeOrderList; i++){
buffer[i - 1] = buffer[i];
}
//set the last element in the buffer to null
buffer[sizeOrderList - 1] = null;
//We have obtained an order from the buffer and now we can handle it.
if(toHandle != null) {
int nPacks = toHandle.getnPacks();
//wait until the appropriate resources are available.
this.hasBoxes.acquire(nPacks);
this.hasTape.acquire(nPacks * 50);
//Now we can handle the order (Simulated by sleeping. Although
//in real live Amazon workers also have about 5ms of time per
//package).
Thread.sleep(5 * nPacks);
//Calculate the total time this order took.
long time = System.currentTimeMillis() -
toHandle.getOrderTime();
//Update the total waiting time for the buyer.
toHandle.getBuyer().setWaitingTime(time +
toHandle.getBuyer().getWaitingTime());
//Check if the order to handle is prime or not.
if(toHandle.isPrime()) {
//Decrement the position of which prime orders are
//inserted into the buffer.
inPrime--;
} else {
//Decrement the position of which normal orders are
//inserted into the buffer.
inNormal--;
}
//Print a message informing the user a new order was completed.
System.out.println("An order has been completed for: "
+ toHandle.getBuyer().getName());
//Notify the buyer he has sucsessfully ordered a new package.
toHandle.getBuyer().setPacksBought(
toHandle.getBuyer().getPacksBought() + 1);
}else {
//Notify the user there was a critical error obtaining the
//error to handle. (There shouldn't exist a case where this
//should happen but you never know.)
System.err.println("Something went wrong obtaining an order.");
}
//Notify the buyers that a new spot has been opened in the buffer.
this.notFullBuffer.release();
//Catch any interrupt exceptions.
} catch(InterruptedException e){
//This is expected behavior as it allows us to force the thread to
//revaluate it's main running loop when notifying it to finish
//execution.
} finally {
//Check if the current thread is locking the buffer lock. This is
//done as in the case of an interrupt we don't want to execute this
//code if the thread interrupted doesn't hold the lock as that
//would result in an exception we don't want.
if (mutexBuffer.isHeldByCurrentThread())
//Unlock the buffer lock.
mutexBuffer.unlock();
}
}
The problem:
To verify the functionallity of the program I use the output from the statement:
System.out.println("An order has been completed for: "
+ toHandle.getBuyer().getName());
from the handleOrder(); function. I place the whole output in a text file, remove all the lines which aren't added by this println(); statement and count the number of lines to know how many orders have been handled. I expect this value to be equal to the amount of threads times 10, however this is often not the case. Running tests I've noticed sometimes it does work and there are no problems but sometimes one or more buyer threads take more orders than they should. with 5 buyer threads there should be 50 outputs but I get anywhere from 50 to 60 lines (orders places).
Turning the amount of threads up to 30 increases the problem and now I can expect an increase of up to 50% more orders with some threads placing up to 30 orders.
Doing some research this is called a data-race and is caused by 2 threads accessing the same data at the same time while 1 of them writes to the data. This basically changes the data such that the other thread isn't working with the same data it expects to be working with.
My attempt:
I firmly believe ReentrantLocks are designed to handle situations like this as they should stop any thread from entering a section of code if another thread hasn't left it. Both the placeOrder(Order order); and handleOrder(); function make use of this mechanic. I'm therefor assuming I didn't implement this correctly. Here is a version of the project which is compileable and executable from a single file called Test.java. Would anyone be able to take a look at that or the code explained above and tell me what I'm doing wrong?
EDIT
I noticed there was a way a buyer could place more than 10 orders so I changed the code to:
/*
* The run method which is ran once the thread is started.
*/
public void run() {
//Run until the thread has bought 10 packages, this ensures the thread
//will eventually stop execution automatically.
for(packsBought = 0; packsBought < 10; packsBought++)
{
try {
//Sleep for a random amount of time between 1 and 50
//milliseconds.
Thread.sleep(this.rand.nextInt(49) + 1);
//Catch any interruptExceptions.
} catch (InterruptedException ex) {
//There is no problem if this exception is thrown, the thread
//will just make an order earlier than planned. that being said
//there should be no manner in which this exception is thrown.
}
//Create a new order.
Order order = new Order(this.rand.nextInt(3)+ 1,
this,
this.isPrime);
//Set the time at which the order was placed as now.
order.setOrderTime(System.currentTimeMillis());
//place the newly created order in the warehouse.
this.warehouse.placeOrder(order);
}
//Notify the thread has finished execution.
System.out.println("Thread: " + super.getName() + " has finished.");
}
in the buyers run(); function yet I'm still getting some threads which place over 10 orders. I also removed the update of the amount of packs bought in the handleOrder(); function as that is now unnecessary. here is an updated version of Test.java (where all classes are together for easy execution) There seems to be a different problem here.
There are some concurrency issues with the code, but the main bug is not related to them: it's in the block starting in line 512 on placeOrder
//Enumerate all non prime orders in the list.
for (int i = inPrime; i < sizeOrderList - 1; i++) {
//Move the non prime order back 1 position in the list.
buffer[i + 1] = buffer[i];
}
when there is only one normal order in the buffer, then inPrime value is 0, inNormal is 1, buffer[0] is the normal order and the rest of the buffer is null.
The code to move non primer orders, starts in index 0, and then does:
buffer[1] = buffer[0] //normal order in 0 get copied to 1
buffer[2] = buffer[1] //now its in 1, so it gets copied to 2
buffer[3] = buffer[2] //now its in 2 too, so it gets copied to 3
....
so it moves the normal order to buffer[1] but then it copies the contents filling all the buffer with that order.
To solve it you should copy the array in reverse order:
//Enumerate all non prime orders in the list.
for (int i = (sizeOrderList-1); i > inPrime; i--) {
//Move the non prime order back 1 position in the list.
buffer[i] = buffer[i-1];
}
As for the concurrency issues:
If you check a field on a thread, updated by another thread you should declare it as volatile. Thats the case of the run field in DispatcherWorker and ResourceSupplier. See: https://stackoverflow.com/a/8063587/11751648
You start interrupting the dispatcher threads (line 183) while they are still processing packages. So if they are stopped at 573, 574 or 579, they will throw an InterruptedException and not finish the processing (hence in the last code not always all packages are delivered). You could avoid this by checking that the buffer is empty before start interrupting dispatcher threads, calling warehouse.notFullBuffer.acquire(warehouse.sizeOrderList); on 175
When catching InterruptedException you should always call Thread.currentThread().interrupt(); the preserve the interrupted status of the Thread. See: https://stackoverflow.com/a/3976377/11751648
I believe you may be chasing ghosts. I'm not entirely sure why you're seeing more outputs than you're expecting, but the number of orders placed appears to be in order. Allow me to clarify:
I've added a Map<String,Integer> to the Warehouse class to map how many orders each thread places:
private Map<String,Integer> ordersPlaced = new TreeMap<>();
// Code omitted for brevity
public void placeOrder(Order order)
{
try
{
//halt untill there are enough packs to handle an order.
this.notFullBuffer.acquire();
//Lock to signify the start of the critical section.
this.mutexBuffer.lock();
ordersPlaced.merge(Thread.currentThread().getName(), 1, Integer::sum);
// Rest of method
}
I then added a for-loop to the main method to execute the code 100 times, and added the following code to the end of each iteration:
warehouse.ordersPlaced.forEach((thread, orders) -> System.out.printf(" %s - %d%n", thread, orders));
I placed a breakpoint inside the lambda expression, with condition orders != 10. This condition never triggered in the 100+ runs I executed. As far as I can tell, your code is working as intended. I've increased both nWorkers and nBuyers to 100 just to be sure.
I believe you're using ReentrantLock correctly, and I agree that it is probably the best choice for your use case.
referring at your code on pastebin
THE GENERIC PROBLEM:
In the function public void handleOrder() he sleep (line 582) Thread.sleep(5 * nPacks); is inside the lock(): unlock(): block.
With this position of sleep, it has no sense to have many DispatchWorker because n-1 will wait at line 559 this.mutexBuffer.lock() while one is sleeping at line 582.
THE BUG:
The bug is in line 173. You should remove it.
In your main() you join all buyers and this is correct. Then you try to stop the workers. The workers at this time are already running to complete orders that will be completed seconds after. You should only set worker.runThread(false); and then join the thead (possibly in two separate loops). This solution really waits for workers to complete orders. Interrupting the thread that is sleeping at line 582 will raise an InterruptedException and the following lines are skipped, in particular line 596 or 600 that update inPrime and in Normal counters generating unpredictable behaviours.
moving line 582 after line 633 and removing line 173 will solve the problem
HOW TO TEST:
My suggestion is to introduce a counter of all Packs boxes generated by supplier and a counter of all boxes ordered and finally check if generated boxes are equals at ordered plus that left in the whorehouse.

What could cause a java process to get gradually decreasing share of CPU?

I have a very simple java program that prints out 1 million random numbers. In linux, I observed the %CPU that this program takes during its lifespan, it starts off at 98% then gradually decreases to 2%, thus causing the program to be very slow. What are some of the factors that might cause the program to gradually get less CPU time?
I've tried running it with nice -20 but I still see the same results.
EDIT: running the program with /usr/bin/time -v I'm seeing an unusual amount of involuntary context switches (588 voluntary vs 16478 involuntary), which suggests that the OS is letting some other higher priority process run.
It boils down to two things:
I/O is expensive, and
Depending on how you're storing the numbers as you go along, that can have an adverse effect on performance as well.
If you're mainly doing System.out.println(randInt) in a loop a million times, then that can get expensive. I/O isn't one of those things that comes for free, and writing to any output stream costs resources.
I would start by profiling via JConsole or VisualVM to see what it's actually doing when it has low CPU %. As mentioned in comments there's a high chance it's blocking, e.g. waiting for IO (user input, SQL query taking a long time, etc.)
If your application is I/O bound - for example waiting for responses from network calls, or disk read/write
If you want to try and balance everything, you should create a queue to hold numbers to print, then have one thread generate them (the producer) and the other read and print them (the consumer). This can easily be done with a LinkedBlockingQueue.
public class PrintQueueExample {
private BlockingQueue<Integer> printQueue = new LinkedBlockingQueue<Integer>();
public static void main(String[] args) throws InterruptedException {
PrinterThread thread = new PrinterThread();
thread.start();
for (int i = 0; i < 1000000; i++) {
int toPrint = ...(i) ;
printQueue.put(Integer.valueOf(toPrint));
}
thread.interrupt();
thread.join();
System.out.println("Complete");
}
private static class PrinterThread extends Thread {
#Override
public void run() {
try {
while (true) {
Integer toPrint = printQueue.take();
System.out.println(toPrint);
}
} catch (InterruptedException e) {
// Interruption comes from main, means processing numbers has stopped
// Finish remaining numbers and stop thread
List<Integer> remainingNumbers = new ArrayList<Integer>();
printQueue.drainTo(remainingNumbers);
for (Integer toPrint : remainingNumbers)
System.out.println(toPrint);
}
}
}
}
There may be a few problems with this code, but this is the gist of it.

Can Scheduler override join functionality?

I wrote a simple code that uses multiple threads to calculate number of primes from 1 to N.
public static void main (String[] args) throws InterruptedException
{
Date start;
start = new Date();
long startms = start.getTime();
int number_primes = 0, number_threads =0;
number_primes = Integer.parseInt(args[0]);
number_threads = Integer.parseInt(args[1]);
MakeThread[] mt = new MakeThread[number_threads];
for(int i=1;i<=number_threads;i++)
{
mt[i-1] = new MakeThread(i,(i-1)*(number_primes/number_threads),i*(number_primes/number_threads));
mt[i-1].start();
}
for(int i=1;i<number_threads;i++)
{
mt[i-1].join();
}
Date end = new Date();
long endms = end.getTime();
System.out.println("Time taken = "+(endms-startms));
}
}
As show in above, I want the final time taken to be displayed (just to measure performance for different inputs). However I noticed that when I enter a really big value of N and assign only 1 or 2 threads, the scheduler seems to override the join functionality (i.e the last print statement is displayed before other threads end). Is the kernel allowed to do this? Or do I have some bug in my code?
P.S: I have only shown a part of my code. I have a similar System.out.println at the end of the function that the newly forked threads call.
Your loop is the problem.
for(int i=1;i<number_threads;i++)
{
mt[i-1].join();
}
Either you change the condition to <= or you make a less cryptic loop like this:
for(int i=0; i < number_threads;i++){
mt[i].join();
}
Or a for each loop:
for(MakeThread thread : mt)
thread.join();
Provided you correct your loop which calls join on all threads as shown below
for(int i=0;i<number_threads;i++)
{
mt[i].join();
}
there is no way that the last print line may get invoked before all threads ( as specified in the loop ) finish running and join the main thread. Scheduler cannot make any assumptions with this semantics. As pointed by Thomas , the bug is there in your code that does not call join on the last thread ( which therefore does not complete before the last print is called ).

Cyclic barrier Java, How to verify?

I am preparing for interviews and just want to prepare some basic threading examples and structures so that I can use them during my white board coding if I have to.
I was reading about CyclicBarrier and was just trying my hands at it, so I wrote a very simple code:
import java.util.concurrent.CyclicBarrier;
public class Threads
{
/**
* #param args
*/
public static void main(String[] args)
{
// ******************************************************************
// Using CyclicBarrier to make all threads wait at a point until all
// threads reach there
// ******************************************************************
barrier = new CyclicBarrier(N);
for (int i = 0; i < N; ++i)
{
new Thread(new CyclicBarrierWorker()).start();
}
// ******************************************************************
}
static class CyclicBarrierWorker implements Runnable
{
public void run()
{
try
{
long id = Thread.currentThread().getId();
System.out.println("I am thread " + id + " and I am waiting for my friends to arrive");
// Do Something in the Thread
Thread.sleep(1000*(int)(4*Math.random()*10));
// Now Wait till all the thread reaches this point
barrier.await();
}
catch (Exception e)
{
e.printStackTrace();
}
//Now do whatever else after all threads are released
long id1 = Thread.currentThread().getId();
System.out.println("Thread:"+id1+" We all got released ..hurray!!");
System.out.println("We all got released ..hurray!!");
}
}
final static int N = 4;
static CyclicBarrier barrier = null;
}
You can copy paste it as is and run in your compiler.
What I want to verify is that indeed all threads wait at this point in code:
barrier.await();
I put some wait and was hoping that I would see 4 statements appear one after other in a sequential fashion on the console, followed by 'outburst' of "released..hurray" statement. But I am seeing outburst of all the statements together no matter what I select as the sleep.
Am I missing something here ?
Thanks
P.S: Is there an online editor like http://codepad.org/F01xIhLl where I can just put Java code and hit a button to run a throw away code ? . I found some which require some configuration before I can run any code.
The code looks fine, but it might be more enlightening to write to System.out before the sleep. Consider this in run():
long id = Thread.currentThread().getId();
System.out.println("I am thread " + id + " and I am waiting for my friends to arrive");
// Do Something in the Thread
Thread.sleep(1000*8);
On my machine, I still see a burst, but it is clear that the threads are blocked on the barrier.
if you want to avoid the first burst use a random in the sleep
Thread.sleep(1000*(int)(8*Math.rand()));
I put some wait and was hoping that I
would see 4 statements appear one
after other in a sequential fashion on
the console, followed by 'outburst' of
"released..hurray" statement. But I am
seeing outburst of all the statements
together no matter what I select as
the sleep.
The behavior I'm observing is that all the threads created, sleep for approximately the same amount of time. Remember that other threads can perform their work in the interim, and will therefore get scheduled; since all threads created sleep for the same amount of time, there is very little difference between the instants of time when the System.out.println calls are invoked.
Edit: The other answer of sleeping of a random amount of time will aid in understanding the concept of a barrier better, for it would guarantee (to some extent) the possibility of multiple threads arriving at the barrier at different instants of time.

Code inside thread slower than outside thread..?

I'm trying to alter some code so it can work with multithreading. I stumbled upon a performance loss when putting a Runnable around some code.
For clarification: The original code, let's call it
//doSomething
got a Runnable around it like this:
Runnable r = new Runnable()
{
public void run()
{
//doSomething
}
}
Then I submit the runnable to a ChachedThreadPool ExecutorService. This is my first step towards multithreading this code, to see if the code runs as fast with one thread as the original code.
However, this is not the case. Where //doSomething executes in about 2 seconds, the Runnable executes in about 2.5 seconds. I need to mention that some other code, say, //doSomethingElse, inside a Runnable had no performance loss compared to the original //doSomethingElse.
My guess is that //doSomething has some operations that are not as fast when working in a Thread, but I don't know what it could be or what, in that aspect is the difference with //doSomethingElse.
Could it be the use of final int[]/float[] arrays that makes a Runnable so much slower? The //doSomethingElse code also used some finals, but //doSomething uses more. This is the only thing I could think of.
Unfortunately, the //doSomething code is quite long and out-of-context, but I will post it here anyway. For those who know the Mean Shift segmentation algorithm, this a part of the code where the mean shift vector is being calculated for each pixel. The for-loop
for(int i=0; i<L; i++)
runs through each pixel.
timer.start(); // this is where I start the timer
// Initialize mode table used for basin of attraction
char[] modeTable = new char [L]; // (L is a class property and is about 100,000)
Arrays.fill(modeTable, (char)0);
int[] pointList = new int [L];
// Allcocate memory for yk (current vector)
double[] yk = new double [lN]; // (lN is a final int, defined earlier)
// Allocate memory for Mh (mean shift vector)
double[] Mh = new double [lN];
int idxs2 = 0; int idxd2 = 0;
for (int i = 0; i < L; i++) {
// if a mode was already assigned to this data point
// then skip this point, otherwise proceed to
// find its mode by applying mean shift...
if (modeTable[i] == 1) {
continue;
}
// initialize point list...
int pointCount = 0;
// Assign window center (window centers are
// initialized by createLattice to be the point
// data[i])
idxs2 = i*lN;
for (int j=0; j<lN; j++)
yk[j] = sdata[idxs2+j]; // (sdata is an earlier defined final float[] of about 100,000 items)
// Calculate the mean shift vector using the lattice
/*****************************************************/
// Initialize mean shift vector
for (int j = 0; j < lN; j++) {
Mh[j] = 0;
}
double wsuml = 0;
double weight;
// find bucket of yk
int cBucket1 = (int) yk[0] + 1;
int cBucket2 = (int) yk[1] + 1;
int cBucket3 = (int) (yk[2] - sMinsFinal) + 1;
int cBucket = cBucket1 + nBuck1*(cBucket2 + nBuck2*cBucket3);
for (int j=0; j<27; j++) {
idxd2 = buckets[cBucket+bucNeigh[j]]; // (buckets is a final int[] of about 75,000 items)
// list parse, crt point is cHeadList
while (idxd2>=0) {
idxs2 = lN*idxd2;
// determine if inside search window
double el = sdata[idxs2+0]-yk[0];
double diff = el*el;
el = sdata[idxs2+1]-yk[1];
diff += el*el;
//...
idxd2 = slist[idxd2]; // (slist is a final int[] of about 100,000 items)
}
}
//...
}
timer.end(); // this is where I stop the timer.
There is more code, but the last while loop was where I first noticed the difference in performance.
Could anyone think of a reason why this code runs slower inside a Runnable than original?
Thanks.
Edit: The measured time is inside the code, so excluding startup of the thread.
All code always runs "inside a thread".
The slowdown you see is most likely caused by the overhead that multithreading adds. Try parallelizing different parts of your code - the tasks should neither be too large, nor too small. For example, you'd probably be better off running each of the outer loops as a separate task, rather than the innermost loops.
There is no single correct way to split up tasks, though, it all depends on how the data looks and what the target machine looks like (2 cores, 8 cores, 512 cores?).
Edit: What happens if you run the test repeatedly? E.g., if you do it like this:
Executor executor = ...;
for (int i = 0; i < 10; i++) {
final int lap = i;
Runnable r = new Runnable() {
public void run() {
long start = System.currentTimeMillis();
//doSomething
long duration = System.currentTimeMillis() - start;
System.out.printf("Lap %d: %d ms%n", lap, duration);
}
};
executor.execute(r);
}
Do you notice any difference in the results?
I personally do not see any reason for this. Any program has at least one thread. All threads are equal. All threads are created by default with medium priority (5). So, the code should show the same performance in both the main application thread and other thread that you open.
Are you sure you are measuring the time of "do something" and not the overall time that your program runs? I believe that you are measuring the time of operation together with the time that is required to create and start the thread.
When you create a new thread you always have an overhead. If you have a small piece of code, you may experience performance loss.
Once you have more code (bigger tasks) you make get a performance improvement by your parallelization (the code on the thread will not necessarily run faster, but you are doing two thing at once).
Just a detail: this decision of how big small can a task be so parallelizing it is still worth is a known topic in parallel computation :)
You haven't explained exactly how you are measuring the time taken. Clearly there are thread start-up costs but I infer that you are using some mechanism that ensures that these costs don't distort your picture.
Generally speaking when measuring performance it's easy to get mislead when measuring small pieces of work. I would be looking to get a run of at least 1,000 times longer, putting the whole thing in a loop or whatever.
Here the one different between the "No Thread" and "Threaded" cases is actually that you have gone from having one Thread (as has been pointed out you always have a thread) and two threads so now the JVM has to mediate between two threads. For this kind of work I can't see why that should make a difference, but it is a difference.
I would want to be using a good profiling tool to really dig into this.

Categories