I am learning Thread-ing in Java in order to create some program run in parallel. To design programs with parallelism is something I never had a chance to learn back at my school programming class. I know how to create threads and make them run, but I have no idea how to use them efficiently. After all I know it is not actually using threads that makes a program fast but a good parallel design. So I did some experiment to test my knowledge. However, my paralleled version actually runs slower than an unparalleled one. I start to doubt if I really get the idea. If you could be so kind, would you mind having a look my following program:
I made a program to fill an array in a divide-and-conquer fashion (I know Java has a Arrays.fill utility, but I just want to test my knowledge in multithreading):
public class ParalledFill
{
private static fill(final double [] array,
final double value,
final int start,
final int size)
{
if (size > 1000)
{ // Each thread handles at most 1000 elements
Runnable task = new Runnable() { // Fork the task
public void run() {
fill(array, value, start, 1000); // Fill the first 1000 elements
}};
// Create the thread
Thread fork = new Thread(task);
fork.start();
// Fill the rest of the array
fill(array, value, start+1000, size-1000);
// Join the task
try {
fork.join();
}
catch (InterruptedException except)
{
System.err.println(except);
}
}
else
{ // The array is small enough, fill it via a normal loop
for (int i = start; i < size; ++i)
array[i] = value;
}
} // fill
public static void main(String [] args)
{
double [] bigArray = new double[1000*1000];
double value = 3;
fill(bigArray, value, 0, bigArray.length);
}
}
I tested this program, but it turns out to be even slower than just doing something like:
for (int i = 0; i < bigArray.length; ++i)
bigArray[i] = value;
I had my guess, it could be that java does some optimisation for filling an array using a loop which makes it much faster than my threaded version. But other than that, I feel more strongly that my way to handle threads/parallelism could be wrong. I have never designed anything using threads (always relied on compiler optimisation or OpenMP in C). Could anyone help me explain why my paralleled version isn’t faster? Was the program just too bad in terms of designing paralleled program?
Thanks,
Xing.
Unless you have multiple CPUs, or long running tasks like I/O, I'm guessing that all you're doing is time slicing between threads. If there's a single CPU that has so much work to do, adding threads doesn't decrease the work that has to be done. All you end up doing is adding overhead due to context switching.
You ought to read "Java Concurrency In Practice". Better to learn how to do things with the modern concurrency package rather than raw threads.
Related
I understood that reading and writing data from multiple threads need to have a good locking mechanism to avoid data race. However, one situation is: If multiple threads try to write to a single variable with a single value, can this be a problem.
For example, here my sample code:
public class Main {
public static void main(String[] args) {
final int[] a = {1};
while(true) {
new Thread(new Runnable() {
#Override
public void run() {
a[0] = 1;
assert a[0] == 1;
}
}).start();
}
}
}
I have run this program for a long time, and look like everything is fine. If this code can cause the problem, how can I reproduce that?
Your test case does not cover the actual problem. You test the variable's value in the same thread - but that thread already copied the initial state of the variable and when it changes within the thread, the changes are visible to that thread, just like in any single-threaded applications. The real issue with write operations is how and when is the updated value used in the other threads.
For example, if you were to write a counter, where each thread increments the value of the number, you would run into issues. An other problem is that your test operation take way less time than creating a thread, therefore the execution is pretty much linear. If you had longer code in the threads, it would be possible for multiple threads to access the variable at the same time. I wrote this test using Thread.sleep(), which is known to be unreliable (which is what we need):
int[] a = new int[]{0};
for(int i = 0; i < 100; i++) {
final int k = i;
new Thread(new Runnable() {
#Override
public void run() {
try {
Thread.sleep(20);
} catch(InterruptedException e) {
e.printStackTrace();
}
a[0]++;
System.out.println(a[0]);
}
}).start();
}
If you execute this code, you will see how unreliable the output is. The order of the numbers change (they are not in ascending order), there are duplicates and missing numbers as well. This is because the variable is copied to the CPU memory multiple times (once for each thread), and is pasted back to the shared ram after the operation is complete. (This does not happen right after it is completed to save time in case it is needed later).
There also might be some other mechanics in the JVM that copy the values within the RAM for threads, but I'm unaware of them.
The thing is, even locking doesn't prevent these issues. It prevents threads from accessing the variable at the same time, but it generally doesn't make sure that the value of the variable is updated before the next thread accesses it.
I do not have much experience making multi-threaded applications but I feel like my program is at a point where it may benefit from having multiple threads. I am doing a larger scale project that involves using a classifier (as in machine learning) to classify roughly 32000 customers. I have debugged the program and discovered that it takes about a second to classify each user. So in other words this would take 8.8 hours to complete!
Is there any way that I can run 4 threads handling 8000 users each? The first thread would handle 1-8000, the second 8001-16000, the third 16001-23000, the fourth 23001-32000. Also, as of now each classification is done by calling a static function from another class...
Then when the other threads besides the main one should end. Is something like this feasible? If so, I would greatly appreciate it if someone could provide tips or steps on how to do this. I am familiar with the idea of critical sections (wait/signal) but have little experience with it.
Again, any help would be very much appreciated! Tips and suggestions on how to handle a situation like this are welcomed! Not sure it matters but I have a Core 2 Duo PC with a 2.53 GHZ processor speed.
This is too lightweight for Apache Hadoop, which requires around 64MB chunks of data per server... but.. it's a perfect opportunity for Akka Actors, and, it just happens to support Java!
http://doc.akka.io/docs/akka/2.1.4/java/untyped-actors.html
Basically, you can have 4 actors doing the work, and as they finish classifying a user, or probably better, a number of users, they either pass it to a "receiver" actor, that puts the info into a data structure or a file for output, or, you can do concurrent I/O by having each write to a file on their own.. then the files can be examined/combined when they're all done.
If you want to get even more fancy/powerful, you can put the actors on remote servers. It's still really easy to communicate with them, and you'd be leveraging the CPU/resources of multiple servers.
I wrote an article myself on Akka actors, but it's in Scala, so I'll spare you that. But if you google "akka actors", you'll get lots of hand-holding examples on how to use it. Be brave, dive right in and experiment. The "actor system" is such an easy concept to pick up. I know you can do it!
Split the data up into objects that implement Runnable, then pass them to new threads.
Having more than four threads in this case won't kill you, but you cannot get more parallel work than you have cores (as mentioned in the comments) - if there are more threads than cores the system will have to handle who gets to go when.
If I had a class customer, and I want to issue a thread to prioritize 8000 customers of a greater collection I might do something like this:
public class CustomerClassifier implements Runnable {
private customer[] customers;
public CustomerClassifier(customer[] customers) {
this.customers = customers;
}
#Override
public void run() {
for (int i=0; i< customers.length; i++) {
classify(customer);//critical that this classify function does not
//attempt to modify a resource outside this class
//unless it handles locking, or is talking to a database
//or something that won't throw fits about resource locking
}
}
}
then to issue these threads elsewhere
int jobSize = 8000;
customer[] customers = new customer[jobSize]();
int j = 0;
for (int i =0; i+j< fullCustomerArray.length; i++) {
if (i == jobSize-1) {
new Thread(new CustomerClassifier(customers)).start();//run will be invoked by thread
customers = new Customer[jobSize]();
j += i;
i = 0;
}
customers[i] = fullCustomerArray[i+j];
}
If you have your classify method affect the same resource somewhere you will have to
implement locking and will also kill off your advantage gained to some degree.
Concurrency is extremely complicated and requires a lot of thought, I also recommend looking at the oracle docs http://docs.oracle.com/javase/tutorial/essential/concurrency/index.html
(I know links are bad, but hopefully the oracle docs don't move around too much?)
Disclaimer: I am no expert in concurrent design or in multithreading (different topics).
If you split the input array in 4 equal subarrays for 4 threads, there is no guarantee that all threads finish simultaneously. You better put all data in a single queue and let all working threads feed from that common queue. Use thead-safe BlockingQueue implementations in order to not write low level synchronize/wait/notify code.
From java 6 we have some handy utils for concurrency. You might want to consider using thread pools for cleaner implementation.
package com.threads;
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.Callable;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
public class ParalleliseArrayConsumption {
private int[] itemsToBeProcessed ;
public ParalleliseArrayConsumption(int size){
itemsToBeProcessed = new int[size];
}
/**
* #param args
*/
public static void main(String[] args) {
(new ParalleliseArrayConsumption(32)).processUsers(4);
}
public void processUsers(int numOfWorkerThreads){
ExecutorService threadPool = Executors.newFixedThreadPool(numOfWorkerThreads);
int chunk = itemsToBeProcessed.length/numOfWorkerThreads;
int start = 0;
List<Future> tasks = new ArrayList<Future>();
for(int i=0;i<numOfWorkerThreads;i++){
tasks.add(threadPool.submit(new WorkerThread(start, start+chunk)));
start = start+chunk;
}
// join all worker threads to main thread
for(Future f:tasks){
try {
f.get();
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (ExecutionException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
threadPool.shutdown();
while(!threadPool.isTerminated()){
}
}
private class WorkerThread implements Callable{
private int startIndex;
private int endIndex;
public WorkerThread(int startIndex, int endIndex){
this.startIndex = startIndex;
this.endIndex = endIndex;
}
#Override
public Object call() throws Exception {
for(int currentUserIndex = startIndex;currentUserIndex<endIndex;currentUserIndex++){
// process the user. Add your logic here
System.out.println(currentUserIndex+" is the user being processed in thread " +Thread.currentThread().getName());
}
return null;
}
}
}
I want to do a task that I've already completed except this time using multithreading. I have to read a lot of data from a file (line by line), grab some information from each line, and then add it to a Map. The file is over a million lines long so I thought it may benefit from multithreading.
I'm not sure about my approach here since I have never used multithreading in Java before.
I want to have the main method do the reading, and then giving the line that has been read to another thread which will format a String, and then give it to another thread to put into a map.
public static void main(String[] args)
{
//Some information read from file
BufferedReader br = null;
String line = '';
try {
br = new BufferedReader(new FileReader("somefile.txt"));
while((line = br.readLine()) != null) {
// Pass line to another task
}
// Here I want to get a total from B, but I'm not sure how to go about doing that
}
public class Parser extends Thread
{
private Mapper m1;
// Some reference to B
public Parse (Mapper m) {
m1 = m;
}
public parse (String s, int i) {
// Do some work on S
key = DoSomethingWithString(s);
m1.add(key, i);
}
}
public class Mapper extends Thread
{
private SortedMap<String, Integer> sm;
private String key;
private int value;
boolean hasNewItem;
public Mapper() {
sm = new TreeMap<String, Integer>;
hasNewItem = false;
}
public void add(String s, int i) {
hasNewItem = true;
key = s;
value = i;
}
public void run() {
while (!Thread.currentThread().isInterrupted()) {
try {
if (hasNewItem) {
// Find if street name exists in map
sm.put(key, value);
newEntry = false;
}
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}
// I'm not sure how to give the Map back to main.
}
}
I'm not sure if I am taking the right approach. I also do not know how to terminate the Mapper thread and retrieve the map in the main. I will have multiple Mapper threads but I have only instantiated one in the code above.
I also just realized that my Parse class is not a thread, but only another class if it does not override the run() method so I am thinking that the Parse class should be some sort of queue.
And ideas? Thanks.
EDIT:
Thanks for all of the replies. It seems that since I/O will be the major bottleneck there would be little efficiency benefit from parallelizing this. However, for demonstration purpose, am I going on the right track? I'm still a bit bothered by not knowing how to use multithreading.
Why do you need multiple threads? You only have one disk and it can only go so fast. Multithreading it won't help in this case, almost certainly. And if it does, it will be very minimal from a user's perspective. Multithreading isn't your problem. Reading from a huge file is your bottle neck.
Frequently I/O will take much longer than the in-memory tasks. We refer to such work as I/O-bound. Parallelism may have a marginal improvement at best, and can actually make things worse.
You certainly don't need a different thread to put something into a map. Unless your parsing is unusually expensive, you don't need a different thread for it either.
If you had other threads for these tasks, they might spend most of their time sitting around waiting for the next line to be read.
Even parallelizing the I/O won't necessarily help, and may hurt. Even if your CPUs support parallel threads, your hard drive might not support parallel reads.
EDIT:
All of us who commented on this assumed the task was probably I/O-bound -- because that's frequently true. However, from the comments below, this case turned out to be an exception. A better answer would have included the fourth comment below:
Measure the time it takes to read all the lines in the file without processing them. Compare to the time it takes to both read and process them. That will give you a loose upper bound on how much time you could save. This may be decreased by a new cost for thread synchronization.
You may wish to read Amdahl's Law. Since the majority of your work is strictly serial (the IO) you will get negligible improvements by multi-threading the remainder. Certainly not worth the cost of creating watertight multi-threaded code.
Perhaps you should look for a new toy-example to parallelise.
I have a Java method that performs two computations over an input set: an estimated and an accurate answer. The estimate can always be computed cheaply and in reliable time. The accurate answer can sometimes be computed in acceptable time and sometimes not (not known a priori ... have to try and see).
What I want to set up is some framework where if the accurate answer takes too long (a fixed timeout), the pre-computed estimate is used instead. I figured I'd use a thread for this. The main complication is that the code for computing the accurate answer relies on an external library, and hence I cannot "inject" Interrupt support.
A standalone test-case for this problem is here, demonstrating my problem:
package test;
import java.util.Random;
public class InterruptableProcess {
public static final int TIMEOUT = 1000;
public static void main(String[] args){
for(int i=0; i<10; i++){
getAnswer();
}
}
public static double getAnswer(){
long b4 = System.currentTimeMillis();
// have an estimate pre-computed
double estimate = Math.random();
//try to get accurate answer
//can take a long time
//if longer than TIMEOUT, use estimate instead
AccurateAnswerThread t = new AccurateAnswerThread();
t.start();
try{
t.join(TIMEOUT);
} catch(InterruptedException ie){
;
}
if(!t.isFinished()){
System.err.println("Returning estimate: "+estimate+" in "+(System.currentTimeMillis()-b4)+" ms");
return estimate;
} else{
System.err.println("Returning accurate answer: "+t.getAccurateAnswer()+" in "+(System.currentTimeMillis()-b4)+" ms");
return t.getAccurateAnswer();
}
}
public static class AccurateAnswerThread extends Thread{
private boolean finished = false;
private double answer = -1;
public void run(){
//call to external, non-modifiable code
answer = accurateAnswer();
finished = true;
}
public boolean isFinished(){
return finished;
}
public double getAccurateAnswer(){
return answer;
}
// not modifiable, emulate an expensive call
// in practice, from an external library
private double accurateAnswer(){
Random r = new Random();
long b4 = System.currentTimeMillis();
long wait = r.nextInt(TIMEOUT*2);
//don't want to use .wait() since
//external code doesn't support interruption
while(b4+wait>System.currentTimeMillis()){
;
}
return Math.random();
}
}
}
This works fine outputting ...
Returning estimate: 0.21007465651836377 in 1002 ms
Returning estimate: 0.5303547292361411 in 1001 ms
Returning accurate answer: 0.008838428149438915 in 355 ms
Returning estimate: 0.7981717302567681 in 1001 ms
Returning estimate: 0.9207406241557682 in 1000 ms
Returning accurate answer: 0.0893839926072787 in 175 ms
Returning estimate: 0.7310211480220586 in 1000 ms
Returning accurate answer: 0.7296754467596422 in 530 ms
Returning estimate: 0.5880164300851529 in 1000 ms
Returning estimate: 0.38605296260291233 in 1000 ms
However, I have a very large input set (in the order of billions of items) to run my analysis over, and I'm uncertain as to how to clean up the threads that do not finish (I do not want them running in the background).
I know that various methods to destroy threads are deprecated with good reason. I also know that the typical way to stop a thread is to use interrupts. However, in this case, I don't see that I can use an interrupt since the run() method passes a single call to an external library.
How can I kill/clean-up threads in this case?
If you know enough about the external library, such as:
never acquires any locks;
never opens any files/network connections;
never involves any I/O whatsoever, not even logging;
then it may be safe to use Thread#stop on it. You could try it and do extensive stress testing. Any resource leaks should manifest themselves soon enough.
I'd try it to see if it will respond to an Thread.interrupt(). Reduce your data of course so it doesn't run forever, but if it responds to an interrupt() then you're home free. If they lock anything, perform a wait(), or sleep() the code will have to handle the InterruptedException and it's possible the author did what was right. They may swallow it and continue, but it's possible they didn't.
While technically you can call Thread.stop() you'll need to know everything about that code to know for sure if it's safe and you won't leak resources. However, doing that research will clue you into how you could easily modify the code to look for interrupt() as well. You'll pretty much have to have the source code to audit it to know for sure which means you could easily do the right thing and add the checks there without involving as much research to know if its safe to call Thread.stop().
The other option is to cause a RuntimeException in the thread. Try nulling a reference it might have or closing some IO (socket, file handle, etc). Modify the array of data it's walking over by changing the size or null out the data. There's something you can do to cause it to throw an exception and that is not handled and it will shutdown.
Extending on the answer by chubbsondubs, if the third-party library uses some well-defined API (such as java.util.List or some library-specific API) to access the input data set, you could wrap the input data set that you pass to the third-party code with a wrapper class that will throw exceptions, e.g. in the List.get method, after a cancel flag is set.
For instance, if you pass a List to your third-party library, then it might be possible to do something along the lines of:
class CancelList<T> implements List<T> {
private final List<T> wrappedList;
private volatile boolean canceled = false;
public CancelList(List<T> wrapped) { this.wrappedList = wrapped; }
public void cancel() { this.canceled = true; }
public T get(int index) {
if (canceled) { throw new RuntimeException("Canceled!"); }
return wrappedList.get(index);
}
// Other List method implementations here...
}
public double getAnswer(List<MyType> inputList) {
CancelList<MyType> cancelList = new CancelList<MyType>(inputList);
AccurateAnswerThread t = new AccurateAnswerThread(cancelList);
t.start();
try{
t.join(TIMEOUT);
} catch(InterruptedException ie){
cancelList.cancel();
}
// Get the result of your calculation here...
}
Of course, this approach depends on a few things:
You must know the third-party code well-enough to know what methods it calls that you can control through input parameters.
The third-party code would need to make frequent calls to these methods throughout the computation process (i.e. it won't work if it copies all the data at once into an internal structure and does its computation there).
Obviously this won't work if the library catches and handles runtime exceptions and continues processing.
I am new to multi-threading and I have to write a program using multiple threads to increase its efficiency. At my first attempt what I wrote produced just opposite results. Here is what I have written:
class ThreadImpl implements Callable<ArrayList<Integer>> {
//Bloom filter instance for one of the table
BloomFilter<Integer> bloomFilterInstance = null;
// Data member for complete data access.
ArrayList< ArrayList<UserBean> > data = null;
// Store the result of the testing
ArrayList<Integer> result = null;
int tableNo;
public ThreadImpl(BloomFilter<Integer> bloomFilterInstance,
ArrayList< ArrayList<UserBean> > data, int tableNo) {
this.bloomFilterInstance = bloomFilterInstance;
this.data = data;
result = new ArrayList<Integer>(this.data.size());
this.tableNo = tableNo;
}
public ArrayList<Integer> call() {
int[] tempResult = new int[this.data.size()];
for(int i=0; i<data.size() ;++i) {
tempResult[i] = 0;
}
ArrayList<UserBean> chkDataSet = null;
for(int i=0; i<this.data.size(); ++i) {
if(i==tableNo) {
//do nothing;
} else {
chkDataSet = new ArrayList<UserBean> (data.get(i));
for(UserBean toChk: chkDataSet) {
if(bloomFilterInstance.contains(toChk.getUserId())) {
++tempResult[i];
}
}
}
this.result.add(new Integer(tempResult[i]));
}
return result;
}
}
In the above class there are two data members data and bloomFilterInstance and they(the references) are passed from the main program. So actually there is only one instance of data and bloomFilterInstance and all the threads are accessing it simultaneously.
The class that launches the thread is(few irrelevant details have been left out, so all variables etc. you can assume them to be declared):
class MultithreadedVrsion {
public static void main(String[] args) {
if(args.length > 1) {
ExecutorService es = Executors.newFixedThreadPool(noOfTables);
List<Callable<ArrayList<Integer>>> threadedBloom = new ArrayList<Callable<ArrayList<Integer>>>(noOfTables);
for (int i=0; i<noOfTables; ++i) {
threadedBloom.add(new ThreadImpl(eval.bloomFilter.get(i),
eval.data, i));
}
try {
List<Future<ArrayList<Integer>>> answers = es.invokeAll(threadedBloom);
long endTime = System.currentTimeMillis();
System.out.println("using more than one thread for bloom filters: " + (endTime - startTime) + " milliseconds");
System.out.println("**Printing the results**");
for(Future<ArrayList<Integer>> element: answers) {
ArrayList<Integer> arrInt = element.get();
for(Integer i: arrInt) {
System.out.print(i.intValue());
System.out.print("\t");
}
System.out.println("");
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
}
I did the profiling with jprofiler and
![here]:(http://tinypic.com/r/wh1v8p/6)
is a snapshot of cpu threads where red color shows blocked, green runnable and yellow is waiting. I problem is that threads are running one at a time I do not know why?
Note:I know that this is not thread safe but I know that I will only be doing read operations throughout now and just want to analyse raw performance gain that can be achieved, later I will implement a better version.
Can anyone please tell where I have missed
One possibility is that the cost of creating threads is swamping any possible performance gains from doing the computations in parallel. We can't really tell if this is a real possibility because you haven't included the relevant code in the question.
Another possibility is that you only have one processor / core available. Threads only run when there is a processor to run them. So your expectation of a linear speed with the number of threads and only possibly achieved (in theory) if is a free processor for each thread.
Finally, there could be memory contention due to the threads all attempting to access a shared array. If you had proper synchronization, that would potentially add further contention. (Note: I haven't tried to understand the algorithm to figure out if contention is likely in your example.)
My initial advice would be to profile your code, and see if that offers any insights.
And take a look at the way you are measuring performance to make sure that you aren't just seeing some benchmarking artefact; e.g. JVM warmup effects.
That process looks CPU bound. (no I/O, database calls, network calls, etc.) I can think of two explanations:
How many CPUs does your machine have? How many is Java allowed to use? - if the threads are competing for the same CPU, you've added coordination work and placed more demand on the same resource.
How long does the whole method take to run? For very short times, the additional work in context switching threads could overpower the actual work. The way to deal with this is to make a longer job. Also, run it a lot of times in a loop not counting the first few iterations (like a warm up, they aren't representative.)
Several possibilities come to mind:
There is some synchronization going on inside bloomFilterInstance's implementation (which is not given).
There is a lot of memory allocation going on, e.g., what appears to be an unnecessary copy of an ArrayList when chkDataSet is created, use of new Integer instead of Integer.valueOf. You may be running into overhead costs for memory allocation.
You may be CPU-bound (if bloomFilterInstance#contains is expensive) and threads are simply blocking for CPU instead of executing.
A profiler may help reveal the actual problem.