Suppose I have a List of integers. Each int I have must be multiplied by 100. To do this with a for loop I'd construct something like the following:
for(Integer i : numbers){
i = i*100;
}
But suppose for performance reasons I wanted to simultaneously spawn a thread for each number in numbers and perform a single multiplication on each thread returning the result to the same List. What would be the best way of doing such a thing?
My actual problem isn't as trivial as multiplication of ints but rather a task that each iteration of the loop takes a substantial amount of time, and so I'd like to do them all at the same time in order to decrease execution time.
If you can use Java 7, the Fork/Join framework is created for precisely this problem. If not, there is a JSR166 (the fork/join proposal) source code at this link.
Essentially, you would create a task for each step (in your case, for each index in the array) and submit it to a service that can pool threads (the fork part). Then you wait for everything to complete and merge the results (the join part).
The reason to use a service as opposed to launching your own threads, is there can be an overhead in creating threads, and in some cases, you may want to limit the number of threads. For example, if you're on a four CPU machine, it wouldn't make much sense to have more than four threads running concurrently.
If your tasks are independent of each other, you can use Executors framework.
Note that you would gain more speed if you create no more threads than you have CPU cores at your disposal.
Sample:
class WorkInstance {
final int argument;
final int result;
WorkInstance(int argument, int result) {
this.argument = argument;
this.result = result;
}
public String toString() {
return "WorkInstance{" +
"argument=" + argument +
", result=" + result +
'}';
}
}
public class Main {
public static void main(String[] args) throws IOException, ExecutionException, InterruptedException {
int numOfCores = 4;
final ExecutorService executor = Executors.newFixedThreadPool(numOfCores);
List<Integer> toMultiplyBy100 = Arrays.asList(1, 3, 19);
List<Future<WorkInstance>> tasks = new ArrayList<Future<WorkInstance>>(toMultiplyBy100.size());
for (final Integer workInstance : toMultiplyBy100)
tasks.add(executor.submit(new Callable<WorkInstance>() {
public WorkInstance call() throws Exception {
return new WorkInstance(workInstance, workInstance * 100);
}
}));
for (Future<WorkInstance> result : tasks)
System.out.println("Result: " + result.get());
executor.shutdown();
}
}
Spawning a new thread for
each number in numbers
is not a good idea. However, using a fixed thread pool of size matching the number of cores/CPUs might increase your performace slightly.
The quick and dirty way to get started is to use a thread pool, such as one returned by Executors.newCachedThreadPool(). Then create tasks that implement Runnable and submit() them to your thread pool. Also read up on the classes and interfaces linked by those Javadocs, lots of cool stuff you can try.
See the concurrency chapter in Effective Java, 2nd ed for a great introduction to multithreaded Java.
Take a look at ThreadPoolExecutor and create a task for each iteration. A prerequisite is that those tasks are independent though.
The use of a thread pool allows you to create a task per iteration but only run as many concurrently as there a threads, since you'd want to reduce the number of thread, for example to the number of cores or hardware threads available. Creating a whole lot of threads would be counter productive since they'd require a lot of context switching which hurts performance.
I assume you are on a commodity PC. You will at most have N threads executing at the same time on your machine, where N is the # of cores of your CPUs, so most likely in the [1, 4] range. Plus the contention on the shared list.
But even more importantly, the cost of spawning a new thread is much greater than the cost of doing a multiplication. One could have a thread pool... but in this specific case, it's not even worth talking about it. Really.
If it is the only application on a node you should determine which number of threads will finish the job most quickly (max_throughput). This depends on the processor you use how much JIT can optimize your code, so there is no general advise but measure.
After that you could distribute the jobs to a pool of worker threads by numbers modulo max_throughput
Related
I have a String and ThreadPoolExecutor that changes the value of this String. Just check out my sample:
String str_example = "";
ThreadPoolExecutor poolExecutor = new ThreadPoolExecutor(10, 30, (long)10, TimeUnit.SECONDS, runnables);
for (int i = 0; i < 80; i++){
poolExecutor.submit(new Runnable() {
#Override
public void run() {
try {
Thread.sleep((long) (Math.random() * 1000));
String temp = str_example + "1";
str_example = temp;
System.out.println(str_example);
} catch (Exception e) {
e.printStackTrace();
}
}
});
}
so after executing this, i get something like that:
1
11
111
1111
11111
.......
So question is: i just expect the result like this if my String object has volatile modifier. But i have the same result with this modifier and without.
There are several reasons why you see "correct" execution.
First, CPU designers do as much as they can so that our programs run correctly even in presence of data races. Cache coherence deals with cache lines and tries to minimize possible conflicts. For example, only one CPU can write to a cache line at some point of time. After write was done other CPUs should request that cache line to be able to write to it. Not to say x86 architecture(most probable which you use) is very strict comparing to others.
Second, your program is slow and threads sleep for some random period of time. So they do almost all the work at different points of time.
How to achieve inconsistent behavior? Try something with for loop without any sleep. In that case field value most probably will be cached in CPU registers and some updates will not be visible.
P.S. Updates of field str_example are not atomic so you program may produce the same string values even in presense of volatile keyword.
When you talk about concepts like thread caching, you're talking about the properties of a hypothetical machine that Java might be implemented on. The logic is something like "Java permits an implementation to cache things, so it requires you to tell it when such things would break your program". That does not mean that any actual machine does anything of the sort. In reality, most machines you are likely to use have completely different kinds of optimizations that don't involve the kind of caches that you're thinking of.
Java requires you to use volatile precisely so that you don't have to worry about what kinds of absurdly complex optimizations the actual machine you're working on might or might not have. And that's a really good thing.
Your code is unlikely to exhibit concurrency bugs because it executes with very low concurrency. You have 10 threads, each of which sleep on average 500 ms before doing a string concatenation. As a rough guess, String concatenation takes about 1ns per character, and because your string is only 80 characters long, this would mean that each thread spends about 80 out of 500000000 ns executing. The chance of two or more threads running at the same time is therefore vanishingly small.
If we change your program so that several threads are running concurrently all the time, we see quite different results:
static String s = "";
public static void main(String[] args) throws Exception {
ExecutorService executor = Executors.newFixedThreadPool(5);
for (int i = 0; i < 10_000; i ++) {
executor.submit(() -> {
s += "1";
});
}
executor.shutdown();
executor.awaitTermination(1, TimeUnit.MINUTES);
System.out.println(s.length());
}
In the absence of data races, this should print 10000. On my computer, this prints about 4200, meaning over half the updates have been lost in the data race.
What if we declare s volatile? Interestingly, we still get about 4200 as a result, so data races were not prevented. That makes sense, because volatile ensures that writes are visible to other threads, but does not prevent intermediary updates, i.e. what happens is something like:
Thread 1 reads s and starts making a new String
Thread 2 reads s and starts making a new String
Thread 1 stores its result in s
Thread 2 stores its result in s, overwriting the previous result
To prevent this, you can use a plain old synchronized block:
executor.submit(() -> {
synchronized (Test.class) {
s += "1";
}
});
And indeed, this returns 10000, as expected.
It is working because you are using Thread.sleep((long) (Math.random() * 100));So every thread has different sleep time and executing may be one by one as all other thread in sleep mode or completed execution.But though your code is working is not thread safe.Even if you use Volatile also will not make your code thread safe.Volatile only make sure visibility i.e when one thread make some changes other threads are able to see it.
In your case your operation is multi step process reading the variable,updating then writing to memory.So you required locking mechanism to make it thread safe.
In python using two Threads for a simple counter program (as demonstrated below) is slower than the program with a single thread. The reason given to this is the mechanism behind Global Interpreter lock.
I tested the same in java to see the performance. Here again, I see that a single Thread out-performs two-threaded one with a significant time scale. why is it so?
Here is the code:
public class ThreadTiming {
static void threadMessage(String message) {
String threadName =
Thread.currentThread().getName();
System.out.format("%s: %s%n",
threadName,
message);
}
private static class Counter implements Runnable {
private int count=500000000;
#Override
public void run() {
while(count>0) {
count--;
}
threadMessage("done processing");
}
}
public static void main(String[] args) throws InterruptedException{
Thread t1 = new Thread(new Counter());
Thread t2 = new Thread(new Counter());
long startTime=System.currentTimeMillis();
t1.start();
t2.start();
t1.join();
t2.join();
long endTime=System.currentTimeMillis();
System.out.println("Time taken by two threads "+ (endTime-startTime)/1000.0);
startTime=System.currentTimeMillis();
Calculate(2*500000000);
endTime=System.currentTimeMillis();
System.out.println("Time taken by single thread "+ (endTime-startTime)/1000.0);
}
public static void Calculate(int x){
while (x>0){
x--;
}
threadMessage("Done processing");
}
}
Output:
Thread-1: done processing
Thread-2: done processing
Time taken by two threads 0.052
main: Done processing
Time taken by single thread 0.0010
Very simple. The single threaded version uses a local variable which hotspot has no problems to reason that it never leaves the scope, hence the whole function is reduced to a nop.
On the other hand proving that the instance variable never leaves scope (hello reflection!) Is much harder and obviously hotspot cannot it here hence the loop isn't removed.
On a general note benchmarking is hard (i count at least three other mistakes that could lead to "wrong" results) and requires tons of knowledge.You are better off using jmh (java measuring harness) which takes care of most things.
The basic answer is you have code the optimiser can eliminate and you are timing how long it takes to detect this. You are also adding the time it takes to start and stop two threads which could be more than half this time.
The second test doesn't start a new thread, it uses the current one so you just need to wait for it to detect the loop doesn't do anything.
For example you have timed that a single thread can do 1 billion loops in 1 ms. If you have a 3.33 GHz processor, this would have to do 300 iterations in a single clock cycle. If this sounds too good to be true, that is because it is. ;)
#Voo seems to be generally right, as you can see by moving ThreadTiming.Counter.count to be a local variable of ThreadTiming.Counter.run(). That eliminates any possibility of non-local references, and the resulting program exhibits much less single-thread vs. dual-thread performance difference.
HOWEVER, that doesn't eliminate all the difference. The timing reported for the dual-thread case is still worse by about a factor of 9 for me. But if I then swap so that the single-threaded case is measured first, the two-thread case wins by about a factor of 2.
But that, too, is illusory, because the two tests are running different -- albeit similar -- code. The single-thread case can easily be made to run exactly the same code as the dual thread case:
Counter c = new Counter();
c.run();
c.run();
(Using the version where count is local to run().) If that approach is used then I observe no difference in performance (at the resolution of the measurement) between single- and dual-threaded, regardless of which case is tested first.
As #Voo said, benchmarking is hard.
It just looks like it's from loading each thread and its context into the CPU. It's thrashing. There's probably a more detailed answer waiting to strike, but let's start by posting the basics...
When running two threads, your timer is including the time taken to launch the two threads. Creating and starting threads has some overhead, and in this case, the overhead is longer than the time to actually carry out the process.
From the Stream javadoc:
Stream pipelines may execute either sequentially or in parallel. This execution mode is a property of the stream. Streams are created with an initial choice of sequential or parallel execution.
My assumptions:
There is no functional difference between a sequential/parallel streams. Output is never affected by execution mode.
A parallel stream is always preferable, given appropriate number of cores and problem size to justify the overhead, due to the performance gains.
We want to write code once and run anywhere without having to care about the hardware (this is Java, after all).
Assuming these assumptions are valid (nothing wrong with a bit of meta-assumption), what's the value in having the execution mode exposed in the api?
It seems like you should just be able to declare a Stream, and the choice of sequential/parallel execution should be handled automagically in a layer below, either by library code or the JVM itself as a function of the cores available at runtime, the size of the problem, etc.
Sure, assuming parallel streams also work on a single core machine, perhaps just always using a parallel stream achieves this. But this is really ugly - why have explicit references to parallel streams in my code when it's the default option?
Even if there is a scenario where you'd deliberately want to hard code the use of a sequential stream - why is there not just a sub-interface SequentialStream for that purpose, rather than polluting Stream with an execution mode switch?
It seems like you should just be able to declare a Stream, and the choice of sequential/parallel execution should be handled automagically in a layer below, either by library code or the JVM itself as a function of the cores available at runtime, the size of the problem, etc.
The reality is that a) streams are a library, and have no special JVM magic, and b) you can't really design a library smart enough to automagically figure out what the right decision is in this particular case. There's no sensible way to estimate how costly a particular function will be without running it -- even if you could introspect its implementation, which you can't -- and now you're introducing a benchmark into every stream operation, trying to figure out if parallelizing it will be worth the cost of the parallelism overhead. That's just not practical, especially given that you don't know in advance how bad the parallelism overhead is, either.
A parallel stream is always preferable, given appropriate number of cores and problem size to justify the overhead, due to the performance gains.
Not always, in practice. Some tasks are just so small that they're not worth parallelizing, and parallelism does always have some overhead. (And frankly, most programmers tend to overestimate the usefulness of parallelism, slapping it everywhere when it's really hurting performance.)
Basically, it's a hard enough problem that you basically have to shove it off onto the programmer.
There's an interesting case in this question showing that sometimes parallel stream might be slower in orders of magnitude. In that particular example parallel version runs for ten minutes while sequential takes several seconds.
There is no functional difference between a sequential/parallel
streams. Output is never affected by execution mode.
There is a difference between sequential/parallel streams execution.
In the below code TEST_2 results shows that parallel thread execution is very much faster than the sequential way.
A parallel stream
is always preferable, given appropriate number of cores and problem
size to justify the overhead, due to the performance gains.
Not really. if task is not worthy(simple tasks) to be executed in parallel threads, then it is simply we are adding overhead to our code.
TEST_1 results shows this. Also note that if all the worker threads are busy on one parallel execution tasks; then other parallel stream operation elsewhere in your code will be waiting for that.
We want to
write code once and run anywhere without having to care about the
hardware (this is Java, after all).
Since only programmer knows about; is it worthy to execute this task in parallel/sequential irrespective of CPU's. So java API exposed both option to the developer.
import java.util.ArrayList;
import java.util.List;
/*
* Performance test over internal(parallel/sequential) and external iterations.
* https://docs.oracle.com/javase/tutorial/collections/streams/parallelism.html
*
*
* Parallel computing involves dividing a problem into subproblems,
* solving those problems simultaneously (in parallel, with each subproblem running in a separate thread),
* and then combining the results of the solutions to the subproblems. Java SE provides the fork/join framework,
* which enables you to more easily implement parallel computing in your applications. However, with this framework,
* you must specify how the problems are subdivided (partitioned).
* With aggregate operations, the Java runtime performs this partitioning and combining of solutions for you.
*
* Limit the parallelism that the ForkJoinPool offers you. You can do it yourself by supplying the -Djava.util.concurrent.ForkJoinPool.common.parallelism=1,
* so that the pool size is limited to one and no gain from parallelization
*
* #see ForkJoinPool
* https://docs.oracle.com/javase/tutorial/essential/concurrency/forkjoin.html
*
* ForkJoinPool, that pool creates a fixed number of threads (default: number of cores) and
* will never create more threads (unless the application indicates a need for those by using managedBlock).
* * http://stackoverflow.com/questions/10797568/what-determines-the-number-of-threads-a-java-forkjoinpool-creates
*
*/
public class IterationThroughStream {
private static boolean found = false;
private static List<Integer> smallListOfNumbers = null;
public static void main(String[] args) throws InterruptedException {
// TEST_1
List<String> bigListOfStrings = new ArrayList<String>();
for(Long i = 1l; i <= 1000000l; i++) {
bigListOfStrings.add("Counter no: "+ i);
}
System.out.println("Test Start");
System.out.println("-----------");
long startExternalIteration = System.currentTimeMillis();
externalIteration(bigListOfStrings);
long endExternalIteration = System.currentTimeMillis();
System.out.println("Time taken for externalIteration(bigListOfStrings) is :" + (endExternalIteration - startExternalIteration) + " , and the result found: "+ found);
long startInternalIteration = System.currentTimeMillis();
internalIteration(bigListOfStrings);
long endInternalIteration = System.currentTimeMillis();
System.out.println("Time taken for internalIteration(bigListOfStrings) is :" + (endInternalIteration - startInternalIteration) + " , and the result found: "+ found);
// TEST_2
smallListOfNumbers = new ArrayList<Integer>();
for(int i = 1; i <= 10; i++) {
smallListOfNumbers.add(i);
}
long startExternalIteration1 = System.currentTimeMillis();
externalIterationOnSleep(smallListOfNumbers);
long endExternalIteration1 = System.currentTimeMillis();
System.out.println("Time taken for externalIterationOnSleep(smallListOfNumbers) is :" + (endExternalIteration1 - startExternalIteration1));
long startInternalIteration1 = System.currentTimeMillis();
internalIterationOnSleep(smallListOfNumbers);
long endInternalIteration1 = System.currentTimeMillis();
System.out.println("Time taken for internalIterationOnSleep(smallListOfNumbers) is :" + (endInternalIteration1 - startInternalIteration1));
// TEST_3
Thread t1 = new Thread(IterationThroughStream :: internalIterationOnThread);
Thread t2 = new Thread(IterationThroughStream :: internalIterationOnThread);
Thread t3 = new Thread(IterationThroughStream :: internalIterationOnThread);
Thread t4 = new Thread(IterationThroughStream :: internalIterationOnThread);
t1.start();
t2.start();
t3.start();
t4.start();
Thread.sleep(30000);
}
private static boolean externalIteration(List<String> bigListOfStrings) {
found = false;
for(String s : bigListOfStrings) {
if(s.equals("Counter no: 1000000")) {
found = true;
}
}
return found;
}
private static boolean internalIteration(List<String> bigListOfStrings) {
found = false;
bigListOfStrings.parallelStream().forEach(
(String s) -> {
if(s.equals("Counter no: 1000000")){ //Have a breakpoint to look how many threads are spawned.
found = true;
}
}
);
return found;
}
private static boolean externalIterationOnSleep(List<Integer> smallListOfNumbers) {
found = false;
for(Integer s : smallListOfNumbers) {
try {
Thread.sleep(100);
} catch (Exception e) {
e.printStackTrace();
}
}
return found;
}
private static boolean internalIterationOnSleep(List<Integer> smallListOfNumbers) {
found = false;
smallListOfNumbers.parallelStream().forEach( //Removing parallelStream() will behave as single threaded (sequential access).
(Integer s) -> {
try {
Thread.sleep(100); //Have a breakpoint to look how many threads are spawned.
} catch (Exception e) {
e.printStackTrace();
}
}
);
return found;
}
public static void internalIterationOnThread() {
smallListOfNumbers.parallelStream().forEach(
(Integer s) -> {
try {
/*
* DANGEROUS
* This will tell you that if all the 7 FJP(Fork join pool) worker threads are blocked for one single thread (e.g. t1),
* then other normal three(t2 - t4) thread wont execute, will wait for FJP worker threads.
*/
Thread.sleep(100); //Have a breakpoint here.
} catch (Exception e) {
e.printStackTrace();
}
}
);
}
}
It seems like you should just be able to declare a Stream, and the choice of sequential/parallel execution should be handled automagically in a layer below, either by library code or the JVM itself as a function of the cores available at runtime, the size of the problem, etc.
To add to the already given answers:
Thats a pretty bold assumption. Imagine simulating a board-game for training some form of AI, it's pretty easy to parallelize the execution of different playthroughs - just create a new instance and let it run on its own thread. As it doesn't share any state with another playthrough you don't even have to consider multi-threading issues in your game logic. If you on the other hand parallelize the game logic itself you get all sorts of multi-threading issues and most likely pay a steep price for complexity and even performance.
Having control over the behaviour of streams gives you (appropriately limited) flexibility which in and of itself is a key feature for good library design.
Our application requires all worker threads to synchronize at a defined point. For this we use a CyclicBarrier, but it does not seem to scale well. With more than eight threads, the synchronization overhead seems to outweigh the benefits of multithreading. (However, I cannot support this with measurement data.)
EDIT: Synchronization happens very frequently, in the order of 100k to 1M times.
If synchronization of many threads is "hard", would it help building a synchronization tree? Thread 1 waits for 2 and 3, which in turn wait for 4+5 and 6+7, respectively, etc.; after finishing, threads 2 and 3 wait for thread 1, thread 4 and 5 wait for thread 2, etc..
1
| \
2 3
|\ |\
4 5 6 7
Would such a setup reduce synchronization overhead? I'd appreciate any advice.
See also this featured question: What is the fastest cyclic synchronization in Java (ExecutorService vs. CyclicBarrier vs. X)?
With more than eight threads, the synchronization overhead seems to outweigh the benefits of multithreading. (However, I cannot support this with measurement data.)
Honestly, there's your problem right there. Figure out a performance benchmark and prove that this is the problem, or risk spending hours / days solving the entirely wrong problem.
You are thinking about the problem in a subtly wrong way that tends to lead to very bad coding. You don't want to wait for threads, you want to wait for work to be completed.
Probably the most efficient way is a shared, waitable counter. When you make new work, increment the counter and signal the counter. When you complete work, decrement the counter. If there is no work to do, wait on the counter. If you drop the counter to zero, check if you can make new work.
If I understand correctly, you're trying to break your solution up into parts and solve them separately, but concurrently, right? Then have your current thread wait for those tasks? You want to use something like a fork/join pattern.
List<CustomThread> threads = new ArrayList<CustomThread>();
for (Something something : somethings) {
threads.add(new CustomThread(something));
}
for (CustomThread thread : threads) {
thread.start();
}
for (CustomThread thread : threads) {
thread.join(); // Blocks until thread is complete
}
List<Result> results = new ArrayList<Result>();
for (CustomThread thread : threads) {
results.add(thread.getResult());
}
// do something with results.
In Java 7, there's even further support via a fork/join pool. See ForkJoinPool and its trail, and use Google to find one of many other tutorials.
You can recurse on this concept to get the tree you want, just have the threads you create generate more threads in the exact same way.
Edit: I was under the impression that you wouldn't be creating that many threads, so this is better for your scenario. The example won't be horribly short, but it goes along the same vein as the discussion you're having in the other answer, that you can wait on jobs, not threads.
First, you need a Callable for your sub-jobs that takes an Input and returns a Result:
public class SubJob implements Callable<Result> {
private final Input input;
public MyCallable(Input input) {
this.input = input;
}
public Result call() {
// Actually process input here and return a result
return JobWorker.processInput(input);
}
}
Then to use it, create an ExecutorService with a fix-sized thread pool. This will limit the number of jobs you're running concurrently so you don't accidentally thread-bomb your system. Here's your main job:
public class MainJob extends Thread {
// Adjust the pool to the appropriate number of concurrent
// threads you want running at the same time
private static final ExecutorService pool = Executors.newFixedThreadPool(30);
private final List<Input> inputs;
public MainJob(List<Input> inputs) {
super("MainJob")
this.inputs = new ArrayList<Input>(inputs);
}
public void run() {
CompletionService<Result> compService = new ExecutorCompletionService(pool);
List<Result> results = new ArrayList<Result>();
int submittedJobs = inputs.size();
for (Input input : inputs) {
// Starts the job when a thread is available
compService.submit(new SubJob(input));
}
for (int i = 0; i < submittedJobs; i++) {
// Blocks until a job is completed
results.add(compService.take())
}
// Do something with results
}
}
This will allow you to reuse threads instead of generating a bunch of new ones every time you want to run a job. The completion service will do the blocking while it waits for jobs to complete. Also note that the results list will be in order of completion.
You can also use Executors.newCachedThreadPool, which creates a pool with no upper limit (like using Integer.MAX_VALUE). It will reuse threads if one is available and create a new one if all the threads in the pool are running a job. This may be desirable later if you start encountering deadlocks (because there's so many jobs in the fixed thread pool waiting that sub jobs can't run and complete). This will at least limit the number of threads you're creating/destroying.
Lastly, you'll need to shutdown the ExecutorService manually, perhaps via a shutdown hook, or the threads that it contains will not allow the JVM to terminate.
Hope that helps/makes sense.
If you have a generation task (like the example of processing columns of a matrix) then you may be stuck with a CyclicBarrier. That is to say, if every single piece of work for generation 1 must be done in order to process any work for generation 2, then the best you can do is to wait for that condition to be met.
If there are thousands of tasks in each generation, then it may be better to submit all of those tasks to an ExecutorService (ExecutorService.invokeAll) and simply wait for the results to return before proceeding to the next step. The advantage of doing this is eliminating context switching and wasted time/memory from allocating hundreds of threads when the physical CPU is bounded.
If your tasks are not generational but instead more of a tree-like structure in which only a subset need to be complete before the next step can occur on that subset, then you might want to consider a ForkJoinPool and you don't need Java 7 to do that. You can get a reference implementation for Java 6. This would be found under whatever JSR introduced the ForkJoinPool library code.
I also have another answer which provides a rough implementation in Java 6:
public class Fib implements Callable<Integer> {
int n;
Executor exec;
Fib(final int n, final Executor exec) {
this.n = n;
this.exec = exec;
}
/**
* {#inheritDoc}
*/
#Override
public Integer call() throws Exception {
if (n == 0 || n == 1) {
return n;
}
//Divide the problem
final Fib n1 = new Fib(n - 1, exec);
final Fib n2 = new Fib(n - 2, exec);
//FutureTask only allows run to complete once
final FutureTask<Integer> n2Task = new FutureTask<Integer>(n2);
//Ask the Executor for help
exec.execute(n2Task);
//Do half the work ourselves
final int partialResult = n1.call();
//Do the other half of the work if the Executor hasn't
n2Task.run();
//Return the combined result
return partialResult + n2Task.get();
}
}
Keep in mind that if you have divided the tasks up too much and the unit of work being done by each thread is too small, there will negative performance impacts. For example, the above code is a terribly slow way to solve Fibonacci.
Please look at my following code....
private static final int NTHREDS = 10;
ExecutorService executor = Executors.newFixedThreadPool(NTHREDS);
while(rs.next()){
webLink=rs.getString(1);
FirstName=rs.getString(2);
MiddleName=rs.getString(3);
Runnable worker = new MyRunnable(webLink,FirstName,MiddleName);// this interface has run method....
executor.execute(worker);
}
//added
public class MyRunnable implements Runnable {
MyRunnable(String webLink,String FirstName,String MiddleName){
** Assigning Values...***
}
#Override
public void run() {
long sum = 0;
**Calling method to crawl by passing those Values**
try {
Thread.sleep(200);
}
catch (InterruptedException e)
{
e.printStackTrace();
}
}
}
In this part if the resultset(rs) having 100 records excutor creating 100 threads..... I need to run this process with in 10 threads. I need your help to know how to get control of threads.. If any thread has completed its task then it should process the immediate available task from the Result Set. Is it possible to achieve using executor framework.
Thanks...
vijay365
The code you've already posted does this. Your code will not immediately spawn 100 threads. It will spawn 10 threads that consume tasks from a queue containing your Runnables.
From the Executors.newFixedThreadPool Javadocs:
Creates a thread pool that reuses a
fixed set of threads operating off a
shared unbounded queue.
Instead of using a static number of threads (10 in this case) you should determine the number dynamically:
final int NTHREADS = Runtime.getRuntime().availableProcessors();
Also, I don't get why you are calling Thread.sleep?
ResultSet is probably a JDBC query result.
This design is almost certain to be doomed to failure.
The JDBC interface implementations are not thread-safe.
ResultSets are scare resources that should be closed in the same scope in which they were created. If you pass them around, you're asking for trouble.
Multi-threaded code is hard to write well and even harder to debug if incorrect.
You are almost certainly headed in the wrong direction with this design. I'd bet a large sum of money that you're guilty of premature optimization. You are hoping that multiple threads will make your code faster, but what will happen is ten threads time slicing on one CPU and taking the same time or longer. (Context switching takes time, too.)
A slightly better idea would be to load the ResultSet into an object or collection, close the ResultSet, and then do some multi-threaded processing on that returned object.
Try executor.submit(worker);