I am developing a program that can send http requests to fetch documents.
I have fill a queue with all the requests items:
Queue<RequestItem> requestItems = buildRequest4Docs();
Then,
int threadNum = requestItems.size();
//ExecutorService exs = Executors.newFixedThreadPool(threadNum);
for (int i = 0; i < threadNum; i++) {
ResponseInterface response = new CMSGOResponse();
RequestTask task = new RequestTask(requestItems.poll(), this, response);
task.run();
//exs.execute(new RequestTask(requestItems.poll(), this, response));
}
//exs.shutdown();
I am confused here, in the for loop,does the tasks run simultaneously? Or the tasks run one by one?
Thanks!
In the way you got it now the tasks will be executed one by one. If you uncomment the code you got now as comments and comment the lines RequestTask task = new RequestTask(requestItems.poll(), this, response); and task.run(); you will get a concurrent execution.
So for the concurrent execution it has to look like this:
int threadNum = requestItems.size();
ExecutorService exs = Executors.newFixedThreadPool(threadNum);
for (int i = 0; i < threadNum; i++) {
ResponseInterface response = new CMSGOResponse();
exs.execute(new RequestTask(requestItems.poll(), this, response));
}
exs.shutdown();
while (! exs.isTerminated()) {
try {
exs.awaitTermination(1L, TimeUnit.DAYS);
}
catch (InterruptedException e) {
// you may or may not care here, but if you truly want to
// wait for the pool to shutdown, just ignore the exception
// otherwise you'll have to deal with the exception and
// make a decision to drop out of the loop or something else.
}
}
In addition to that I suggest, that you do not bind the amount of threads created with the ExecutorService to the amount of task you got to work. Connecting it to the amount of processors of the host system is usually a better method. To get the amount of processors use: Runtime.getRuntime().availableProcessors()
And in the executor service initialized like this you put the items of your queue. But that works nicely without fetching the total size, rather by polling the Queue until it does not return additional data.
The final result of my proposals could look like this:
final int threadNum = Runtime.getRuntime().availableProcessors();
final ExecutorService exs = Executors.newFixedThreadPool(threadNum);
while (true) {
final RequestItem requestItem = requestItems.poll();
if (requestItem == null) {
break;
}
final ResponseInterface response = new CMSGOResponse();
exs.execute(new RequestTask(requestItem , this, response));
}
exs.shutdown();
I am confused here, in the for loop,does the tasks run simultaneously? Or the tasks run one by one?
With the code you've posted, they'll run one-by-one, because (assuming RequestTask is a subclass of Thread) you've called run. You should call start. Now that you've said RequestTask implements Runnable, the correct code wouldn't call start (it doesn't have one!) but rather new Thread(task);. (But it looks like you've now received a good answer regarding the ExecutorService, which is another way to do it.)
Assuming you call start start them on different threads instead, then yes, they'll all run in parallel (as much as they can on the hardware, etc.).
Currently you are running your thread sequentially, Well you have two ways to run threads.(Assuming that RequestTask extends Thread)
I.Either create thread object and call start() method.
RequestTask task = new RequestTask(requestItems.poll(), this, response);
task.start(); // run() method will be called, you don't need to call it
II.Or create ExecutorService
ExecutorService pool = Executors.newFixedThreadPool(poolSize);
//....
for (int i = 0; i < threadNum; i++) {
ResponseInterface response = new CMSGOResponse();
RequestTask task = new RequestTask(requestItems.poll(), this, response);
pool.execute(task);
}
You are running them one by one in the current thread. You need to use the ExecutorService to run them concurrently.
I am confused here, in the for loop,does the tasks run simultaneously? Or the tasks run one by one?
Task will be executed in the same thread i.e. one by one since you are calling run() rather that start , it will not run the task in new thread .
int threadNum = requestItems.size();
ExecutorService exs = Executors.newFixedThreadPool(threadNum);
ResponseInterface response = new CMSGOResponse();
RequestTask task = new RequestTask(requestItems.poll(), this, response);
exs.execute(task );
exs.shutdown();
In above case task will be executed in new thread and as soon as you assign 10 different task to ExecutorService they will be executed asynchronously in different threads.
I usually tend to create my Threads (or classes implementing Interface), THEN launch them with the start() method.
In your case, since RequestTask implements Runnable, you could add a start() method like this :
public class RequestTask implements Runnable {
Thread t;
boolean running;
public RequestTask() {
t = new Thread(this);
}
public void start() {
running = true; // you could use a setter
t.start();
}
public void run() {
while (running) {
// your code goes here
}
}
}
, then :
int threadNum = requestItems.size();
RequestTask[] rta = new RequestTask[threadNum];
// Create the so-called Threads ...
for (int i=0;i<threadNum;i++) {
rta[i] = new RequestTask(requestItems.poll(), this, new CMSGOResponse());
}
// ... THEN launch them
for (int i=0;i<threadNum;i++) {
rta[i].start();
}
Related
I believe am getting bad data because the instance variable are not thread safe.
I am trying to use multi-threading in a way that opens (at most) 13 threads at a time based on a list. I am using it in a service and need to pass parameters into the run method, so I made some instance variables and set them. I also want those thirteen methods to execute before moving on to the next iteration of the first for loop
private EnergyPortalGroup superGroup;
private EnergyPortalSubGroups singleSubGroup;
private BillingPeriod singlePeriod;
private DateTime[] dateTimeArray;
private void parseGroup(EnergyPortalGroup superGroup) throws InterruptedException{
EnergyPortalSubGroupsCriteria criteria = new EnergyPortalSubGroupsCriteria();
criteria.setGroupId(superGroup.getId());
List<EnergyPortalSubGroups> wholeSubGroupList = subgroupsFactory.readList(criteria);
for (EnergyPortalSubGroups singleSubGroup : wholeSubGroupList){
this.singleSubGroup = singleSubGroup;
this.deleteSubGroupRecordsFromDB(singleSubGroup);
List<BillingPeriod> billingPeriodList = this.getPreviousTwelveBillingPeriods(singleSubGroup, superGroup);
if (billingPeriodList != null && billingPeriodList.size() > 0){
Thread[] threads = new Thread[billingPeriodList.size()];
for (int i = 0; i < billingPeriodList.size(); i++){
this.singlePeriod = billingPeriodList.get(i);
threads[i] = new Thread(this);
threads[i].start();
}
for (Thread thread : threads){
thread.join();
}
}
}
}
Here is my overridden run method:
#Override
public void run(){
List<GroupSummarization> groupSummarizationsToWriteList = new ArrayList<>();
WidgetDataSummationHolder holder = new WidgetDataSummationHolder();
holder = energyPortalService.getEnergyPortalWidgetsSummedData(singleSubGroup, null, null, singlePeriod);
parseSummationHolder(groupSummarizationsToWriteList, holder, singleSubGroup, dateTimeArray, singlePeriod);
processBatchLists(groupSummarizationsToWriteList, superGroup, singlePeriod);
}
Can anyone help me make this thread safe? I am obviously new to multithreading and I tried passing these variables in with a constructor but I have some autowired services that were null and I was getting a null pointer at this line holder = energyPortalService.getEnergyPortalWidgetsSummedData(singleSubGroup, null, null, singlePeriod);
energyPortalService cannot be null sometimes and not other times, given the code you provided. If it is not null when you launch a new Thread(this), then it should be there if you would use a new Thread(()-> {...});
(since you are talking about autowiring, I will presume a whole lot of foul play can occur with osgi and aop and such evils.)
In the end, I went with ExecutorService and a new class/service like a few suggested. So here is an example in case anyone else runs into this type of problem:
for (final Object x : list){
List<Object> someList = getList();
if (!Collections.isEmpty(someList)){
ExecutorService executorService = Executors.newCachedThreadPool();
List<Future<?>> futures = new ArrayList<Future<?>>();
for (final Object n : someList){
futures.add(executorService.submit(new Runnable(){
#Override
public void run(){
someOtherService.process(parameters)
}
}));
}
for (Future<?> f : futures){
try {
f.get();
} catch (InterruptedException | ExecutionException e) {
//do some logging
}
}
}
}
Basically this is calling an ExecutorService that manages the thread pool. I call newCachedThreadPool so that it creates new threads as needed instead of just assuming I know how many threads I would need in this case if you do see newFixedThreadPool(n). But, to ensure I get some consistency on thread size, after I loop through the inner loop, I loop through the futures list (a future is a future result of an asynchronous computation) and call f.get which waits if necessary for the computation to complete, and then retrieves its result...
This worked great for what I needed. And the key part is that inside of the overridden run function, the process method takes whatever parameters you want (notice the use of final) instead of trying to force feed run() or worrying about an autowired service when you are calling a constructor. This bypasses all of that.
Thank you to all who put me on the correct path
In Java, how to pass the objects back to Main thread from worker threads? Take the following codes as an example:
main(String[] args) {
String[] inputs;
Result[] results;
Thread[] workers = new WorkerThread[numThreads];
for (int i = 0; i < numThreads; i++) {
workers[i] = new WorkerThread(i, inputs[i], results[i]);
workers[i].start();
}
....
}
....
class WorkerThread extends Thread {
String input;
int name;
Result result;
WorkerThread(int name, String input, Result result) {
super(name+"");
this.name = name;
this.input = input;
this.result = result;
}
public void run() {
result = Processor.process(input);
}
}
How to pass the result back to main's results[i] ?
How about passing this to WorkerThread,
workers[i] = new WorkerThread(i, inputs[i], results[i], this);
so that it could
mainThread.reults[i] = Processor.process(inputs[i]);
Why don't you use Callables and an ExecutorService?
main(String[] args) {
String[] inputs;
Future<Result>[] results;
for (int i = 0; i < inputs.length; i++) {
results[i] = executor.submit(new Worker(inputs[i]);
}
for (int i = 0; i < inputs.length; i++) {
Result r = results[i].get();
// do something with the result
}
}
#Thilo's and #Erickson's answers are the best one. There are existing APIs that do this kind of thing simply and reliably.
But if you want to persist with your current approach of doing it by hand, then the following change to you code may be sufficient:
for (int i = 0; i < numThreads; i++) {
results[i] = new Result();
...
workers[i] = new WorkerThread(i, inputs[i], results[i]);
workers[i].start();
}
...
public void run() {
Result tmp = Processor.process(input);
this.result.updateFrom(tmp);
// ... where the updateFrom method copies the state of tmp into
// the Result object that was passed from the main thread.
}
Another approach is to replace Result[] in the main program with Result[][] and pass a Result[0] to the child thread that can be updated with the result object. (A light-weight holder).
However, there us an Important Gotcha when you are implementing this at a low level is that the main thread needs to call Thread.join on all of the child threads before attempting to retrieve the results. If you don't, there is a risk that the main thread will occasionally see stale values in the Result objects. The join also ensures that the main thread doesn't try to access a Result before the corresponding child thread has completed it.
The main thread will need to wait for the worker threads to complete before getting the results. One way to do this is for the main thread to wait for each worker thread to terminate before attempting to read the result. A thread terminates when its run() method completes.
For example:
for (int i = 0; i < workers.length; i++) {
worker.join(); // wait for worker thread to terminate
Result result = results[i]; // get the worker thread's result
// process the result here...
}
You still have to arrange for the worker thread's result to be inserted into the result[] array somehow. As one possibility, you could do this by passing the array and an index into each worker thread and having the worker thread assign the result before terminating.
Some typical solutions would be:
Hold the result in the worker thread's instance (be it Runnable or Thread). This is similar to the use of the Future interface.
Use a BlockingQueue that the worker threads are constructed with which they can place their result into.
Simple use the ExecutorService and Callable interfaces to get a Future which can be asked for the result.
It looks like your goal is to perform the computation in parallel, then once all results are available to the main thread, it can continue and use them.
If that's the case, implement your parallel computation as a Callable rather than a thread. Pass this collection of tasks to the invokeAll() method of an ExecutorService. This method will block until all the tasks have been completed, and then your main thread can continue.
I think I have a better solution, why don't you make your worker threads pass the result into a linkedListBlockingQueue, which is passed to them, after they are done, and your main function picks the results up from the queue like this
while(true){linkedListBlockingQueue.take();
//todo: fil in the task you want it to do
//if a specific kind of object is returned/countdownlatch is finished exit
}
I am writing a thread pool utility in my multithreading program. i just need to validate the following methods are correct and are they return the right values for me. i am using a LinkedBlockingQueue with size of 1. and also I refer to the java doc and it always says 'method will return approximate' number phrase. so i doubt weather following conditions are correct.
public boolean isPoolIdle() {
return myThreadPool.getActiveCount() == 0;
}
public int getAcceptableTaskCount() {
//initially poolSize is 0 ( after pool executes something it started to change )
if (myThreadPool.getPoolSize() == 0) {
return myThreadPool.getCorePoolSize() - myThreadPool.getActiveCount();
}
return myThreadPool.getPoolSize() - myThreadPool.getActiveCount();
}
public boolean isPoolReadyToAcceptTasks(){
return myThreadPool.getActiveCount()<myThreadPool.getCorePoolSize();
}
Please let me know your thoughts and suggestions.
UPDATE
interesting thing was if pool returns me there are 3 threads available from the getAcceptableTaskCount method and when i pass 3 tasks to the pool some times one task got rejected and it is handle by RejectedExecutionHandler. some times pool will handle all the tasks i passed. i am wondering why pool is rejected the tasks since i am passing tasks according to the available thread count.
--------- implementation of the answer of gray---
class MyTask implements Runnable {
#Override
public void run() {
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
System.out.println("exec");
}
}
#Test
public void testTPool(){
ExecutorService pool = Executors.newFixedThreadPool(5);
List<Future<MyTask>> list = new ArrayList<Future<MyTask>>();
for (int i = 0; i < 5; i++) {
MyTask t = new MyTask();
list.add(pool.submit(t, t));
}
for (int i = 0; i < list.size(); i++) {
Future<MyTask> t = list.get(i);
System.out.println("Result -"+t.isDone());
MyTask m = new MyTask();
list.add(pool.submit(m,m));
}
}
This will print Result -false in the console meaning that task is not complete.
From your comments:
i need to know that if pool is idle or pool can accept the tasks. if pool can accept, i need to know how much free threads in the pool. if it is 5 i will send 5 tasks to the pool to do the processing.
I don't think that you should be doing the pool accounting yourself. For your thread pool if you use Executors.newFixedThreadPool(5) then you can submit as many tasks as you want and it will only run them in 5 threads.
so i get the first most 5 tasks from the vector and assign them to the pool.ignore the other tasks in the vector since they may be update / remove from a separate cycle
Ok, I see. So you want to maximize parallelization while at the same time not pre-loading jobs? I would think that something like the following pseudo code would work:
int numThreads = 5;
ExecutorService threadPool = Executors.newFixedThreadPool(numThreads);
List<Future<MyJob>> futures = new ArrayList<Future<MyJob>>();
// submit the initial jobs
for (int i = 0; i < numThreads; i++) {
MyJob myJob = getNextBestJob();
futures.add(threadPool.submit(myJob, myJob));
}
// the list is growing so we use for i
for (int i = 0; i < futures.size(); i++) {
// wait for a job to finish
MyJob myJob = futures.get(i);
// process the job somehow
// get the next best job now that the previous one finished
MyJob nextJob = getNextBestJob();
if (nextJob != null) {
// submit the next job unless we are done
futures.add(threadPool.submit(myJob, myJob));
}
}
However, I don't quite understand how the thread count would change however. If you edit your question with some more details I can tweak my response.
I have a .csv file containing over 70 million lines of which each line is to generate a Runnable and then executed by threadpool. This Runnable will insert a record into Mysql.
What's more , I want to record a position of the csv file for the RandomAccessFile to locate. The position is written to a File.I want to write this record when all the threads in threadpool are finished.So ThreadPoolExecutor.shutdown() is invoked. But when more lines come, I need a threadpool again. How can I reuse this current threadpool instead of make a new one.
The code is as follows:
public static boolean processPage() throws Exception {
long pos = getPosition();
long start = System.currentTimeMillis();
raf.seek(pos);
if(pos==0)
raf.readLine();
for (int i = 0; i < PAGESIZE; i++) {
String lineStr = raf.readLine();
if (lineStr == null)
return false;
String[] line = lineStr.split(",");
final ExperienceLogDO log = CsvExperienceLog.generateLog(line);
//System.out.println("userId: "+log.getUserId()%512);
pool.execute(new Runnable(){
public void run(){
try {
experienceService.insertExperienceLog(log);
} catch (BaseException e) {
e.printStackTrace();
}
}
});
long end = System.currentTimeMillis();
}
BufferedWriter resultWriter = new BufferedWriter(
new OutputStreamWriter(new FileOutputStream(new File(
RESULT_FILENAME), true)));
resultWriter.write("\n");
resultWriter.write(String.valueOf(raf.getFilePointer()));
resultWriter.close();
long time = System.currentTimeMillis()-start;
System.out.println(time);
return true;
}
Thanks !
As stated in the documentation, you cannot reuse an ExecutorService that has been shut down. I'd recommend against any workarounds, since (a) they may not work as expected in all situations; and (b) you can achieve what you want using standard classes.
You must either
instantiate a new ExecutorService; or
not terminate the ExecutorService.
The first solution is easily implemented, so I won't detail it.
For the second, since you want to execute an action once all the submitted tasks have finished, you might take a look at ExecutorCompletionService and use it instead. It wraps an ExecutorService which will do the thread management, but the runnables will get wrapped into something that will tell the ExecutorCompletionService when they have finished, so it can report back to you:
ExecutorService executor = ...;
ExecutorCompletionService ecs = new ExecutorCompletionService(executor);
for (int i = 0; i < totalTasks; i++) {
... ecs.submit(...); ...
}
for (int i = 0; i < totalTasks; i++) {
ecs.take();
}
The method take() on the ExecutorCompletionService class will block until a task has finished (either normally or abruptly). It will return a Future, so you can check the results if you wish.
I hope this can help you, since I didn't completely understand your problem.
create and group all tasks and submit them to the pool with invokeAll (which only returns when all tasks are successfully completed)
After calling shutdown on a ExecutorService, no new Task will be accepted. This means you have to create a new ExecutorService for each round of tasks.
However, with Java 8 ForkJoinPool.awaitQuiescence was introduced. If you can switch from a normal ExecutorService to ForkJoinPool, you can use this method to wait until no more tasks are running in a ForkJoinPool without having to call shutdown. This means you can fill a ForkJoinPool with Tasks, waiting until it is empty (quiescent), and then again begin filling it with Tasks, and so on.
The setup:
I am in the process of changing the way a program works under the hood. In the current version works like this:
public void threadWork( List<MyCallable> workQueue )
{
ExecutorService pool = Executors.newFixedThreadPool(someConst);
List<Future<myOutput>> returnValues = new ArrayList<Future<myOutput>>();
List<myOutput> finishedStuff = new ArrayList<myOutput>();
for( int i = 0; i < workQueue.size(); i++ )
{
returnValues.add( pool.submit( workQueue.get(i) ) );
}
while( !returnValues.isEmpty() )
{
try
{
// Future.get() waits for a value from the callable
finishedStuff.add( returnValues.remove(0).get(0) );
}
catch(Throwable iknowthisisbaditisjustanexample){}
}
doLotsOfThings(finsihedStuff);
}
But the new system is going to use a private inner Runnable to call a synchronized method that writes the data into a global variable. My basic setup is:
public void threadReports( List<String> workQueue )
{
ExecutorService pool = Executors.newFixedThreadPool(someConst);
List<MyRunnable> runnables = new ArrayList<MyRunnable>()
for ( int i = 0; i < modules.size(); i++ )
{
runnables.add( new MyRunnable( workQueue.get(i) );
pool.submit(threads.get(i));
}
while( !runnables.isEmpty() )
{
try
{
runnables.remove(0).wait(); // I realized that this wouldn't work
}
catch(Throwable iknowthisisbaditisjustanexample){}
}
doLotsOfThings(finsihedStuff); // finishedStuff is the global the Runnables write to
}
If you read my comment in the try of the second piece of code you will notice that I don't know how to use wait(). I had thought it was basically like thread.join() but after reading the documentation I see it is not.
I'm okay with changing some structure as needed, but the basic system of taking work, using runnables, having the runnables write to a global variable, and using a threadpool are requirements.
The Question
How can I wait for the threadpool to be completely finished before I doLotsOfThings()?
You should call ExecutorService.shutdown() and then ExecutorService.awaitTermination.
...
pool.shutdown();
if (pool.awaitTermination(<long>,<TimeUnit>)) {
// finished before timeout
doLotsOfThings(finsihedStuff);
} else {
// Timeout occured.
}
Try this:
pool.shutdown();
pool.awaitTermination(WHATEVER_TIMEOUT, TimeUnit.SECONDS);
Have you considered using the Fork/Join framework that is now included in Java 7. If you do not want to use Java 7 yet you can get the jar for it here.
public void threadReports( List<String> workQueue )
{
ExecutorService pool = Executors.newFixedThreadPool(someConst);
Set<Future<?>> futures = new HashSet<Future<?>>();
for ( int i = 0; i < modules.size(); i++ )
{
futures.add(pool.submit(threads.get(i)));
}
while( !futures.isEmpty() )
{
Set<Future<?>> removed = new Set<Future<?>>();
for(Future<?> f : futures) {
f.get(100, TimeUnit.MILLISECONDS);
if(f.isDone()) removed.add(f);
}
for(Future<?> f : removed) futures.remove(f);
}
doLotsOfThings(finsihedStuff); // finishedStuff is the global the Runnables write to
}
shutdownis a lifecycle method of the ExecutorService and renders the executor unusable after the call. Creating and destroying ThreadPools in a method is as bad as creating/destroying threads: it pretty much defeats the purpose of using threadpool, which is to reduce the overhead of thread creation by enabling transparent reuse.
If possible, you should maintain your ExecutorService lifecycle in sync with your application. - create when first needed, shutdown when your app is closing down.
To achieve your goal of executing a bunch of tasks and waiting for them, the ExecutorService provides the method invokeAll(Collection<? extends Callable<T>> tasks) (and the version with timeout if you want to wait a specific period of time.)
Using this method and some of the points mentioned above, the code in question becomes:
public void threadReports( List<String> workQueue ) {
List<MyRunnable> runnables = new ArrayList<MyRunnable>(workQueue.size());
for (String work:workQueue) {
runnables.add(new MyRunnable(work));
}
// Executor is obtained from some applicationContext that takes care of lifecycle mgnt
// invokeAll(...) will block and return when all callables are executed
List<Future<MyRunnable>> results = applicationContext.getExecutor().invokeAll(runnables);
// I wouldn't use a global variable unless you have a VERY GOOD reason for that.
// b/c all the threads of the pool doing work will be contending for the lock on that variable.
// doLotsOfThings(finishedStuff);
// Note that the List of Futures holds the individual results of each execution.
// That said, the preferred way to harvest your results would be:
doLotsOfThings(results);
}
PS: Not sure why threadReports is void. It could/should return the calculation of doLotsOfThings to achieve a more functional design.