Scheduling task vs busy waiting

Scheduling task vs busy waiting - java

I need to perform a task every few hours and I'm looking for most efficient solution for that. I thought about two approaches:
1) busy waiting
while(true){
doMyJob()
wait(2*hour);
}
2) executor scheduling:
executor.schedule(new MyJobTask(),2,TimeUnit.HOUR);
...
class MyJobTask implements Runnable{
void run(){
doMyJob();
...
executor.schedule(new MyJobTask(),2,TimeUnit.HOUR);
}
Could you please advise me which solution is more efficient and in what situation each of them is more preferable (if any). Intuitively, I would go for second solution but I couldn't find anything to prove my intuition. If you have some other solutions - please share. Solution should be also memory efficient (that's why I have a dilemma - do I need to create and keep a ThreadPool object only to do a simple job every two hours).

None of the proposed solutions is really advisable inside an EE container (where you should avoid messing with threads), which you seem to target according to the tags of your question.
Starting with Java EE 5 there is the timer service which according to my tests works quite nicely with longer timeouts like the 2 hours in your example. There is one point that you really shouldn't forget though - quoting from the aforementioned tutorial:
Timers are persistent. If the server is shut down (or even crashes), timers are saved and will become active again when the server is restarted. If a timer expires while the server is down, the container will call the #Timeout method when the server is restarted.
If for whatever reason this solution is not acceptable you should have a look at the Quartz Scheduler. Its possibilities exceed your requirements by far, but at least it gives you a ready to use solution whose compatibility with a wide range of application servers is guaranteed.

Both should have about the same efficiency but I would suggest using ScheduledExecutorService
Executors.newSingleThreadScheduledExecutor()
.scheduleAtFixedRate(new MyJobTask(), 0, 2, TimeUnit.HOUR);
There are several reasons, detailed here: A better way to run code for a period of time
But importantly, ScheduledExecutorService allows you to use multiple threads so that tasks which take a long time don't necessarily have to back-up your queue of tasks (the service can be running 2 of your tasks simultaneously). Also, if doMyJob throws an exception, ScheduledExecutorService will continue to schedule your task rather than being cancelled because it failed to reschedule the task.

Related

How to use Executor when there are thousands/dynamic number of tasks?

I just found out about Java's Executor and I'm thinking whether it fits my needs. I have a database, to which tasks are constantly added.
I SQL SELECT all rows(tasks) from the database. At any time, there could be 100000, 200, 3 or 0 of them.
I see all examples are like this:
ExecutorService taskExecutor = Executors.newFixedThreadPool(50);
while(...) {
taskExecutor.execute(new MyTask());
}
How could it be adjusted to my scenario? I can't just instantiate 100000 of MyTask's, it will waste a ton of memory.
How should I wait for finish and check again? I can't just wait until 100000 tasks are finished, because first one might finish in 5 secs and will stay unused for a long time until last one is finished, so we will be wasting time.

An elegant and efficient way to do this would be to use a BoundedExecutor implementation as illustrated within the book Java Concurrency In Practice by Brian Goetz (Github link here).
The implementation provided on the Github link while a good starting point would also need to make a few changes
In case you want to access the return values for the task then make
sure that your tasks implement the Callable interface instead
of Runnable.
Using an ExecutorService instead of an Executor would be more
beneficial since it helps us expose more granular shutdown
functionality.
Expose set of methods that allow the executor to be shutdown by
cleanly. These shutdown call implementations can simply delegate to
the underlying ExecutorService.shutdown() or
ExecutorService.shutdownNow() based on the requirement.
Also it would be good to keep the bound as a configuration parameter - this way when the bounded executor is initialized the number of tasks that can be executed safely in parallel is read off this configuration parameter.
This configuration parameter can be tuned based on your performance and scale testing to identify the optimal bound for your application without degrading it's performance or the underlying system performance.
Hope this helps.

Is Thread to be favoured over Executor here?

As far as I understand Executors help handling the execution of runnables. E.g. I would choose using an executor when I have several worker threads that do their job and then terminate.
The executor would handle the creation and the termination of the Threads needed to execute the worker runnables.
However now I am facing another situation. A fixed number of classes/objects shall encapsulate their own thread. So the thread is started at the creation of those objects and the Thread shall continue running for the whole life time of these objects.
The few objects in turn are created at the start of the programm and exist for the whole run time.
I guess Threads are preferable over Executors in this situation, however when I read the internet everybody seems to suggest using Executors over Threads in any possible situation.
Can somebody please tell me if I want to choose Executors or Threads here and why?
Thanks

You're somewhat mixing things. Executor is just an interface. Thread is a core class. There's nothing which directly implies that Executor implementations execute tasks in separate threads.
Read the first few lines of the JavaDoc.
Executor
So if you want full control, just use Thread and do things on your own.

Without knowing more about the context, it's hard to give a good answer, but generally speaking I'd say that the situations that calls for using Thread are pretty few and far between. If you start trying to synchronize your program "manually" using synchronized I bet things will get out of hand quickly. (Not to mention how hard it will be to debug the code.)
Last time I used a thread was when I wanted to record some audio in the background. It was a "start"/"stop" kind of thing, and not "task oriented". (I tried long and hard to try to find an audio library that would encapsulate that for me but failed.)
If you choose to go for a thread-solution, I suggest you try to limit the scope of the thread to only execute within the associated object. This will to an as large extent as possible avoid forcing you to think about happens-before relations, thread-safe publishing of values etc throughout the code.

ExecutorService can have thread pool
It optimizes performance, because creating a Thread is expensive.
ExecutorService has life cycle control
shutdown(), shutdownNow() etc are provided.
ExecutorService is flexible
You could invoke variety of behaviors: customize ThreadFactory, set thread pool size, delay behavior ScheduledThreadPoolExecutor etc...

Threadpoolsize of ScheduledExecutorService

i have a ScheduledExecutorService that gets tasks for periodically execution:
scheduler = Executors.newScheduledThreadPool( what size? );
public addTask(ScheduledFuture<?> myTask, delay, interval) {
myTask = scheduler.scheduleAtFixedRate(new Runnable() {
// doing work here
},
delay,
interval,
TimeUnit.MILLISECONDS );
}
The number of tasks the scheduler gets depends solely on the user of my program. Normaly it should be a good idea, afaik, to make the ThreadPoolSize #number_of_Cpu_Threads, so that each CPU or CPU Thread executes one Task at a time, cause this should give the fastest throughput. But what should i do if the Tasks involve I/O (as they do in my program)? The tasks in my program are grabbing data from a server on the internet and saving them in a db. So that means most of the time they are waiting for the data to come in (aka idle). So what would be the best solution for this problem?

It really depends on the exact context:
How many tasks will be added? (You've said it's up to the user, but do you have any idea? Do you know this before you need to create the pool?)
How long does each of them take?
Will they be doing any intensive work?
If they're all saving to the same database, is there any concurrency issue there? (Perhaps you want to have several threads fetching from different servers and putting items in a queue, but only one thread actually storing data in the database?)
So long as you don't get "behind", how important is the performance anyway?
Ultimately I strongly suspect you'll need to benchmark this yourself - it's impossible to give general guidance without more information, and even with specific numbers it would be mostly guesswork. Hard data is much more useful :)
Note that the argument to newScheduledThreadPool only specifies the number of core threads to keep in the thread pool if threads are idle - so it's going to be doing a certain amount of balancing itself.

ScheduledThreadPoolExecutor and corePoolSize 0?

I'd like to have a ScheduledThreadPoolExecutor which also stops the last thread if there is no work to do, and creates (and keeps threads alive for some time) if there are new tasks. But once there is no more work to do, it should again discard all threads.
I naivly created it as new ScheduledThreadPoolExecutor(0) but as a consequence, no thread is ever created, nor any scheduled task is ever executed.
Can anybody tell me if I can achieve my goal without writing my own wrapper around the ScheduledThreadpoolExecutor?
Thanks in advance!

Actually you can do it, but its non-obvious:
Create a new ScheduledThreadPoolExecutor
In the constructor set the core threads to the maximum number of threads you want
set the keepAliveTime of the executor
and at last, allow the core threads to timeout
m_Executor = new ScheduledThreadPoolExecutor ( 16,null );
m_Executor.setKeepAliveTime ( 5, TimeUnit.SECONDS );
m_Executor.allowCoreThreadTimeOut ( true );
This works only with Java 6 though

I suspect that nothing provided in java.util.concurrent will do this for you, just because if you need a scheduled execution service, then you often have recurring tasks to perform. If you have a recurring task, then it usually makes more sense to just keep the same thread around and use it for the next recurrence of the task, rather than tearing down your thread and having to build a new one at the next recurrence.
Of course, a scheduled executor could be used for inserting delays between non-recurring tasks, or it could be used in cases where resources are so scarce and recurrence is so infrequent that it makes sense to tear down all your threads until new work arrives. So, I can see cases where your proposal would definitely make sense.
To implement this, I would consider trying to wrap a cached thread pool from Executors.newCachedThreadPool together with a single-threaded scheduled executor service (i.e. new ScheduledThreadPoolExecutor(1)). Tasks could be scheduled via the scheduled executor service, but the scheduled tasks would be wrapped in such a way that rather than having your single-threaded scheduled executor execute them, the single-threaded executor would hand them over to the cached thread pool for actual execution.
That compromise would give you a maximum of one thread running when there is absolutely no work to do, and it would give you as many threads as you need (within the limits of your system, of course) when there is lots of work to do.

Reading the ThreadPoolExecutor javadocs might suggest that Alex V's solution is okay. However, doing so will result in unnecessarily creating and destroying threads, nothing like a cashed thread-pool. The ScheduledThreadPool is not designed to work with a variable number of threads. Having looked at the source, I'm sure you'll end up spawning a new thread almost every time you submit a task. Joe's solution should work even if you are ONLY submitting delayed tasks.
PS. I'd monitor your threads to make sure your not wasting resources in your current implementation.

This problem is a known bug in ScheduledThreadPoolExecutor (Bug ID 7091003) and has been fixed in Java 7u4. Though looking at the patch, the fix is that "at least one thread is started even if corePoolSize is 0."

Java daemon - handling shutdown requests

I'm currently working on a daemon that will be doing A LOT of different tasks. It's multi threaded and is being built to handle almost any kind of internal-error without crashing. Well I'm getting to the point of handling a shutdown request and I'm not sure how I should go about doing it.
I have a shutdown hook setup, and when it's called it sets a variable telling the main daemon loop to stop running. The problem is, this daemon spawns multiple threads and they can take a long time. For instance, one of these threads could be converting a document. Most of them will be quick (I'm guessing under 10 seconds), but there will be threads that can last as long as 10+ minutes.
What I'm thinking of doing right now is when a shutdown hook has been sent, do a loop for like 5 seconds on ThreadGroup.activeCount() with a 500ms (or so) Sleep (all these threads are in a ThreadGroup) and before this loop, I will send a notification to all threads telling them a shutdown request has been called. Then they will have to instantly no matter what they're doing cleanup and shutdown.
Anyone else have any suggestions? I'm interested in what a daemon like MySQL for instance does when it gets told to stop, it stops instantly. What happens if like 10 query's are running that are very slow are being called? Does it wait or does it just end them. I mean servers are really quick, so there really isn't any kind of operation that I shouldn't be able to do in less than a second. You can do A LOT in 1000ms now days.
Thanks

The java.util.concurrent package provides a number of utilities, such as ThreadPoolExecutor (along with various specialized types of other Executor implementations from the Executors class) and ThreadPoolExecutor.awaitTermination(), which you might want to look into - as they provide the same exact functionality you are looking to implement. This way you can concentrate on implementing the actual functionality of your application/tasks instead of worrying about things like thread and task scheduling.

Are your thread jobs amenable to interruption via Thread#interrupt()? Do they mostly call on functions that themselves advertise throwing InterruptedException? If so, then the aforementioned java.util.concurrent.ExecutorService#shutdownNow() is the way to go. It will interrupt any running threads and return the list of jobs that were never started.
Similarly, if you hang on to the Futures produced by ExecutorService#submit(), you can use Future#cancel(boolean) and pass true to request that a running job be interrupted.
Unless you're calling on code out of your control that swallows interrupt signals (say, by catching InterruptedException without calling Thread.currentThread().interrupt()), using the built-in cooperative interruption facility is a better choice than introducing your own flags to approximate what's already there.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.