Are there any frameworks available in java or .NET to execute long running tasks?
This framework should give me the flexibility to plug in my implementation to execute the job and also ability to control the run-time like the number of tasks that execute and load balancing of execution.
I would like to here different approaches to solve the problem.
in Java, you can try the Java 5 Executor framework, Spring Batch or Quartz depending on your need.
In Java:
ThreadPoolExecutor offers you a thread pool and lets you execute tasks in it. It also offers means to determine current active tasks, number of already executed tasks etc.
Windows Workflow Foundation might work for you in .NET depending on what exactly you mean by "Long Running Task". It has the capability of recording the task's state and restoring it at a later time, allowing the task to survive a reboot or other interuption.
TaskParallelLibrary in .Net 4.0 has facility to work with parallel, asynchronous and long-running tasks.
See http://msdn.microsoft.com/en-us/library/dd460717.aspx
In .Net for "long running tasks" it would be Windows Workflow Foundation. And for "ability to control the run-time like the number of tasks that execute and load balancing of execution" it would be Task Parallel Library. You might need to evaluate your problem statement and see if either one of these or combination of both of these frameworks can solve your problem.
Related
Let's say I have some unit of work that needs to get done and I want to do it asynchronously relative to the rest of my application because it can take a long time e.g. 10 seconds to 2 minutes. To accomplish this I'm considering two options:
Schedule a Quartz job with a simple trigger set to fire only once and as soon as possible.
Create a Runnable instance, hand it off to a Thread, and call run();.
In the context of the above I have the following questions:
What does using the Quartz job get me that the thread doesn't have?
What does using the runable get me that using the quartz job doesn't?
In terms of best practices, what criteria ought be used for deciding between Quartz jobs and runnables for this use case?
With Quartz you have many features "well implemented", like:
transaction mgmt of job execution
job persistence, so that we know the status of the running jobs
clustering supports
scheduling control, even if you just need the simple trigger. But it provides the possibility.
without using it, you have to control them on your own, some issue could be complicated.
Starting new thread:
light weight no job persistence, quartz api etc.
your application runs without extra dependency (quartz)
error source (from quartz) was reduced
It depends on what kind of job do you want to start, and if other features of your application require job scheduling too.
If your concern is just asynchronisation, you can just start a thread. If there were other concerns, like clustering, you may consider to use quartz.
I would not add Quartz to a project just for this capability, but if I already had Quartz installed and was already using it, then, yea, even for a one off I would use a one time immediate Quartz job.
The reason is simply consistency. Quartz already manages all of the details of the thread and job process. A single thread is Simple, but we also know from experience that even a single thread can be Not Simple.
Quartz wraps the thread in to a high level concept (the Job), and all that which it brings with it.
From a code base point of view you get the consistency of all your jobs having the same semantics, your developers don't have to "shift gears" "just for a thread". Later, they may "just do a thread" and run in to a complexity that Quartz manages painlessly.
The overhead of the abstraction and conditions that make a Quartz job are not significant enough to just use a thread in this case because "it's lighter weight".
Consistency and commonality are important aspects to a codebase. I would stick to the single abstraction and leverage as much as I can.
If it's a one-time job and there are no additional requirements, like job persistency, scheduling, etc. then you're better off with regular threads.
Quartz jobs are much more robust than regular threads and support scheduling, job persistence, etc., all the other stuff that you probably don't need.
No need to set anything up with Runnables and Threads
If you think there might be more jobs that this, scheduled jobs, delayed jobs, etc, you have 2 options: go with Java's standard Excecutors. Set up a thread pool and use this to run your jobs. You might also want to use Spring's TaskExecutor abstraction so you can easily switch between Quartz and Executors when you need it. But that seems like an overkill for a one-time gig.
For immediate 1 time task, Threads will be enough.
But there are better plugins available like quartz, Spring Scheduler
In a java web application (servlets/spring mvc), using tomcat, is it possible to run a cron job type service?
e.g. every 15 minutes, purge the log database.
Can you do this in a way that is container independent, or it has to be run using tomcat or some other container?
Please specify if the method is guaranteed to run at a specific time or one that runs every 15 minutes, but may be reset etc. if the application recycles (that's how it is in .net if you use timers)
As documented in Chapter 23. Scheduling and Thread Pooling, Spring has scheduling support through integration classes for the Timer and the Quartz Scheduler (http://www.quartz-scheduler.org/). For simple needs, I'd recommend to go with the JDK Timer.
Note that Java schedulers are usually used to trigger Java business oriented jobs. For sysadmin tasks (like the example you gave), you should really prefer cron and traditional admin tools (bash, etc).
If you're using Spring, you can use the built-in Quartz or Timer hooks. See http://static.springsource.org/spring/docs/2.5.x/reference/scheduling.html
It will be container-specific. You can do it in Java with Quartz or just using Java's scheduling concurrent utils (ScheduledExecutorService) or as an OS-level cron job.
Every 15 minutes seems extreme. Generally I'd also advise you only to truncate/delete log files that are no longer being written to (and they're generally rolled over overnight).
Jobs are batch oriented. Either by manual trigger or cron-style (as you seem to want).
Still I don't get your relation between webapp and cron-style job? The only webapp use-case I could think of is, that you want to have a HTTP endpoint to trigger a job (but this opposes your statement about being 'cron-style').
Generally use a dedicated framework, which solves the problem-area 'batch-jobs'. I can recommend quartz.
we need run one function periodically in Java web application .
How to call function of some class periodically ?
Is there any way that call function when some event occured like high load in server and so on . what is crontab ? Is that work periodically ?
To call something periodically, see TimerTask
If you need something more robust you can use Quartz
As for crontab is the scheduling tool on Unix machines.
For calling methods when server has high load, you have at least two possible approaches. Your App Server might have management hooks that would you allow to monitor its behaviour and take progrommatic action. Alterntaively you have some system monitoring capability (Eg. Tivoli or OpenView) and it generates "events", it should not be too hard to deliver such events as (for example) JMS messages and have your server pick them up.
However, you might want to explain a bit more about what you want to achieve. Adaptive application beahviour might be quite tricky to get right.
If you want to run something periodically, don't do it in the webserver. That would be a very wrong approach IMO. It's better to use cron instead, if you are on a Unix-like operating system. Windows servers offer similar functionality.
we need run one function periodically
in Java web application
(1) So look in your deployment descriptor (web.xml) define a listener to startup at startup time.
How to call function of some class
periodically ?
(2) Create a Timer in the listener.
Is there any way that call function
when some event occured like high load
in server and so on
(3) and run some Threads to check for system conditions that are accesible with Java, even run system progs (uptime, etc) and parse the output.
crontab could be a way but the execution of Java will start another JVM and that is really the hot thing in servlet containers: all runs in the same JVM.
Don't forget about java.util.concurrent - it has a lot of classes for scheduling, e.g. ScheduledThreadPoolExecutor, if you need more than a simple Timer.
http://java.sun.com/j2se/1.5.0/docs/api/java/util/concurrent/package-summary.html
There is also a backport of it to Java 1.4: http://backport-jsr166.sourceforge.net.
If you already use Spring, you might want to have a look at Spring's task execution framework - using #Scheduled and #Async for annotating methods as tasks and implementing the functionality in a Processor that delegates the actual work to a Worker, as described in:
http://blog.springsource.com/2010/01/05/task-scheduling-simplifications-in-spring-3-0/
The advantage is that you can define timers using a cron-like syntax in your spring context and you don't need anything special to set up tasks, it is also well integrated into Java EE applications and should play well with web servers (which custom Threads tend not to do always).
How to call function of some class periodically?
There are several solutions: a Timer, a Java cron implementation like cron4j, Quartz, or even the EJB Timer API. Choosing one or the other highly depends on the context: the type of application, the technologies used, the number of jobs, etc.
Is there any way that call function when some event occurred like high load in server and so on
You could maybe use JMX in your jobs to access and monitor informations and trigger an action under some specific condition. But this is more a pull mode, not event based.
I'm drawing a design for a system to do daily business functions for my company. It will consist of a Oracle 10g database with Pl/SQL packages and a Java-based web application. All of this is running on a Solaris 10 server. Aside from handling transactions from the web interface, scheduled tasks need to run on the database to run calculations and load data etc.
This is a redesign of a legacy system that currently controls everything with a plethora of cron jobs. Given the task of redesigning it, would you do it differently? I know Oracle has its own task scheduler, but the DBA argues that he would rethink using it because if the database is down or offline for some reason, it can't send alerts or log errors of any kind. The cron jobs currently have the ability to send SMS messages or emails should one of the tasks fail. Another option would be to have the web application do it somehow.
What do you suggest?
Are all the scheduled tasks related to the database? If so, then your DBA's objection is irrelevant: you don't want to run the jobs when the database is offline for planned downtime, and the DBA ought to have something in place to alert them if the database is down for unplanned reasons, rather than relying on a signal from a failing cron job.
If you have jobs which run on other parts of the architecture without touching the database then certainly an external scheduler makes sense. There are plenty of commercial products, but if you want to go for FOSS then you probably ought to look at Quartz.
Having used both cron and the Oracle job scheuler - I have always found it a lot more reliable and easier to user and understand cron. It has more things that it can do (interface with the entire OS, not just Oracle). I would choose cron.
My rule of thumb for scheduled jobs is consistency. Since you've already got a lot of infrastructure in place like alerting I'd stick with cron.
I already asked a separate question on how to create time triggered event in Java. I was introduced to Quartz.
At the same time, I also google it online, and people are saying cron in Unix is a neat solution.
Which one is better? What's the cons and pros?
Some specification of the system:
* Unix OS
* program written in Java
* I have a task queue with 1000+ entries, for each timestamp, up to 500 tasks might be triggered.
Using cron seems to add another entry point into your application, while Quartz would integrate into it. So you would be forced to deal with some inter-process communication if you wanted to pass some information to/from the process invoked from cron. In Quartz you simply (hehe) run multiple threads.
cron is platform dependent, Quartz is not.
Quartz may allow you to reliably make sure a task is run at the given time or some time after if the server was down for some time. Pure cron wouldn't do it for you (unless you handle it manually).
Quartz has a more flexible language of expressing occurences (when the tasks should be fired).
Consider the memory footprint. If your single tasks share nothing or little, then it might be better to run them from the operating system as a separate process. If they share a lot of information, it's better to have them as threads within one process.
Not quite sure how you could handle the clustering in the cron approach. Quartz might be used with Terracotta following the scaling out pattern (I haven't tried it, but I believe it's doable).
The plus for cron is that any sysadmin knows how to use it and it's documented in many places. If cron will do the job then it would really be the preferred solution.