I'm writing an java based app (not web app) and it should be able to run standalone without any container the task it carries are below:
windows scheduler fires off either quartz or simple POKO
pick up file(s) during midnight
import the data into DB
move the files over from original destination to another drive
Now, the dilemma I'm having is I've been reading around and it appears quartz need web container to function.
Is that correct AND what would be most simple and durable solution?
According your question: Quartz does not need a web container, it can be run in any java application. See Quartz Quickstart Guide for how to configure Quartz.
If you use Quartz the windows scheduler shouldn't be necessary, but this implies that your java application is running constantly.
I think Quartz has the advantage, that you can configure your application in one place and do not need to consider os specific scheduling. Further more Quartz makes you independent of the os specific scheduling mechanism.
But: All this advantages are not relevant if your application is not running all the time.
On the other hand if you want it to be a fire and forget like application, that runs, does its work and then quits again, you will be on the safe side to delegate the task of scheduling to the operation system your application runs on.
So, for this specific context I think using the operation system's scheduling mechanism is the better option.
Related
I developed jobs in Spring Batch to replace a data loading process that was previously done using bash scripts.
My company's scheduler of choice is control-m. The old bash scripts were triggered from control-m on file arrival using a file watcher.
For reasons beyond my control, we still need to use control-m. Using Spring Boot or any other framework tied to a webserver is not a possibility.
The safe approach seems to be to package the spring batch application as a jar and trigger from control-m the job using "java -jar", but this doesn't seem the right way considering we have 20+ jobs.
I was wondering if it's possible to trigger the app once (like a deamon) and communicate with it using JMS or any other approach. In this way we wouldn't need to spawn multiple jvms (considering jobs might run simultaneously).
I'm open to different solutions. Feel free to teach me the best way to solve this use case.
The safe approach seems to be to package the spring batch application as a jar and trigger from control-m the job using "java -jar", but this doesn't seem the right way considering we have 20+ jobs.
IMO, running jobs on demand is the way to go, because it is the most efficient way to use resources. Having a JVM running 24/7 and make it run a batch job once a while is a waste of resource as the JVM will be idle between run schedules. If your concern is the packaging side of things, you can package all jobs in a single jar and use the spring boot property spring.batch.job.names at launch time to specify which job to run.
I was wondering if it's possible to trigger the app once (like a deamon) and communicate with it using JMS or any other approach. In this way we wouldn't need to spawn multiple jvms (considering jobs might run simultaneously).
I would recommend to expose a REST endpoint in your JVM and implement a controller that launches batch jobs on demand. You can find an example in the Running Jobs from within a Web Container section. In this case, the job name and its parameters could passed in as request parameters.
Another way is to use a combination between Spring Batch and Spring Integration to launch jobs using JMS requests (JobLaunchRequest). This approach is explained in details with code examples in the Launching Batch Jobs through Messages.
In addition to the helpful answer from Mahmoud, Control-M isn't great with daemons. Sure, you can launch a daemon but anything running for a substantial length of time (i.e. into several weeks and beyond) is prone to error and you can often end up with daemons that are running that Control-M is no longer "aware" of (e.g. if the system has issues that cause the Control-M Agent to assume the daemon job has failed and then launches another one).
When I had no other methods available I used to add daemons as batch jobs in Control-M but there was an overhead in additional jobs that checked for multiple daemons, did the stop/starts and various housekeeping tasks. Best avoided if possible.
We have a web application that receives incoming data via RESTful web services running on Jersey/Tomcat/Apache/PostgreSQL. Separately from this web-service application, we have a number of repeating and scheduled tasks that need to be carried out. For example, purging different types of data at different intervals, pulling data from external systems on varying schedules, and generating reports on specified days and times.
So, after reading up on Quartz Scheduler, I see that it seems like a great fit.
My question is: should I design my Quartz-based scheduling application to run in Tomcat (via QuartzInitializerListener), or build it into a standalone application to run as a linux daemon (e.g., via Apache Commons Daemon or the Tanuk Java Service Wrapper).
On the one hand, it strikes me as counterintuitive to use Tomcat to host an application that is not geared towards receiving http calls. On the other hand, I haven't used Apache Commons Daemon or the Java Service Wrapper before, so maybe running inside Tomcat is the path of least resistance.
Are there any significant benefits or dangers with either approach that I should be aware of? Our core modules already take care of data access, logging, etc., so I don't see that those services are much of a factor either way.
Our scheduling will be data driven, so our Quartz-based scheduler will read the relevant data from PostgreSQL. However, if we run the scheduling application within Tomcat, is it possible/reasonable to send messages to our application via http calls to Tomcat? Finally, fwiw, since our jobs will be driven by our existing application data, I don't see any need for the Quartz JDBCJobStore.
To run a Java standalone application as linux daemon, simply end the java-command with an & -sign so that it runs in the background and put it in an Upstart-script for example.
As for the design: in this case I would go for whatever is easier to maintain. And it looks like running an app in Tomcat is already familiar. One benefit that comes to mind is that configuration files (for the database for example) can be shared/re-used so that only one set of configuration files needs to be maintained.
However, if you think the scheduled tasks can have a significant impact on resource usage, then you might want to run the tasks on a separate (virtual) machine. Since the timing of the tasks is data driven, it is hard to predict the exact load. E.g. it could happen that all the different tasks are executed at the same time (worst case/highest load scenario). Also consider the complexity of the software for the scheduled tasks and the related risk of nasty bugs: if you think there is a low chance of nasty bugs, then running the tasks in Tomcat next to the web-service is a good option, if not, run the tasks as a separate application. Lastly, consider the infrastructure in general: production line systems (providing (a continuous flow of) data processing critical to business) should be separate from non-production line systems. E.g. if the reports are created an hour later than usual and the business is largely unaffected, then this is non-production line. But if the web-service goes down and business is (immediatly) affected, then this is production line. Purging data and pulling updates is a bit gray: depends on what happens if these tasks are not performed, or later.
In a java web application (servlets/spring mvc), using tomcat, is it possible to run a cron job type service?
e.g. every 15 minutes, purge the log database.
Can you do this in a way that is container independent, or it has to be run using tomcat or some other container?
Please specify if the method is guaranteed to run at a specific time or one that runs every 15 minutes, but may be reset etc. if the application recycles (that's how it is in .net if you use timers)
As documented in Chapter 23. Scheduling and Thread Pooling, Spring has scheduling support through integration classes for the Timer and the Quartz Scheduler (http://www.quartz-scheduler.org/). For simple needs, I'd recommend to go with the JDK Timer.
Note that Java schedulers are usually used to trigger Java business oriented jobs. For sysadmin tasks (like the example you gave), you should really prefer cron and traditional admin tools (bash, etc).
If you're using Spring, you can use the built-in Quartz or Timer hooks. See http://static.springsource.org/spring/docs/2.5.x/reference/scheduling.html
It will be container-specific. You can do it in Java with Quartz or just using Java's scheduling concurrent utils (ScheduledExecutorService) or as an OS-level cron job.
Every 15 minutes seems extreme. Generally I'd also advise you only to truncate/delete log files that are no longer being written to (and they're generally rolled over overnight).
Jobs are batch oriented. Either by manual trigger or cron-style (as you seem to want).
Still I don't get your relation between webapp and cron-style job? The only webapp use-case I could think of is, that you want to have a HTTP endpoint to trigger a job (but this opposes your statement about being 'cron-style').
Generally use a dedicated framework, which solves the problem-area 'batch-jobs'. I can recommend quartz.
we need run one function periodically in Java web application .
How to call function of some class periodically ?
Is there any way that call function when some event occured like high load in server and so on . what is crontab ? Is that work periodically ?
To call something periodically, see TimerTask
If you need something more robust you can use Quartz
As for crontab is the scheduling tool on Unix machines.
For calling methods when server has high load, you have at least two possible approaches. Your App Server might have management hooks that would you allow to monitor its behaviour and take progrommatic action. Alterntaively you have some system monitoring capability (Eg. Tivoli or OpenView) and it generates "events", it should not be too hard to deliver such events as (for example) JMS messages and have your server pick them up.
However, you might want to explain a bit more about what you want to achieve. Adaptive application beahviour might be quite tricky to get right.
If you want to run something periodically, don't do it in the webserver. That would be a very wrong approach IMO. It's better to use cron instead, if you are on a Unix-like operating system. Windows servers offer similar functionality.
we need run one function periodically
in Java web application
(1) So look in your deployment descriptor (web.xml) define a listener to startup at startup time.
How to call function of some class
periodically ?
(2) Create a Timer in the listener.
Is there any way that call function
when some event occured like high load
in server and so on
(3) and run some Threads to check for system conditions that are accesible with Java, even run system progs (uptime, etc) and parse the output.
crontab could be a way but the execution of Java will start another JVM and that is really the hot thing in servlet containers: all runs in the same JVM.
Don't forget about java.util.concurrent - it has a lot of classes for scheduling, e.g. ScheduledThreadPoolExecutor, if you need more than a simple Timer.
http://java.sun.com/j2se/1.5.0/docs/api/java/util/concurrent/package-summary.html
There is also a backport of it to Java 1.4: http://backport-jsr166.sourceforge.net.
If you already use Spring, you might want to have a look at Spring's task execution framework - using #Scheduled and #Async for annotating methods as tasks and implementing the functionality in a Processor that delegates the actual work to a Worker, as described in:
http://blog.springsource.com/2010/01/05/task-scheduling-simplifications-in-spring-3-0/
The advantage is that you can define timers using a cron-like syntax in your spring context and you don't need anything special to set up tasks, it is also well integrated into Java EE applications and should play well with web servers (which custom Threads tend not to do always).
How to call function of some class periodically?
There are several solutions: a Timer, a Java cron implementation like cron4j, Quartz, or even the EJB Timer API. Choosing one or the other highly depends on the context: the type of application, the technologies used, the number of jobs, etc.
Is there any way that call function when some event occurred like high load in server and so on
You could maybe use JMX in your jobs to access and monitor informations and trigger an action under some specific condition. But this is more a pull mode, not event based.
This app must perform connection to a web service, grab data, save it in the database.
Every hour 24/7.
What's the most effective way to create such an app in java?
How should it be run - as a system application or as a web application?
Keep it simple: use cron (or task scheduler)
If that's all what you want to do, namely to probe some web service once an hour, do it as a console app and run it with cron.
An app that starts and stops every hour
cannot leak resources
cannot hang (may be you lose one cycle)
consumes 0 resources 99% of the time
look at quartz, its a scheduling library in java. they have sample code to get you started.
you'd need that and the JDBC driver to your database of choice.
no web container required - this can be easily done using a stand alone application
Try the ScheduledExecutorService.
Why not use cron to start the Java application every hour? No need to soak up server resources keeping the Java application active if it's not doing anything the rest of the time, just start it when needed,
If you are intent on doing it in java a simple Timer would be more than sufficient.
Create a web page and schedule its execution with one of many online scheduling services. The majority of them are free, very simple to use and very reliable. Some allows you to create schedules of any complexity just like in cron, SqlServer job UI, etc. Saves you a LOT of headache creating/debugging/maintaining your own scheduling engine, even if it's based on some framework like Ncron, Quartz, etc. I'm speaking from my own experience.