Oracle scheduled tasks? - java

I'm drawing a design for a system to do daily business functions for my company. It will consist of a Oracle 10g database with Pl/SQL packages and a Java-based web application. All of this is running on a Solaris 10 server. Aside from handling transactions from the web interface, scheduled tasks need to run on the database to run calculations and load data etc.
This is a redesign of a legacy system that currently controls everything with a plethora of cron jobs. Given the task of redesigning it, would you do it differently? I know Oracle has its own task scheduler, but the DBA argues that he would rethink using it because if the database is down or offline for some reason, it can't send alerts or log errors of any kind. The cron jobs currently have the ability to send SMS messages or emails should one of the tasks fail. Another option would be to have the web application do it somehow.
What do you suggest?

Are all the scheduled tasks related to the database? If so, then your DBA's objection is irrelevant: you don't want to run the jobs when the database is offline for planned downtime, and the DBA ought to have something in place to alert them if the database is down for unplanned reasons, rather than relying on a signal from a failing cron job.
If you have jobs which run on other parts of the architecture without touching the database then certainly an external scheduler makes sense. There are plenty of commercial products, but if you want to go for FOSS then you probably ought to look at Quartz.

Having used both cron and the Oracle job scheuler - I have always found it a lot more reliable and easier to user and understand cron. It has more things that it can do (interface with the entire OS, not just Oracle). I would choose cron.

My rule of thumb for scheduled jobs is consistency. Since you've already got a lot of infrastructure in place like alerting I'd stick with cron.

Related

Launch spring batch jobs from control-m

I developed jobs in Spring Batch to replace a data loading process that was previously done using bash scripts.
My company's scheduler of choice is control-m. The old bash scripts were triggered from control-m on file arrival using a file watcher.
For reasons beyond my control, we still need to use control-m. Using Spring Boot or any other framework tied to a webserver is not a possibility.
The safe approach seems to be to package the spring batch application as a jar and trigger from control-m the job using "java -jar", but this doesn't seem the right way considering we have 20+ jobs.
I was wondering if it's possible to trigger the app once (like a deamon) and communicate with it using JMS or any other approach. In this way we wouldn't need to spawn multiple jvms (considering jobs might run simultaneously).
I'm open to different solutions. Feel free to teach me the best way to solve this use case.
The safe approach seems to be to package the spring batch application as a jar and trigger from control-m the job using "java -jar", but this doesn't seem the right way considering we have 20+ jobs.
IMO, running jobs on demand is the way to go, because it is the most efficient way to use resources. Having a JVM running 24/7 and make it run a batch job once a while is a waste of resource as the JVM will be idle between run schedules. If your concern is the packaging side of things, you can package all jobs in a single jar and use the spring boot property spring.batch.job.names at launch time to specify which job to run.
I was wondering if it's possible to trigger the app once (like a deamon) and communicate with it using JMS or any other approach. In this way we wouldn't need to spawn multiple jvms (considering jobs might run simultaneously).
I would recommend to expose a REST endpoint in your JVM and implement a controller that launches batch jobs on demand. You can find an example in the Running Jobs from within a Web Container section. In this case, the job name and its parameters could passed in as request parameters.
Another way is to use a combination between Spring Batch and Spring Integration to launch jobs using JMS requests (JobLaunchRequest). This approach is explained in details with code examples in the Launching Batch Jobs through Messages.
In addition to the helpful answer from Mahmoud, Control-M isn't great with daemons. Sure, you can launch a daemon but anything running for a substantial length of time (i.e. into several weeks and beyond) is prone to error and you can often end up with daemons that are running that Control-M is no longer "aware" of (e.g. if the system has issues that cause the Control-M Agent to assume the daemon job has failed and then launches another one).
When I had no other methods available I used to add daemons as batch jobs in Control-M but there was an overhead in additional jobs that checked for multiple daemons, did the stop/starts and various housekeeping tasks. Best avoided if possible.

Quartz Scheduler - to run in Tomcat or application jar?

We have a web application that receives incoming data via RESTful web services running on Jersey/Tomcat/Apache/PostgreSQL. Separately from this web-service application, we have a number of repeating and scheduled tasks that need to be carried out. For example, purging different types of data at different intervals, pulling data from external systems on varying schedules, and generating reports on specified days and times.
So, after reading up on Quartz Scheduler, I see that it seems like a great fit.
My question is: should I design my Quartz-based scheduling application to run in Tomcat (via QuartzInitializerListener), or build it into a standalone application to run as a linux daemon (e.g., via Apache Commons Daemon or the Tanuk Java Service Wrapper).
On the one hand, it strikes me as counterintuitive to use Tomcat to host an application that is not geared towards receiving http calls. On the other hand, I haven't used Apache Commons Daemon or the Java Service Wrapper before, so maybe running inside Tomcat is the path of least resistance.
Are there any significant benefits or dangers with either approach that I should be aware of? Our core modules already take care of data access, logging, etc., so I don't see that those services are much of a factor either way.
Our scheduling will be data driven, so our Quartz-based scheduler will read the relevant data from PostgreSQL. However, if we run the scheduling application within Tomcat, is it possible/reasonable to send messages to our application via http calls to Tomcat? Finally, fwiw, since our jobs will be driven by our existing application data, I don't see any need for the Quartz JDBCJobStore.
To run a Java standalone application as linux daemon, simply end the java-command with an & -sign so that it runs in the background and put it in an Upstart-script for example.
As for the design: in this case I would go for whatever is easier to maintain. And it looks like running an app in Tomcat is already familiar. One benefit that comes to mind is that configuration files (for the database for example) can be shared/re-used so that only one set of configuration files needs to be maintained.
However, if you think the scheduled tasks can have a significant impact on resource usage, then you might want to run the tasks on a separate (virtual) machine. Since the timing of the tasks is data driven, it is hard to predict the exact load. E.g. it could happen that all the different tasks are executed at the same time (worst case/highest load scenario). Also consider the complexity of the software for the scheduled tasks and the related risk of nasty bugs: if you think there is a low chance of nasty bugs, then running the tasks in Tomcat next to the web-service is a good option, if not, run the tasks as a separate application. Lastly, consider the infrastructure in general: production line systems (providing (a continuous flow of) data processing critical to business) should be separate from non-production line systems. E.g. if the reports are created an hour later than usual and the business is largely unaffected, then this is non-production line. But if the web-service goes down and business is (immediatly) affected, then this is production line. Purging data and pulling updates is a bit gray: depends on what happens if these tasks are not performed, or later.

Divide process workflow between remote workers

I need to develop a Java platform to download and process information from Twitter. The basic idea is to have a centralized controller to generate tasks (id and keywords basically) and send this tasks to remote workers (one per computer). I need to receive an status report periodically to know about the status of both, the task and the worker. I'll have at least 60 workers (ten times more in a near future).
My initial idea was to use RMI but I need to communicate in both directions and I don't feel comfortable with RMI. The other approach was to use SSLSockets to send serialized objects but I would have to control a lot of errors and add a lot of code to monitor tasks and workers. Some people told me about use a framework like Spring Batch, Gigaspaces or Quartz.
What do you think would be the best option for this project? By the time being I've read a lot of good things about Gigaspaces but I don't find a good tutorial about how to implement it and Quartz seems promising. What do you think? Is it worth using any of them?
It's not easy to tell you to go for a technology based on your question. GigaSpaces is certainly up to the job but so is Spring Batch. Quartz is just the scheduling part of your question and not so much the remoting and the distribution of workload.
GigaSpaces is a fully fledged application platform to handle scenario's where parallelism, high throughput and scalability is a factor. Spring Batch can definitely also do the job, but unlike GigaSpaces, it is not an application platform. So you would still need to deploy your application somewhere.
However, GigaSpaces is a commericial product (free version available) but there are other frameworks that can help you such as Storm Project (http://storm-project.net/) and Hazelcast (www.hazelcast.com) also come to mind.
So without clarifying your use case it's hard to give a single answer. It all depends on what exactly you want and how you want to use it, now and in the future.

Is it possible to run a cron job in a web application?

In a java web application (servlets/spring mvc), using tomcat, is it possible to run a cron job type service?
e.g. every 15 minutes, purge the log database.
Can you do this in a way that is container independent, or it has to be run using tomcat or some other container?
Please specify if the method is guaranteed to run at a specific time or one that runs every 15 minutes, but may be reset etc. if the application recycles (that's how it is in .net if you use timers)
As documented in Chapter 23. Scheduling and Thread Pooling, Spring has scheduling support through integration classes for the Timer and the Quartz Scheduler (http://www.quartz-scheduler.org/). For simple needs, I'd recommend to go with the JDK Timer.
Note that Java schedulers are usually used to trigger Java business oriented jobs. For sysadmin tasks (like the example you gave), you should really prefer cron and traditional admin tools (bash, etc).
If you're using Spring, you can use the built-in Quartz or Timer hooks. See http://static.springsource.org/spring/docs/2.5.x/reference/scheduling.html
It will be container-specific. You can do it in Java with Quartz or just using Java's scheduling concurrent utils (ScheduledExecutorService) or as an OS-level cron job.
Every 15 minutes seems extreme. Generally I'd also advise you only to truncate/delete log files that are no longer being written to (and they're generally rolled over overnight).
Jobs are batch oriented. Either by manual trigger or cron-style (as you seem to want).
Still I don't get your relation between webapp and cron-style job? The only webapp use-case I could think of is, that you want to have a HTTP endpoint to trigger a job (but this opposes your statement about being 'cron-style').
Generally use a dedicated framework, which solves the problem-area 'batch-jobs'. I can recommend quartz.

Daemon in Java: simple schedule application?

This app must perform connection to a web service, grab data, save it in the database.
Every hour 24/7.
What's the most effective way to create such an app in java?
How should it be run - as a system application or as a web application?
Keep it simple: use cron (or task scheduler)
If that's all what you want to do, namely to probe some web service once an hour, do it as a console app and run it with cron.
An app that starts and stops every hour
cannot leak resources
cannot hang (may be you lose one cycle)
consumes 0 resources 99% of the time
look at quartz, its a scheduling library in java. they have sample code to get you started.
you'd need that and the JDBC driver to your database of choice.
no web container required - this can be easily done using a stand alone application
Try the ScheduledExecutorService.
Why not use cron to start the Java application every hour? No need to soak up server resources keeping the Java application active if it's not doing anything the rest of the time, just start it when needed,
If you are intent on doing it in java a simple Timer would be more than sufficient.
Create a web page and schedule its execution with one of many online scheduling services. The majority of them are free, very simple to use and very reliable. Some allows you to create schedules of any complexity just like in cron, SqlServer job UI, etc. Saves you a LOT of headache creating/debugging/maintaining your own scheduling engine, even if it's based on some framework like Ncron, Quartz, etc. I'm speaking from my own experience.

Categories