I wanted to develop 'tasks' in Java which can be run periodically as per the schedule defined.
How do I run this on my Linux server. If it is a jar file - is it enough that I create a jar file and run it using shell script? and schedule to run the script (CRON)
I was planning to make use of Spring Framework. Do I really need one? Since I can schedule to call my java program using CRON
How do I approach this?
You can build the app using Spring Boot and run it as a daemon:
https://docs.spring.io/spring-boot/docs/current/reference/html/deployment-install.html
And then use quartz to schedule tasks
You can use CRON job and as well as scheduler like (Quartz etc) to run your java task. I think CRON job is a convenient way to run your jar file. You can simply schedule your jar in the CRON job.
Check out quartz its an awesome scheduling library that you can include in any java application.
Once the scheduler is started it runs in selected intervals defined in a cron expression say
( ***** )
Related
Currently our project is on MR and we use Oozie to orchestrate our MR Jobs. Now we are moving to Spark, and would like to know the recommended ways to schedule/trigger Spark Jobs on the CDH cluster. Note that CDH Oozie does not support Spark2 Jobs. So please give an alternative for this.
Last time I looked, Hue had a Spark option in the Worlflow editor. If Cloudera didn't support that, I'm not sure why it'd be there...
CDH Oozie does support plain shell scripts, though, but you need to be sure all NodeManagers will have spark-submit command available on the local server.
If that doesn't work, it also supports Java actions for running a JAR, so you could write your Spark scripts all starting with a main method that loads up any configuration from there
As soon as you submit the spark job from the shell, like:
spark-submit <script_path> <arguments_list>
it gets submitted to the CDH cluster. Immediately you will be able to see the spark jobs and its progress in the Hue.This is how we trigger the spark jobs.
Further, to orchestrate a series of jobs, you can use a shell script wrapper around it. Or, you can use a cron job to trigger in timing.
I have a mapreduce job as a 'jar' ,that should be run daily. Also, I need to run this jar from a remote java application. How can I schedule it: i.e, I just want to run job daily from my remote java application.
I read about Oozie, but I dont think it is apt here.
Take a look at Quartz. It enables you to run a standalone java programs or run inside an web or application container (like JBoss or Apache Tomcat). There is a good integration with Spring and Spring batch in particular.
Quartz can be configured outside of the java code - in XML and the syntax is exactly like in crontab. So, I found it very handy.
äSome examples can be found here and here.
I am not clear about your requirement. You can use ssh command execution libraries in your program.
SSH library for Java
If you are running your program in linux environment itself, You can set some crontab for periodic execution.
If the trigger of your jar is your java program, then you should schedule your java program hourly rather than the jar. And if that is separate, then you can schedule your jar in Oozie workflow where you can have the java code execution in step one of oozie workflow and jar execution in the second step.
In oozie, you can pass the parameters from one level to another as well.Hope this helps.
-Dipika Harwani
I'm trying to make a mini web application for reminders, I deploy Quartz Scheduler to handle the issue of launch events reminder, I have understood the tasks (Jobs) and programmers (Schedulers) can be configured from a Database with JDBC, I have searched and can not find an example where I show what information should I put on the tables and I run java code to start operating scheduled tasks. If someone can have an example or something that I can serve this purpose, they are grateful.
You have understood wrong. You can use any JobStore (including the JdbcJobStore to store your jobs/triggers/etc. but creating them manually in the database is a bad idea™.
Depending on how you are using Quartz you can set it up, either using SPRING or using the Fluent syntax (which I believe is the preferred method these days).
Further reading: http://quartz-scheduler.org/documentation/quartz-2.1.x/tutorials/tutorial-lesson-09
I'd like to ask if anyone can suggest proper framework for backend scheduled jobs. Currently whole backend is based on multiple scheduled jobs. All the jobs are written in java and deployed on linux machine. Those jobs are controlled by cron (using crontab) and simple bash scripts as a wrappers so basically I have a couple of jars (they all are spring based uber-jars [with dependencies]) which are fired periodically. Those java modules are doing various things like processing csv/xml files, getting data from webservices, calling external APIs (HTTP) and collecting data from FTP.
Is there a framework so that I would be able to have all the modules in one place and simply manage them? I was thinking about camel (I used it before) but the must have for me is:
ability to deploy/undeploy single module without interrupting the rest of the modules.
ability to reschedule jobs (cron expression) in the runtime.
Camel is almost perfect because it has all the features for external integration (FTP, HTTP, WS) and also easy quartz integration. I don't know If it's achievable to have multiple modules and deploy/undeploy them in the runtime.
Maybe there is some other frameworks which are going to fit my needs. Please suggest.
If planning to do this in Java/Scala, try using Quartz
It (also) offers a CRON like syntax for scheduling jobs.
We have our "modules" deployed as webapps on a simple servlet container (jetty) and trigger actions on them using a Quartz scheduler (also in a webapp to expose a simple UI)
For managing the loading and unloading of modules, you might want to look at something based on OSGI like Apache ServiceMix which seems especially good with module management in the way you're describing (I admit I don't quite understand your requirement for loading and unloading modules). Add Quartz to ServiceMix for scheduling jobs.
I want to implement a task scheduler to run in Apache Felix. The idea is the task scheduler will read a crontab file, and execute the task (the task is defined by a installed services or bundles) periodically. What is the best way to do this? I am new to OSGI, and good suggestions is appreciated.
Well, it's not really an OSGi matter (OSGi doesn't cover crontab-type event scheduling), I'd say use a 3rd party open source scheduler like Quartz:
http://quartz-scheduler.org/
However, it's not an OSGi bundle out of the box, so that still might require some effort to make it work.
Other suggestion: Apache Sling seems to have a built in scheduler (also based on Quartz), and as Sling is OSGi based, it should be reasonably easy to add to your app.
http://sling.apache.org/documentation/bundles/scheduler-service-commons-scheduler.html
Hope this helps, Frank