I'd like to ask if anyone can suggest proper framework for backend scheduled jobs. Currently whole backend is based on multiple scheduled jobs. All the jobs are written in java and deployed on linux machine. Those jobs are controlled by cron (using crontab) and simple bash scripts as a wrappers so basically I have a couple of jars (they all are spring based uber-jars [with dependencies]) which are fired periodically. Those java modules are doing various things like processing csv/xml files, getting data from webservices, calling external APIs (HTTP) and collecting data from FTP.
Is there a framework so that I would be able to have all the modules in one place and simply manage them? I was thinking about camel (I used it before) but the must have for me is:
ability to deploy/undeploy single module without interrupting the rest of the modules.
ability to reschedule jobs (cron expression) in the runtime.
Camel is almost perfect because it has all the features for external integration (FTP, HTTP, WS) and also easy quartz integration. I don't know If it's achievable to have multiple modules and deploy/undeploy them in the runtime.
Maybe there is some other frameworks which are going to fit my needs. Please suggest.
If planning to do this in Java/Scala, try using Quartz
It (also) offers a CRON like syntax for scheduling jobs.
We have our "modules" deployed as webapps on a simple servlet container (jetty) and trigger actions on them using a Quartz scheduler (also in a webapp to expose a simple UI)
For managing the loading and unloading of modules, you might want to look at something based on OSGI like Apache ServiceMix which seems especially good with module management in the way you're describing (I admit I don't quite understand your requirement for loading and unloading modules). Add Quartz to ServiceMix for scheduling jobs.
Related
Here're goals i'm trying to achieve:
Take the scheduled jobs out of microservice because it can and would harm timings/performance
Execute jobs in a separate computation cluster aka workers
Avoid code duplication: i want to keep all my business logic in one Service, all DB-related operations in one Dao, do not write additional services/daos for jobs
Avoid dependency management problems: different jobs may require different libs/versions/etc. For instance, job from ServiceA may use javax.annotation-api while job originated from ServiceB may use jakarta.annotation-api. Making a worker depend both on ServiceA and ServiceB will cause build or runtime problems.
Are there any approaches/libraries/solutions to achieve all the goals at the same time?
UPD:
Both Temporal.io and quartz are not quite what I need - they both require worker to depend on workflow tasks.
I can imagine that I’m approaching the issue I face in incorrect way, so architectural advises are also appreciated
From architectural perspective, expose service (business logic) via API.
Have schedulers run on separate instance or if you are using some of the popular cloud solutions have their FaaS (function as a service in your case scheduler) trigger service API via HTTP (any or dedicated instance).
Azure -> user azure functions
AWS -> lambda functions
Google Cloud -> Google Cloud Functions
All of the above have comprehensive guide how to create scheduled function aka trigger.
Hope this helps and I'm not off topic.
From my perspective you have the option to use one of three possible solutions:
Most straight forward - Ensure that service logic which is required in the jobs also implements a local API (programming API).
As such it can act as and be imported as a library and reused in jobs without code duplication.
If you have a larger development organization you also want to make sure that such libraries are correctly version managed and version releases are pre-planned, which allows the teams using the libraries to treat them like they would third party libraries.
Also there is no magic - You would have to work through any build/dependency problems if there would be conflicts. (Since your question sounds like this is a deal breaker, let's take a look at the other solutions.)
The second solution would be to provide a wrapper for each service logic that allows to access functionality via CLI. That means you don't have to import the libraries, but rather execute them as jars/executables through the CLI. This would allow you to use the same code but avoid dependency problems.
(You will still have to deal with version management and version upgrades, etc.)
In case you use containerized deployments/hosting you can also consider to bundle up multiple containers together just for your jobs, where each job gets its own private service container instances for use during the job. Kubernetes and Docker Compose for example have options to run such multi-container deployments/jobs.
That solution would allow you to reuse the same services as they run for other purposes, but you have to make sure that they are configurable enough to work in this scenario.
One problem that all of the approaches have is that you have to make sure there are no runtime conflicts between your jobs and the deployed regular services. (For example state conflicts)
In terms of how to execute jobs it will depend on your deployment scenario. Kubernetes has an option to run containers as jobs natively, which makes it easy to bundle multiple jars, etc. But it is always an option to deploy a dedicated scheduler or workflow tool like Apache Airflow to run your jobs.
I'm planning to build a Java-based system to handle different business processes where each of these is a particular module in the system. Most modules would depend on some of the other modules to handle their particular business process. In other words, top modules would consume some sort of basic services provided by underlying modules. Some modules will be developed from the very beginning, but some will be added to the system later. Next, some modules will expose RESTful interfaces to handle external input / output.
To handle all this, OSGi seems appropriate, but it's a bit difficult to learn with all the different "distributions" out there (Equinox, Felix, etc.) and I'm concerned about the ease of using the Spring framework and other 3rd party libraries within each module (starting with Spring 3.2 the different jars might not come with OSGi manifests).
On top of this, I'd like a central web portal to administer all bundles, thus with each new bundle there will be a new admin section.
that's why we developed osgi-less modularity for Spring https://github.com/griddynamics/banshun Your feedback is appreciated!
Why do you need OSGi? Why not use a Web Server like Tomcat, and deploy your application as a war? You can deploy it on multiple servers in a cluster, and your application can scale on and on.
Why do you need Spring? It has become incredibly coupled. And it has a complexity that find quite useless since OSGi applications tend to be built from small components communicating through services; voiding most of the advantages of the Spring wiring model which assumes it is central.
And hard to configure is a strange remark, OSGi is excellent configuration support. It is just different than what you're used to.
Instead of using spring, why not using OSGi Blueprint it'll give you an "easy" transition from Spring to OSGi.
We want to integrate DROOLS with my current web Application which is based on struts 2. Is there a sample Application which could be used as reference?
Generally we are seeing all application use Spring+ Drools.
Also later on can it be possible to integrate Guvnor for a GUI of the rules created?
Yes it is possible. Drools is not tier specific, you can plug it into you Java application however you see fit. As a general rule you would incorporate it into your service tier, where all the heavy lifting is done.
Drools needs very little configuration (in many scenarios it needs none at all). Simply drop the applicable JAR files into your library folder and reference them in your classpath.
I actually built a prototype application for a client using Yahoo UI, Struts and Drools. It works like a charm (can't share the source unfortunately). To wit, you are definitely not tied to Spring.
As far as your second question, note that using Guvnor to manage rules and accessing those rules from your app logic are two totally separate things. The Guvnor governance application is bundled as a web app that you deploy on a server. Once deployed it provides a very nice interface that you can use for managing a rules repository. To use those managed rules in your application you need to include the appropriate JAR files in your application and do some configuration.
I would recommend standing up a simple application first that simple executes some rules in an embedded DRL, before attempting anything more complex like integrating with Guvnor.
Using JAVA framework i want to achieve the following task.
hot code(jar) deployment which will perform certain task in an environment
At any time if i update that jar file it should automatically unload old code and load new code
I want to schedule that deployed jar file to perform tasks.
Currently i see Apache karaf/Felix fulfill this requirement but less help available and difficult to manage.
Any alternate framwork i can use instead of using karaf/felix ?
If you aren't going to go the OSGi route, which you basically implied by forgoing Karaf / Felix (and Karaf uses Equinox, by default) then about the best thing I can suggest for you to consider is LiveRebel when it comes out. #Daniel's answer mentioned JRebel, which is outstanding for hot deployment during development but it is not meant as a tool for production systems. Instead you should check out LiveRebel, also made by Zero Turnaround, might be able to fulfill of your needs. Please note that this is a commercial product but they are offering a private beta right now.
[Edit]
Idiotically, I forgot to mention that there's also Knoplerfish, another OSGI runtime which has a BSD style license. Perhaps give that a shot?
Give JRebel a try. It is a great tool.
Note sure what environment you mean (eg. web, desktop, server-side, etc), but...
Working backwards:
3: Scheduled Tasks
You can achieve this in any Java container with the Quartz Scheduler library. This allows you to schedule events in a CRON like fashion.
1-2: Hot Deployment
Then it's a question of where you want to deploy and how to handle hot deployment. Other answers have mentioned JRebel and OSGI which will work. If you want some super quick deployment (eg. save the code and it's available) and have it hosted in a web container ,then use the Play Framework. It uses Quartz do implement Scheduled Jobs in a very nice way.
For example (from the Play docs) :
#Every("1h")
public class Bootstrap extends Job {
public void doJob() {
List<User> newUsers = User.find("newAccount = true").fetch();
for(User user : newUsers) {
Notifier.sayWelcome(user);
}
}
}
JBoss has the hot deploy feature that your describing. However, I'm guessing it's as complicated to configure Karaf. It may be possible to find out how JBoss is achieving it and use the libraries yourself though.
hot code(jar) deployment which will perform certain task in an
environment
At any time if i update that jar file it should automatically unload
old code and load new code
I want to schedule that deployed jar file to perform tasks.
In a nutshell, hot deploy/redeploy is done like that
Use a classloader (java.net.URLClassLoader is a good start), load the jar(s), actually copy the jar somewhere (temp) before loading it
You need some interface implementation, instantiate the class implementing the interface (META-INF in the jar, custom xml, whatever), configure it (props/xml, whatever)
call start() and perform the tasks.
Monitor the jar: some thread to check it each second and compare the last modified time/size
If changed - call stop() and undeploy, may need to wait for threads, etc, start over
There are a lot of frameworks that allow dynamic deploy
The hotdeploy feature of most web containers (like Tomcat or Jetty) allow you to have the behaviour you want, on web applications.
Such an application can be very simple, and essentially just contain your jar.
What is it you need your application to do?
I am looking for something very close to an application server with these features:
it should handle a series of threads/daemons, allowing the user to start-stop-reload each one without affecting the others
it should keep libraries separated between different threads/daemons
it should allow to share some libraries
Currently we have some legacy code reinventing the wheel... and not a perflectly round-shaped one at that!
I thought to use Tomcat, but I don't need a web server, except maybe for the simple backoffice user interface (/manager/html).
Any suggestion? Is there a non-web application server, or is there a better alternative to Tomcat (more lightweight, for example, or easier to configure)? Thanks in advance.
Have you looked at OSGi ? You can load/unload bundles (basically .jar files with metadata) independently of each other, and optionally define dependencies between these (with a software lifecycle defined such that bundles are aware of other bundles being loaded/unloaded).
I have found the Jetty "contexts" concept very useful in handling applications (packaged as WAR's and with servlet context listeners), where the xml-file placed in contexts/ describe fully what you want to have started. When you remove the xml-file again, the thing described is stopped.
If you do not start a server connector you will just have a start-stop thing which sounds like what you are looking for.
Jetty can be made very small so the overhead is not bad.
You could consider Spring dmServer. It's a rather non-traditional appserver, with a very lightweight OSGi core (the web container is optional, for example), but it gives you classloader isolation and basic container services. It's not a JavaEE container, but comes with plug-in modules that are.
You're stlll going to have to do a lot of work yourself, but the basics of dmServer are very sound.
No one stops you from sending binary and text data instead of HTML-pages using http protocol. That is whats servlets are for. So I would use the tomcat server.