How can I define multiple jobs dynamically in Spring Batch?

How can I define multiple jobs dynamically in Spring Batch? - java

I have an application that uses Spring Batch to define a preset number of jobs, which are currently all defined in the XML.
We add more jobs over time which requires updating the XML, however these jobs are always based on the same parent and can easily be predetermined using a simple SQL query.
So I've been trying to switch to use some combination of XML configuration and Java-based configuration but am quickly getting confused.
Even though we have many jobs, each job definition falls into essentially one of two categories. All of the jobs inherit from one or the other parent job and are effectively identical, besides having different names. The job name is used in the process to select different data from the database.
I've come up with some code much like the following but have run into problems getting it to work.
Full disclaimer that I'm also not entirely sure I'm going about this in the right way. More on that in a second; first, the code:
#Configuration
#EnableBatchProcessing
public class DynamicJobConfigurer extends DefaultBatchConfigurer implements InitializingBean {
#Autowired
private JobBuilderFactory jobBuilderFactory;
#Autowired
private JobRegistry jobRegistry;
#Autowired
private DataSource dataSource;
#Autowired
private CustomJobDefinitionService customJobDefinitionService;
private Flow injectedFlow1;
private Flow injectedFlow2;
public void setupJobs() throws DuplicateJobException {
List<JobDefinition> jobDefinitions = customJobDefinitionService.getAllJobDefinitions();
for (JobDefinition jobDefinition : jobDefinitions) {
Job job = null;
if (jobDefinition.getType() == 1) {
job = jobBuilderFactory.get(jobDefinition.getName())
.start(injectedFlow1).build()
.build();
} else if (jobDefinition.getType() == 2) {
job = jobBuilderFactory.get(jobDefinition.getName())
.start(injectedFlow2).build()
.build();
}
if (job != null) {
jobRegistry.register(new ReferenceJobFactory(job));
}
}
}
#Override
public void afterPropertiesSet() throws Exception {
setupJobs();
}
public void setInjectedFlow1(Flow injectedFlow1) {
this.injectedFlow1 = injectedFlow1;
}
public void setInjectedFlow2(Flow injectedFlow2) {
this.injectedFlow2 = injectedFlow2;
}
}
I have the flows that get injected defined in the XML, much like this:
<batch:flow id="injectedFlow1">
<batch:step id="InjectedFlow1.Step1" next="InjectedFlow1.Step2">
<batch:flow parent="InjectedFlow.Step1" />
</batch:step>
<batch:step id="InjectedFlow1.Step2">
<batch:flow parent="InjectedFlow.Step2" />
</batch:step>
</batch:flow>
So as you can see, I'm effectively kicking off the setupJobs() method (which is intended to dynamically create these job definitions) from the afterPropertiesSet() method of InitializingBean. I'm not sure that's right. It is running, but I'm not sure if there's a different entry point that's better intended for this purpose. Also I'm not sure what the point of the #Configuration annotation is to be honest.
The problem I'm currently running into is as soon as I call register() from JobRegistry, it throws the following IllegalStateException:
To use the default BatchConfigurer the context must contain no more than one DataSource, found 2.
Note: my project actually has two data sources defined. The first is the default dataSource bean which connects to the database that Spring Batch uses. The second data source is an external database, and this second one contains all the information I need to define my list of jobs. But the main one does use the default name "dataSource" so I'm not quite sure how else I can tell it to use that one.

First of all - I don't recommend using a combination of XML as well as Java Configuration. Use only one, preferably Java one as its not much of an effort to convert XML config to Java config. (Unless you have some very good reasons to do it - that you haven't explained)
I haven't used Spring Batch alone as I have always used it with Spring Boot and I have a project where I have defined multiple jobs and it always worked well for similar code that you have shown.
For your issue, there are some answers on SO like this OR this which are basically trying to say that you need to write your own BatchConfigurer and not rely on default one.
Now coming to solution using Spring Boot
With Spring Boot, You should try segregate job definitions and job executions.
You should first try to just define jobs and initialize Spring context without enabling jobs (spring.batch.job.enabled=false)
In your Spring Boot main method, when you start app with something like - SpringApplication.run(Application.class, args); ...you will get ApplicationContext ctx
Now you can get your relevant beans from this context & launch specif jobs by getting names from property or command line etc & using JobLauncher.run(...) method.
You can refer my this answer if willing to order job executions. You can also write job schedulers using Java.
Point being, you separate your job building / bean configurations & job execution concerns.
Challenge
Keeping multiple jobs in a single project can be challenging when you try to have different settings for each job as application.properties file is environment specific and not job specific i.e. spring boot properties will apply to all jobs.

In my particular case, the solution was to actually eliminate the #Configuration
and #EnableBatchProcessing annotations from my class above. Something about these caused it to try and use the DefaultBatchConfigurer which fails when you have more than one data source defined (even if you've identified them clearly with "dataSource" as the primary and some other name for the secondary).
The #Configuration class in particular wasn't necessary because all it really does is lets your class get auto-instantiated without having to define it as a bean in the app context. But since I was doing that anyway this one was superfluous.
One of the downsides of removing #EnableBatchProcessing was that I could no longer auto-wire the JobBuilderFactory bean. So instead I just had to do to create it:
JobRepositoryFactoryBean factory = new JobRepositoryFactoryBean();
factory.setDataSource(dataSource);
factory.setTransactionManager(transactionManager);
factory.afterPropertiesSet();
jobRepository = factory.getObject();
jobBuilderFactory = new JobBuilderFactory(jobRepository);
Then it seems I was on the right track already by using jobRegistry.register(...) to define my jobs. So essentially once I removed those annotations above everything started working. I'm going to mark Sabir's answer as the correct one however because it helped me out.

Related

Spring Boot 2 - Wire Two LDAP Templates

I need to configure multiple LDAP data sources / LdapTemplates in my Spring Boot 2 application. The first LdapTemplate will be used for most of the work, while the second will be used for a once-in-a-while subset of data (housed elsewhere).
I have read these StackOverflow questions regarding doing that, but they seem to be for Spring Boot 1.
Can a spring ldap repository project access two different ldap directories?
Multiple LDAP repositories with Spring LDAP Repository
From what I can gather, much of that configuration/setup had to be done anyway, even for just one LDAP data source, back in Spring Boot 1. With Spring Boot 2, I just put the properties in my config file like so
ldap.url=ldap://server.domain.com:389
ldap.base:DC=domain,DC=com
ldap.username:domain\ldap.svc.acct
ldap.password:secret
and autowire the template in my repository like so
#Autowired
private final LdapTemplate ldapTemplate;
and I'm good to go. (See: https://stackoverflow.com/a/53474188/3669288)
For a second LDAP data source, can I just add the properties and configuration elements for "ldap2" and be done (see linked questions)? Or does adding this configuration cause Spring Boot 2's auto configuration to think I'm overriding it and so now I lose my first LdapTemplate, meaning I now need to go explicitly configure that as well?
If so, do I need to configure everything, or will only a partial configuration work? For example, if I add the context source configuration and mark it as #Primary (does that work for LDAP data sources?), can I skip explicitly assigning it to the first LdapTemplate? On a related note, do I still need to add the #EnableLdapRepositories annotation, which is otherwise autoconfigured by Spring Boot 2?
TLDR: What's the minimum configuration I need to add in Spring Boot 2 to wire in a second LdapTemplate?

This takes what I've learned over the weekend and applies it as an answer to my own question. I'm still not an expert in this so I welcome more experienced answers or comments.
The Explanation
First, I still don't know for certain if I need the #EnableLdapRepositories annotation. I don't yet make use of those features, so I can't say if not having it matters, or if Spring Boot 2 is still taking care of that automatically. I suspect Spring Boot 2 is, but I'm not certain.
Second, Spring Boot's autoconfigurations all happen after any user configurations, such as my code configuring a second LDAP data source. The autoconfiguration is using a couple of conditional annotations for whether or not it runs, based on the existence of a context source or an LdapTemplate.
This means that it sees my "second" LDAP context source (the condition is just that a context source bean exists, regardless of what its name is or what properties it is using) and skips creating one itself, meaning that I no longer have that piece of my primary data source configured.
It will also see my "second" LdapTemplate (again, the condition is just that an LdapTemplate bean exists, regardless of what its name is or what context source or properties it is using) and skip creating one itself, so I again no longer have that piece of my primary data source configured.
Unfortunately, those conditions mean that in this case there is no in-between either (where I can manually configure the context source, for example, and then allow the autoconfiguration of the LdapTemplate to still happen). So the solution is to either make my configuration run after the autoconfiguration, or to not leverage the autoconfiguration at all and set them both up myself.
As for making my configuration run after the autoconfiguration: the only way to do that is to make my configuration an autoconfiguration itself and specify its order to be after Spring's built-in autoconfiguration (see: https://stackoverflow.com/a/53474188/3669288). That's not appropriate for my use case, so for my situation (because Spring Boot's setup does make sense for a standard single-source situation) I'm stuck forgoing the autoconfiguration and setting them both up myself.
The Code
Setting up two data sources is pretty well covered in the following two answers (though partly for other reasons), as linked in my question, but I'll also detail my setup here.
Can a spring ldap repository project access two different ldap directories?
Multiple LDAP repositories with Spring LDAP Repository
First up, the configuration class needs to be created, as one was not previously needed at all with Spring Boot 2. Again, I left out the #EnableLdapRepositories annotation partly because I don't use it yet, and partly because I think Spring Boot 2 will still cover that for me. (Note: All of this code was typed up in the Stack Overflow answer box as I don't have a development environment where I'm writing this, so imports are skipped and the code may not be perfectly compilable and function correctly, though I hope it's good.)
#Configuration
public class LdapConfiguration {
}
Second is manually configuring the primary data source; the one that used to be autoconfigured but no longer will be. There is one piece of Spring Boot's autoconfiguration that can be leveraged here, and that is its reading in of the standard spring.ldap.* properties (into a properties object), but since it wasn't given a name, you have to reference it by its fully qualified class name. This means you can skip straight to setting up the context source for the primary data source. This code is not quite as full featured as the actual autoconfiguration code (See: Spring Code)
I marked this LdapTemplate as #Primary because for my use, this is the primary data source and so it's what all other autowired calls should default to. This also means you don't need a #Qualifier where you autowire this source up (as seen later).
#Configuration
public class LdapConfiguration {
#Bean(name="contextSource")
public LdapContextSource ldapContextSource(#Qualifier("spring.ldap-org.springframework.boot.autoconfigure.ldap.LdapProperties") LdapProperties properties) {
LdapContextSource source = new LdapContextSource();
source.setUrls(properties.getUrls());
source.setUserDn(properties.getUsername());
source.setPassword(properties.getPassword());
source.setBaseEnvironmentProperties(Collections.unmodifiableMap(properties.getBaseEnvironment()));
return source;
}
#Bean(name="ldapTemplate")
#Primary
public LdapTemplate ldapTemplate(#Qualifier("contextSource") LdapContextSource source) {
return new LdapTemplate(source);
}
}
Third is to manually configure the secondary data source, the one that caused all of this to begin with. For this one, you do need to configure the reading of your properties into an LdapProperties object. This code builds on the previous code, so you can see the complete class for context.
#Configuration
public class LdapConfiguration {
#Bean(name="contextSource")
public LdapContextSource ldapContextSource(#Qualifier("spring.ldap-org.springframework.boot.autoconfigure.ldap.LdapProperties") LdapProperties properties) {
LdapContextSource source = new LdapContextSource();
source.setUrls(properties.getUrls());
source.setUserDn(properties.getUsername());
source.setPassword(properties.getPassword());
source.setBaseEnvironmentProperties(Collections.unmodifiableMap(properties.getBaseEnvironment()));
return source;
}
#Bean(name="ldapTemplate")
#Primary
public LdapTemplate ldapTemplate(#Qualifier("contextSource") LdapContextSource source) {
return new LdapTemplate(source);
}
#Bean(name="ldapProperties2")
#ConfigurationProperties("app.ldap2")
public LdapProperties ldapProperties2() {
return new LdapProperties();
}
#Bean(name="contextSource2")
public LdapContextSource ldapContextSource2(#Qualifier("ldapProperties2") LdapProperties properties) {
LdapContextSource source = new LdapContextSource();
source.setUrls(properties.getUrls());
source.setUserDn(properties.getUsername());
source.setPassword(properties.getPassword());
source.setBaseEnvironmentProperties(Collections.unmodifiableMap(properties.getBaseEnvironment()));
return source;
}
#Bean(name="ldapTemplate2")
public LdapTemplate ldapTemplate2(#Qualifier("contextSource2") LdapContextSource source) {
return new LdapTemplate(source);
}
}
Finally, in your class that uses these LdapTemplates, you can autowire them as normal. This uses constructor autowiring instead of the field autowiring the other two answers used. Either is technically valid though constructor autowiring is recommended.
#Component
public class LdapProcessing {
protected LdapTemplate ldapTemplate;
protected LdapTemplate ldapTemplate2;
#Autowired
public LdapProcessing(LdapTemplate ldapTemplate, #Qualifier("ldapTemplate2") LdapTemplate ldapTemplate2) {
this.ldapTemplate = ldapTemplate;
this.ldapTemplate2 = ldapTemplate2;
}
}
TLDR: Defining a "second" LDAP data source stops the autoconfiguration of the first LDAP data source, so both must be (nearly fully) manually configured if using more than one; Spring's autoconfiguration can not be leveraged even for the first LDAP data source.

Spring Batch with modular=true and GenericApplicationContextFactory is cumbersome

I'm using Spring Batch with #EnableBatchProcessing(modular = true)
The problem is that in this mode you have to explicitly declare which beans to initialize (i.e which classes Spring needs to scan)
Here's an exmaple:
#Configuration
#EnableBatchProcessing(modular = true)
public class ModularJobsConfig {
#Autowired
private AutomaticJobRegistrar registrar;
#PostConstruct
public void initialize() {
registrar.addApplicationContextFactory(new GenericApplicationContextFactory(
SomeJobConfig.class,
SomeJobTasklet.class,
SomeClassToDefineTaskExecutor.class
SomeClassToRunTheJob.class));
}
}
I can imagine that by the time I'll have several jobs, this configuration class will be bloated. How can I automate this?
It is worth mentioning that each job has its own package (e.g com.example.jobs.<job_name>) + they are defined in different maven-modules but I think it's irrelevant.
Further Clarification
I have a core module which contains the configuration above. Every job is defined in a separate maven module and it's been registered as a maven dependency in core.
Mainly for preventing naming clash, I'm using #EnableBatchProcessing(modular = true) and I'm registering the jobs with AutomaticJobRegistrar as you can see in the example code above.
Ideally, I'd like Spring to scan the maven dependency and to do it for me (i.e. defining GenericApplicationContextFactory)
Currently, it's cumbersome to add manually each and every class (in the example above: SomeJobConfig.class, SomeJobTasklet.class etc)
As a counter example, if I didn't use modular=true I could let Spring Batch to load all beans on its own, but then I'd have to make sure methods names are unique across all the modules.

How to Lazy load all the Spring beans whether it is defined by #Bean or #Component in Springboot 2.2

I am writing a spring application which is interactive and basically handles lots of commands like create, list, update, delete various types of resources.
For now, just assume a single run of the application handles only a single command and the program exits.
For all the classes to validate command, execute the command, required factory classes for each resource there is a separate class and each class is annotated with #Component annotation for spring to manage all the components.
There are also some of the manually defined beans by #Bean method.
Now that my application first identifies what kind of command is executed (create, delete, list, update, etc), I want the only beans of that command to be created and Autowired wherever required (after taking command from user) and I want to avoid the creation of dozens of beans related to other commands.
On searching, I came to know about Lazy instantiation of Beans that spring provides.
However, I am not sure if it is the weapon I am searching for.
What I tried
Very first I found #Lazy annotation, but since I want to lazily load all the Beans, I don't want to write #Lazy everywhere in each class.
Then I found setting below property in application.yml does the work.
spring:
main:
lazy-initialization: true
I tried that but still, it is not lazily creating the beans.
My application.yml files looks like this
spring:
main:
lazy-initialization: true
My main SpringBootApplication file looks like this:
#Slf4j
#SpringBootApplication
public class SpringBootApplication {
public static void main(String[] args) {
System.out.println("Loading Application...");
ApplicationContext context = SpringApplication.run(SpringBootApplication.class, args);
final AtomicInteger counter = new AtomicInteger(0);
log.info("**************** START: Total Bean Objects: {} ******************", context.getBeanDefinitionCount());
Arrays.asList(context.getBeanDefinitionNames())
.forEach(beanName -> {
log.info("{}) Bean Name: {} ", counter.incrementAndGet(), beanName);
});
log.info("**************** END: Total Bean: {} ******************", context.getBeanDefinitionCount());
}
}
My other classes looks like this:
#Component
#RequiredArgsConstructor(onConstructor = #__(#Autowired))
public class MyClass1 implements ResourceCreator<MyClass2, MyClass3> {
private final RequestValidatorImpl requestValidator;
private final ResourceCreator resourceCreator;
#Override
public MyClass2 toImplementFunction(MyClass3 myclass3) {
//logic
}
On running the application, It prints all the classes where I annotated #Component as well as beans created by #Bean method.
I have also tried using below in Application.properties but still no use.
spring.main.lazy-initialization=true
Also, if you wish, please comment on whether I should use #Component for each Class like I am using or not and what is better practice instead.

I think you misunderstood the meaning of the lazy flag you are passing,
It means that the object will be created only when it is invoked but it does not say that it will not scan that package. Spring will scan all packages and store bean definition names but it will create the objects only when it is invoked, if you have passed the lazy flag to it.
You can verify this behaviour by checking the number of beans created when you pass the lazy flag as true and false.
you can check it as given below
ApplicationContext context = SpringApplication.run(SpringBootApplication.class, args);
System.out.println("count:"+context.getBeanDefinitionCount());
Edit on Apr/7th 2020 start
Another way to do that is create a constructor and use that to inject the autowired properties and print out a log when they enter the constructor.
I did the same in a sample project and below is he result, first one is for eager initialization and next one for lazy.
spring.main.lazy-initialization=false
Application logs
Inside Constructor
calling bean
inside bean method
spring.main.lazy-initialization=true
Application logs
calling bean
Inside Constructor
inside bean method
Edit on Apr/7th 2020 end
Please mark this as answered if I answered your question.
Thank you

Long story short:
For anyone wanting to lazily initialize their whole Spring Boot context, setting this property to true is the way to go:
spring.main.lazy-initialization=true
Pro tip:
It can be used in combination with the #Lazy annotation, set to false; so all the defined beans will use lazy initialization, except for those that we explicitly configure with #Lazy(false).
In such a way, the lazy initialization becomes opt-out instead of the default opt-in.

Options for dynamic properties in Spring Boot

I have an application with some externalized configuration in the form of properties. I would like the application to react to a change of such properties without a restart or full context refresh.
I am not clear what my options are.
Artificial example: the application implements a service that receives requests and decides whether to queue or reject them. The maximum size of the queue is a property.
final int queueMaxSize = queueProperties.getMaxSize();
if (queue.size() >= queueMaxSize) { //reject }
Where QueueProperties is a class annotated with #ConfigurationProperties.
#ConfigurationProperties(prefix = "myapp.limits.queue")
#Getter
#Setter
public class QueueProperties {
public int maxSize = 10;
}
This works as far as allowing me to control behavior via system properties, profiles, etc.
However, I would like to be able to change this value without releasing/deploying/restarting/refreshing the application.
My application already uses Archaius.
If I push a new value for this property using our internal infrastructure, i can see the application Spring Environment does receive the new value.
(e.g., /admin/env reflects the value and changes dynamically).
The part I'm not clear on is: how to make my service react to the change of value in the environment?
I found two ways, but they seem hairy, I wonder if there are better options. I expected this to be a common problem with a first class solution in the Spring ecosystem.
Hacky solution #1:
#Bean
#Scope("prototype")
#ConfigurationProperties(prefix = "myapp.limits.queue")
QueueProperties queueProperties() {
return new QueueProperties();
}
And inject this into the service using the properties as Provider<QueueProperties> and use it as queuePropertiesProvider.get().getMaxSize().
This works but has a few side-effects I'm not a fan of:
ConfigurationProperties annotation moved from the class to the bean definition
A new QueueProperties object is created and bound to values for every request coming in
Provider might throw on get()
Invalid values are not detected until the first request comes in
Hacky solution #2:
Don't annotate my properties class with ConfigurationProperties, inject the Spring environment at construction time. Implement the getters as such:
int getMaxSize() {
return environment.getProperty("myapp.limits.queue", 10);
}
This also works ok in terms of behavior. However
- This is not annotated as property (unlike the rest of the properties classes in this large project, makes it harder to find)
- This class does not show up in /admin/configprops
Hacky solution #3:
Schedule a recurring task that uses Environment to update my singleton QueueProperties bean.
Any further ideas/suggestions/pointers?
Is there a canonical/recommended way to do this that does not have the shortcoming of my solutions above?

Making #Schedule run only once in a clustered environment

I have two tomee instances clustered.
Each one have a method annotated like
#Schedule(dayOfWeek = "*")
public void runMeDaily() {...}
I'd like to run this method only once a day. Not twice a day (one on each instance)
I could use a flag as described here Run #Scheduled task only on one WebLogic cluster node? or just elect some node, but I wonder if there's a more elegant way to do that.
This question is somewhat related to EJB3.1 #Schedule in clustered environment but I am not using JBOSS. (and it's not answered)

Im using same approach as in other thread - checking that particular host is the correct one to run job. But..
Im not very info ee tools, but in spring you can use profiles for that. Probably you can find similar solution for your needs. Take a look at http://spring.io/blog/2011/06/21/spring-3-1-m2-testing-with-configuration-classes-and-profiles
You can define two seperate beans:
#Configuration
#Profile("dev")
public class StandaloneDataConfig {
#Bean
public DataSource dataSource() {
return new EmbeddedDatabaseBuilder()
.setType(EmbeddedDatabaseType.HSQL)
.addScript("classpath:com/bank/config/sql/schema.sql")
.addScript("classpath:com/bank/config/sql/test-data.sql")
.build();
}
}
#Configuration
#Profile("production")
public class JndiDataConfig {
#Bean
public DataSource dataSource() throws Exception {
Context ctx = new InitialContext();
return (DataSource) ctx.lookup("java:comp/env/jdbc/datasource");
}
}
and decide which one to turn on by switching profile. So your class with method annotated #Scheduled would be loaded only for specific profile. Ofcourse then you need to configure your app to turn on profile on of the nodes only. In spring app it would be as simple as passing -Dspring.profiles.active=profile to one of them.

I could only solve this using a non-Java EE solution, specific to the platform (proprietary). In my case, I am using TomEE+ and Quartz. Running Quartz in the clustered mode (org.quartz.jobStore.isClustered = true) and persisting the timers in a single database forces Quartz to choose an instance to trigger the timer, so it will only run once.
This link was very useful -- http://rmannibucau.wordpress.com/2012/08/22/tomee-quartz-configuration-for-scheduled-methods/
It's a shame Java EE does not specify a behavior for that. (yet, I hope) :-)

I solved this problem by making one of the box as master. basically set an environment variable on one of the box like master=true.
and read it in your java code through system.getenv("master"). if its present and its true then run your code.
basic snippet
#schedule()
void process(){
boolean master=Boolean.parseBoolean(system.getenv("master"));
if(master)
{
//your logic
}
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.