spring batch failure in the end of the day - java

is there a solution that allows you to check on the jobrepository for a given job(JobInstance), the presence of a completed job during the day, if there is no completed status on the batch_job_execution table during the current day, so I must send a notification or an exit code like what we got nothing today.
i plan to implement the solution in a class that extends from JobExecutionListenerSupport, like this:
public class JobCompletionNotificationListener extends JobExecutionListenerSupport {
private Logger logger = LoggerFactory.getLogger(JobCompletionNotificationListener.class);
private JobRegistry jobRegistry;
private JobRepository jobRepository;
public JobCompletionNotificationListener(JobRegistry jobRegistry, JobRepository jobRepository) {
this.jobRegistry = jobRegistry;
this.jobRepository = jobRepository;
}
#Override
public void afterJob(JobExecution jobExecution) {
System.out.println("finishhhhh");
//the logic if no job completed to day
if(noJobCompletedToDay){
Notify();
}
if (jobExecution.getStatus() == BatchStatus.COMPLETED) {
logger.info("!!! JOB FINISHED! -> example action execute after Job");
}
}
}

You can use JobExplorer#getLastJobExecution to get the last execution for your job instance and check if it's completed during the current day.
Depending on when you are going to do that check, you might also make sure there are no currently running jobs (JobExplorer#findRunningJobExecutions can help).

You can implement monitoring in multiple ways. Since version 4.2 Spring Batch provides support for metrics and monitoring based on Micrometer. There is an example of spring [grafana sample][1], with prometheus and grafana from which you can rely to customize a custom board or launch alerts from these tools.
If you have several batch processes it may be the best option, in addition to these tools will help you to monitor services, applications etc.
Buily in metrics:
Duration of job execution
Currently active jobs
Duration of step execution
Duration of item reading
Duration of item processing
Duration of chunk writing
You can create your own custom metrics (eg. Execution failures).
Otherwise, you can implement the monitoring, for example, through another independent batch process, which executes and sends a notification / mail etc. collecting for example the state of the process base, of the application or a filesystem shared between both processes.
You can also implement the check the way you describe it, there is an interesting thread where you can find described how to throw an exception in one step and process it in a next step that sends or not an alert as appropriate.

Related

How to Limit the Job Execution Ids Listings in Spring-data-flow-server's Task Execution Details page?

I have a set of Spring boot batch jobs, which I have deployed in Spring cloud data flow serve. I'm using local server configuration. But I also want the Scheduling option for each jobs inside my application. So as mentioned in document Scheduling for Local Config for Scheduling jobs using local configuration, I use rest services along with #Scheduled annotation to kick start the job or otherwise knows as task in SCDF.
These scheduled jobs are supposed to run at 15 minutes intervals for several days. And there are 10 jobs. So what's happening when I launch the job using REst API is,
Job gets launched and gets executed on scheduled intervals. Job Id link is produced in the
Task Execution Details page.
Job gets launched again as per the scheduled interval and the new Job Id links is produced along with
the previous Job Id.
Since these are all jobs that would have several executions (500+) over several days, in the
Task Execution Details page theere would be 100s of the Job Id present. And there's no scroll bar and it occupies more than half page.
//Job Config
#Configuration
#EnableBatchProcessing
#EnableTask
public class Job1Loader {
#Bean
public Job loadJob1()
{
return jobBuilderFactory().get("JOb1Loader")
.incrementer(new RunIdIncrementer())
.flow(job01_step01())
.end()
.build();;//return job
}
Rest Controller
#RestController
public class JobLauncherController {
Logger logger = LoggerFactory.getLogger(JobLauncherController.class);
#Autowired
JobLauncher jobLauncher;
#Autowired
#Qualifier(value = "loadJob1")
Job job1;
#Scheduled(cron ="0 */2 * * * ?")
#RequestMapping("/LaunchJob1")
public String LaunchJob1() throws Exception
{
logger.info("Executing LaunchJob1");
JobParameters jobParameters = new JobParametersBuilder().addLong("time", System.currentTimeMillis())
.toJobParameters();
jobLauncher.run(job1, jobParameters);
return "Job has been launched";
}
}
So my question is this. How to limit the number of job id's listed in "Task Execution Details" page to minimum of 10 job id's. Or is there a possibility of introducing a scroll bar when certain thershold reaches for Job id count. Attached the screenshot for better unedrstanding.
Currently, the REST API for Task execution response which includes the JobExecutionIds doesn't have such filtering options. What you mention above is more than a feature request than an issue :-)
Would you mind creating a feature request in here and of course, you are welcome to contribute by providing a Pull Request with the changes.

Spring Integration #Scheduled not working due to #Poller

I have a Spring Boot / Spring Integration application running that makes use of #Poller in Spring Integration and also #Scheduled on another method in a mostly-unrelated class. The #Poller is for polling an FTP server for new files. However I've found that it seems like the #Poller is somehow interfering with my #Scheduled method.
The #Poller has maxMessagesPerPoll = -1 so that it will process as many files as it can get. However, when I first start my application, there are over 100 files on the FTP server, so it's going to process them all. What I have found is that, if these files are being processed, then the #Scheduler stops triggering at all.
For example, if I set my #Scheduled to have a fixedDelay = 1 to trigger every millisecond and then start my application, the #Scheduled method will trigger a few times, until the #Poller triggers and begins processing messages, at which point my #Scheduled method completely stops triggering. I assumed that simply there was some task queue that was being filled by the #Poller so I simply needed to wait for all of the messages to be processed, but even after the #Poller is completely done and has processed all of the files, the #Scheduled method still does not trigger at all.
My thoughts are that maybe there is some task queue that is being filled by the #Poller, which is breaking my #Scheduled method, but if so, I still don't see any way that I can use a separate task queue for the different methods, or any other possible options for customizing or fixing this issue.
Does anyone have any idea what might be happening to my #Scheduled method, and how can I fix this?
#Poller:
#Bean
#InboundChannelAdapter(channel = "ftpChannel", poller = #Poller(cron = "0/5 * * ? * *", maxMessagesPerPoll = "-1"))
public MessageSource<InputStream> myMessageSource() {
//Build my message source
return messageSource;
}
#Scheduled:
#Scheduled(fixedDelay = 6000)
public void myScheduledMethod(){
//Do Stuff
}
They do use the same bean name for their scheduler taskScheduler.
It should only be a problem if you have 10 or more pollers (the default scheduler bean configured by Spring Integration has a pool size of 10 by default). A common mistake is having many queue channels (which hold on to scheduler threads for a second at a time, by default).
If you only have one poller, and not a lot of queue channels, I can't explain why you would get thread starvation.
You can increase the pool size - see Configuring the Task Scheduler.
Or you can use a different scheduler in the ScheduledAnnotationBeanPostProcessor.
As already pointed out, the problem is linked to task schedulers having the same name, although it may occur even if there are fewer than 10 pollers. Spring Boot auto-configuration provides scheduler with default pool size of 1 and registration of this scheduler may happen before the registration of taskScheduler, provided by Spring Integration.
Configuring task scheduler via Spring Integration properties doesn't help as this bean doesn't get registered at all. But providing own instance of TaskScheduler with adjusted pool size, changing pool size of auto-configured scheduler via spring.task.scheduling.pool.size property or excluding TaskSchedulingAutoConfiguration should solve the issue.
In our case, the Poller was used by inbound-channel-adapter to access mail from the IMAP server - but when it polls for an email with large attachments, it blocks the thread used by #Scheduled as it only uses a single thread for scheduling the task.
So we set the Spring property spring.task.scheduling.pool.size=2 - which now allows the #Scheduled method to run in a different thread even if the poller gets blocked (in a different thread) when trying to fetch mail from IMAP server

Is it possible to activate or deactivate jobs via a configuration file so as to avoid unintentional startups?

I would like to have the possibility to activate or deactivate jobs via:
a configuration file with specific ON/OFF for each job or
a mysql table with specific ON/OFF for each job.
Charging must take place at each change of status: for example, if a job is OFF when the ON setting (state change) the java app will be able to receive the status update.
Thanks for helping me out.
If I understood your question correctly, you are looking for capability to control your jobs from configuration. If yes, then following might be helpful.
You can schedule you jobs using Apache Camel routing.
public class JobExecutorRoute extends RouteBuilder
{
private String script="exec:yourjob.sh";
private String cron="quartz2://group/timer?cron=00+00+010+*+*+?" ;
public void configure() throws Exception {
from(script).autoStartup(true).to(cron);
}
}

Is running a long-running-process always bad on a servlet thread having a single HTTP request? [duplicate]

This question already has answers here:
How to run a background task in a servlet based web application?
(5 answers)
Closed 6 years ago.
I have a Java application (running on WAS 8.5) which acts as a kind of server inventory for the client. The application has a servlet that triggers a long running process.
The process: fetches data from a third party DB, executes Java logic, writes records back to the application's own DB (these DB connections are pooled ones) .
The servlet is not load-on-startup and is manually triggered only once a month by a single Operations guy (on some particular date based on the client's choice each month). The servlet had been historically using Timer and TimerTask in this way:
public class SyncMissingServlet extends HttpServlet implements Servlet{
public void doPost(HttpServletRequest req, HttpServletResponse resp)
throws ServletException, IOException
{
try{
SyncMissing.runSync();
}catch(Exception ex){
logger.error(new LogMessage("ERROR: "), ex);
this.sendReply(printWriter, "ERROR: " + ex.toString());
}
}
}
public class SyncMissing
{
public static void runSync() throws Exception
{
Timer t = new Timer(true);
SyncMissingTask task = new SyncMissingTask(); //SyncMissingTask is an instance of TimerTask
// Start the synchronization 5 secs from now, and run it every 30 days.
t.schedule(task, 5000, 2592000000l); //this 30 day timings never really worked out for the client,
//since the app server is restarted frequently for deployments.
}
}
There is no use of Timer.close() or TimerTask.close() in the current code.
Recently this Servlet seems to have got auto-trigerred, after a system reboot and restart of the WAS services on the system...and that's the worry.
While I couldn't explain the auto-trigger to my client, I proposed the following options:
1. drop off the use of Timer and TimerTask (the long-running process then runs on the servlet's thread itself)
2. instead of TimerTask, make it a regular Runnable and run it in a separate thread within the servlet thread.
3. make use of Java's Executor Service
4. migrate to Servlet 3.0 and turn the servlet into a Async servlet.
5. drop off the servlet altogether, and replace it with a batch job.
I understand that options 3 and 4 are really the recommended ones (or possibly option 5). But I have a feeling, that in my business scenario - Options 3 & 4 may be an overkill.
If the need is really a manual invocation of the servlet by only one user per month, are options 1 and 2 that bad?
(my client wants the quickest solution and would certainly not fund option 5)
Well, if the servlet is supposed to be run only once in a month and there is only one client triggering it, it is fine to run it in the servlet's thread itself or create a new thread inside the servlet and let that do the task. The question of load and response times arises when you have a lot of clients making simultaneous requests, at which point you might want to use an Executor service or an async servlet.
There is no need to activate a background task by invoking a servlet. Your web app has its own lifecycle. The Servlet spec provides hooks for your web app getting set-up and torn-down. Perfect place to launch and quit your background task without ever invoking a servlet by a client user.
No need to depend on a human user remembering to start the background task. Let your web app technology do the work for you.
Also, you may often hear/read "never launch threads from JSP or Servlet". This is worthy advice with regard to processing each incoming request for generating a web page. But background tasks (not directly related to a single servlet request) is a different animal; perfectly okay to have threads for background tasks as long as you handle them properly. By 'properly' I mean you explicitly end those threads appropriately, and you handle thread-safety issues. An example of a background task might be regularly polling a web service or database to refresh a cache of data.
ServletContextListener
If you want an automated task to be performed regularly within your web app, use a ServletContextListener.
That interface defines a pair of methods. One, contextInitialized, is called automatically when the web app launches, guaranteed to run before any HTTP requests are handled. The other method, contextDestroyed, runs when the web app is being torn down.
Tip: Marking your listener with a #WebListener annotation will cause your Servlet container to automatically notice it and instantiate when the web app is launched.
Beware of a nasty bug when doing development with NetBeans & Tomcat (development problem only, not a problem in deployment) where the web app does a double launch.
ScheduledExecutorService
In your custom class implementing that interface, in contextInitialized, establish a ScheduledExecutorService object to run your task repeatedly. In contextDestroyed, shutdown that executor. This is very important as the thread(s) of that executor will survive the shutdown of your web app and even the servlet container.
The ScheduledExecutorService technology supplants the Timer and TimerTask classes. These classes are especially not recommended for use in a Java Servlet environment.
You can store a reference to the executor in your listener object.
#WebListener
class MonthlyTaskRunner implements ServletContextListener {
private ScheduledExecutorService scheduledExecutorService;
void contextInitialized(ServletContextEvent see) {
// initialize your ScheduledExecutorService.
// The ScheduledExecutorService will use one or more threads for its work outside of this thread running now.
this.scheduledExecutorService = … ;
}
void contextInitialized(ServletContextEvent see) {
// Shutdown the executor along with its thread(s).
this.scheduledExecutorService.shutDown();
}
}
I and others have posted on this extensively here on Stack Overflow, such as this. So search Stack Overflow. I have posted extensive examples in the context of Vaadin web apps, but the principles apply to any servlet web app. And see the Oracle Tutorial on Executors.
Where to store a reference to your ScheduledExecutorService once instantiated? You could store in a member variable on your context listener. But a more accessible place would be as an "attribute" on the servlet context. I describe this in detail along with example code and a nifty diagram in my Answer to another Question: Start & Stop a ScheduledExecutorService in Java EE environment using servlet
YearMonth
In that executor task, get the year-month of the current date for the time zone of your business context. Compare that year-month to one recorded when the task was last performed. Record that year-month somewhere, in a file, in a database, someplace.
Schedule your ScheduledExecutorService to run more often than necessary. Rather than worry about scheduling out a month, just let it run everyday. The check to compare current YearMonth with stored year-month requires nearly no execution time. KISS.
Java includes a YearMonth class.
YearMonth ymThen = YearMonth.parse( "2016-11" ); // Retrieve that string from storage.
ZoneId z = ZoneId.of( "America/Montreal" );
YearMonth ymNow = YearMonth.now( z );
if( ymNow.isAfter( ymThen ) ) {
// … run the task
String ymOutput = ymNow.toString();
// … write that `ymOutput` string someplace in storage.
} // Else do nothing. Let the ScheduledExecutorService run again after its designated rest period.
Similar Questions
Background timer task in JSP/Servlet web application
How to run a background task in a servlet based web application?

Using Spring #Async and ThreadPoolTaskScheduler with pool-size=1

We have a service implementation in our Spring-based web application that increments some statistics counters in the db. Since we don't want to mess up response time for the user we defined them asynchronous using Spring's #Async:
public interface ReportingService {
#Async
Future<Void> incrementLoginCounter(Long userid);
#Async
Future<Void> incrementReadCounter(Long userid, Long productId);
}
And the spring task configuration like this:
<task:annotation-driven executor="taskExecutor" />
<task:executor id="taskExecutor" pool-size="10" />
Now, having the pool-size="10", we have concurrency issues when two threads try two create the same initial record that will contain the counter.
Is it a good idea here to set the pool-size="1" to avoid those conflicts? Does this have any side affects? We have quite a few places that fire async operations to update statistics.
The side-effects would depend on the speed at which tasks are added to the executor in comparison to how quickly a single thread can process them. If the number of tasks being added per second is greater than the number that a single thread can process in a second you run the risk of the queue increasing in size over time until you finally get an out of memory error.
Check out the executor section at this page Task Execution. They state that having an unbounded queue is not a good idea.
If you know that you can process tasks faster than they will be added then you are probably safe. If not, you should add a queue capacity and handle the input thread blocking if the queue reaches this size.
Looking at the two examples you posted, instead of a constant stream of #Async calls, consider updating a JVM local variable upon client requests, and then have a background thread write that to the database every now and then. Along the lines of (mind the semi-pseudo-code):
class DefaultReportingService implements ReportingService {
ConcurrentMap<Long, AtomicLong> numLogins;
public void incrementLoginCounterForUser(Long userId) {
numLogins.get(userId).incrementAndGet();
}
#Scheduled(..)
void saveLoginCountersToDb() {
for (Map.Entry<Long, AtomicLong> entry : numLogins.entrySet()) {
AtomicLong counter = entry.getValue();
Long toBeSummedWithTheValueInDb = counter.getAndSet(0L);
// ...
}
}
}

Categories