Stop all spring batch jobs at shutdown (CTRL-C)

Stop all spring batch jobs at shutdown (CTRL-C) - java

I have a spring boot / spring batch application, which starts different jobs.
When the app is stopped (CTRL-C) the jobs are left in the running state (STARTED).
Even though CTRL-C gives the app enough time to gracefully stop the jobs the result is the same as a kill -9.
I've found a way (see below) to gracefully stop all jobs when the application is killed using CTRL-C, but would like to know if there is a better / simpler way to achieve this goal.
Everything below is documentation on how I managed to stop the jobs.
In a blog entry from 부알프레도 a JobExecutionListener is used to register shutdown hooks which should stop jobs:
public class ProcessShutdownListener implements JobExecutionListener {
private final JobOperator jobOperator;
ProcessShutdownListener(JobOperator jobOperator) { this.jobOperator = jobOperator; }
#Override public void afterJob(JobExecution jobExecution) { /* do nothing. */ }
#Override
public void beforeJob(final JobExecution jobExecution) {
Runtime.getRuntime().addShutdownHook(new Thread() {
#Override
public void run() {
super.run();
try {
jobOperator.stop(jobExecution.getId());
while(jobExecution.isRunning()) {
try { Thread.sleep(100); } catch (InterruptedException e) {}
}
} catch (NoSuchJobExecutionException | JobExecutionNotRunningException e) { /* ignore */ }
}
});
}
}
In addition to the provided code I also had to create a JobRegistryBeanPostProcessor.
Without this PostProcessor the jobOperator would not be able to find the job.
(NoSuchJobException: No job configuration with the name [job1] was registered
#Bean
public JobRegistryBeanPostProcessor jobRegistryBeanPostProcessor(JobRegistry jobRegistry) {
JobRegistryBeanPostProcessor postProcessor = new JobRegistryBeanPostProcessor();
postProcessor.setJobRegistry(jobRegistry);
return postProcessor;
}
The shutdown hook was not able to write the state to the database, as the database connection was already closed:
org.h2.jdbc.JdbcSQLNonTransientConnectionException: Database is already closed (to disable automatic closing at VM shutdown, add ";DB_CLOSE_ON_EXIT=FALSE" to the db URL)
Processing item 2 before
Shutdown Hook is running !
2021-02-08 22:39:48.950 INFO 12676 --- [extShutdownHook] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Shutdown initiated...
2021-02-08 22:39:49.218 INFO 12676 --- [extShutdownHook] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Shutdown completed.
Processing item 3 before
Exception in thread "Thread-3" org.springframework.transaction.CannotCreateTransactionException: Could not open JDBC Connection for transaction; nested exception is java.sql.SQLTransientConnectionException: HikariPool-1 - Connection is not available, request timed out after 30004ms.
In order to make sure that spring boot doesn't close the hikari datasource pool before having stopped the jobs I used a SmartLifeCycle as mentioned here.
The final ProcessShutdownListener looks like:
#Component
public class ProcessShutdownListener implements JobExecutionListener, SmartLifecycle {
private final JobOperator jobOperator;
public ProcessShutdownListener(JobOperator jobOperator) { this.jobOperator = jobOperator; }
#Override
public void afterJob(JobExecution jobExecution) { /* do nothing. */ }
private static final List<Runnable> runnables = new ArrayList<>();
#Override
public void beforeJob(final JobExecution jobExecution) {
runnables.add(() -> {
try {
if (!jobOperator.stop(jobExecution.getId())) return;
while (jobExecution.isRunning()) {
try {
Thread.sleep(100);
} catch (InterruptedException ignored) { /* ignore */ }
}
} catch (NoSuchJobExecutionException | JobExecutionNotRunningException e) { /* ignore */ }
});
}
#Override
public void start() {}
#Override
public void stop() {
// runnables.stream()
// .parallel()
// .forEach(Runnable::run);
runnables.forEach(Runnable::run);
}
#Override
public boolean isRunning() { return true; }
#Override
public boolean isAutoStartup() { return true; }
#Override
public void stop(Runnable callback) { stop(); callback.run(); }
#Override
public int getPhase() { return Integer.MAX_VALUE; }
}
This listener has to be registered when configuring a job:
#Bean
public Job job(JobBuilderFactory jobs,
ProcessShutdownListener processShutdownListener) {
return jobs.get("job1")
.listener(processShutdownListener)
.start(step(null))
.build();
}
Finally as mentioned in the exception output the flag: ;DB_CLOSE_ON_EXIT=FALSE must be added to the jdbc url.

This approach is the way to go, because shutdown hooks are the only way (to my knowledge) offered by the JVM to intercept external signals. However, this approach is not guaranteed to work because shutdown hooks are not guaranteed to be called by the JVM. Here is an excerpt from the Javadoc of Runtime.addShutdownHook method:
In rare circumstances the virtual machine may abort, that is, stop running
without shutting down cleanly. This occurs when the virtual machine is
terminated externally, for example with the SIGKILL signal on Unix or
the TerminateProcess call on Microsoft Windows.
Moreover, shutdown hooks are expected to run "quickly":
Shutdown hooks should also finish their work quickly. When a program invokes
exit the expectation is that the virtual machine will promptly shut down
and exit.
In your case, JobOperator.stop involves a database transaction (which might cross a network) to update the job's status, and I'm not sure if this operation is "quick" enough.
As a side note, there is an example in the samples module called GracefulShutdownFunctionalTests. This example is based on JobExecution.stop which is deprecated, but it will be updated to use JobOperator.stop.

Related

Is there any way to ensure all CheckpointListeners notified about checkpoint completion on Flink on job cancel with savepoint?

I'm using flink 1.9 and the REST API /jobs/:jobid/savepoints to trigger the savepoint and cancel job (stop the job gracefully to run later on from savepoint).
I use a two-phase commit in source function so my source implements both CheckpointedFunction and CheckpointListener interfaces. On snapshotState() method call I snapshot the internal state and on notifyCheckpointComplete() I checkpoint state to 3rd party system.
From what I can see from source code, only the snapshotState() part is synchronous in CheckpointCoordinator -
// send the messages to the tasks that trigger their checkpoint
for (Execution execution: executions) {
if (props.isSynchronous()) {
execution.triggerSynchronousSavepoint(checkpointID, timestamp, checkpointOptions, advanceToEndOfTime);
} else {
execution.triggerCheckpoint(checkpointID, timestamp, checkpointOptions);
}
}
The checkpoint acknowledge and completion notification is asynchronous in AsyncCheckpointRunnable.
That being said, when the savepoint with cancel-job set to true is triggered, after the snapshot is taken, some of the Task Managers keep up to receive completion notification before the job cancelling and execute notifyCheckpointComplete(), and some not.
The question is whether there is a way to cancel job with savepoint so that the notifyCheckpointComplete() is guaranteed to be invoked by all Task Managers before job cancelled or there is no way to achieve this at the moment ?

It's been a while since I looked at Flink 1.9 so please take my answer with some caution.
My guess is that your sources cancel too early. So notifyCheckpointComplete is actually sent to all tasks, but some SourceFunctions already quit the run and the respective task is cleaned up.
Afaik, what you described should be possible if you ignore cancellation and interruptions until you have received the last notifyCheckpointComplete.
class YourSource implements SourceFunction<Object>, CheckpointListener, CheckpointedFunction {
private volatile boolean canceled = false;
private volatile boolean pendingCheckpoint = false;
#Override
public void snapshotState(FunctionSnapshotContext context) throws Exception {
pendingCheckpoint = true;
// start two-phase commit
}
#Override
public void initializeState(FunctionInitializationContext context) throws Exception {
}
#Override
public void notifyCheckpointComplete(long checkpointId) throws Exception {
// finish two-phase commit
pendingCheckpoint = false;
}
#Override
public void run(SourceContext<Object> ctx) throws Exception {
while (!canceled) {
// do normal source stuff
}
// keep the task running after cancellation
while (pendingCheckpoint) {
try {
Thread.sleep(1);
} catch (InterruptedException e) {
// ignore interruptions until two-phase commit is done
}
}
}
#Override
public void cancel() {
canceled = true;
}
}

Wouldn't using stop-with-savepoint[1][2] solve the problem?
[1]https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/rest_api.html#jobs-jobid-stop
[2]https://ci.apache.org/projects/flink/flink-docs-stable/ops/cli.html

Block Java application from exiting until ThreadPool is empty

I've got an ExecutorService sitting inside a singleton class which receives tasks from many different classes. On application shutdown, I need to wait for the pool to be empty before I allow the application to exit.
private static NotificationService instance = null;
private ExecutorService executorService = Executors.newFixedThreadPool(25);
public static synchronized NotificationService getInstance() {
if (instance == null) {
instance = new NotificationService(true);
}
return instance;
}
While using this NotificationService, it frequently happens that I restart the application and the executorService hasn't finished processing all the notifications.
For Testing, I can manually shutdown the executorService and wait until all tasks are completed.
public static boolean canExit() throws InterruptedException {
NotificationService service = getInstance();
service.executorService.shutdown();
service.executorService.awaitTermination(30, TimeUnit.SECONDS);
return service.executorService.isTerminated();
}
Is it reliable and safe to override the finalize method and wait there until the pool is empty? From what I've read, finalize is not always called, especially not when using a singleton class.
#Override
protected void finalize() throws Throwable {
while (!canExit()){
Thread.sleep(100);
}
super.finalize();
}
This code is included in a library that will be included in another application, so there's no main method where I can wait until the pool is empty, unless I force the person using it to do so which is not great.
What is the correct way to stall the application (for a reasonable amount of time) from terminating until the pool is empty?

You can use addShutdownHook to catch the process termination event and wait for the pool there.
example:
Runtime.getRuntime().addShutdownHook(new Thread() {
public void run() {
NotificationService service = getInstance();
service.executorService.shutdown();
service.executorService.awaitTermination(30, TimeUnit.SECONDS);
}
});

Answered here: Java Finalize method call when close the application
Finalizers do not run on exit by default and the functionality to do this is deprecated.
One common advice is to use the Runtime.addShutdownHook but be aware of the following line of documentation:
Shutdown hooks should also finish their work quickly. When a program invokes exit the expectation is that the virtual machine will promptly shut down and exit. When the virtual machine is terminated due to user logoff or system shutdown the underlying operating system may only allow a fixed amount of time in which to shut down and exit. It is therefore inadvisable to attempt any user interaction or to perform a long-running computation in a shutdown hook.
In all honesty the best way to ensure everything gets properly cleaned up is to have your own application lifecycle which you can end before you even ask the VM to exit.

Don't use blocking shutdown hooks or anything similar in a library. You never know how the library is meant to be used. So it should always be up to the code that is using your library to take sensible actions on shut down.
Of course, you have to provide the necessary API for that, e.g. by adding lifecycle-methods to your class:
public class NotificationService {
...
public void start() {
...
}
/**
* Stops this notification service and waits until
* all notifications have been processed, or a timeout occurs.
* #return the list of unprocessed notification (in case of a timeout),
or an empty list.
*/
public List<Notification> stop(long timeout, TimeUnit unit) {
service.shutdown();
if (!service.awaitTermination(timeout, unit)) {
List<Runnable> tasks = service.shutdownNow();
return extractNotification(tasks);
}
return Collections.emptyList();
}
private List<Notification> extractNotification(List<Runnable> tasks) {
...
}
}
Then, the application code can take the required actions to handle your service, e.g.:
public static void main(String[] args) {
NotificationService service = new NotificationService(...);
service.start();
try {
// use service here
} finally {
List<Notification> pending = service.stop(30, TimeUnit.SECONDS);
if (!pending.isEmpty()) {
// timeout occured => handle pending notifications
}
}
}
Btw.: Avoid using singletons, if feasible.

How to close Spring beans properly after received a SIGTERM?

I'm trying to keep my application alive in order to listen for some messages from my queue. However, once my application receives a SIGTERM, I would like to ensure that my application shutdown nicely. Meaning, ensure that the jobs, internally, has finished first, before the shutdown.
After reading about it , I came up with this:
#Component
public class ParserListenerStarter {
public static void main(final String[] args) throws InterruptedException {
ConfigurableApplicationContext context = new AnnotationConfigApplicationContext(ParserReceiveJmsContext.class);
context.registerShutdownHook();
System.out.println("Listening...");
Runtime.getRuntime().addShutdownHook( //
new Thread() {
#Override
public void run() {
System.out.println("Closing parser listener gracefully!");
context.close();
}
});
while (true) {
Thread.sleep(1000);
}
}
}
Then I send a kill command to my application; and this is my output:
Listening...
// output from my other beans here
Closing parser listener gracefully!
Process finished with exit code 143 (interrupted by signal 15: SIGTERM)
The shutdown methods from my beans were not called:
#PreDestroy public void shutdown() {..}
I'm not an expert in Spring, so I'm sorry for any silly point that I'm missing here.
How can I shutdown my beans and then close my application nicely ?

All you need:
context.registerShutdownHook();
So, add the code above and then your #PreDestroy method will be invoked.
After that you don't need to do anything else. It means you must can delete
Runtime.getRuntime().addShutdownHook( //
new Thread() {
#Override
public void run() {
System.out.println("Closing parser listener gracefully!");
context.close();
}
});
EDIT:
Documentation says that you can have multiple shutdown hooks:
When the virtual machine begins its shutdown sequence it will start all registered shutdown hooks
So the statement below is incorrect:
When you added this, you replaced the Spring hook, which do the beans destroying, because internally, method looks like
if (this.shutdownHook == null) {
// No shutdown hook registered yet.
this.shutdownHook = new Thread() {
#Override
public void run() {
doClose();
}
};
Runtime.getRuntime().addShutdownHook(this.shutdownHook);
}

ExecutorService design pattern

I have a java application which has to be run as a Linux process. It connects to a remote system via socket connection. I have two threads which run through whole life cycle of the program. This is the brief version of my application entry point:
public class SMPTerminal {
private static java.util.concurrent.ExcecutorService executor;
public static void main(String[] args) {
executor = Executors.newFixedThreadPool(2);
Runtime.getRuntime().addShutdownHook(new Thread(new ShutdownHook()));
run(new SMPConsumer());
run(new SMPMaintainer());
}
public static void run(Service callableService) {
try {
Future<Callable> future = executor.submit(callableService);
run(future.get().restart());
} catch (InterruptedException | ExcecutionException e) {
// Program will shutdown
}
}
}
This is Service interface:
public interface Service() {
public Service restart();
}
And this is one implementation of Service interface:
public class SMPConsumer implements Callable<Service>, Service {
#Override
public Service call() throws Exception {
// ...
try {
while(true) {
// Perform the service
}
} catch (InterruptedException | IOException e) {
// ...
}
return this; // Returns this instance to run again
}
public Service restart() {
// Perform the initialization
return this;
}
}
I reached this structure after I have headaches when a temporary IO failure or other problems were causing my application shutdown. Now If my program encounters a problem it doesn't shutdown completely, but just initializes itself from scratch and continues. But I think this is somewhat weired and I am violating OOP design rules. My questions
Is this kind of handling failures correct or efficient?
what problems do I may encounter in future?
Do I have to study about any special design pattern for my problem?

You might not have noticed, but your run method waits for the callableService to finish execution before it returns. So you are not able to start two services concurrently. This is because Future.get() waits until the task computation completes.
public static void run(Service callableService) {
try {
Future<Callable> future = executor.submit(callableService);
run(future.get().restart()); // <=== will block until task completes!
} catch (InterruptedException | ExcecutionException e) {
// Program will shutdown
}
}
(You should have noticed that because of the InterruptionException that must be caught - it indicates that there is some blocking, long running operation going on).
This also renders the execution service useless. If the code that submits a task to the executor always waits for the task to complete, there is no need to execute this task via executor. Instead, the submitting code should call the service directly.
So I assume that blocking is not inteded in this case. Probably your run method should look something like that:
public static void run(Service callableService) {
executor.submit(() -> {
Service result = callableService.call();
run(result.restart());
return result;
});
}
This code snippet is just basic, you might want to extend it to handle exceptional situations.

Is this kind of handling failures correct or efficient? That depends on context of application and how you are using error handling.
May encounter situation where I/O failures etc. are not handled properly.
Looks like you are already using Adapter type design pattern. Look at Adapter design pattern http://www.oodesign.com/adapter-pattern.html

Handling Shutdown Event

Hi I have an Standalone application in which when an user logs in a abc.lck file gets created when the application is closed it gets deleted.I have used addshutdownhook() to delete the file when power supply is interrupted that is switching off the power supply when my application is running.My problem is the file is not getting deleted when I manually shutdown the system i.e by start-->shutdown and I should prompt the user with a message to save the changes using cofirm dailog box like in MS Word.Can some one help me
Thanking u
Chaithu

The general contract of addShutdown hook is
The Java virtual machine shuts down in response to two kinds of events:
The program exits normally, when the last non-daemon thread exits or when the exit (equivalently, System.exit) method is invoked, or
The virtual machine is terminated in response to a user interrupt, such as typing ^C, or a system-wide event, such as user logoff or system shutdown.
A shutdown hook is simply an initialized but unstarted thread. When the virtual machine begins its shutdown sequence it will start all registered shutdown hooks in some unspecified order and let them run concurrently. When all the hooks have finished it will then run all uninvoked finalizers if finalization-on-exit has been enabled. Finally, the virtual machine will halt. Note that daemon threads will continue to run during the shutdown sequence, as will non-daemon threads if shutdown was initiated by invoking the exit method.
In rare circumstances the virtual machine may abort, that is, stop running without shutting down cleanly. This occurs when the virtual machine is terminated externally, for example with the SIGKILL signal on Unix or the TerminateProcess call on Microsoft Windows. The virtual machine may also abort if a native method goes awry by, for example, corrupting internal data structures or attempting to access nonexistent memory. If the virtual machine aborts then no guarantee can be made about whether or not any shutdown hooks will be run.
Hence during shutdown, the Windows machine may call TerminateProcess and hence your shutdown hook might not be invoked.

Use deleteOnExit method instead of adding shutdownhook. However, take a look at this sample,
class Shutdown {
private Thread thread = null;
protected boolean flag=false;
public Shutdown() {
thread = new Thread("Sample thread") {
public void run() {
while (!flag) {
System.out.println("Sample thread");
try {
Thread.currentThread().sleep(1000);
} catch (InterruptedException ie) {
break;
}
}
System.out.println("[Sample thread] Stopped");
}
};
thread.start();
}
public void stopThread() {
flag=true;
}
}
class ShutdownThread extends Thread {
private Shutdown shutdown = null;
public ShutdownThread(Shutdown shutdown) {
super();
this.shutdown = shutdown;
}
public void run() {
System.out.println("Shutdown thread");
shutdown.stopThread();
System.out.println("Shutdown completed");
}
}
public class Main {
public static void main(String [] args) {
Shutdown shutdown = new Shutdown();
try {
Runtime.getRuntime().addShutdownHook(new ShutdownThread(shutdown));
System.out.println("[Main thread] Shutdown hook added");
} catch (Throwable t) {
System.out.println("[Main thread] Could not add Shutdown hook");
}
try {
Thread.currentThread().sleep(10000);
} catch (InterruptedException ie) {}
System.exit(0);
}
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.