How to end spring #Schedule task gracefully? - java

I'm trying to get a spring boot service to end gracefully.It has a method with a #Scheduled annotation. The service uses spring-data for the DB and spring-cloud-stream for RabbitMQ. It's vital that the DB and RabbitMQ are accessible until the scheduled method ends. There's an autoscaler in place which frequently starts/stops service instances, crashing while stopping is not an option.
From this post Spring - Scheduled Task - Graceful Shutdown I take that it should be enough to add
#Bean
TaskSchedulerCustomizer taskSchedulerCustomizer() {
return taskScheduler -> {
taskScheduler.setWaitForTasksToCompleteOnShutdown(true);
taskScheduler.setAwaitTerminationSeconds(30);
};
}
and spring should wait to shut down the application until the scheduled method has finished or 30s have expired.
When the service is stopped while the scheduled method is executed I can see the following things from the log
spring-cloud-stream is closing the connections without waiting for the method to finish.
spring-data is also closing the db-connection right away.
The method is not stopped and tries to finish but fails since it cannot access the Db anymore.
Any ideas how to get the scheduled method including db-connection and rabbitMq access to finish gracefully?
This is my application class:
#SpringBootApplication(scanBasePackages = {
"xx.yyy.infop.dao",
"xx.yyy.infop.compress"})
#EntityScan("ch.sbb.infop.common.entity")
#EnableJpaRepositories({"xx.yyy.infop.dao", "xx.yyy.infop.compress.repository"})
#EnableBinding(CompressSink.class)
#EnableScheduling
public class ApplicationCompress {
#Value("${max.commpress.timout.seconds:300}")
private int maxCompressTimeoutSeconds;
public static void main(String[] args) {
SpringApplication.run(ApplicationCompress.class, args);
}
#Bean
TaskSchedulerCustomizer taskSchedulerCustomizer() {
return taskScheduler -> {
taskScheduler.setWaitForTasksToCompleteOnShutdown(true);
taskScheduler.setAwaitTerminationSeconds(maxCompressTimeoutSeconds);
};
}
}
And this the bean:
#Component
#Profile("!integration-test")
public class CommandReader {
private static final Logger LOGGER = LoggerFactory.getLogger(CommandReader.class);
private final CompressSink compressSink;
private final CommandExecutor commandExecutor;
CommandReader(CompressSink compressSink, CommandExecutor commandExecutor) {
this.compressSink = compressSink;
this.commandExecutor = commandExecutor;
}
#PreDestroy
private void preDestory() {
LOGGER.info("preDestory");
}
#Scheduled(fixedDelay = 1000)
public void poll() {
LOGGER.debug("Start polling.");
ParameterizedTypeReference<CompressCommand> parameterizedTypeReference = new ParameterizedTypeReference<>() {
};
if (!compressSink.inputSync().poll(this::execute, parameterizedTypeReference)) {
compressSink.inputAsync().poll(this::execute, parameterizedTypeReference);
}
LOGGER.debug("Finished polling.");
}
private void execute(Message<?> message) {
CompressCommand compressCommand = (CompressCommand) message.getPayload();
// uses spring-data to write to DB
CompressResponse compressResponse = commandExecutor.execute(compressCommand);
// Schreibt die Anwort in Rensponse-Queue
compressSink.outputResponse().send(MessageBuilder.withPayload(compressResponse).build());
}
}
And here some lines from the log (see https://pastebin.com/raw/PmmqhH1P for the full log):
2020-05-15 11:59:35,640 [restartedMain] - INFO org.springframework.scheduling.concurrent.ThreadPoolTaskScheduler.initialize - traceid= - Initializing ExecutorService 'taskScheduler'
2020-05-15 11:59:44,976 [restartedMain] - INFO org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor.initialize - traceid= - Initializing ExecutorService 'applicationTaskExecutor'
Disconnected from the target VM, address: '127.0.0.1:52748', transport: 'socket'
2020-05-15 12:00:01,228 [SpringContextShutdownHook] - INFO org.springframework.cloud.stream.binder.BinderErrorChannel.adjustCounterIfNecessary - traceid= - Channel 'application-1.kompressSync.komprimierungSyncProcessingGroup.errors' has 1 subscriber(s).
2020-05-15 12:00:01,229 [SpringContextShutdownHook] - INFO org.springframework.cloud.stream.binder.BinderErrorChannel.adjustCounterIfNecessary - traceid= - Channel 'application-1.kompressSync.komprimierungSyncProcessingGroup.errors' has 0 subscriber(s).
2020-05-15 12:00:01,232 [SpringContextShutdownHook] - INFO org.springframework.cloud.stream.binder.BinderErrorChannel.adjustCounterIfNecessary - traceid= - Channel 'application-1.kompressAsync.komprimierungAsyncProcessingGroup.errors' has 1 subscriber(s).
2020-05-15 12:00:01,232 [SpringContextShutdownHook] - INFO org.springframework.cloud.stream.binder.BinderErrorChannel.adjustCounterIfNecessary - traceid= - Channel 'application-1.kompressAsync.komprimierungAsyncProcessingGroup.errors' has 0 subscriber(s).
2020-05-15 12:00:01,237 [SpringContextShutdownHook] - INFO org.springframework.integration.endpoint.EventDrivenConsumer.logComponentSubscriptionEvent - traceid= - Removing {logging-channel-adapter:_org.springframework.integration.errorLogger} as a subscriber to the 'errorChannel' channel
2020-05-15 12:00:01,237 [SpringContextShutdownHook] - INFO org.springframework.integration.channel.PublishSubscribeChannel.adjustCounterIfNecessary - traceid= - Channel 'application-1.errorChannel' has 0 subscriber(s).
2020-05-15 12:00:01,237 [SpringContextShutdownHook] - INFO org.springframework.integration.endpoint.EventDrivenConsumer.stop - traceid= - stopped bean '_org.springframework.integration.errorLogger'
2020-05-15 12:00:01,244 [SpringContextShutdownHook] - INFO org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor.shutdown - traceid= - Shutting down ExecutorService 'applicationTaskExecutor'
2020-05-15 12:00:01,245 [SpringContextShutdownHook] - INFO yy.xxx.infop.compress.CommandReader.preDestory - traceid= - preDestory
2020-05-15 12:00:01,251 [SpringContextShutdownHook] - INFO org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean.destroy - traceid= - Closing JPA EntityManagerFactory for persistence unit 'default'
2020-05-15 12:00:01,256 [SpringContextShutdownHook] - INFO org.springframework.scheduling.concurrent.ThreadPoolTaskScheduler.shutdown - traceid= - Shutting down ExecutorService 'taskScheduler'
2020-05-15 12:00:01,256 [scheduling-1] - INFO yy.xxx.infop.compress.CommandExecutor.doInTransactionWithoutResult - traceid=d22b696edc90e123 - 4
2020-05-15 12:00:02,257 [scheduling-1] - INFO yy.xxx.infop.compress.CommandExecutor.doInTransactionWithoutResult - traceid=d22b696edc90e123 - 5
2020-05-15 12:00:03,258 [scheduling-1] - INFO yy.xxx.infop.compress.CommandExecutor.doInTransactionWithoutResult - traceid=d22b696edc90e123 - 6
2020-05-15 12:00:04,260 [scheduling-1] - INFO yy.xxx.infop.compress.CommandExecutor.doInTransactionWithoutResult - traceid=d22b696edc90e123 - 7
2020-05-15 12:00:05,260 [scheduling-1] - INFO yy.xxx.infop.compress.CommandExecutor.doInTransactionWithoutResult - traceid=d22b696edc90e123 - 8
2020-05-15 12:00:06,261 [scheduling-1] - INFO yy.xxx.infop.compress.CommandExecutor.doInTransactionWithoutResult - traceid=d22b696edc90e123 - 9
2020-05-15 12:00:07,262 [scheduling-1] - INFO yy.xxx.infop.compress.CommandExecutor.doInTransactionWithoutResult - traceid=d22b696edc90e123 - end
2020-05-15 12:00:07,263 [scheduling-1] - INFO yy.xxx.infop.compress.condense.VmLaufVerdichter.verdichte - traceid=d22b696edc90e123 - VarianteTyp=G, vmId=482392382, vnNr=8416
2020-05-15 12:00:07,326 [scheduling-1] -ERROR yy.xxx.infop.compress.CommandExecutor.execute - traceid=d22b696edc90e123 -
org.springframework.beans.factory.BeanCreationNotAllowedException: Error creating bean with name 'inMemoryDatabaseShutdownExecutor': Singleton bean creation not allowed while singletons of this factory are in destruction (Do not request a bean from a BeanFactory in a destroy method implementation!)
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:208) ~[spring-beans-5.2.4.RELEASE.jar:5.2.4.RELEASE]
2020-05-15 12:00:08,332 [scheduling-1] - INFO yy.xxx.infop.compress.CommandExecutor.execute - traceid=d22b696edc90e123 - Compress started. compressCommand=yy.xxx.infop.compress.client.CompressCommand#247ec0d[hostName=K57176,jobId=b1211ee8-4a54-47f2-a58b-92b3560bbddd,cmdId=1,userId=goofy2,commandTyp=verdichtet G, T und komprimiert G, T,vmId=482392382,started=1589536752609]
2020-05-15 12:00:08,337 [scheduling-1] -ERROR yy.xxx.infop.compress.CommandExecutor.execute - traceid=d22b696edc90e123 -
org.springframework.transaction.CannotCreateTransactionException: Could not open JPA EntityManager for transaction; nested exception is java.lang.IllegalStateException: EntityManagerFactory is closed
at org.springframework.orm.jpa.JpaTransactionManager.doBegin(JpaTransactionManager.java:448) ~[spring-orm-5.2.4.RELEASE.jar:5.2.4.RELEASE]
2020-05-15 12:00:10,339 [scheduling-1] - INFO yy.xxx.infop.compress.CommandExecutor.execute - traceid=d22b696edc90e123 - Compress started. compressCommand=yy.xxx.infop.compress.client.CompressCommand#247ec0d[hostName=K57176,jobId=b1211ee8-4a54-47f2-a58b-92b3560bbddd,cmdId=1,userId=goofy2,commandTyp=verdichtet G, T und komprimiert G, T,vmId=482392382,started=1589536752609]
2020-05-15 12:00:10,343 [scheduling-1] -ERROR yy.xxx.infop.compress.CommandExecutor.execute - traceid=d22b696edc90e123 -
org.springframework.transaction.CannotCreateTransactionException: Could not open JPA EntityManager for transaction; nested exception is java.lang.IllegalStateException: EntityManagerFactory is closed
2020-05-15 12:00:10,351 [scheduling-1] -DEBUG yy.xxx.infop.compress.CommandReader.poll - traceid=d22b696edc90e123 - Finished polling.
2020-05-15 12:00:10,372 [SpringContextShutdownHook] - INFO org.springframework.integration.monitor.IntegrationMBeanExporter.destroy - traceid= - Summary on shutdown: bean 'response'
2020-05-15 12:00:10,372 [SpringContextShutdownHook] - INFO org.springframework.integration.monitor.IntegrationMBeanExporter.destroy - traceid= - Summary on shutdown: nullChannel
2020-05-15 12:00:10,373 [SpringContextShutdownHook] - INFO org.springframework.integration.monitor.IntegrationMBeanExporter.destroy - traceid= - Summary on shutdown: bean 'errorChannel'
2020-05-15 12:00:10,373 [SpringContextShutdownHook] - INFO org.springframework.integration.monitor.IntegrationMBeanExporter.destroy - traceid= - Summary on shutdown: bean '_org.springframework.integration.errorLogger.handler' for component '_org.springframework.integration.errorLogger'
2020-05-15 12:00:10,374 [SpringContextShutdownHook] - INFO com.zaxxer.hikari.HikariDataSource.close - traceid= - HikariPool-1 - Shutdown initiated...
2020-05-15 12:00:10,405 [SpringContextShutdownHook] - INFO com.zaxxer.hikari.HikariDataSource.close - traceid= - HikariPool-1 - Shutdown completed.
Process finished with exit code 130

I've tested this configuration which should do the same as your TaskSchedulerCustomizer:
spring.task.scheduling.shutdown.await-termination=true
spring.task.scheduling.shutdown.await-termination-period=30s
Spring waits 30 seconds with all services available before shutting down anything if there are active tasks. If there are no active tasks, the shutdown is immediate.
It is worth mentioning that what brought me to this question was the graceful shutdown of #Async methods which is configured in a very similar way:
spring.task.execution.shutdown.await-termination=true
spring.task.execution.shutdown.await-termination-period=1s
or in code:
#Bean
public TaskExecutorCustomizer taskExecutorCustomizer() {
// Applies to #Async tasks, not #Scheduled as in the question
return (customizer) -> {
customizer.setWaitForTasksToCompleteOnShutdown(true);
customizer.setAwaitTerminationSeconds(10);
};
}
Back to your case, my guess is that the TaskSchedulerCustomizer is not actually executed or is overridden by something else after it executes.
For the first option, validate by adding logging statements or setting a breakpoint in taskSchedulerCustomizer().
For the second option, I suggest setting a breakpoint in TaskSchedulerBuilder::configure() to see what happens. Once the debugger breaks in that method, add a data breakpoint on the ExecutorConfigurationSupport::awaitTerminationMillis property of the taskScheduler to see if that property is modified elsewhere.
You can see the final termination period used in the shutdown process in the method ExecutorConfigurationSupport::awaitTerminationIfNecessary.

Related

Apache Camel Route is getting started automatically

Route loadfile is getting started automatically when I start main class.
On exception, when process should finish. It starts loadfile again and again.
It should get start from timer and then should call loadfile route, but loadfile is starting independent as well as from timer.
CamelContext context = new DefaultCamelContext(sr);
try {
context.addRoutes(new RouteBuilder() {
#Override
public void configure() throws Exception {
onException(Exception.class)
.log(LoggingLevel.INFO, "Extype:${exception.message}")
.stop();
from("timer://alertstrigtimer?period=60s&repeatCount=1")
.startupOrder(1)
.log(LoggingLevel.INFO, "*******************************Job-Alert-System: Started: alertstrigtimer******************************")
.to("direct:loadFile").stop();
from("direct:loadFile").routeId("loadfile")
.log(LoggingLevel.INFO, "*******************************Job-Alert-System: Started: direct:loadFile******************************")
.from(getTriggerFileURI(getWorkFilePath(), getWorkFileName())).choice()
.
.
});
context.start();
Thread.sleep(40000);
Following is log:
[main] INFO org.apache.camel.impl.DefaultCamelContext - Apache Camel 2.21.1 (CamelContext: camel-1) is starting
[main] INFO org.apache.camel.management.ManagedManagementStrategy - JMX is enabled
[main] INFO org.apache.camel.impl.converter.DefaultTypeConverter - Type converters loaded (core: 194, classpath: 14)
[main] INFO org.apache.camel.impl.DefaultCamelContext - StreamCaching is not in use. If using streams then its recommended to enable stream caching. See more details at http://camel.apache.org/stream-caching.html
[main] INFO org.apache.camel.impl.DefaultCamelContext - Route: route1 started and consuming from: timer://alertstrigtimer?period=60s&repeatCount=1
[main] INFO org.apache.camel.impl.DefaultCamelContext - Skipping starting of route loadfile as its configured with autoStartup=false
[main] INFO org.apache.camel.impl.DefaultCamelContext - Route: loadDataAndAlerts started and consuming from: direct://loadDataAndAlerts
[main] INFO org.apache.camel.impl.DefaultCamelContext - Total 4 routes, of which 2 are started
[main] INFO org.apache.camel.impl.DefaultCamelContext - Apache Camel 2.21.1 (CamelContext: camel-1) started in 0.761 seconds
[Camel (camel-1) thread #1 - timer://alertstrigtimer] INFO route1 - *******************************Job-Alert-System: Started: alertstrigtimer******************************
[Camel (camel-1) thread #2 - timer://alertstrigtimer] INFO loadfile - *******************************Job-Alert-System: Started: direct:loadFile******************************
[Camel (camel-1) thread #1 - file://null] INFO loadfile - *******************************Job-Alert-System: Started: direct:loadFile******************************
The problem could be cause by this line .from(getTriggerFileURI(getWorkFilePath(), getWorkFileName())) in loadfile route. Route with multiple from endpoint is known as Multiple Input and this pattern is removed in Camel 3.x.
From RedHat,
from("URI1").from("URI2").from("URI3").to("DestinationUri");
..., exchanges from each of the input endpoints,
URI1, URI2, and URI3, are processed independently of each other and in
separate threads. In fact, you can think of the preceding route as
being equivalent to the following three separate routes:
from("URI1").to("DestinationUri");
from("URI2").to("DestinationUri");
from("URI3").to("DestinationUri");
Rather than using multiple from endpoint (extra independent input), try content enricher pattern (pollEnrich for file component).

Kafka streams shutting down and don't run

Good morning guys,
I'm trying to run a Kafka Stream Application but every time that i try, it start and close in sequence. Below is the result printed on the console
[main] WARN org.apache.kafka.clients.consumer.ConsumerConfig - The configuration 'admin.retries' was supplied but isn't a known config.
[main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka version : 2.1.0
[main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka commitId : eec43959745f444f
[application-brute-test-client-StreamThread-1] INFO org.apache.kafka.streams.processor.internals.StreamThread - stream-thread [application-brute-test-client-StreamThread-1] Starting
[main] INFO org.apache.kafka.streams.KafkaStreams - stream-client [application-brute-test-client] Started Streams client
[application-brute-test-client-StreamThread-1] INFO org.apache.kafka.streams.processor.internals.StreamThread - stream-thread [application-brute-test-client-StreamThread-1] State transition from CREATED to RUNNING
[Thread-0] INFO org.apache.kafka.streams.KafkaStreams - stream-client [application-brute-test-client] State transition from RUNNING to PENDING_SHUTDOWN
[kafka-streams-close-thread] INFO org.apache.kafka.streams.processor.internals.StreamThread - stream-thread [application-brute-test-client-StreamThread-1] Informed to shut down
[kafka-streams-close-thread] INFO org.apache.kafka.streams.processor.internals.StreamThread - stream-thread [application-brute-test-client-StreamThread-1] State transition from RUNNING to PENDING_SHUTDOWN
[application-brute-test-client-StreamThread-1] INFO org.apache.kafka.streams.processor.internals.StreamThread - stream-thread [application-brute-test-client-StreamThread-1] Shutting down
[application-brute-test-client-StreamThread-1] INFO org.apache.kafka.clients.consumer.KafkaConsumer - [Consumer clientId=application-brute-test-client-StreamThread-1-restore-consumer, groupId=] Unsubscribed all topics or patterns and assigned partitions
[application-brute-test-client-StreamThread-1] INFO org.apache.kafka.streams.processor.internals.StreamThread - stream-thread [application-brute-test-client-StreamThread-1] State transition from PENDING_SHUTDOWN to DEAD
[application-brute-test-client-StreamThread-1] INFO org.apache.kafka.streams.processor.internals.StreamThread - stream-thread [application-brute-test-client-StreamThread-1] Shutdown complete
[kafka-admin-client-thread | application-brute-test-client-admin] INFO org.apache.kafka.clients.admin.internals.AdminMetadataManager - [AdminClient clientId=application-brute-test-client-admin] Metadata update failed
org.apache.kafka.common.errors.TimeoutException: Timed out waiting to send the call.
[kafka-streams-close-thread] INFO org.apache.kafka.streams.KafkaStreams - stream-client [application-brute-test-client] State transition from PENDING_SHUTDOWN to NOT_RUNNING
[Thread-0] INFO org.apache.kafka.streams.KafkaStreams - stream-client [application-brute-test-client] Streams client stopped completely
watch out for the following line:
[application-brute-test-client-StreamThread-1] Informed to shut down
The application was informed to shut down, but i don't know why. Can someone help me with this problem?
Here is my simple code only to test the stream:
Properties properties = new Properties();
properties.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "myserver");
properties.put(StreamsConfig.APPLICATION_ID_CONFIG, "application-brute-test");
properties.put(StreamsConfig.CLIENT_ID_CONFIG, "application-brute-test-client");
properties.setProperty(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
properties.setProperty(StreamsConfig.PROCESSING_GUARANTEE_CONFIG, StreamsConfig.EXACTLY_ONCE); // Enable exacly once feature
properties.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass()); // Set a default key serde
properties.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, Serdes.String().getClass()); // Set a default key serde
StreamsBuilder builder = new StreamsBuilder();
KStream<String, String> input = builder.stream("neurotech_propostas", Consumed.with(Serdes.String(), Serdes.String()));
input.print(Printed.toSysOut());
KStream<String, String> output = input.mapValues((value) -> value.toUpperCase());
output.to("brute-test-out");
KafkaStreams stream = new KafkaStreams(builder.build(), properties);
stream.cleanUp();
stream.start();
Runtime.getRuntime().addShutdownHook(new Thread(stream::close));
To solve the problem I simply stopped using JUnit to run the Stream and executed through a Main class. Running Kafka Streams via JUnit was causing this trouble.
Maybe in this envirorment the JUnit don't hold the Thread execution?

Project Reactor: Schedulers#parallel & Schedulers#elastic purpose

I am learning Project Reactor where I am exploring Schedulers factory.
I tried the following code:
ExecutorService executorService = Executors.newFixedThreadPool(10);
Flux.range(1,4)
.map(i -> {
logger.info(i +" [MAP] " + Thread.currentThread().getName());
return 10 / i;
})
.publishOn(Schedulers.fromExecutorService(executorService)) // .publishOn(Schedulers.parallel())
.subscribe(
n -> {
logger.info("START "+((Long)(System.currentTimeMillis() % 10000000L)).toString());
try {
Thread.sleep(100);
} catch (InterruptedException e) {
e.printStackTrace();
}
logger.info(n.toString());
logger.info("END "+((Long)(System.currentTimeMillis() % 10000000L)).toString());
}
);
executorService.shutdown();
This code was tried with Schedulers.parallel() and Schedulers.elastic() as well. Also, tried with subscribeOn() operator to see similar results.
The logs are:
02:07:30.142 [main] INFO - 1 [MAP] main
02:07:30.143 [main] INFO - 2 [MAP] main
02:07:30.143 [main] INFO - 3 [MAP] main
02:07:30.143 [main] INFO - 4 [MAP] main
02:07:30.143 [pool-1-thread-2] INFO - START 1050143
02:07:30.247 [pool-1-thread-2] INFO - 10
02:07:30.247 [pool-1-thread-2] INFO - END 1050247
02:07:30.247 [pool-1-thread-2] INFO - START 1050247
02:07:30.350 [pool-1-thread-2] INFO - 5
02:07:30.350 [pool-1-thread-2] INFO - END 1050350
02:07:30.350 [pool-1-thread-2] INFO - START 1050350
02:07:30.455 [pool-1-thread-2] INFO - 3
02:07:30.455 [pool-1-thread-2] INFO - END 1050455
02:07:30.455 [pool-1-thread-2] INFO - START 1050455
02:07:30.557 [pool-1-thread-2] INFO - 2
02:07:30.558 [pool-1-thread-2] INFO - END 1050558
Since the Flux's elements are ordered and operated upon in sequence (apparent from the logs above), having multiple threads for an operator (or operator chain) for one element does not make sense. I am sure I am either misinterpreting the Schedulers or lack somewhere in my basic understanding. Can someone point me to the right direction?
I understand the purpose of Schedulers to make the processing asynchronous and unhold the main thread. But why would anyone want to give multiple threads to the operator(s) when operated at one element at a time.
Does it makes sense only when we deal with flatMap operator?

Quartz Job executed multiple times simultaneously by each cluster machine, rather than one time by one machine for the entire cluster

Goal:
* Have Job1 run once for a three-node cluster every 10 minutes, and Job2 run once for the same cluster every 5 minutes. Each job generates an email; so at 10:55am I should receive only one Job2 email from the cluster, and at 11:00am I should receive one Job1 email and one Job2 email from the cluster, at 11:05am I should receive only one Job2 email from the cluster, and so on...
Problem:
* Job1 is being run multiple times every 10 minutes on each node in the cluster, and the same for Job2 (except every 5 minutes). This leads to many, many more than one or two emails.
Configuration:
* Three-node linux cluster
* Each machine NTP configured and time-sync'd
* Oracle DB
* Quartz v2.2.0 (cluster mode)
* Jobs configured via CronTrigger
* Each node has an instance of the same standalone Java application running on it, and the Java application instantiates an instance of the quartz scheduler in cluster-mode.
* quartz.properties files are identical on each machine.
I have investigated all the obvious potential causes, but nothing explains it or presents a fix. I have even tried inserting an artificial 10-second sleep instruction in the job, to ensure that it doesn't finish in under a second. Please find relevant artifacts below (quartz.properties and log output). Any help would be greatly appreciated!
Artifact #1:
============================================================================
============================================================================
Q U A R T Z --- P R O P E R T I E S
==================
#============================================================================
# Configure Main Scheduler Properties
#============================================================================
org.quartz.scheduler.instanceName: MyQrtzScheduler
org.quartz.scheduler.instanceId: AUTO
org.quartz.scheduler.skipUpdateCheck: true
#============================================================================
# Configure ThreadPool
#============================================================================
org.quartz.threadPool.class: org.quartz.simpl.SimpleThreadPool
org.quartz.threadPool.threadCount: 1
org.quartz.threadPool.threadPriority: 5
#============================================================================
# Configure JobStore
#============================================================================
org.quartz.jobStore.misfireThreshold: 2592000000
org.quartz.jobStore.class=org.quartz.impl.jdbcjobstore.JobStoreTX
org.quartz.jobStore.driverDelegateClass=org.quartz.impl.jdbcjobstore.oracle.OracleDelegate
org.quartz.jobStore.useProperties=false
org.quartz.jobStore.dataSource=myDS
org.quartz.jobStore.tablePrefix=QRTZ_
org.quartz.jobStore.isClustered=true
org.quartz.jobStore.clusterCheckinInterval=60000
#============================================================================
# Other Example Delegates
#============================================================================
#org.quartz.jobStore.driverDelegateClass=org.quartz.impl.jdbcjobstore.DB2v6Delegate
#org.quartz.jobStore.driverDelegateClass=org.quartz.impl.jdbcjobstore.DB2v7Delegate
#org.quartz.jobStore.driverDelegateClass=org.quartz.impl.jdbcjobstore.DriverDelegate
#org.quartz.jobStore.driverDelegateClass=org.quartz.impl.jdbcjobstore.HSQLDBDelegate
#org.quartz.jobStore.driverDelegateClass=org.quartz.impl.jdbcjobstore.MSSQLDelegate
#org.quartz.jobStore.driverDelegateClass=org.quartz.impl.jdbcjobstore.PointbaseDelegate
#org.quartz.jobStore.driverDelegateClass=org.quartz.impl.jdbcjobstore.PostgreSQLDelegate
#org.quartz.jobStore.driverDelegateClass=org.quartz.impl.jdbcjobstore.StdJDBCDelegate
#org.quartz.jobStore.driverDelegateClass=org.quartz.impl.jdbcjobstore.WebLogicDelegate
#org.quartz.jobStore.driverDelegateClass=org.quartz.impl.jdbcjobstore.oracle.OracleDelegate
#org.quartz.jobStore.driverDelegateClass=org.quartz.impl.jdbcjobstore.oracle.WebLogicOracleDelegate
#============================================================================
# Configure Datasources
#============================================================================
org.quartz.dataSource.myDS.driver: oracle.jdbc.driver.OracleDriver
org.quartz.dataSource.myDS.URL: jdbc:oracle:thin:#myServer:myPort:blah
org.quartz.dataSource.myDS.user: myDBUser
org.quartz.dataSource.myDS.password: myDBPassword
org.quartz.dataSource.myDS.maxConnections: 2
org.quartz.dataSource.myDS.validationQuery: select 0
#============================================================================
# Configure Plugins
#============================================================================
org.quartz.plugin.shutdownHook.class: org.quartz.plugins.management.ShutdownHookPlugin
org.quartz.plugin.shutdownHook.cleanShutdown: true
org.quartz.plugin.triggerHistory.class=org.quartz.plugins.history.LoggingTriggerHistoryPlugin
org.quartz.plugin.jobHistory.class=org.quartz.plugins.history.LoggingJobHistoryPlugin
Artifact #2:
============================================================================
============================================================================
L O G --- O U T P U T
==================
2015-01-29 12:56:16,602 [main] INFO com.mycompany.myapp.jobs.QuartzHelper - Initializing Quartz scheduler...
2015-01-29 12:56:16,829 [main] INFO org.quartz.impl.StdSchedulerFactory - Using default implementation for ThreadExecutor
2015-01-29 12:56:16,855 [main] INFO org.quartz.core.SchedulerSignalerImpl - Initialized Scheduler Signaller of type: class org.quartz.core.SchedulerSignalerImpl
2015-01-29 12:56:16,855 [main] INFO org.quartz.core.QuartzScheduler - Quartz Scheduler v.2.2.0 created.
2015-01-29 12:56:16,857 [main] INFO org.quartz.plugins.management.ShutdownHookPlugin - Registering Quartz shutdown hook.
2015-01-29 12:56:16,859 [main] INFO org.quartz.impl.jdbcjobstore.JobStoreTX - Using db table-based data access locking (synchronization).
2015-01-29 12:56:16,864 [main] INFO org.quartz.impl.jdbcjobstore.JobStoreTX - JobStoreTX initialized.
2015-01-29 12:56:16,865 [main] INFO org.quartz.core.QuartzScheduler - Scheduler meta-data: Quartz Scheduler (v2.2.0) 'MyQrtzScheduler' with instanceId 'node1_1422554176832'
Scheduler class: 'org.quartz.core.QuartzScheduler' - running locally.
NOT STARTED.
Currently in standby mode.
Number of jobs executed: 0
Using thread pool 'org.quartz.simpl.SimpleThreadPool' - with 1 threads.
Using job-store 'org.quartz.impl.jdbcjobstore.JobStoreTX' - which supports persistence. and is clustered.
2015-01-29 12:56:16,865 [main] INFO org.quartz.impl.StdSchedulerFactory - Quartz scheduler 'MyQrtzScheduler' initialized from specified file: '/my/install/directory/quartz.properties'
2015-01-29 12:56:16,866 [main] INFO org.quartz.impl.StdSchedulerFactory - Quartz scheduler version: 2.2.0
2015-01-29 12:56:16,866 [main] INFO com.mycompany.myapp.jobs.QuartzHelper - Quartz scheduler initialized successfully.
2015-01-29 12:59:53,450 [MyQrtzScheduler_QuartzSchedulerThread] DEBUG org.quartz.core.QuartzSchedulerThread - batch acquisition of 1 triggers
2015-01-29 13:00:00,007 [MyQrtzScheduler_QuartzSchedulerThread] DEBUG org.quartz.impl.jdbcjobstore.StdRowLockSemaphore - Lock 'TRIGGER_ACCESS' is desired by: MyQrtzScheduler_QuartzSchedulerThread
2015-01-29 13:00:00,008 [MyQrtzScheduler_QuartzSchedulerThread] DEBUG org.quartz.impl.jdbcjobstore.StdRowLockSemaphore - Lock 'TRIGGER_ACCESS' is being obtained: MyQrtzScheduler_QuartzSchedulerThread
2015-01-29 13:00:00,809 [MyQrtzScheduler_QuartzSchedulerThread] DEBUG org.quartz.impl.jdbcjobstore.StdRowLockSemaphore - Lock 'TRIGGER_ACCESS' given to: MyQrtzScheduler_QuartzSchedulerThread
2015-01-29 13:00:00,836 [MyQrtzScheduler_QuartzSchedulerThread] DEBUG org.quartz.impl.jdbcjobstore.StdRowLockSemaphore - Lock 'TRIGGER_ACCESS' returned by: MyQrtzScheduler_QuartzSchedulerThread
2015-01-29 13:00:00,839 [MyQrtzScheduler_QuartzSchedulerThread] DEBUG org.quartz.simpl.PropertySettingJobFactory - Producing instance of Job 'node2_1422546730757.Job1', class=com.mycompany.myapp.job.Job1
2015-01-29 13:00:00,851 [MyQrtzScheduler_Worker-1] INFO org.quartz.plugins.history.LoggingTriggerHistoryPlugin - Trigger node2_1422546730757.Job1Trigger fired job node2_1422546730757.Job1 at: 13:00:00 01/29/2015
2015-01-29 13:00:00,852 [MyQrtzScheduler_Worker-1] INFO org.quartz.plugins.history.LoggingJobHistoryPlugin - Job node2_1422546730757.Job1 fired (by trigger node2_1422546730757.Job1Trigger) at: 13:00:00 01/29/2015
2015-01-29 13:00:00,852 [MyQrtzScheduler_Worker-1] DEBUG org.quartz.core.JobRunShell - Calling execute on job node2_1422546730757.Job1
2015-01-29 13:00:00,853 [MyQrtzScheduler_Worker-1] INFO com.mycompany.myapp.job.Job1 - ***Executing Inbound File SLA Job...
2015-01-29 13:00:02,054 [MyQrtzScheduler_Worker-1] INFO com.mycompany.myapp.job.Job1 - ***Inbound File SLA Job: No SLA breaches found...
2015-01-29 13:00:02,150 [MyQrtzScheduler_Worker-1] INFO com.mycompany.myapp.job.Job1 - Job1 completed successfully in [1297ms]; sleeping [63703ms] to meet the required minimum runtime for quartz-jobs
2015-01-29 13:00:24,881 [QuartzScheduler_MyQrtzScheduler-node1_1422554176832_ClusterManager] DEBUG org.quartz.impl.jdbcjobstore.JobStoreTX - ClusterManager: Check-in complete.
2015-01-29 13:01:05,862 [MyQrtzScheduler_Worker-1] INFO com.mycompany.myapp.job.Job1 - Job1 sleep-delay completed.
2015-01-29 13:01:05,864 [MyQrtzScheduler_Worker-1] INFO org.quartz.plugins.history.LoggingJobHistoryPlugin - Job node2_1422546730757.Job1 execution complete at 13:01:05 01/29/2015 and reports: SUCCESS
2015-01-29 13:01:05,865 [MyQrtzScheduler_Worker-1] INFO org.quartz.plugins.history.LoggingTriggerHistoryPlugin - Trigger node2_1422546730757.Job1Trigger completed firing job node2_1422546730757.Job1 at 13:01:05 01/29/2015 with resulting trigger instruction code: DO NOTHING
2015-01-29 13:01:05,868 [MyQrtzScheduler_Worker-1] DEBUG org.quartz.impl.jdbcjobstore.StdRowLockSemaphore - Lock 'TRIGGER_ACCESS' is desired by: MyQrtzScheduler_Worker-1
2015-01-29 13:01:05,869 [MyQrtzScheduler_Worker-1] DEBUG org.quartz.impl.jdbcjobstore.StdRowLockSemaphore - Lock 'TRIGGER_ACCESS' is being obtained: MyQrtzScheduler_Worker-1
2015-01-29 13:01:05,872 [MyQrtzScheduler_Worker-1] DEBUG org.quartz.impl.jdbcjobstore.StdRowLockSemaphore - Lock 'TRIGGER_ACCESS' given to: MyQrtzScheduler_Worker-1
2015-01-29 13:01:05,880 [MyQrtzScheduler_Worker-1] DEBUG org.quartz.impl.jdbcjobstore.StdRowLockSemaphore - Lock 'TRIGGER_ACCESS' returned by: MyQrtzScheduler_Worker-1
2015-01-29 13:01:05,915 [MyQrtzScheduler_QuartzSchedulerThread] DEBUG org.quartz.core.QuartzSchedulerThread - batch acquisition of 1 triggers
2015-01-29 13:01:05,917 [MyQrtzScheduler_QuartzSchedulerThread] DEBUG org.quartz.impl.jdbcjobstore.StdRowLockSemaphore - Lock 'TRIGGER_ACCESS' is desired by: MyQrtzScheduler_QuartzSchedulerThread
2015-01-29 13:01:05,918 [MyQrtzScheduler_QuartzSchedulerThread] DEBUG org.quartz.impl.jdbcjobstore.StdRowLockSemaphore - Lock 'TRIGGER_ACCESS' is being obtained: MyQrtzScheduler_QuartzSchedulerThread
2015-01-29 13:01:05,921 [MyQrtzScheduler_QuartzSchedulerThread] DEBUG org.quartz.impl.jdbcjobstore.StdRowLockSemaphore - Lock 'TRIGGER_ACCESS' given to: MyQrtzScheduler_QuartzSchedulerThread
2015-01-29 13:01:05,954 [MyQrtzScheduler_QuartzSchedulerThread] DEBUG org.quartz.impl.jdbcjobstore.StdRowLockSemaphore - Lock 'TRIGGER_ACCESS' returned by: MyQrtzScheduler_QuartzSchedulerThread
2015-01-29 13:01:05,955 [MyQrtzScheduler_QuartzSchedulerThread] DEBUG org.quartz.simpl.PropertySettingJobFactory - Producing instance of Job 'node1_1422543657050.Job2', class=com.mycompany.myapp.jobs.Job2
2015-01-29 13:01:05,961 [MyQrtzScheduler_Worker-1] INFO org.quartz.plugins.history.LoggingTriggerHistoryPlugin - Trigger node1_1422543657050.Job2Trigger fired job node1_1422543657050.Job2 at: 13:01:05 01/29/2015
2015-01-29 13:01:05,962 [MyQrtzScheduler_Worker-1] INFO org.quartz.plugins.history.LoggingJobHistoryPlugin - Job node1_1422543657050.Job2 fired (by trigger node1_1422543657050.Job2Trigger) at: 13:01:05 01/29/2015
2015-01-29 13:01:05,963 [MyQrtzScheduler_Worker-1] DEBUG org.quartz.core.JobRunShell - Calling execute on job node1_1422543657050.Job2
2015-01-29 13:01:05,963 [MyQrtzScheduler_Worker-1] WARN com.mycompany.myapp.jobs.Job2 - No outbound files found; Outbound File SLA Job cannot check for SLA breaches.
2015-01-29 13:01:05,965 [MyQrtzScheduler_Worker-1] INFO org.quartz.plugins.history.LoggingJobHistoryPlugin - Job node1_1422543657050.Job2 execution complete at 13:01:05 01/29/2015 and reports: null
2015-01-29 13:01:05,966 [MyQrtzScheduler_Worker-1] INFO org.quartz.plugins.history.LoggingTriggerHistoryPlugin - Trigger node1_1422543657050.Job2Trigger completed firing job node1_1422543657050.Job2 at 13:01:05 01/29/2015 with resulting trigger instruction code: DO NOTHING
The following answer was given by the OP.
The problem was that I was defining quartz jobs with identities that have a unique group id (the scheduler id) instead of a group id common to all hosts in the cluster. Since the scheduler id is unique to the host, each host in the cluster would look to see if that job already existed using the fully qualified job name groupId.jobName and surely it found it didn't, so it would create a new instance of Job1 and Job2 during startup. The quartz jobs/triggers are never expired or cleared without an explicit request in Java or manual sql statement in Oracle. So over time the instances would build up, and instead of quartz running a single instance of Job1 and Job2, it would run all the instances of each job that had been created over time (hence the multiple executions and multiple email alerts).
The solution is that I replace schedulerId with a static string such as "MyQuartzJobs" when defining a job's identity.
Basically, I changed the following line of Java code:
JobDetail job =
newJob(Job1.class).withIdentity(JOB1_JOB_NAME, uniqueSchedulerId)
.withDescription(JOB1_DESC + " created [" + new Date() + "]")
.storeDurably(false)
.requestRecovery(false)
.build();
to something like the following:
JobDetail job =
newJob(Job1.class).withIdentity(JOB1_JOB_NAME, "MyQuartzJobs")
.withDescription(JOB1_DESC + " created [" + new Date() + "]")
.storeDurably(false)
.requestRecovery(false)
.build();

ERROR backtype.storm.util - Async loop died

I am an absolute beginner to storm. This is an example from getting started with storm (book). The wordcounter example . I am using storm in local and when i run this example using maven i am getting this error.
Maven command use :
[knk#kinock Storm-Starter]$mvn exec:java -Dexec.mainClass="TopologyMain" -Dexec.args="src/main/resources/words.txt"
This is the error
java.lang.NullPointerException: null
at spouts.WordReader.open(WordReader.java:62) ~[classes/:na]
at backtype.storm.daemon.executor$fn__3430$fn__3445.invoke(executor.clj:504) ~[storm-core-0.9.0.1.jar:na]
at backtype.storm.util$async_loop$fn__444.invoke(util.clj:401) ~[storm-core-0.9.0.1.jar:na]
at clojure.lang.AFn.run(AFn.java:24) [clojure-1.4.0.jar:na]
at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
5608 [Thread-22-word-reader] ERROR backtype.storm.daemon.executor -
java.lang.NullPointerException: null
at spouts.WordReader.open(WordReader.java:62) ~[classes/:na]
at backtype.storm.daemon.executor$fn__3430$fn__3445.invoke(executor.clj:504) ~ [storm-core-0.9.0.1.jar:na]
at backtype.storm.util$async_loop$fn__444.invoke(util.clj:401) ~[storm-core- 0.9.0.1.jar:na]
at clojure.lang.AFn.run(AFn.java:24) [clojure-1.4.0.jar:na]
at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
5610 [Thread-6] INFO backtype.storm.daemon.executor - Loading executor __system:[-1 -1]
5618 [Thread-6] INFO backtype.storm.daemon.task - Emitting: __system __system ["startup"]
5619 [Thread-6] INFO backtype.storm.daemon.executor - Loaded executor tasks __system: [-1 -1]
5624 [Thread-6] INFO backtype.storm.daemon.executor - Finished loading executor __system:[-1 -1]
5629 [Thread-24-__system] INFO backtype.storm.daemon.executor - Preparing bolt __system:(-1)
5638 [Thread-24-__system] INFO backtype.storm.daemon.executor - Prepared bolt __system:(-1)
5668 [Thread-6] INFO backtype.storm.daemon.executor - Loading executor __acker:[1 1]
5670 [Thread-6] INFO backtype.storm.daemon.task - Emitting: __acker __system ["startup"]
5671 [Thread-6] INFO backtype.storm.daemon.executor - Loaded executor tasks __acker:[1 1]
5672 [Thread-22-word-reader] INFO backtype.storm.util - Halting process: ("Worker died")
5678 [Thread-6] INFO backtype.storm.daemon.executor - Timeouts disabled for executor __acker:[1 1]
5679 [Thread-6] INFO backtype.storm.daemon.executor - Finished loading executor __acker: [1 1]
This is the Block of code raising exception .. 62nd line is marked with comment .
public void open(Map conf, TopologyContext context, SpoutOutputCollector collector) {
try {
//this.collector = collector;
this.context = context;
this.fileReader = new FileReader(conf.get("wordsFile").toString());//exception is raised here. 62nd line
} catch (FileNotFoundException e){
throw new RuntimeException("Error reading file ["+conf.get("wordFile")+"]");
}
this.collector = collector;
}
Thanks in advance ..
[knk#kinock Storm-Starter]$mvn exec:java -Dexec.mainClass="TopologyMain" -Dexec.args="src/main/resources/words.txt"
In the above command the complete path to the file should be given or else java cant find the path.
The most probable reason could be it is not able to read the wordsFile entry from the Config object .. e.g if you have some thing like FileReader(conf.get("wordsFile").toString()); and then if you use conf.put("SomeOtherWordsFile", ..) then it will try to find the file path location (inside the spout's open method) form the conf object by searching the key wordsFile which is actually not present

Categories