Camel process receivnig hang up singal and shutsdown gracefully - java

i have a Camel process which pulls file from Azure container and process it in Azure environment.
I expect the process to run continuously, but it shuts down after random interval.
Logs:
CamelHangupInterceptor: INFO (MainSupport.java:87) - Received hang up - stopping the main instance.
CamelHangupInterceptor: DEBUG (Main.java:187) - Stopping Spring ApplicationContext: org.springframework.context.support.ClassPathXmlApplicationContext#47ef968d
2019-01-29 21:39:50,782: main: INFO (MainSupport.java:502) - MainSupport exiting code: 0
...
closes all the routes after inflight exchange is compeleted since using DefaultShutdownStrategy.
Spring context route:
- Start with a scheduler for initial delay,
- then <dealy> component to randomly generate time (logic used for scalability to avoid race condition)
- invoking the custom class implementing Process class, which has Azure container url with credentials and fetch the file from container
- then using wireTap component to downloading the file
- finally invoking another class implementing Process class.
The Camel (v 2.20) process starts and executes as expected, but after a random interval process shuts down.
I see hangup signal received, but not sure how it happens. Logs shows graceful shutdown. Is there a way to send hangup signal to Camel process from external process?
in one of the route i am using to stop the exchange to stop the route forcibly.
exchange.setProperty(Exchange.ROUTE_STOP, Boolean.TRUE);
More logs:
2019-01-29 21:40:17,838: Camel Thread #0 - CamelHangupInterceptor: DEBUG (DefaultExecutorServiceManager.java:363) - Forcing shutdown of ExecutorService: org.apache.camel.util.concurrent.RejectableThreadPoolExecutor#647b9364[Running, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 0][WireTap]
...
2019-01-29 21:40:17,840: Camel Thread #0 - CamelHangupInterceptor: DEBUG (RouteService.java:289) - Shutting down services on route: route1
2019-01-29 21:40:17,841: Camel Thread #0 - CamelHangupInterceptor: DEBUG (BeanComponent.java:72) - Clearing BeanInfo cache[size=1, hits=1, misses=1, evicted=0]
2019-01-29 21:40:17,849: Camel Thread #0 - CamelHangupInterceptor: DEBUG (SimpleLanguage.java:136) - Clearing simple language predicate cache[size=0, hits=0, misses=0, evicted=0]
2019-01-29 21:40:17,849: Camel Thread #0 - CamelHangupInterceptor: DEBUG (SimpleLanguage.java:142) - Clearing simple language expression cache[size=4, hits=1, misses=4, evicted=0]
2019-01-29 21:40:17,849: Camel Thread #0 - CamelHangupInterceptor: DEBUG (DefaultManagementAgent.java:358) - Unregistered MBean with ObjectName: org.apache.camel:context=camelAzureBlobContext,type=routecontroller,name="camelAzureBlobContext"
2019-01-29 21:40:17,849: Camel Thread #0 - CamelHangupInterceptor: DEBUG (DefaultManagementAgent.java:358) - Unregistered MBean with ObjectName: org.apache.camel:context=camelAzureBlobContext,type=health,name="camelAzureBlobContext"
2019-01-29 21:40:17,850: Camel Thread #0 - CamelHangupInterceptor: DEBUG (TimerListenerManager.java:128) - Removed TimerListener: org.apache.camel.management.mbean.ManagedCamelContext#473692b
2019-01-29 21:40:17,850: Camel Thread #0 - CamelHangupInterceptor: DEBUG (DefaultManagementAgent.java:358) - Unregistered MBean with ObjectName: org.apache.camel:context=camelAzureBlobContext,type=context,name="camelAzureBlobContext"
2019-01-29 21:40:17,850: Camel Thread #0 - CamelHangupInterceptor: INFO (MainLifecycleStrategy.java:44) - CamelContext: camelAzureBlobContext has been shutdown, triggering shutdown of the JVM.
2019-01-29 21:40:17,850: Camel Thread #0 - CamelHangupInterceptor: DEBUG (DefaultExecutorServiceManager.java:363) - Forcing shutdown of ExecutorService: org.apache.camel.util.concurrent.RejectableThreadPoolExecutor#72b68833[Running, pool size = 1, active threads = 0, queued tasks = 0, completed tasks = 1][ShutdownTask]
2019-01-29 21:40:17,851: Camel Thread #0 - CamelHangupInterceptor: DEBUG (DefaultInflightRepository.java:183) - Shutting down with no inflight exchanges.
2019-01-29 21:40:17,851: Camel Thread #0 - CamelHangupInterceptor: DEBUG (DefaultServicePool.java:110) - Stopping service pool: org.apache.camel.impl.SharedPollingConsumerServicePool#13ed066e
2019-01-29 21:40:17,851: Camel Thread #0 - CamelHangupInterceptor: DEBUG (DefaultServicePool.java:110) - Stopping service pool: org.apache.camel.impl.SharedProducerServicePool#151ab2b9
...
2019-01-29 21:40:17,856: Camel Thread #0 - CamelHangupInterceptor: DEBUG (MBeanInfoAssembler.java:79) - Clearing cache[size=30, hits=12, misses=30, evicted=0]
2019-01-29 21:40:17,864: Camel Thread #0 - CamelHangupInterceptor: DEBUG (IntrospectionSupport.java:134) - Clearing cache[size=93, hits=192, misses=93, evicted=0]
2019-01-29 21:40:17,869: Camel Thread #0 - CamelHangupInterceptor: INFO (DefaultCamelContext.java:3575) - Apache Camel 2.20.0 (CamelContext: camelAzureBlobContext) uptime 32 minutes
-- ROUTE INFORMATION--
<camel:camelContext id="camelAzureBlobContext" xmlns="http://camel.apache.org/schema/spring" typeConverterStatisticsEnabled="true" autoStartup="true">
<endpoint id="listBlobendpoint"
uri="azure-blob://storageaccount/containerName?credentials=containercredientiasl&operation=listBlobs" /> <!-- changed the actual values -->
<dataFormats>
<json id="inputMsg" library="Jackson" unmarshalTypeName="pacakage.requiredinputpojojacksonclass" /> <!-- renamed the class name-->
</dataFormats>
<onException>
<exception>com.fasterxml.jackson.core.JsonParseException</exception>
<exception>com.fasterxml.jackson.databind.JsonMappingException</exception>
<handled>
<constant>true</constant>
</handled>
<process ref="parseExceptionResponse" />
</onException>
<route>
<from uri="scheduler://tempScheduler?initialDelay=5000&delay=50000" /> <!-- changed the actual values -->
<setHeader headerName="BlobListingDetails">
<simple resultType="com.microsoft.azure.storage.blob.BlobListingDetails">METADATA</simple>
</setHeader>
<delay>
<method ref="blobcamelprocess" method="randomDelayToPoll"></method> <!-- Method which has some random number generation-->
</delay>
<to uri="ref:listBlobendpoint" /> <!-- bean which actually sets the metadata value-->
<process ref="blobcamelprocess" />
<!-- creating recipient list to update metadata of the container blob file -->
<recipientList>
<header>update_metadata</header>
</recipientList>
<log message="Message | $simple{in.header[filename]}" loggingLevel="INFO"></log>
<wireTap uri="file://location?fileName=$simple{in.header[filename]}"/>
<unmarshal ref="inputMsg" />
<process ref="messageconversionprocess" /> <!-- bean which actually converts uses the parsed json to construct java object-->
<process ref="deleteblobProcess" /> <!-- bean that will be used to delete the file from the blob store -->
<recipientList>
<header>delete_blob</header> <!-- endpoint details is set from the above bean and passed here -->
</recipientList>
</route>
</camel:camelContext>

As per the logs, your whole Camel context is shutting down. Its quite unlikely that the context shutdown is initiated by your ROUTE_STOP call.
Looking at MainSupport.java source, the INFO log that you see in your log files is executed as a part of hang up notification hook, notified by a JVM Shutdown in progress. When this happens, does your JVM die as well? This documentation says hooks are notified when a JVM Abnormal termination is initiated by SIGINT, SIGTERM or SIGHUP
Perhaps we can look into further details if you could provide answers to the following
Is Camel running in a web container or something else? I see MainSupport.java :)
What JVM version do you use?
What OS platform is it running on?
When Camel context shuts down, is the JVM still running or JVM exits too?
Do you see additional exceptions in the log files?
Can we correlate this CamelContext shutdown to a system resource exhaustion event? (like running short of memory or something like that). Quite unlikely to be the OOMKiller since its is a graceful shutdown.
On a side note, don't be misled by the name CamelHangupInterceptor it doesn't mean Camel received a SIGHUP.

Related

Apache Camel Route is getting started automatically

Route loadfile is getting started automatically when I start main class.
On exception, when process should finish. It starts loadfile again and again.
It should get start from timer and then should call loadfile route, but loadfile is starting independent as well as from timer.
CamelContext context = new DefaultCamelContext(sr);
try {
context.addRoutes(new RouteBuilder() {
#Override
public void configure() throws Exception {
onException(Exception.class)
.log(LoggingLevel.INFO, "Extype:${exception.message}")
.stop();
from("timer://alertstrigtimer?period=60s&repeatCount=1")
.startupOrder(1)
.log(LoggingLevel.INFO, "*******************************Job-Alert-System: Started: alertstrigtimer******************************")
.to("direct:loadFile").stop();
from("direct:loadFile").routeId("loadfile")
.log(LoggingLevel.INFO, "*******************************Job-Alert-System: Started: direct:loadFile******************************")
.from(getTriggerFileURI(getWorkFilePath(), getWorkFileName())).choice()
.
.
});
context.start();
Thread.sleep(40000);
Following is log:
[main] INFO org.apache.camel.impl.DefaultCamelContext - Apache Camel 2.21.1 (CamelContext: camel-1) is starting
[main] INFO org.apache.camel.management.ManagedManagementStrategy - JMX is enabled
[main] INFO org.apache.camel.impl.converter.DefaultTypeConverter - Type converters loaded (core: 194, classpath: 14)
[main] INFO org.apache.camel.impl.DefaultCamelContext - StreamCaching is not in use. If using streams then its recommended to enable stream caching. See more details at http://camel.apache.org/stream-caching.html
[main] INFO org.apache.camel.impl.DefaultCamelContext - Route: route1 started and consuming from: timer://alertstrigtimer?period=60s&repeatCount=1
[main] INFO org.apache.camel.impl.DefaultCamelContext - Skipping starting of route loadfile as its configured with autoStartup=false
[main] INFO org.apache.camel.impl.DefaultCamelContext - Route: loadDataAndAlerts started and consuming from: direct://loadDataAndAlerts
[main] INFO org.apache.camel.impl.DefaultCamelContext - Total 4 routes, of which 2 are started
[main] INFO org.apache.camel.impl.DefaultCamelContext - Apache Camel 2.21.1 (CamelContext: camel-1) started in 0.761 seconds
[Camel (camel-1) thread #1 - timer://alertstrigtimer] INFO route1 - *******************************Job-Alert-System: Started: alertstrigtimer******************************
[Camel (camel-1) thread #2 - timer://alertstrigtimer] INFO loadfile - *******************************Job-Alert-System: Started: direct:loadFile******************************
[Camel (camel-1) thread #1 - file://null] INFO loadfile - *******************************Job-Alert-System: Started: direct:loadFile******************************
The problem could be cause by this line .from(getTriggerFileURI(getWorkFilePath(), getWorkFileName())) in loadfile route. Route with multiple from endpoint is known as Multiple Input and this pattern is removed in Camel 3.x.
From RedHat,
from("URI1").from("URI2").from("URI3").to("DestinationUri");
..., exchanges from each of the input endpoints,
URI1, URI2, and URI3, are processed independently of each other and in
separate threads. In fact, you can think of the preceding route as
being equivalent to the following three separate routes:
from("URI1").to("DestinationUri");
from("URI2").to("DestinationUri");
from("URI3").to("DestinationUri");
Rather than using multiple from endpoint (extra independent input), try content enricher pattern (pollEnrich for file component).

Spring 5 webclient with jetty connector continues to run after main completes execution

I am trying to test rest api by Spring 5 webclient with jetty connector. I am getting data from api call but main program continues to run even after main completes execution. How to resolve the issue.? What configuration needed so that Jetty connector stops after main completes its execution?
My connector initialization code:
SslContextFactory.Client sslContextFactory = new SslContextFactory.Client();
HttpClient httpClient = new HttpClient();
httpClient.setIdleTimeout(DefaultIdleTimeout);
ClientHttpConnector clientConnector = new JettyClientHttpConnector(httpClient, jettyResourceFactory);
webClient = WebClient.builder().baseUrl(getBaseUrl()).clientConnector(clientConnector).build();
Using webclient in main class :
webClient.get().uri("/getUri").exchange().flatMap(response.bodyToMono(String.class)).subscribe(di -> {
System.out.println(di);
}, error -> {
System.out.println(error.getStackTrace());
}, () -> {
System.out.println("Execution complete");
});
getting below in log:
13:39:11.225 [HttpClient#1165b38-scheduler-1] DEBUG org.eclipse.jetty.io.AbstractConnection - HttpConnectionOverHTTP#4eb423a5::DecryptedEndPoint#4345fb54{<hostname>/<targetIp>:443<->/<sourceIp>:65106,CLOSED,fill=-,flush=-,to=15016/15000} onFillInterestedFailed {}
13:39:11.225 [HttpClient#1165b38-scheduler-1] DEBUG org.eclipse.jetty.io.ManagedSelector - Wakeup ManagedSelector#75ed9710{STARTED} id=1 keys=0 selected=0 updates=0
13:39:11.226 [HttpClient#1165b38-24] DEBUG org.eclipse.jetty.io.ManagedSelector - Selector sun.nio.ch.WindowsSelectorImpl#3d003adb woken with none selected
13:39:11.226 [HttpClient#1165b38-scheduler-1] DEBUG org.eclipse.jetty.util.thread.QueuedThreadPool - queue org.eclipse.jetty.io.ManagedSelector$DestroyEndPoint#74c9b13c startThread=0
13:39:11.226 [HttpClient#1165b38-24] DEBUG org.eclipse.jetty.io.ManagedSelector - Selector sun.nio.ch.WindowsSelectorImpl#3d003adb woken up from select, 0/0/0 selected
13:39:11.226 [HttpClient#1165b38-scheduler-1] DEBUG org.eclipse.jetty.io.FillInterest - onClose FillInterest#52ac96eb{null}
13:39:11.226 [HttpClient#1165b38-21] DEBUG org.eclipse.jetty.util.thread.QueuedThreadPool - run org.eclipse.jetty.io.ManagedSelector$DestroyEndPoint#74c9b13c in QueuedThreadPool[HttpClient#1165b38]#45efc20d{STARTED,8<=8<=200,i=1,r=8,q=0}[ReservedThreadExecutor#4bef0fe3{s=2/8,p=0}]
13:39:11.226 [HttpClient#1165b38-scheduler-1] DEBUG org.eclipse.jetty.client.http.HttpConnectionOverHTTP - Closed HttpConnectionOverHTTP#4eb423a5::DecryptedEndPoint#4345fb54{<hostname>/<targetIp>:443<->/<sourceIp>:65106,CLOSED,fill=-,flush=-,to=15017/15000}
13:39:11.226 [HttpClient#1165b38-24] DEBUG org.eclipse.jetty.io.ManagedSelector - Selector sun.nio.ch.WindowsSelectorImpl#3d003adb processing 0 keys, 0 updates
13:39:11.226 [HttpClient#1165b38-24] DEBUG org.eclipse.jetty.io.ManagedSelector - updateable 0
13:39:11.226 [HttpClient#1165b38-24] DEBUG org.eclipse.jetty.io.ManagedSelector - updates 0
13:39:11.226 [HttpClient#1165b38-24] DEBUG org.eclipse.jetty.io.ManagedSelector - Selector sun.nio.ch.WindowsSelectorImpl#3d003adb waiting with 0 keys
13:39:11.226 [HttpClient#1165b38-21] DEBUG org.eclipse.jetty.io.ManagedSelector - Destroyed SocketChannelEndPoint#7a9db40c{<hostname>/<targetIp>:443<->/<sourceIp>:65106,CLOSED,fill=-,flush=-,to=0/15000}{io=0/0,kio=-1,kro=-1}->SslConnection#53046985{NEED_UNWRAP,eio=-1/-1,di=-1,fill=IDLE,flush=IDLE}~>DecryptedEndPoint#4345fb54{<hostname>/<targetIp>:443<->/<sourceIp>:65106,CLOSED,fill=-,flush=-,to=15017/15000}=>HttpConnectionOverHTTP#4eb423a5(l:/<sourceIp>:65106 <-> r:<hostname>/<targetIp>:443,closed=true)=>HttpChannelOverHTTP#7dafb76e(exchange=null)[send=HttpSenderOverHTTP#4491419(req=QUEUED,snd=COMPLETED,failure=null)[HttpGenerator#22805291{s=START}],recv=HttpReceiverOverHTTP#52385f05(rsp=IDLE,failure=null)[HttpParser{s=START,0 of -1}]]
13:39:11.227 [HttpClient#1165b38-21] DEBUG org.eclipse.jetty.io.AbstractConnection - onClose HttpConnectionOverHTTP#4eb423a5::DecryptedEndPoint#4345fb54{<hostname>/<targetIp>:443<->/<sourceIp>:65106,CLOSED,fill=-,flush=-,to=15018/15000}
13:39:11.227 [HttpClient#1165b38-21] DEBUG org.eclipse.jetty.io.AbstractConnection - onClose SslConnection#53046985::SocketChannelEndPoint#7a9db40c{<hostname>/<targetIp>:443<->/<sourceIp>:65106,CLOSED,fill=-,flush=-,to=0/15000}{io=0/0,kio=-1,kro=-1}->SslConnection#53046985{NEED_UNWRAP,eio=-1/-1,di=-1,fill=IDLE,flush=IDLE}~>DecryptedEndPoint#4345fb54{<hostname>/<targetIp>:443<->/<sourceIp>:65106,CLOSED,fill=-,flush=-,to=15018/15000}=>HttpConnectionOverHTTP#4eb423a5(l:/<sourceIp>:65106 <-> r:<hostname>/<targetIp>:443,closed=true)=>HttpChannelOverHTTP#7dafb76e(exchange=null)[send=HttpSenderOverHTTP#4491419(req=QUEUED,snd=COMPLETED,failure=null)[HttpGenerator#22805291{s=START}],recv=HttpReceiverOverHTTP#52385f05(rsp=IDLE,failure=null)[HttpParser{s=START,0 of -1}]]
13:39:11.227 [HttpClient#1165b38-21] DEBUG org.eclipse.jetty.util.thread.QueuedThreadPool - ran org.eclipse.jetty.io.ManagedSelector$DestroyEndPoint#74c9b13c in QueuedThreadPool[HttpClient#1165b38]#45efc20d{STARTED,8<=8<=200,i=1,r=8,q=0}[ReservedThreadExecutor#4bef0fe3{s=2/8,p=0}]
and continue to get log in console & program continues to run..................................
Taken from the official docs
https://docs.spring.io/spring-framework/docs/current/spring-framework-reference/web-reactive.html#webflux-client-builder-jetty
You can share resources between multiple instances of the Jetty client (and server) and ensure that the resources are shut down when the Spring ApplicationContext is closed by declaring a Spring-managed bean of type JettyResourceFactory
Your JettyHttpClientConnector should be instantiated with a jettyResourceFactory bean whose lifecycle should end as the test ends.

Camel route publisher not publishing to JMS outbound queue fails with java.util.concurrent.RejectedExecutionException

So I have a route consuming a JMS message from one queue and I want to send it to another queue on the same broker. Blueprint xml pretty simple:
<camel:route id="from-jms-consumer-to-jms" trace="false">
<from uri="jms:queue:test" />
<log message="Message consumed :: ${body}" />
<to uri="jms:queue:other_test" />
</camel:route>
But it doesn't work! It consumes and logs the message (so no issue with connection factory) but when it tries to publish it fails with java.util.concurrent.RejectedExecutionException
Why? I tried everything:
separate connection factories
pooling/no-pooling
transacted/non-transacted
But even a simple route like this fails on the publisher.
This is Camel 3.0.0-M4 and I'm running on Karaf 4.2.1 on JDK 9. There are no other issues. All other components work and have been individually tested.
I even looked at the code for JmsPublisher in the Camel source. There's an isRunAllowed() boolean that returns false for jmsPublisher. I don't know what I could be missing? Thread pool? JMSHeader on the message? I'm stumped!
Error stack trace:
Message History
---------------------------------------------------------------------------------------------------------------------------------------
RouteId ProcessorId Processor Elapsed (ms)
[from-jms-consumer-] [from-jms-consumer-] [jmsFrom://queue:test ] [ 2]
[from-jms-consumer-] [log7 ] [log ] [ 1]
[from-jms-consumer-] [to7 ] [jms:queue:other_test ] [ 0]
Stacktrace
---------------------------------------------------------------------------------------------------------------------------------------
java.util.concurrent.RejectedExecutionException: null
at org.apache.camel.component.jms.JmsProducer.process(JmsProducer.java:140) ~[127:org.apache.camel.camel-jms:3.0.0.M4]
at org.apache.camel.processor.SendProcessor.process(SendProcessor.java:130) ~[110:org.apache.camel.camel-base:3.0.0.M4]
at org.apache.camel.processor.errorhandler.RedeliveryErrorHandler$RedeliveryState.run(RedeliveryErrorHandler.java:480) [110:org.apache.camel.camel-base:3.0.0.M4]
at org.apache.camel.impl.engine.DefaultReactiveExecutor$Worker.schedule(DefaultReactiveExecutor.java:185) [110:org.apache.camel.camel-base:3.0.0.M4]
at org.apache.camel.impl.engine.DefaultReactiveExecutor.scheduleMain(DefaultReactiveExecutor.java:59) [110:org.apache.camel.camel-base:3.0.0.M4]
at org.apache.camel.processor.Pipeline.process(Pipeline.java:87) [110:org.apache.camel.camel-base:3.0.0.M4]
at org.apache.camel.processor.CamelInternalProcessor.process(CamelInternalProcessor.java:222) [110:org.apache.camel.camel-base:3.0.0.M4]
at org.apache.camel.impl.engine.DefaultAsyncProcessorAwaitManager.process(DefaultAsyncProcessorAwaitManager.java:77) [110:org.apache.camel.camel-base:3.0.0.M4]
at org.apache.camel.support.AsyncProcessorSupport.process(AsyncProcessorSupport.java:40) [143:org.apache.camel.camel-support:3.0.0.M4]
at org.apache.camel.component.jms.EndpointMessageListener.onMessage(EndpointMessageListener.java:111) [127:org.apache.camel.camel-jms:3.0.0.M4]
at org.springframework.jms.listener.AbstractMessageListenerContainer.doInvokeListener(AbstractMessageListenerContainer.java:736) [167:org.apache.servicemix.bundles.spring-jms:5.0.8.RELEASE_1]
at org.springframework.jms.listener.AbstractMessageListenerContainer.invokeListener(AbstractMessageListenerContainer.java:696) [167:org.apache.servicemix.bundles.spring-jms:5.0.8.RELEASE_1]
at org.springframework.jms.listener.AbstractMessageListenerContainer.doExecuteListener(AbstractMessageListenerContainer.java:674) [167:org.apache.servicemix.bundles.spring-jms:5.0.8.RELEASE_1]
at org.springframework.jms.listener.AbstractPollingMessageListenerContainer.doReceiveAndExecute(AbstractPollingMessageListenerContainer.java:318) [167:org.apache.servicemix.bundles.spring-jms:5.0.8.RELEASE_1]
at org.springframework.jms.listener.AbstractPollingMessageListenerContainer.receiveAndExecute(AbstractPollingMessageListenerContainer.java:245) [167:org.apache.servicemix.bundles.spring-jms:5.0.8.RELEASE_1]
at org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.invokeListener(DefaultMessageListenerContainer.java:1189) [167:org.apache.servicemix.bundles.spring-jms:5.0.8.RELEASE_1]
at org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.executeOngoingLoop(DefaultMessageListenerContainer.java:1179) [167:org.apache.servicemix.bundles.spring-jms:5.0.8.RELEASE_1]
at org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.run(DefaultMessageListenerContainer.java:1076) [167:org.apache.servicemix.bundles.spring-jms:5.0.8.RELEASE_1]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641) [?:?]
at java.lang.Thread.run(Thread.java:844) [?:?]
I got publisher to work using "toD" instead of "to" on the route (https://camel.apache.org/message-endpoint.html)
Now transaction manager is still throwing an ERROR not being able to LOG a transaction, something about connection not being a "NamedXAResource" wtf that means, but at least the publisher is publishing

AMQP Channel shutdown but Consumer not always restart

I have a frequent Channel shutdown: connection error issues (under 24.133.241:5671 thread, name is truncated) in RabbitMQ Java client (my producer and consumer are far apart). Most of the time consumer is automatically restarted as I have enabled heartbeat (15 seconds). However, there were some instances only Channel shutdown: connection error but no Consumer raised exception and no Restarting Consumer (under cTaskExecutor-4 thread).
My current workaround is to restart my application. Anyone can shed some light on this matter?
2017-03-20 12:42:38.856 ERROR 24245 --- [24.133.241:5671] o.s.a.r.c.CachingConnectionFactory
: Channel shutdown: connection error
2017-03-20 12:42:39.642 WARN 24245 --- [cTaskExecutor-4] o.s.a.r.l.SimpleMessageListenerCont
ainer : Consumer raised exception, processing can restart if the connection factory supports
it
...
2017-03-20 12:42:39.642 INFO 24245 --- [cTaskExecutor-4] o.s.a.r.l.SimpleMessageListenerCont
ainer : Restarting Consumer: tags=[{amq.ctag-4CqrRsUP8plDpLQdNcOjDw=21-05060179}], channel=Ca
ched Rabbit Channel: AMQChannel(amqp://21-05060179#10.24.133.241:5671/,1), conn: Proxy#7ec317
54 Shared Rabbit Connection: SimpleConnection#44bac9ec [delegate=amqp://21-05060179#10.24.133
.241:5671/], acknowledgeMode=NONE local queue size=0
Generally, this is due to the consumer thread being "stuck" in user code somewhere, so it can't react to the broken connection.
If you have network issues, perhaps it's stuck reading or writing to a socket; make sure you have timeouts set for any I/O operations.
Next time it happens take a thread dump to see what the consumer threads are doing.

Quartz Job executed multiple times simultaneously by each cluster machine, rather than one time by one machine for the entire cluster

Goal:
* Have Job1 run once for a three-node cluster every 10 minutes, and Job2 run once for the same cluster every 5 minutes. Each job generates an email; so at 10:55am I should receive only one Job2 email from the cluster, and at 11:00am I should receive one Job1 email and one Job2 email from the cluster, at 11:05am I should receive only one Job2 email from the cluster, and so on...
Problem:
* Job1 is being run multiple times every 10 minutes on each node in the cluster, and the same for Job2 (except every 5 minutes). This leads to many, many more than one or two emails.
Configuration:
* Three-node linux cluster
* Each machine NTP configured and time-sync'd
* Oracle DB
* Quartz v2.2.0 (cluster mode)
* Jobs configured via CronTrigger
* Each node has an instance of the same standalone Java application running on it, and the Java application instantiates an instance of the quartz scheduler in cluster-mode.
* quartz.properties files are identical on each machine.
I have investigated all the obvious potential causes, but nothing explains it or presents a fix. I have even tried inserting an artificial 10-second sleep instruction in the job, to ensure that it doesn't finish in under a second. Please find relevant artifacts below (quartz.properties and log output). Any help would be greatly appreciated!
Artifact #1:
============================================================================
============================================================================
Q U A R T Z --- P R O P E R T I E S
==================
#============================================================================
# Configure Main Scheduler Properties
#============================================================================
org.quartz.scheduler.instanceName: MyQrtzScheduler
org.quartz.scheduler.instanceId: AUTO
org.quartz.scheduler.skipUpdateCheck: true
#============================================================================
# Configure ThreadPool
#============================================================================
org.quartz.threadPool.class: org.quartz.simpl.SimpleThreadPool
org.quartz.threadPool.threadCount: 1
org.quartz.threadPool.threadPriority: 5
#============================================================================
# Configure JobStore
#============================================================================
org.quartz.jobStore.misfireThreshold: 2592000000
org.quartz.jobStore.class=org.quartz.impl.jdbcjobstore.JobStoreTX
org.quartz.jobStore.driverDelegateClass=org.quartz.impl.jdbcjobstore.oracle.OracleDelegate
org.quartz.jobStore.useProperties=false
org.quartz.jobStore.dataSource=myDS
org.quartz.jobStore.tablePrefix=QRTZ_
org.quartz.jobStore.isClustered=true
org.quartz.jobStore.clusterCheckinInterval=60000
#============================================================================
# Other Example Delegates
#============================================================================
#org.quartz.jobStore.driverDelegateClass=org.quartz.impl.jdbcjobstore.DB2v6Delegate
#org.quartz.jobStore.driverDelegateClass=org.quartz.impl.jdbcjobstore.DB2v7Delegate
#org.quartz.jobStore.driverDelegateClass=org.quartz.impl.jdbcjobstore.DriverDelegate
#org.quartz.jobStore.driverDelegateClass=org.quartz.impl.jdbcjobstore.HSQLDBDelegate
#org.quartz.jobStore.driverDelegateClass=org.quartz.impl.jdbcjobstore.MSSQLDelegate
#org.quartz.jobStore.driverDelegateClass=org.quartz.impl.jdbcjobstore.PointbaseDelegate
#org.quartz.jobStore.driverDelegateClass=org.quartz.impl.jdbcjobstore.PostgreSQLDelegate
#org.quartz.jobStore.driverDelegateClass=org.quartz.impl.jdbcjobstore.StdJDBCDelegate
#org.quartz.jobStore.driverDelegateClass=org.quartz.impl.jdbcjobstore.WebLogicDelegate
#org.quartz.jobStore.driverDelegateClass=org.quartz.impl.jdbcjobstore.oracle.OracleDelegate
#org.quartz.jobStore.driverDelegateClass=org.quartz.impl.jdbcjobstore.oracle.WebLogicOracleDelegate
#============================================================================
# Configure Datasources
#============================================================================
org.quartz.dataSource.myDS.driver: oracle.jdbc.driver.OracleDriver
org.quartz.dataSource.myDS.URL: jdbc:oracle:thin:#myServer:myPort:blah
org.quartz.dataSource.myDS.user: myDBUser
org.quartz.dataSource.myDS.password: myDBPassword
org.quartz.dataSource.myDS.maxConnections: 2
org.quartz.dataSource.myDS.validationQuery: select 0
#============================================================================
# Configure Plugins
#============================================================================
org.quartz.plugin.shutdownHook.class: org.quartz.plugins.management.ShutdownHookPlugin
org.quartz.plugin.shutdownHook.cleanShutdown: true
org.quartz.plugin.triggerHistory.class=org.quartz.plugins.history.LoggingTriggerHistoryPlugin
org.quartz.plugin.jobHistory.class=org.quartz.plugins.history.LoggingJobHistoryPlugin
Artifact #2:
============================================================================
============================================================================
L O G --- O U T P U T
==================
2015-01-29 12:56:16,602 [main] INFO com.mycompany.myapp.jobs.QuartzHelper - Initializing Quartz scheduler...
2015-01-29 12:56:16,829 [main] INFO org.quartz.impl.StdSchedulerFactory - Using default implementation for ThreadExecutor
2015-01-29 12:56:16,855 [main] INFO org.quartz.core.SchedulerSignalerImpl - Initialized Scheduler Signaller of type: class org.quartz.core.SchedulerSignalerImpl
2015-01-29 12:56:16,855 [main] INFO org.quartz.core.QuartzScheduler - Quartz Scheduler v.2.2.0 created.
2015-01-29 12:56:16,857 [main] INFO org.quartz.plugins.management.ShutdownHookPlugin - Registering Quartz shutdown hook.
2015-01-29 12:56:16,859 [main] INFO org.quartz.impl.jdbcjobstore.JobStoreTX - Using db table-based data access locking (synchronization).
2015-01-29 12:56:16,864 [main] INFO org.quartz.impl.jdbcjobstore.JobStoreTX - JobStoreTX initialized.
2015-01-29 12:56:16,865 [main] INFO org.quartz.core.QuartzScheduler - Scheduler meta-data: Quartz Scheduler (v2.2.0) 'MyQrtzScheduler' with instanceId 'node1_1422554176832'
Scheduler class: 'org.quartz.core.QuartzScheduler' - running locally.
NOT STARTED.
Currently in standby mode.
Number of jobs executed: 0
Using thread pool 'org.quartz.simpl.SimpleThreadPool' - with 1 threads.
Using job-store 'org.quartz.impl.jdbcjobstore.JobStoreTX' - which supports persistence. and is clustered.
2015-01-29 12:56:16,865 [main] INFO org.quartz.impl.StdSchedulerFactory - Quartz scheduler 'MyQrtzScheduler' initialized from specified file: '/my/install/directory/quartz.properties'
2015-01-29 12:56:16,866 [main] INFO org.quartz.impl.StdSchedulerFactory - Quartz scheduler version: 2.2.0
2015-01-29 12:56:16,866 [main] INFO com.mycompany.myapp.jobs.QuartzHelper - Quartz scheduler initialized successfully.
2015-01-29 12:59:53,450 [MyQrtzScheduler_QuartzSchedulerThread] DEBUG org.quartz.core.QuartzSchedulerThread - batch acquisition of 1 triggers
2015-01-29 13:00:00,007 [MyQrtzScheduler_QuartzSchedulerThread] DEBUG org.quartz.impl.jdbcjobstore.StdRowLockSemaphore - Lock 'TRIGGER_ACCESS' is desired by: MyQrtzScheduler_QuartzSchedulerThread
2015-01-29 13:00:00,008 [MyQrtzScheduler_QuartzSchedulerThread] DEBUG org.quartz.impl.jdbcjobstore.StdRowLockSemaphore - Lock 'TRIGGER_ACCESS' is being obtained: MyQrtzScheduler_QuartzSchedulerThread
2015-01-29 13:00:00,809 [MyQrtzScheduler_QuartzSchedulerThread] DEBUG org.quartz.impl.jdbcjobstore.StdRowLockSemaphore - Lock 'TRIGGER_ACCESS' given to: MyQrtzScheduler_QuartzSchedulerThread
2015-01-29 13:00:00,836 [MyQrtzScheduler_QuartzSchedulerThread] DEBUG org.quartz.impl.jdbcjobstore.StdRowLockSemaphore - Lock 'TRIGGER_ACCESS' returned by: MyQrtzScheduler_QuartzSchedulerThread
2015-01-29 13:00:00,839 [MyQrtzScheduler_QuartzSchedulerThread] DEBUG org.quartz.simpl.PropertySettingJobFactory - Producing instance of Job 'node2_1422546730757.Job1', class=com.mycompany.myapp.job.Job1
2015-01-29 13:00:00,851 [MyQrtzScheduler_Worker-1] INFO org.quartz.plugins.history.LoggingTriggerHistoryPlugin - Trigger node2_1422546730757.Job1Trigger fired job node2_1422546730757.Job1 at: 13:00:00 01/29/2015
2015-01-29 13:00:00,852 [MyQrtzScheduler_Worker-1] INFO org.quartz.plugins.history.LoggingJobHistoryPlugin - Job node2_1422546730757.Job1 fired (by trigger node2_1422546730757.Job1Trigger) at: 13:00:00 01/29/2015
2015-01-29 13:00:00,852 [MyQrtzScheduler_Worker-1] DEBUG org.quartz.core.JobRunShell - Calling execute on job node2_1422546730757.Job1
2015-01-29 13:00:00,853 [MyQrtzScheduler_Worker-1] INFO com.mycompany.myapp.job.Job1 - ***Executing Inbound File SLA Job...
2015-01-29 13:00:02,054 [MyQrtzScheduler_Worker-1] INFO com.mycompany.myapp.job.Job1 - ***Inbound File SLA Job: No SLA breaches found...
2015-01-29 13:00:02,150 [MyQrtzScheduler_Worker-1] INFO com.mycompany.myapp.job.Job1 - Job1 completed successfully in [1297ms]; sleeping [63703ms] to meet the required minimum runtime for quartz-jobs
2015-01-29 13:00:24,881 [QuartzScheduler_MyQrtzScheduler-node1_1422554176832_ClusterManager] DEBUG org.quartz.impl.jdbcjobstore.JobStoreTX - ClusterManager: Check-in complete.
2015-01-29 13:01:05,862 [MyQrtzScheduler_Worker-1] INFO com.mycompany.myapp.job.Job1 - Job1 sleep-delay completed.
2015-01-29 13:01:05,864 [MyQrtzScheduler_Worker-1] INFO org.quartz.plugins.history.LoggingJobHistoryPlugin - Job node2_1422546730757.Job1 execution complete at 13:01:05 01/29/2015 and reports: SUCCESS
2015-01-29 13:01:05,865 [MyQrtzScheduler_Worker-1] INFO org.quartz.plugins.history.LoggingTriggerHistoryPlugin - Trigger node2_1422546730757.Job1Trigger completed firing job node2_1422546730757.Job1 at 13:01:05 01/29/2015 with resulting trigger instruction code: DO NOTHING
2015-01-29 13:01:05,868 [MyQrtzScheduler_Worker-1] DEBUG org.quartz.impl.jdbcjobstore.StdRowLockSemaphore - Lock 'TRIGGER_ACCESS' is desired by: MyQrtzScheduler_Worker-1
2015-01-29 13:01:05,869 [MyQrtzScheduler_Worker-1] DEBUG org.quartz.impl.jdbcjobstore.StdRowLockSemaphore - Lock 'TRIGGER_ACCESS' is being obtained: MyQrtzScheduler_Worker-1
2015-01-29 13:01:05,872 [MyQrtzScheduler_Worker-1] DEBUG org.quartz.impl.jdbcjobstore.StdRowLockSemaphore - Lock 'TRIGGER_ACCESS' given to: MyQrtzScheduler_Worker-1
2015-01-29 13:01:05,880 [MyQrtzScheduler_Worker-1] DEBUG org.quartz.impl.jdbcjobstore.StdRowLockSemaphore - Lock 'TRIGGER_ACCESS' returned by: MyQrtzScheduler_Worker-1
2015-01-29 13:01:05,915 [MyQrtzScheduler_QuartzSchedulerThread] DEBUG org.quartz.core.QuartzSchedulerThread - batch acquisition of 1 triggers
2015-01-29 13:01:05,917 [MyQrtzScheduler_QuartzSchedulerThread] DEBUG org.quartz.impl.jdbcjobstore.StdRowLockSemaphore - Lock 'TRIGGER_ACCESS' is desired by: MyQrtzScheduler_QuartzSchedulerThread
2015-01-29 13:01:05,918 [MyQrtzScheduler_QuartzSchedulerThread] DEBUG org.quartz.impl.jdbcjobstore.StdRowLockSemaphore - Lock 'TRIGGER_ACCESS' is being obtained: MyQrtzScheduler_QuartzSchedulerThread
2015-01-29 13:01:05,921 [MyQrtzScheduler_QuartzSchedulerThread] DEBUG org.quartz.impl.jdbcjobstore.StdRowLockSemaphore - Lock 'TRIGGER_ACCESS' given to: MyQrtzScheduler_QuartzSchedulerThread
2015-01-29 13:01:05,954 [MyQrtzScheduler_QuartzSchedulerThread] DEBUG org.quartz.impl.jdbcjobstore.StdRowLockSemaphore - Lock 'TRIGGER_ACCESS' returned by: MyQrtzScheduler_QuartzSchedulerThread
2015-01-29 13:01:05,955 [MyQrtzScheduler_QuartzSchedulerThread] DEBUG org.quartz.simpl.PropertySettingJobFactory - Producing instance of Job 'node1_1422543657050.Job2', class=com.mycompany.myapp.jobs.Job2
2015-01-29 13:01:05,961 [MyQrtzScheduler_Worker-1] INFO org.quartz.plugins.history.LoggingTriggerHistoryPlugin - Trigger node1_1422543657050.Job2Trigger fired job node1_1422543657050.Job2 at: 13:01:05 01/29/2015
2015-01-29 13:01:05,962 [MyQrtzScheduler_Worker-1] INFO org.quartz.plugins.history.LoggingJobHistoryPlugin - Job node1_1422543657050.Job2 fired (by trigger node1_1422543657050.Job2Trigger) at: 13:01:05 01/29/2015
2015-01-29 13:01:05,963 [MyQrtzScheduler_Worker-1] DEBUG org.quartz.core.JobRunShell - Calling execute on job node1_1422543657050.Job2
2015-01-29 13:01:05,963 [MyQrtzScheduler_Worker-1] WARN com.mycompany.myapp.jobs.Job2 - No outbound files found; Outbound File SLA Job cannot check for SLA breaches.
2015-01-29 13:01:05,965 [MyQrtzScheduler_Worker-1] INFO org.quartz.plugins.history.LoggingJobHistoryPlugin - Job node1_1422543657050.Job2 execution complete at 13:01:05 01/29/2015 and reports: null
2015-01-29 13:01:05,966 [MyQrtzScheduler_Worker-1] INFO org.quartz.plugins.history.LoggingTriggerHistoryPlugin - Trigger node1_1422543657050.Job2Trigger completed firing job node1_1422543657050.Job2 at 13:01:05 01/29/2015 with resulting trigger instruction code: DO NOTHING
The following answer was given by the OP.
The problem was that I was defining quartz jobs with identities that have a unique group id (the scheduler id) instead of a group id common to all hosts in the cluster. Since the scheduler id is unique to the host, each host in the cluster would look to see if that job already existed using the fully qualified job name groupId.jobName and surely it found it didn't, so it would create a new instance of Job1 and Job2 during startup. The quartz jobs/triggers are never expired or cleared without an explicit request in Java or manual sql statement in Oracle. So over time the instances would build up, and instead of quartz running a single instance of Job1 and Job2, it would run all the instances of each job that had been created over time (hence the multiple executions and multiple email alerts).
The solution is that I replace schedulerId with a static string such as "MyQuartzJobs" when defining a job's identity.
Basically, I changed the following line of Java code:
JobDetail job =
newJob(Job1.class).withIdentity(JOB1_JOB_NAME, uniqueSchedulerId)
.withDescription(JOB1_DESC + " created [" + new Date() + "]")
.storeDurably(false)
.requestRecovery(false)
.build();
to something like the following:
JobDetail job =
newJob(Job1.class).withIdentity(JOB1_JOB_NAME, "MyQuartzJobs")
.withDescription(JOB1_DESC + " created [" + new Date() + "]")
.storeDurably(false)
.requestRecovery(false)
.build();

Categories