ActiveMQ stops receiving messages after servers left idle for few hours - java

I've been browsing the forums for last few days and tried almost everything i could find, but without any luck.
The situation is: inside our Java Web Application we have ActiveMQ 5.7 (I know it's very old, eventually we will upgrade to newer version - but for some reasons it's not possible right now). We have only one broker and multiple consumers.
When I start the servers (I have tried to do so for 2, 3, 4 and more servers) everything is ok. The servers are comunicating with each other, QUEUE messages are consumed instantly. But when I leave the servers idle (for example to finally catch some sleep ;) ) it is no longer the case. Messages are stuck in the database and are not beign consumed. The only option to have them delivered is to restart the server.
Part of my configuration (we keep it in properties file, it's the actual state, however I have tried many different combinations):
BrokerServiceURI=broker:(tcp://0.0.0.0:{0})/{1}?persistent=true&useJmx=false&populateJMSXUserID=false&useShutdownHook=false&deleteAllMessagesOnStartup=false&enableStatistics=true
ConnectionFactoryURI=failover://({0})?initialReconnectDelay=100&timeout=6000
ConnectionFactoryServerURI=tcp://{0}:{1}?keepAlive=true&soTimeout=100&wireFormat.cacheEnabled=false&wireFormat.tightEncodingEnabled=false&wireFormat.maxInactivityDuration=0
BrokerService.startAsync=true
BrokerService.networkConnectorStartAsync=true
BrokerService.keepDurableSubsActive=false
Do you have a clue?

I cannot actually tell you the reason from the description mentioned above but I can list down a few checks that are fresh in my mind. Please confirm the following if they are valid for you or not.
Can you check the consumer connections?
Are the consumer sessions still active?
If all the consumer-connections are up, then check the thread-dump whether the active consumer threads (I'm assuming you created consumer threads, correct me if I'm wrong) are in RUNNING or WAITING state(this happened with me where all the consumers were active but some other thread was keeping a lock on Logger while posting a message to slack and the consumers were in WAITING state) because of some other thread in the server).
Check the Dispatch queue size for each consumer. Check the prefetch of each consumer and then compare Dispatch Queue size with Prefetch, refer
Is there a JMSXGroupID you are allotting to each message?
Can you tell a little more about your consumer/producer/broker configurations?

Related

Anyway to set RMQ Consumer delivery timeout on per-message level?

We have a use case that holds active jobs in a RabbitMQ job queue. The Clients, when they are free, pull jobs from this queue. Pretty normal. But in our case, we do not ACK the jobs. We allow them to stay in the Unacked state so if the Client dies, the job goes to a pre-defined DeadLetter queue. We then have a process that pulls messages from dead-letter, and decides to either requeue message back to original job-queue, or discard.
This has worked well for a long time. Now, we have upgraded to a newer version of RMQ, and found that we get disconnects with PRECONDITION_FAILED, because the default ack timeout of 30 minutes has expired.
Beyond removing this from the server, does anyone know a way to configure this on a per-message level?
While some might say just ACK the job, and use a handler to return to DEADLETTER if needed. Well, sorry, that will not work for us.
So, any thoughts?
No, there is not at this time. You should configure the default to be greater than the longest expected job duration. Please note that if you are using quorum queues this may cause disk usage growth because the log files can't be compacted while messages are outstanding.
We may make this timeout configurable in a more granular way, so please keep an eye on future RabbitMQ releases for that.
NOTE: the RabbitMQ team monitors the rabbitmq-users mailing list and only sometimes answers questions on StackOverflow.

kafka Consumer Reading Previous Records

i am facing a problem with my kafka consumer. i have two kafka brokers running with replication factor 2 for the topic. everytime a broker restarts and if i restart my consumer service, it starts to read records which it has already read. e.g. before i restarted the consumer this was the state.
and consumer was sitting idle not receiving any messages as it has read all of them.
i restart my consumer, and all of a sudden it starts receiving messages which it has processed previously and here is the offset situation now.
also what is this LOG-END-OFFSET and LAG, looks like these are something to consider here.
note that it only happens when 1 of the broker gets restarted due to kubernetes shifting it to another node.
this is the topic configuration
Based on the info you posted, a couple of things that immediately come to mind:
The first screenshot shows a lag of 182, which means the consumer either was not running, or it has some weird configuration that made it stop consuming. Was it possible one of the brokers was down when the consumer stopped consuming?
On restart, the consumer finally consumed all the remaining messages, because it now shows lag of 0. This is correct, expected Kafka behavior.
Make sure that the consumer group name is not changing between restarts. Some clients default to "randomized" customer group names, which works as long as the consumer is not restarted.

Detecting ActiveMQ flow control

I have a production system that uses ActiveMQ (5.3.2) to send messages from server A to server B. A few weeks ago, the system inexplicably started taking 10+ second to send a message. After a reboot of the producer, the system worked fine.
After investigation, I'm pretty sure this is due to producer flow control. (I have a fairly standard activemq setup). The day before this happened (for other reasons) my consumer software had been acting erratically and had even stopped accepting connections for a while. So I'm guessing this triggered this. (It does puzzle me that the requests were still being throttled a day later).
Question -- how can I confirm that the requests were being throttled. I took a heap dump of the server -- is there data in memory I can look for?
Edit: I've found the following:
WireFormatNegotiator.tcpNoDelayEnabled=false for one of three WireFormatNegotiator instances in the memory. I'm trying to figure out what sets this.
And second (and more important), is there a way I can use JMX to tell if the messages are being throttled? I'd like to set up a Nagios alert to let me know if this happens in the future. What property should I check for with JMX?
you can configure your producer client to throw javax.jms.ResourceAllocationException exceptions which can then be detected/logged, etc. just set one of the following...
<systemUsage>
<systemUsage sendFailIfNoSpaceAfterTimeout="3000">
...OR...
<systemUsage sendFailIfNoSpace="true">

JMS queue is full

My Java EE application sends JMS to queue continuously, but sometimes the JMS consumer application stopped receiving JMS. It causes the JMS queue very large even full, that collapses the server.
My server is JBoss or Websphere. Do the application servers provide strategy to remove "timeout" JMS messages?
What is strategy to handle large JMS queue? Thanks!
With any asynchronous messaging you must deal with the "fast producer/slow consumer" problem. There are a number of ways to deal with this.
Add consumers. With WebSphere MQ you can trigger a queue based on depth. Some shops use this to add new consumer instances as queue depth grows. Then as queue depth begins to decline, the extra consumers die off. In this way, consumers can be made to automatically scale to accommodate changing loads. Other brokers generally have similar functionality.
Make the queue and underlying file system really large. This method attempts to absorb peaks in workload entirely in the queue. This is after all what queuing was designed to do in the first place. Problem is, it doesn't scale well and you must allocate disk that 99% of the time will be almost empty.
Expire old messages. If the messages have an expiry set then you can cause them to be cleaned up. Some JMS brokers will do this automatically while on others you may need to browse the queue in order to cause the expired messages to be deleted. Problem with this is that not all messages lose their business value and become eligible for expiry. Most fire-and-forget messages (audit logs, etc.) fall into this category.
Throttle back the producer. When the queue fills, nothing can put new messages to it. In WebSphere MQ the producing application then receives a return code indicating that the queue is full. If the application distinguishes between fatal and transient errors, it can stop and retry.
The key to successfully implementing any of these is that your system be allowed to provide "soft" errors that the application will respond to. For example, many shops will raise the MAXDEPTH parameter of a queue the first time they get a QFULL condition. If the queue depth exceeds the size of the underlying file system the result is that instead of a "soft" error that impacts a single queue the file system fills and the entire node is affected. You are MUCH better off tuning the system so that the queue hits MAXDEPTH well before the file system fills but then also instrumenting the app or other processes to react to the full queue in some way.
But no matter what else you do, option #4 above is mandatory. No matter how much disk you allocate or how many consumer instances you deploy or how quickly you expire messages there is always a possibility that your consumer(s) won't keep up with message production. When this happens your producer app should throttle back, or raise an alarm and stop or do anything other than hang or die. Asynchronous messaging is only asynchronous up to the point that you run out of space to queue messages. After that your apps are synchronous and must gracefully handle that situation, even if that means to (gracefully) shut own.
Sure!
http://download.oracle.com/docs/cd/E17802_01/products/products/jms/javadoc-102a/index.html
Message#setJMSExpiration(long) does exactly what you want.

How can I get this external JMS client to terminate?

I'm working through the 'Simple Point-to-Point Example' section of the Sun JMS tutorial (sender source, receiver source), using Glassfish as my JMS provider. I've set up the QueueConnectionFactory and Queue in the Glassfish admin UI, and added the relevant JARs to my classpath and the receiver is receiving the messages sent by the sender.
However, neither sender nor receiver terminate. The main thread exits normally (after successfully calling queueConnection.close()) but two non-daemon threads are left hanging around:
iMQReadChannel-0
imqConnectionFlowControl-0
It seems (from this java.net thread) that the reason is that queueConnection.close() just returns the connection to the pool, rather than really closing it. I can't find any way to tell the pool to shutdown, so the only option I'm left with is System.exit(), which feels wrong.
I've tried setting the minimum pool size to 0, the maximum pool size to 1 and the idle timeout to 10 seconds but it seems to make no difference. Even when I just lookup the connection factory and don't ask for a connection, these two threads are still started and don't terminate.
Any help much appreciated!
Why don't you simply terminate with a System.exit(0)? Given the sample, the current behavior is correct (a Java program terminates when all non-daemon threads end).
Maybe you can have the samples shutting down properly by playing with client library's properties (idle time, etc...), but it seems others ( http://www.nabble.com/Simple-JMS-Client-doesn%27t-quit-td15662753.html) still experience the very same problem (and, anyway, i still don't understand what the point is).
Good news for us. "Will not fixed"
http://java.net/jira/browse/GLASSFISH-1429?focusedCommentId=85555&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_85555

Categories