Is it possible to limit the number of JMS receiver instances to a single instance? I.e. only process a single message from a queue at any one time?
The reason I ask is because I have a fairly intensive render type process to run for each message (potentially many thousands). I'd like to limit the execution of this code to a single instance at a time.
My application server is JBoss AS 6.0
You can configure the queue listener pool to have a single thread, so no more than one listener is handling requests, but this makes no sense to me.
The right answer is to tune the size of the thread pool to balance performance with memory requirements.
Many thousands? Per second, per minute, per hour? The rate at which they arrive, and the time each task takes, are both crucial. How much time, memory, CPU per request? Make sure you configure your queue to handle what could be a rather large backlog.
UPDATE: If ten messages arrive per second, and it takes 10 seconds for a single listener to process a message, then you'll need 101 listener threads to be able to keep up. (10 messages/second * 10 seconds means 100 messages arrive by the time the first listener finishes its 10 second task. The 101st listener will handle the 101st message, and subsequent listeners will finish in time to keep up.) If you need 1 MB of RAM per listener, you'll need 101 MB RAM just to process all the messages on one server. You'll need a similar estimate for CPU as well.
It might be wise to think about multiple queues on multiple servers and load balancing between them if one server isn't sufficient.
Related
We have a kafka sdk written over apache-kafka (2.7.0) that we use to produce and consume messages to kafka topics.
By default the configuration is like this -
Auto commit is set to false
We use commitSync() for offsets
poll frequency for consumers is 1000 ms
max.poll.records is set to 2
Consumers are single threaded and single consumer runs per instance/pod (we use EKS)
Now, there is a order service that produces order-created message to order topic and it is consumed by another service that fulfils the order fulfil service. The fulfilment logic takes on an average 20s to process this message (too high!).
Because of this even if we have 10 partitions in the topic and 10 application pods / consumers running (they all belong to same consumer group), we can only process 3 messages per minute per consumer (30 messages per minute overall).
The problem in rate of message production at peak is around 300 per minute. Even if we scale to 50 partitions with 50 consumers, we can only process 150 per minute. And even here, each consumer remains underutilized in terms of cpu and memory usage.
Because of this, over time, there is a huge build up in consumer lag.
How do we scale to solve this problem? We can't have 100s of underutilized consumers running as that is not cost effective. Please help with any pointers to solve this.
PS. : We are looking into how to optimize the consumer that is taking 20s on average, but it will take time and we need a short term solution for this that is cost effective as well.
I'd rather suggest "semi-lambda" architecture approach. If you are already running on k8s, use openfaas/knative to decouple handling of these messages:
First service that consumes messages, verifies them and spins up lambdas to handle those.
Actually lambdas, managed by openfaas or etc, this is classic use case when handling of message is higher than 500ms and not more than couple minutes. When this lambda finishes handling it will return response to first service. If it's okay - commit the offset, if not - also commit, but re-send to the dead queue.
I have a typical kafka consumer/producer app that is polling all the time for data. Sometimes, there might be no data for hours, but sometimes there could be thousands of messages per second. Because of this, the application is built so it's always polling, with a 500ms duration timeout.
However, I've noticed that sometimes, if the kafka cluster goes down, the consumer client, once started, won't throw an exception, it will simply timeout at 500ms, and continue returning empty ConsumerRecords<K,V>. So, as far as the application is concerned, there is no data to consume, when in reality, the whole Kafka cluster could be unreachable, but the app itself has no idea.
I checked the docs, and I couldn't find a way to validate consumer health, other than maybe closing the connection and subscribing to the topic every single time, but I really don't want to do that on a long-running application.
What's the best way to validate that the consumer is active and healthy while polling, ideally from the same thread/client object, so that the app can distinguish between no data and an unreachable kafka cluster situation?
I am sure this is not the best way to achieve what you are looking for.
But one simple way which I had implemented in my application is by maintaining a static counter in the application indicating emptyRecordSetReceived. Whenever I receive an empty record set by the poll operation I increment this counter.
This counter was emitted to the Graphite at periodic interval (say every minute) with the help of the Metric registry from the application.
Now let's say you know the maximum time frame for which the message will not be available to consume by this application. For example, say 6 hours. Given that you are polling every 500 Millisecond, you know that if we do not receive the message for 6 hours, the counter would increase by
2 poll in 1 second * 60 seconds * 60 minutes * 6 hours = 43200.
We had placed an alerting check based on this counter value reported to Graphite. This metric used to give me a decent idea if it is a genuine problem from the application or something else is down from the Broker or producer side.
This is just the naive way I had solved this use case to some extent. I would love to hear how it is actually done without maintaining these counters.
I am using GAE task queue to update bulk data in Datastore. Number of records are around 1-2M. To do this I scheduled a cron Job and a queue in this way
<queue>
<name>queueName</name>
<rate>20/s</rate>
<bucket-size>300</bucket-size>
<retry-parameters>
<task-retry-limit>1</task-retry-limit>
</retry-parameters>
<max-concurrent-requests>800</max-concurrent-requests>
</queue>
Each task is doing following task
Fetching 1500 record from datastore using a cursor.
If the next cursor exists create a new task and push in the queue.
Process 1500 fetched record, means updating all 1500 in datastore back.
the expected task to add should be around 667, but I can only see 40 tasks in logs.
In logs, I can see the 40 tasks are added in the queue in 40 sec. I m not getting any error in the logs.
Can anybody help me to understand what is happening? Why I m not able to add all the task.
Thanks
In your approach the task enqueueing appears to be very tightly coupled with the task request processing, in the sense that the request for one such task in the queue needs to be processed to enqueue the next task. So you need to take a look at your task processing rate limiting factors you may hit. The ones from your queue configuration are pretty generous, but there are others.
If you configured your app with threadsafe and if your app design takes advantage of it an instance of your app will be able to handle multiple requests concurrently, up to a maximum depending on its max-concurrent-requests config and its processing latency. Without the threadsafe config that maximum is 1.
Once an instance hits the max number of task requests it can process concurrently it won't start processing new tasks from the queue (so it won't execute step #1 - enqueueing a new task) until it completes processing at least one of the tasks already in progress. The task enqueueing rate per app instance is thus effectively limited - each running instance can contribute to the overall number of tasks in the queue only with a number equal to the max number of tasks it can process in parallel.
But your app is configured for automatic scaling, so once you manage to quickly "fill up" all your running instances, the scheduler will start new instances for it. As new instances are started they will be able to process more of the tasks in the queue and thus also enqueue new tasks, contributing with the above-mentioned amount to the total number of tasks in the queue.
But this growth in the number of enqueued tasks can be much slower than while instances didn't hit their max processing rate - it takes some time to measure how new instances helps with traffic to determine if more instances are needed or not. The overall growth in the number of tasks in the queue will have a "staircase" profile, with the height of a step being the max number of concurrent requests an instance can handle and the number of steps being the number of new instances started +1.
Since you aren't seeing any actual task enqueuing errors I can only suspect that you're somehow hitting a rate limit in processing your enqueued tasks or somehow that processing completely stops. There can be many reasons for it, including, for example:
hitting your app's daily budget (most likely due to the number of instance-hours)
hitting automatic scaling limits
You'd have to investigate your app from this perspective to pinpoint the culprit.
Side note: I assume this is on GAE, not on the development server (which doesn't respect the task queue configs and most likely can't get even close to GAE's parallel processing capability).
I have one Server and multiple clients. With some period, clients sends an alive packet to Server. (At this moment, Server doesn't respond alive packets). The period may change device to device and configurable at runtime, for both Server and Clients. I want to generate an alert when one or more clients doesn't send the alive packet. (One packet or two in row etc.). This aliveness is used other parts of application so the quicker notice is the better. I came up some ideas but I couldn't select one.
Create a task that checks every clients last alive packet timestamps with current time and generate alert or alerts. Call this method in some period which should be smaller than minimum client-period.
Actually that seems better to me, however this way unnecessarily I check some clients alive. (Ex: If clients period are change 1-5 minute, task should be run in every minute at least, so I check all clients above 2 minute period is redundant). Also if the minimum of client periods is decrease, I should decrease the tasks period also.
Create a task for each clients, and check the last alive packet timestamps with current time, sleep for one client's period time.
In this way, if clients number goes very high, there will be dozens of task. Since they will sleep most of the time, I still doubt this is more elegant.
Is there any idiom or pattern for this kind of situation? I think watchdog kind implementation is suite well, however I didn't see something like in Java.
Approach 2 is not very useful as it is vague idea to write 100 task for 100 clients.
Approach 1 can be optimized if you use average client-period instead of minimum.
It depends on your needs.
Is it critical if alert is generated few seconds later (or earlier) than it should be?
If not then maybe it's worth grouping clients with nearby heartbeat intervals and run the check against not a single client but the group of clients? This will allow to decrease number of tasks (100 -> 10) and increase number of clients handled by single task (1 -> 10).
First approach is fine.
Only thing I can suggest you is that create an independent service to do this control. If you set this task as a thread in your server, it wouldn't be that manageable. Imagine your control thread is broken, killed etc, how would you notice? So, build an independent OS service, another java program, to check last alive timestamps periodically.
In this way you can easily modify and restart your service and see its logs separately. According to its importance, you may even built a "watchdog of watchdog" service too.
We encountered a problem under WebLogic 8.1 that we lived with but could never fix. We often queue up a hundred or more JMS messages, each of which represents a unit of work. Despite the fact that each message is of the same size and looks the same, one may take only seconds to complete while the next one represents 20 minutes of solid crunching.
Our problem is that each of the message driven beans we have doing the work of these messages ends up on a thread that seems to grab ten messages at a time (we think it is being done as a WebLogic optimization to keep from having to hit the queue over and over again for small messages). Then, as one thread after another finishes all of its small jobs and no new ones come in, we end up with a single thread log jammed on a long running piece of work with up to nine other items sitting waiting on it to finish, despite the fact that other threads are free and could start on those units of work.
Now we are at a point where we are converting to WebLogic 10 so it is a natural point to return to this problem and find out if there is any solution that we could implement so that either: a) each thread only grabs one JMS message at a time to process and leaves all the others waiting in the incoming queue, or b) it would automatically redistribute waiting messages (even ones already assigned to a particular thread) out to free threads. Any ideas?
Enable the Forward Delay and provide an appropriate value. This will cause the JMS Queue to redistribute messages to it's peers if they have not been processed in the configured time.
Taking a single message off the queue every time might be overkill - It's all a balance on the number of messages you are processing and what you gauge as an issue.
There are also multiple issues with JMS on WebLogic 10 depending on your setup. You can save yourself a lot of time and trouble by using the latest MP right from the start.
when a Thread is in 'starvation' after getting the resources they can able to execute.The threads which are in starvation called as "greedy thread"