Greedy threads are grabbing too many JMS messages under WebLogic

Greedy threads are grabbing too many JMS messages under WebLogic - java

We encountered a problem under WebLogic 8.1 that we lived with but could never fix. We often queue up a hundred or more JMS messages, each of which represents a unit of work. Despite the fact that each message is of the same size and looks the same, one may take only seconds to complete while the next one represents 20 minutes of solid crunching.
Our problem is that each of the message driven beans we have doing the work of these messages ends up on a thread that seems to grab ten messages at a time (we think it is being done as a WebLogic optimization to keep from having to hit the queue over and over again for small messages). Then, as one thread after another finishes all of its small jobs and no new ones come in, we end up with a single thread log jammed on a long running piece of work with up to nine other items sitting waiting on it to finish, despite the fact that other threads are free and could start on those units of work.
Now we are at a point where we are converting to WebLogic 10 so it is a natural point to return to this problem and find out if there is any solution that we could implement so that either: a) each thread only grabs one JMS message at a time to process and leaves all the others waiting in the incoming queue, or b) it would automatically redistribute waiting messages (even ones already assigned to a particular thread) out to free threads. Any ideas?

Enable the Forward Delay and provide an appropriate value. This will cause the JMS Queue to redistribute messages to it's peers if they have not been processed in the configured time.
Taking a single message off the queue every time might be overkill - It's all a balance on the number of messages you are processing and what you gauge as an issue.
There are also multiple issues with JMS on WebLogic 10 depending on your setup. You can save yourself a lot of time and trouble by using the latest MP right from the start.

when a Thread is in 'starvation' after getting the resources they can able to execute.The threads which are in starvation called as "greedy thread"

Related

Size of event bus in vert.x

I am using vert.x to read a file and transform and then push to kafka.
I am using 2 verticles, without using any worker thread (I dont want to change the order of logs in the file).
Verticle 1 : Read the file and filter
Verticle 2 : Publish to kafka
Each files contain approximately 120000 lines
However, I observed that after sometime i stop observing logs from verticle 1.
I am suspecting that event bus is getting full, so Consumer is still consuming, but producer thread is waiting for event bus to get empty.
So My questions are
1. What is the default size of event bus? In Docs it says
DEFAULT_ACCEPT_BACKLOG
The default accept backlog = 1024
2. How do I confirm my suspicion that publisher thread is blocked?

VertX uses Netty's SingleThreadEventLoop internally for its event bus, maximum pending tasks allowed is Integer.MAX_VALUE which is probably 2 billion messages.
You may have to try VertxOptions.setWarningExceptionTime(long warningExceptionTime) to set the value lower than default (5sec) to see if there is any warning about blocked thread.

To complement #iwat answer, in the version I am using, it looks like the max size is read from a system property:
protected static final int DEFAULT_MAX_PENDING_TASKS = Math.max(16, SystemPropertyUtil.getInt("io.netty.eventLoop.maxPendingTasks", 2147483647));
So you can control the size of the queues in front of the Verticles by setting that system property.
If the event bus is full (the queue in NioEventLoop reaches the max size), the task will be rejected. So if you hit that, you should start to see error responses to your messages, you should not see any blocked producers.

I'm not sure the accept-backlog setting has any effect on the eventbus, given the documentation it might have something to do with the netserver, but from a short scan of the code I haven't found any use in the eventbus.
The event bus however does deliver the message immediately, messages don't get queued up somewhere (at least that's what I understand from the code). So regarding your first question, it doesn't have any size, at least not when running locally (don't know about the clustered version, but I assume that doesn't apply in your case anyway)
To confirm an (eventloop) thread is actually blocked is easy, there should be tons of exceptions in your log stating the event loop is blocked.
I guess your problem is somewhere else, but that's actually hard to tell without any code or meaningful logs.

I would like to make a question to the comunity and get as many feedbacks as possible about an strategy I have been thinking, oriented to resolve some issues of performance in my project.
The context:
We have an important process that perform 4 steps.
An entity status change and its persistence
If 1 ends OK. Entity is exported into a CSV file.
If 2 ends OK. Entity is exported into another CSV. This one with way more Info.
If 3 ends OK. The last CSV is sent by mail
Steps 1 and 2 are linked and they are critical.
Steps 3 and 4 are not critical. Doesn't even care if they ends successfully.
Performance of 1-2 is fine, but 3-4 in some escenarios are just insanely slow. Mostly cause step 3.
If we execute all the steps as a sequence, some times step 3 causes a timeout. Client do not get any response about steps 1 and 2 (the important ones) and user don't know whats going on.
This case made me think in JMS queues in order to delegate the last 2 steps to another app/process. Deallocate the notification from the business logic. Second export and mailing will be processed when posible and probably in parallel. I could also split it in 2 queues: exports, mail notification.
Our webapp runs into a WebLogic 11 cluster, so I could use its implementation.
What do you think about the strategy? Is WebLogic JMS implementation anything good? Should I check another implementation? ActiveMQ, RabbitMQ,...
I have also thinking on tiketing system implementation with spring-tasks.
At this point I have to point at spring-batch. Its usage is limited. We have already so many jobs focused on important processes of data consolidation and the window of time for allocation of more jobs is limited. Plus the impact of to try to process all items massively at once.
May be we could if we find out a way to use the multithread of spring-batch but we didn't find yet the way to fit oír requirements into such strategy.
Thank you in advance and excuse my english. I promise to keep working hard on it :-).

One problem to consider is data integrity. If step n fails, does step n-1 need to be reversed? Is there any ordering dependencies that you need to be aware of? And are you writing to the same or different CSV? If the same, then might have contention issues.
Now, back to the original problem. I would consider Java executors, using 4 fixed-sized pools and move the task through the pools as successes occur:
Submit step 1 to pool 1, getting a Future back, which will be used to check for completion.
When step 1 completes, you submit step 2 to pool 2.
When step 2 completes, you now can return a result to the caller. The call to this point has been waiting (likely with a timeout so it doesn't hang around forever) but now the critical tasks are done.
After returning to the client, submit step 3 to pool 3.
When step 3 completes, submit step to pool 4.
The pools themselves, while fixed sized, could be larger for pool 1/2 to get maximum throughput (and to get back to your client as quickly as possible) and pool 3/4 could be smaller but still large enough to get the work done.
You could do something similar with JMS, but the issues are similar: you need to have multiple listeners or multiple threads per listener so that you can process at an appropriate speed. You could do steps 1/2 synchronously without a pool, but then you don't get some of the thread management that executors give you. You still need to "schedule" steps 3/4 by putting them on the JMS queue and still have listeners to process them.
The ability to recover from server going down is key here, but Executors/ExecutorService has not persistence, so then I'd definitely be looking at JMS (and then I'd be queuing absolutely everything up, even the first 2 steps) but depending on your use case it might be overkill.

Yes, an event-driven approach where a message bus makes the integration sounds good. They are asynch so you will not have timeout. Of course you will need to use a Topic. WLS has some memory issues when you have too many messages in the server, maybe a different server would work better for separation of concerns and resources.

Fair task queue for Java EE

I intend to make a service where in people could submit tasks(specifically transcoding tasks) to the system and they should get serviced soon but at the same time it should not starve anyone else, ie it must be fair. If a person submits 2000 tasks the system should not cater to only him all the time but instead do a round robin or something like that among other people's requests...
Are there any solutions available? I looked at rabbitMQ and other messaging systems but they don't exactly cater to my problem. How are fair task queues implemented?

I would implement like this:
Have a queue listener on a queue which when a message arrives checks the last time a task from the given user was received; if the time < 1 sec put it on queue 1, if time < 10 seconds put on queue 2, if time < 100 seconds put on queue 3, else put on queue 4. You would then have listeners on the 4 queues that would be processing the tasks.
Of course you can change the number of queues and change the times to match the best throughput. Ideally you want your queues to be busy all the time.

I don't think this behavior exists natively but I could see it being implemented with some of RabbitMQ's features.
http://www.rabbitmq.com/blog/2010/08/03/well-ill-let-you-go-basicreject-in-rabbitmq/
That would let you reject messages and requeue them. You would then have to write a utility that can choose to execute or requeue messages based on some identifying property of the message (in this case, the report requester, which is custom to your app). Conceivably you could design the policy entirely around the routing key if it contains the ID of the user you are trying to throttle.
Your policy could be structured using
responding with basic.reject
using {requeue=true}
Hopefully this helps!

Limiting JMS Destination Instances

Is it possible to limit the number of JMS receiver instances to a single instance? I.e. only process a single message from a queue at any one time?
The reason I ask is because I have a fairly intensive render type process to run for each message (potentially many thousands). I'd like to limit the execution of this code to a single instance at a time.
My application server is JBoss AS 6.0

You can configure the queue listener pool to have a single thread, so no more than one listener is handling requests, but this makes no sense to me.
The right answer is to tune the size of the thread pool to balance performance with memory requirements.
Many thousands? Per second, per minute, per hour? The rate at which they arrive, and the time each task takes, are both crucial. How much time, memory, CPU per request? Make sure you configure your queue to handle what could be a rather large backlog.
UPDATE: If ten messages arrive per second, and it takes 10 seconds for a single listener to process a message, then you'll need 101 listener threads to be able to keep up. (10 messages/second * 10 seconds means 100 messages arrive by the time the first listener finishes its 10 second task. The 101st listener will handle the 101st message, and subsequent listeners will finish in time to keep up.) If you need 1 MB of RAM per listener, you'll need 101 MB RAM just to process all the messages on one server. You'll need a similar estimate for CPU as well.
It might be wise to think about multiple queues on multiple servers and load balancing between them if one server isn't sufficient.

Multiple SingleThreadExecutors for a given application...a good idea?

This question is about the fallouts of using SingleThreadExecutor (JDK 1.6). Related questions have been asked and answered in this forum before, but I believe the situation I am facing, is a bit different.
Various components of the application (let's call the components C1, C2, C3 etc.) generate (outbound) messages, mostly in response to messages (inbound) that they receive from other components. These outbound messages are kept in queues which are usually ArrayBlockingQueue instances - fairly standard practice perhaps. However, the outbound messages must be processed in the order they are added. I guess use of a SingleThreadExector is the obvious answer here. We end up having a 1:1 situation - one SingleThreadExecutor for one queue (which is dedicated to messages emanating from one component).
Now, the number of components (C1,C2,C3...) is unknown at a given moment. They will come into existence depending on the need of the users (and will be eventually disposed of too). We are talking about 200-300 such components at the peak load. Following the 1:1 design principle stated above, we are going to arrange for 200 SingleThreadExecutors. This is the source of my query here.
I am uncomfortable with the thought of having to create so many SingleThreadExecutors. I would rather try and use a pool of SingleThreadExecutors, if that makes sense and is plausible (any ready-made, seen-before classes/patterns?). I have read many posts on recommended use of SingleThreadExecutor here, but what about a pool of the same?
What do learned women and men here think? I would like to be directed, corrected or simply, admonished :-).

If your requirement is that the messages be processed in the order that they're posted, then you want one and only one SingleThreadExecutor. If you have multiple executors, then messages will be processed out-of-order across the set of executors.
If messages need only be processed in the order that they're received for a single producer, then it makes sense to have one executor per producer. If you try pooling executors, then you're going to have to put a lot of work into ensuring affinity between producer and executor.
Since you indicate that your producers will have defined lifetimes, one thing that you have to ensure is that you properly shut down your executors when they're done.

Messaging and batch jobs is something that has been solved time and time again. I suggest not attempting to solve it again. Instead, look into Quartz, which maintains thread pools, persisting tasks in a database etc. Or, maybe even better look into JMS/ActiveMQ. But, at the very least look into Quartz, if you have not already. Oh, and Spring makes working with Quartz so much easier...

I don't see any problem there. Essentially you have independent queues and each has to be drained sequentially, one thread for each is a natural design. Anything else you can come up with are essentially the same. As an example, when Java NIO first came out, frameworks were written trying to take advantage of it and get away from the thread-per-request model. In the end some authors admitted that to provide a good programming model they are just reimplementing threading all over again.

It's impossible to say whether 300 or even 3000 threads will cause any issues without knowing more about your application. I strongly recommend that you should profile your application before adding more complexity
The first thing that you should check is that number of concurrently running threads should not be much higher than number of cores available to run those threads. The more active threads you have, the more time is wasted managing those threads (context switch is expensive) and the less work gets done.
The easiest way to limit number of running threads is to use semaphore. Acquire semaphore before starting work and release it after the work is done.
Unfortunately limiting number of running threads may not be enough. While it may help, overhead may still be to great, if time spent per context switch is major part of total cost of one unit of work. In this scenario, often the most efficient way is to have fixed number of queues. You get queue from global pool of queues when component initializes using algorithm such as round-robin for queue selection.
If you are in one of those unfortunate cases where most obvious solutions do not work, I would start with something relatively simple: one thread pool, one concurrent queue, lock, list of queues and temporary queue for each thread in pool.
Posting work to queue is simple: add payload and identity of producer.
Processing is relatively straightforward as well. First you get get next item from queue. Then you acquire the lock. While you have lock in place, you check if any of other threads is running task for same producer. If not, you register thread by adding a temporary queue to list of queues. Otherwise you add task to existing temporary queue. Finally you release the lock. Now you either run the task or poll for next and start over depending on whether current thread was registered to run tasks. After running the task, you get lock again and see, if there is more work to be done in temporary queue. If not, remove queue from list. Otherwise get next task. Finally you release the lock. Again, you choose whether to run the task or to start over.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.