I intend to make a service where in people could submit tasks(specifically transcoding tasks) to the system and they should get serviced soon but at the same time it should not starve anyone else, ie it must be fair. If a person submits 2000 tasks the system should not cater to only him all the time but instead do a round robin or something like that among other people's requests...
Are there any solutions available? I looked at rabbitMQ and other messaging systems but they don't exactly cater to my problem. How are fair task queues implemented?
I would implement like this:
Have a queue listener on a queue which when a message arrives checks the last time a task from the given user was received; if the time < 1 sec put it on queue 1, if time < 10 seconds put on queue 2, if time < 100 seconds put on queue 3, else put on queue 4. You would then have listeners on the 4 queues that would be processing the tasks.
Of course you can change the number of queues and change the times to match the best throughput. Ideally you want your queues to be busy all the time.
I don't think this behavior exists natively but I could see it being implemented with some of RabbitMQ's features.
http://www.rabbitmq.com/blog/2010/08/03/well-ill-let-you-go-basicreject-in-rabbitmq/
That would let you reject messages and requeue them. You would then have to write a utility that can choose to execute or requeue messages based on some identifying property of the message (in this case, the report requester, which is custom to your app). Conceivably you could design the policy entirely around the routing key if it contains the ID of the user you are trying to throttle.
Your policy could be structured using
responding with basic.reject
using {requeue=true}
Hopefully this helps!
Related
I have one Server and multiple clients. With some period, clients sends an alive packet to Server. (At this moment, Server doesn't respond alive packets). The period may change device to device and configurable at runtime, for both Server and Clients. I want to generate an alert when one or more clients doesn't send the alive packet. (One packet or two in row etc.). This aliveness is used other parts of application so the quicker notice is the better. I came up some ideas but I couldn't select one.
Create a task that checks every clients last alive packet timestamps with current time and generate alert or alerts. Call this method in some period which should be smaller than minimum client-period.
Actually that seems better to me, however this way unnecessarily I check some clients alive. (Ex: If clients period are change 1-5 minute, task should be run in every minute at least, so I check all clients above 2 minute period is redundant). Also if the minimum of client periods is decrease, I should decrease the tasks period also.
Create a task for each clients, and check the last alive packet timestamps with current time, sleep for one client's period time.
In this way, if clients number goes very high, there will be dozens of task. Since they will sleep most of the time, I still doubt this is more elegant.
Is there any idiom or pattern for this kind of situation? I think watchdog kind implementation is suite well, however I didn't see something like in Java.
Approach 2 is not very useful as it is vague idea to write 100 task for 100 clients.
Approach 1 can be optimized if you use average client-period instead of minimum.
It depends on your needs.
Is it critical if alert is generated few seconds later (or earlier) than it should be?
If not then maybe it's worth grouping clients with nearby heartbeat intervals and run the check against not a single client but the group of clients? This will allow to decrease number of tasks (100 -> 10) and increase number of clients handled by single task (1 -> 10).
First approach is fine.
Only thing I can suggest you is that create an independent service to do this control. If you set this task as a thread in your server, it wouldn't be that manageable. Imagine your control thread is broken, killed etc, how would you notice? So, build an independent OS service, another java program, to check last alive timestamps periodically.
In this way you can easily modify and restart your service and see its logs separately. According to its importance, you may even built a "watchdog of watchdog" service too.
I have a piece of middleware that sits between two JMS queues. From one it reads, processes some data into the database, and writes to the other.
Here is a small diagram to depict the design:
With that in mind, I have some interesting logic that I would like to integrate into the service.
Scenario 1: Say the middleware service receives a message from Queue 1, and hits the database to store portions of that message. If all goes well, it constructs a new message with some data, and writes it to Queue 2.
Scenario 2: Say that the database complains about something, when the service attempts to perform some logic after getting a message from Queue 1.In this case, instead of writing a message to Queue 2, I would re-try to perform the database functionality in incremental timeouts. i.e Try again in 5 sec., then 30 sec, then 1 minute if still down. The catch of course, is to be able to read other messages independently of this re-try. i.e Re-try to process this one request, while listening for other requests.
With that in mind, what is both the correct and most modern way to construct a future proof solution?
After reading some posts on the net, it seems that I have several options.
One, I could spin off a new thread once a new message is received, so that I can both perform the "re-try" functionality and listen to new requests.
Two, I could possibly send the message back to the Queue, with a delay. i.e If the process failed to execute in the db, write the message to the JMS queue by adding some amount of delay to it.
I am more fond of the first solution, however, I wanted to get the opinion of the community if there is a newer/better way to solve for this functionality in java 7. Is there something built into JMS to support this sort of "send message back for reprocessing at a specific time"?
JMS 2.0 specification describes the concept of delayed delivery of messages. See "What's new" section of https://java.net/projects/jms-spec/pages/JMS20FinalReleaseMany JMS providers have implemented the delayed delivery feature.
But I wonder how the delayed delivery will help your scenario. Since the database writes have issues, subsequent messages processing and attempt to write to database might end up in same situation. I guess it might be better to sort out issues with database updates and then pickup messages from queue.
In my case, let's say that there are 50 JMS queues receiving different type of messages.
If I implemented 50 JMS listeners (one for each queue), it is working pretty good.
However, when all the 50 queues had many pending messages there, all my 50 JMS listeners were working at the same time (i.e. there would be 50 JAVA threads were working). This made my server overloaded (if it has very limited RAM resource and easily got out-of-memory).
So I am thinking whether I can limit the number of active listeners. Let's says, limit to only maximum 10 active listeners at a time. Sometimes listener 01 ~10 work on queue 01~10, and sometimes listener 11~20 can work on queue 11~20 etc.
And even there are new messages coming into queue 01~10, listener 01~10 should be able to sleep for a while and let other listeners to work.
How can I achieve this case?
Usually it's one listener per queue, so unless you're going to manage listeners being active/inactive, you'll get a running thread each time a message is delivered to the queue.
What you need is a way to manage the scaling, regardless of where the messages are delivered. Two ideas come to mind:
1) Does the message processing require some memory-intensive resource that could be shared somehow? For example, database connections are often shared/pooled to avoid creating too many (though the too-many connections is often a server-side issue, maybe there's another resource that you need to share).
2) Using semaphores, limit the number of concurrent threads allowed by having each thread get a permit from the semaphore before starting, returning it at the end (very important!). Then, if you get a lot of concurrent messages coming in, only n messages are processed concurrently and the other queue up in the listener handler for the message.
3) You could aggregate messages into new queues that have listeners that do the processing. Listeners for queues 1-10 post the message to newQueue1, queues 11-20 post the message to newQueue2, etc., and then you have listeners working on newQueue1, newQueue2, etc.
Background
At a high level, I have a Java application in which certain events should trigger a certain action to be taken for the current user. However, the events may be very frequent, and the action is always the same. So when the first event happens, I would like to schedule the action for some point in the near future (e.g. 5 minutes). During that window of time, subsequent events should take no action, because the application sees that there's already an action scheduled. Once the scheduled action executes, we're back to Step 1 and the next event starts the cycle over again.
My thought is to implement this filtering and throttling mechanism by embedding an in-memory ActiveMQ instance within the application itself (I don't care about queue persistence).
I believe that JMS 2.0 supports this concept of delayed delivery, with delayed messages sitting in a "staging queue" until it's time for delivery to the real destination. However, I also believe that ActiveMQ does not yet support the JMS 2.0 spec... so I'm thinking about mimicking the same behavior with time-to-live (TTL) values and Dead Letter Queue (DLQ) handling.
Basically, my message producer code would put messages on a dummy staging queue from which no consumers ever pull anything. Messages would be placed with a 5-minute TTL value, and upon expiration ActiveMQ would dump them into a DLQ. That's the queue from which my message consumers would actually consume the messages.
Question
I don't think I want to actually consume from the "default" DLQ, because I have no idea what other internal things ActiveMQ might dump there that are completely unrelated to my application code. So I think it would be best for my dummy staging queue to have its own custom DLQ. I've only seen one page of ActiveMQ documentation which discusses DLQ config, and it only addresses XML config files for a standalone ActiveMQ installation (not an in-memory broker embedded within an app).
Is it possible to programmatically configure a custom DLQ at runtime for a queue in an embedded ActiveMQ instance?
I'd also be interested to hear alternative suggestions if you think I'm on the wrong track. I'm much more familiar with JMS than AMQP, so I don't know if this is much easier with Qpid or some other Java-embeddable AMQP broker. Whatever Apache Camel actually is (!), I believe it's supposed to excel at this sort of thing, but that learning curve might be gross overkill for this use case.
Although you're worried that Camel might be gross overkill for this usecase, I think that ActiveMQ is already gross overkill for the usecase you've described.
You're looking to schedule something to happen 5 minutes after an event happens, and for it to consume only the first event and ignore all the ones between the first one and when the 5 minutes are up, right? Why not just schedule your processing method for 5 minutes from now via ScheduledExecutorService or your favorite scheduling mechanism, and save the event in a HashMap<User, Event> member variable. If any more events come in for this user before the processing method fires, you'll just see that you already have an event stored and not store the new one, so you'll ignore all but the first. At the end of your processing method, delete the event for this user from your HashMap, and the next event to come in will be stored and scheduled.
Running ActiveMQ just to get this behavior seems like way more than you need. Or if not, can you explain why?
EDIT:
If you do go down this path, don't use the message TTL to expire your messages; just have the (one and only) consumer read them into memory and use the in-memory solution described above to only process (at most) one batch every 5 minutes. Either have a single queue with message selectors, or use dynamic queues, one per user. You don't need the DLQ to implement the delay, and even if you could get it to do that, it won't give you the functionality of batching everything so you only run once per 5 minutes. This isn't a path you want to go down, even if you figure out how.
A simple solution is keeping track of the pending actions in a concurrent structure and use a ScheduledExecutorService to execute them:
private static final Object RUNNING = new Object();
private final ConcurrentMap<UserId, Object> pendingActions =
new ConcurrentHashMap<>();
private ScheduledExecutorService ses = Executors.newScheduledThreadPool(10);
public void takeAction(final UserId id) {
Object running = pendingActions.putIfAbsent(id, RUNNING); // atomic
if(running == null) { // no pending action for this user
ses.schedule(new Runnable() {
#Override
public void run() {
doWork();
pendingActions.remove(id);
}
}, 5, TimeUnit.MINUTES);
}
}
With Camel this could be easily achieved with an Aggregator component with the parameter completionInterval , so on every five minutes you can check if the list aggregated messages is empty, if it's not fire a message to the route responsible for you user action and empty the list. You do need to maintain the whole list of exchanges, just the state (user action planned or not).
We encountered a problem under WebLogic 8.1 that we lived with but could never fix. We often queue up a hundred or more JMS messages, each of which represents a unit of work. Despite the fact that each message is of the same size and looks the same, one may take only seconds to complete while the next one represents 20 minutes of solid crunching.
Our problem is that each of the message driven beans we have doing the work of these messages ends up on a thread that seems to grab ten messages at a time (we think it is being done as a WebLogic optimization to keep from having to hit the queue over and over again for small messages). Then, as one thread after another finishes all of its small jobs and no new ones come in, we end up with a single thread log jammed on a long running piece of work with up to nine other items sitting waiting on it to finish, despite the fact that other threads are free and could start on those units of work.
Now we are at a point where we are converting to WebLogic 10 so it is a natural point to return to this problem and find out if there is any solution that we could implement so that either: a) each thread only grabs one JMS message at a time to process and leaves all the others waiting in the incoming queue, or b) it would automatically redistribute waiting messages (even ones already assigned to a particular thread) out to free threads. Any ideas?
Enable the Forward Delay and provide an appropriate value. This will cause the JMS Queue to redistribute messages to it's peers if they have not been processed in the configured time.
Taking a single message off the queue every time might be overkill - It's all a balance on the number of messages you are processing and what you gauge as an issue.
There are also multiple issues with JMS on WebLogic 10 depending on your setup. You can save yourself a lot of time and trouble by using the latest MP right from the start.
when a Thread is in 'starvation' after getting the resources they can able to execute.The threads which are in starvation called as "greedy thread"