Im currently facing the problem that i want to realize a simple Master-Slave pattern, where the master initializes a job queue by publishing all jobs from the beginning to a topic. The slaves would pull those jobs everytime they have free working capabilities, pulling would be realized by pulling one job at a time. The code from the example code on github pulls multiple messages for a specific time
subscriber.startAsync().awaitRunning();
Thread.sleep(params.y());
I dont want that, i just want to pull one job message from the queue, let the slave do the work and after the work is done, call the pulling method to pull another job message, but just one at a time. Since I'm executing the jobs in an ExecutorService i want to ensure that i don't pull any messages, if my thread pool is filled. How would i realize pulling one message, fill that job into my ExecutorService and only pull the next job message, if there is a job finished, and a thread without work?
Pulling a single message at a time would be considered an anti-pattern for Google Cloud Pub/Sub. You can control the number of messages delivered to your worker by specifying FlowControlSettings via the Subscriber Builder. In particular, you could call setMaxOutstandingElementCount on the FlowControlSettings Builder to limit the maximum number of messages that have been delivered to the MessageReceiver you provided. If each of your workers is individually a subscriber and wants to perform a single action at a time, you could even set this number to 1.
If you need more exact control over the pull semantics for your subscriber, then you can use the gRPC library's pull method directly. The Serivce APIs Overview has more information on this approach.
Related
I have an application which should use JMS to queue several long running tasks asynchronously in response to a specific request. Some of these tasks might complete within seconds while others might take a longer time to complete. The original request should already complete after all the tasks have been started (i.e. the message to start the task has been queued) - i.e. I don't want to block the request while the tasks are being executed.
Now, however, I would like to execute another action per request once all of the messages have been processed successfully. For this, I would like to send another message to another queue - but only after all messages have been processed.
So what I am doing is a bit similar to a reply-response pattern, but not exactly: The responses of multiple messages (which were queued in the same transaction) should be aggregated and processed in a single transaction once they are all available. Also, I don't want to "block" the transaction enqueuing the messages by waiting for replies.
My first, naive approach would be the following:
When a requests comes in:
Queue n messages for each of the n actions to be performed. Give them all the same correlation id.
Store n (i.e. the number of messages sent) in a database along with the correlation id of the messages.
Complete the request successfully
Each of the workers would do the following:
Receive a message from the queue
Do the work that needs to be done to handle the message
Decrement the counter stored in the database based on the correlation id.
If the counter has reached zero: Send a "COMPLETED" message to the completed-queue
However, I am wondering if there is an alternative solution which doesn't require a database (or any other kind of external store) to keep track whether all messages have already been processed or not.
Does JMS provide some functionality which would help me with this?
Or do I really have to use the database in this case?
If your system is distributed, and I presume it is, it's very hard to solve this problem without some kind of global latch lock like the one you have implemented. The main thing to notice is that "tasks" have to signal within "global storage" that they are over. Your app is essentially creating a new countdown latch lock instance (identified by CorrelationID) each time a new request comes by inserting a row in a db. Your tasks are "signaling" the end of jobs by counting that latch down. The job which ends holding a lock has to clean the row.
Now global storage doesn't have to be a database, but it still has to be some kind of global access state. And you have to keep on counting. And if only thing you have is a JMS you have to create latch and count down by sending messages.
The simplest solution which comes to a mind is by having each job sends a TASK_ENDED message to a JOBS_FINISHED queue. TASK_ENDED message stands for: "task X triggered by request Y with CorrelationID Z has ended" signal. Just as counting down in db. Recipient of this q is a special task whose only job is to trigger COMPLETED messages when all messages are received for a request with given correlation id. So this jobs is just reading messages sequentially. And counts each unique correlation id which it encounters. Once it has counted to an expected number it should clear that counter and send COMPLETED message.
You can encode number of triggered tasks and any other specifics within JMS header of messages created when processing request. For example:
// pretend this request handling triggers 10 tasks
// here we are creating first of ten START TASK messages
TextMessage msg1 = session.createTextMessage("Start a first task");
msg1.setJMSCorrelationID(request.id);
msg1.setIntProperty("TASK_NUM", 1);
msg1.setIntProperty("TOTAL_TASK_COUNT", 10);
And than you just pass that info to a TASK_ENDED messages all the way to a final job. You have to make sure that all messages sent to an ending job are received to same instance of a job.
You could go from here by expanding idea with publish subscribe messaging, and error handling and temporary queues and stuff like that, but that is becoming very specific of you needs so I'll end here.
I have the following use case on a Spring-based Web application:
I need to apply the Competing Consumers EIP with the following twists: the messages in the queue are actually split tasks belonging to the same job. Therefore, I need to properly track when all tasks of a job get completed and their completion status in order to save the scenario either as COMPLETED or FAILED, log the outcome and notify by e.g. e-mail the users accordingly
So, given the requirements I described above, my question is:
Can this be done with RabbitMQ and if yes how?
I created a quick gist to show a very crude example of how one could do it. In this example, there is one producer and 2 consumers, 2 queues, one for sending by the producer ("SEND"), consumed by the consumers, and vice versa, consumers publish to the "RECV" queue and is consumed by the producer.
Now bear in mind this is a pretty crude example, as the Producer in that case send simply one job (a random amount of tasks between 0 and 5), and block until the job is done. A way to circumvent this would be to store in a Map a job id and the number of tasks, and every time check that the number of tasks done reported per job id.
What you are trying to do is beyond the scope of RabbitMQ. RabbitMQ is for sending and receiving messages with ability to queue them.
It can't track your job tasks for you.
You will need to have a "Job Storage" service. Whenever your consumer finishes the task, its updates the Job Storage service, marking task as done. Job storage service knows about how many tasks are in the job, and when last task is done, completes jobs as succeeded. There in this service, you will also implement all your other business logic, such as when to treat job as failed.
I have a piece of middleware that sits between two JMS queues. From one it reads, processes some data into the database, and writes to the other.
Here is a small diagram to depict the design:
With that in mind, I have some interesting logic that I would like to integrate into the service.
Scenario 1: Say the middleware service receives a message from Queue 1, and hits the database to store portions of that message. If all goes well, it constructs a new message with some data, and writes it to Queue 2.
Scenario 2: Say that the database complains about something, when the service attempts to perform some logic after getting a message from Queue 1.In this case, instead of writing a message to Queue 2, I would re-try to perform the database functionality in incremental timeouts. i.e Try again in 5 sec., then 30 sec, then 1 minute if still down. The catch of course, is to be able to read other messages independently of this re-try. i.e Re-try to process this one request, while listening for other requests.
With that in mind, what is both the correct and most modern way to construct a future proof solution?
After reading some posts on the net, it seems that I have several options.
One, I could spin off a new thread once a new message is received, so that I can both perform the "re-try" functionality and listen to new requests.
Two, I could possibly send the message back to the Queue, with a delay. i.e If the process failed to execute in the db, write the message to the JMS queue by adding some amount of delay to it.
I am more fond of the first solution, however, I wanted to get the opinion of the community if there is a newer/better way to solve for this functionality in java 7. Is there something built into JMS to support this sort of "send message back for reprocessing at a specific time"?
JMS 2.0 specification describes the concept of delayed delivery of messages. See "What's new" section of https://java.net/projects/jms-spec/pages/JMS20FinalReleaseMany JMS providers have implemented the delayed delivery feature.
But I wonder how the delayed delivery will help your scenario. Since the database writes have issues, subsequent messages processing and attempt to write to database might end up in same situation. I guess it might be better to sort out issues with database updates and then pickup messages from queue.
I have an Java-Akka based application where one Akka actor tells another Akka actor to do a certain jobs and it starts doing the job in the command prompt but If I gave him 10 jobs it starts all the jobs at a time in 10 command prompt.
If i'll be having 100+ jobs than my system will be hanged.
So how can I make my application to do the job 1 at a time and all the other jobs should will get the CPU in FIFO(first in first out) manner.
The question is not quite clear but I try to answer with my understanding.
So, it looks like you use actor as a job dispatcher which translates job messages to calls for some "job executor system". Each incoming message is translated to some call.
If this call is synchronous (which smells when working with actors of course but just for understanding) then no problem in your case, your actor waits until call is complete, then proceed with next message in its mailbox.
If that call is asynchronous which I guess what you have then all the messages will be handled one by one without waiting for each other.
So you need to throttle the messages handling in order to have at most one message being processed at a time. This can be archived by "pull" pattern which is described here.
You basically allocate one master actor which has a queue with incoming messages (jobs) and one worker actor which asks for job when it is free of jobs. Be careful with the queue in master actor - you probably don't want it to grow too much, think about monitoring and applying back-pressure, which is another big topic covered by akka-stream.
Background
At a high level, I have a Java application in which certain events should trigger a certain action to be taken for the current user. However, the events may be very frequent, and the action is always the same. So when the first event happens, I would like to schedule the action for some point in the near future (e.g. 5 minutes). During that window of time, subsequent events should take no action, because the application sees that there's already an action scheduled. Once the scheduled action executes, we're back to Step 1 and the next event starts the cycle over again.
My thought is to implement this filtering and throttling mechanism by embedding an in-memory ActiveMQ instance within the application itself (I don't care about queue persistence).
I believe that JMS 2.0 supports this concept of delayed delivery, with delayed messages sitting in a "staging queue" until it's time for delivery to the real destination. However, I also believe that ActiveMQ does not yet support the JMS 2.0 spec... so I'm thinking about mimicking the same behavior with time-to-live (TTL) values and Dead Letter Queue (DLQ) handling.
Basically, my message producer code would put messages on a dummy staging queue from which no consumers ever pull anything. Messages would be placed with a 5-minute TTL value, and upon expiration ActiveMQ would dump them into a DLQ. That's the queue from which my message consumers would actually consume the messages.
Question
I don't think I want to actually consume from the "default" DLQ, because I have no idea what other internal things ActiveMQ might dump there that are completely unrelated to my application code. So I think it would be best for my dummy staging queue to have its own custom DLQ. I've only seen one page of ActiveMQ documentation which discusses DLQ config, and it only addresses XML config files for a standalone ActiveMQ installation (not an in-memory broker embedded within an app).
Is it possible to programmatically configure a custom DLQ at runtime for a queue in an embedded ActiveMQ instance?
I'd also be interested to hear alternative suggestions if you think I'm on the wrong track. I'm much more familiar with JMS than AMQP, so I don't know if this is much easier with Qpid or some other Java-embeddable AMQP broker. Whatever Apache Camel actually is (!), I believe it's supposed to excel at this sort of thing, but that learning curve might be gross overkill for this use case.
Although you're worried that Camel might be gross overkill for this usecase, I think that ActiveMQ is already gross overkill for the usecase you've described.
You're looking to schedule something to happen 5 minutes after an event happens, and for it to consume only the first event and ignore all the ones between the first one and when the 5 minutes are up, right? Why not just schedule your processing method for 5 minutes from now via ScheduledExecutorService or your favorite scheduling mechanism, and save the event in a HashMap<User, Event> member variable. If any more events come in for this user before the processing method fires, you'll just see that you already have an event stored and not store the new one, so you'll ignore all but the first. At the end of your processing method, delete the event for this user from your HashMap, and the next event to come in will be stored and scheduled.
Running ActiveMQ just to get this behavior seems like way more than you need. Or if not, can you explain why?
EDIT:
If you do go down this path, don't use the message TTL to expire your messages; just have the (one and only) consumer read them into memory and use the in-memory solution described above to only process (at most) one batch every 5 minutes. Either have a single queue with message selectors, or use dynamic queues, one per user. You don't need the DLQ to implement the delay, and even if you could get it to do that, it won't give you the functionality of batching everything so you only run once per 5 minutes. This isn't a path you want to go down, even if you figure out how.
A simple solution is keeping track of the pending actions in a concurrent structure and use a ScheduledExecutorService to execute them:
private static final Object RUNNING = new Object();
private final ConcurrentMap<UserId, Object> pendingActions =
new ConcurrentHashMap<>();
private ScheduledExecutorService ses = Executors.newScheduledThreadPool(10);
public void takeAction(final UserId id) {
Object running = pendingActions.putIfAbsent(id, RUNNING); // atomic
if(running == null) { // no pending action for this user
ses.schedule(new Runnable() {
#Override
public void run() {
doWork();
pendingActions.remove(id);
}
}, 5, TimeUnit.MINUTES);
}
}
With Camel this could be easily achieved with an Aggregator component with the parameter completionInterval , so on every five minutes you can check if the list aggregated messages is empty, if it's not fire a message to the route responsible for you user action and empty the list. You do need to maintain the whole list of exchanges, just the state (user action planned or not).