How is back-and-forth communication established in MIP? - java

For example, I have a root process which sends some computations to be completed by worker processes. But because I have limited (4) processes I have to share the workload to all of them so I send multiple times. The workaround that I have found is this:
int me = MPI.COMM_WORLD.Rank();
if(me == 0) {
sendToWorkers(); //Sends more than once to workers.
}
else {
while(true) {//wait indefinitely, accept data received from root process and work on it.
MPI.COMM_WORLD.Recv(Buf, 0, Buf.length, MPI.INT, 0, 0);
doTask(Buf);
}
}
Now the problem arises that I want to send data that has completed processing back to the root process but I can't do another while(true);. I am sure there must be a much more elegant way to accomplish this.
EDIT 1: The reason why I want to send to root process is because it is cleaner. However, alternatively I can just print computed solutions from the worker processes but the output is all mangled up due to interleaving. Declaring the print method to be synchronized doesn't work.

One simple solution is at the end of task distribution, master must sent a "FINISH/STOP/END" (any custom message to indicate that tasks are over) message to all workers. Workers getting the finish message exits the loop and sends the result back to the master. Master can start a loop with total tasks and waits for those results.
From your example shown, this is a typical master worker model use-case. Here when you send a task to a worker using MPI_Send(), there is a corresponding MPI_Recv() in your worker process. After receiving task, you perform doTask(Buf). Then you again goes to the loop. So in your case, to summarise, you receive a new task only after computing the previously received task for that rank right? In that case, master process can also wait for reply from any of the finished tasks and can send new tasks based on that. May be you can consider that approach. If your doTask uses thread, then this becomes complicated. Each worker nodes then haves to keep track of its tasks and after all the tasks are completed, master should start a loop and waits for the results.
Or you can use multithreaded implementation. You may use separate thread for send and receive in master.

Related

Sending a message after several other messages have completed without utilizing an external store?

I have an application which should use JMS to queue several long running tasks asynchronously in response to a specific request. Some of these tasks might complete within seconds while others might take a longer time to complete. The original request should already complete after all the tasks have been started (i.e. the message to start the task has been queued) - i.e. I don't want to block the request while the tasks are being executed.
Now, however, I would like to execute another action per request once all of the messages have been processed successfully. For this, I would like to send another message to another queue - but only after all messages have been processed.
So what I am doing is a bit similar to a reply-response pattern, but not exactly: The responses of multiple messages (which were queued in the same transaction) should be aggregated and processed in a single transaction once they are all available. Also, I don't want to "block" the transaction enqueuing the messages by waiting for replies.
My first, naive approach would be the following:
When a requests comes in:
Queue n messages for each of the n actions to be performed. Give them all the same correlation id.
Store n (i.e. the number of messages sent) in a database along with the correlation id of the messages.
Complete the request successfully
Each of the workers would do the following:
Receive a message from the queue
Do the work that needs to be done to handle the message
Decrement the counter stored in the database based on the correlation id.
If the counter has reached zero: Send a "COMPLETED" message to the completed-queue
However, I am wondering if there is an alternative solution which doesn't require a database (or any other kind of external store) to keep track whether all messages have already been processed or not.
Does JMS provide some functionality which would help me with this?
Or do I really have to use the database in this case?
If your system is distributed, and I presume it is, it's very hard to solve this problem without some kind of global latch lock like the one you have implemented. The main thing to notice is that "tasks" have to signal within "global storage" that they are over. Your app is essentially creating a new countdown latch lock instance (identified by CorrelationID) each time a new request comes by inserting a row in a db. Your tasks are "signaling" the end of jobs by counting that latch down. The job which ends holding a lock has to clean the row.
Now global storage doesn't have to be a database, but it still has to be some kind of global access state. And you have to keep on counting. And if only thing you have is a JMS you have to create latch and count down by sending messages.
The simplest solution which comes to a mind is by having each job sends a TASK_ENDED message to a JOBS_FINISHED queue. TASK_ENDED message stands for: "task X triggered by request Y with CorrelationID Z has ended" signal. Just as counting down in db. Recipient of this q is a special task whose only job is to trigger COMPLETED messages when all messages are received for a request with given correlation id. So this jobs is just reading messages sequentially. And counts each unique correlation id which it encounters. Once it has counted to an expected number it should clear that counter and send COMPLETED message.
You can encode number of triggered tasks and any other specifics within JMS header of messages created when processing request. For example:
// pretend this request handling triggers 10 tasks
// here we are creating first of ten START TASK messages
TextMessage msg1 = session.createTextMessage("Start a first task");
msg1.setJMSCorrelationID(request.id);
msg1.setIntProperty("TASK_NUM", 1);
msg1.setIntProperty("TOTAL_TASK_COUNT", 10);
And than you just pass that info to a TASK_ENDED messages all the way to a final job. You have to make sure that all messages sent to an ending job are received to same instance of a job.
You could go from here by expanding idea with publish subscribe messaging, and error handling and temporary queues and stuff like that, but that is becoming very specific of you needs so I'll end here.

RabbitMQ how to split jobs to tasks and handle results

I have the following use case on a Spring-based Web application:
I need to apply the Competing Consumers EIP with the following twists: the messages in the queue are actually split tasks belonging to the same job. Therefore, I need to properly track when all tasks of a job get completed and their completion status in order to save the scenario either as COMPLETED or FAILED, log the outcome and notify by e.g. e-mail the users accordingly
So, given the requirements I described above, my question is:
Can this be done with RabbitMQ and if yes how?
I created a quick gist to show a very crude example of how one could do it. In this example, there is one producer and 2 consumers, 2 queues, one for sending by the producer ("SEND"), consumed by the consumers, and vice versa, consumers publish to the "RECV" queue and is consumed by the producer.
Now bear in mind this is a pretty crude example, as the Producer in that case send simply one job (a random amount of tasks between 0 and 5), and block until the job is done. A way to circumvent this would be to store in a Map a job id and the number of tasks, and every time check that the number of tasks done reported per job id.
What you are trying to do is beyond the scope of RabbitMQ. RabbitMQ is for sending and receiving messages with ability to queue them.
It can't track your job tasks for you.
You will need to have a "Job Storage" service. Whenever your consumer finishes the task, its updates the Job Storage service, marking task as done. Job storage service knows about how many tasks are in the job, and when last task is done, completes jobs as succeeded. There in this service, you will also implement all your other business logic, such as when to treat job as failed.

How to restrict the akka actor to do one job at a time

I have an Java-Akka based application where one Akka actor tells another Akka actor to do a certain jobs and it starts doing the job in the command prompt but If I gave him 10 jobs it starts all the jobs at a time in 10 command prompt.
If i'll be having 100+ jobs than my system will be hanged.
So how can I make my application to do the job 1 at a time and all the other jobs should will get the CPU in FIFO(first in first out) manner.
The question is not quite clear but I try to answer with my understanding.
So, it looks like you use actor as a job dispatcher which translates job messages to calls for some "job executor system". Each incoming message is translated to some call.
If this call is synchronous (which smells when working with actors of course but just for understanding) then no problem in your case, your actor waits until call is complete, then proceed with next message in its mailbox.
If that call is asynchronous which I guess what you have then all the messages will be handled one by one without waiting for each other.
So you need to throttle the messages handling in order to have at most one message being processed at a time. This can be archived by "pull" pattern which is described here.
You basically allocate one master actor which has a queue with incoming messages (jobs) and one worker actor which asks for job when it is free of jobs. Be careful with the queue in master actor - you probably don't want it to grow too much, think about monitoring and applying back-pressure, which is another big topic covered by akka-stream.

how to serialize multi-threaded program

I have many threads performing different operations on object and when nearly 50% of the task finished then I want to serialize everything(might be I want to shut down my machine ).
When I come back then I want to start from the point where I had left.
How can we achieve?
This is like saving state of objects of any game while playing.
Normally we save the state of the object and retrieve back. But here we are storing its process's count/state.
For example:
I am having a thread which is creating salary excel sheet for 50 thousand employee.
Other thread is creating appraisal letters for same 50 thousand employee.
Another thread is writing "Happy New Year" e-mail to 50 thousand employee.
so imagine multiple operations.
Now I want to shut down in between 50% of task finishes. say 25-30 thousand employee salary excel-sheet have been written and appraisal letters done for 25-30 thousand and so on.
When I will come back next day then I want to start the process from where I had left.
This is like resume.
I'm not sure if this might help, but you can achieve this if the threads communicate via in-memory queues.
To serialize the whole application, what you need to do is to disable the consumption of the queues, and when all the threads are idle you'll reach a "safe-point" where you can serialize the whole state. You'll need to keep track of all the threads you spawn, to know if they are in are idle.
You might be able to do this with another technology (maybe a java agent?) that freezes the JVM and allows you to dump the whole state, but I don't know if this exists.
well its not much different than saving state of object.
just maintain separate queues for different kind of inputs. and on every launch (1st launch or relaunch) check those queues, if not empty resume your 'stopped process' by starting new process but with remaining data.
say for ex. an app is sending messages, and u quit the app with 10 msg remaining. Have a global queue, which the app's senderMethod will check on every launch. so in this case it will have 10msg in pending queue, so it will continue sending remaining msgs.
Edit:
basically, for all resumable process' say pr1, pr2....prN, maintain queue of inputs, say q1, q2..... qN. queue should remove processed elements, to contain only pending inputs. as soon as u suspend system. store these queues, and on relaunching restore them. have a common routine say resumeOperation, which will call all resumable process (pr1, pr2....prN). So it will trigger the execution of methods with non-0 queues. which in tern replicate resuming behavior.
Java provides the java.io.Serializable interface to indicate serialization support in classes.
You don't provide much information about the task, so it's difficult to give an answer.
One way to think about a task is in terms of a general algorithm which can split in several steps. Each of these steps in turn are tasks themselves, so you should see a pattern here.
By cutting down each algorithms in small pieces until you cannot divide further you get a pretty good idea of where your task can be interrupted and recovered later.
The result of a task can be:
a success: the task returns a value of the expected type
a failure: somehow, something didn't turn right while doing computation
an interrupted computation: the work wasn't finished, but it may be resumed later, and the return value is the state of the task
(Note that the later case could be considered a subcase of a failure, it's up to you to organize your protocol as you see fit).
Depending on how you generate the interruption event (will it be a message passed from the main thread to the worker threads? Will it be an exception?), that event will have to bubble within the task tree, and trigger each task to evaluate if its work can be resumed or not, and then provide a serialized version of itself to the larger task containing it.
I don't think serialization is the correct approach to this problem. What you want is persistent queues, which you remove an item from when you've processed it. Every time you start the program you just start processing the queue from the beginning. There are numerous ways of implementing a persistent queue, but a database comes to mind given the scale of your operations.

Java Async Processing

I am currently developing a system that uses allot of async processing. The transfer of information is done using Queues. So one process will put info in the Queue (and terminate) and another will pick it up and process it. My implementation leaves me facing a number of challenges and I am interested in what everyone's approach is to these problems (in terms of architecture as well as libraries).
Let me paint the picture. Lets say you have three processes:
Process A -----> Process B
|
Process C <-----------|
So Process A puts a message in a queue and ends, Process B picks up the message, processes it and puts it in a "return" queue. Process C picks up the message and processes it.
How does one handle Process B not listening or processing messages off the Queue? Is there some JMS type method that prevents a Producer from submitting a message when the Consumer is not active? So Process A will submit but throw an exception.
Lets say Process C has to get a reply with in X minutes, but Process B has stopped (for any reason), is there some mechanism that enforces a timeout on a Queue? So guaranteed reply within X minutes which would kick off Process C.
Can all of these matters be handled using a dead letter Queue of some sort? Should I maybe be doing this all manually with timers and check. I have mentioned JMS but I am open to anything, in fact I am using Hazelcast for the Queues.
Please note this is more of a architectural question, in terms of available java technologies and methods, and I do feel this is a proper question.
Any suggestions will be greatly appreciated.
Thanks
IMHO, The simplest solution is to use an ExecutorService, or a solution based on an executor service. This supports a queue of work, scheduled tasks (for timeouts).
It can also work in a single process. (I believe Hazelcast supports distributed ExecutorService)
It seems to me that the type of questions you're asking are "smells" that queues and async processing may not be the best tools for your situation.
1) That defeats a purpose of a queue. Sounds like you need a synchronous request-response process.
2) Process C is not getting a reply generally speaking. It's getting a message from a queue. If there is a message in the queue and the Process C is ready then it will get it. Process C could decide that the message is stale once it gets it, for example.
I think your first question has already been answered adequately by the other posters.
On your second question, what you are trying to do may be possible depending on the messaging engine used by your application. I know this works with IBM MQ. I have seen this being done using the WebSphere MQ Classes for Java but not JMS. The way it works is that when Process A puts a message on a queue, it specifies the time it will wait for a response message. If Process A fails to receive a response message within the specified time, the system throws an appropriate exception.
I do not think there is a standard way in JMS to handle request/response timeouts the way you want so you may have to use platform specific classes like WebSphere MQ Classes for Java.
Well, kind of the point of queues is to keep things pretty isolated.
If you're not stuck on any particular tech, you could use a database for your queues.
But first, a simple mechanism to ensure two processes are coordinated is to use a socket. If practical, simply have process B create an open socket listener on some well know port, and process A will connect to that socket, and monitor it. If process B ever goes away, process A can tell because their socket gets shutdown, and it can use that as an alert of problems with process B.
For the B -> C problem, have a db table:
create table queue (
id integer,
payload varchar(100), // or whatever you can use to indicate a payload
status varchar(1),
updated timestamp
)
Then, Process A puts its entry on the queue, with the current time and a status of "B". B, listens on the queue:
select * from queue where status = 'B' order by updated
When B is done, it updates the queue to set the status to "C".
Meanwhile, "C" is polling the DB with:
select * from queue where status = 'C'
or (status = 'B' and updated < (now - threshold) order by updated
(with the threshold being however long you want things to rot on the queue).
Finally, C updates the queue row to 'D' for done, or deletes it, or whatever you like.
The dark side is there is a bit of a race condition here where C might try and grab an entry while B is just starting up. You can probably get through that with a strict isolation level, and some locking. Something as simply as:
select * from queue where status = 'C'
or (status = 'B' and updated < (now - threshold) order by updated
FOR UPDATE
Also use FOR UPDATE for B's select. This way whoever win the select race will get an exclusive lock on the row.
This will get you pretty far down the road in terms of actual functionality.
You are expecting the semantics of synchronous processing with async (messaging) setup which is not possible. I have worked on WebSphere MQ and normally when the consumer dies, the messages are kept in the queue forever (unless you set the expiry). Once the queue reaches its depth, the subsequent messages are moved to the dead letter queue.
I've used a similar approach to create a queuing and processing system for video transcoding jobs. Basically the way it worked was:
Process A posts a "schedule" message to Arbiter Q, which adds the job into its "waiting" queue.
Process B requests the next job from Arbiter Q, which removes the next item in its "waiting" queue (subject to some custom scheduling logic to ensure that a single user couldn't flood transcode requests and prevent other users from being able to transcode videos) and inserts it into its "processing" set before returning the job back to Process B. The job is timestamped when it goes into the "processing" set.
Process B completes the job and posts a "complete" message to Arbiter Q, which removes the job from the "processing" set and then modifies some state so that Process C knows the job completed.
Arbiter Q periodically inspects the jobs in its "processing" set, and times out any that have been running for an unusually long amount of time. Process A is then free to attempt to queue up the same job again, if it wants.
This was implemented using JMX (JMS would have been much more appropriate, but I digress). Process A was simply the servlet thread which responded to a user-initiated transcode request. Arbiter Q was an MBean singleton (persisted/replicated across all the nodes in a cluster of servers) that received "schedule" and "complete" messages. Its internally managed "queues" were simply List instances, and when a job completed it modified a value in the application's database to refer to the URL of the transcoded video file. Process B was the transcoding thread. Its job was simply to request a job, transcode it, and then report back when it finished. Over and over again until the end of time. Process C was another user/servlet thread. It would see that the URL was available, and present the download link to the user.
In such a case, if Process B were to die then the jobs would sit in the "waiting" queue forever. In practice, however, that never happened. If your Process B is not running/doing what it is supposed to do then I think that suggests a problem in your deployment/configuration/implementation of Process B more than it does a problem in your overall approach.

Categories