GPars - How do I know if actor is busy? - java

I'm trying to use GPars in Java to handle messages of a few types.
There is one actor for each message type.
But message processing takes a lot of time, while messages keep coming. I need to ignore upcoming messages (just throw them away) while the actor is busy.
How do I know if an GPars actor is busy? I know about Actor.isActive() method, but I'm not too sure that it is the thing (the JavaDoc is pretty ambiguous and unclear) and I couldn't find any useful info ether.

There's no built-in way to determine this, I'm afraid. You'd have to implement a busy-controlling algorithm yourself, perhaps based on CountdownLatches.

In general, this can be problematic, as an actor can only process one message at a time, whatever solution you have to do this, will have limitations. For instance you can have an actor within an actor, a supervisor and a worker.
The worker actor is the one which actually does the work. What happens is work is sent to the supervisor. The supervisor has a boolean variable such as isBusy, which is initially false. When work is received the supervisor sets the variable to true, and passes the work on to the worker. When the work is finished the worker sends the result back to the supervisor and the supervisor sets the isBusy to false and returns the result.
If another message comes in whilst the isBusy is false the supervisor can just send a message back such as an isBusy message, or do nothing which is what you say you want.
Note that if the worker crashes, or restarts, the isBusy will still be true. You will need to think about this solution, if it will meet your needs. There maybe a mailbox which would be better for this I don't know.
Whatever you do, you should try your best to avoid creating situations where you could leave your actor system in a bad state, best of luck.

Related

simplest application possible that needs multiple (two) JVMs

I have an actor system "Main" running potentially forever. This main actor understands "Snapshot" or "Stop" messages (defined by me).
I would like to create a bash script that, while Main actor is running, launches a second (short lived) system, or actor, or whatever and sends a snapshot or stop message to the Main actor.
With akka classic that was very easy with actorSelection from a secondary actor
ActorRef mainActorRef = Await.result(system.actorSelection("akka.main.actor.path").resolveOne(timeout));
mainActorRef.send(new StopMessage() or new SnapsthotMessage());
What is the analogous and hopefully equally easy solution in akka typed?
Ok, let's try to sort this mess a bit... First of all, your question is highly unclear:
In the title, you ask for something based on two JVMs, but in the text you ask for a "second (short lived) system, or actor, or whatever". No clue if multiple JVMs are a requirement or just an idea to solve this. Additionally, your example code is something that - disregarding clustering - works in one JVM and you also only mention a second "actor" there.
So, if the requirement is using two JVMs, then I would suggest making it more clear in what way, why, etc. Then people can also actually provide help for that part.
For now, let me assume you want to simply have...
A (typed) actor system
...that can somehow process StopMessage/SnapshotMessage...
...both of which can be triggered from the outside
The way you can do this very simply is the usual typed way:
Define a RootGuardian actor that accepts those two messages (that actor is basically what the implicit /user actor was in classic) - you have to do that for your Typed actor system anyway (because to setup a typed actor system, you supply the behavior of the RootGuardian).
Let it create the needed child actors to process those messages (either at start or when needed). Of course, in your simple example, the root guardian can also process these messages itself, but an actorsystem with only one actor is not a very typical use-case.
Let it delegate the messages to the appropriate child actor(s)
Add a simple api endpoint to call system.tell ( ... ) to send the message into the system, where your RootGuardian actor will delegate it correctly.
Use curl to call your api endpoint (or use any other way to communicate with your application, there are dozens, but most of them are outside the scope of akka itself)
As a general idea, Akka Typed tends to be much more strict about who can send what messages where. In Akka classic, it was easy to basically send everything everywhere and find and access any actor from everywhere, including outside the system. Unfortunately, this "freedom" leads to a huge amount of problems and was thus severely limited in Typed, which makes for clearer contracts and better defined message flows.
Of course, in a highly complex system, you might, for example, want to use a Receptionist instead to find the target actor for your specific message, but since the question was for a simple application, I would skip that for now.
You can, of course, also add ways to get your ActorRefs outside the system, for example by using the Ask Pattern to implement something like an actor discovery in the RootGuardian, but there is simply no need to try to circumvent the concepts of Akka Typed by re-implementing ActorSelection.
Obviously you also could use clustering, start up a 2nd JVM, connect it to the cluster, send the message and shut it down again, but we can assume that this would be overkill and very, very slow (waiting for long seconds while starting up the app, connecting to the cluster, etc. just to then use a few milliseconds to send the message).
If you absolutely want a 2nd JVM there, you can, of course, for example, create simply REST client that sends the message and start that, but... curl exists, so... what for?
So, tl;dr: The "analogous and hopefully equally easy solution" is system.tell( new StopMessage() );, which is basically the same in akka typed as the code for akka classic you provided. Obviously, implementing the actor system in a way that this code works, is the (slightly) more tricky part.

Akka, how can i preserve the sender reference when working with simple Java object and Futures?

I have a Java class which creates a supervisor Actor which then creates a child actor for each request it needs to generate.
Java class uses a Future to get response back from supervisor Actor
Supervisor uses tell and sender reference to get response back from child Actor
However my supervisor Actor is a singleton scoped bean so i need to be able find a way to store the sender reference each time my Java class makes a request to the supervisor. What is the best way to do this?
I don't think singleton scope is a good isea, as Akka needs to be able to restart actors when exceptions occur. That might lead to some weird problems. We used prototype scope ourselves.
Other than that, one possibility is to simply decouple receiving the request and sending the response by passing the server through your actors, you just use forward instead of tell, and the last actor in your pipeline will respond to the sender. This way the supervisor does not need to care about the response. Obviously this is ideal if your supervisor does nothing but sending the response to the sender.
If there is some processing to be done before sending the response, you can create a temporary actor and pass it the sender reference, and let this actor collect the results, send the response to your future and stop itself. This is especially useful if you need to wait for more than one response and aggregate it.
You can also add the sender reference to the message you send from the supervisor to your actors and back to the supervisor. Simple, yet effective.

After creating SQS message, how can I monitor when it gets deleted?

My program creates a message in a SQS queue and then needs to wait for one of the workers pulling work on the queue to process it. I want to monitor the status of a message to determine when it gets deleted, since that would be my indicator that the work is done. But I can't figure out a way to do this with the SQS API.
SendMessageRequest msgRequest = new SendMessageRequest(SQS_QUEUE_URL, messageBody);
SendMessageResult result = sqsClient.sendMessage(msgRequest);
String msgId = result.getMessageId();
// so, in theory, this is what I WANT to do...
while(!sqsClient.wasThisMessageDeletedYet(msgId))
Thread.sleep(1000L);
// continue, confident that because the message was deleted, I can rely upon the fact that the result of the Worker is now stashed where it's supposed to be
What's the right way to do "wasThisMessageDeletedYet(id)"?
I'm afraid such an API endpoint doesn't exist; looking at the API reference (http://docs.aws.amazon.com/AWSSimpleQueueService/latest/APIReference/Welcome.html), you could see that there are no methods for querying messages.
Maybe you could try with "change message visibility", but that:
has side effects
you would need to know the receipt handle which you obtain when receiving the message
So I suppose your best bet is to store that state in some external database (if you want to stay in Amazon-land, maybe Dynamo?). With a simple message id -> boolean mapping indicating if messages have been processed or not.
Another option (but similar) is for the consumer to publish the status in a response queue. The wait will have to be done asynchronously (a Future perhaps).
Obviously there an overhead in processing as well as complexity in programming due to the asynchronous nature of interactions. But typically it is done this way.

Right design in akka. - Message delivery

I have gone through some posts on how and why akka does not guarantee message delivery. The documentation, this discussion and the other discussions on group do explain it well.
I am pretty new to akka and wish to know the appropriate design for a case. For example say I have 3 different actors all on different machines. One is responsible for cookbooks, the other for history and the last for technology books.
I have a main actor on another machine. Suppose there is a query to the main-actor to search if we have some book available. The main actor sends requests to the 3 remote actors, and expects the result. So I do this:
val scatter = system.actorOf(
Props[SearchActor].withRouter(ScatterGatherFirstCompletedRouter(
routees=someRoutees, within = 10 seconds)), "router")
implicit val timeout = Timeout(10 seconds)
val futureResult = scatter ? Text("Concurrency in Practice")
// What should I do here?.
//val result = Await.result(futureResult, timeout.duration) line(a)
In short, I have sent requests to all 3 remote actors and expect the result in 10 seconds.
What should be the action?
Say I do not get the result in 10 seconds, should I send a new request to all of them again?
What if within time above is premature. But I do not know pre-hand on how much time it might take.
What if within time was sufficient but the message got dropped.
If i dont get response in within time and resend the request again. Something like this, it remain asynchronous:
futureResult onComplete{
case Success(i) => println("Result "+i)
case Failure(e) => //send again
}
But under too many queries, wont it be too many threads on the call and bulky? If I uncomment line(a), it becomes synchronous and under load might perform badly.
Say I dont get response in 10 seconds. If within time was premature, then its a heavy useless computation happening again. If messsage got dropped, then 10 seconds of valuable time wasted. In case, say I knew that the message got delivered, I would probably wait for longer duration without being skeptical.
How do people solve such issues? ACK? But then I have to store the state in actor of all queries. It must be a common thing and I am looking for right design.
I'm going to try and answer some of these questions for you. I'm not going to have concrete answers for everything, but hopefully I can guide you in the right direction.
For starters, you will need to make a change in how you are communicating the request to the 3 actors that do book searches. Using a ScatterGatherFirstCompletedRouter is probably not the correct approach here. This router will only wait for an answer from one of the routees (the first one to respond), so your set of results will be incomplete as it will not contain results from the other 2 routees. There is also a BroadcastRouter, but that will not fit your needs either as it only handles tell (!) and not ask (?). To do what you want to do, one option is to send the request to each receipient, getting Futures for the responses and then combine them into an aggregate Future using Future.sequence. A simplified example could look like this:
case class SearchBooks(title:String)
case class Book(id:Long, title:String)
class BookSearcher extends Actor{
def receive = {
case req:SearchBooks =>
val routees:List[ActorRef] = ...//Lookup routees here
implicit val timeout = Timeout(10 seconds)
implicit val ec = context.system.dispatcher
val futures = routees.map(routee => (routee ? req).mapTo[List[Book]])
val fut = Future.sequence(futures)
val caller = sender //Important to not close over sender
fut onComplete{
case Success(books) => caller ! books.flatten
case Failure(ex) => caller ! Status.Failure(ex)
}
}
}
Now that's not going to be our final code, but it's an approximation of what your sample was attempting to do. In this example, if any one of the downstream routees fails/times out, we will hit our Failure block, and the caller will also get a failure. If they all succeed, the caller will get the aggregate List of Book objects instead.
Now onto your questions. First, you ask if you should send a request to all of the actors again if you do not get an answer from one of the routees within the timeout. The answer to this question really up to you. Would you allow your user on the other end to see a partial result (i.e. the results from 2 of the 3 actors), or does it always have to be the full set of results every time? If the answer is yes, you could tweak the code that is sending to the routees to look like this:
val futures = routees.map(routee => (routee ? req).mapTo[List[Book]].recover{
case ex =>
//probably log something here
List()
})
With this code, if any of the routees timesout or fails for any reason, an empty list of 'Book` will be substituted in for the response instead of the failure. Now, if you can't live with partial results, then you could resend the entire request again, but you have to remember that there is probably someone on the other end waiting for their book results and they don't want to wait forever.
For your second question, you ask if what if your timeout is premature? The timeout value you select is going to be completely up to you, but it most likely should be based on two factors. The first factor will come from testing the call times of the searches. Find out on average how long it takes and select a value based on that with a little cushion just to be safe. The second factor is how long someone on the other end is willing to wait for their results. You could just be very conservative in your timeout, making it like 60 seconds just to be safe, but if there is indeed someone on the other end waiting for results, how long are they willing to wait? I'd rather get a failure response indicating that I should try again instead of waiting forever. So taking those two factors into account, you should select a value that will allow you to get responses a very high percentage of the time while still not making the caller on the other end wait too long.
For question 3, you ask what happens if the message gets dropped. In this case I'm guessing that the future for whoever was to receive that message will just timeout because it will not get a response because the recipient actor will never receive a message to respond to. Akka is not JMS; it doesn't have acknowledgement modes where a message can be resent a number of times if the recipient does not receive and ack the message.
Also, as you can see from my example, I agree with not blocking on the aggregate Future by using Await. I prefer using the non-blocking callbacks. Blocking in a receive function is not ideal as that Actor instance will stop processing its mailbox until that blocking operation completes. By using a non-blocking callback, you free that instance up to go back to processing its mailbox and allow the handling of the result to be just another job that is executed in the ExecutionContext, decoupled from the actor processing its mailbox.
Now if you really want to not waste communications when the network is not reliable, you could look into the Reliable Proxy available in Akka 2.2. If you don't want to go this route, you could roll it yourself by sending ping type messages to the routees periodically. If one does not respond in time, you mark it as down and do not send messages to it until you can get a reliable (in a very short amount of time) ping from it, sort of like a FSM per routee. Either of these can work if you absolutely need this behavior, but you need to remember that these solutions add complexity and should only be employed if you absolutely need this behavior. If you're developing bank software and you absolutely need guaranteed delivery semantics as bad financial implications will result otherwise, by all means go with this kind of approach. Just be judicious in deciding if you need something like this because I bet 90% of the time you don't. In your model, the only person probably affected by waiting on something that you might have already known won't be successful is the caller on the other end. By using non-blocking callbacks in the actor, it's not being halted by the fact that something might take a long time; it's already moved in to its next message. You also do need to be careful if you decide to resubmit on failure. You don't want to flood the receiving actors mailboxes. If you decide to resend, cap it at a fixed number of times.
One other possible approach if you need these guaranteed kind of semantics might be to look into Akka's Clustering Model. If you clustered the downstream routees, and one of the servers was failing, then all traffic would be routed to the node that was still up until that other node recovered.

Implementing a JMS Request-Reply. Queue vs Topic?

I understand that there are different ways (or permutations) to implementing a JMS Request-Reply mechanism, i.e. request queue and response queue, request topic and response topic, or a mix of either.
What I would like to know is, (1) what is the recommended (or most common) way and (2) how do the different permutations measure up?
Next, is it more correct to say
a. "Send a message to a queue" or b. "Send a message through a queue"?
Cheers!
Normally, use a queue. "Request" implies a recipient, not a notice to anyone who cares, so you probably want the behaviour of a queue.
Queues usually do better for one thing - or a limited number of peer things - receiving the message and processing it. They also tend to saner persistence models than topic, when it matters that the message actually get to someone who processes it. (eg: if dropping the message is a problem, you probably want a queue)
Topics are more broadcast oriented: say something, and anyone who cares will hear about it. Normally that goes hand-in-hand with "...and no direct response is expected" because the "zero or more listeners" model ... well, zero listeners is always a problem if you expect a response.
Topics can do persistence, but the rules are stranger, and seldom what you actually want.
Finally, I think most people say "to" a queue, because the queue and the thing(s) processing messages off it are distinct, but really, it doesn't matter much as log as you convey your meaning.
Also with a Queue you are able to have multiple subscribers process the messages so its kid of a built in loadbalancer. You cannot do this easily with a Topic.

Categories