Get ActorRef to previously spawned EventSourcedBehavior - java

We are using event sourcing with Akka Persistence by extending EventSourcedBehavior. When we create the persistent actor we give it an unique name, by using an uuid (the same we use inside create to build the PersistenceId, for entity sharding):
UUID uuid = UUID.randomUUID();
String name = MyBehavior.nameFor(uuid);
ActorRef<Command> actorRef =
context.spawn(MyBehavior.create(uuid), name);
Later on, when we want to send further commands to the actor, we would like to get an ActorRef<Command> from the context, since the actorRef object reference returned by spawn won't be in scope anymore. Think about commands as a result of subsequents HTTP requests.
We can't use context.getChild(name) as it returns ActorRef<Void>.
We've also considered actor discovery with Receptionist, but the documentation says it doesn't scale to any number of actors:
https://doc.akka.io/docs/akka/current/typed/actor-discovery.html#receptionist-scalability
On the other hand, ActorSelection is not supported in typed, as per the following link:
https://doc.akka.io/docs/akka/current/typed/from-classic.html#actorselection
We are not sure about the right approach here. Any help would be much appreciated.

If I understand your question correctly, you want to access the ActorRef of your previously spawned actor. Here is what I usually do.
private final Map<String, ActorRef<Command> instanceIdToActor = new HashMap<>();
private ActorRef<Command> getActorRef(String instanceId) {
ActorRef<Command> instanceActor = instanceIdToActor.get(instanceId);
if (instanceActor == null) {
instanceActor = getContext().spawn(MyBehavior.create(), instanceId);
instanceIdToActor.put(instanceId, instanceActor);
}
return instanceActor;
}
You must also remove the reference whenever the actor dies.
instanceIdToActor.remove(instanceId);

I finally found it on the documentation. In a typed system, the correct way to handle persistent actors is by using EntityRef with ClusterSharding, as in the example linked below:
https://doc.akka.io/docs/akka/current/typed/cluster-sharding.html#persistence-example

Related

Utilizing SpawnProtocol.Command from Guardian Actor

I need the ActorSystem<SpawnProtocol.Command> in another actor. I would like the GuardianActor to have a reference to ActorSystem<SpawnProtocol.Command> so I can pass this reference when I spawn the actor using the Guardian Actor, is there any way to do this? I don't see it is possible since we only get ActorSystem<SpawnProtocol.Command> after creating the actor system and guardian.
ActorSystem<SpawnProtocol.Command> system = ActorSystem.create(GuardianActor.create(), "System", config)
The only option I see is doing something like
system.tell(new SpawnProtocol.Spawn<>(NewActor.create(system)), "NewActor", Props.empty(), system.ignoreRef());
in this case I will not be spawning NewActor using the guardian actor - which I think is not a clean implementation (correct me if I am wrong)
If I'm understanding you, you want to spawn actors with a reference to the ActorSystem.
Every actor can obtain a reference to its ActorSystem via its ActorContext; the ActorContext is typically injected by using Behaviors.setup. You would obtain the ActorSystem by calling getSystem() on the ActorContext. Note that it is an ActorSystem<Void>, but since the only way it really uses its type parameter is when you're using it as an ActorRef, the unsafeUpcast() method can be used. Note that unsafeUpcast doesn't in any way validate that the cast is valid, but since there will typically not be confusion around the type (there being only one ActorSystem in a typical application) this normally isn't a problem; if an improper cast is made, it will result in the system crashing when a message is sent.
// Apologies, my Java is pretty rusty
public class Actor1 extends AbstractBehavior<Actor1.Command> {
public static Behavior<Command> create(int x) {
return Behaviors.setup(context -> new Actor1(context, x));
}
private int x;
private final ActorRef<SpawnProtocol.Command> systemGuardian;
private Actor1(ActorContext<Command> context, int x) {
super(context);
this.x = x;
// If doing an unsafeUpcast on the ActorSystem and there's a message
// which will do nothing in its protocol, it might be a good idea to
// send that message eagerly, so everything crashes quickly...
systemGuardian = context.getSystem().unsafeUpcast<SpawnProtocol.Command>()
}
}
When an Actor1 wants to spawn an actor as a child of the guardian (to be honest, I'm not sure when you'd want to do this from inside of another actor: the purpose of the SpawnProtocol is for code outside of an actor), you just send a SpawnProtocol.Spawn to systemGuardian.
It's also worth noting that the SpawnProtocol can be handled by an actor which isn't the guardian: the guardian actor can spawn an actor handling the SpawnProtocol and provide a ref to that actor as a means to spawn an actor which won't be a child of the requestor.
Note that the ActorRef for the ActorSystem is the guardian actor and it is the guardian actor that will spawn the actor when you do system.tell(new SpawnProtocol.Spawn...).

Retrieve an Akka actor or create it if it does not exist

I am developing an application that creates some Akka actors to manage and process messages coming from a Kafka topic. Messages with the same key are processed by the same actor. I use the message key also to name the corresponding actor.
When a new message is read from the topic, I don't know if the actor with the id equal to the message key was already created by the actor system or not. Therefore, I try to resolve the actor using its name, and if it does not exist yet, I create it. I need to manage concurrency in regard to actor resolution. So it is possible that more than one client asks the actor system if an actor exists.
The code I am using right now is the following:
private CompletableFuture<ActorRef> getActor(String uuid) {
return system.actorSelection(String.format("/user/%s", uuid))
.resolveOne(Duration.ofMillis(1000))
.toCompletableFuture()
.exceptionally(ex ->
system.actorOf(Props.create(MyActor.class, uuid), uuid))
.exceptionally(ex -> {
try {
return system.actorSelection(String.format("/user/%s",uuid)).resolveOne(Duration.ofMillis(1000)).toCompletableFuture().get();
} catch (InterruptedException | ExecutionException e) {
throw new RuntimeException(e);
}
});
}
The above code is not optimised, and the exception handling can be made better.
However, is there in Akka a more idiomatic way to resolve an actor, or to create it if it does not exist? Am I missing something?
Consider creating an actor that maintains as its state a map of message IDs to ActorRefs. This "receptionist" actor would handle all requests to obtain a message processing actor. When the receptionist receives a request for an actor (the request would include the message ID), it tries to look up an associated actor in its map: if such an actor is found, it returns the ActorRef to the sender; otherwise it creates a new processing actor, adds that actor to its map, and returns that actor reference to the sender.
I would consider using akka-cluster and akka-cluster-sharding. First, this gives you throughput, and as well, reliability. However, it will also make the system manage the creation of the 'entity' actors.
But you have to change the way you talk to those actors. You create a ShardRegion actor which handles all the messages:
import akka.actor.AbstractActor;
import akka.actor.ActorRef;
import akka.actor.ActorSystem;
import akka.actor.Props;
import akka.cluster.sharding.ClusterSharding;
import akka.cluster.sharding.ClusterShardingSettings;
import akka.cluster.sharding.ShardRegion;
import akka.event.Logging;
import akka.event.LoggingAdapter;
public class MyEventReceiver extends AbstractActor {
private final ActorRef shardRegion;
public static Props props() {
return Props.create(MyEventReceiver.class, MyEventReceiver::new);
}
static ShardRegion.MessageExtractor messageExtractor
= new ShardRegion.HashCodeMessageExtractor(100) {
// using the supplied hash code extractor to shard
// the actors based on the hashcode of the entityid
#Override
public String entityId(Object message) {
if (message instanceof EventInput) {
return ((EventInput) message).uuid().toString();
}
return null;
}
#Override
public Object entityMessage(Object message) {
if (message instanceof EventInput) {
return message;
}
return message; // I don't know why they do this it's in the sample
}
};
public MyEventReceiver() {
ActorSystem system = getContext().getSystem();
ClusterShardingSettings settings =
ClusterShardingSettings.create(system);
// this is setup for the money shot
shardRegion = ClusterSharding.get(system)
.start("EventShardingSytem",
Props.create(EventActor.class),
settings,
messageExtractor);
}
#Override
public Receive createReceive() {
return receiveBuilder().match(
EventInput.class,
e -> {
log.info("Got an event with UUID {} forwarding ... ",
e.uuid());
// the money shot
deviceRegion.tell(e, getSender());
}
).build();
}
}
So this Actor MyEventReceiver runs on all nodes of your cluster, and encapsulates the shardRegion Actor. You no longer message your EventActors directly, but, using the MyEventReceiver and deviceRegion Actors, you use the sharding system keep track of which node in the cluster the particular EventActor lives on. It will create one if none have been created before, or route it messages if it has. Every EventActor must have a unique id: which is extracted from the message (so a UUID is pretty good for that, but it could be some other id, like a customerID, or an orderID, or whatever, as long as its unique for the Actor instance you want to process it with).
(I'm omitting the EventActor code, it's otherwise a pretty normal Actor, depending what you are doing with it, the 'magic' is in the code above).
The sharding system automatically knows to create the EventActor and allocate it to a shard, based on the algorithm you've chosen (in this particular case, it's based on the hashCode of the unique ID, which is all I've ever used). Furthermore, you're guaranteed only one Actor for any given unique ID. The message is transparently routed to the correct Node and Shard wherever it is; from whichever Node and Shard it's being sent.
There's more info and sample code in the Akka site & documentation.
This is a pretty rad way to make sure that the same Entity/Actor always processes messages meant for it. The cluster and sharding takes automatic care of distributing the Actors properly, and failover and the like (you would have to add akka-persistence to get passivation, rehydration, and failover if the Actor has a bunch of strict state associated with it (that must be restored)).
The answer by Jeffrey Chung is indeed of Akka way. The downside of such approach is its low performance. The most performant solution is to use Java's ConcurrentHashMap.computeIfAbsent() method.

Number of Actors when using #Inject

I am building an application in Play Framework that has to do some intense file parsing. This parsing involves parsing multiple files, preferably in parallel.
A user uploads an archive that gets unziped and the files are stored on the drive.
In that archive there is a file (let's call it main.csv) that has multiple columns. One such column is the name of another file from the archive (like subPage1.csv). This column can be empty, so that not all rows from the main.csv have subpages.
Now, I start an Akka Actor to parse the main.csv file. In this actor, using #Inject, I have another ActorRef
public MainParser extends ActorRef {
#Inject
#Named("subPageParser")
private AcgtorRef subPageParser;
public Receive createReceive() {
...
if (column[3] != null) {
subPageParser.tell(column[3], getSelf());
}
}
}
SubPageParser Props:
public static Props getProps(JPAApi jpaApi) {
return new RoundRobinPool(3).props(Props.create((Class<?>) SubPageParser.class, jpaApi));
}
Now, my question is this. Considering that a subPage may take 5 seconds to be parsed, will I be using a single instance of SubPageParser or will there be multiple instances that do the processing in parallel.
Also, consider another scenario, where the names are stored in the DB, and I use something like this:
List<String> names = dao.getNames();
for (String name: names) {
subPageParser.tell(name, null);
}
In this case, considering that the subPageParser ActorRef is obtained using Guice #Inject as before, will I do parallel processing?
If I am doing processing in parallel, how do I control the number of Actors that are being spawned? If I have 1000 subPages, I don't want 1000 Actors. Also, their lifetime may be an issue.
NOTE:
I have an ActorsModule like this, so that I can use #Inject and not Props:
public class ActorsModule extends AbstractModule implements AkkaGuiceSupport {
#Override
protected void configure() {
bindActor(MainParser.class, "mainparser");
Function<Props, Props> props = p -> SubPageParser.getProps();
bindActor(SubPageParser.class, "subPageParser", props);
}
}
UPDATE: I have modified to use a RoundRobinPool. However, This does not work as intended. I specified 3 as the number of instances, but I get a new object for each parse request tin the if.
Injecting an actor like you did will lead to one SubPageParser per MainParser. While you might send 1000 messages to it (using tell), they will get processed one by one while the others are waiting in the mailbox to be processed.
With regards to your design, you need to be aware that injecting an actor like that will create another top-level actor rather than create the SubPageParser as a child actor, which would allow the parent actor to control and monitor it. The playframework has support for injecting child actors, as described in their documentation: https://www.playframework.com/documentation/2.6.x/JavaAkka#Dependency-injecting-child-actors
While you could get akka to use a certain number of child actors to distribute the load, I think you should question why you have used actors in the first place. Most problems can be solved with simple Futures. For example you can configure a custom thread pool to run your Futures with and have them do the work at a parallelization level as you wish: https://www.playframework.com/documentation/2.6.x/ThreadPools#Using-other-thread-pools

Unit testing private methods in Akka

I'm new to akka and I'm trying akka on java. I'd like to understand unit testing of business logic within actors. I read documentation and the only example of isolated business logic within actor is:
static class MyActor extends UntypedActor {
public void onReceive(Object o) throws Exception {
if (o.equals("say42")) {
getSender().tell(42, getSelf());
} else if (o instanceof Exception) {
throw (Exception) o;
}
}
public boolean testMe() { return true; }
}
#Test
public void demonstrateTestActorRef() {
final Props props = Props.create(MyActor.class);
final TestActorRef<MyActor> ref = TestActorRef.create(system, props, "testA");
final MyActor actor = ref.underlyingActor();
assertTrue(actor.testMe());
}
While this is simple, it implies that the method I want to test is public. However, considering actors should communicate only via messages, my understanding that there is no reason to have public methods, so I'd made my method private. Like in example below:
public class LogRowParser extends AbstractActor {
private final Logger logger = LoggerFactory.getLogger(LogRowParser.class);
public LogRowParser() {
receive(ReceiveBuilder.
match(LogRow.class, lr -> {
ParsedLog log = parse(lr.rowText);
final ActorRef logWriter = getContext().actorOf(Props.create(LogWriter.class));
logWriter.tell(log, self());
}).
matchAny(o -> logger.info("Unknown message")).build()
);
}
private ParsedLog parse(String rowText) {
// Log parsing logic
}
}
So to test method parse I either:
need it to make package-private
Or test actor's public interface, i.e. that next actor LogWriter received correct parsed message from my actor LogRowParser
My questions:
Are there any downsides on option #1? Assuming that actors communicating only via messages, encapsulation and clean open interfaces are less important?
In case if I try to use option #2, is there a way to catch messages sent from actor in test downstream (testing LogRowParser and catching in LogWriter)? I reviewed various examples on JavaTestKit but all of them are catching messages that are responses back to sender and none that would show how to intercept the message send to new actor.
Is there another option that I'm missing?
Thanks!
UPD:
Forgot to mention that I also considered options like:
Moving logic out of actors completely into helper classes. Is it common practice with akka?
Powermock... but i'm trying to avoid it if redesign is possible
There's really no good reason to make that method private. One generally makes a method on a class private to prevent someone who has a direct reference to an instance of that class from calling that method. With an actor instance, no one will have a direct reference to an instance of that actor class. All you can get to communicate with an instance of that actor class is an ActorRef which is a light weight proxy that only allows you to communicate by sending messages to be handled by onReceive via the mailbox. An ActorRef does not expose any internal state or methods of that actor class. That's sort of one of the big selling points of an actor system. An actor instance completely encapsulates its internal state and methods, protecting them from the outside world and only allows those internal things to change in response to receiving messages. That's why it does not seem necessary to mark that method as private.
Edit
Unit testing of an actor, IMO, should always go through the receive functionality. If you have some internal methods that are then called by the handling in receive, you should not focus on testing these methods in isolation but instead make sure that the paths that lead to their invocation are properly exercised via the messages that you pass during test scenarios.
In your particular example, parse is producing a ParsedLog message that is then sent on to a logWriter child actor. For me, knowing that parse works as expected means asserting that the logWriter received the correct message. In order to do this, I would allow the creation of the child logWriter to be overridden and then do just that in the test code and replace the actor creation with a TestProbe. Then, you can use expectMsg on that probe to make sure that it received the expected ParsedLog message thus also testing the functionality in parse.
As far as your other comment around moving the real business for the actor out into a separate and more testable class and then calling that from in the actor, some people do this, so it's not unheard of. I personally don't, but that's just me. If that approach works for you, I don't see any major issues with it.
I had the same problem 3 years ago, when dealing with actors : the best approach i found was to have minimum responsability to the actor messenging responsability.
The actor will receive the message and choose the Object's method to call or the message to send or the exception to throw and that's it.
This way it will be very simple to mock up either the services called by the actor and the input to those services.

Akka: Cleanup of dynamically created actors necessary when they have finished?

I have implemented an Actor system using Akka and its Java API UntypedActor. In it, one actor (type A) starts other actors (type B) dynamically on demand, using getContext().actorOf(...);. Those B actors will do some computation which A doesn't really care about anymore. But I'm wondering: is it necessary to clean up those actors of type B when they have finished? If so, how?
By having B actors call getContext().stop(getSelf()) when they're done?
By having B actors call getSelf().tell(Actors.poisonPill()); when they're done? [this is what I'm using now].
By doing nothing?
By ...?
The docs are not clear on this, or I have overlooked it. I have some basic knowledge of Scala, but the Akka sources aren't exactly entry-level stuff...
What you are describing are single-purpose actors created per “request” (defined in the context of A), which handle a sequence of events and then are done, right? That is absolutely fine, and you are right to shut those down: if you don’t, they will accumulate over time and you run into a memory leak. The best way to do this is the first of the possibilities you mention (most direct), but the second is also okay.
A bit of background: actors are registered within their parent in order to be identifyable (e.g. needed in remoting but also in other places) and this registration keeps them from being garbage collected. OTOH, each parent has a right to access the children it created, hence no automatic termination (i.e. by Akka) makes sense, instead requiring explicit shutdown in user code.
In addition to Roland Kuhn's answer, rather than create a new actor for every request, you could create a predefined set of actors that share the same dispatcher, or you can use a router that distributes requests to a pool of actors.
The Balancing Pool Router, for example, allows you to have a fixed set of actors of a particular type share the same mailbox:
akka.actor.deployment {
/parent/router9 {
router = balancing-pool
nr-of-instances = 5
}
}
Read the documentation on dispatchers and on routing for further detail.
I was profiling(visualvm) one of the sample cluster application from AKKA documentation and I see garbage collection cleaning up the per request actors during every GC. Unable to completely understand the recommendation of explicitly killing the actor after use. My actorsystem and actors are managed by SPRING IOC container and I use spring extension in-direct actor-producer to create actors. The "aggregator" actor is getting garbage collected on every GC, i did monitor the # of instances in visual VM.
#Component
#Scope(ConfigurableBeanFactory.SCOPE_PROTOTYPE)
public class StatsService extends AbstractActor {
private final LoggingAdapter log = Logging.getLogger(getContext().getSystem(), this);
#Autowired
private ActorSystem actorSystem;
private ActorRef workerRouter;
#Override
public void preStart() throws Exception {
System.out.println("Creating Router" + this.getClass().getCanonicalName());
workerRouter = getContext().actorOf(SPRING_PRO.get(actorSystem)
.props("statsWorker").withRouter(new FromConfig()), "workerRouter");
super.preStart();
}
#Override
public Receive createReceive() {
return receiveBuilder()
.match(StatsJob.class, job -> !job.getText().isEmpty(), job -> {
final String[] words = job.getText().split(" ");
final ActorRef replyTo = sender();
final ActorRef aggregator = getContext().actorOf(SPRING_PRO.get(actorSystem)
.props("statsAggregator", words.length, replyTo));
for (final String word : words) {
workerRouter.tell(new ConsistentHashableEnvelope(word, word),
aggregator);
}
})
.build();
}
}
Actors by default do not consume much memory. If the application intends to use actor b later on, you can keep them alive. If not, you can shut them down via poisonpill. As long your actors are not holding resources, leaving an actor should be fine.

Categories