I'm currently on a study project involving Domain-driven design (DDD) and integration scenarios of multiple domains.
I have a use case in one of my bounded context where I need to contact another BC to validate an aggregate. In fact, there could be several BC to ask for validation data in the future (but no for now).
Right now, I'm suffering from DDD obsessive compulsive disorder nervous breakdown where I cannot found a way to apply patterns correctly (lol). I would really appreciate some feedback from people about it.
About the 2 bounded contexts.
- The first one (BC_A) where the use case is taking place would contain a list of elements that are related to the user.
- The external one (BC_B) has some knowledge about those elements
* So, a validation request from BC_A to BC_B would ask a review of all elements of the aggregate from BC_A, and would return a report containing some specifications about what to do with those elements (if we should keep it or not, and why).
*The state of the aggregate would pass through (let say) "draft" then "validating" after a request, and then depending on the report sent back, it would be "valid" or "has_error" in case there is one. If the user later choose to not follow the spec, it could pass the state of the aggregate to "controlled" meaning there is some error but we do not taking care of it.
The command is ValidateMyAggregateCommand
The use case is:
get the target aggregate by id
change its state to "validating"
persist the aggregate
make validation call (to another BC)
persist the validation report
acknowledge the validation report with the target aggregate (which will change its state again depending on the result, should be "OK" or "HAS_ERROR")
persist the aggregate again
generate a domain event depending on the validation result
it contains 8 steps, possibly from 1 to 3 transactions or more.
I need to persist the validation report localy (to access it in the UI) and I think I could do it:
after the validation call independently (the report being its own aggregate)
when I persist the target aggregate (it would be inside it)
I prefer the first option (step 5) because it is more decoupled - even if we could argue that there is an invariant here (???) - and so there is a consistency delay between the persistance of the report and the acknownledgement by the aggregate.
I'm actually struggling with the call itself (step 4).
I think I could do it in several ways:
A. synchronous RPC call with REST implementation
B. call without a response (void) (fire and forget) letting several implementations options on the table (sync/async)
C. domain event translated into technical event to reach other BC
A. Synchronous RPC call
// code_fragment_a
// = ValidateMyAggregateCommandHandler
// ---
myAggregate = myAggregateRepository.find(command.myAggregateId()); // #1
myAggregate.changeStateTo(VALIDATING); // #2
myAggregateRepository.save(myAggregate); // #3
ValidationReport report = validationService.validate(myAggregate); // #4
validationReportRepository.save(report); // #5
myAggregate.acknowledge(report); // #6
myAggregateRepository.save(myAggregate); // #7
// ---
The validationService is a domain service implemented in the infrastructure layer with a REST service bean (could be local validation as well but not in my scenario).
The call needs a response immediately and the caller (the command handler) is blocked until the response is returned. So it introduces a high temporal coupling.
In case the validation call fails because of technical reasons, we take an exception and we have to rollback everything. The command would have to be replayed later.
B. Call without response (sync or async)
In this version, the command handler would persist the "validating" state of the aggregate, and would fire (and forget) the validation request.
// code_fragment_b0
// = ValidateMyAggregateCommandHandler
// ---
myAggregate = myAggregateRepository.find(command.myAggregateId()); // #1
myAggregate.changeStateTo(VALIDATING); // #2
myAggregateRepository.save(myAggregate); // #3
validationRequestService.requestValidation(myAggregate); // #4
// ---
Here, the acknowledgement of the report could happen in a sync or async manner, inside or outside the initial transaction.
Having this code above in a dedicated transaction allows failures in validation call to be harmless (if we have a retry mechanism in the impl).
This solution would allow to start with a sync communication quickly and easily, and switch to an async one later. So it is flexible.
B.1. Synchronous impl
In this case, the implementation of the validationRequestService (in the infrastructure layer) does a direct request/response.
// code_fragment_b1_a
// = SynchronousValidationRequestService
// ---
private ValidationCaller validationCaller;
public void requestValidation(MyAggregate myAggregate) {
ValidationReport report = validationCaller.validate(myAggregate);
validationReportRepository.save(report);
DomainEventPublisher.publish(new ValidationReportReceived(report))
}
// ---
The report is persisted in a dedicated transaction, and the publishing of an event activate a third code fragment (in the application layer) that do the actual acknowledgment work on the aggregate.
// code_fragment_b1_b
// = ValidationReportReceivedEventHandler
// ---
public void when(ValidationReportReceived event) {
MyAggregate myAggregate = myAggregateRepository.find(event.targetAggregateId());
ValidationReport report = ValidationReportRepository.find(event.reportId());
myAggregate.acknowledge(report);
myAggregateRepository.save(myAggregate);
}
// ---
So here, we have an event from infra layer to the app layer.
B.2. Asynchronous
The asynchronous version would change the previous solution in the ValidationRequestService impl (code_fragment_b1_a). The use of a JMS/AMQP bean would allow to send a message in a first time, and receive the response later independently.
I guess the messaging listener would fire the same ValidationReportReceived event, and the rest of the code would be the same for code_fragment_b1_b.
As I write this post, I realize this solution (B2) present a nicer symetry in the exchange and better technical points because it is more decoupled and reliable regarding the network communications. At this point it is not introducing so much complexity.
C. Domain events and bus between BCs
Last implementation, instead of using a domain service to request a validation from other BC, I would raise a domain event like MyAggregateValidationRequested. I realize it is a "forced" domain event, ok the user requested it but it never really emerge in conversation but still it is a domain event.
The thing is, I don't know yet how and where to put the event handlers. Should the infrastructure handlers take it directly ?
Should I translate the domain event into a technical event before sending it to its destination ???
technical event like some kind of DTO if it was a data structure
I guess all the code related to messaging belong to the infrastructure layer (port/adapter slot) because they are used to communicate between systems only.
And the technical events that are transfered inside those pipes with their raising/handling code should belong to the application layer because like commands, they end up on a mutation of the system state. They coordinates the domain, and are fired by the infra (like controllers firing application service).
I read some solutions about translating events in commands but I think it makes the system more complex for no benefits.
So my application facade would expose 3 types of interacion:
- commands
- queries
- events
With this separation, I think we can isolate commands from UI and events from other BCs more clearly.
Ok, I realize that post is pretty long and maybe a little bit messy, but this is where I'm stuck, so I thank you in advance if you can say something that could help me.
So my problem is that I'm struggling with the integration of the 2 BC.
Different solutions: - The service RPC (#A) is simple but limit the scale, - the service with messaging (#B) seems right but I still need feedback, - and the domain events (#C) I don't know really how to cross boudaries with that.
Thank you again!
I have a use case in one of my bounded context where I need to contact another BC to validate an aggregate.
That's a really weird problem to have. Typically, aggregates are valid, or not valid, entirely dependent on their own internal state -- that would be why they are aggregates, and not merely entities in some larger web.
In other words, you may be having trouble applying the DDD patterns because your understanding of the real problem you are trying to solve is incomplete.
As an aside: when asking for help in ddd, you should adhere as closely as you can to your actual problem, rather than trying to make it abstract.
That said, there are some patterns that can help you out. Udi Dahan walks through them in detail in his talk on reliable messaging, but I'll cover the high points here.
When you run a command against an aggregate, there are two different aspects to be considered
Persisting the change of state
Scheduling side effects
"Side effects" can include commands to be run against other aggregates.
In your example, we would see three distinct transactions in the happy path.
The first transaction would update the state of your aggregate to Validating, and schedule the task to fetch the validation report.
That task runs asynchronously, querying the remote domain context, then starts transaction #2 in this BC, which persists the validation report and schedules a second task.
The second task - built from the data copied into the validation report - starts transaction #3, running a command against your aggregate to update its state. When this command is finished, there are no more commands to schedule, and everything gets quiet.
This works, but it couples your aggregates perhaps too tightly to your process. Furthermore, your process is disjoint - scattered about in your aggregate code, not really recognized as being a first class citizen.
So you are more likely to see this implemented with two additional ideas. First, the introduction of a domain event. Domain events are descriptions changes of state that are of special significance. So the aggregate describes the change (ValidationExpired?) along with the local state needed to make sense of it, publishing the event asynchronously. (In other words, instead of asynchronously running an arbitrary task, we run asynchronously schedule a PublishEvent Task, with an arbitrary domain event as the payload).
Second, the introduction of a "process manager". The process manager subscribes to the events, updates its internal state machine, and schedules (asynchronous) tasks to run. (These tasks are the same tasks that the aggregate was scheduling before). Note that the process manager doesn't have any business rules; those belong in the aggregates. But they know how to match commands with the domain events they generate (see the messaging chapter in Enterprise Integration Patterns, by Gregor Hohpe), to schedule timeout tasks that help detect which scheduled tasks haven't completed within their SLA and so on.
Fundamentally, process managers are analogous to aggregates; they themselves are part of the domain model, but access to them is provided to them by the application component. With aggregates, the command handler is part of the application; when the command has been processed by the aggregate, it's the application that schedules the asynchronous tasks. The domain events are published to the event bus (infrastructure), and the application's event handlers subscribe to that bus, loading the process managers via persistence, passing the domain event to be processed, using the persistence component again to save the updated process manager, and then the application schedules the pending tasks.
I realize it is a "forced" domain event, ok the user requested it but it never really emerge in conversation but still it is a domain event.
I wouldn't describe it as forced; if the requirement for this validation process really comes from the business, then the domain event is a thing that belong in the ubiquitous language.
Should I translate the domain event into a technical event before sending it to its destination
I have no idea what you think that means. Event is a message describing something that happened. "Domain event" means that the something happened within the domain. It's still a message to be published.
Related
I have a question about Axon Saga. I have a project where I have three microservices, each microservice has his own database, but the two "Slave" microservice has to share his data to the "Master" microservice, for that I want to use the Axon Saga. I already asked a question about the compensation, when something goes wrong, and I have to deal with the compensation by myself, it is ok, but not ideal. Currently I am using the DistributedCommandBus to communicate between the microservices, is it good for that? I am using the Choreography Saga model, so here is what it is look like now:
Master -> Send command -> Slave1 -> Handles event
Slave1 -> Send back command -> Master -> Handles event
Master -> Send command -> Slave2 -> Handles event
Slave2 -> Send back command -> Master -> Handles event
If something went wrong then comes the compensating Commands/Events backwards.
My question is has anybody did something like this with Axon, with compensation, what the best practices for that? How can I retry the Saga process? With the RetryScheduler? Add a github repo if you can.
Thanks, Máté
First and foremost, let me answer your main question:
My question is has anybody did something like this with Axon?
Shortly, yes, as this is one of the main use cases of for Sagas.
As a rule of thumb, I'd like to state a Saga can be used to coordinate a complex business transaction between:
Several distinct Aggregate Instances
Several Bounded Contexts
On face value, it seems you've landed in option two of delegating a complex business transaction.
It is important to note that when you are using Sagas, you should very consciously deal with any exceptions and/or command dispatching results.
Thus, if you dispatch a command from the "Master" to "Slave 1" and the latter fails the operation, this result will come back in to the Saga.
This thus gives you the first option to retry an operation, which I would suggest to do with a compensating action.
Lastly, with a compensating action, I am talking about dispatching a command to trigger it.
If you can not rely on the direct response from dispatching the command, retrying/rescheduling a message within the Saga would be a reasonable second option.
To that end, Axon has the EventScheduler and DeadlineManager.
Note that the former of the two publishes an event for everyone to see.
The latter schedules a DeadlineMessage within the context of that single Saga instance, thus limiting the scope of who can see a retry is occurring.
Typically, the DeadlineManager would be my preferred mode of operation for thus, unless you require this 'rescheduling action' to be seen by everybody.
FYI, check this page for EventScheduler information and this page for DeadlineManager info.
Sample Update
Here's a bit of pseudo-code to get a feel what a compensating action in a Saga Event Handler would look like:
class SomeSaga {
private CommandGateway commandGateway;
#SagaEventHandler(assocationValue = "some-key")
public void on(SomeEvent event) {
// perform some checks, validation and state setting, if necessary
commandGateway.send(new MyActionCommand(...))
.exceptionally(throwable -> {
commandGateway.send(new CompensatingAction(...));
});
}
}
I don't know your exact use case, but from this and your previous question I get the impression you want to roll back, or in this case undo, the event if one of the event handlers cannot process it.
In general, there are some things you are able to do. You can see if the aggregate that applied the event in the first place has or can have the information to check whether the 'slave' microservice should be able to handle the event before you apply it. If this isn't practical, the slave microservice can also apply a 'failure' event directly on the eventbus to inform the rest of the system that a failure state has occurred that needs to be handled:
https://docs.axoniq.io/reference-guide/implementing-domain-logic/event-handling/dispatching-events#dispatching-events-from-a-non-aggregate
I'm working on a CQRS+ES system, mainly using the axon framework, but really this question applies to any implementation. So I have a command handler and 1 or more event handlers, running on different JVMs, containers etc, and at some point one of these handlers encounters an error.
We have two cases, an 'expected' business error and an 'unexpected' system error. As I understand it, we are now in an asynchronous handler, and the event is now a fact, so in reality we cannot directly rollback the command for neither case (as it could entail rolling it back in numerous other projections and break CQRS).
So my question is, should such an error be 'resolved' in an accounts ledger sort of way, i.e. by sending a new 'reversal' command that is then propagated to the projections in such a way that the event that failed is now resolved?
As an example, let's say we have a command that updates a customer's credit. The event is published, one projection updates its "total credits" statistic, another one publishes the update to some websocket for the UI and, finally, another one which maintains the credit state - and this last handler fails. Should we send a command for rolling back the business transaction, and again deduct the credit, update the websocket again, etc? And in case of axon is there some way in which this is captured as a transaction?
I'd state that the decision making whether taking an action, thus handling a command, is okay, should always lie with the Command Model/Aggregate. The aggregate being in an incorrect state to handle an action will typically lead to a 'business exception/error'.
If you'd make decisions upon event handling failing however, you're adding some decision making logic in a event handling service which in most cases it likely doesn't care about. It such an event handling services updates views/query models, but fails to do so, I'd argue that's not a valid reason to publish a 'compensating command' to your aggregate to 'rollback/undo the event'.
In your example you have a 'credit-state-maintainer', which I'd guess updates a query model. As such I'd deem the problem of dealing with the exception to lie within the service itself, not by performing a compensating action.
From an Axon Framework perspective, you could wrap your CreditStateEventHandler in a TrackingEventProcessor and trigger a reset on that event processor by calling the TrackingEventProcessor#resetTokens() function. This is taken the stance that the exception due to which your CreditStateEventHandler is due to faulty coding of course, otherwise a replay would result in the exact same exception.
I have to coordinate 5 separate microservices e.g. A,B,C,D,E
I need to create a coordinator which might monitor a queue for new jobs for A. If A completes ok then a rest request should be sent to B then if everything is ok (happy path) then C is called all the way down to E.
However B,C etc might fail for one reason or another e.g. end point is down or credentials are insufficient causing the flow to fail at a particular stage. I'd like to be able to create something that could check the status of failed job and rerun again e.g. lets try B again, ok now it works the flow would then continue.
Any tips or advice for patterns / frameworks to do this. I'd like something fairly simple and not over complex.
I've already looked briefly at Netflix Conductor / Camunda but ideally I'd like something a bit less complex.
Thanks
W
Any tips or advice for patterns / frameworks to do this. I'd like something fairly simple and not over complex.
What you describe is the good ol' domain of A,B,C,D and E. Because the dependencies and engagement rules between the letters are complex enough, it's good to create a dedicated service for this domain. It could be as simple as this overarching service just being triggered by queue events.
The only other alternative is to do more on the client side and organize the service calls from there. But that isn't feasible in every domain for security reasons or other issues.
And since it sounds like you already got an event queue going, I'll not recommend one (Kafka).
One way apart from Camunda, Conductor is to send a event from Service A on some Messaging Queue (eg. lets say kafka ) which provides at least once delivery semantics.
Then write a consumer which receive the event and do the orchestration part (talking to service B,C,D,E).
As these all operations needs to be idempotent.First before starting orchestration create a RequestAgg. for the event from A and keep updating its state to represent where you reach in your orchestration journey.
Now even if the other services are down or your node goes down. This should either reach the end or you should write functions to rollback as well.
And to check the states and debug , you could see the read model of RequestAgg.
I have written a Rest application based on Spring MVC wherein I am required to do some validations, some of the validations are Hard rules and some of them are Soft rules. Soft rules if they fail generate a warning, but if the hard rules fail they generate an error.
First I am checking the hard rules if any fail then, at that time only, I am returning back the response, but let the process continue to process the subsequent Soft rules.
Herein I would like to know how to create two parallel threads in Spring to do this?
OR How to publish a custom event and asynchronously handle it in another thread and let the original thread continue its work in Spring?
I know about #async and SpringTaskExecutor, but how to best use them here.
I am seeking design and architectural guidelines and ideas to handle this task in best possible way.
As mentioned soft rule(s) validation failure only generates warning it can be handled in a separate background process. This way main thread can focus solely on hard rule(s) without bothering itself about soft rule(s).
For above behavior below points need to be implemented
For every request persist the relevant data, for soft rule processing, with flag processed=false and preferably time stamp (for insertion and processed).
Post persisting the data, let the main thread continue with hard rule processing.
Introduce a scheduled service (via #Scheduled) which will periodically fetch the unprocessed data and mark them as processed=true post soft rule processing along with relevant processed time stamp. (This will as act as the background process which will periodically poll the data for unprocessed rule(s))
Do ensure that the respective transactions viz. soft rule data insertion and processing are well isolated. Also the error handling should be robust in terms of system failures when rule processing are in progress.
Let know in comments if more information is required.
Let's imagine the situation where you have incoming upstream message which contains multiple items. Each item contains the information which participates in the business logic implemented as part of the pipeline.
Difficulties I can see:
Message has to be split & converted into multiple internal events, those are processed further and if one of them fails, then all internal events should be rolled back
If we had one upstream message = 1 item, it would be much easier
How should one cater for such situation from architecture point of view?
What is the best pattern to employ here?
How should one set up transactions?
Thanks!
Looks like your question isn't clear and that transaction word is used for different subjects...
Anyway let me guess what you want.
If you are going (and can) to roll back part of the business request, you should just ensure global XA transaction for all of them and do all splitted sub-tasks in the same thread. Because only this let you keep and track transaction and roll backs afterwards, if that.
If you can't deal with XA and single thread, than you should take a look to some solutions like compensation transaction or acknowledge with claim-checks.
But that is already outside of Spring Integration scope.