I want to use SAGA pattern in my Spring Boot Microservices. For example in order of customer, when the order created, an event like OrderCreatedEvent produced and then in customer microservice the listener on OrderCreatedEvent Update the customer credit and produce CreditUpdateEvent and ... .
I use session transacted JmsTemplate for event producing. In javadoc of JmsTemplate said that the JMS transaction commited after the main transaction:
This has the effect of a local JMS transaction being managed alongside the main transaction (which might be a native JDBC transaction), with the JMS transaction committing right after the main transaction.
Now My question is how can I handle below scenario:
The main transaction committed (for example order recored committed) and system was unable to commit the JMS transaction (for any reason).
I want to use SAGA instead of two phase commit but I think just SAGA move the problem from order and customer service to order service and JMS provider.
SAGA hints the issue:
There are also the following issues to address:
...
In order to be reliable, a service must atomically update its database and publish an event. It cannot use the traditional mechanism of a distributed transaction that spans the database and the message broker. Instead, it must use one of the patterns listed below.
...
The following patterns are ways to atomically update state and publish events:
Event sourcing
Application events
Database triggers
Transaction log tailing
Event Sourcing is special in this list as it brings radical change on how your system stores and processes data. Usually, systems store only the current state of the entities. Some systems add explicit support for historical states with validity periods and/or bitemporal data.
Systems which are based on Event Sourcing store the sequence of events instead of entity state in a way that allows it to reconstruct the state from events. There is only one transactional resource to maintain - event store - so there is no need to coordinate transactions.
Other patterns in the list avoid the issue of transaction coordination by requiring the event producer code to commit all changes - both entities state and events (as entities) - to the single data store. Then a dedicated, but separate mechanism - event publisher - is implemented to fetch the events from the data store and publish them to the event consumers.
Event publisher would need to keep the track of published / unpublished events which usually brings back the problem of coordinated transactions. That's were the idempotency of the event consumers comes to light. Event publisher will replay events from the last known position while consumers will ignore duplicates.
You may also reverse the active / passive aspects of the event producer and event consumer. Event producer stores entities state and events (as entities) to the single data store and provides an endpoint which allows event consumer to access event streams. Each event consumer keeps track of processed / unprocessed events - which it needs to do anyway for idempotency reasons - but only for event streams it is interested about. A really good explanation of this approach is given in the book REST in Practice - chapters 7 and 8.
With SAGA you would like to split or reorder your transaction (tx) steps in 3 phases:
Tx steps for which you can have a compensating action. For each T1..N you have a
C1..N
Tx steps that cannot be compensating. If they fail then you trigger previously
defined C1..N
Retriable Tx steps that always succeed.
SAGAs are not ACID, only ACD. You need to implement yourself isolation, to prevent dirty reads. Usually with a locking.
Why SAGA? To avoid synchronous runtime coupling and pour availability. You wait for the last participant to commit.
It's quite a hefty price to pay.
The chance is small but still you can end up with inconsistent events that might be used to source an aggregate.
Related
I am working on building a microservice which is using transaction manager implemented based on Java Transaction API(JTA).
My question is does Trasaction maanger have ability to handle concurrency issue in distributed database scenario's .
Scenario:
Assume there are multiple instance of a service running and we get two requests to update balance amount by 10 in an account. Initially an account can have $100 and the first instance gets that and increments it to $10 but has not been commited yet.
At the same time the second instance also retreive's account which is still 100 and increments it by $10 and then commits it updating balance to $110 and then service one updates account again to $110.
By this time you must have figured that balance was supposed to be incremented by $20 and not 10. Do I have to write some kind of Optimistic lock exception mechanism to prevent the above scenario or will Transaction Manager based on JTA specification already ensure such a thing will not happen ?
does Trasaction maanger have ability to handle concurrency issue in distributed database scenario's .
Transactions and concurrency are two independent concepts and though Transactions become most siginificant in context where we also see concurrency , transactions can be important without concurrency.
To answer your question : No , Transaction Manager generally does not concern itself with handling issues that arise with concurrent updates. It takes a very naive and simple ( and often most meaningful ) approach : if after the start of a transaction , it detects that the state has become inconsistent ( because of concurrent updates ) it would simply raise it as an exception and Rollback the transaction. If only it can establish that all the conditions of the ACID properties of the transaction are still valid will it commit the transaction.
For such type of requests, you can handle through Optimistic Concurrency where you would have a column on the database (Timestamp) as a reference to the version number.
Each time when a change is commited it would modify the timestamp value.
If two requests try to commit the change at the same time, only one of them will succeed as the version (Timestamp) column will change by then negating other request from comitting its changes.
The transaction manager (as implementation of the JTA specification) makes transparent a work above multiple resources. It ensures all the operations happens as a single unit of work. The "work above multiple resources" mean that that the application can insert data to database and meanwhile it sends a message to a JMS broker. Transaction manager guarantees ACID properties to be hold for this two operations. In simplistic form when the transaction finishes successfully the application developer can be sure both operation was processed. When some trouble happens is on the transaction manager to handle it - possibly throw an exception and rollback the data changes. Thus neither operation was processed.
It makes this transparent for the application developer who does not need to care to update first database and then JMS and checks if all data changes were really processed or a failure happens.
In general the JTA specification was not written with microservice architecture in mind. Now it really depends on your system design(!) But if I consider you have two microservices where each one has attached its own transaction manager then the transaction manager can't help you to sort out your concurrency issue. Transaction managers does not work (usually) in some synchronization. You don't work with multiple resources from one microservice (what is the usecase for the transaction manager) but with one resource from multiple microservices.
As there is the one resource it's the synchronization point for all you updates. It depends on it how it manages concurrency. Considering it's a SQL database then it depends on the level of the isolation it uses (ACID - I = isolation, see https://en.wikipedia.org/wiki/ACID_(computer_science)). Your particular example talks about lost update phenomena (https://vladmihalcea.com/a-beginners-guide-to-database-locking-and-the-lost-update-phenomena/). As both microservices tries to update one record. One solution for the avoiding the issue is using optimistic/pesimistic locking (you can implement it on your own by e.g. timestamps as stated above), the other is to use serializable isolation level in your database, or you can design your application for not reading and updating data based on what is read first time but change the sql query having the update atomic (or there are possibly other strategies how to work with your data model to achieve the desired outcome).
In summary - it depends on how your transaction manager is implemented, it can help you in a way but it's not its purpose. Your goal should be to check how the isolation level is set up at the shared storage and consider if your application needs to handle lost update phenomena at application level or your storage cang manage it for you.
Let's imagine the situation where you have incoming upstream message which contains multiple items. Each item contains the information which participates in the business logic implemented as part of the pipeline.
Difficulties I can see:
Message has to be split & converted into multiple internal events, those are processed further and if one of them fails, then all internal events should be rolled back
If we had one upstream message = 1 item, it would be much easier
How should one cater for such situation from architecture point of view?
What is the best pattern to employ here?
How should one set up transactions?
Thanks!
Looks like your question isn't clear and that transaction word is used for different subjects...
Anyway let me guess what you want.
If you are going (and can) to roll back part of the business request, you should just ensure global XA transaction for all of them and do all splitted sub-tasks in the same thread. Because only this let you keep and track transaction and roll backs afterwards, if that.
If you can't deal with XA and single thread, than you should take a look to some solutions like compensation transaction or acknowledge with claim-checks.
But that is already outside of Spring Integration scope.
Looking for an architectural pattern to solve the following problem.
In my architecture, I have a Stateless EventDispatcher EJB that implements:
public void dispatchEvent(MyEvent ev)
This method is called by a variety of other EJBs in their business methods. The purpose of my EventDispatcher is to hide the complexity of how events are dispatched (be it JMS or other mechanism).
For now let's assume my bean is using JMS. So it simply looks at the event passed it, and builds JMS messages and dispatches them to the right topic. It can produce several JMS messages and they are only sent if the surrounding transaction ends up being committed successfully (XA transaction).
Problem: I may be looking at transactions where I send thousands of individual messages. Some messages might become invalid because of other things that happened in a transaction (object Updated and then later Deleted). So I need a good deal of logic to "scrub" messages based on a context, and make a final decision on if it is one big JMS batch message or multiple small ones.
Solutions: What I would like to is use some sort of "TransactionalContext" object and use it in my Stateless EJB to "Buffer" all the events. Then I need a callback of some sort to tell me the transaction is about to commit. This is something similar to how we use EntityManager, i can make changes to entities, and it holds onto changes and is shared between stateless EJBs. At "flush" time (transaction complete) it does its logic to figure out what SQL to execute. I need a TransactionContext available to my stateless bean that has a unique session per transaction, and, has a callback as the transaction is about to complete.
What would you do?
Note that I am NOT in a valid CDI context, some of these transactions are starting because of #Schedule timers. Other transactions begin because of JMS MDBs.
I believe the thing I am looking for is the TransactionSynchronizationRegistry.
http://docs.oracle.com/javaee/5/api/javax/transaction/TransactionSynchronizationRegistry.html#putResource(java.lang.Object
I find the similar question here but didn't find a clear answer on the transaction management for back end (database)
My current project is to create producer/consumer and let consumer to digest JMS message and persist in database. Because the back end of the application is managed by JPA, so it is critical to maintain the whole process transactional. My question is what is the downside if place #Transactional annotation on the classic onMessage method? Is there any potential performance challenge if do so?
The only problem may be if the whole queue process takes too long and the connection closes in the middle of the operation. Apart of this, if you enable the transaction for the whole queue process rather than per specific services methods, then theoretically the performance should be the same.
It would be better to enable two phase commit (also known as XA transaction) for each queue process. Then, define each specific service method as #Transactional and interact with your database as expected. At the end, the XA transaction will perform all the commits done by the #Transactional service methods. Note that using this approach does affect your performance.
This question follows directly from a previous question of mine in SO. I am still unable to grok the concept of JMS sessions being used as a transactional unit of work .
From the Java Message Service book :
The QueueConnection object is used to create a JMS Session object
(specifically, a Queue Session), which is the working thread and
transactional unit of work in JMS. Unlike JDBC, which requires a
connection for each transactional unit of work, JMS uses a single
connection and multiple Session objects. Typically, applications will
create single JMS Connection on application startup and maintain a
pool of Session objects for use whenever a message needs to be
produced or consumed.
I am unable to understand the meaning of the phrase transactional unit of work. A plain and simple explanation with an example is what I am looking for here.
A unit of work is something that must complete all or nothing. If it fails to complete it must be like it never happened.
In JTA parlance a unit of work consists of interactions with a transactional resource between a transaction.begin() call and a transaction.commit() call.
Lets say you define a unit of work that pulls a message of a source queue, inserts a record in a database, and puts the another message on a destination queue. In this scenario transaction aware resources are the two JMS queues and the database.
If a failure occurs after the database insert then a number of things must happen to achieve atomicity. The database commit must be rolled back so you don't have an orphaned record in the datasource and the message that was pulled off the source queue must be replaced.
The net out in this contrived scenario is that regardless of where a failure occurs in the unit of work the result is the exact state that you started in.
The key to remember about messaging systems is that a more global transaction can be composed of several smaller atomic transactional handoffs queue to queue.
Queue A -> Processing Agent -> Queue B --> Processing Agent --> Queue C
While in this scenario there isn't really a global transactional context(for instance rolling a failure in B->C all the way back to A) what you do have is garauntees that messages will either be delivered down the chain or remain in their source queues. This makes the system consistent at any instant. Exception states can be handled by creating error routes to achieve a more global state of consistency.
A series of messages of which all or none are processed/sent.
Session may be created as transacted. For a transacted session on session.commit() all messages which consumers of this session have received are committed, that is received messages are removed from their destinations (queues or topics) and messages that all producers of this session have sent become visible to other clients. On rollback received messages are returned back to their destinations, sent messages removed from destination. All sent / received messages until commit / rollback are one unit of work.