how to roll back a transaction happening between microservices?

how to roll back a transaction happening between microservices? - java

we have microservice architecture where for most part each microservice is independent. But for some legacy reasons, there is a situation where we have to call another microservice from within another.
eg: the following method is part of Legal Service
#Autowired
public ServiceManager UserServiceManager;
public void updateUserLegalData(){
//do db update of Legal info for the user
userServiveManager.setAcceptedLegal(true);
}
There are two db transactions going on above. one is updating legalService db and other is updating UserService db. please NOTE userService is a microservicerunning on a separate VM.
we are seeing situations where legal Service db is updated but call to userService is failing (Internal server error). so this leaves the application in an inconsistent state. How can we fix this in a recommended way?
Thanks

This situation can be handled only with JTA global/distributed transactions. JTA is part of Java EE standard and can have various implementors. Atomikos is often tool of choice.
Here is good writeup from Dave Syer (Spring ecosystem contributor). It contain also working examples. It's little bit outdated, but still relevant. You can apply some more modern Spring abstractions on top of his examples.
I created few GitHub examples of JTA transactions for my book. Notice that there are errors simulated and transaction is spread across JMS and JDBC datasources.
But also bear in mind that JTA transactions across various data sources are slow, because of 2-phased commit algorithm involved. So often people try to avoid them and rather deal with inconsistencies somehow pragmatically.

Don't do distributed transactions.
For integration with your existing legacy system one approach could be a separate (micro)service which listens to update events from your userService and forwards the respective updates to the legalService.
Spring integration may be suitable for such a task.
Cheers,
Michael

Well if you read little bit about the subject in the internet, it is a big debacle point at the moment but there is one answer that everybody agrees on it, distributed transactions are not way to go for it. They are too clumsy and buggy that we can't rely on them for data consistency.
So what is our options then, people are the moment trying to coordinate micro service transactions via Apache Kafka or with Event Source (which concentrate on saving events which are changing the data instead of saving the data itself). So what is the problem with those? Well they are quite different then usual programming model that we get used to and at technical and organisational point of view quite complex, so instead of programming for Business Problems, you start programming against the technical challenge.
So what is the alternative, I personally developed an another concept and wrote a blog about it, it might be interesting for you. In its basics, it uses full micro service design principles and Spring Boot + Netflix in a J2EE container and fully using the transactions, it is too long to write all details here, if you are interested you can read the from the link below.
Micro Services and Transactions with Spring Boot + Netflix

Transaction across microservices can become complex and can slow down the system, one of the best ways to solve the problem of distributed transactions is to avoid them completely.
If you avoid distributed transactions across microservices then you will not end up in such situation.
If at all you have to implement distributed transactions across microservices then I think there are a couple of ways :
Two-phase commit protocol
**Eventual Consistency
In your case, I would recommend using message bus and flag to communicate among services , So if legal service adds the data into legal database put a lock on that record and send message on message bus , user service when it is up it will pick the message and update database at its end and send ack message onto the message bus, once ack message is received remove the lock otherwise delete/rollback the record after certain time duration. This looks complex but reliable and failure-proof solution in your case.

Related

Where to put event upcasters in a microservice architecture?

I'm "playing" with Axon Framework with some small examples where the query and command services (and the logic behind them) are running as separated applications in several Docker containers.
Everything works fine so far and I started to evolve the event versioning topic. I haven't implemented that yet, but I like the idea to share the events as an API via JSON schema. But I've got stuck using that idea with the potential need of event upcasters.
If I understand that approach correctly every listening component has to upcasts the events independently, therefore it might be a good idea to share the upcasters, there is no need for different implementations, right? But then the upcasters seem to became a part of the API, or am I missing something?
How do you deal with that situation? Or generally, what are the best practices for API definitions in such scenario?

When accessing a microservices environment with distinct repositories for the different services, I feel it is common place to have a dedicated module/package/repository for the API of the given microservice. Or, a dedicated module for the shared language within a Bounded Context.
Especially when following the notion of Bounded Context, thus that every service within the context speaks the same language, to me emphasizes the requirement to share the created upcasters as well.
So shortly yes, I would group upcasters together with the API in question.
Schema languages typically also have solutions in place to support several versions of a message for example. Thus if you would be to use a schema language as your core API, that would also include a (although different) form of upcaster.
This is my 2 cents on the situation; hope this helps you out!

Is there a mature Java Workflow Engine for BPM backed by NoSQL?

I am researching how to build a general application or microservice to enable building workflow-centric applications. I have done some research about frameworks (see below), and the most promising candidates share a hard reliance upon RDBMSes to store workflow and process state combined with JPA-annotated entities. In my opinion, this damages the possibility of designing a general, data-driven workflow microservice. It seems that a truly general workflow system can be built upon NoSQL solutions like MondoDB or Cassandra by storing data objects and rules in JSON or XML. These would allow executing code to enforce types or schemas while using one or two simple Java objects to retrieve and save entities. As I see it, this could enable a single application to be deployed as a Controller for different domains' Model-View pairs without modification (admittedly given a very clever interface).
I have tried to find a workflow engine/BPM framework that supports NoSQL backends. The closest I have found is Activiti-Neo4J, which appears to be an abandoned project enabling a connector between Activity and Neo4J.
Is there a Java Work Engine/BPM framework that supports NoSQL backends and generalizes data objects without requiring specific POJO entities?
If I were to give up on my ideal, magically general solution, I would probably choose a framework like jBPM and Activi since they have great feature sets and are mature. In trying to find other candidates, I have found a veritable graveyard of abandoned projects like this one on Java-Source.net.

Yes, Temporal Workflow has pluggable persistence and runs on Cassandra as well as on SQL databases. It was tested to up to 100 Cassandra nodes and could support tens of thousands of events per second and hundreds of millions of open workflows.
It allows to model your workflow logic as plain old java classes and ensures that the code is fully fault tolerant and durable across all sorts of failures. This includes local variable and threads.
See this presentation that goes into more details about the programming model.

I think the reason why workflow engines are often based on RDBMS is not the database schema but more the combination to a transaction-safe data store.
Transactional robustness is an important factor for workflow engines, especially for long-running or nested transactions which are typical for complex workflows.
So maybe this is one reason why most engines (like activi) did not focus on a data-driven approach. (I am not talking about data replication here which is covered by NoSQL databases in most cases)
If you take a look at the Imixs-Workflow Project you will find a different approach based on Java Enterprise. This engine uses a generic data object which can consume any kind of serializable data values. The problem of the data retrieval is solved with the Lucene Search technology. Each object is translated into a virtual document with name/value pairs for each item. This makes it easy to search through the processed business data as also to query structured workflow data like the status information or the process owners. So this is one possible solution.
Apart from that, you always have the option to store your business data into a NoSQL database. This is independent from the workflow data of a running process instance as far as you link both objects together.
Going back to the aspect of transactional robustness it's a good idea to store the reference to your NoSQL data storage into the process instance, which is transaction aware. Take also a look here.
So the only problem you can run into is the fact that it's very hard to synchronize a transaction context from a EJB/JPA to an 'external' NoSQL database. For example: what will you do when your data was successful saved into your NoSQL data storage (e.g. Casnadra), but the transaction of the workflow engine fails and a role-back is triggered?

The designers of the Activiti project have also been aware of the problem you have stated, but knew it would be quite a re-write to implement such flexibility which, arguably, should have been designed into the project from the beginning. As you'll see in the link provided below, the problem has been a lack of interfaces toward which to code different implementations other than that of a relational database. With version 6 they went ahead and ripped off the bandaid and refactored the framework with a set of interfaces for which different implementations (think Neo4J, MongoDB or whatever other persistence technology you fancy) could be written and plugged in.
In the linked article below, they provide some code examples for a simple in-memory implementation of the aforementioned interfaces. Looks pretty cool and sounds to perhaps be precisely what you're looking for.
https://www.javacodegeeks.com/2015/09/pluggable-persistence-in-activiti-6.html

OpenSessionInView vs PersistentContext (Extended)

I'm working on an architecture Hibernate/JPA/Spring/Zk, and I multiply the questions these days because I have to learn a lot of framework.
I have a question that leaves me perplexed for several days.
I hear about the "pattern" OpenSessionInView to keep alive a Hibernate transaction to make lazy loading.
Many also say that pattern is not very clean.
And on the other, it is said that PersistentContext extended is not thread safe, and is therefore not suitable for keeping alive the entityManager.
So, what is the real solution to these problems?
I presume that these issues arise from the introduction of ajax which allows more possibilities especially with the use lazy loading to load some heavy Collections when necessary.
For moment, i tried #PersistenceContext in extended mode. It's working...
I had to set it for my JUnit tests, et it's working too in my web application with lazy loading without more configurations.
Is that the evolution of framework (Spring, JPA 2.0) mean that it is now easier and more "clean" work with PersistentContext?
If this is not the case, should we use the OpenSessionInViewFilter from Spring and replace the PersistentContext in transactional mode?
Thank you.

I hear you. I've implemented both patterns in several applications since 2008. Now, I abandon any statetful patterns altogether. When you introduce state to the client, you pose scalability and state management issues: do you merge in client, do you save in user session, what happens when you walk through a wizard and object must be transient before save? How would you synchronize client and serverside state? What happens when db changes--does the client break?
Look at the trend of existing technologies, including Spring MVC: the pattern is to build two projects: 1) restful webservices 2) user interfaces. State is shared through an immutable domain model. Sure you might end up maintain a set of dtos, but they're predictable, cheap, and scale infinitely.
My recommendation? Avoid sending proxied objects over the wire and deal with dtos on the client or a share a domain model with the client if you want to reuse serverside validations. Lazy collections can be loaded via fine-grained api calls through Ajax. That way, you give complete control to the client.
That's how the social web has scaled in the past five years.

Multitenancy with Spring JPA

I am looking around for a multitenancy solution for my web application.
I would like to implement a application with Separate Schema Model. I am thinking to have a datasource per session. In order to do that i put datasource and entitymanger in session scope , but thats not working. I am thinking to load data-access-context.xml(which include datasource and other repository beans) file when the user entered username and password and tenantId. I would like to know if it is a good solution?

Multitenancy is a bit tricky subject and it has to be handled on the JPA provider side so that from the client code perspective nothing or almost nothing changes. eclipselink has support for multitenancy (see: EclipseLink/Development/Indigo/Multi-Tenancy), hibernate just added it recently.
Another approach is to use AbstractRoutingDataSource, see: Multi tenancy in Hibernate.
Using session-scope is way too risky (also you will end up with thousands of database connections, few for every session/user. Finally EntityManager and underlying database connections are not serializable so you cannot migrate your session and scale your app properly.

I have worked with a number of multi-tenancy systems. The challenge here is how you keep
open architecture and
provide a solution that evolves with your business.
Let's look at second challenge first. Multi-tenancy systems has a tendency to evolve where you'll need to support use cases where same data (record) can be accessed by multiple tenants with different capacity (e.g. https://bugs.eclipse.org/bugs/show_bug.cgi?id=355458). So, the system ultimately needs Access Control List.
To keep the open architecture you can code to a standard (like JPA). Coding to EclipseLink or Hibernate makes me uncomfortable.
Spring Security ACL provides very flexible community supported solution to both these challenges. Give that a try. I did and been happy with it's performance. However, I must caution you, it took me some digging to get my head around it.

Java Transaction API (JTA) Overview Help

Can someone give me a good explanation on the motivation and application of JTA in modern Java applications? I don't want overly technical details. But just a paragraph on why do we need JTA, what does JTA accomplish, and maybe a piece of pseudo code showing how JTA is being used?

Normally, an application performs transactional operations over information resources like Database, JMS etc. As these transactions are totally isolated from each other, it can happen that the application is able to commit one transaction on one resource, but on the other one it fails. It would lead to information inconsistency among these resources, as one got committed and the other not.
XA is an open standard to such a problem. And, JTA is the name given to XA in the J2EE world.
Hope that helps.
Nitin

Greatest book about JTA. Java Transaction Design Strategies By Mark Richards
You can find here a lot of basics about JTA, transactions, XA, Spring, EJB support. Good explanation about all aspects of programming and designing transactional application. Recommend.

JTA defines the semantics (specification + API) of the orchestration that allows for 3rd party enterprise information systems and your application to exchange information with integrity.
JTA Specification. Introduction pretty much sums it up.

JTA allows you to write code or systems having multiple transactional resources: databases, message queues, your own custom resource, or resources accessed from multiple processes, perhaps on multiple hosts, as participants in a single transaction.

This has a fairly nice explanation of what JTA is:
http://www.roseindia.net/interviewquestions/j2ee-interview-questions-2.shtml
To learn more you can look at the link at the top of this page, the pdf version of the tutorial. As you search for JTA you will find code for JTA.
http://docs.sun.com/app/docs/doc/819-3669/bnciz?a=view

JTA allows us to write code which having multiple transaction with resources, databases and resources accessed from multiple processes as participants in a single transaction.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.