I have a service method where I request an entity by ID from the database. If the entity has the attribute paid == false, I set it to true and do something. If paid==true it just returns.
#Override
#Transactional(rollbackFor={ServiceException.class})
public void handleIntentSucceeded(PaymentIntent intent) throws ServiceException {
LOGGER.trace("handleIntentSucceeded({})", intent);
CreditCharge charge = transactionRepository.findByPaymentIntentId(intent.getId());
if(charge.getPaid()) {
return;
// do some stuff
charge.setPaid(true);
transactionRepository.save(charge);
}
Now if there are multiple requests with the same intent at the same time, this method would no longer be consistent because, for example, the first request receives the charge with paid==false, so it does "some things" and if the second request comes to this method before the first request has saved the charge with paid==true, it would also do "some things" even if the first request already does so. Is this a correct conclusion?
To be sure that only one request can process this method at a time, to avoid "some things" being done multiple times, I could set the Transactional to #Transactional(isolation = Isolation.SERIALIZABLE). This way any request can process this method/transaction only if the request has committed the Transactional before.
Is this the best approach or is there a better way?
One solution, as already mentioned above is to use OptimisticLocking. However, an OptimisticLockingException will lead to a failed http request. If this is a problem, you can handle the exception.
But in case you are sure, that you will not run multiple instances of the application and there are not big requirements for perfomance, or you simply want to deal with the problem later and until that use a "workaround", you can make the method synchronized (https://www.baeldung.com/java-synchronized). That way, the Java runtime will ensure, that the method cannot be run in parallel.
I would probably look for a way of optimisically locking the record (e.g. using some kind of update counter), so that only the first concurrent transaction changing the paid property would complete successfully.
Any subsequent transaction which was trying to modify the same entity in the meantime would then fail, and their actions done during do some stuff would rollback.
Optimistic vs. Pessimistic locking
edit: REPEATABLE_READ isolation level (as suggested by one of the comments) might also behave similarly to optimistic locking; though this might depend on the implementation
Related
Say I receive two concurrent requests with the same payload. But I must (1) perform a single payment transaction (using a third-party API) and (2) somehow return the same response to both requests. It's this second requirement that complicates things. Otherwise, I could have just returned an error response to the duplicate request.
I have two entities: Session and Payment (related via #OneToOne relation). Session has two fields to keep track of the overall state: PaymentStatus (NONE, OK, ERROR), SessionStatus (CHECKED_IN, CHECKED_OUT). The initial condition is NONE and CHECKED_IN.
The request payload does contain a unique session number, which I use to get the relevant session. For now, assume that the payment service is sort of "idempotent" for a unique order id: it performs only one transaction for a given order id. The order id also comes in the request payload (the same value for the twin requests).
The flow I have in mind is along these lines:
Get the session
If session.getPaymentStatus() == OK, find the payment and return success response.
Perform the payment
Save the payment to DB. Session has a field with unique constraint generated from the request payload. So if one of the threads tries to insert a duplicate, a DataIntegrityViolationException will be thrown. I catch it, find the already inserted payment, and return a response based on it.
If no exception is thrown in 4, return the appropriate response.
In this flow, there seems to be at least one scenario where I might have to return error responses to both requests despite the fact that the payment transaction was successfully completed! For instance, say an error occurs for the "first" request, payment is not done, and an error response is returned. But for the "second" request, which happens to take a bit longer to process, payment is done, but upon insertion to DB, the already inserted payment record is discovered, and an error response is formed on the basis of it.
I'd like to avoid all these race condition-like situations. And I've a feeling that I'm missing something very obvious here. In essence, the problem is to somehow make one request to wait for another to complete. Is there a way that I can utilize DB transactions and locks to handle this smoothly?
Above I assumed that the payment service is idempotent for a given order id. What if it wasn't and I had to absolutely avoid sending duplicate requests to it?
Here's the relevant part of the service method:
Session session = sessionRepo.findById(sessionId)
.orElseThrow(SessionNotFoundException::new);
Payment payment = paymentManager.pay(session, req.getReference(), req.getAmount());
Payment saved;
try {
saved = paymentRepo.save(payment);
} catch (DataIntegrityViolationException ex) {
saved = paymentRepo.findByOrderId(req.getReference())
.orElseThrow(PaymentNotFoundException::new);
}
PaymentStatus status = saved.getSession().getPaymentStatus();
PaymentStage stage = saved.getSession().getPaymentStage();
if (stage == COMPLETION && status == OK)
return CheckOutResponse.success(req.getTerminalId(), req.getReference(),
req.getPlateNumber(), saved.getAmount(), saved.getRrn());
return CheckOutResponse.error(req.getTerminalId(), req.getReference(),
"Unable to complete transaction.");
You are talking about ”same Payload“. So you have to create class Payload with hash/equal methods which implement to notion of ”same”.
Then you create a synchronized hashset for all payloads ever started.
When next request is processed, you create new payload if absent, and starts it. If such payload already existed, then simply return its result. Even existing payload can be not finished, to comfortably wait its result declare payload as CompletableFuture.
I'd like to avoid all these race condition-like situations. And I've a
feeling that I'm missing something very obvious here. In essence, the
problem is to somehow make one request to wait for another to
complete. Is there a way that I can utilize DB transactions and locks
to handle this smoothly?
I'm inclined to think that there is no way to eliminate all possibility of returning an error response despite the payment being processed successfully, because there are too many places where breakage can occur, including outside your own code. But yes, you can remove some opportunities for inconsistent responses by applying some locking.
For example,
Get the session
Get the session's PaymentStatus and take out a pessimistic lock on it. You must also include code to ensure that this lock is released before request processing completes, even in error cases (I say nothing further about this).
If session.getPaymentStatus() != NONE, return a corresponding response.
Perform the payment
Save the payment to the DB, which I am supposing includes updating the PaymentStatus to either OK or ERROR. Because of the locking, it is not expected that any attempt will be made to insert a duplicate. If that happens anyway then an admin needs to be notified, and a different response should be returned, maybe a 501.
Return the appropriate response.
Note that idempotency of successfully making payment does not help you in this area, but if idempotency extended to payment failure cases then your original workflow would not be susceptible to the inconsistent response issue described in the question.
I think that this combination of an assignment of an id to an entity (hopefully always the same for the same request) and a UNIQUE (id) constraint constitute a sufficient condition to avoid db duplicates.
If you want to (for a reason unknown to me) avoid this first condition, you can always check the request's timestamp or design your persistance layer to "manually" check for duplicates before updating.
But, as always, the question is what are you trying to acheive? This place (StackOverflow) is more about discussing/correcting the implementations rather then theoretical questions.
edit
If I understand correctly, it's a matter of setting a public static flag somewhere (or a List of flags, you get the idea). In your service, you'd check the flag first, and if true, wait for it to be false; then finally perform the main operation.
As for the duplicate requests, I'd compare every req with the last one. If all the parameters are the same and the timestamp is close enough, I'd return status 400 or whatever.
But I still don't get why would you want the same responses. Of course you could wait for an arbitrary amount of time after receiving every req and before actually executing it, but why not always permit a "unique" request to proceed?
I have a method
#Transactional
public void updateSharedStateByCommunity(List[]idList)
This method is called from the following REST API:
#RequestMapping(method = RequestMethod.POST)
public ret_type updateUser(param) {
// call updateSharedStateByCommunity
}
Now the ID lists are very large, like 200000, When I try to process it, then it takes lots of time and on client side timeout error occurred.
So, I want to split it to two calls with list size of 100000 each.
But, the problem is, it is considered as 2 independent transactions.
NB: The 2 calls is an example, it can be divided to many times, if number ids are more larger.
I need to ensure two separate call to a single transaction. If any one of the 2 calls fails, then it should rollback to all operation.
Also, in the client side, we need to show progress dialog, so I can't use only timeout.
The most obvious direct answer to your question IMO is to slightly change the code:
#RequestMapping(method = RequestMethod.POST)
public ret_type updateUser(param) {
updateSharedStateByCommunityBlocks(resolveIds);
}
...
And in Service introduce a new method (if you can't change the code of the service provide an intermediate class that you'll call from controller with the following functionality):
#Transactional
public updateSharedStatedByCommunityBlocks(resolveIds) {
List<String> [] blocks = split(resolveIds, 100000); // 100000 - bulk size
for(List<String> block :blocks) {
updateSharedStateByCommunity(block);
}
}
If this method is in the same service, the #Transactional in the original updateSharedStateByCommunity won't do anything so it will work. If you'll put this code into some other class, then it will work since the default propagation level of spring transaction is "Required"
So it addresses harsh requirements: you wanted to have a single transaction - you've got it. Now all the code runs in the same transaction. Each method now runs with 100000 and not with all the ids, everything is synchronous :)
However, this design is problematic for many different reasons.
It doesn't allow to track the progress (show it to the user) as you've stated by yourself in the last sentence of the question. REST is synchronous.
It assumes that network is reliable and waiting for 30 minutes is technically not a problem (leaving alone the UX and 'nervous' user that will have to wait :) )
In addition to that, the network equipment can force closing the connection (like load balancers with pre-configured request timeout).
That's why people suggest some kind of asyncrhonous flow.
I can say that you still can use the async flow, spawn the task, and after each bulk update some shared state (in-memory in the case of a single instance) and persistent (like database in the case of cluster).
So that the interaction with the client will change:
Client calls "updateUser" with 200000 ids
Service responds "immediately" with something like "I've got your request, here is a request Id, ping me once in a while to see what happens.
Service starts an async task and process the data chunk by chunk in a single transaction
Client calls "get" method with that id and server reads the progress from the shared state.
Once ready, the "Get" methods will respond "done".
If something fails during the transaction execution, the rollback is done, and the process updates the database status with "failure".
You can also use more modern technologies to notify the server (web sockets for example), but it's kind of out of scope for this question.
Another thing to consider here: from what I know, processing 200000 objects should be done in much less then 30 minutes, its not that much for modern RDBMSs.
Of course, without knowing your use case its hard to tell what happens there, but maybe you can optimize the flow itself (using bulk operations, reducing the number of requests to db, caching and so forth).
My preferred approach in those scenarios is make the call asynchronous (Spring Boot allow this using the #Async annotation), hence the client won't expect for any HTTP response. The notification could be done via a WebSocket that will push a message to the client with the progress each X items processed.
Surely it will add more complexity to your application, but if you design the mechanism properly, you'll be able to reuse it for any other similar operation you may face in the future.
The #Transactional annotation accepts a timeout (although not all underlying implementations will support it). I would argue against trying to split the IDs into two calls, and instead try to fix the timeout (after all, what you really want is a single, all-or-nothing transaction). You can set timeouts for the whole application instead of on a per-method basis.
From technical point, it can be done with the org.springframework.transaction.annotation.Propagation#NESTED Propagation, The NESTED behavior makes nested Spring transactions to use the same physical transaction but sets savepoints between nested invocations so inner transactions may also rollback independently of outer transactions, or let them propagate. But the limitation is only works with org.springframework.jdbc.datasource.DataSourceTransactionManager datasource.
But for really large dataset, it still need more time to processing and make the client waiting, so from solution point of view, maybe using async approach will be more better but it depends on your requirement.
After I create a new object 'Order', I would like to get its generated ID and put it on an AMQP queue, so that a worker can do other stuff with it. The worker takes the generated ID (message) and looks up the order but complains that no record exists, even though I just created one. I am trying to figure out either how long to wait for after I call my .persist() before I put the message (generated ID) on the queue (which I dont think is a good idea); have the worker loop over and over until mysql DOES return a record (which I dont like either); or find a point where I can put the message on the queue after I know the data is safe in mysql (this sounds best). Im thinking that it needs to be done outside of any #Transactional method.
The worker that is going to read the data back out of mysql is part of a different system on a different server. So when can I tell the worker that the data is in mysql so that it can get started with its task?
Is it true that after the #Transactional method finishes the data is done being written to mysql, I am having trouble understanding this.
Thanks a million in advanced.
Is it true that after the #Transactional method finishes the data is
done being written to mysql, I am having trouble understanding this.
Thanks a million in advanced.
So first, as Kayamann and Ralf wrote in comments, it is guaranteed that data is stored and available for other processes when the transaction commits (ends)
#Transactional methods are easy to understand. When you have #Transactional method, it means that the container (application that is going to actually invoke that method) will begin the transaction before the method is invoked, and auto commit or rollback the transaction in case of success or error.
So if we have
#Transactional
public void modify(){
doSomething();
}
And when you call somewhere in the code (or invokation via contaier eg due to some bindings) the actuall frol will be as follows
tx=entityManager.beginTransaction();
object.modify();
tx.commit();
There is quite simple. Such approach will mean that transactions are Container Controlled
As four your situation, well to let your external system know that transaction has been complete, you have to either use message queue (that you are using already) with the message that transaction is complete for some id and it can start processing stuff, or use different technology, REST for example.
Remote systems can signal eachoter for various of events via queues and REST services, so there is no difference.
We use Spring and Hibernate in our project and has a layered Architechture. Controller -> Service -> Manager -> Dao. Transactions start in the Manager layer. A method in the service layer which updates an object in the db is called by many threads and this is causing to throw a stale object expection. So I made this method Synchronized and still see the stale object exception thrown. What am I doing wrong here? Any better way to handle this case?
Thanks for the help in advance.
The stale object exception is thrown when an entity has been modified between the time it was read and the time it's updated. This can happen inside a single transaction, but may also happen when you read an object in a transaction, modify it (in the controller layer, for example), then start another transaction and merge/update it (in this case, minutes or hours can separate the read and the update).
The exception is thrown to help you avoid conflicts between users.
If you don't care about conflicts (i.e. the last update always wins and replaces what the previous ones have written), then don't use optimistic locking. If you're concerned about conflicts, then StaleObjectExceptions will happen, and you should popup a meaningful message to the end user, asking him to reload the data and try to modify it again. There's no way to avoid them. You must just be optimistic and hope that they won't happen often.
Note that your synchronized trick will work only if
the exception happens only when reading and writing in the same transaction
updates to the entity are only made by this service
your application is not clustered.
It might also reduce the throughput dramatically, because you forbid any concurrent updates, regardless of which entities are updated by the concurrent transactions. It's like if you locked the whole table for the duration of the whole transaction.
My guess is that you would need to configure optimistic locking on the Hibernate side.
i have a case where at my JAVA application, inside a transaction i want to call another service (JMS, WebService, SMS gate way, ...etc), i don't want to depend on the result of the call (Success, Fail, Exception thrown, ... etc), so if it fails somehow it won't affect my transaction completion,
what is the best approach to use this, am using Spring framework,
also i want to ask if i used threads for handling this, but my deployment will be on clusters(i.e different nodes with separate JVMs), what's the best way for handling (Lock, synchronization),
Regards,
You can could spawn a new thread (preferably via a java.util.Executor or a Spring TaskExecutor) to perform the subsidiary task. Spring's transaction synchronization works using non-inheritable ThreadLocal variables, so a new thread will not participate in the current transaction.
Alternatively, and perhaps more elegantly, you can specify an explicit transaction isolation level around the subsidiary task, something like:
#Transactional(propagation=Propagation.NOT_SUPPORTED)
public void doTheThing() { /.../ }
This will suspend the existing transaction for the duration of that method, although you'd still need to be careful with runtime exceptions not bubbling up into your main transaction boundary.
Regards your second question, locking and synchronisation in a cluster is a very complex topic, and not one I can really answer with the information you've given. I suggest opening a new question for this, and elaborate your requirements.
I'd schedule this in a quartz job.