I have the following case:
I have a thread which uses the session to save or update
public void run()
{
Session session = DAO.getInstance().getCurrentSession();
Transaction tx = null;
try
{
tx = session.beginTransaction();
session.saveOrUpdate(entity);
}catch.....
}
But in the meantime during the serialization with session.saveorUpdate i change the entity object...
So the User-thread will change the data during the session serialization..
How can I overcome this problem? is there a simple way in hibernate?
EDIT:
The biggest problem lies when the UserThread changes some data in the entity object durignt the saveOrUpdate method.
It sounds like you'd be interested in optimistic concurrency control using versioning.
Optimistic Concurrency Control
If you haven't come across it before, it's a similar idea to compare-and-swap whereby Hibernate will manage a version along with the entity. By incrementing a version number during updates and checking it hasn't changed after, Hibernate can detect conflict and error. It optimistically assumes that actual contention is rare and leaves it to the developer to handle the exceptions. I've generally found this to be the case and as the Hibernate docs put it;
The only approach that is consistent with high concurrency and high
scalability, is optimistic concurrency control with versioning.
You can tweak Hibernate's transaction visibility and isolation level to affect the finer details, see
http://docs.jboss.org/hibernate/orm/3.3/reference/en/html/transactions.html#transactions-optimistic
Transaction Demarcation
I can't tell from the question's code snippet but it may also be worth considering the transaction boundary. Usually, I'll start a transaction (beginTransaction) at the start of a business operation or request and commit and completion. All updates are performed in this session (with one thread-per-session Hibernate) model. I still have each business operation or request processed on their own thread and rely on usual Hiernate issolation levels etc to manage conflicts.
I mention it because there may be a chance to step back and consider why you make updates from multiple threads. It may be that your application doesn't suit the approach I've tried to outline but it may be worth considering if it can be shifted around to avoid genuine multiple-thread updates.
Failing that it's certainly worth understanding if there is likely to be frequent conflicts in production. Testing this could help you understand if you really need to worry about it or if you can rely on the usually transaction control to detect conflicts and handle them in other ways.
One way is to synchronize on the entity object:
public void run()
{
Session session = DAO.getInstance().getCurrentSession();
Transaction tx = null;
try
{
tx = session.beginTransaction();
synchronized(entity) {
session.saveOrUpdate(entity);
}
}
Related
I am working on building a microservice which is using transaction manager implemented based on Java Transaction API(JTA).
My question is does Trasaction maanger have ability to handle concurrency issue in distributed database scenario's .
Scenario:
Assume there are multiple instance of a service running and we get two requests to update balance amount by 10 in an account. Initially an account can have $100 and the first instance gets that and increments it to $10 but has not been commited yet.
At the same time the second instance also retreive's account which is still 100 and increments it by $10 and then commits it updating balance to $110 and then service one updates account again to $110.
By this time you must have figured that balance was supposed to be incremented by $20 and not 10. Do I have to write some kind of Optimistic lock exception mechanism to prevent the above scenario or will Transaction Manager based on JTA specification already ensure such a thing will not happen ?
does Trasaction maanger have ability to handle concurrency issue in distributed database scenario's .
Transactions and concurrency are two independent concepts and though Transactions become most siginificant in context where we also see concurrency , transactions can be important without concurrency.
To answer your question : No , Transaction Manager generally does not concern itself with handling issues that arise with concurrent updates. It takes a very naive and simple ( and often most meaningful ) approach : if after the start of a transaction , it detects that the state has become inconsistent ( because of concurrent updates ) it would simply raise it as an exception and Rollback the transaction. If only it can establish that all the conditions of the ACID properties of the transaction are still valid will it commit the transaction.
For such type of requests, you can handle through Optimistic Concurrency where you would have a column on the database (Timestamp) as a reference to the version number.
Each time when a change is commited it would modify the timestamp value.
If two requests try to commit the change at the same time, only one of them will succeed as the version (Timestamp) column will change by then negating other request from comitting its changes.
The transaction manager (as implementation of the JTA specification) makes transparent a work above multiple resources. It ensures all the operations happens as a single unit of work. The "work above multiple resources" mean that that the application can insert data to database and meanwhile it sends a message to a JMS broker. Transaction manager guarantees ACID properties to be hold for this two operations. In simplistic form when the transaction finishes successfully the application developer can be sure both operation was processed. When some trouble happens is on the transaction manager to handle it - possibly throw an exception and rollback the data changes. Thus neither operation was processed.
It makes this transparent for the application developer who does not need to care to update first database and then JMS and checks if all data changes were really processed or a failure happens.
In general the JTA specification was not written with microservice architecture in mind. Now it really depends on your system design(!) But if I consider you have two microservices where each one has attached its own transaction manager then the transaction manager can't help you to sort out your concurrency issue. Transaction managers does not work (usually) in some synchronization. You don't work with multiple resources from one microservice (what is the usecase for the transaction manager) but with one resource from multiple microservices.
As there is the one resource it's the synchronization point for all you updates. It depends on it how it manages concurrency. Considering it's a SQL database then it depends on the level of the isolation it uses (ACID - I = isolation, see https://en.wikipedia.org/wiki/ACID_(computer_science)). Your particular example talks about lost update phenomena (https://vladmihalcea.com/a-beginners-guide-to-database-locking-and-the-lost-update-phenomena/). As both microservices tries to update one record. One solution for the avoiding the issue is using optimistic/pesimistic locking (you can implement it on your own by e.g. timestamps as stated above), the other is to use serializable isolation level in your database, or you can design your application for not reading and updating data based on what is read first time but change the sql query having the update atomic (or there are possibly other strategies how to work with your data model to achieve the desired outcome).
In summary - it depends on how your transaction manager is implemented, it can help you in a way but it's not its purpose. Your goal should be to check how the isolation level is set up at the shared storage and consider if your application needs to handle lost update phenomena at application level or your storage cang manage it for you.
I have a bank project which customer balances should be updated by parallel threads in parallel applications. I hold customer balances in an Oracle database. My java applications will be implemented with Spring and Hibernate.
How can i implement the race condition between parallel applications? Should my solution be at database level or at application level?
I assume what you would like to know is how to handle concurrency, preventing race conditions which can occur where two parts of the application modify and accidentally overwrite the same data.
You have mostly two strategies for this: pessimistic locking and optimistic locking:
Pessimistic locking
here you assume that the likelyhood that two threads overwrite the same data is high, so you would like it to handle it in a transparent way. To handle this, increase the isolation level of your Spring transactions from it's default value of READ_COMMITTED to for example REPEATABLE_READ which should be sufficient in most cases:
#Transactional(isolation=Isolation.REPEATABLE_READ)
public void yourBusinessMethod {
...
}
In this case if you read some data in the beginning of the method, you are sure that noone can overwrite the data in the database while your method is ongoing. Note that it's still possible for another thread to insert extra records to a query you made (a problem known as phantom reads), but not change the records you already read.
If you want to protect against phantom reads, you need to upgrade the isolation level to SERIALIZABLE. The improved isolation comes at a performance cost, your program will run slower and will more frequently 'hang' waiting for the other part of the program to finish.
Optimistic Locking
Here you assume that data access colisions are rare, and that in the rare cases they occur they are easilly recoverable by the application. In this mode, you keep all your business methods in their default REPEATABLE_READ mode.
Then each Hibernate entity is marked with a version column:
#Entity
public SomeEntity {
...
#Version
private Long version;
}
With this each entity read from the database is versioned using the version column. When Hibernate write changes to an entity in the database, it will check if the version was incremented since the last time that transaction read the entity.
If so it means someone else modified the data, and decisions where made using stale data. In this case a StaleObjectException is thrown, that needs to be caught by the application and handled, ideally at a central place.
In the case of a GUI, you usuall catch the exception, show a message saying user xyz changed this data while you where also editing it, your changes are lost. Press Ok to reload the new data.
With optimistic locking your program will run faster but the applications needs to handle some concurrency aspects that would otherwise be transparent with pessimistic locking: version entities, catch exceptions.
The most frequently used method is optimistic locking, as it seems to be acceptable in most applications. With pessimistic locking it's very easy to cause performance problems, specially when data access colisions are rare and can be solved in a simple way.
There are no constraints to mix the use of the two concurrency handling methods in the same application if needed.
I am looking at EntityManager API, and I am trying to understand an order in which I would do a record lock. Basically when a user decides to Edit a record, my code is:
entityManager.getTransaction().begin();
r = entityManager.find(Route.class, r.getPrimaryKey());
r.setRoute(txtRoute.getText());
entityManager.persist(r);
entityManager.getTransaction().commit();
From my trial and error, it appears I need to set WWEntityManager.entityManager.lock(r, LockModeType.PESSIMISTIC_READ); after the .begin().
I naturally assumed that I would use WWEntityManager.entityManager.lock(r, LockModeType.NONE); after the commit, but it gave me this:
Exception Description: No transaction is currently active
I haven't tried putting it before the commit yet, but wouldn't that defeat the purpose of locking the record, since my goal is to avoid colliding records in case 50 users try to commit a change at once?
Any help as to how to I can lock the record for the duration of the edit, is greatly appreciated!
Thank You!
Performing locking inside transaction makes perfectly sense. Lock is automatically released in the end of the transaction (commit / rollback). Locking outside of transaction (in context of JPA) does not make sense, because releasing lock is tied to end of the transaction. Also otherwise locking after changes are performed and transaction is committed does not make too much sense.
It can be that you are using pessimistic locking to purpose other than what they are really for. If my assumption is wrong, then you can ignore end of the answer. When your transaction holds pessimistic read lock on entity (row), following is guaranteed:
No dirty reads: other transactions cannot see results of operations you performed to locked rows.
Repeatable reads: no modifications from other transactions
If your transaction modifies locked entity, PESSIMISTIC_READ is upgraded to PESSIMISTIC_WRITE or transaction fails if lock cannot be upgraded.
Following coarsely describes scenario with obtaining locking in the beginning of transaction:
entityManager.getTransaction().begin();
r = entityManager.find(Route.class, r.getPrimaryKey(),
LockModeType.PESSIMISTIC_READ);
//from this moment on we can safely read r again expect no changes
r.setRoute(txtRoute.getText());
entityManager.persist(r);
//When changes are flushed to database, provider must convert lock to
//PESSIMISTIC_WRITE, which can fail if concurrent update
entityManager.getTransaction().commit();
Often databases do not have separate support for pessimistic read, so you are actually holding lock to row since PESSIMISTIC_READ. Also using PESSIMISTIC_READ makes sense only if no changes to the locked row are expected. In case above changes are done always, so using PESSIMISTIC_WRITE from the beginning on is reasonable, because it saves you from the risk of concurrent update.
In many cases it also makes sense to use optimistic instead of pessimistic locking. Good examples and some comments about choosing between locking strategies can be found from: Locking and Concurrency in Java Persistence 2.0
Great work attempting to be safe in write locking your changing data. :) But you might be going overboard / doing it the long way.
First a minor point. The call to persist() isn't needed. For update, just modify the attributes of the entity returned from find(). The entityManager automatically knows about the changes and writes them to the db during commit. Persist is only needed when you create a new object & write it to the db for the first time (or add a new child object to a parent relation and which to cascade the persist via cascade=PERSIST).
Most applications have a low probability of 'clashing' concurrent updates to the same data by different threads which have their own separate transactions and separate persistent contexts. If this is true for you and you would like to maximise scalability, then use an optimistic write lock, rather than a pessimistic read or write lock. This is the case for the vast majority of web applications. It gives exactly the same data integrity, much better performance/scalability, but you must (infrequently) handle an OptimisticLockException.
optimistic write locking is built-in automatically by simply having a short/integer/long/TimeStamp attribute in the db and entity and annotating it in the entity with #Version, you do not need to call entityManager.lock() in that case
If you were satisfied with the above, and you added a #Version attribute to your entity, your code would be:
try {
entityManager.getTransaction().begin();
r = entityManager.find(Route.class, r.getPrimaryKey());
r.setRoute(txtRoute.getText());
entityManager.getTransaction().commit();
} catch (OptimisticLockException e) {
// Logging and (maybe) some error handling here.
// In your case you are lucky - you could simply rerun the whole method.
// Although often automatic recovery is difficult and possibly dangerous/undesirable
// in which case we need to report the error back to the user for manual recovery
}
i.e. no explicit locking at all - the entity manager handles it automagically.
IF you had a strong need to avoid concurrent data update "clashes", and are happy to have your code with limited scalability then serialise data access via pessimistic write locking:
try {
entityManager.getTransaction().begin();
r = entityManager.find(Route.class, r.getPrimaryKey(), LockModeType.PESSIMISTIC_WRITE);
r.setRoute(txtRoute.getText());
entityManager.getTransaction().commit();
} catch (PessimisticLockException e) {
// log & rethrow
}
In both cases, a successful commit or an exception with automatic rollback means that any locking carried out is automatically cleared.
Cheers.
I am reading the official GAE documentation on transactions and I can't understand when a ConcurrentModificationException is thrown.
Look at one of the examples which I am copy-pasting here:
int retries = 3;
while (true) {
Transaction txn = datastore.beginTransaction();
try {
Key boardKey = KeyFactory.createKey("MessageBoard", boardName);
Entity messageBoard = datastore.get(boardKey);
long count = (Long) messageBoard.getProperty("count");
++count;
messageBoard.setProperty("count", count);
datastore.put(messageBoard);
txn.commit();
break;
} catch (ConcurrentModificationException e) {
if (retries == 0) {
throw e;
}
// Allow retry to occur
--retries;
} finally {
if (txn.isActive()) {
txn.rollback();
}
}
}
Now, all the writes to the datastore (in this example) are wrapped under a transaction. So why would a ConcurrentModificationException be thrown?
Does it happen when some other code which is not wrapped in a transaction updates the same entity that is being modified by the above code? If I ensure that all code that updates an Entity is always wrapped in a transaction, is it guaranteed that I won't get a ConcurrentModificationException?
I found the answer on the GAE mailing list.
I had a misconceived notion of how transactions work in GAE. I had imagined that beginning a transaction will lock out any concurrent updates to the datastore until the transaction commits. That would have been a performance nightmare as all updates would block on this transaction and I am happy that this isn't the case.
Instead, what happens is, the first update wins, and if a collision is detected in subsequent updates, then an exception is thrown.
This surprised me at first, because it means many transactions will need a retry logic. But it seems similar to the PostgreSQL semantics for "serializable isolation" level, though in PostgreSQL you also have the option to lock individual rows and columns.
It seems that you're doing what they suggest you shouldn't do: http://code.google.com/appengine/docs/java/datastore/transactions.html#Uses_for_Transactions
Warning! The above sample depicts transactionally incrementing a counter only for the sake of simplicity. If your app has counters that are updated frequently, you should not increment them transactionally, or even within a single entity. A best practice for working with counters is to use a technique known as counter-sharding.
Perhaps the above warning doesn't apply, but what follows after it seems to hint at the issue you're seeing:
This requires a transaction because the value may be updated by another user after this code fetches the object, but before it saves the modified object. Without a transaction, the user's request uses the value of count prior to the other user's update, and the save overwrites the new value. With a transaction, the application is told about the other user's update. If the entity is updated during the transaction, then the transaction fails with a ConcurrentModificationException. The application can repeat the transaction to use the new data.
In other words: it seems that somebody is modifying your entity without using a transaction at the same time that you're updating the same entity with a transaction.
Note: In extremely rare cases, the transaction is fully committed even if a transaction returns a timeout or internal error exception. For this reason, it's best to make transactions idempotent whenever possible.
A fair warning: I'm not familiar with the library, but the above quotes were taken from the documentation showing sample transactions (which seems identical to what you've posted in the original question).
I was going through ACID properties regarding Transaction and encountered the statement below across the different sites
ACID is the acronym for the four properties guaranteed by transactions: atomicity, consistency, isolation, and durability.
**My question is specifically about the phrase.
guaranteed by transactions
**. As per my experience these properties are not taken care by
transaction automatically. But as a java developer we need to ensure that these properties criteria are met.
Let's go through for each property:-
Atomicity:- Assume when we create the customer the account should be created too as it is compulsory. So now during transaction
the customer gets created while during account creation some exception oocurs. So the developer can now go two ways: either he rolls back the
complete transaction (atomicity is met in this case) or he commits the transaction so customer will be created but not the
account (which violates the atomicity). So responsibility lies with developer?
Consistency:- Same reason holds valid for consistency too
Isolation :- as per definition isolation makes a transaction execute without interference from another process or transactions.
But this is achieved when we set the isolation level as Serializable. Otherwis in another case like read commited or read uncommited
changes are visible to other transactions. So responsibility lies with the developer to make it really isolated with Serializable?
Durability:- If we commit the transaction, then even if the application crashes, it should be committed on restart of application. Not sure if it needs to be taken care by developer or by database vendor/transaction?
So as per my understanding these ACID properties are not guaranteed automatically; rather we as a developer sjould achieve them. Please let me know
if above understanding regarding each point is correct? Would appreciate if you folks can reply for each point(yes/no will also do.
As per my understanding read committed should be most logical isolation level in most application, though it depends on requirement too.
The transactions guarantees ACID more or less:
1) Atomicity. Transaction guarantees all changes are made or none of them. But you need to manually set the start and end of a transaction and manually perform commit or rollback. Depending on the technology you use (EJB...), transactions are container-managed, setting the start and end to the whole "method" you are creating. You can control by configuration if a method invoked requires a new transaction or an existing one, no transaction...
2) Consistency. Guaranteed by atomicity.
3) Isolation. You must define the isolation level your application needs. Default value is defined depending upon the database, container... The commonest one is READ COMMITTED. Be careful with locks as can cause dead-lock depending on your logic and isolation level.
4) Durability. Managed entirely by the database. If your commit executes without error, nearly all database guarantees durability of changes, but some scenarios can cause to not guarantee that (writes to disk are cached in memory and flushed later...)
In general, you should be aware of transactions and configure it in the container of declare by code the star and end (commit, rollback).
Database transactions are atomic: They either happen in their entirety or not at all. By itself, this says nothing about the atomicity of business transactions. There are various strategies to map business transactions to database transactions. In the simplest case, a business transaction is implemented by one database transaction (where a business transaction is aborted by rolling back the database one). Then, atomicity of database transactions implies atomicity of business transactions. However, things get tricky once business transactions span several database transactions ...
See above.
Your statement is correct. Often, the weaker guarantees are sufficient to prove correctness.
Database transactions are durable (unless there is a hardware failure): if the transaction has committed, its effect will persist until other transactions change the data. However, calling code might not learn whether a transaction has comitted if the database or the network between database and calling code fails. Therefore
If we commit the transaction, then even if application crash, it should be committed on restart of application.
is wrong. If the transaction has committed, there is nothing left to do.
To summarize, the database does give strong guarantees - about the behaviour of the database. Obviously, it can not give guarantees about the behaviour of the entire application.