I kinda understand the purpose of entity locking and transaction isolation level, but can't get the difference between pessimistic locking and serializable level. As I understand, in both cases the table gets locked and no other transaction can access it, so in both cases actions to prevent concurrent modifications are taken by the DB, which looks like there's no difference. Could someone please explain if there actually is difference here?
(I don't assume you're using ObjectDB. You'll probably get better answers if you edit your question, and include the specific database you're using with JPA.)
I don't like the terms optimistic locking and pessimistic locking. I think optimistic concurrency control and pessimistic concurrency control are more accurate. Locks are the most common way to deal with concurrency control problems, but they're not the only way. (Date's chapter on concurrency in An Introduction to Database Systems is about 25 pages long.)
The topics of transaction management and concurrency control aren't limited to the relational model of data or to SQL database management systems (dbms). Transaction isolation levels have to do with SQL.
Pessimistic concurrency control really means only that you expect the dbms to prevent other transactions from accessing something when the dbms starts processing your request. Behavior is up to the dbms vendor. Different vendors might prevent access by locking the entire database, locking some tables, locking some pages, or locking some rows. Or the dbms might prevent access in some other way that doesn't directly involve locks.
Transaction isolation levels are how SQL tries to solve concurrency control problems. Transaction isolation levels are defined in SQL standards.
The serializable transaction isolation level guarantees that the effect of concurrent, serializable transactions is the same as running them one at a time in some particular order. The guarantee describes the effect--not any particular kind of concurrency control or locking needed to achieve that effect.
Pessimistic locking normally involves writelocks to the database to do changes in a safe and exclusive way. This is normally done by doing select ... for update. This will prevent or delay other connections from doing their own select ... for update or changes on the locked records in the database until the transaction of the first connection is completed.
Serializable Isolation Level does not need to be concerned with changes but makes sure that after the transaction started, the result of reads will always stay the same (except changes by the transaction itself) until that transactions ends. To support this "Non-MVCC"-DBMS must set many locks (on each record read by the connection working serializable) in the database and therefore might hinder concurrency very much.
The same effect can also be achieved without locking when databases provide MVCC as do Oracle, MySql-INNODB, MariaDB, Postgres
Related
I have a bank project which customer balances should be updated by parallel threads in parallel applications. I hold customer balances in an Oracle database. My java applications will be implemented with Spring and Hibernate.
How can i implement the race condition between parallel applications? Should my solution be at database level or at application level?
I assume what you would like to know is how to handle concurrency, preventing race conditions which can occur where two parts of the application modify and accidentally overwrite the same data.
You have mostly two strategies for this: pessimistic locking and optimistic locking:
Pessimistic locking
here you assume that the likelyhood that two threads overwrite the same data is high, so you would like it to handle it in a transparent way. To handle this, increase the isolation level of your Spring transactions from it's default value of READ_COMMITTED to for example REPEATABLE_READ which should be sufficient in most cases:
#Transactional(isolation=Isolation.REPEATABLE_READ)
public void yourBusinessMethod {
...
}
In this case if you read some data in the beginning of the method, you are sure that noone can overwrite the data in the database while your method is ongoing. Note that it's still possible for another thread to insert extra records to a query you made (a problem known as phantom reads), but not change the records you already read.
If you want to protect against phantom reads, you need to upgrade the isolation level to SERIALIZABLE. The improved isolation comes at a performance cost, your program will run slower and will more frequently 'hang' waiting for the other part of the program to finish.
Optimistic Locking
Here you assume that data access colisions are rare, and that in the rare cases they occur they are easilly recoverable by the application. In this mode, you keep all your business methods in their default REPEATABLE_READ mode.
Then each Hibernate entity is marked with a version column:
#Entity
public SomeEntity {
...
#Version
private Long version;
}
With this each entity read from the database is versioned using the version column. When Hibernate write changes to an entity in the database, it will check if the version was incremented since the last time that transaction read the entity.
If so it means someone else modified the data, and decisions where made using stale data. In this case a StaleObjectException is thrown, that needs to be caught by the application and handled, ideally at a central place.
In the case of a GUI, you usuall catch the exception, show a message saying user xyz changed this data while you where also editing it, your changes are lost. Press Ok to reload the new data.
With optimistic locking your program will run faster but the applications needs to handle some concurrency aspects that would otherwise be transparent with pessimistic locking: version entities, catch exceptions.
The most frequently used method is optimistic locking, as it seems to be acceptable in most applications. With pessimistic locking it's very easy to cause performance problems, specially when data access colisions are rare and can be solved in a simple way.
There are no constraints to mix the use of the two concurrency handling methods in the same application if needed.
I want to use Spring-Hibernate and JDBC together in my application.
Hibernate should do all the updating and writing from one thread and other threads should just be able to read from the database without too much synchronization effort.
Will those JDBC-using threads deliver correct results (if they read from the database a short time after calling persist() or merge()) or could it happen, that Hibernate
has not flushed any updates and therefore other threads return wrong database entries?
"Wrong" depends on the isolation level you set for your connection pool.
I think it can work if Hibernate and Spring share the same connection pool and you set the isolation level to SERIALIZABLE for all connections.
Long-running transactions will be the problem. If all your write operations are fast you won't block. If you don't commit and flush updates quickly the read operations will either have to block and wait OR allow "dirty reads".
That depends. You're basically describing a race condition - if you want to make sure that your read-thread only reads after the write-thread has persisted, you will have to look into thread synchronization methodology.
Cheers,
Although I learned and reference JPA 2.0 pessimistic lock,
I don't know where is it used and how can I test it?
What is the best way to test it without using web client?
What will be best example?
See Locking and Concurrency in Java Persistence 2.0
Pessimistic locking assumes that transactions will frequently collide. In pessimistic locking, a transaction that reads the data locks it. Another transaction cannot change the data until the first transaction commits the read.
So if you use pessimistic lock you block entire entity (at least row) and nobody else can read/write in the same time.
I was going through ACID properties regarding Transaction and encountered the statement below across the different sites
ACID is the acronym for the four properties guaranteed by transactions: atomicity, consistency, isolation, and durability.
**My question is specifically about the phrase.
guaranteed by transactions
**. As per my experience these properties are not taken care by
transaction automatically. But as a java developer we need to ensure that these properties criteria are met.
Let's go through for each property:-
Atomicity:- Assume when we create the customer the account should be created too as it is compulsory. So now during transaction
the customer gets created while during account creation some exception oocurs. So the developer can now go two ways: either he rolls back the
complete transaction (atomicity is met in this case) or he commits the transaction so customer will be created but not the
account (which violates the atomicity). So responsibility lies with developer?
Consistency:- Same reason holds valid for consistency too
Isolation :- as per definition isolation makes a transaction execute without interference from another process or transactions.
But this is achieved when we set the isolation level as Serializable. Otherwis in another case like read commited or read uncommited
changes are visible to other transactions. So responsibility lies with the developer to make it really isolated with Serializable?
Durability:- If we commit the transaction, then even if the application crashes, it should be committed on restart of application. Not sure if it needs to be taken care by developer or by database vendor/transaction?
So as per my understanding these ACID properties are not guaranteed automatically; rather we as a developer sjould achieve them. Please let me know
if above understanding regarding each point is correct? Would appreciate if you folks can reply for each point(yes/no will also do.
As per my understanding read committed should be most logical isolation level in most application, though it depends on requirement too.
The transactions guarantees ACID more or less:
1) Atomicity. Transaction guarantees all changes are made or none of them. But you need to manually set the start and end of a transaction and manually perform commit or rollback. Depending on the technology you use (EJB...), transactions are container-managed, setting the start and end to the whole "method" you are creating. You can control by configuration if a method invoked requires a new transaction or an existing one, no transaction...
2) Consistency. Guaranteed by atomicity.
3) Isolation. You must define the isolation level your application needs. Default value is defined depending upon the database, container... The commonest one is READ COMMITTED. Be careful with locks as can cause dead-lock depending on your logic and isolation level.
4) Durability. Managed entirely by the database. If your commit executes without error, nearly all database guarantees durability of changes, but some scenarios can cause to not guarantee that (writes to disk are cached in memory and flushed later...)
In general, you should be aware of transactions and configure it in the container of declare by code the star and end (commit, rollback).
Database transactions are atomic: They either happen in their entirety or not at all. By itself, this says nothing about the atomicity of business transactions. There are various strategies to map business transactions to database transactions. In the simplest case, a business transaction is implemented by one database transaction (where a business transaction is aborted by rolling back the database one). Then, atomicity of database transactions implies atomicity of business transactions. However, things get tricky once business transactions span several database transactions ...
See above.
Your statement is correct. Often, the weaker guarantees are sufficient to prove correctness.
Database transactions are durable (unless there is a hardware failure): if the transaction has committed, its effect will persist until other transactions change the data. However, calling code might not learn whether a transaction has comitted if the database or the network between database and calling code fails. Therefore
If we commit the transaction, then even if application crash, it should be committed on restart of application.
is wrong. If the transaction has committed, there is nothing left to do.
To summarize, the database does give strong guarantees - about the behaviour of the database. Obviously, it can not give guarantees about the behaviour of the entire application.
I have the below flow in a multi-threaded environment
start transaction
read n number of top rows (based on a column) from db
check some criteria
update those set of rows
commit/rollback the transaction
I am using optimistic locking to handle multi-threaded scenario, but in above situation DB is always returning the same set of rows so if a second thread runs at the same time it will always fail.
Is there a better way to handle this?
Could we force DB to return different set of rows for each transaction using some option?
The reason you are getting the same top n records for all your threads is because of the I in the ACID (atomicity, consistency, isolation, durability) principles of transactions. Isolation means other operations cannot access data that has been modified during a transaction that has not yet completed. So until your threads commit their transactions the other threads cannot see what they have done.
It is possible to change the Isolation level on most databases to one of the following:
SERIALIZABLE
REPEATABLE READ
READ COMMITTED
READ UNCOMMITTED
In your case you probably want READ UNCOMMITTED as it allows one transaction to see uncommitted changes made by some other transaction.
Note: This is almost certainly the wrong isolation level for most applications, and could lead to data corruption. If other application other than the one you described here are accessing the same database you probably don't want to change the isolation level as those application may start to see unexpected and incorrect behaviour.