maybe somebody can help me with a transactional issue in Spring (3.1)/ Postgresql (8.4.11)
My transactional service is as follows:
#Transactional(isolation = Isolation.SERIALIZABLE, readOnly = false)
#Override
public Foo insertObject(Bar bar) {
// these methods are just examples
int x = firstDao.getMaxNumberOfAllowedObjects(bar)
int y = secondDao.getNumerOfExistingObjects(bar)
// comparison
if (x - y > 0){
secondDao.insertNewObject(...)
}
....
}
The Spring configuration Webapp contains:
#Configuration
#EnableTransactionManagement
public class ....{
#Bean
public DataSource dataSource() {
org.apache.tomcat.jdbc.pool.DataSource ds = new DataSource();
....configuration details
return ds;
}
#Bean
public DataSourceTransactionManager txManager() {
return new DataSourceTransactionManager(dataSource());
}
}
Let us say a request "x" and a request "y" execute concurrently and arrive both at the comment "comparison" (method insertObject). Then both of them are allowed to insert a new object and their transactions are commited.
Why am I not having a RollbackException? As far as I know that is what the Serializable isolotation level is for. Coming back to the previous scenario, if x manages to insert a new object and commits its transaction, then "y"'s transaction should not be allowed to commit since there is a new object he did not read.
That is, if "y" could read again the value of secondDao.getNumerOfExistingObjects(bar) it would realize that there is a new object more. Phantom?
The transaction configuration seems to be working fine:
For each request I can see the same connection for firstDao and secondDao
A transaction is created everytime insertObject is invoked
Both first and second DAOs are as follows:
#Autowired
public void setDataSource(DataSource dataSource) {
this.jdbcTemplate = new JdbcTemplate(dataSource);
}
#Override
public Object daoMethod(Object param) {
//uses jdbcTemplate
}
I am sure I am missing something. Any idea?
Thanks for your time,
Javier
TL;DR: Detection of serializability conflicts improved dramatically in Pg 9.1, so upgrade.
It's tricky to figure out from your description what the actual SQL is and why you expect to get a rollback. It looks like you've seriously misunderstood serializable isolation, perhaps thinking it perfectly tests all predicates, which it doesn't, especially not in Pg 8.4.
SERIALIZABLE doesn't perfectly guarantee that the transactions execute as if they were run in series - as doing so would be prohibitively expensive from a performance point of view if it it were possible at all. It only provides limited checking. Exactly what is checked and how varies from database to database and version to version, so you need to read the docs for your version of your database.
Anomalies are possible, where two transactions executing in SERIALIZABLE mode produce a different result to if those transactions truly executed in series.
Read the documentation on transaction isolation in Pg to learn more. Note that SERIALIZABLE changed behaviour dramatically in Pg 9.1, so make sure to read the version of the manual appropriate for your Pg version. Here's the 8.4 version. In particular read 13.2.2.1. Serializable Isolation versus True Serializability. Now compare that to the greatly improved predicate locking based serialization support described in the Pg 9.1 docs.
It looks like you're trying to perform logic something like this pseudocode:
count = query("SELECT count(*) FROM the_table");
if (count < threshold):
query("INSERT INTO the_table (...) VALUES (...)");
If so, that's not going to work in Pg 8.4 when executed concurrently - it's pretty much the same as the anomaly example used in the documentation linked above. Amazingly it actually works on Pg 9.1; I didn't expect even 9.1's predicate locking to catch use of aggregates.
You write that:
Coming back to the previous scenario, if x manages to insert a new
object and commits its transaction, then "y"'s transaction should not
be allowed to commit since there is a new object he did not read.
but 8.4 won't detect that the two transactions are interdependent, something you can trivially prove by using two psql sessions to test it. It's only with the true-serializability stuff introduced in 9.1 that this will work - and frankly, I was surprised it works in 9.1.
If you want to do something like enforce a maximum row count in Pg 8.4, you need to LOCK the table to prevent concurrent INSERTs, doing the locking either manually or via a trigger function. Doing it in a trigger will inherently require a lock promotion and thus will frequently deadlock, but will successfully do the job. It's better done in the application where you can issue the LOCK TABLE my_table IN EXCLUSIVE MODE before obtaining even SELECTing from the table, so it already has the highest lock mode it will need on the table and thus shouldn't need deadlock-prone lock promotion. The EXCLUSIVE lock mode is appropriate because it permits SELECTs but nothing else.
Here's how to test it in two psql sessions:
SESSION 1 SESSION 2
create table ser_test( x text );
BEGIN TRANSACTION
ISOLATION LEVEL SERIALIZABLE;
BEGIN TRANSACTION
ISOLATION LEVEL SERIALIZABLE;
SELECT count(*) FROM ser_test ;
SELECT count(*) FROM ser_test ;
INSERT INTO ser_test(x) VALUES ('bob');
INSERT INTO ser_test(x) VALUES ('bob');
COMMIT;
COMMIT;
When run on Pg 9.1, the st commits succeeds then the secondCOMMIT` fails with:
regress=# COMMIT;
ERROR: could not serialize access due to read/write dependencies among transactions
DETAIL: Reason code: Canceled on identification as a pivot, during commit attempt.
HINT: The transaction might succeed if retried.
but when run on 8.4 both commits commits succeed, because 8.4 didn't have all the predicate locking code for serializability added in 9.1.
Related
I've seen articles saying that we should try to limit the scope of transaction, e.g. instead of doing this:
#Transactional
public void save(User user) {
queryData();
addData();
updateData();
}
We should exclude queryData from the transaction by using Spring's TransactionTemplate (or just move it out of the transactional method):
#Autowired
private TransactionTemplate transactionTemplate;
public void save(final User user) {
queryData();
transactionTemplate.execute((status) => {
addData();
updateData();
return Boolean.TRUE;
})
}
But my understanding is that since JDBC will always need a transaction for all operations, if I use the second way, there will be 2 transactions opened and closed, 1 for queryData (opened by JDBC), and another for codes inside transactionTemplate.execute opened by our class. If so, won't this be a waste of resources now that you've split 1 transaction into 2?
If an transaction starts , it will use up one DB connection. So we generally want the transaction to be completed as fast as possible , and delay to start it as much as we can until we really need to access DB such that the connection pool has more time to provide more available connections for other requests to use.
So if part of the workflow within your function requires to take some time to finish their work and that work is not required to access DB, it is true that it is better to limit the scope of the transaction to exclude this part of the codes.
But in your example, as both transaction are executed in series and both need to access DB , I don't see there are any points to separate them into two different transactions.
Also, in term of Hibernate, it is very normal to load and update the entities in the same transaction such that you do not need to deal with the detached entities if the entities that you update are loaded from another already closed transaction. Dealing with detached entities is not easy if you are not familiar with Hibernate.
Sample Scenario
I have a limit that controls the total value of a column. If I make a save that exceeds this limit, I want it to throw an exception. For example;
Suppose I have already added the following data: LIMIT = 20
id
code
value
1
A
15
2
A
5
3
B
12
4
B
3
If I insert (A,2) it exceeds the limit and I want to get exception
If I insert (B,4) the transaction should be successful since it didn't exceed the limit
code and value are interrelated
What can I do
I can check this scenario with required queries. For example, I write a method for it and I can check it in the save method. That's it.
However, I'm looking for a more useful solution than this
For example, is there any annotation when designing Entity ?
Can I do this without calling the method that provides this control every time ?
What examples can I give ?
#UniqueConstraint checking if it adds the same values
Using transaction
The most common and long-accepted way is to simply abstract in a suitable form (in a class, a library, a service, ...) the business rules that govern the behavior you describe, within a transaction:
#Transactional(propagation = Propagation.REQUIRED)
public RetType operation(ReqType args) {
...
perform operations;
...
if(fail post conditions)
throw ...;
...
}
In this case, if when calling a method there is already an open transaction, that transaction will be used (and there will be no interlocks), if there is no transaction created, it will create a new one so that both the operations and the postconditions check are performed within the same transaction.
Note that with this strategy both operation and invariant check transactions can combine multiple transactional states managed by the TransactionManager (e.g. Redis, MySQL, MQS, ... simultaneously and in a coordinated manner).
Using only the database
It has not been used for a long time (in favor of the first way) but using TRIGGERS was the canonical option used some decades ago to check postconditions, but this solution is usually coupled to the specific database engine (e.g. in PostgreSQL or MySQL).
It could be useful in the case where the client making the modifications is unable or unwilling (not safe) to check postconditions (e.g. bash processes) within a transaction. But nowadays it is infrequent.
The use of TRIGGERS may also be preferable in certain scenarios where efficiency is required, as there are certain optimization options within the database scripts.
Neither Hibernate nor Spring Data JPA have anything built-in for this scenario. You have to program the transaction logic in your repository yourself:
#PersistenceContext
EntityManager em;
public addValue(String code, int value) {
var checkQuery = em.createQuery("SELECT SUM(value) FROM Entity WHERE code = :code", Integer.class);
checkQuery.setParameter("code", code);
if (checkQuery.getSingleResult() + value > 20) {
throw new LimitExceededException("attempted to exceed limit for " + code);
}
var newEntity = new Entity();
newEntity.setCode(code);
newEntity.setValue(value);
em.persist(newEntity);
}
Then (it's important!) you have to define SERIALIZABLE isolation level on the #Transactional annotations for the methods that work with this table.
Read more about serializable isolation level here, they have an oddly similar example.
Note that you have to consider retrying the failed transaction. No idea how to do this with Spring though.
You should use a singleton (javax/ejb/Singleton)
#Singleton
public class Register {
#Lock(LockType.WRITE)
public register(String code, int value) {
if(i_can_insert_modify(code, value)) {
//use entityManager or some dao
} else {
//do something
}
}
}
I am trying to perform batch inserts with data that is currently being inserted to DB one statement per transaction. Transaction code statement looks similar to below. Currently, addHolding() method is being called for each quote that comes in from an external feed, and each of these quote updates happens about 150 times per second.
public class HoldingServiceImpl {
#Autowired
private HoldingDAO holdingDao;
#Transactional(propagation = Propagation.REQUIRES_NEW, rollbackFor = Exception.class)
public void addHolding(Quote quote) {
Holding holding = transformQuote(quote);
holdingDao.addHolding(holding);
}
}
And DAO is getting current session from Hibernate SessionFactory and calling save on object.
public class HoldingDAOImpl {
#Autowired
private SessionFactory sessionFactory;
public void addHolding(Holding holding) {
sessionFactory.getCurrentSession().save(holding);
}
}
I have looked at Hibernate batching documentation, but it is not clear from document how I would organize code for batch inserting in this case, since I don't have the full list of data at hand, but rather am waiting for it to stream.
Does merely setting Hibernate batching properties in properties file (e.g. hibernate.jdbc.batch_size=20) "magically" batch insert these? Or will I need to, say, capture each quote update in a synchronized list, and then insert list load and clear list when batch size limit reached?
Also, the whole purpose of implementing batching is to see if performance improves. If there is better way to handle inserts in this scenario, let me know.
Setting the property hibernate.jdbc.batch_size=20 is an indication for the hibernate to Flush the objects after 20. In your case hibernate automatically calls sessionfactory.flush() after 20 records saved.
When u call a sessionFactory.save(), the insert command is only fired to in-memory hibernate cache. Only once the Flush is called hibernate synchronizes these changes with the Database. Hence setting hibernate batch size is enough to do batch inserts. Fine tune the Batch size according to your needs.
Also make sure your transactions are handled properly. If you commit a transaction also forces hibernate to flush the session.
How to specify #Lock timeout for query?
I am using Oracle 11g, I hope I can use something like 'select id from table where id = ?1 for update wait 5'.
I defined method like this:
#Lock(LockModeType.PESSIMISTIC_WRITE)
Stock findById(String id);
It seems to lock forever.
When I set javax.persistence.lock.timeout=0 in LocalContainerEntityManagerFactoryBean.jpaProperties, there is no effect.
To lock entities pessimistically, set the lock mode to
PESSIMISTIC_READ, PESSIMISTIC_WRITE, or
PESSIMISTIC_FORCE_INCREMENT.
If a pessimistic lock cannot be obtained, but the locking failure
doesn’t result in a transaction rollback, a LockTimeoutException is
thrown.
Pessimistic Locking Timeouts
The length of time in milliseconds the persistence provider should
wait to obtain a lock on the database tables may be specified using
the javax.persistence.lock.timeout property. If the time it takes to
obtain a lock exceeds the value of this property, a
LockTimeoutException will be thrown, but the current transaction
will not be marked for rollback. If this property is set to 0, the
persistence provider should throw a LockTimeoutException if it
cannot immediately obtain a lock.
If javax.persistence.lock.timeout is set in multiple places, the
value will be determined in the following order:
The argument to one of the EntityManager or Query methods.
The setting in the #NamedQuery annotation.
The argument to the Persistence.createEntityManagerFactory method.
The value in the persistence.xml deployment descriptor.
For Spring Data 1.6 or greater
#Lock is supported on CRUD methods as of version 1.6 of Spring Data JPA (in fact, there's already a milestone available). See this ticket for more details.
With that version you simply declare the following:
interface WidgetRepository extends Repository<Widget, Long> {
#Lock(LockModeType.PESSIMISTIC_WRITE)
Widget findOne(Long id);
}
This will cause the CRUD implementation part of the backing repository proxy to apply the configured LockModeType to the find(…) call on the EntityManager.
On the other hand,
For previous version of Spring Data 1.6
The Spring Data pessimistic #Lock annotations only apply (as you pointed out) to queries. There are not annotations I know of which can affect an entire transaction. You can either create a findByOnePessimistic method which calls findByOne with a pessimistic lock or you can change findByOne to always obtain a pessimistic lock.
If you wanted to implement your own solution you probably could. Under the hood the #Lock annotation is processed by LockModePopulatingMethodIntercceptor which does the following:
TransactionSynchronizationManager.bindResource(method, lockMode == null ? NULL : lockMode);
You could create some static lock manager which had a ThreadLocal<LockMode> member variable and then have an aspect wrapped around every method in every repository which called bindResource with the lock mode set in the ThreadLocal. This would allow you to set the lock mode on a per-thread basis. You could then create your own #MethodLockMode annotation which would wrap the method in an aspect which sets the thread-specific lock mode before running the method and clears it after running the method.
Resource Link:
How to enable LockModeType.PESSIMISTIC_WRITE when looking up entities with Spring Data JPA?
How to add custom method to Spring Data JPA
Spring Data Pessimistic Lock timeout with Postgres
JPA Query API
Various Example of Pessimistic Lock Timeout
Setting a Pessimistic Lock
An entity object can be locked explicitly by the lock method:
em.lock(employee, LockModeType.PESSIMISTIC_WRITE);
The first argument is an entity object. The second argument is the requested lock mode.
A TransactionRequiredException is thrown if there is no active transaction when lock is called because explicit locking requires an active transaction.
A LockTimeoutException is thrown if the requested pessimistic lock cannot be granted:
A PESSIMISTIC_READ lock request fails if another user (which is
represented by another EntityManager instance) currently holds a
PESSIMISTIC_WRITE lock on that database object.
A PESSIMISTIC_WRITE lock request fails if another user currently
holds either a PESSIMISTIC_WRITE lock or a PESSIMISTIC_READ lock on
that database object.
Setting Query Hint (Scopes)
Query hints can be set in the following scopes (from global to local):
For the entire persistence unit - using a persistence.xml property:
<properties>
<property name="javax.persistence.query.timeout" value="3000"/>
</properties>
For an EntityManagerFactory - using the createEntityManagerFacotory method:
Map<String,Object> properties = new HashMap();
properties.put("javax.persistence.query.timeout", 4000);
EntityManagerFactory emf =
Persistence.createEntityManagerFactory("pu", properties);
For an EntityManager - using the createEntityManager method:
Map<String,Object> properties = new HashMap();
properties.put("javax.persistence.query.timeout", 5000);
EntityManager em = emf.createEntityManager(properties);
or using the setProperty method:
em.setProperty("javax.persistence.query.timeout", 6000);
For a named query definition - using the hints element:
#NamedQuery(name="Country.findAll", query="SELECT c FROM Country c",
hints={#QueryHint(name="javax.persistence.query.timeout", value="7000")})
For a specific query execution - using the setHint method (before query execution):
query.setHint("javax.persistence.query.timeout", 8000);
Resource Link:
Locking in JPA
Pessimistic Lock Timeout
You can use #QueryHints in Spring Data:
#Lock(LockModeType.PESSIMISTIC_WRITE)
#QueryHints({#QueryHint(name = "javax.persistence.lock.timeout", value ="5000")})
Stock findById(String id)
For Spring Data 1.6 or greater,we can use #Lock annotation provided by spring data jpa.
Also, Lock time out can be set as well by using #QueryHints. Originally there was no support for query hint annotations in default CRUD methods but its been available after fix 1.6M1.
https://jira.spring.io/browse/DATAJPA-173
Below is an example of a Pessimistic Lock with PESSIMISTIC_WRITE mode type which is an exclusive lock.
#Lock(LockModeType.PESSIMISTIC_WRITE)
#QueryHints({#QueryHint(name = "javax.persistence.lock.timeout", value ="5000")})
Customer findByCustomerId(Long customerId);
javax.persistence.lock.timeout doesn't seem to be working for me also when provided like below:
#QueryHints({#QueryHint(name = "javax.persistence.lock.timeout",value = "15000")})
But then I tried something else which worked. Instead of using #Repository and using CrudRepository, now I am configuring my hbernate using entity manager. Used createQuery along with lock and setting lock timeout. And this configuration is working as expected.
I have two transaction running in parellel and trying to lock exact same row in DB. First transaction is able to acquire WRITE lock and holds the lock for around 10 secs before releasing lock. Meanwhile, second transaction tries to acquire lock on same row but since javax.persistence.lock.timeout is set to 15 secs, it waits for lock to be released and then acquires its own lock. Hence making the flow serialized.
#Component
public class Repository {
#PersistenceContext
private EntityManager em;
public Optional<Cache> getById(int id){
List<Cache> list = em.createQuery("select c from Cache c where c.id = ?1")
.setParameter(1, id)
.setHint("javax.persistence.lock.timeout", 15000)
.setLockMode(LockModeType.PESSIMISTIC_WRITE)
.getResultList();
return Optional.ofNullable(list.get(0));
}
public void save(Cache cache) {
cache = em.find(Cache.class, cache.getId());
em.merge(cache);
}
}
I've searched through the net but no answer so far, at least no clear answer.
Suppose you are in the following situation
#Transactional(readOnly = false, propagation = Propagation.REQUIRES_NEW)
private void usingManagerTest()
{
List<SomeType> someList = someDao.findAll();
for (SomeType listItem : someList )
{
someManager.create();
}
}
where someManager.create() set the fields of an entity, say someEntity, and then calls someDao.create(someEntity).
Mysql logs shows that for every iteration in the for, the following mysql queries are performed:
set autocommit = 0
insert into ...
commit
Now suppose you are in the following situation:
#Transactional(readOnly = false, propagation = Propagation.REQUIRES_NEW)
private void usingDaoTest()
{
List<SomeType> someList = someDao.findAll();
for (SomeType listItem : someList )
{
SomeEntity someEntity = someManager.createEntity();
someDao.create(someEntity);
}
}
where the createEntity method calls some setters on a java entity, and the create is performed by the DAO. This lead to a mysql log like the following:
set autocommit = 0
insert into ...
insert into ...
insert into ...
...
commit
where the number of insert query is the number of iteration in the for cycle.
I've read the spring documentation but so far it is not clear to me why this happens.
Anyone that could explain this behaviour?
Thanks
P.S. I know that the title is not clear, any suggestion is welcomed.
UPDATE: it seems that it works differently from what I've said: the log resulting from running usingDaoTest() does not shows at all the autocommit query (that is no good for me).
I'm still interested in understanding why the two scripts work differently, but now I'm interested also in understanding how to achieve the second log result (where all the operation in the for loop are executed between autocommit = 0 and commit).
Thanks again
UPDATE2: after some other test I've understood a bit more the logic behind the #Transactional, so I've performed a more specific research, founding a solution here.
This discussion can be considered closed, thanks to all.
MySQL will perform the operations for you while your transaction is running. (reason why autocommit is set to 0) After you commit your transaction, all changes will be effectively performed on the database tables that are visible to other transactions.
This is the normal situation. However there is a possibility to define transactions where the changes performed are directly visible to other transactions. This has its up- and downsides.