How to check special conditions before saving data with Hibernate - java

Sample Scenario
I have a limit that controls the total value of a column. If I make a save that exceeds this limit, I want it to throw an exception. For example;
Suppose I have already added the following data: LIMIT = 20
id
code
value
1
A
15
2
A
5
3
B
12
4
B
3
If I insert (A,2) it exceeds the limit and I want to get exception
If I insert (B,4) the transaction should be successful since it didn't exceed the limit
code and value are interrelated
What can I do
I can check this scenario with required queries. For example, I write a method for it and I can check it in the save method. That's it.
However, I'm looking for a more useful solution than this
For example, is there any annotation when designing Entity ?
Can I do this without calling the method that provides this control every time ?
What examples can I give ?
#UniqueConstraint checking if it adds the same values

Using transaction
The most common and long-accepted way is to simply abstract in a suitable form (in a class, a library, a service, ...) the business rules that govern the behavior you describe, within a transaction:
#Transactional(propagation = Propagation.REQUIRED)
public RetType operation(ReqType args) {
...
perform operations;
...
if(fail post conditions)
throw ...;
...
}
In this case, if when calling a method there is already an open transaction, that transaction will be used (and there will be no interlocks), if there is no transaction created, it will create a new one so that both the operations and the postconditions check are performed within the same transaction.
Note that with this strategy both operation and invariant check transactions can combine multiple transactional states managed by the TransactionManager (e.g. Redis, MySQL, MQS, ... simultaneously and in a coordinated manner).
Using only the database
It has not been used for a long time (in favor of the first way) but using TRIGGERS was the canonical option used some decades ago to check postconditions, but this solution is usually coupled to the specific database engine (e.g. in PostgreSQL or MySQL).
It could be useful in the case where the client making the modifications is unable or unwilling (not safe) to check postconditions (e.g. bash processes) within a transaction. But nowadays it is infrequent.
The use of TRIGGERS may also be preferable in certain scenarios where efficiency is required, as there are certain optimization options within the database scripts.

Neither Hibernate nor Spring Data JPA have anything built-in for this scenario. You have to program the transaction logic in your repository yourself:
#PersistenceContext
EntityManager em;
public addValue(String code, int value) {
var checkQuery = em.createQuery("SELECT SUM(value) FROM Entity WHERE code = :code", Integer.class);
checkQuery.setParameter("code", code);
if (checkQuery.getSingleResult() + value > 20) {
throw new LimitExceededException("attempted to exceed limit for " + code);
}
var newEntity = new Entity();
newEntity.setCode(code);
newEntity.setValue(value);
em.persist(newEntity);
}
Then (it's important!) you have to define SERIALIZABLE isolation level on the #Transactional annotations for the methods that work with this table.
Read more about serializable isolation level here, they have an oddly similar example.
Note that you have to consider retrying the failed transaction. No idea how to do this with Spring though.

You should use a singleton (javax/ejb/Singleton)
#Singleton
public class Register {
#Lock(LockType.WRITE)
public register(String code, int value) {
if(i_can_insert_modify(code, value)) {
//use entityManager or some dao
} else {
//do something
}
}
}

Related

Jpa reader Spring Batch

I would like to know, if this way is recommended to implement the reader spring batch with jpa or is it better to look for another solution and if this way is not recommended where can I look for information on a better option
public class CreditCardItemReader implements ItemReader<CreditCard> {
#Autowired
private CreditCardRepository respository;
private Iterator<CreditCard> usersIterator;
#BeforeStep
public void before(StepExecution stepExecution) {
usersIterator = respository.someQuery().iterator();
}
#Override
public CreditCard read() {
if (usersIterator != null && usersIterator.hasNext()) {
return usersIterator.next();
} else {
return null;
}
}
}
This implementation is acceptable only for the small dataset because data is read by one batch query, and stored whole result list in memory. Also, it is not thread-safe.
In the case of loading large volumes:
on the environment with limited memory can lead to out of memory
can lead to performance problems. We will wait until thousands of records will be loaded from DB by one call
Solution 1, org.springframework.batch.item.database.JpaCursorItemReader
A similar implementation is defined out of the box in Spring Batch: JpaCursorItemReader
The main difference is that this implementation is working only with specific JPQL query instead of repository and use JPA’s Query.getResultStream() method to get query results.
Implementation of JpaCursorItemReader:
protected void doOpen() throws Exception {
...
Query query = createQuery();
if (this.parameterValues != null) {
this.parameterValues.forEach(query::setParameter);
}
this.iterator = query.getResultStream().iterator();
}
Hibernate, for example, introduced the Query.getResultStream() method in version 5.2.
It uses Hibernate’s ScrollableResult implementation to move through the result set and to fetch the records in batches. That prevents you from loading all records of the result set at once and allows you to process them more efficiently.
Example of creation:
protected ItemReader<Foo> getItemReader() throws Exception {
LocalContainerEntityManagerFactoryBean factoryBean = new LocalContainerEntityManagerFactoryBean();
String jpqlQuery = "from Foo";
JpaCursorItemReader<Foo> itemReader = new JpaCursorItemReader<>();
itemReader.setQueryString(jpqlQuery);
itemReader.setEntityManagerFactory(factoryBean.getObject());
itemReader.afterPropertiesSet();
itemReader.setSaveState(true);
return itemReader;
}
Solution 2, org.springframework.batch.item.database.JpaPagingItemReader
It is more flexible solution for JPQL query than JpaCursorItemReader. ItemReader loads and stores data by pages and it is thread-safe.
According to documentation:
ItemReader for reading database records built on top of JPA.
It executes the JPQL setQueryString(String) to retrieve requested
data. The query is executed using paged requests of a size specified
in AbstractPagingItemReader.setPageSize(int). Additional pages are
requested when needed as
AbstractItemCountingItemStreamItemReader.read() method is called,
returning an object corresponding to current position.
The performance of the paging depends on the JPA implementation and
its use of database specific features to limit the number of returned
rows.
Setting a fairly large page size and using a commit interval that
matches the page size should provide better performance.
In order to reduce the memory usage for large results the persistence
context is flushed and cleared after each page is read. This causes
any entities read to be detached. If you make changes to the entities
and want the changes persisted then you must explicitly merge the
entities.
The implementation is thread-safe in between calls
Solution 3, org.springframework.batch.item.data.RepositoryItemReader
It is a more efficient solution. It works with the repository, loads and stores data in chunks and it is thread-safe.
According to documentation:
A ItemReader that reads records utilizing a
PagingAndSortingRepository.
Performance of the reader is dependent on the repository
implementation, however setting a reasonably large page size and
matching that to the commit interval should yield better performance.
The reader must be configured with a PagingAndSortingRepository, a
Sort, and a pageSize greater than 0.
This implementation is thread-safe between calls to
AbstractItemCountingItemStreamItemReader.open(ExecutionContext), but
remember to use saveState=false if used in a multi-threaded client (no
restart available).
Example of creation:
PagingAndSortingRepository<Foo, Long> repository = FooRepository<>();
RepositoryItemReader<Foo> reader = new RepositoryItemReader<>();
reader.setRepository(repository ); //The PagingAndSortingRepository implementation used to read input from.
reader.setMethodName("findByName"); //Specifies what method on the repository to call.
reader.setArguments(arguments); // Arguments to be passed to the data providing method.
Creation via builder:
PagingAndSortingRepository<Foo, Long> repository = new FooRepository<>();
new RepositoryItemReaderBuilder<>().repository(repository)
.methodName("findByName")
.arguments(new ArrayList<>())
.build()
More examples of usage: RepositoryItemReaderTests and RepositoryItemReaderIntegrationTests
Summarise:
Your implementation is good only for simple use cases.
I recommend to use out of box solutions.

Spring #Transactional DAO calls return same object

We are using Spring and IBatis and I have discovered something interesting in the way a service method with #Transactional handles multiple DAO calls that return the same record. Here is an example of a method that does not work.
#Transactional
public void processIndividualTrans(IndvTrans trans) {
Individual individual = individualDAO.selectByPrimaryKey(trans.getPartyId());
individual.setFirstName(trans.getFirstName());
individual.setMiddleName(trans.getMiddleName());
individual.setLastName(trans.getLastName());
Individual oldIndvRecord = individualDAO.selectByPrimaryKey(trans.getPartyId());
individualHistoryDAO.insert(oldIndvRecord);
individualDAO.updateByPrimaryKey(individual);
}
The problem with the above method is that the 2nd execution of the line
individualDAO.selectByPrimaryKey(trans.getPartyId())
returns the exact object returned from the first call.
This means that oldIndvRecord and individual are the same object, and the line
individualHistoryDAO.insert(oldIndvRecord);
adds a row to the history table that contains the changes (which we do not want).
In order for it to work it must look like this.
#Transactional
public void processIndividualTrans(IndvTrans trans) {
Individual individual = individualDAO.selectByPrimaryKey(trans.getPartyId());
individualHistoryDAO.insert(individual);
individual.setFirstName(trans.getFirstName());
individual.setMiddleName(trans.getMiddleName());
individual.setLastName(trans.getLastName());
individualDAO.updateByPrimaryKey(individual);
}
We wanted to write a service called updateIndividual that we could use for all updates of this table that would store a row in the IndividualHistory table before performing the update.
#Transactional
public void updateIndividual(Individual individual) {
Individual oldIndvRecord = individualDAO.selectByPrimaryKey(trans.getPartyId());
individualHistoryDAO.insert(oldIndvRecord);
individualDAO.updateByPrimaryKey(individual);
}
But it does not store the row as it was before the object changed. We can even explicitly instantiate different objects before the DAO calls and the second one becomes the same object as the first.
I have looked through the Spring documentation and cannot determine why this is happening.
Can anyone explain this?
Is there a setting that can allow the 2nd DAO call to return the database contents and not the previously returned object?
You are using Hibernate as ORM and this behavior is perfectly described in the Hibernate documentation. In the Transaction chapter:
Through Session, which is also a transaction-scoped cache, Hibernate provides repeatable reads for lookup by identifier and entity queries and not reporting queries that return scalar values.
Same goes for IBatis
MyBatis uses two caches: a local cache and a second level cache. Each
time a new session is created MyBatis creates a local cache and
attaches it to the session. Any query executed within the session will
be stored in the local cache so further executions of the same query
with the same input parameters will not hit the database. The local
cache is cleared upon update, commit, rollback and close.

Hibernate: Accessing created entity from different transaction

I am having quite complex methods which create different entities during its execution and use them. For instance, I create some images and then I add them to an article:
#Transactional
public void createArticle() {
List<Image> images = ...
for (int i = 0; i < 10; i++) {
// creating some new images, method annotated #Transactional
images.add(repository.createImage(...));
}
Article article = getArticle();
article.addImages(images);
em.merge(article);
}
This correctly works – images have their IDs and then they are added to the article. The problem is that during this execution the database is locked and nothing can be modified. This is very unconvinient because images might be processed by some graphic processor and it might take some time.
So we might try to remove the #Transactional from the main method. This could be good.
What happens is that images are correctly created and have their ID. But once I try to add them to article and call merge, I get javax.persistence.EntityNotFoundException for Image with ID XXXX. The entity manager can't see that the image was created and have its ID. So the database is not locked, but we can't do anything either.
So what can I do? I don't want to have the database locked during the whole execution and I want to be able to access the created entities!
I am using current version of Spring and Hibernate, everything defined by Annotations. I don't use session factory, I am accessing everything via javax.persistence.EntityManager.
Consider leveraging the Hibernate cascading functionality for persisting object trees in one go with minimal database locking:
#Entity
public class Article {
#OneToMany(cascade=CascadeType.MERGE)
private List<Images> images;
}
#Transactional
public void createArticle() {
//images created as Java objects in memory, no DAOs called yet
List<Image> images = ...
Article article = getArticle();
article.addImages(images);
// cascading will save the article AND the images
em.merge(article);
}
Like this the article AND it's images will get persisted at the end of the transaction in a single transaction with a minimal lifetime. Up until then no locking occurred on the database.
Alternativelly split the createArticle in two #Transactional business methods, one createImages and the other addImagesToArticle and call them one after the other in a third method in another bean:
#Service
public class OtherBean {
#Autowired
private YourService yourService;
// note that no transactional annotation is used, this is intentional
public otherMethod() {
yourService.createImages(); // first transaction - images are committed
yourService.addImagesToArticle(); // second transaction - images are added to article
}
}
You could try setting the transaction isolation on your datasource to READ_UNCOMMITTED, though that can lead to inconsistencies so it is generally not a recommended thing to do.
My best guess is that your transaction isolation level is SERIALIZABLE. That's why the DB locks affected tables for the whole duration of a transaction.
If that's the case change the level to READ_COMMITTED. Hibernate (or any JPA provider) works nicely with this one.
It won't lock anything unless you explicitly call entityManager.lock(someEntity, LockModeType.SomeLockType))
Also when you choose transaction boundaries firstly think in terms of atomicity. If createArticle() is an atomic unit of work it just has to be made transactional, breaking it into smaller transactions for the sake of 'optimization' is wrong.

Isolation level SERIALIZABLE in Spring-JDBC

maybe somebody can help me with a transactional issue in Spring (3.1)/ Postgresql (8.4.11)
My transactional service is as follows:
#Transactional(isolation = Isolation.SERIALIZABLE, readOnly = false)
#Override
public Foo insertObject(Bar bar) {
// these methods are just examples
int x = firstDao.getMaxNumberOfAllowedObjects(bar)
int y = secondDao.getNumerOfExistingObjects(bar)
// comparison
if (x - y > 0){
secondDao.insertNewObject(...)
}
....
}
The Spring configuration Webapp contains:
#Configuration
#EnableTransactionManagement
public class ....{
#Bean
public DataSource dataSource() {
org.apache.tomcat.jdbc.pool.DataSource ds = new DataSource();
....configuration details
return ds;
}
#Bean
public DataSourceTransactionManager txManager() {
return new DataSourceTransactionManager(dataSource());
}
}
Let us say a request "x" and a request "y" execute concurrently and arrive both at the comment "comparison" (method insertObject). Then both of them are allowed to insert a new object and their transactions are commited.
Why am I not having a RollbackException? As far as I know that is what the Serializable isolotation level is for. Coming back to the previous scenario, if x manages to insert a new object and commits its transaction, then "y"'s transaction should not be allowed to commit since there is a new object he did not read.
That is, if "y" could read again the value of secondDao.getNumerOfExistingObjects(bar) it would realize that there is a new object more. Phantom?
The transaction configuration seems to be working fine:
For each request I can see the same connection for firstDao and secondDao
A transaction is created everytime insertObject is invoked
Both first and second DAOs are as follows:
#Autowired
public void setDataSource(DataSource dataSource) {
this.jdbcTemplate = new JdbcTemplate(dataSource);
}
#Override
public Object daoMethod(Object param) {
//uses jdbcTemplate
}
I am sure I am missing something. Any idea?
Thanks for your time,
Javier
TL;DR: Detection of serializability conflicts improved dramatically in Pg 9.1, so upgrade.
It's tricky to figure out from your description what the actual SQL is and why you expect to get a rollback. It looks like you've seriously misunderstood serializable isolation, perhaps thinking it perfectly tests all predicates, which it doesn't, especially not in Pg 8.4.
SERIALIZABLE doesn't perfectly guarantee that the transactions execute as if they were run in series - as doing so would be prohibitively expensive from a performance point of view if it it were possible at all. It only provides limited checking. Exactly what is checked and how varies from database to database and version to version, so you need to read the docs for your version of your database.
Anomalies are possible, where two transactions executing in SERIALIZABLE mode produce a different result to if those transactions truly executed in series.
Read the documentation on transaction isolation in Pg to learn more. Note that SERIALIZABLE changed behaviour dramatically in Pg 9.1, so make sure to read the version of the manual appropriate for your Pg version. Here's the 8.4 version. In particular read 13.2.2.1. Serializable Isolation versus True Serializability. Now compare that to the greatly improved predicate locking based serialization support described in the Pg 9.1 docs.
It looks like you're trying to perform logic something like this pseudocode:
count = query("SELECT count(*) FROM the_table");
if (count < threshold):
query("INSERT INTO the_table (...) VALUES (...)");
If so, that's not going to work in Pg 8.4 when executed concurrently - it's pretty much the same as the anomaly example used in the documentation linked above. Amazingly it actually works on Pg 9.1; I didn't expect even 9.1's predicate locking to catch use of aggregates.
You write that:
Coming back to the previous scenario, if x manages to insert a new
object and commits its transaction, then "y"'s transaction should not
be allowed to commit since there is a new object he did not read.
but 8.4 won't detect that the two transactions are interdependent, something you can trivially prove by using two psql sessions to test it. It's only with the true-serializability stuff introduced in 9.1 that this will work - and frankly, I was surprised it works in 9.1.
If you want to do something like enforce a maximum row count in Pg 8.4, you need to LOCK the table to prevent concurrent INSERTs, doing the locking either manually or via a trigger function. Doing it in a trigger will inherently require a lock promotion and thus will frequently deadlock, but will successfully do the job. It's better done in the application where you can issue the LOCK TABLE my_table IN EXCLUSIVE MODE before obtaining even SELECTing from the table, so it already has the highest lock mode it will need on the table and thus shouldn't need deadlock-prone lock promotion. The EXCLUSIVE lock mode is appropriate because it permits SELECTs but nothing else.
Here's how to test it in two psql sessions:
SESSION 1 SESSION 2
create table ser_test( x text );
BEGIN TRANSACTION
ISOLATION LEVEL SERIALIZABLE;
BEGIN TRANSACTION
ISOLATION LEVEL SERIALIZABLE;
SELECT count(*) FROM ser_test ;
SELECT count(*) FROM ser_test ;
INSERT INTO ser_test(x) VALUES ('bob');
INSERT INTO ser_test(x) VALUES ('bob');
COMMIT;
COMMIT;
When run on Pg 9.1, the st commits succeeds then the secondCOMMIT` fails with:
regress=# COMMIT;
ERROR: could not serialize access due to read/write dependencies among transactions
DETAIL: Reason code: Canceled on identification as a pivot, during commit attempt.
HINT: The transaction might succeed if retried.
but when run on 8.4 both commits commits succeed, because 8.4 didn't have all the predicate locking code for serializability added in 9.1.

How to use #Transactional with Spring Data?

I just started working on a Spring-data, Hibernate, MySQL, JPA project. I switched to spring-data so that I wouldn't have to worry about creating queries by hand.
I noticed that the use of #Transactional isn't required when you're using spring-data since I also tried my queries without the annotation.
Is there a specific reason why I should/shouldn't be using the #Transactional annotation?
Works:
#Transactional
public List listStudentsBySchool(long id) {
return repository.findByClasses_School_Id(id);
}
Also works:
public List listStudentsBySchool(long id) {
return repository.findByClasses_School_Id(id);
}
What is your question actually about? The usage of the #Repository annotation or #Transactional.
#Repository is not needed at all as the interface you declare will be backed by a proxy the Spring Data infrastructure creates and activates exception translation for anyway. So using this annotation on a Spring Data repository interface does not have any effect at all.
#Transactional - for the JPA module we have this annotation on the implementation class backing the proxy (SimpleJpaRepository). This is for two reasons: first, persisting and deleting objects requires a transaction in JPA. Thus we need to make sure a transaction is running, which we do by having the method annotated with #Transactional.
Reading methods like findAll() and findOne(…) are using #Transactional(readOnly = true) which is not strictly necessary but triggers a few optimizations in the transaction infrastructure (setting the FlushMode to MANUAL to let persistence providers potentially skip dirty checks when closing the EntityManager). Beyond that the flag is set on the JDBC Connection as well which causes further optimizations on that level.
Depending on what database you use it can omit table locks or even reject write operations you might trigger accidentally. Thus we recommend using #Transactional(readOnly = true) for query methods as well which you can easily achieve adding that annotation to you repository interface. Make sure you add a plain #Transactional to the manipulating methods you might have declared or re-decorated in that interface.
In your examples it depends on if your repository has #Transactional or not.
If yes, then service, (as it is) in your case - should no use #Transactional (since there is no point using it). You may add #Transactional later if you plan to add more logic to your service that deals with another tables / repositories - then there will be a point having it.
If no - then your service should use #Transactional if you want to make sure you do not have issues with isolation, that you are not reading something that is not yet commuted for example.
--
If talking about repositories in general (as crud collection interface):
I would say: NO, you should not use #Transactional
Why not: if we believe that repository is outside of business context, and it should does not know about propagation or isolation (level of lock). It can not guess in which transaction context it could be involved into.
repositories are "business-less" (if you believe so)
say, you have a repository:
class MyRepository
void add(entity) {...}
void findByName(name) {...}
and there is a business logic, say MyService
class MyService() {
#Transactional(propagation=Propagation.REQUIRED, isolation=Isolation.SERIALIZABLE)
void doIt() {
var entity = myRepository.findByName("some-name");
if(record.field.equal("expected")) {
...
myRepository.add(newEntity)
}
}
}
I.e. in this case: MyService decides what it wants to involve repository into.
In this cases with propagation="Required" will make sure that BOTH repository methods -findByName() and add() will be involved in single transaction, and isolation="Serializable" would make sure that nobody can interfere with that. It will keep a lock for that table(s) where get() & add() is involved into.
But some other Service may want to use MyRepository differently, not involving into any transaction at all, say it uses findByName() method, not interested in any restriction to read whatever it can find a this moment.
I would say YES, if you treat your repository as one that returns always valid entity (no dirty reads) etc, (saving users from using it incorrectly). I.e. your repository should take care of isolation problem (concurrency & data consistency), like in example:
we want (repository) to make sure then when we add(newEntity) it would check first that there is entity with such the same name already, if so - insert, all in one locking unit of work. (same what we did on service level above, but not we move this responsibility to the repository)
Say, there could not be 2 tasks with the same name "in-progress" state (business rule)
class TaskRepository
#Transactional(propagation=Propagation.REQUIRED,
isolation=Isolation.SERIALIZABLE)
void add(entity) {
var name = entity.getName()
var found = this.findFirstByName(name);
if(found == null || found.getStatus().equal("in-progress"))
{
.. do insert
}
}
#Transactional
void findFirstByName(name) {...}
2nd is more like DDD style repository.
I guess there is more to cover if:
class Service {
#Transactional(isolation=.., propagation=...) // where .. are different from what is defined in taskRepository()
void doStuff() {
taskRepository.add(task);
}
}
You should use #Repository annotation
This is because #Repository is used for translating your unchecked SQL exception to Spring Excpetion and the only exception you should deal is DataAccessException
We also use #Transactional annotation to lock the record so that another thread/request would not change the read.
We use #Transactional annotation when we create/update one more entity at the same time. If the method which has #Transactional throws an exception, the annotation helps to roll back the previous inserts.

Categories