We have a method that has reads and writes to MySql, the method can be called by multiple threads. The db operations are like:
public List<Record> getAndUpdate() {
Task task = taskMapper.selectByPrimaryKey(id);
if (task.getStatus() == 0) {
insertRecords();
task.setStatus(1);
taskMapper.update(task);
}
// some queries and return data
return someRecordMapper.selectByXXX();
}
private void insertRecords() {
// read some files and create someRecords
someRecordMapper.insertBatch(someRecords);
}
The method reads a task's status, if the status is 0, it then inserts a bunch of records (of that task) to the Records table, and then set the status of the task to 1.
I want those DB operations to be transactional and exclusive, meaning that when one thread enters the transaction, other threads trying to read the same
task should block. Otherwise, they will see task status as 0 and insertRecords() will be called multiple times, resulting in duplicated data.
The #Transactional annotation doesn't seem to block transactions from other threads, it only ensures rollback in case of abortion. So I think with #Transactional alone, the above issue cannot be avoided.
I'm using MySql with mybatis, I think MySql itself can achieve such synchronization between threads so I try not to introduce extra components such as redis lock to do it. I wonder how can I do it in Spring?
I ended up using the "SELECT ... FOR UPDATE" query. With this query executed, all the other reads/writes are locked until the current transaction commits or gets rolled back. Also need to annotate the method with #Transactional. But the row lock and the transaction here are 2 different concerns. The test results is satisfactory.
Related
I would like to know if there is possibility to have transaction over multiple threads.
The problem is, I need to persist large number of records. If i do it sequentially it's going to take long time. That's why i need to partition these entity objects and persist them parallelly. Which is faster and works great. But only trade off is. Each thread is in it's own transaction. If any of this fail. It will rollback its own thread transaction but not all of them.
Is there any way to run all the threads in single transaction. So that i can control the transaction.
I am using JpaRepository respository to persist or delete.
code sample
#Transactional
public boolean executeTransaction(final EntityWrapper entityWrapper) {
transactionTemplate = new TransactionTemplate(transactionManagerRepo);
try {
transactionTemplate.execute(new TransactionCallbackWithoutResult() {
protected void doInTransactionWithoutResult(TransactionStatus status) {
repo.saveAll(list); // Saves all at once. Batch commit. Which takes long time
// other way executing in ThreadPoolExecutor saving 200 records parallelly
entitiesList = partition(entitiesMasterList, 200);
executor = ThreadPoolExecutor.initializeThreadPoolExecutor(entitiesList.size());
Please suggest.
Good Morning,
I am trying to create a scheduled task which has to update database entity cyclically, I am using Spring MVC and Hibernate as ORM.
Problem
The scheduled task should update entities in background, but changes are not persisted in the Database.
Structure of the system
I have a Batch entity with basic information and plenty of sensors inserting record in the DB every few seconds.
Related to the Batch entity, there is a TrackedBatch entity which contains many calculated fields related to the Batch entity itself, the scheduled task takes each Batch one by one, update related data from sensors with lotto = lottoService.updateBatchRelations(batch) and then update the TrackedBatch entity with the new computed data.
A user can modify Batch basic information, then the system should recompute TrackedBatch data and update the entity (this is done by the controller which calls updateBatchFollowingModification method). This step is correctly done with an asynch method, the problem comes when the scheduled task should recompute the same infos.
Asynch method used to update entities after user modification (Working correctly)
#Async("threadPoolTaskExecutor")
#Transactional
public void updateBatchFollowingModification(Lotto lotto)
{
logger.debug("Daemon started");
Lotto batch = lottoService.findBatchById(lotto.getId_lotto(), false);
lotto = lottoService.updateBatchRelations(batch);
lotto.setTrackedBatch(trackableBatchService.modifyTrackedBatch(batch.getTrackedBatch(), batch));
logger.debug("Daemon ended");
}
Scheduled methods to update entities cyclically (Not working as expected)
#Scheduled(fixedDelay = 10000)
public void updateActiveBatchesWithDaemon()
{
logger.info("updating active batches in background");
List<Integer> idsOfActiveBatches = lottoService.findIdsOfActiveBatchesInAllSectors();
if(!idsOfActiveBatches.isEmpty())
{
logger.info("found " + idsOfActiveBatches.size() + " active batches");
for(Integer id : idsOfActiveBatches)
{
logger.debug("update batch " + id + " in background");
updateBatch(id);
}
}
else
{
logger.info("no active batches found");
}
}
#Transactional
public void updateBatch(Integer id)
{
Lotto activeLotto = lottoService.findBatchById(id, false);
updateBatchFollowingModification(activeLotto);
}
As a premise, I can state that scheduled method is fired/configured correctly and runs continously (the same stands for asynch method, as following a user modification all entities are updated correctly), at line updateBatchFollowingModification(activeLotto) in updateBatch method, the related entities are modified correctly (even the TrackedBatch, I have checked with the debugger), then the changes are not persisted in the Database when method ends and no exception is thrown.
Looking around the internet I didn't find any solution to this problem nor it seems to be a known problem or bug from Hibernate and Spring.
Also reading Spring documentation about scheduling didn't help, I also tried to use save method in the scheduled task to save again the entity (but it obiously didn't work).
Further considerations
I do not know if the #Scheduled annotation needs some extra configuration to handle #Transactional methods as in the web devs are using those annotations together with no problem, moreover in documentation no cons are mentioned.
I also do not think it is a concurrency problem, because if the asynch method is modifying the data, the scheduled one should be stopped by the implicit optimistic locking system in order to finish after the first transaction commit, the same stands if the first to acquire the locking is the scheduled method (correct me if I am wrong).
I cannot figure out why changes are not persisted when the scheduled method is used, can someone link documentation or tutorials on this topic? so I can find a solution, or, better, if someone faced a similar problem, how it can be solved?
Finally I managed to resolve the issue by explicitly defining the isolation level for the transaction involved in the process and by eliminating the updateBatch method (as it was a kind of duplicated feature as updateBatchFollowingModification is doing the same thing), in particular I put the isolation level for updateBatchFollowingModification to #Transactional(isolation = Isolation.SERIALIZABLE).
This obviously works in my case as no scalability is needed, so serializing actions do not bring any problem to the application.
In a recent task, after I created an object I flushed the result to the database. The database table had a unique constraint, meaning that if I tried to flush the same record for the second time, I would get a ConstraintViolationException. A sample snippet is shown below:
createEntityAndFlush(result);
sendAsyncRequestToThirdSystem(param);
The code for the createEntityAndFlush:
private T createEntityAndFlush(final T entity) throws ServiceException {
log.debug("Persisting {}", entity.getClass().getSimpleName());
getEntityManager().persist(entity);
getEntityManager().flush();
return entity;
}
The reason I used flush was that I wanted to make sure that a ConstraintViolationException would be thrown prior to finishing the transaction and thus calling the sendAsyncRequestToThirdSystem. But that was not the case, since sendAsyncRequestToThirdSystem was called after the exception was thrown.
To test the code in racing conditions, I used the ManagedExecutorService and created two runnable tasks (Future<?> submit(Runnable task)) to replicate the incoming request.
Eventually the problem was solved by trying performing a lock on a new table for each unique request id, but I would like to know where I was wrong in my first approach (ex. wrong use of flash, ManagedExecutorService was responsible for awkward behaviour). Thanks in advance!
The issue is that while flush() does flush the changes into the database, the transaction is still open, and the unique constraint will be checked when the transaction is committed (this may depend on the database, but at least with Postgres and any MVCC using DB).
So you will need to make sure that createEntityAndFlush(result); runs in its own transaction, possibly with a #Transactional(propagation = Propagation.REQUIRES_NEW) (or equivalent, if not using Spring) to see if the unique index is violated.
There is one batch job looking like this:
#Transactional
public void myBatchJob() {
// retrieves thousands of entries and locks them
// to prevent other jobs from touthing this dataset
entries = getEntriesToProcessWithLock();
additional = doPrepWork(); // interacts with DB
processor = applicationContext.getBean(getClass());
while (!entries.isEmpty()) {
result = doActualProcessing(entries, additional); // takes as many entries as it needs; removes them from collection afterwards
resultDao.save(result);
}
}
However I occasionally get the below error if the entries collection is big enough.
ORA-01000: maximum open cursors exceeded
I decided to blame doActualProcessing() and save() methods as they could end up in creating hundreds of blobs in one transaction.
The obvious way out seems to be splitting processing into multiple transactions: one for getting and locking entries and multiple other transactions for processing and persisting. Like this:
#Transactional
public void myBatchJob() {
// retrieves thousands of entries and locks them
// to prevent other jobs from touthing this dataset
entries = getEntriesToProcessWithLock();
additional = doPrepWork(); // interacts with DB
processor = applicationContext.getBean(getClass());
while (!entries.isEmpty()) {
processor.doProcess(entries, additional);
}
}
#Transactional(propagation=REQUIRES_NEW)
public void doProcess(entries, additional) {
result = doActualProcessing(entries, additional); // takes as many entries as it needs; removes them from collection afterwards
resultDao.save(result);
}
and now whenever doProcess is called I get:
Caused by: org.hibernate.HibernateException: illegally attempted to associate a proxy with two open Sessions
How do I make HibernateTransactionManager do what REQUIRES_NEW javadoc suggests: suspend current transaction and start a new one?
In my opinion the problem lies in the fact that you have retrieved the entities in the top Transaction and while they are still associated with that transaction you try to pass them (proxies) to method which would be processed in a separate transaction.
I think that you could try two options:
1) Detach the entities before ivoking processor.doProcess(entries, additional);:
session.evict(entity); // loop through the list and do this
then inside inner transaction try to merge:
session.merge(entity);
2) Second option would be to retrieve ids instead of entities in the getEntriesToProcessWithLock. Then you would pass plain primitive fields which wont cause proxy problems. You would then retrieve proper entities inside of the inner transaction.
I am trying to perform batch inserts with data that is currently being inserted to DB one statement per transaction. Transaction code statement looks similar to below. Currently, addHolding() method is being called for each quote that comes in from an external feed, and each of these quote updates happens about 150 times per second.
public class HoldingServiceImpl {
#Autowired
private HoldingDAO holdingDao;
#Transactional(propagation = Propagation.REQUIRES_NEW, rollbackFor = Exception.class)
public void addHolding(Quote quote) {
Holding holding = transformQuote(quote);
holdingDao.addHolding(holding);
}
}
And DAO is getting current session from Hibernate SessionFactory and calling save on object.
public class HoldingDAOImpl {
#Autowired
private SessionFactory sessionFactory;
public void addHolding(Holding holding) {
sessionFactory.getCurrentSession().save(holding);
}
}
I have looked at Hibernate batching documentation, but it is not clear from document how I would organize code for batch inserting in this case, since I don't have the full list of data at hand, but rather am waiting for it to stream.
Does merely setting Hibernate batching properties in properties file (e.g. hibernate.jdbc.batch_size=20) "magically" batch insert these? Or will I need to, say, capture each quote update in a synchronized list, and then insert list load and clear list when batch size limit reached?
Also, the whole purpose of implementing batching is to see if performance improves. If there is better way to handle inserts in this scenario, let me know.
Setting the property hibernate.jdbc.batch_size=20 is an indication for the hibernate to Flush the objects after 20. In your case hibernate automatically calls sessionfactory.flush() after 20 records saved.
When u call a sessionFactory.save(), the insert command is only fired to in-memory hibernate cache. Only once the Flush is called hibernate synchronizes these changes with the Database. Hence setting hibernate batch size is enough to do batch inserts. Fine tune the Batch size according to your needs.
Also make sure your transactions are handled properly. If you commit a transaction also forces hibernate to flush the session.