How to audit in a for-loop (Hibernate envers)? - java

I am using hibernate envers for auditing.
It works fine but today I realized that it doesnt if I create entities in a for-loop.
After set log true for sql queries I figured out, that the rev-tables are not updated after each iteration. Somehow hibernate collects all changes and fires the audit command in the end of a request? How can I let hibernate to do auditing after each iteration in my for-loop?
What I already tried:
for (...) {
Obj a = new Obj();
objRepository.save(a);
entityManager.flush();
entityManager.clear();
}

As #gtosto points out, Hibernate Envers operates on a transaction boundary basis and therefore audit records won't be flushed and persisted until commit.
One way to synchronize this would be to manually control the transaction boundary yourself as a part of the for-loop so you basically persist small buckets of the list and commit.
The downside here is that can be performance intensive, particularly if the list of objects you're trying to persist is quite large.
The jira issue HHH-9622 outlines a request to make the AuditProcess flushable; however, there are consequences to introducing such behavior that need to be considered.

In fact the problem was that I added the #Transactional annotation to the respective class. Remove it and hibernate will fire the audit commands as soon as you call objRepository.save(a). No need for entity manager.

Related

Hibernate Envers and Hibernate Auto Flush

I've got a Spring application using Hibernate. I've implemented Envers into it, which is working fine. However, Hibernate will by default automatically flush before some transactions are committed.
For example, I have an MVC endpoint that will update a record, but before saving it, will have to make various other queries to retrieve some other data. Each time another query is run, Hibernate flushes and this results in there being multiple audit rows for each change. This creates some confusion, as there is already a modified date on my record which isn't changed in each update (as it's flushing before this property is changed).
What are my options for managing this more effectively, and creating a reliable audit log even with Hibernate flushing in this way? Is the only answer to implement my own listener with some custom logic to check if it should actually be committing an audit change or not?
You can detach the entity and merge when you are done. These queries are only executed if they touch tables that would be affected by pending inserts/updates/deletes. If you use native queries, this is a different topic. Hibernate has no SQL parser to figure out which tables you are touching so it is conservative and flushes all pending changes.

How would I audit the changes to a list of JPA entities?

I've got two lists of entities: One that is the current state of the rows in the DB, the other is the changes that were made to the list. How do I audit the rows that were deleted, added, and the changes made to the entities? My audit table is used by all the entities.
Entity listeners and Callback methods look like a perfect fit, until you notice the sentence that says: A callback method must not invoke EntityManager or Query methods! Because of this restriction, I can collect audits, but I can't persist them to the database :(
My solution has been a complex algorithm to discover the audits.
If the entity is in the change list and has no key, it's an add
If the entity is in the db but not the changes list, it's a delete
If the entity is in both list, recursively compare their fields to find differences to audit (if any)
I collect these and insert them into the DB in the same transaction I merge the changes list. But I hate the fact that I'm writing this by hand. It seems like JPA should be able to do this logic for me.
One solution we've come up with is to use an Entity Listener that posts the audits to a JMS queue. The queue then inserts the audits into the database. But I don't like this solution because I think setting up a JMS queue is a pain. It's currently the best solution we've got though.
I'm using eclipselink (ideally, that's not relevant) and have found these two things that look helpful but the JMS queue is a better solution than them:
http://wiki.eclipse.org/EclipseLink/FAQ/JPA#How_to_access_what_changed_in_an_object_or_transaction.3F This looks really difficult to use. You search for the fields by a string. So if I refactor my entity and forget to update this, it'll throw a runtime error.
http://wiki.eclipse.org/EclipseLink/Examples/JPA/History This isn't consistent with the way we currently audit. It expects a special entity_history table.
The EntityListener looks like a good approach since you are able to collect the audit information.
Have you tried persisting the information in a different transaction than the one persisting the changes? perhaps obtaining a reference to a Stateless EJB (assuming you are using EJBs) and using methods marked with #TransactionAttribute(TransactionAttributeType.REQUIRES_NEW). In this way the transaction persisting the original changes is put on hold while the transaction of the audit completes. Note that you will not be able to access the updated information in this separate audit transaction, since the original one has not committed yet.

JPA not taking update into account within single Transaction

Within a transactional service method, I loop on querying a database to get the first 10 of entity A with a criteria.
I update each A entity from the list so that they don't match the criteria anymore, and call a flush() to make sure changes are made.
The second call to the query within the loop returns the exact same set of A entities.
Why isn't the flushed change on the entities taken into account?
I'm using JPA 2.0 with Hibernate 4.1.7.
The same process with Hibernate only seems to be working.
I've turned off the second level cache and the query cache, to no avail.
I'm using a rather plain configuration, JpaTransactionManager, Spring over JPA over Hibernate. Main method annotated with #Transactional.
The code would be something like this:
do {
modelList = exportContributionDao.getContributionListToExport(10);
for (M m : modelList) {
//export m to a file
m. (false);
super.flush();
}
} while (modelList.size() == 10);
With each iteration of the loop, the Dao method always return the same 10 results, JPA not taking into account the updated 'isToBeExported' attribute.
I'm not trying to solve a problem, rather I want to understand why JPA is not behaving as expected here.
I expect this to be a 'classic' problem.
No doubt it would be solved if the Transaction would be commited at each iteration.
ASAIK, the cache L1, i.e. the Session with Hibernate as the underlying JPA provider, should be up to date, and the second iteration query should take into account the updated Entities, even though the changes haven't been persisted yet.
So my question is: why isn't it the case? Misconfiguration or know behavior?
Flush does not necessarily commit the changes on the database. What do you want to achieve? From what I understand you do s.th. like:
Loop about entities
Within the loop, change the entity
Call 'flush' on the entity
Read the entity back again
You wonder why the data did not change on the database?
If this is correct, why do you re-read the changes and just don't work with the elements? After leaving the transaction, the changes will be automagically made persistent.
This should definitely be working.
This is a configuration problem on our part.
Apologies for the question, it was pretty hard to spot the reason on our part, but I hope the answer will at least be useful to some:
JPA definitely takes into account changes made on entities within a single transaction.

EntityManager refresh

I have web application using JPA. This entity manager keeps bunch of entites and suddenly I update the database from other side. I use MySQL and I use PhpMyAdmin and change some row.
How to tell entity manager to re-synchronize, e.g. to forgot all the entites in cache?
I know there is refresh(Object) method, but is there any possibility how to do refreshAll() or something what results in this?
It is sure this is expensive operation but if it has to be done.
entityManager.getEntityManagerFactory().getCache().evictAll()
Refresh is something different since it modifies your object. This line will just empty the cache, so if you fetch objects changed outside the entity manager, it will do an actual database query instead of using the outdated cached value.
I had a similar issue and the evictAll() line above worked for me.
Alternatively, the #Cache annotation on the entity class worked too, with the benefit of being able to control caching parameters:
#Cache(coordinationType=CacheCoordinationType.INVALIDATE_CHANGED_OBJECTS)
See: http://wiki.eclipse.org/EclipseLink/Examples/JPA/Caching
If you are using EclipseLink instead of Hibernate the hint is:
em.createNamedQuery("SomeEntity.SomeNamedQuery")
.setHint(QueryHints.REFRESH, true)
.getResultList();
Well, for some people (like me) that tried to add factory.getCache().evictAll(); and doesn't work, and are used JPA + Hibernate, to refresh a query add the hint org.hibernate.cacheMode to IGNORE. Example:
em.createNamedQuery("SomeEntity.SomeNamedQuery")
.setHint("org.hibernate.cacheMode", "IGNORE")
.getResultList();
cache.evictAll is not working for me. So to retrieve data pushed from another app, I peform :
em.getTransaction().begin();
em.getTransaction().commit();
After that, my find query retrieves refreshed data. I don't know if it's very safe solution but it works properly.
When you read an object into an EntityManager, it becomes part of the persistence context, and the same object will remain in the EntityManager until you either clear() it and get a new EntityManager.
So if you update the database, the EntityManager will not see the change unless you call refresh() on the object, or clear() the EntityManager. This has nothing to do with the shared cache (L2) or the persistence context (L1). If you also also using a shared cache, and updating the database directly, then your shared cache will be out of date. You need to refresh() the object, or mark it as invalid to be refreshed the next time it is queried.
Code must follow the way like.
DETACH
REFRESH
MERGE
FLUSH

Hibernate transaction problem

We are using Hibernate Spring MVC with OpenSessionInView filter.
Here is a problem we are running into (pseudo code)
transaction 1
load object foo
transaction 1 end
update foo's properties (not calling session.save or session.update but only foo's setters)
validate foo (using hibernate validator)
if validation fails ?
go back to edit screen
transaction 2 (read only)
load form backing objects from db
transaction 2 end
go to view
else
transaction 3
session.update(foo)
transaction 3 end
the problem we have is if the validation fails
foo is marked "dirty" in the hibernate session (since we use OpenSessionInView we only have one session throughout the http request), when we load the form backing objects (like a list of some entities using an HQL query), hibernate before performing the query checks if there are dirty objects in the session, it sees that foo is and flushes it, when transaction 2 is committed the updates are written to the database.
The problem is that even though it is a read only transaction and even though foo wasn't updated in transaction 2 hibernate doesn't have knowledge of which object was updated in which transaction and doesn't flush only objects from that transaction.
Any suggestions? did somebody ran into similar problem before
Update: this post sheds some more light on the problem: http://brian.pontarelli.com/2007/04/03/hibernate-pitfalls-part-2/
You can run a get on foo to put it into the hibernate session, and then replace it with the object you created elsewhere. But for this to work, you have to know all the ids for your objects so that the ids will look correct to Hibernate.
There are a couple of options here. First is that you don't actually need transaction 2 since the session is open you could just load the backing objects from the db, thus avoiding the dirty check on the session. The other option is to evict foo from the session after it is retrieved and later use session.merge() to reattach it when you what your changes to be stored.
With hibernate it is important to understand what exactly is going on under the covers. At every commit boundary it will attempt to flush all changes to objects in the current session regardless of whether or not the changes where made in the current transaction or any transaction at all for that matter. This is way you don't actually need to call session.update() for any object that is already in the session.
Hope this helps
There is a design issue here. Do you think an ORM is a transparent abstraction of your datastore, or do you think it's a set of data manipulation libraries? I would say that Hibernate is the former. Its whole reason for existing is to remove the distinction between your in-memory object state and your database state. It does provide low-level mechanisms to allow you to pry the two apart and deal with them separately, but by doing so you're removing a lot of Hibernate's value.
So very simply - Hibernate = your database. If you don't want something persisted, don't change your persistent objects.
Validate your data before you update your domain objects. By all means validate domain objects as well, but that's a last line of defense. If you do get a validation error on a persistent object, don't swallow the exception. Unless you prevent it, Hibernate will do the right thing, which is to close the session there and then.
What about using Session.clear() and/or Session.evict()?
What about setting singleSession=false on the filter? That might put your operations into separate sessions so you don't have to deal with the 1st level cache issues. Otherwise you will probably want to detach/attach your objects manually as the user above suggests. You could also change the FlushMode on your Session if you don't want things being flushed automatically (FlushMode.MANUAL).
Implement a service layer, take a look at spring's #Transactional annotation, and mark your methods as #Transactional(readOnly=true) where applicable.
Your flush mode is probably set to auto, which means you don't really have control of when a DB commit happens.
You could also set your flush mode to manual, and your services/repos will only try to synchronize the db with your app when you tell them to.

Categories