Hibernate Dirty Object usage

Hibernate Dirty Object usage - java

I have a Hibernate Entity in my code. i would fetch that and based on the value of one of the properties ,say "isProcessed" , go on and :
change value of "isProcessed" to "Yes" (the property that i checked)
add some task to a DelayedExecutor.
in my performance test, i have found that if i hammer the function,a classic dirty read scenario happens and i add too many tasks to the Executor that all of them would be executed. i can't use checking the equality of the objects in the Queue based on anything , i mean java would just execute all of them which are added.
how can i use hibernate's dirty object stuff to be able to check "isProcessed" before adding the task to executor? would it work?
hope that i have been expressive enough.

If you can do all of your queries to dispatch your tasks using the same Session, you can probably patch something together. The caveat is that you have to understand how hibernate's caching mechanisms (yes, that's plural) work. The first-level cache that is associated with the Session is going to be the key here. Also, it's important to know that executing a query and hydrating objects will not look into and return objects from the first-level cache...the right hand is not talking to the left hand.
So, to accomplish what you're trying to do (assuming you can keep using the same Session...if you can't do this, then I think you're out of luck) you can do the following:
execute your query
for each returned object, re-load it with Session's get method
check the isProcessed flag and dispatch if need-be
By calling get, you'll be sure to get the object from the first-level cache...where all the dirty objects pending flush are held.
For background, this is an extremely well-written and helpful document about hibernate caching.

Related

Concurrency with Hibernate in Spring

I found a lot of posts regarding this topic, but all answers were just links to documentations with no example code, i.e., how to use concurrency in practice.
My situation: I have an entity House with (for simplyfication) two attributes, number (the id) and owner. The database is initialized with 10 Houses with number 1-10 and owner always null.
I want to assign a new owner to the house with currently no owner, and the smallest number. My code looks like this:
#Transactional
void assignNewOwner(String newOwner) {
//this is flagged as #Transactional too
House tmp = houseDao.getHouseWithoutOwnerAndSmallestNumber();
tmp.setOwner(newOwner);
//this is flagged as #Transactional too
houseDao.update(tmp);
}
For my understanding, although the #Transactional is used, the same House could be assigned twice to different owners, if two requests fetch the same empty House as tmp. How do I ensure this can not happen?
I know, including the update in the selection of the empty House would solve the issue, but in near future, I want to modify/work with the tmp object more.

Optimistic
If you add a version column to your entity / table then you could take advantage of a mechanism called Optimistic Locking. This is the most proficient way of making sure that the state of an entity has not changed since we obtained it in a transactional context.
Once you createQuery using the session you can then call setLockMode(LockModeType.OPTIMISTIC);
Then, just before the transaction is commited, the persistence provider would query for the current version of that entity and check whether it has been incremented by another transaction. If so, you would get an OptimisticLockException and a transaction rollback.
Pessimistic
If you do not version your rows, then you are left with pessimistic lockin which basically means that you phycically create a lock for queries entities on the database level and other transactions cannot read / update those certain rows.
You achieve that by setting this on the Query object:
setLockMode(LockModeType.PESSIMISTIC_READ);
or
setLockMode(LockModeType.PESSIMISTIC_WRITE);

Actually it's pretty easy - at least in my opinion and I am going to abstract away of what Hibernate will generate when you say Pessimistic/Optimistic. You might think this is SELECT FOR UPDATE - but it's not always the case, MSSQL AFAIK does not have that...
These are JPA annotations and they guarantee some functionality, not the implementation.
Fundamentally they are entire different things - PESSIMISTIC vs OPTIMISTIC locking. When you do a pessimistic locking you sort of do a synchronized block at least logically - you can do whatever you want and you are safe within the scope of the transaction. Now, whatever the lock is being held for the row, table or even page is un-specified; so a bit dangerous. Usually database may escalate locks, MSSQL does that if I re-call correctly.
Obviously lock starvation is an issue, so you might think that OPTIMISTIC locking would help. As a side note, this is what transactional memory is in modern CPU; they use the same thinking process.
So optimistically locking is like saying - I will mark this row with an ID/Date, etc, then I will take a snapshot of that and work with it - before committing I will check if that Id has a changed. Obviously there is contention on that ID, but not on the data. If it has changed - abort (aka throw OptimisticLockException) otherwise commit the work.
The thing that bothers everyone IMO is that OptimisticLockException - how do you recover from that? And here is something you are not going to like - it depends. There are apps where a simple retry would be enough, there are apps where this would be impossible. I have used it in rare scenarios.
I usually go with Pessimistic locking (unless Optimistic is totally not an option). At the same time I would look of what hibernate generates for that query. For example you might need an index on how the entry is retrieved for the DB to actually lock just the row - because ultimately that is what you would want.

Creating a Jooq Caching Layer

So, I'm working on using Jooq to create a caching layer over Postgres. I've been using the MockConnection/MockDataProvider objects to intercept every query, and this is working, but I'm having a few issues.
First, how do I determine between reads and writes? That is, how do I tell whether a query is an insert/update/etc or a select, given only the MockExecuteContext that's passed into the execute method in MockDataProvider?
And I'm a bit confused on how I can do invalidations. The basic scheme I'm implementing right now is that whenever a "write" query is made to a table, I invalidate all cached queries that involve that table. This goes back to my first question, on telling different types of queries from each other, but also brings up another issue: how would I identify the tables used in a query given only the sql string and the bindings (both are attributes of MockExecuteContext)?
Also, is this a correct approach at caching? My first thought was to override the fetch() method, but that method is final, and I'd rather not change something already embedded in Jooq itself. This is the only other way I could think of to intercept all requests made so I could create a separate, persistent caching layer.
I have seen this (https://groups.google.com/forum/#!topic/jooq-user/xSjrvnmcDHw) question, but I'm still not clear on how Lukas recommended to identify tables from the object. I can try to implement a Postgres NOTIFY, but I wanted something native in Jooq first. I've seen this issue (https://github.com/jOOQ/jOOQ/issues/2665) pop up a lot too, but I'm not sure how it applies.
Keep in mind that I'm new to Jooq, so it's quite possible that I'm missing something obvious.
Thanks!

Parallel updates to different entity properties

I'm using JDO to access Datastore entities. I'm currently running into issues because different processes access the same entities in parallel and I'm unsure how to go around solving this.
I have entities containing values and calculated values: (key, value1, value2, value3, calculated)
The calculation happens in a separate task queue.
The user can edit the values at any time.
If the values are updated, a new task is pushed to the queue that overwrite the old calculated value.
The problem I currently have is in the following scenario:
User creates entity
Task is started
User notices an error in his initial entry and quickly updates the entity
Task finishes based on the old data (from step 1) and overwrites the entire entity, also removing the newly entered values (from step 3)
User is not happy
So my questions:
Can I make the task fail on update in step 4? Wrapping the task in a transaction does not seem to solve this issue for all cases due to eventual consistency (or, quite possibly, my understanding of datastore transactions is just wrong)
Is using the low-level setProperty method the only way to update a single field of an entity and will this solve my problem?
If none of the above, what's the best way to deal with a use case like this
background:
At the moment, I don't mind trading performance for consistency. I will care about performance later.
This was my first AppEngine application, and because it was a learning process, it does not use some of the best practices. I'm well aware that, in hindsight, I should have thought longer and harder about my data schema. For instance, none of my entities use ancestor relationships where they would be appropriate. I come from a relational background and it shows.
I am planning a major refactoring, probably moving to Objectify, but in the meantime I have a few urgent issues that need to be solved ASAP. And I'd like to first fully understand the Datastore.

Obviously JDO comes with optimistic concurrency checking (should the user enable it) for transactions, which would prevent/reduce the chance of such things. Optimistic concurrency is equally applicable with relational datastores, so you likely know what it does.
Google's JDO plugin uses the low-level API setProperty() method obviously. The log even tells you what low level calls are made (in terms of PUT and GET). Moving to some other API will not on its own solve such problems.

Whenever you need to handle write conflicts in GAE, you almost always need transactions. However, it's not just as simple as "use a transaction":
First of all, make sure each logical unit of work can be defined in a transaction. There are limits to transactions; no queries without ancestors, only a certain number of entity groups can be accessed. You might find you need to do some extra work prior to the transaction starting (ie, lookup keys of entities that will participate in the transaction).
Make sure each unit of work is idempotent. This is critical. Some units of work are automatically idempotent, for example "set my email address to xyz". Some units of work are not automatically idempotent, for example "move $5 from account A to account B". You can make transactions idempotent by creating an entity before the transaction starts, then deleting the entity inside the transaction. Check for existence of the entity at the start of the transaction and simply return (completing the txn) if it's been deleted.
When you run a transaction, catch ConcurrentModificationException and retry the process in a loop. Now when any txn gets conflicted, it will simply retry until it succeeds.
The only bad thing about collisions here is that it slows the system down and wastes effort during retries. However, you will get at least one completed transaction per second (maybe a bit less if you have XG transactions) throughput.
Objectify4 handles the retries for you; just define your unit of work as a run() method and run it with ofy().transact(). Just make sure your work is idempotent.

The way I see it, you can either prevent the first task from updating the object because certain values have changed from when the task was first launched.
Or you can you embed the object's values within the task request so that the 2nd calc task will restore the object state with consistent value and calcuated members.

StaleObjectstateException row was updated or deleted by

I am getting this exception in a controller of a web application based on spring framework using hibernate. I have tried many ways to counter this but could not resolve it.
In the controller's method, handleRequestInternal, there are calls made to the database mainly for 'read', unless its a submit action.
I have been using, Spring's Session but moved to getHibernateTemplate() and the problem still remains.
basically, this the second call to the database throws this exception. That is:
1) getEquipmentsByNumber(number) { firstly an equipment is fetched from the DB based on the 'number', which has a list of properties and each property has a list of values. I loop through those values (primitive objects Strings) to read in to variables)
2) getMaterialById(id) {fetches materials based on id}
I do understand that the second call, most probably, is making the session to "flush", but I am only 'reading' objects, then why does the second call throws the stale object state exception on the Equipment property if there is nothing changed?
I cannot clear the cache after the call since it causes LazyExceptions on objects that I pass to the view.
I have read this:
https://forums.hibernate.org/viewtopic.php?f=1&t=996355&start=0
but could not solve the problem based on the suggestions provided.
How can I solve this issue? Any ideas and thoughts are appreciated.
UPDATE:
What I just tested is that in the function getEquipmentsByNumber() after reading the variables from list of properties, I do this: getHibernateTemplate().flush(); and now the exception is on this line rather then the call to fetch material (that is getMaterialById(id)).
UPDATE:
Before explicitly calling flush, I am removing the object from session cache so that no stale object remains in the cache.
getHibernateTemplate().evict(equipment);
getHibernateTemplate().flush();
OK, so now the problem has moved to the next fetch from DB after I did this. I suppose I have to label the methods as synchronized and evict the Objects as soon as I am finished reading their contents! it doesn't sound very good.
UPDATE:
Made the handleRequestInternal method "synchronized". The error disappeared. Ofcourse, not the best solution, but what to do!
Tried in handleRequestInternal to close the current session and open a new one. But it would cause other parts of the app not to work properly. Tried to use ThreadLocal that did not work either.

You're mis-using Hibernate in some way that causes it to think you're updating or deleting objects from the database.
That's why calling flush() is throwing an exception.
One possibility: you're incorrectly "sharing" Session or Entities, via member field(s) of your servlet or controller. This is the main reason 'synchronized' would change your error symptoms.. Short solution: don't ever do this. Sessions and Entities shouldn't & don't work this way -- each Request should get processed independently.
Another possibility: unsaved-value defaults to 0 for "int" PK fields. You may be able to type these as "Integer" instead, if you really want to use 0 as a valid PK value.
Third suggestion: use Hibernate Session explicitly, learn to write simple correct code that works, then load the Java source for Hibernate/ Spring libraries so you can read & understand what these libraries are actually doing for you.

I also have been struggling with this exception, but when it continued to recur even when I put a lock on the object (and in a test environment, where I knew I was the only process touching the object), I decided to give the parenthetical in the stack trace its due consideration.
org.hibernate.StaleObjectStateException: Row was updated or deleted by
another transaction (or unsaved-value mapping was incorrect):
[com.rc.model.mexp.MerchantAccount#59132]
In our case it turned out that the mapping was wrong; we had type="text" in the mapping for one field that was a mediumtext type in the database, and it seems that Hibernate really hates that, at least under certain circumstances. We removed the type specification altogether from the mapping for this field, and the problem was resolved.
Now the weird thing is that in our production environment, with the supposedly problematic mapping in place, we do NOT get this exception. Does anybody have any idea why this might be? We are using the same version of MySQL - "5.0.22-log" (I don't know what the "-log" means) - in dev and production envs.

Here are 3 possibilities (as I do not know exactly, which kind of hibernate session handling you are using). Add one after another and test:
Use bi-directional mapping with inverse=true between parent object and child object, so the change in parent or child will get propagate to the other end of relation properly.
Add support for Optimistic Locking using TimeStamp or Version column
Use join query to fetch the whole object graph [ parent+children] together to avoid the second call altogether.
Lastly, if and only if nothing works:
Load the parent again by Id (you have that already) and populate modified data then update.
Life will be good! :)

This problem was something that I had experienced and was quite frustrating, although there has to be something a little odd going on in your DAO/Hibernate calls, because if you're doing a lookup by ID there is no reason to get a stale state, since that is just a simple lookup for an object.
First, make sure all your methods are annotated with #Transaction(required=true) // you'll have to look up the exact syntax
However, this exception is usually thrown when you try to make changes to an object that has been detached from the session it was retrieved from. The solution to this is often not simple and would require more code posted so we can see exactly what is going on; my general suggestion would be to create a #Service that performs these kinds of things within a single transaction

When Hibernate flushes a Session, how does it decide which objects in the session are dirty?

My understanding of Hibernate is that as objects are loaded from the DB they are added to the Session. At various points, depending on your configuration, the session is flushed. At this point, modified objects are written to the database.
How does Hibernate decide which objects are 'dirty' and need to be written?
Do the proxies generated by Hibernate intercept assignments to fields, and add the object to a dirty list in the Session?
Or does Hibernate look at each object in the Session and compare it with the objects original state?
Or something completely different?

Hibernate does/can use bytecode generation (CGLIB) so that it knows a field is dirty as soon as you call the setter (or even assign to the field afaict).
This immediately marks that field/object as dirty, but doesn't reduce the number of objects that need to be dirty-checked during flush. All it does is impact the implementation of org.hibernate.engine.EntityEntry.requiresDirtyCheck(). It still does a field-by-field comparison to check for dirtiness.
I say the above based on a recent trawl through the source code (3.2.6GA), with whatever credibility that adds. Points of interest are:
SessionImpl.flush() triggers an onFlush() event.
SessionImpl.list() calls autoFlushIfRequired() which triggers an onAutoFlush() event. (on the tables-of-interest). That is, queries can invoke a flush. Interestingly, no flush occurs if there is no transaction.
Both those events eventually end up in AbstractFlushingEventListener.flushEverythingToExecutions(), which ends up (amongst other interesting locations) at flushEntities().
That loops over every entity in the session (source.getPersistenceContext().getEntityEntries()) calling DefaultFlushEntityEventListener.onFlushEntity().
You eventually end up at dirtyCheck(). That method does make some optimizations wrt to CGLIB dirty flags, but we've still ended up looping over every entity.

Hibernate takes a snapshot of the state of each object that gets loaded into the Session. On flush, each object in the Session is compared with its corresponding snapshot to determine which ones are dirty. SQL statements are issued as required, and the snapshots are updated to reflect the state of the (now clean) Session objects.

Take a look to org.hibernate.event.def.DefaultFlushEntityEventListener.dirtyCheck
Every element in the session goes to this method to determine if it is dirty or not by comparing with an untouched version (one from the cache or one from the database).

Hibernate default dirty checking mechanism will traverse current attached entities and match all properties against their initial loading-time values.
You can better visualize this process in the following diagram:

These answers are incomplete (at best -- I am not an expert here). If you have an hib man entity in your session, you do NOTHING to it, you can still get an update issued when you call save() on it. when? when another session updates that object between your load() and save(). here is my example of this: hibernate sets dirty flag (and issues update) even though client did not change value

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.