I hope someone can clarify the below scenerio for me.
From what I understand, when you request a 'row' from hibernate, for example:
User user = UserDao.get(1);
I know have the user with id=1 in memory.
In a web application, if 2 web pages request and load the user at the same time, and then both update a property on the user's object, what will happend? e.g.:
user.pageViews += 1; // the value is current 10 before the increment
UserDao.update(user);
Will this use the value that is in-memory (both requests have the value 10), or will it use the value in the database?
You must use two hibernate sessions for the two users. This means there are two instances of the object in the memory. If you use only one hibernate session (and so one instance of the object in memory), then the result is unpredictable.
In the case of a concurrent update the second update wins. The value of the first update is overwritten by the second update. To avoid the loss of the first update you normally use a version column (see the hibernate doc), and the second update then gets an error which you can catch and react on it (for example with an error message "Your record was modified in meantime. Please reload." which allows the second user to redo his modification on the modified record, to ensure his modif does not get lost.
in the case of a page view counter, like in your example, as a different solution you could write a synchronized methods which counts the page views sequentially.
By default the in memory value is used for the update.
In the following I assume you want to implement an automatic page view counter, not to modify the User in a web user interface. If you want this take a look at Hibernate optimistic locking.
So, supposing you need 100% accuracy when counting the page views, you can lock your User entity while you modify their pageView value to obtain exclusivity on the table row:
Session session = ...
Transaction tx = ...
session.lock(user, LockMode.UPGRADE);
user.increasePageViews();
tx.commit();
session.close();
The LockMode.UPGRADE will translate in a SELECT ... FOR UPDATE in your database so be careful to maintain the lock as little as possible to not impact application scalability.
Related
My application parses a CSV file, about 100 - 200 records per file, does database CRUD features and commits them all in the end.
public static void main(String[] args) {
try {
List<Row> rows = parseCSV();
Transaction t = openHibernateTransaction();
//doCrudStuff INSERTS some records in the database
for (Row r : rows)
doCrudStuff(r);
t.commit();
} catch (Exception ex) {
//log error
if (t != null) t.rollback();
}
}
When I was about to doCrudStuff on the 78th Row, I suddenly got this error:
Data truncation: Data too long for column 'SOME_COLUMN_UNRELATED_TO_78TH_ROW' at row 1.
I read the stack trace and the error was triggered by a SELECT statement to a table unrelated to the 78th row. Huh, weird right?
I checked the CSV file and found that on the 77th row, some field was indeed too long for the database column. But Hibernate didn't catch the error during the INSERT of the 77th row and threw the error when I was doing a SELECT for the 78th row. Why is it delayed?
Does Hibernate really behave like this? I commit only once at the very end because I want to make sure that everything succeeded, otherwise, rollback.
Actually not really if you take into account what hibernate is doing behind the scenes for you.
Hibernate does not actually execute your write statements (update,insert) until it needs to, thus in your case I assume your "doCrudStuff" executes a select and then executes an update or insert right?
This is what is happening:
You tell hibernate to execute "UPDATE my_table SET something = value;" which causes hibernate to cache this in the session and return right away.
You may do more writes, which Hibernate will likely continue to cache in the session until either 1) you manually flush the session or 2) hibernate decides its time to flush the session.
You then execute a SELECT statement to get some data from the database. At this point, the state of the database is not consistent with the state of the session since there is data waiting to be written. Hibernate will then start executing your writes to catch up the database state to the session state.
If one of the writes fails, when you look at the stack trace, you will actually not be able to map it to the exact point you asked (this a important distinction between an ORM and using JDBC directly) hibernate to execute the write, but rather it will fail when the session had to be flushed (either manually or automatically).
At the expense of performance, you can always tell hibernate to flush your session after your writes. But as long as you are aware of the lifecycle of the hibernate session and how it caches those queries, you should be able to more easily debug these.
By the way, if you want to see this is practice, you can tell hibernate to log the queries.
Hope this helps!
EDIT: I understand how this can be confusing, let me try to augment my answer by highlighting the difference between a Transaction and a Hibernate Session.
A transaction is a sequence of atomic operations performed on the database. Until a transaction is committed, it is typically not visible by other clients of the database. The state of the transaction is fully managed by the database - i.e. you can start a transaction and send you operations to the database, and it will ensure consistency of these operations within the transaction.
A Hibernate Session is a session managed by Hibernate, outside the database, mostly for performance reasons. Hibernate will queue operations whenever possible to improve performance, and only go to the database when it deems necessary.
Imagine you have 50 marbles that are all different colors and need to be stored in their correct buckets, but these buckets are 100 feet away and you need someone to correctly sort them inside their rightful buckets. You ask your friend Bob to store the blue marbles, then the red marbles then the green marbles. Your friend is smart and anticipates that you will ask him to make multiple round trips, so he ways until your last request to walk those 100 feet to store them in their proper buckets, which is much faster than making 3 round trips.
Now imagine that you ask him to store the yellow marbles, and then you ask him how many total marbles you have across all the buckets. He is then forced to go to the buckets (since he needs to gather information), store the yellow marbles (so he can accurately count all buckets) before he can give you an answer. This is in essence what hibernate is doing with your data.
How in your case, imagine there is NO yellow bucket. Bob unfortunately is not going to find that out until he tries to answer your query into how many total marbles you have - thus in the sequence of events, he will come back to you to tell you he couldn't complete your request only after he tries to count the marbles (as opposed to when you asked him to store the yellow ones, which is what he was actually unable to do).
Hope this helps clear things a little bit!
I have a single table i am trying to understand the logic behind session.merge but i think is in somehow useless i will try to explain more with some code.
public static void main(String[] args)
{
final Merge clazz = new Merge();
Ragonvalia ragonvalia = clazz.load();//LOADED FROM DATABASE...
System.out.println("ORIGINAL: "+ragonvalia);
//Prints c02=1953
clazz.session.evict(ragonvalia);//WE EVICT HERE FOR FORCE MERGE RELOAD FROM DB
//HERE I MAKE SOME MODIFICATIONS TO THE RECORD IN THE DB DIRECTLY.....
try{Thread.sleep(20000);}catch(final Exception e){e.printStackTrace();}
//now c02=2000
final Ragonvalia merge = ragonvalia = (Ragonvalia)clazz.session.merge(ragonvalia);//MERGE IN FACT THE SELECT IS THROWN
System.out.println("MERGING");
System.out.println("merge: "+merge);
System.out.println("ragonvalia: "+ragonvalia);
System.out.println(merge.toString().equals(ragonvalia.toString()));//PRINT EQUALS
ragonvalia.setC01("PUEBLO LINDO");//I MODIFIY THE C01 FIELD
System.out.println(ragonvalia);
final Transaction tx = clazz.session.beginTransaction();
clazz.session.update(merge);//WE UPDATE
tx.commit();
//In this point i can see the c02 was reset again to 1953
clazz.shutDown();
}
Yep i know that merge is using for detached objects and all that stuff but what really are behind the select i just thought some things.
If when i retrieve the record from the first time thing the field c02=1953 latter was changed to c02=2000 i just though when the merge was made they would keep the new field already changed c02=2000 and if i do not modify the field in my session they would replace the c02 from 1953 which was the original to 2000 in the update to dont hurts anybody job when the keep the 1953 and updates the field as 1953 and 1953 replaces the 2000 in the database of course the job from the other person is lost.
I have read some stuff over the internet and i see something like this Essentially, if you do not have a version or timestamp field, Hibernate must check the existing record before updating it so that concurrent modifications do not occur. You would not want a record updated that someone else modified after you read it. There are a couple of solutions, outlined in the link above. But it makes life much easier if can add a version field on each table. Sounds great but before updating it so that concurrent modifications do not occur this is not happening Hibernate is just updating the fields i have in my class even when they are not the same in the currently DB record.
Hibernate must check the existing record before updating checking for what what hibernates checks?
In fact i am not using any version in my Models but seems the merge is only works to check that the records exists in the database.
I know this question is somehow simple or duplicate but i just cant see the logic or the benefits of firing a select.
Resume
After the merge Hibernate is updating all the properties even those whose unmodified i dont know why is this i just though that hibernate would update only the modified to gain performance, and the values of those properties are the same when the clazz was loaded for 1 time or modified by hand i think the merge was useless.
update
ragonvalia
set
.....
.....
.....
c01=?,
c02=?,HERE IS 1953 EVEN WHEN THE MERGE WAS FIRED THE VALUE IN THE DB WAS 2000
c03=?
where
ID=?
Your question mixes what appears to be 2 concerns.
Dynamic Update Statements
Hibernate has had support for #DynamicUpdate since 4.1.
For updating, should this entity use dynamic sql generation where only changed
columns get referenced in the prepared sql statement.
Note, for re-attachment of detached entities this is not possible without the
#SelectBeforeUpdate annotation being used.
This simply means that within the bounds of an open session, any entity attached to the session flagged with #DynamicUpdate will track field level changes and only issue DDL statements that include only the altered fields.
Should your entity be deattached from the session and you issue a merge, the #SelectBeforeUpdate annotation forces hibernate to refresh the entity from the database, attach it to the session and then determine dirty attributes in order to write the DDL statement with only altered fields.
It's worth pointing out that this will not guard you against concurrent updates to the same database records in a highly concurrent environment. This is simply a means to minimize the DDL statement for legacy or wide tables where a majority of the columns aren't changed.
Concurrent Changes
In order to deal with concurrent operations on the same data, you can approach this using two types of locking mechanics.
Pessimistic
In this situation, you would want to apply a lock at read time which basically forces the database to prevent any other transaction from reading/altering that row until the lock is released.
Since this type of locking can have severe performance implications, particularly on a highly concurrent table, it's generally not preferred. Most modern databases will reach a point of row level locks and eventually escalate them to the data page or worse the table; causing all other sessions to block until the locks are released.
public void alterEntityWithLockExample() {
// Read row, apply a row level update/write lock.
Entity entity = entityManager.find(Entity.class, 1L, LockModeType.PESSIMISTIC_WRITE);
// update entity and save it.
entity.setField(someValue);
entityManager.merge(entity);
}
It is probably worth noting that had any other session queried the entity with id 1 prior to the write lock being applied in the code above, the two sessions would still step on one another. All operations on Entity that would result in state changes would need to query using a lock in order to prevent concurrency issues.
Opimistic
This is a much more desired approach in a highly concurrent environment. This is where you'd want to annotate a new field with #Version on your entity.
This has the benefit that you can query an entity, leave it detached, reattach it later, perhaps in another request or thread, alter it, and merge the changes. If the record had changed since it was originally fetched at the start of your process, an OptimisticLockException will be thrown, allowing your business case to handle that scenario however you need.
In a web application as an example, you may want to inform the user to requery the page, make their changes, and resave the form to continue.
Pessimisitic locking is a proactive locking mechanic where-as optimistic is more reactionary.
We are using spring and hibernate for an web application:
The application has a shopping cart where user can place items in it. in order to hold the items to be viewed between different login's the item values in the shopping cart are stored in tables. when submitting the shopping cart the items will be saved into different table were we need to generate the order number.
When we insert the values into the table to get the order number, we use to get the max order number and add +1 to it. we are using spring transaction manager and hibernate, in the code flow we get the order number and update the hibernate object to hold the order num value. when i debug, i noticed that only when the complete transaction is issued the order number entity bean is being inserted.
Issue here is when we two request is being submitted to the server at the same time, the same order number is being used, and only one request data is getting inserted. could not insert the other request value which is again a unique one.
The order num in the table is a unique one.
i noticed when debugging the persistant layer is not getting inserted into the database even after issuing session flush
session.flush()
its just updating the memory and inserting the data to db only at the end of the spring transaction . i tried explicitly issuing a commit to transaction
session.getTransaction().commit();
this inserted the values into the database immediately, but on further code flow displayed message that could not start transaction.
Any help is highly appreciated.
Added:
Oracle database i used.
There is a sequence number which is unique for that table and also the order number maps to it.
follow these steps :- ,
1) Create a service method with propagation REQUIRES_NEW in different service class .
2)Move your code (whatever code you want to flush in to db ) in this new method .
3)Call this method from existing api (Because of proxy in spring, we have to call this new service method from different class otherwise REQUIRES_NEW will not work which make sure your flushing data ).
I would set the order number with a trigger which will run in the same transaction with the shopping cart insert one.
After you save the shopping cart, to see the updated order count, you'll have to call:
session.refresh(cart);
The count shouldn't be managed by Hibernate (insertable/updatable = false or #Transient).
Your first problem is that of serial access around the number generation when multiple thread are executing the same logic. If you could use Oracle sequences this would have been automatically taken care of at the database level as the sequences
are guranteed to return unique values any number of times they are called. However since this needs to be now managed at server side, you would need to
use synchronization mechanism around your number generation logic ( select max and increment by one) across the transaction boundary. You can make the Service
method synchronized ( your service class would be singleton and Spring managed) and declare the transaction boundary around it. However please note that this would be have performance implications and is usually bad for
scalability.
Another option could be variation of this - store the id to be allocated in a seperate table with one column "currentVal" and use pessimistic lock
for getting the next number. This way, the main table would not have any big lock. This way a lock would be held for the sequence generator code for the time the main entity creation transaction is complete. The main idea behind these techniques is to serialize
access to the sequence generator and hold the lock till the main entity transaction commits. Also delay the number generator as late as possible.
The solution suggested by #Vlad is an good one if using triggers is fine in your design.
Regarding your question around the flush behaviour, the SQL is sent to the database at flush call, however the data is not committed until the transaction is committed declaratively or a manual commit is called. The transaction can however see the data it purposes to change but not other transactions depending upon the isolation nature of transaction.
I am writing a system that holds a hibernate-managed entity called Voucher that has a field named serialNumber, which holds a unique number for the only-existing valid copy of the voucher instance. There may be old, invalid copies in the database table as well, which means that the database field may not be declared unique.
The operation that saves a new valid voucher instance (that will need a new serial number) is, first of all, synchronized on an appropriate entity. Thereafter the whole procedure is encapsulated in a transaction, the new value is fetched by the JPQL
SELECT MAX(serialNumber) + 1 FROM Voucher
the field gets the result from the query, the instance is thereafter saved, the session is flushed, the transaction is committed and the code finally leaves the synchronized block.
In spite of all this, the database sometimes (if seldom) ends up with Vouchers with duplicate serial numbers.
My question is: Considering that I am rather confident in the synchronization and transaction handling, is there anything more or less obvious that I should know about hibernate that I have missed, or should I go back to yet another debugging session, trying to find anything else causing the problem?
The service running the save process is a web application running on tomcat6 and is managed by Spring's HttpRequestHandlerServlet. The db connections are pooled by C3P0, running a very much default-based configuration.
I'd appreciate any suggestion
Thanks
You can use a MultipleHiLoPerTableGenerator: it generate #Id outside current transaction.
You do not need to debug to find the cause. In a multi-threaded environment it is likely to happen. You are selecting max from your table. So suppose that TX1 reads the max value which is a and inserts a row with serial number a+1; at this stage if any TX2 reads DB, the max value is still a as TX1 has not committed its data. So TX2 may insert a row with serial number of a+1 as well.
To avoid this issue you might decide to change Isolation Level of your database or change the way you are getting serial numbers (it entirely depends on circumstances of your project). But generally I do not recommend changing Isolation Levels as it is too much effort for such an issue.
I trying to implement the active record pattern using Java/JDBC and MySQL along with optimistic locking for concurrency handling.
Now, I have a 'version_number' field for all the records in a table which is incremented after every update.
There seem to be 2 strategies for implementing this:
The application when it requests the data it also stores the corresponding version number of each of the objects (i.e. record). On updating, the version number is 'sent down' to the data layer which is used in the UPDATE...SET...WHERE query for optimistic locking
The application DOES NOT store the version number, but only some parts of the object (as opposed to an entire row of data). For optimistic locking to succeed, the data layer (active record) would need to first fetch the 'row' from the DB, get version number and then fire the same UPDATE...SET...WHERE query for updating the record.
In the former there is the 'first fetch' and then an update. In the latter case you do have a 'first fetch' but also a fetch right before an update.
The question is this: by design, which is the better approach? Is it okay/safe/correct to have all the data, including the version number be stored in the web application's front-end (Javascript/HTML)? Or is it better to take a performance hit of a read before update?
Is there a 'right way' to implement this design? I'm not sure how current implementations of active record handle this (Ruby, Play, ActiveJDBC etc.) If I'm to implement it 'raw' in JDBC what's the right design decision in this case?
This is neither a matter of performance nor security, the two approaches are functionally different and achieve different goals.
With the first approach you are optimistically locking the row for the user's entire "think time." If User 1 loads the screen, then User 2 makes changes, User 1's changes will fail and they will see an error that they were looking at out of date data.
With the second approach you are only protecting against interleaving writes between competing request threads. User 1 may load a page, then User 2 makes changes, then when User 1 hits submit, their changes will go through and blow out User 2's changes. User 1 may have made a decision based on outdate information and never know.
It's a matter of which behaviour is the one you want for your business rules, not one or the other being technically "correct." they are both valid, they do different things.
ActiveJDBC implements version 1. With version 2, you might introduce race conditions