Using Spring transactions, JPA with Hibernate implementation.
I have marked a method as:
#Transactional(readOnly=true, isolation=Isolation.READ_UNCOMMITTED)
Does this combination of readOnly and read uncommitted valid? I am using this on a method who's native sql is a select statement for a report page. First, I am marking it as read uncommitted so that it does not get stuck waiting on the table which gets frequently updated, as user wants to generate report even during processing time (a warning is shown if this happens). Second, I am marking as readonly to tell JPA not to keep the entities in the persistent context.
Is this a correct understanding?
Since you need intermediate data during the processing time, READ_UNCOMMITTED isolation level makes sense.
ReadOnly transaction will make sure that all the sql statements in that particular transaction are Select statements and if it finds any Insert/Update sql statements, it will immediately throw an error.
I think ReadOnly is just an additional check to make sure that your are not updating any data during a particular transaction.
Related
I've got a Spring application using Hibernate. I've implemented Envers into it, which is working fine. However, Hibernate will by default automatically flush before some transactions are committed.
For example, I have an MVC endpoint that will update a record, but before saving it, will have to make various other queries to retrieve some other data. Each time another query is run, Hibernate flushes and this results in there being multiple audit rows for each change. This creates some confusion, as there is already a modified date on my record which isn't changed in each update (as it's flushing before this property is changed).
What are my options for managing this more effectively, and creating a reliable audit log even with Hibernate flushing in this way? Is the only answer to implement my own listener with some custom logic to check if it should actually be committing an audit change or not?
You can detach the entity and merge when you are done. These queries are only executed if they touch tables that would be affected by pending inserts/updates/deletes. If you use native queries, this is a different topic. Hibernate has no SQL parser to figure out which tables you are touching so it is conservative and flushes all pending changes.
I'm not getting a clear idea of why autocommit is by default false in Hibernate when we have the Transaction management apis provided.
I have three questions
why autocommit mode is not recommended by Hibernate?
What happens when we use autocommit = true and then use the Hibernate Transaction apis for transaction management ?
When using spring declarative transaction management how #Transactional(readonly = true) will help the read only code (Hibernate code) we write?
I will answer one by one Starting with (2) as i don't know much about (1)
(2): autocommit=true means by default all queries get commited. In such case If
if there is a #Transactional on a method, it overrides the autocommit and encloses all queries into a single transaction, thus overriding the autocommit
if there is a #Transactional method that calls other #Transactional annotated methods, the outer most annotation should override the inner annotaions and create a larger transaction, thus annotations also override eachother.
(3): In DBs like Oracal/Mysql a read only transaction can be translated READ_ONLY level which provides no dirty reads, no unrepeatable reads but doesn’t allow any updates. That means the flush mode will be set as FlushMode.NEVER in the current Hibernate Session preventing the session from commiting the transaction. Even setReadOnly(true) will be called on the JDBC Connection which ensure that you cannot call session.flsuh() even to flush session manually.
since Spring doesn't do persistence itself, it cannot specify what readOnly should exactly mean. This attribute is only a hint to the provider, the behavior depends on, in this case, Hibernate.
In regards of (1):
Suppose, you own a company and have 1000 employees. You maintain a database table to track if an employee was paid at the end of the month already or not.
So, here we are, the end of the month and pay day. Payroll sends you an excel file
with 600 names who has just received their compensation for the last month. So you log on to your computer and start your java app and select the excel to save all 600 records in your database. It usually takes 2 minutes, but now it fails at 1 minute and 23 second. What is your expectation now? Do you expect a partially uploaded file with gosh'knows how many records, or nothing at all? Your autocommit will drive it: If autocommit=false then you can give a go to upload the whole file again and again, but if autocommit=true you might need to tweak around your input data first to remove some records to prevent duplicates in your database.
I hope, my simplified example helps you to better understand it, but really the purpose is to ensure either everything from a batch is saved (it includes every write operation, like insert/update/delete) in the database or nothing at all in case an error occurs at anytime during the process. In the real life in most of the cases you and your organisation will expect complete data sets rather than partial data set in the database. Understanding it is key and anybody who advises to 'use autocommit=true because it is safe' should be avoided by miles. This is a key concept and one of the foundation of data management.
I have a question about persist and merge strategy of eclipselink. I would like to know how eclipselink/JPA inserts and updates records. Is it insert/update one by one into database? or it is saving them in a log file and then flush them to the database?
It is important for me, because I am going to have a history table with trigger that triggs when insertion and update. so if for example update is happening on each field, and 3 fields are updated, then I will have 3 records in history table or one?
I will be appreciated if anyone answers me and also leave some reference link for further information.
The persistence provider is quite free to flush changes whenever it sees fit. So you cannot reliably predict the number of update callbacks or the expected SQL statements.
In general, the provider will flush changes before each query to make changes in the persistence context available to the query. You can hint the provider to defer the flush until commit time, but the provider still can flush at will.
Please see the relevant chapters of the JPA (2.0) spec:
§3.2.4 Synchronization to the Database
§3.8.7 Queries and Flush Mode
EDIT: There is an important point to flushing and transaction isolation. The changes are flushed to the database and the lifecycle listeners are invoked, but the data is not committed and not visible to other transactions - the read-committed isolation is the default. The commit itself is atomic.
I am not sure what the consequences of a server crash would be, but under normal circumstances, data integrity is ensured.
I have been confused about transaction.rollback. Here is example pseudocode:
transaction = session.beginTransaction()
EntityA a = new EntityA();
session.save(a);
session.flush();
transaction.rollback();
What happens when this code works? Do I have the entity in the database or not?
Short answer: No, you won't have entity in the database.
Longer answer: hibernate is smart enough not to send insert/updates to the DB until it knows if the transaction is going to be committed or rolled back (although this behavior can be changed by setting a different FlushMode), in your case by calling flush you are forcing the SQL to be sent to the DB but you still have the DB transaction to protect you, when you call rollback the DB transaction will be rolled back removing the changes performed inside itself and hence nothing will be actually saved. Note that depending on your configured transaction isolation level perhaps other transactions will be able to see in some way the EntityA you saved for the short while between the save and the rollback.
Also note that flush is called automatically when you try to read from DB, in 99% of the cases calling it explicitly is not necessary. One exception that comes to mind is when unit testing with auto rolling back tests.
When you call session.save(a) Hibernate basically remembers somewhere inside session that this object has to be saved. It can decide if he wants to issue INSERT INTO... immediately, some time later or on commit. This is a performance improvement, allowing Hibernate to batch inserts or avoid them if transaction is rolled back.
When you call session.flush(), Hibernate is forced to issue INSERT INTO... against the database. The entity is stored in the database, but not yet commited. Depending on transaction isolation level it won't be seen by other running transactions. But now the database knows about the record.
When you call transaction.rollback(), Hibernate rolls-back the database transaction. Database handles rollback, thus removing newly created object.
Now consider the scenario without flush(). First of all, you never touch the database so the performance is better and rollback is basically a no-op. On the other hand if transaction isolation level is READ UNCOMMITTED, other transactions can see inserted record even before commit/rollback. Without flush() this won't happen, unless Hibernate does not decide to flush() implicitly.
I think you get confused with flush and commit.
flush() synchronizes the state with the database, but it is not doing a commit. The state is still visible by transaction, so that you can call rollback to rollback.
So the answer to your question is: no, you don't have the entity (a) in the database.
I was gathering information about the flush() method, but I'm not quite clear when to use it and how to use it correctly. From what I read, my understanding is that the contents of the persistence context will be synchronized with the database, i. e. issuing outstanding statements or refreshing entity data.
Now I got following scenario with two entities A and B (in a one-to-one relationship, but not enforced or modelled by JPA). A has a composite PK, which is manually set, and also has an auto-generated IDENTITY field recordId. This recordId should be written to entity B as a foreign-key to A. I'm saving A and B in a single transaction. The problem is that the auto-generated value A.recordId is not available within the transaction, unless I make an explicit call of em.flush() after calling em.persist() on A. (If I have an auto-generated IDENTITY PK then the value is directly updated in the entity, but that's not the case here.)
Can em.flush() cause any harm when using it within a transaction?
Probably the exact details of em.flush() are implementation-dependent.
In general anyway, JPA providers like Hibernate can cache the SQL instructions they are supposed to send to the database, often until you actually commit the transaction.
For example, you call em.persist(), Hibernate remembers it has to make a database INSERT, but does not actually execute the instruction until you commit the transaction. Afaik, this is mainly done for performance reasons.
In some cases anyway you want the SQL instructions to be executed immediately; generally when you need the result of some side effects, like an autogenerated key, or a database trigger.
What em.flush() does is to empty the internal SQL instructions cache, and execute it immediately to the database.
Bottom line: no harm is done, only you could have a (minor) performance hit since you are overriding the JPA provider decisions as regards the best timing to send SQL instructions to the database.
Can em.flush() cause any harm when using it within a transaction?
Yes, it may hold locks in the database for a longer duration than necessary.
Generally, When using JPA you delegates the transaction management to the container (a.k.a CMT - using #Transactional annotation on business methods) which means that a transaction is automatically started when entering the method and commited / rolled back at the end. If you let the EntityManager handle the database synchronization, sql statements execution will be only triggered just before the commit, leading to short lived locks in database. Otherwise your manually flushed write operations may retain locks between the manual flush and the automatic commit which can be long according to remaining method execution time.
Notes that some operation automatically triggers a flush : executing a native query against the same session (EM state must be flushed to be reachable by the SQL query), inserting entities using native generated id (generated by the database, so the insert statement must be triggered thus the EM is able to retrieve the generated id and properly manage relationships)
Actually, em.flush(), do more than just sends the cached SQL commands. It tries to synchronize the persistence context to the underlying database. It can cause a lot of time consumption on your processes if your cache contains collections to be synchronized.
Caution on using it.