I'm writing some application for GlassFish 2.1.1 (JavaEE 5, JPA 1.0, as far as I know). I have the following code in my servlet (which I mostly borrowed from some sample on the Internet):
#PersistenceContext(name = "persistence/em", unitName = "pu")
private EntityManager em;
#Resource
private UserTransaction utx;
#Override
protected void doPost(...) {
utx.begin();
. . . perform retrieving operations on em . . .
utx.rollback();
}
web.xml has the following in it:
<persistence-context-ref>
<persistence-context-ref-name>persistence/em</persistence-context-ref-name>
<persistence-unit-name>pu</persistence-unit-name>
</persistence-context-ref>
The problem is, the em doesn't see changes that have been made in another, outside transaction. Roughly, I make a request to my servlet from web browser, see data, perform some DML in SQL console, reload servlet page -- and it doesn't show any change. I've tried to use many combinations of em.flush, and utx.rollback, and em.joinTransaction, but it doesn't seem to do any good.
Situation is complicated by me being a total newbie in JPA, so I do not have a clear understanding of how the underlying machinery works. So any help and -- more importantly -- explanations/links of what is happening there would be very appreciated. Thanks!
The JPA implementation maintains a cache of entities that have been accessed. When you perform operations in a different transaction without using JPA, the cache is no longer up to date, and hence you never see the changes made in it.
If you do wish to see the changes, you will have to refresh the cache, in which case all entities will be evicted from the cache. Of course, you'll need to know when to do this (after the other transaction has completed), otherwise you'll continue to see ambiguous entities. If this is your business need, then JPA is possibly not a good fit to your problem domain.
Related:
Are entities cached in jpa by default ?
Invalidating JPA EntityManager session
As axtavt says, you need to commit the transaction in the console. Assuming you did that, it is also possible data is still being cached by the PersistenceManager (or underlying infrastructure).
To prevent trouble with caching you can evict by hand (which may be tricky as you have to know when to evict) or you can go to pessimistic locking. Pessimistic locking can have a huge impact on performance, but if you have multiple independent connections to the database you may not have a choice.
If your process has concurrent read/writes from different sources the whole time, you may really need pessimistic locks. If you sometimes have a batch update from an external source, you may try to signal, from that batch job, your JPA application that it should evict. Perhaps via a web service or so. That way you would not incur pessimistic locking performance degradation the entire time.
The wise lesson here is that synchronization of processes can be really complicated :)
Perhaps you need to commit a transaction made in SQL console.
Related
I need to audit changes to some entities in our application and am thinking of using JaVers. I like the support for interrogating the audit data provided by JaVers. Hibernate Envers looks good, but it stores data in the same DB.
Here are my requirements:
async logging - for minimal performance impact
store audit data in a different db - performance reasons as well
As far as I can see JaVers is not designed for the above, but seems possible to adapt to achieve the above. Here's how:
JaVers actually allows data to be stored in a different DB. You can provide a connection to any DB really. It's not how it's intended, but it works. Code below (note connectionProvider which can provide a connection to any DB):
'
final Connection dbConnection =
DriverManager.getConnection("jdbc:mysql://localhost:3306/javers", "root", "root");
ConnectionProvider connectionProvider = new ConnectionProvider() {
#Override
public Connection getConnection() {
//suitable only for testing!
return dbConnection;
}
};
JaversSqlRepository sqlRepository = SqlRepositoryBuilder
.sqlRepository()
.withConnectionProvider(connectionProvider)
.withDialect(DialectName.MYSQL).build();
The async can be achieved by moving the execution of the JaVers commit into a thread/executor. The challenge with that is that if the execution takes too long, it could be that the object changes before it's logged. There are 2 solutions I can think of here:
we could create a snapshot of the object (e.g. serialize it to JSON or the like) and pass that to a Thread to log it.
we provide our custom implementation of Javers Repository which processes the differences in the current thread, and then passes the Snapshot objects to be persisted in another thread. This way we'd only do reading from DB in the application thread, and do writing (which is generally more costly performance wise) in the Auditing thread.
QUESTIONS:
am I missing anything here? Could this work?
Does JaVers have support to create a snapshot of the object which then can be moved to another thread. It does it internally somewhere, so maybe it's something we could use.
JUST FYI: Not relevant for the question, but here are some other challenges I can think of and how I'm planning to solve them:
due to not doing audits in the same transaction, as if the transaction fails, it'd make audit rollback complex. So we need to audit only objects that were successfully committed. I intend to do that by using a Hibernate Interceptor, listening to the afterTransactionCompletion and only committing objects updated by that transaction.
In case of lazy loaded objects, I could see how, if we're trying to access them once the transaction is finished, it might be that the lazy loaded props can't be accessed (as the session might be closed too) - don't know how to fix this, but it might not be an issue as I think we're loading eager most props.
Interesting question.
First the démenti. All JaVers core modules are designed to decouple audit data from application data. As you mentioned, user provides a ConnectionProvider to be used by JaVers. It could be any database you want.
What are not designed to use with multiple DB are Spring integration modules for SQL, so javers-spring-jpa and javers-spring-boot-starter-sql. They just cover most common scenario so the same DB for application and JaVers.
You are right about lack of async commit. Fortunately, it can be implemented only in JaversCore without changing the Repositories.
The API could be:
CompletableFuture<Commit> javers.commitAsync(..., Executor);
First, Javers will take a snapshot of user's objects, it's fast so it can be done in the current thread.
Then, DB reads (loading latest snapshots) and DB writes (inserting new snapshots) can be done asynchronously (submitted to the given Executor).
As you mentioned, it requires the new approach to DB transactions. We plan to implement the Commit Withdrawal feature, so the app would be able to withdraw JaVers' commit after main DB rollback. See https://github.com/javers/javers/issues/588
We have a somewhat huge application which started a decade ago and is still under active development. So some parts are still in J2EE 1.4 architecture, others using Java EE 5/6.
While testing some new code, I realized that I had data inconsistency between information coming in through old and new code parts, where the old one uses the Hibernate session directly and the new one an injected EntityManager. This led to the problem, that one part couldn't see new data from the other part and thus also created a database record, resulting in primary key constraint violation.
It is planned to migrate the old code completely to get rid of J2EE, but in the meantime - what can I do to coordinate database access between the two parts? And shouldn't at some point within the application server both ways come together in the Hibernate layer, regardless if accessed via JPA or directly?
You can mix both Hibernate Session and Entity Manager in the same application without any problem. The EntityManagerImpl simply delegates calls the a private SessionImpl instance.
What you describe is a Transaction configuration anomaly. Every database transaction runs in isolation (unless you use REAN_UNCOMMITED which I guess it's not the case), but once you commit it the changes are available from any other transaction or connection. So once a transaction is committed you should see al changes in any other Hibernate Session, JDBC connection or even your database UI manager tool.
You said that there was a primary key conflict. This can't happen if you use Hibernate identity or sequence generator. For the old hi-lo generator you can have problems if an external connection tries to insert records in the same table Hibernate uses an old hi/lo identifier generator.
This problem can also occur if there is a master/master replication anomaly. If you have multiple nodes and there is no strict consistency replication you can end up with primar key constraint violations.
Update
Solution 1:
When coordinating the new and the old code trying to insert the same entity, you could have a slect-than-insert logic running in a SERIALIZABLE transaction. The SERIALIZABLE transaction acquires the appropriate locks on tour behalf and so you can still have a default READ_COMMITTED isolation level, while only the problematic Service methods are marked as SERIALIZABLE.
So both the old code and the new code have this logic running a select for checking if there is already a row satisfying the select constraint, only to insert it if nothing is found. The SERIALIZABLE isolation level prevents phantom reads so I think it should prevent constraint violations.
Solution 2:
If you are open to delegate this task to JDBC, you might also investigate the MERGE SQL statement, if your current database supports it. Basically, this is an upsert operation issuing an update or an insert behind the scenes. This command is much more attractive since you can still run it with even on READ_COMMITTED. The only drawback is that you can't use Hibernate for it, and only some databases support it.
If you instanciate separately a SessionFactory for the old code and an EntityManagerFactory for new code, that can lead to different value in first level cache. If during a single Http request, you change a value in old code, but do not immediately commit, the value will be changed in session cache, but it will not be available for new code until it is commited. Independentely of any transaction or database locking that would protect persistent values, that mix of two different Hibernate session can give weird things for in memory values.
I admit that the injected EntityManager still uses Hibernate. IMHO the most robust solution is to get the EntityManagerFactory for the PersistenceUnit and cast it to an Hibernate EntityManagerFactoryImpl. Then you can directly access the the underlying SessionFactory :
SessionFactory sessionFactory = entityManagerFactory.getSessionFactory();
You can then safely use this SessionFactory in your old code, because now it is unique in your application and shared between old and new code.
You still have to deal with the problem of session creation-close and transaction management. I suppose it is allready implemented in old code. Without knowing more, I think that you should port it to JPA, because I am pretty sure that if an EntityManager exists, sessionFactory.getCurrentSession() will give its underlying Session but I cannot affirm anything for the opposite.
I've run into a similar problem when I had a list of enumerated lookup values, where two pieces of code would check for the existence of a given value in the list, and if it didn't exist the code would create a new entry in the database. When both of them came across the same non-existent value, they'd both try to create a new one and one would have its transaction rolled back (throwing away a bunch of other work we'd done in the transaction).
Our solution was to create those lookup values in a separate transaction that committed immediately; if that transaction succeeded, then we knew we could use that object, and if it failed, then we knew we simply needed to perform a get to retrieve the one saved by another process. Once we had a lookup object that we knew was safe to use in our session, we could happily do the rest of the DB modifications without risking the transaction being rolled back.
It's hard to know from your description whether your data model would lend itself to a similar approach, where you'd at least commit the initial version of the entity right away, and then once you're sure you're working with a persistent object you could do the rest of the DB modifications that you knew you needed to do. But if you can find a way to make that work, it would avoid the need to share the Session between the different pieces of code (and would work even if the old and new code were running in separate JVMs).
I have a scenario where I use a read on set of tables in a java service.
I've annotated the service class #Transactional.
Is there any possible way to lock the corresponding rows I read, in all the tables I use, in my transaction and release it at the end of transaction ?
Ps: I'm using spring Hibernate, and I'm new to this locking concept.
any material/ examples links would be of much help
Thanks
This depends on the underlying database engine and selected transaction isolation level.
Some database systems do locking for reads, and some use MVCC, which means your updates won't be visible to other transactions until your transaction finishes and your transaction will operate on a snapshot of data taken at the start of the transaction.
So a simple answer is: choose appropriately high transaction isolation level (e.g. SERIALIZABLE) for your needs and a database engine that supports it.
http://en.wikipedia.org/wiki/Isolation_(database_systems)
In my web application I have several threads that potentially access the same data concurrently why I decided to implement optimistic (versioning) and pessimistic locking with Hibernate.
Currently I use the following pattern to lock an entity and perform write operations on it (using Springs Transaction manager and transaction demarcation with #Transactional):
#Transactional
public void doSomething(entity) {
session.lock(entity, LockMode.UPGRADE);
session.refresh(entity);
// I change the entity itself as well as entites in a relationship.
entity.setBar(...);
for(Child childEntity : entity.getChildren()) {
childEntity.setFoo(...);
}
}
However, sometimes I am getting StaleObjectException when the #Transactional is flushing that tells me that a ChildEntity has been modifed concurrently and now has a wrong version.
I guess I am not correctly refreshing entity and its children so I am working with stale data. Can someone point out how to achieve this? Some thoughts of me included clearing the persistence context (the session) or calling session.lock(entity, LockMode.READ) again, but I am not sure what is correct here.
Thanks for your help!
You may want to take at look at this Hibernate-Issue: LockMode.Upgrade doesn't refresh entity values.
In short: Hibernat does NOT perform a select after a successful lock if the given entity was already preloaded. You need to call refresh for the entity for yourself after you received the lock.
Why do you make "LockMode.UPGRADE" and optimistic locking live together? Seem like controversial things.
Hibernate never lock objects in memory and always use the locking mechanism of the database. Also, "if the requested lock mode is not supported by the database, Hibernate uses an appropriate alternate mode instead of throwing an exception. This ensures that applications are portable.". It means, that if your database doesn't support SELECT ... FOR UPDATE, most probably, you will get these exceptions.
Another possible reason is that you haven't used "org.hibernate.annotations.CascadeType.LOCK" for children.
I've got a Vehicle entity, with a Vehicle DTO.
I use OpenSessionInView with Stripes.
In my Stripes action bean, i need to generate a csv containing the data for about 50000 Vehicles.
Thus, as a Stripes developer told me to do, i write the file to the outputstream in the following method:
StreamingResolution() {...}.stream(HttpServletResponse)
I have a service that take some pagination information, load a part of the vehicles and transform them into DTO's.
These dto's are returned to the view and written to the csv.
The pagination system (500 items for each page) was made to avoid having a list of 50000 DTO and have some memory problems.
But it doesn't work perfectly yet. With Jmap i saw that at the end of the csv process, there are more than 40000 vehicules loaded in heap space and not garbage collected.
With Yourkit profiler, it seems to me that these entities are still in the L1 cache of hibernate (referenced in StatefulPersistenceContext), and since i have OpenSessionInView, i guess the problem is that the conversation is a bit long and the cache need to be cleaned...
I just wonder how to do that in an elegant way since my dao methods loading vehicles are used by a lot of services that doesn't necessary need a session clean/flush.
Someone know i could do? I guess i can make a method in the DAO/Service to clear the session but it's not very elegant...
This is a pretty big project and i made a very simple description of it. Please don't tell me to not use opensessioninview or something like that, it's not my decision... ;)
It may not be elegant, but in cases like this evicting entities from the session is the only practical solution.
For example, once you've finished writing an entity's data to the output stream, call session.evict(entity) to remove it from the session cache. Alternatively, call this at the end of each "page".
The combination of the paging mechanism and the eviction should ensure you never have more than 500 entities in the cache at one time.
Evicting objects is elegant and it is the only good solution when doing reports, exports to CSV etc.
The best elegant way is to implement it inside a service, which will open iterator, evict each item after process. The class would be provided for service call, that would be item consumer.
public interface ItemConsumer {
void consume(Item item);
}
public void processAllItems(ItemConsumer consumer) {
.. do your job
}
You probably want to explicitly open a stateless session.
e.g.
StatelessSession session = sessionFactory.openStatelessSession();
From the docs:
A StatelessSession has no persistence
context associated with it and does
not provide many of the higher-level
life cycle semantics. In particular, a
stateless session does not implement a
first-level cache nor interact with
any second-level or query cache. It
does not implement transactional
write-behind or automatic dirty
checking
See here for more details:
http://docs.jboss.org/hibernate/core/3.3/reference/en/html/batch.html