How do you do your Hibernate session management in a Java Desktop Swing application? Do you use a single session? Multiple sessions?
Here are a few references on the subject:
http://www.hibernate.org/333.html
http://blog.schauderhaft.de/2008/09/28/hibernate-sessions-in-two-tier-rich-client-applications/
http://in.relation.to/Bloggers/HibernateAndSwingDemoApp
Single session. Start transaction when you need to do a set of operations (like update data after dialog box OK button), commit the tx at the end. The connection though is constantly open (since it's the same session), and thus all opportunities for caching can be used by both Hib and RDBMS.
It may also be a good idea to implement a transparent session re-open in case the connection went dead -- users tend to leave applications open for extended periods of time, and it should continue to work Monday even if DB server was rebooted on weekend.
Update
Jens Schauder provided a reason to use multiple sessions: partial (unwanted) updates to the session. Well, that comes down to the way you use Hibernate.
Suppose we have two dialogs open (as in Jens' blog example). If user clicks a radiobox, and we immediately update a Hibernate entity associated with this radiobox, then, when user clicks Cancel, we're in trouble -- session is already updated.
The right way, as I see it, is to update dialog variables (non-Hibernate objects) only. Then, when user clicks OK, we begin a transaction, merge updated objects, commit the transaction. No garbage gets ever saved into session.
MyHibernateUtils.begin();
Settings settings = DaoSettings.load();
// update setttings here
DaoSettings.save(settings);
MyHibernateUtils.commit();
If we implement such a clean separation of concerns, we can later switch to multiple sessions with a simple change of MyHibernateUtils.begin() implementation.
As for possible memory leak, well... Transaction.commit() calls Session.flush(), which AFAIK, cleans the cache too. Also, one may manually control the caching policy by calling Session.setCacheMode().
Problem with "''session per thread''" is good Swing applications do the database access outside the EDT, usually in newly created SwingWorker threads. This way, "''session per thread''" quickly becomes "''session per click''".
Don't use a single session. For everything but the smallest applications, it will grow, collecting outdated data and become slower and slower, since the dirty check needs to check every entity in the session.
If you don't need/want lazy loading and tracking of changes by Hibernate, you can use short-lived sessions.
But if you want to benefit from the power of Hibernate use the approach I described in my blog:
http://blog.schauderhaft.de/2008/09/28/hibernate-sessions-in-two-tier-rich-client-applications/
or in the German version:
http://blog.schauderhaft.de/2007/12/17/hibernate-sessions-in-fat-client-anwendungen/
AFAIK it is really the same approach described in the http://in.relation.to/Bloggers/HibernateAndSwingDemoApp but with a recommendation how to actually scope your session:
On Session per Frame, with the exception of modal Frames which use the session of the parent Frame.
Just make sure never to combine objects from different sessions. It will cause lots of trouble.
In reply to Vladimirs update:
The cancel actually works extremely nice with my approach: throw away the session.
session.flush does not fix the problem of the evergrowing session when you work with a single session for the application. Of course with the approach, you describe you can work with short-lived sessions which should work ok. BUT
you lose a lot: lazy loading only works with attached objects, automatic detection of dirty objects. If you work with detached objects (or objects that aren't entities at all) you have to do this yourself.
Use one session per thread (doc) and a version or timestamp column to allow optimistic concurrency and thereby avoiding session-to-instance conflicts. Attach instances to session when needed unless you need long running transactions or a restrictive isolation level.
Related
I am writing an application in Java SE 8 and have recently migrated the database system from raw JDBC code to JPA. The interface itself is so much simpler, but I am running into an issue with the way I have designed my code which does not work well with JPA and I am unsure of how to proceed.
The primary issue I am having is that I cannot store references to my entities in code for any period of time anymore, because they immediately become out-of-date. I used to have a central persistence context where the one "true" instance of all my entities were always stored in code, and changes made to them would always be reflected everywhere because there were no duplicate instances. I realize this is not smart design when it comes to memory efficiency, but that allowed me to, for instance, implement the observer pattern and guarantee that any entity updates would be immediately visible in GUIs. But now, as soon as I load an entity from the database using JPA and close the EntityManager (as I have read so often that you must do), that instance merely represents a snapshot in time from when it was loaded and my GUIs will be waiting for updates from a dead object. Loading that entity from elsewhere in the code and making a change will do nothing, as it is a different instance altogether, with an empty list of subscribers (transient). There are a lot more cases in my code where I attempt to hold a reference to an entity for whatever purpose, and a lot of them rely on those entities being up-to-date.
I know that EntityManager is intended to be a short-lived object, but now I am thinking that it maybe wouldn't be such a bad idea after all to keep an EntityManager open for the lifetime of my program to replace that construct that I had in my old code. I quite frankly don't understand what the point of closing EntityManager so quickly is - isn't it beneficial to have your entities managed over a longer period of time? When I was first reading about how changes to managed entities are detected and persisted automatically, I hoped that that would allow me to completely detach my business logic from my persistence layer, and trust that all my changes were being saved. It was rather disillusioning to discover that in order for those entities to be managed in the first place, I would have to leave the EntityManager open for the duration of that business logic. And that would require them to be scoped higher than the method they are created in, so I could close them later. But all the literature implores the use of short-lived, low-scoped EntityManagers, which just seems like a direct contradiction.
I am somewhat at a loss for how to proceed. I would love to make full use of JPA and all of its extremely useful features, but I feel like I might be missing the point of EntityManager being short-lived. It seems like it would be so much more convenient long-lived. Can anyone give me some guidance?
Your central 'cache' with a single instance of data is a common idea, but it is difficult to manage. Some orm/JPA providers have caching built in and maintain something similar (check out EclipeLink's shared cache) but they usually have complex mechanisms that allow for limiting and managing what could be endless amounts of data that can quickly become stale. EclipseLink has tie ins to the database to get notifications when data changes, and can be configured for cache coordination when being run in different servers. Without such capabilities, your cache will be stale - and worse, your cache will have great difficulty maintaining transactional isolation. Any change to those cached objects is immediately visible to all processes, regardless of the transaction going through to the database or rolling back. Use of JPA is meant to guarantee that you only see committed data (excluding the changes you've made in the current transaction/unit of work).
To answer your specific question about keeping an EM open as generally to JPA providers: EntityManagers keep hooks to the entities read in through them so that they can track and manage all changes made to them. This can lead to very large amounts of data being held - check the forum for memory leak questions, as keeping EMs open for an extended period is the cause of quite a few. You gain object identity, but have to realize it comes at the cost of tracking everything read in through them - so you will likely have to occasionally clear the memory (em.clear()) at some key points, or find provider specific mechanics to dereference what it might be holding onto so GC can do its thing.
Other draw backs are that the EntityManager then itself becomes very large and difficult to merge changes into. Depending on how you merge changes into your app, you'll need a way to get those changes into your database. Having JPA go through very large sets of entities that builds over time to find changes to a small dataset is very inefficient, and you'll still have to find ways to refresh these entities if change are done through other entityManagers or applications.
We have a somewhat huge application which started a decade ago and is still under active development. So some parts are still in J2EE 1.4 architecture, others using Java EE 5/6.
While testing some new code, I realized that I had data inconsistency between information coming in through old and new code parts, where the old one uses the Hibernate session directly and the new one an injected EntityManager. This led to the problem, that one part couldn't see new data from the other part and thus also created a database record, resulting in primary key constraint violation.
It is planned to migrate the old code completely to get rid of J2EE, but in the meantime - what can I do to coordinate database access between the two parts? And shouldn't at some point within the application server both ways come together in the Hibernate layer, regardless if accessed via JPA or directly?
You can mix both Hibernate Session and Entity Manager in the same application without any problem. The EntityManagerImpl simply delegates calls the a private SessionImpl instance.
What you describe is a Transaction configuration anomaly. Every database transaction runs in isolation (unless you use REAN_UNCOMMITED which I guess it's not the case), but once you commit it the changes are available from any other transaction or connection. So once a transaction is committed you should see al changes in any other Hibernate Session, JDBC connection or even your database UI manager tool.
You said that there was a primary key conflict. This can't happen if you use Hibernate identity or sequence generator. For the old hi-lo generator you can have problems if an external connection tries to insert records in the same table Hibernate uses an old hi/lo identifier generator.
This problem can also occur if there is a master/master replication anomaly. If you have multiple nodes and there is no strict consistency replication you can end up with primar key constraint violations.
Update
Solution 1:
When coordinating the new and the old code trying to insert the same entity, you could have a slect-than-insert logic running in a SERIALIZABLE transaction. The SERIALIZABLE transaction acquires the appropriate locks on tour behalf and so you can still have a default READ_COMMITTED isolation level, while only the problematic Service methods are marked as SERIALIZABLE.
So both the old code and the new code have this logic running a select for checking if there is already a row satisfying the select constraint, only to insert it if nothing is found. The SERIALIZABLE isolation level prevents phantom reads so I think it should prevent constraint violations.
Solution 2:
If you are open to delegate this task to JDBC, you might also investigate the MERGE SQL statement, if your current database supports it. Basically, this is an upsert operation issuing an update or an insert behind the scenes. This command is much more attractive since you can still run it with even on READ_COMMITTED. The only drawback is that you can't use Hibernate for it, and only some databases support it.
If you instanciate separately a SessionFactory for the old code and an EntityManagerFactory for new code, that can lead to different value in first level cache. If during a single Http request, you change a value in old code, but do not immediately commit, the value will be changed in session cache, but it will not be available for new code until it is commited. Independentely of any transaction or database locking that would protect persistent values, that mix of two different Hibernate session can give weird things for in memory values.
I admit that the injected EntityManager still uses Hibernate. IMHO the most robust solution is to get the EntityManagerFactory for the PersistenceUnit and cast it to an Hibernate EntityManagerFactoryImpl. Then you can directly access the the underlying SessionFactory :
SessionFactory sessionFactory = entityManagerFactory.getSessionFactory();
You can then safely use this SessionFactory in your old code, because now it is unique in your application and shared between old and new code.
You still have to deal with the problem of session creation-close and transaction management. I suppose it is allready implemented in old code. Without knowing more, I think that you should port it to JPA, because I am pretty sure that if an EntityManager exists, sessionFactory.getCurrentSession() will give its underlying Session but I cannot affirm anything for the opposite.
I've run into a similar problem when I had a list of enumerated lookup values, where two pieces of code would check for the existence of a given value in the list, and if it didn't exist the code would create a new entry in the database. When both of them came across the same non-existent value, they'd both try to create a new one and one would have its transaction rolled back (throwing away a bunch of other work we'd done in the transaction).
Our solution was to create those lookup values in a separate transaction that committed immediately; if that transaction succeeded, then we knew we could use that object, and if it failed, then we knew we simply needed to perform a get to retrieve the one saved by another process. Once we had a lookup object that we knew was safe to use in our session, we could happily do the rest of the DB modifications without risking the transaction being rolled back.
It's hard to know from your description whether your data model would lend itself to a similar approach, where you'd at least commit the initial version of the entity right away, and then once you're sure you're working with a persistent object you could do the rest of the DB modifications that you knew you needed to do. But if you can find a way to make that work, it would avoid the need to share the Session between the different pieces of code (and would work even if the old and new code were running in separate JVMs).
ThreadLocal<Session> tl = new ThreadLocal<Session>();
tl.set(session);
to get the session,
Employee emp = (Employee)((Session)tl.get().get(Employee.class, 1));
If our application is web based, the web container creates a separate thread for each request.
If all these requests concurrently using the same single Session object , we should get
unwanted results in our database operations.
To overcome from above results, it is good practice to set our session to threadLocal object
which does not allows concurrent usage of session.I think, If it is correct the application performance should be very poor.
What is the good approach in above scenarios.
If I'm in wrong track , in which situations we need to go for ThreadLocal.
I'm new to hibernate, please excuse me if this type questioning is silly.
thanks in advance.
Putting the Hibernate Session in ThreadLocal is unlikely to achieve the isolation between requests that you want. Surely you create a new Session for each request using a SessionFactory backed by a connection pooling implementation of DataSource, which means that the local reference to the Session is on the stack anyway. Changing that local reference to a member variable only complicates the code, imho.
Anyhow, ensuring isolation within a single container doesn't address the actual problem - how is data accessed efficiently while maintaining consistency within a multi-threaded environment.
There are two parts to the problem you mention - the first is that a database connection is an expensive resource, the second that you need to ensure some level of data consistency between threads/requests.
The general approach to the resource problem is to use a database connection pool (which I'd guess you're already doing). As each request is processed, connections are obtained from the pool and returned when finished but importantly the connections in the pool are maintained beyond the lifetime of a request thus avoiding the cost of creating a connection each time it is needed.
The consistency problem is a little trickier and there's no one size fits all model. What you need to be doing is thinking about what level of consistency you need - questions like does it matter if data is read at the same time it's being written, do updates absolutely have to be atomic, etc.
Once you know the answer to these questions there two places you need to look at consistency - in the database and in the code.
With the database you need to look at database level locks and create a scheme suitable for your application by applying that appropriate isolation levels.
With the code, things are a little more complicated. Data is often loaded and displayed for a period of time before updates are written back - no problem if there's a single user but in a multi-user system it's possible that updates are made based on stale data or multiple updates occur simulatiously. It may be acceptable to have a policy of last update wins, in which case it's simple, but if not you'll need to be using version numbers or old/new comparisons to ensure integrity at the time the updates are applied.
I am not sure if you have compulsion of using ThreadLocal. Using ThreadLocal to store session object is definitely is not a good idea, specially when you are using hibernate along with spring.
A typical scheme for using Hibernate with Spring is:
Inject the sessionFactory in your DAO. I assume that you have sessionFactory already configured which is backed by a pooled datasource.
Now in your DAO class, a session can be accessed as follows.
Session session = sessionFactory.getCurrentSession();
Here is a link to related article.
Please note that this example is specific to Hiberante 3.x APIs. This takes care of session creation/closure/thread-safety aspect internally and its neat too.
I've recently taken on the database/hibernate side of our project and am having terrible trouble understanding some fundamentals of our design regarding the use of managed sessions.
We have a util class containing a static session that is only initialised once. Retrieval of the session is used by every DAO in the system via a static method getBoundSession(). The application runs 24/7. Is this a common design?
One of the benefits which is extremely useful, is that lazy attributes/collections on domain objects can be used throughout the business logic tier since the session is always open. Another benefit is that the objects retreived will stay cached within the session.
I feel we must be using Hibernate in the wrong way, it just doesn't seem right to have a single permanently open session. Also it causes problems when separate threads are using the util class, hence sharing the session. On the flip side I can't find a way to achieve the above benefits (particularly the first) with a different design. Can anyone shed any light on this?
Thanks
James
We have a util class containing a static session that is only initialised once. Retrieval of the session is used by every DAO in the system via a static method getBoundSession(). The application runs 24/7. Is this a common design?
Not it's not. The most common pattern in a multi-user client/server application is session-per-request and a session-per-application approach in a multi-user application is not only an anti-pattern, it's totally wrong:
A Session is not thread-safe.
You should rollback a transaction and close the Session after an Hibernate exception if you want to keep object state and database in sync.
The Session will grow indefinitely if keep it open too long.
You really need to read the whole Chapter 11. Transactions and Concurrency.
On the flip side I can't find a way to achieve the above benefits (particularly the first) with a different design.
Either use the OSIV (Open Session In View) pattern or load explicitely what you need per flow. And if you want to benefit from global caching, use the second level cache.
Keeping a session open for an extended period of time is OK (although that should not be eternity :-) A session should identify a unit of work - a coherent set of queries / updates which logically belong together. Can you identify such units in your app - e.g. client requests or conversations? If so, create a separate session for each of these.
You should also definitely use a separate session per thread (typically a unit of work is handled by a single thread anyway). A simple way to achieve this is using thread local storage.
It's an anti-pattern.
If you use one session for all requests. Then consider 100 clients (100 requests/threads) running almost simultaneously. You detach something from the session, but then another user reloads the same thing. You will need syncrhonization, which will hit performance. And you will have totally random behaviour that will be nightmare to debug.
The SessionFactory is static / per-application, not the Session. The factory should build a session whenever required. Read sessions and transactions docs at hibernate.
Describe please a typical lifecycle of a Hibernate object (that maps to a db table) in a web app.
Suppose, you create a new instance of an object and persist in the db.
But during the app lifetime you'll be working on a detached object and finally
you need to update it in the database, for example on exit.
How does it look like with hibernate and spring?
p.s. Can transactions and sessions live between servlet transitions? So that we opened 1 session and use it in all servlets without a need to reopen it?
I'll try to give a descriptive example.
Suppose, when the app starts, the log record is created. this can be done at once,
Log log = new Log(...) and then something like save(log) -- log corresponds to a table LOG.
then, as the application processes user inputs and keeps going, new data is being accumulated.
and after the second step we could add something to a log object, a collection for example:
// now we have a tracking of what user chosen: Set thisUserChoice,
// so we can update the persistent object, we have new data now !
// log.userChoices = thisUserChoice.
Here occurs the nature of my question. How are we supposed to deal with it, if we want to
update the database whenever new data is gotten from a user?
In a relational model we can work with a row id, so we could get this record and update some other data of the row.
In Hibernate we are also able to load a object by its id.
But is IT THE WAY TO GO? IS ANYTHING BETTER?
You could do everything in a single session. But that's like doing everything in a single class. It could make sense from a beginner's point of view, but nobody does it like that in practice.
In a web app, you can normally expect to have several threads running at once, each dealing with a different user. Each thread would typically have a separate session, and the session would only have managed instances of the objects that were actually needed by that user. It's not that you can completely ignore concurrency in your own code, but it's useful to have hibernate's help. If you were to do everything with one session, you would have to do all the concurrency management yourself.
Hibernate can also manage the concurrency if you have multiple application servers talking to a single database. The separate JVMs can't possibly share the same session in this case...
The lifecycle is described in the hibernate documentation (which I'm sure you've seen).
Whenever a request comes from the web client to the server, the first thing you should do is load the relevant objects (see section 10.3) so that you have persistent, not detached entities to deal with. Then, you do whatever operations are required. When the session closes (ie. when the server returns the response to the client), it will write any updates to the database. Or, if your operation involves creating new entities, you'll have to create transient ones (with new) and then call persist() or save() (see section 10.2). That will result in a managed entity -- you can make more changes to it, and hibernate will record those changes when the session closes.
I try to avoid using detached objects. But if I have to (perhaps they're stored in the user's session), then whenever they might need to be saved to the database, you'll have to use update() (see section 10.6). This converts it into a managed object, and so the session will save any changes to the database when it's closed.
Spring makes it very easy to generate a new session for each request. You would normally tell Spring to create a sessionFactory, and then every request will be given its own session. Search for "spring hibernate tutorial" and you'll find several examples.
http://scbcd.blogspot.com/2007/01/hibernate-persistence-lifecycle.html This explains transient, persistent objects.
Also have a look at the Lifecycle interface to know what hibernate does (and it provides hooks at all stages for user to do something)