I've recently taken on the database/hibernate side of our project and am having terrible trouble understanding some fundamentals of our design regarding the use of managed sessions.
We have a util class containing a static session that is only initialised once. Retrieval of the session is used by every DAO in the system via a static method getBoundSession(). The application runs 24/7. Is this a common design?
One of the benefits which is extremely useful, is that lazy attributes/collections on domain objects can be used throughout the business logic tier since the session is always open. Another benefit is that the objects retreived will stay cached within the session.
I feel we must be using Hibernate in the wrong way, it just doesn't seem right to have a single permanently open session. Also it causes problems when separate threads are using the util class, hence sharing the session. On the flip side I can't find a way to achieve the above benefits (particularly the first) with a different design. Can anyone shed any light on this?
Thanks
James
We have a util class containing a static session that is only initialised once. Retrieval of the session is used by every DAO in the system via a static method getBoundSession(). The application runs 24/7. Is this a common design?
Not it's not. The most common pattern in a multi-user client/server application is session-per-request and a session-per-application approach in a multi-user application is not only an anti-pattern, it's totally wrong:
A Session is not thread-safe.
You should rollback a transaction and close the Session after an Hibernate exception if you want to keep object state and database in sync.
The Session will grow indefinitely if keep it open too long.
You really need to read the whole Chapter 11. Transactions and Concurrency.
On the flip side I can't find a way to achieve the above benefits (particularly the first) with a different design.
Either use the OSIV (Open Session In View) pattern or load explicitely what you need per flow. And if you want to benefit from global caching, use the second level cache.
Keeping a session open for an extended period of time is OK (although that should not be eternity :-) A session should identify a unit of work - a coherent set of queries / updates which logically belong together. Can you identify such units in your app - e.g. client requests or conversations? If so, create a separate session for each of these.
You should also definitely use a separate session per thread (typically a unit of work is handled by a single thread anyway). A simple way to achieve this is using thread local storage.
It's an anti-pattern.
If you use one session for all requests. Then consider 100 clients (100 requests/threads) running almost simultaneously. You detach something from the session, but then another user reloads the same thing. You will need syncrhonization, which will hit performance. And you will have totally random behaviour that will be nightmare to debug.
The SessionFactory is static / per-application, not the Session. The factory should build a session whenever required. Read sessions and transactions docs at hibernate.
Related
I'm considering using Singleton Session to write request log for our application.
Usually I would use Hibernate with Lifestyle PerWebRequest, but in my specific case there is no Http Context (we use socket thread to listen for request). However, I notice that, as we only use the session to record log, I can use a singleton Hibernate session as well.
Even if multiple threads run the session, all the request log will just add up in the session and get saved when Hibernate feels comfortable. There is no need to read the log right after it was written, so that would work.
The code would be something like this:
// Got socket request
// Doing some stuff here, saving to a legacy database (cannot use Hibernate with this one)
var logging = new Message(info);
loggingService.save(logging); // FYI: in case it fail, we don't want to roll back the previous work
The only concern I have is that, when we want to do load-balancing with 2 program running on multiple machine, then it could be a problem. I figure that in that case we must do some locking/synchronization to avoid possible conflicts (though I can't think of any for now).
Is it an ok use for Singleton session, or there are possible impacts that I haven't thought of?
This "brilliant" idea of utilizing Singleton Session cost me quite a few days to clean up the mess. At first it produces good result, but then lots of unpredictable errors.
Now I can confirm Session isn't designed for something like this.
I'm opting to delete this question, but then decide to leave it here for anyone who have same "brilliant moments".
TL;DR: I'd recommend everyone to stick with the "good old way": one thread - one session.
ThreadLocal<Session> tl = new ThreadLocal<Session>();
tl.set(session);
to get the session,
Employee emp = (Employee)((Session)tl.get().get(Employee.class, 1));
If our application is web based, the web container creates a separate thread for each request.
If all these requests concurrently using the same single Session object , we should get
unwanted results in our database operations.
To overcome from above results, it is good practice to set our session to threadLocal object
which does not allows concurrent usage of session.I think, If it is correct the application performance should be very poor.
What is the good approach in above scenarios.
If I'm in wrong track , in which situations we need to go for ThreadLocal.
I'm new to hibernate, please excuse me if this type questioning is silly.
thanks in advance.
Putting the Hibernate Session in ThreadLocal is unlikely to achieve the isolation between requests that you want. Surely you create a new Session for each request using a SessionFactory backed by a connection pooling implementation of DataSource, which means that the local reference to the Session is on the stack anyway. Changing that local reference to a member variable only complicates the code, imho.
Anyhow, ensuring isolation within a single container doesn't address the actual problem - how is data accessed efficiently while maintaining consistency within a multi-threaded environment.
There are two parts to the problem you mention - the first is that a database connection is an expensive resource, the second that you need to ensure some level of data consistency between threads/requests.
The general approach to the resource problem is to use a database connection pool (which I'd guess you're already doing). As each request is processed, connections are obtained from the pool and returned when finished but importantly the connections in the pool are maintained beyond the lifetime of a request thus avoiding the cost of creating a connection each time it is needed.
The consistency problem is a little trickier and there's no one size fits all model. What you need to be doing is thinking about what level of consistency you need - questions like does it matter if data is read at the same time it's being written, do updates absolutely have to be atomic, etc.
Once you know the answer to these questions there two places you need to look at consistency - in the database and in the code.
With the database you need to look at database level locks and create a scheme suitable for your application by applying that appropriate isolation levels.
With the code, things are a little more complicated. Data is often loaded and displayed for a period of time before updates are written back - no problem if there's a single user but in a multi-user system it's possible that updates are made based on stale data or multiple updates occur simulatiously. It may be acceptable to have a policy of last update wins, in which case it's simple, but if not you'll need to be using version numbers or old/new comparisons to ensure integrity at the time the updates are applied.
I am not sure if you have compulsion of using ThreadLocal. Using ThreadLocal to store session object is definitely is not a good idea, specially when you are using hibernate along with spring.
A typical scheme for using Hibernate with Spring is:
Inject the sessionFactory in your DAO. I assume that you have sessionFactory already configured which is backed by a pooled datasource.
Now in your DAO class, a session can be accessed as follows.
Session session = sessionFactory.getCurrentSession();
Here is a link to related article.
Please note that this example is specific to Hiberante 3.x APIs. This takes care of session creation/closure/thread-safety aspect internally and its neat too.
I trying to understand the best prastice of using ThreadLocal for the above questions. From my understanding the reason of using this is to ensure only one session/pm created for entire application. My question is
is there any impact of using threadlocal like this on clustering application? (example google app engine) ?
if u use "transactional" begin,commit on my application, i do not need to use threadlocal right? since "transaction" already ensure my session open and close properly?
if i need to use "transactional", tx, it should be in threadlocal as well?
why not just use "static" instead of "threadlocal" ?
i interested to hear feedback from you all regarding advantages/disadvantages of using this techinque?
Probably not unless your clustering software can migrate threads between nodes. In this case, you'd need to migrate the thread local data as well.
No. The transaction is attached to the session, so you must keep both in sync. While you can begin a transaction in thread A and commit it in thread B, it's usually very hard to make sure that this work reliably. Therefore: Don't.
Yes.
static is global for the whole application. threadlocal is global per Thread.
Conclusion: If you're a beginner in this area, I suggest to use Spring. The Spring Framework solves many of the problems for you and helps you with useful error messages when something breaks.
Follow the documentation to the letter, especially when it doesn't make sense. Chances are that you missed something important and the Spring guys are right.
ThreadLocal is not used to create one session for the whole application. It is used to create one session for every thread. Every user session will be one thread so the ThreadLocal ensures that every user accessing you web page/ database will get its own database connection. If you use a static singleton pattern every user on the server will use the same database connection and I don't know how that would work out.
The implementation of many of the Transaction engines is actually using ThreadLocal to associate the session state you have with the database to a particular thread. This makes for instance running multiple threads inside of a transaction very difficult.
ThreadLocal is a guarantee of Thread safety but queryable in a semi static way later on by another piece of code. Its a thread global variable. This makes it useful for temporary but session aware information. Another use beyond transactions might be holding onto internal parameters for Authorisation which are then checked with a proxy.
Is using the 'synchronized' keyword on methods in a Java DAO going to cause issues when used by a web application?
I ask because I have a multi-threaded stand alone application that needs the methods to by synchronized to avoid resource conflict, as seen here.
java.util.concurrent.ExecutionException: javax.persistence.PersistenceException: org.hibernate.HibernateException: Found shared references to a collection: com.replaced.orm.jpa.Entity.stuffCollection
What I am concerned about is that when a significant number of people try and use the application that the synchronized methods will block and slow the entire application down.
I am using a Spring injected JPA entity manager factory, which provides an entity manager to the DAO. I could technically remove the DAO layer and have the classes call the entity manager factory directly, but I enjoy the separation the DAO provides.
I should also note that I am being very careful not to pass around connected entity ORM objects between threads. I speculate that the resource conflict error comes about when accessing the DAO. I think multiple threads are going at the same time and try to persist or read from the database in non-atomic ways.
In this case is using a DAO going to do more harm then help?
A big piece of information I left out of the question is that the DAO is not a singleton. If I had been thinking lucidly enough to include that detail I probably wouldn't have asked the question in the first place.
If I understand correctly, Spring creates a new instance of the DAO class for each class that uses it. So the backing entity manager should be unique to each thread. Not sharing the entity manager is, as Rob H answered, the key thing here.
However, now I don't understand why I get errors when I remove synchronized.
According to this thread, the #PersistenceContext annotation creates a thread-safe SharedEntityManager. So you should be able to create a singleton DAO.
You say you are not sharing entity objects across threads. That's good. But you should also make sure you're not sharing EntityManager objects (or Session objects in Hibernate) across threads either. Frameworks like Spring manage this for you automatically by storing the session in a thread-local variable. If you're coding your own DAOs without the help of a framework, you need to take precautions yourself to avoid sharing them.
Once you do this, there should be no reason to synchronize DAO methods because none of the conversational state will be shared across threads. This is critical for a highly concurrent web application. The alternative is that only one thread will be able to access the DAO at one time, assuming they all share the same DAO instance. Not good at all for throughput.
If it needs to be synchronized for thread safety, then leave them there. The blocking is required anyway in that case. If the blocking is not required for the web application case, you can either:
leave it as is, since the performance
hit when there is no contention on
the lock is negligible, and
insignificant when taken into account
the expense of hitting the database.
Redesign it so that you add a
synchronization layer for the
standalone application case which
protects the underlying
unsynchronized DAO.
Personally, I would leave it as is and profile it to see if you need to refactor. Until then you are simply doing premature optimization.
How do you do your Hibernate session management in a Java Desktop Swing application? Do you use a single session? Multiple sessions?
Here are a few references on the subject:
http://www.hibernate.org/333.html
http://blog.schauderhaft.de/2008/09/28/hibernate-sessions-in-two-tier-rich-client-applications/
http://in.relation.to/Bloggers/HibernateAndSwingDemoApp
Single session. Start transaction when you need to do a set of operations (like update data after dialog box OK button), commit the tx at the end. The connection though is constantly open (since it's the same session), and thus all opportunities for caching can be used by both Hib and RDBMS.
It may also be a good idea to implement a transparent session re-open in case the connection went dead -- users tend to leave applications open for extended periods of time, and it should continue to work Monday even if DB server was rebooted on weekend.
Update
Jens Schauder provided a reason to use multiple sessions: partial (unwanted) updates to the session. Well, that comes down to the way you use Hibernate.
Suppose we have two dialogs open (as in Jens' blog example). If user clicks a radiobox, and we immediately update a Hibernate entity associated with this radiobox, then, when user clicks Cancel, we're in trouble -- session is already updated.
The right way, as I see it, is to update dialog variables (non-Hibernate objects) only. Then, when user clicks OK, we begin a transaction, merge updated objects, commit the transaction. No garbage gets ever saved into session.
MyHibernateUtils.begin();
Settings settings = DaoSettings.load();
// update setttings here
DaoSettings.save(settings);
MyHibernateUtils.commit();
If we implement such a clean separation of concerns, we can later switch to multiple sessions with a simple change of MyHibernateUtils.begin() implementation.
As for possible memory leak, well... Transaction.commit() calls Session.flush(), which AFAIK, cleans the cache too. Also, one may manually control the caching policy by calling Session.setCacheMode().
Problem with "''session per thread''" is good Swing applications do the database access outside the EDT, usually in newly created SwingWorker threads. This way, "''session per thread''" quickly becomes "''session per click''".
Don't use a single session. For everything but the smallest applications, it will grow, collecting outdated data and become slower and slower, since the dirty check needs to check every entity in the session.
If you don't need/want lazy loading and tracking of changes by Hibernate, you can use short-lived sessions.
But if you want to benefit from the power of Hibernate use the approach I described in my blog:
http://blog.schauderhaft.de/2008/09/28/hibernate-sessions-in-two-tier-rich-client-applications/
or in the German version:
http://blog.schauderhaft.de/2007/12/17/hibernate-sessions-in-fat-client-anwendungen/
AFAIK it is really the same approach described in the http://in.relation.to/Bloggers/HibernateAndSwingDemoApp but with a recommendation how to actually scope your session:
On Session per Frame, with the exception of modal Frames which use the session of the parent Frame.
Just make sure never to combine objects from different sessions. It will cause lots of trouble.
In reply to Vladimirs update:
The cancel actually works extremely nice with my approach: throw away the session.
session.flush does not fix the problem of the evergrowing session when you work with a single session for the application. Of course with the approach, you describe you can work with short-lived sessions which should work ok. BUT
you lose a lot: lazy loading only works with attached objects, automatic detection of dirty objects. If you work with detached objects (or objects that aren't entities at all) you have to do this yourself.
Use one session per thread (doc) and a version or timestamp column to allow optimistic concurrency and thereby avoiding session-to-instance conflicts. Attach instances to session when needed unless you need long running transactions or a restrictive isolation level.