Session management using Hibernate in a *multi-threaded* Swing application

Session management using Hibernate in a *multi-threaded* Swing application - java

I'm currently working on a (rather large) pet project of mine , a Swing application that by it's very nature needs to be multi-threaded. Almost all user interactions might fetch data from some remote servers over the internet , since I neither control these servers nor the internet itself, long response times are thus inevitable. A Swing UI obviously cannot repaint itself while the EDT is busy so all remote server calls need to be executed by background thread(s).
My problem:
Data fetched by the background threads gets 'enriched' with data from a local (in-memory) database (remote server returns IDs/references to data in the local database). This data later eventually gets passed to the EDT where it becomes part of the view model. Some entities are not completely initialized at this point (lazy-fetching enabled) so the user might trigger lazy-fetching by e.g. scrolling in a JTable. Since the hibernate session is already closed this will trigger a LazyInitializationException. I can't know when lazy-fetching might be triggered by the user so creating a session on demand/attaching the detached object will not work here.
I 'solved' this problem by:
using a single (synchronized , since Session instances are not thread-safe) Session for the whole application
disabling lazy-fetching completely
While this works, the application's performance has suffered greatly (sometimes being close to unusable). The slowdown is mainly caused by the large number of objects that are now fetched by each query.
I'm currently thinking about changing the application's design to 'Session-per-thread' and migrating all entities fetched by non-EDT threads to the EDT thread's Session (similar to this posting on the Hibernate forums).
Side-note: Any problems related to database updates do not apply since all database entities are read-only (reference data).
Any other ideas on how to use Hibernate with lazy-loading in this scenario ?

Don't expose the Session itself in your data API. You can still do it lazily, just make sure that the hydration is being done from the 'data' thread each time. You could use a block (runnable or some kind of command class is probably the best Java can do for you here unfortunately) that's wrapped by code that performs the load async from the 'data' thread. When you're in UI code, (on the UI thread of course) field some kind of a 'data is ready' event that is posted by the data service. You can then get the data from the event use in the UI.

You could look have a look at Ebean ORM. It is session-less and lazy loading just works. This doesn't answer your question but really proposes an alternative.
I know Ebean has built in support for asynchronous query execution which may also be interesting for your scenario.
Maybe worth a look.
Rob.

There are two distinct problems, that should get resolved seperately:
Handling of Hibernate Sessions in Swing Applications. Let me recommend my own article, regarding this problem: http://blog.schauderhaft.de/2008/09/28/hibernate-sessions-in-two-tier-rich-client-applications/
The basic idea is to have a session for every frame, excluding modal frames which use the session of the spawning frame. It is not easy but it works. Meaning, you won't get any LLEs anymore.
How to get your GUI thread separated from the back end.
I recommend to keep the hibernate objects strictly on the back end thread they originate from. Only give wrapper objects to the ETD. If these wrapper objects are asked for a value, they create a request which gets passed to the backend thread, which eventually will return the value.
I'd envision three kinds of wrapper Implementations:
Async: requests the value, and gets notified when the value is available. It would return immediately with some dummy value. On notification it will fire a PropertyChange event i.O. to inform the GUI about the 'changed' value (changed from unknown to a real value).
Sync: requests the value and waits for it to be available.
Timed: a mixture between the two, waiting for a short time (0.01) seconds, before returning. This would avoid plenty change events, compared to the async version.
As a basis for these wrappers a recommend the ValueModel of the JGoodies Binding library: http://www.jgoodies.com/downloads/libraries.html
Obviously You need to take care that any action is only performed on actually loaded values, but since you don't plan on doing updates this shouldn't be to much of an issue.
Let me end with a warning: I have thought about it a lot, but never actually tried it, so move with care.

Related

How to handle transactions in Async calls

Due to the slowness of the application, we made some of our long running queries asynchronous. Problem is these are part of a single transaction but in case if these queries/routines fail I need to roll back everything. How to achieve this? This application is legacy application using JDBC oracle and java 8. Also like to know if there is any support for this in Springboot, jpa application.
Thanks in advance.

Don't try to interact with the same DB connection from multiple threads at once. JDBC's connection system isn't specced to let you do this.
A transaction belongs to a single connection. You can't smear it out over multiples.
The obvious way to ensure that 'it is all rolled back' is to have a single long-lived transaction (but see later).
Combine these 3 facets and you end up with: Do all work in the async block. At least, all work that either needs to all happen, or none of it happens (i.e. the one transaction).
Any other basic approach wouldn't work or wouldn't be useful; there's no point freezing the main thread to wait for the async task (just do the async task on the spot; moving code to another thread doesn't magically make it go any faster. On the contrary, in fact).
However, transactions that aren't just long lived, but make a ton of changes to a DB is its own problem, but we now we're getting into the performance characteristics of your specific batch of queries and your particular DB engine, version, indices, and data. Kinda hard to answer with specifics, what with all those unknowns.
There are ways to design your DB to deal with this (mostly involving a table representing a calculation, and having a row indicate whether the calculation is complete or not. As long as you aren't done, dont set it to 'completed', and all your queries should ignore non-complete results. Upon bootup, delete (and with it, let that cascade) any non-complete results: Those must be half-baked work done right before your server crashed, and now you've restarted it). It's probably not the right answer here, just making sure you're aware that such options also exist.
As a general rule of thumb, countering a problem of "Our code has been observed to run too slowly" with "lets make it all async" doesn't work. async makes code harder to read, way harder to debug, and doesn't make stuff go faster. All you can really do with async is soothe the user by playing them some elevator music or slightly more pragmatic: A progress bar or whatnot, whilst they wait. And that's actually generally easier by spawning off the bits that tell the user what's happening into a separate thread, instead of asyncing the work itself. That, and make your algorithm better and/or fix your DB index definitions. You can search the web for that too; run EXPLAIN variants of your queries to make the DB tell you whether it is using any table sweeps (that's where it goes through the entire dataset before it can answer a query. You want to avoid those).
If you need help with either of those parts (show the user what is going on, instead of freezing the webpage or freezing the GUI / how to optimize a DB query), search the web for this information, there are tons of tutorials. Make sure to include the frontend tech; java can be used for swing apps, javafx, android, and there are at last count like a 100 web frameworks.

Concurrency : Handling multiple submits in a web application

This is a recent interview question to my friend:
How would you handle a situation where users enter some data in the screen and let's say 5 of them clicked on the Submit button *the SAME time ?*
(By same time,the interviewer insisted that they are same to the level of nanoseconds)
My answer was just to make the method that handles the request synchronized and only one request can acquire the lock on the method at a given time.
But it looks like the interviewer kept insisting there was a "better way" to handle it .
One other approach to handle locking at the database level, but I don't think it is "better".
Are there any other approaches. This seems to be a fairly common problem.

If you have only one network card, you can only have one request coming down it at once. ;)
The answer he is probably looking for is something like
Make the servlet stateless so they can be executed concurrently.
Use components which allow thread safe concurrent access like Atomic* or Concurrent*
Use locks only where you obsolutely have to.
What I prefer to do is to make the service so fast it can respond before the next resquest can come in. ;) Though I don't have the overhead of Java EE or databases to worry about.

Does it matter that they click at the same time e.g. are they both updating the same record on a database?
A synchronized method will not cut it, especially if it's a webapp distributed amongst multiple JVMs. Also the synchronized method may block, but then the other threads would just fire after the first completes and you'd have lost writes.
So locking at database level seems to be the option here i.e. if the record has been updated, report an error back to the users whose updates were serviced after the first.

You do not have to worry about this as web server launches each request in isolated thread and manages it.
But if you have some shared resource like some file for logging then you need to achieve concurrency and put thread lock on it in request and inter requests

Two threads reading from the same table:how do i make both thread not to read the same set of data from the TASKS table

I have a tasks thread running in two separate instances of tomcat.
The Task threads concurrently reads (using select) TASKS table on certain where condition and then does some processing.
Issue is ,sometimes both the threads pick the same task , because of which the task is executed twice.
My question is how do i make both thread not to read the same set of data from the TASKS table

It is just because your code(which is accessing data base)DAO function is not synchronized.Make it synchronized,i think your problem will be solved.

If the TASKS table you mention is a database table then I would use Transaction isolation.
As a suggestion, within a trasaction, set an attribute of the TASK table to some unique identifiable value if not set. Commit the tracaction. If all is OK then the task has be selected by the thread.
I haven't come across this usecase so treat my suggestion with catuion.

I think you need to see some information how does work with any enterprise job scheduler, for example with Quartz

For your use case there is a better tool for the job - and that's messaging. You are persisting items that need to be worked on, and then attempting to synchronise access between workers. There are a number of issues that you would need to resolve in making this work - in general updating a table and selecting from it should not be mixed (it locks), so storing state there doesn't work; neither would synchronization in your Java code, as that wouldn't survive a server restart.
Using the JMS API with a message broker like ActiveMQ, you would publish a message to a queue. This message would contain the details of the task to be executed. The message broker would persist this somewhere (either in its own message store, or a database). Worker threads would then subscribe to the queue on the message broker, and each message would only be handed off to one of them. This is quite a powerful model, as you can have hundreds of message consumers all acting on tasks so it scales nicely. You can also make this as resilient as it needs to be, so tasks can survive both Tomcat and broker restarts.

Whether the database can provide graceful management of this will depend largely on whether it is using strict two-phase locking (S2PL) or multi-version concurrency control (MVCC) techniques to manage concurrency. Under MVCC reads don't block writes, and vice versa, so it is very possible to manage this with relatively simple logic. Under S2PL you would spend too much time blocking for the database to be a good mechanism for managing this, so you would probably want to look at external mechanisms. Of course, an external mechanism can work regardless of the database, it's just not really necessary with MVCC.
Databases using MVCC are PostgreSQL, Oracle, MS SQL Server (in certain configurations), InnoDB (except at the SERIALIZABLE isolation level), and probably many others. (These are the ones I know of off-hand.)
I didn't pick up any clues in the question as to which database product you are using, but if it is PostgreSQL you might want to consider using advisory locks. http://www.postgresql.org/docs/current/interactive/explicit-locking.html#ADVISORY-LOCKS I suspect many of the other products have some similar mechanism.

I think you need have some variable (column) where you keep last modified date of rows. Your threads can read same set of data with same modified date limitation.
Edit:
I did not see "not to read"
In this case you need have another table TaskExecutor (taskId , executorId) , and when some thread runs task you put data to TaskExecutor; and when you start another thread it just checks that task is already executing or not (Select ... from RanTask where taskId = ...).
Нou also need to take care of isolation level for transaсtions.

Whats the best practice around when to start a new session or transaction for batch jobs using spring/hibernate?

I have set of batch/cron jobs in Java that call my service classes. I'm using Hibernate and Spring as well.
Originally the batch layer was always creating an outer transaction, and then the batch job will call a service to get a list of objects from the DB w/ the same session, then call a service to process each object separately. Theres a tx-advice set for my service layer to rollback on any throwable. So if on the 5th object theres an exception, the first 4 objects that were processed gets rolled back too because they were all part of the same transaction.
So i was thinking this outer transaction created in the batch layer was unnecessary. I removed that, and now i call a service to get a list of objects. THen call another service to process each object separately, and if one of those objects fail, the other ones will still persist because its a new transaction/session for each service call. But the problem I have here now is after getting a list of objects, when i pass each object to a service to process, if i try to get one of the properties i get a lazy initialization error because the session used to load that object (from the list) is closed.
Some options i thought of were to just get a list of IDs in the batch job and pass each id to a service and the service will retrieve the whole object in that one session and process it. Another one is to set lazy loading to false for that object's attributes, but this would load everything everytime even if sometimes the nested attributes aren't needed.
I could always go back to the way it was originally w/ the outer transaction around every batch job, and then create another transaction in the batch job before each call to the service for processing each individual object...
What's the best practice for something like this?

Well I would say that you listed every possible option except OpenSessionInView. That would keep your session alive across transactions, but it's difficult to implement properly. So difficult that it's considered an AntiPattern by many.
However, since you're not implementing a web interface and you aren't dealing with a highly threaded environment, I would say that's the way to go. It's not like you're passing entities to views. Your biggest fear is an N+1 call to the database while iterating through a collection, but since this is a cron job, performance may not be a major issue when compared with code cleanliness. If you're really worried about it, just make sure you get all of your collections via a call to a DAO who can do a select *.
Additionally, you were effectively doing an Open Session In View before when you were doing everything in the same transaction. In Spring, Sessions are opened on a per transaction basis, so keeping a transaction open a long period of time is effectively the same as keeping a Session open a long period of time. The only real difference in your case will be the fact that you can commit periodically without fear of a lazy initialization error down the road.
Edit
All that being said, it takes a bit of time to set up an Open Session in View, so unless you have any particular issues against doing everything in the same transaction, you might consider just going back to that.
Also, I just noticed that you mentioned opening a transaction in the batch layer and then opening "mini transactions" in the Service layer. This is most emphatically NOT a good idea. Spring's annotation driven transactions will piggyback on any currently open transaction in the session. This means that transactions that are supposed to be read-only will suddenly become read-write if the currently open transaction is read-write. Additionally, the Session won't be flushed until the outermost transaction is finished anyways, so there's no point in marking the Service layer with #Transactional. Putting #Transactional on multiple layers only lends to a false sense of security.
I actually blogged about this issue some time ago.

java methods and race condition in a jsp/servlets application

Suppose that I have a method called doSomething() and I want to use this method in a multithreaded application (each servlet inherits from HttpServlet).I'm wondering if it is possible that a race condition will occur in the following cases:
doSomething() is not staic method and it writes values to a database.
doSomething() is static method but it does not write values to a database.
what I have noticed that many methods in my application may lead to a race condition or dirty read/write. for example , I have a Poll System , and for each voting operation, a certain method will change a single cell value for that poll as the following:
[poll_id | poll_data ]
[1 | {choice_1 : 10, choice_2 : 20}]
will the JSP/Servlets app solve these issues by itself, or I have to solve all that by myself?
Thanks..

It depends on how doSomething() is implemented and what it actually does. I assume writing to the database uses JDBC connections, which are not threadsafe. The preferred way of doing that would be to create ThreadLocal JDBC connections.
As for the second case, it depends on what is going on in the method. If it doesn't access any shared, mutable state then there isn't a problem. If it does, you probably will need to lock appropriately, which may involve adding locks to every other access to those variables.
(Be aware that just marking these methods as synchronized does not fix any concurrency bugs. If doSomething() incremented a value on a shared object, then all accesses to that variable need to be synchronized since i++ is not an atomic operation. If it is something as simple as incrementing a counter, you could use AtomicInteger.incrementAndGet().)

The Servlet API certainly does not magically make concurrency a non-issue for you.
When writing to a database, it depends on the concurrency strategy in your persistence layer. Pessimistic locking, optimistic locking, last-in-wins? There's way more going on when you 'write to a database' that you need to decide how you're going to handle. What is it you want to have happen when two people click the button at the same time?
Making doSomething static doesn't seem to have too much bearing on the issue. What's happening in there is the relevant part. Is it modifying static variables? Then yes, there could be race conditions.

The servlet api will not do anything for you to make your concurrency problems disappear. Things like using the synchronized keyword on your servlets are a bad idea because you are basically forcing your threads to be processed one at a time and it ruins your ability to respond quickly to multiple users.
If you use Spring or EJB3, either one will provide threadlocal database connections and the ability to specify transactions. You should definitely check out one of those.

Case 1, your servlet uses some code that accesses a database. Databases have locking mechanisms that you should exploit. Two important reasons for this: the database itself might be used from other applications that read and write that data, it's not enough for your app to deal with contending with itself. And: your own application may be deployed to a scaled, clustered web container, where multiple copies of your code are executing on separate machines.
So, there are many standard patterns for dealing with locks in databases, you may need to read up on Pessimistic and Optimistic Locking.
The servlet API and JBC connection pooling gives you some helpful guarantees so that you can write your servlet code without using Java synchronisation provided your variables are in method scope, in concept you have
Start transaction (perhaps implicit, perhaps on entry to an ejb)
Get connection to DB ( Gets you a connection from pool, associated with your tran)
read/write/update code
Close connection (actually keeps it for your thread until your transaction commits)
Commit (again maybe implictly)
So your only real issue is dealing with any contentions in the DB. All of the above tends to be done rather more nicely using things such as JPA these days, but under the covers thats more or less what's happening.
Case 2: static method, this presumably implies that you now keep everything in a memory structure. This (barring remote invocation of some sort) impies a single JVM and you managing your own locking. Should your JVM or machine crash I guess you lose your data. If you care about your data then using a DB is probably better.
OR, how about a completely other approach: servlet simply records the "vote" by writing a message to a persistent JMS queue. Have some other processes pick up the votes from the queue and adds them up. You won't give immediate feedback to the voter this way, but you decouple the user's experience from the actual (in similar scenarios) quite complex processing .

I thing that the best solution for your problem is to use something like "synchronized" keyword and wait/notify!

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.