JAX-WS web services thread safety and performance concerns - java

I understand from a few other posts and my understanding on JAX-WS web services they are not thread-safe. My web service is going to get called by 100's of clients and we need to be able to process around 200 transaction/second.
My web service is going to interact with database to perform its work, if i introduce the synchronized keyword around the code that access the database I essentially will ensure only one thread access the database at a time, I wonder if I will still be able to achieve the required throughput in this case. thanks in advance for your help.
I have been told to actually move the database access work into another class and instaniate that class at the method level that way I won't need to use the synchronized keyword and still achieve thread safety. Is that correct?

If you need transactions and thread safety why aren't you just using EJBs as your JAX-WS endpoints?

We need more info on the application.
In general - for performance in the case you describe + database access I recommend.
Carefully plan your database - index where possible/makes sense, use views, etc...
Try to use a database with a good locking mechanism (lock per row). This way when two requests access different rows, you will not suffer from whole table locking.
Have your transactions as short as possible. If using EJBs - make sure the transaction scope for "read data" methods is not Required or RequiredNew (this might cause opening a transaction).
If you do use synchronization - carefully use the proper lock. Don't be tempted to automatically used "synchronized" as its the easiest to code. Consider using ReaderWriterLock where possible.
Consider using caching where possible, but carefully plan this, so your flows work on "relevant" data.
Start with these directions - I think you will see you can achieve your performance target like this.

Related

Prevent REST clients of simultaneous executing same methods with same arguments

Consider Spring MVC java web-application, which provides some REST API.
Let's say it has many methods, one of them is DELETE /api/foo/{id}, which obviously deletes foo entity from the DB with given id.
The problem is that due to big data in the DB, this operation is not immediate, so if client tries perform simultaneously multiply delete operations on same entity, say
DELETE /api/foo/123 x N times (by mistake in client software of course),
it causes some unpleasant side effects in the DB (you know, if you try delete same entity in several transactions, that's not generally nice).
My question is: what is the best practice in Spring MVC to prevent such situations?
I can certainly introduce synchronisation on Foo id in each such update method (PUT/DELETE). I will need to do it for all entities and all PUT/DELETE API methods though, which I really don't want to do. I suppose it should be some elegant and nice solution, how to perform such type of synchronisation on interceptor/servlet level, i.e. not on service of controller level.
I can also create specific interceptor and perform there waiting for duplicated requests (requests with same URL and parameters). But again, it doesn't sound as an elegant solution (until I will be ensured that it is not possible to configure in Spring MVC somehow in more beauty way).
That is a problem of concurrency that shall be handled by using the appropriate transaction and locking level. Unfortunately, there is no single size fits all way here and depending on your actual requirements, you could have to implement optimistic or pessimistic locking, as well as one of the possible transaction level (from no transaction at all to serializable transactions).
In general, handling such questions at the web level is a bad idea, because you will end in questions like what to do in on request wants to delete some data that another one is displaying at the same time? In SpringMVC, the common way is to use transactional methods in the service layer. Additionaly, you should declare an optimistic or pessimistic locking system in the persistence layer.
Optimistic layer normally give a higher throughput, at the cost of some transaction ending in exceptions. In that case, current best practices are now to report the problem to the user asking him/her to send his/her request again.

How to lock database records in a Java EE application?

I want to write a Java EE web application where different users work with a database. A user can start editing a record, and then either save changes or cancel editing. While the user is editing, the record should be locked for other users. It should be locked on the database level, because there are also other non-Java users editing the same database, locking the records they work on.
I understand some basic Java + databases, but I am not good at multiple-user things like locking. Looking for some examples on the internet, it seems to me like every "hello world" example for a Java EE technology introduces at least one another technology. To access objects in the database, I use JPA. To lock records, I probably need transactions, which brings JTA. To work with JTA, I need JNDI. To work with all those objects, I probably also need EJB and injections... and at this moment I wonder whether this is really the most simple way to solve the problem, or whether I missed something important. I do not know whether all those technologies are necessary (if yes, I will use them; I just would like to be sure before I learn them all). I just see that the examples I found on the web introduce them very generously.
I would like a simple example of a Java EE code which:
uses JPA;
connects to a database described in the "persistence.xml" file;
has a MyObject class with properties id and name, stored in the MYOBJECT table;
has a method (e.g. called from a JSP page) that database-level locks the object with id = 42 (so that non-Java users with access to the same database also cannot modify it), or displays an error if the record is already locked by another user (either another Java user, or a non-Java user);
has another method (e.g. called from another JSP) that either updates the name to a specified value and releases the lock, or just releases the lock if empty string is provided.
For each new technology you introduce in the solution, I would like to hear a very short explanation why did you use it. Also whether that technology requires me to install new libraries, create or modify configuration files, write additional code, etc. (The JSP files which call the methods are not necessary; I am interested in the database-related parts.)
(Another detail: Here is described a difference between EntityTransaction and UserTransaction. If I understand it correctly, JTA is needed only if I use multiple databases. Is it also necessary if I use only one Oracle database with different schemas? If yes, the please write the example code using JTA.)
1) If you want to lock a record in a database, you need something called pessimistic lock. Remember this keyword and use it for further googling. Simply said, pessimistic lock means really locking the record in the database. Which means that if your Java application makes a pessimistic lock, the record is really locked; so even if some other non-Java program accesses the same database, the record will be locked, and they cannot modify it.
On the other hand, the so-called optimistic lock is mostly a pretend-lock. It is, approximately, a "we most likely don't need to lock this record anyway, so we will not really lock it, and if something bad happens, then we will try to fix the problem afterwards" approach. Which actually makes sense and increases performance, but only in situations where the assumptions behind this approach are true; where the conflicts are really rare, and where you really can fix the problem afterwards. Unless you understand it well (which you don't seem to), just don't use it.
2) JPA is a unified approach for using a database with transactions and stuff, and it also maps objects to tables for you. This is probably what you want.
JTA is the same stuff, plus a unified approach to use transactions over many databases, so it is more powerful than JPA, but that means it has additional functionality that you don't really need. On the other hand, for using these superpowers you pay some cost, like losing the ability to start and transactions on whim. The server will manage the transactions for you, as the server needs. If you completely understand how exactly that works, then you know whether this fits your needs; but if you don't, then you rather avoid it. Your development environment may offer you JTA as a default option, but that is only because it thinks that you are going to write Skynet. By not using JTA you also don't have to use JNDI, EJB, and many other Skynet-related technologies.
3) After hearing this, now it is time for you to do your homework. Because now you have an idea of what to do. Read the "javax.persistence" API documentation.
You can use annotated Java classes to represent your database tables; or you can use the old-fashioned SQL queries; or both, as you wish. You can use either of them to lock and release records. A lock must be inside of a transaction, so if you want to keep the lock, you have to keep the transaction.
We will not solve this for you. You are asking for everything. You need to code it your self, but here is a link for JPA locking.
Hint: Use #Version
Read here for information on locking for JPA

DAO and Service layer design

I am developing web application with Java EE 6. In order to minimize calls to database will it be a good idea to have classes:
Data access class (DAO) will call only basic methods getAllClients, getAllProducts, getAllOrders, delete, update methods - CRUD methods.
Service class which will call CRUD methods but in addition filter methods e.g. findClientByName, findProuctByType, findProductByYear, findOrderFullyPaid/NotPaid etc... which will be based on basic DAO methods.
Thank you
In my experience (albeit, limited) DAO classes tend to have all the possible database operations which the application is allowed to perform. So in your case, it will have methods such as getAllClients() and getClientByName(String name), etc.
Getting all the users in your DAO and iterating all over them until you find the one you need will result in unneeded waste of computational time and memory consumption.
If you want to reduce the amount of times that your database is hit you could, maybe, implement some caching mechanism. An ORM framework such as Hibernate should be able to provide what you need as shown here.
EDIT:
As per your comment question, no, your service will not be made redundant. What one does is to usually use a Service layer to expose the DAO functionalities. This will, basically, not make the DAO visible from the from front end of your application. It usually also allows for extra methods, such as, for instance, public String getUserFormatted(String userName). This will make use of the getUserByName function offered by the DAO but provide some extra functionality.
The Service layer will also make itself useful should there be a change in specification and you now also need a web service to interface with your application. Having a service layer in between will allow the web service to query the DAO through the Service layer.
So basically, the DAO layer will still worry about the database stuff (CRUD Operations) while the service will adapt the data returned by the DAO without exposing the DAO.
It's hard to say without more information, but I think it's probably a good idea to leverage your database more than with just CRUD operations. Databases are good at searching, provided you configure them correctly, so IMHO it's a good idea to let your database handle the searching in your find methods for you. This means that your find methods would probably go in your DAOs...
It's good to think about/be aware of the implications of DB access on performance, but don't go overboard. Also, your approach implies that since your services are going to be doing the filtering, you are going to load a large amount of DB data into your application, which is a bad idea. The bottom line is you should use your RDBMS as it is intended to be used, and worry about performance due to over-access when you can show its a problem. I doubt you will run into that scenario.
I would say that you're better off having your DAO be more fine grained than you've specified.
I'd suggest putting findClientByName, findProuctByType, findProductByYear, findOrderFullyPaid/NotPaid on your DAO as well in some way because your database will most likely be better at filtering and sorting data than your in memory code.
Imagine you have 10 years of data and you call findProductsByYear on your service class and it then calls getAllProducts and then throws away 9 years of data in memory. You're far better off getting your database to only return you the year you are interested in.
Yes, this is the right way to do it.
The service will own the transactions. You should write these as POJOs; that way you can expose them as SOAO or REST web services, EJBs, or anything else that you want later on.

ThreadLocal when using hibernate session/JDO persistenceManager

I trying to understand the best prastice of using ThreadLocal for the above questions. From my understanding the reason of using this is to ensure only one session/pm created for entire application. My question is
is there any impact of using threadlocal like this on clustering application? (example google app engine) ?
if u use "transactional" begin,commit on my application, i do not need to use threadlocal right? since "transaction" already ensure my session open and close properly?
if i need to use "transactional", tx, it should be in threadlocal as well?
why not just use "static" instead of "threadlocal" ?
i interested to hear feedback from you all regarding advantages/disadvantages of using this techinque?
Probably not unless your clustering software can migrate threads between nodes. In this case, you'd need to migrate the thread local data as well.
No. The transaction is attached to the session, so you must keep both in sync. While you can begin a transaction in thread A and commit it in thread B, it's usually very hard to make sure that this work reliably. Therefore: Don't.
Yes.
static is global for the whole application. threadlocal is global per Thread.
Conclusion: If you're a beginner in this area, I suggest to use Spring. The Spring Framework solves many of the problems for you and helps you with useful error messages when something breaks.
Follow the documentation to the letter, especially when it doesn't make sense. Chances are that you missed something important and the Spring guys are right.
ThreadLocal is not used to create one session for the whole application. It is used to create one session for every thread. Every user session will be one thread so the ThreadLocal ensures that every user accessing you web page/ database will get its own database connection. If you use a static singleton pattern every user on the server will use the same database connection and I don't know how that would work out.
The implementation of many of the Transaction engines is actually using ThreadLocal to associate the session state you have with the database to a particular thread. This makes for instance running multiple threads inside of a transaction very difficult.
ThreadLocal is a guarantee of Thread safety but queryable in a semi static way later on by another piece of code. Its a thread global variable. This makes it useful for temporary but session aware information. Another use beyond transactions might be holding onto internal parameters for Authorisation which are then checked with a proxy.

Is using synchronized on a Java DAO going to cause issues?

Is using the 'synchronized' keyword on methods in a Java DAO going to cause issues when used by a web application?
I ask because I have a multi-threaded stand alone application that needs the methods to by synchronized to avoid resource conflict, as seen here.
java.util.concurrent.ExecutionException: javax.persistence.PersistenceException: org.hibernate.HibernateException: Found shared references to a collection: com.replaced.orm.jpa.Entity.stuffCollection
What I am concerned about is that when a significant number of people try and use the application that the synchronized methods will block and slow the entire application down.
I am using a Spring injected JPA entity manager factory, which provides an entity manager to the DAO. I could technically remove the DAO layer and have the classes call the entity manager factory directly, but I enjoy the separation the DAO provides.
I should also note that I am being very careful not to pass around connected entity ORM objects between threads. I speculate that the resource conflict error comes about when accessing the DAO. I think multiple threads are going at the same time and try to persist or read from the database in non-atomic ways.
In this case is using a DAO going to do more harm then help?
A big piece of information I left out of the question is that the DAO is not a singleton. If I had been thinking lucidly enough to include that detail I probably wouldn't have asked the question in the first place.
If I understand correctly, Spring creates a new instance of the DAO class for each class that uses it. So the backing entity manager should be unique to each thread. Not sharing the entity manager is, as Rob H answered, the key thing here.
However, now I don't understand why I get errors when I remove synchronized.
According to this thread, the #PersistenceContext annotation creates a thread-safe SharedEntityManager. So you should be able to create a singleton DAO.
You say you are not sharing entity objects across threads. That's good. But you should also make sure you're not sharing EntityManager objects (or Session objects in Hibernate) across threads either. Frameworks like Spring manage this for you automatically by storing the session in a thread-local variable. If you're coding your own DAOs without the help of a framework, you need to take precautions yourself to avoid sharing them.
Once you do this, there should be no reason to synchronize DAO methods because none of the conversational state will be shared across threads. This is critical for a highly concurrent web application. The alternative is that only one thread will be able to access the DAO at one time, assuming they all share the same DAO instance. Not good at all for throughput.
If it needs to be synchronized for thread safety, then leave them there. The blocking is required anyway in that case. If the blocking is not required for the web application case, you can either:
leave it as is, since the performance
hit when there is no contention on
the lock is negligible, and
insignificant when taken into account
the expense of hitting the database.
Redesign it so that you add a
synchronization layer for the
standalone application case which
protects the underlying
unsynchronized DAO.
Personally, I would leave it as is and profile it to see if you need to refactor. Until then you are simply doing premature optimization.

Categories