I thought finding this answer would be easy...but not so much.
Does anyone know if the OracleDataSource.getConnection method is thread safe?
I do not mean the Connection objects it returns, but the calling of getConnection itself.
Specifically, this method: http://download.oracle.com/otn_hosted_doc/jdeveloper/905/jdbc-javadoc/oracle/jdbc/pool/OracleDataSource.html#getConnection()
The docs and class doesn't say explicitly but, being as how its a connection pool, I'm inclined to believe it is.
This is a problem discussed in Java Concurrency in Practice (Brian Goetz):
4.5.1. Interpreting Vague Documentation
Many Java technology specifications are silent, or at least unforthcoming, about thread safety guarantees and requirements for interfaces such as ServletContext, HttpSession, or DataSource.
...information about servlets...
One can make a similar inference about the JDBCDataSource interface, which represents a pool of reusable database connections. A DataSource provides service to an application, and it doesn't make much sense in the context of a
single threaded application. It is hard to imagine a use case that doesn't involve calling getConnection from multiple threads. And, as with servlets, the examples in the JDBC specification do not suggest the need for any client side locking in the many code examples using DataSource. So, even though the specification doesn't promise that DataSource is thread-safe or require container vendors to provide a thread-safe implementation, by the same "it would be absurd if it
weren't" argument, we have no choice but to assume that DataSource.getConnection does not require additional client-side locking.
...
Related
Under IBM JVM we have faced an issue when multiple threads are trying to call Class.getAnnotation at the same time on different objects (but with the same annotation). Threads are starting to deadlock waiting on a monitor inside a Hashtable, which is used as a cache for annotations in IBM JVM. The weirdest thing is that the thread that is holding this monitor is put into 'waiting on condition' state right inside Hashtable.get, making all other threads to wait indefinitely.
The support from IBM stated, that implementation of Class.getAnnotation is not thread safe.
Comparing to other JVM implementations (for example, OpenJDK) we see that they implement Class methods in thread safe manner. IBM JVM is a closed source JVM, they do publish some source code together with their JVM, but it's not enough to make a clear judgment whenever their implementation of Class is thread safe or not.
The Class documentation doesn't clearly state whenever its methods are thread safe or not. So is it a safe assumption to treat Class methods (getAnnotation in particular) as a thread safe or we must use sync blocks in multi threaded environment?
How do popular frameworks (ex. Hibernate) are mitigating this problem? We haven't found any usage of synchronization in Hibernate code that was using getAnnotation method.
Your problem might be related to bug fixed in version 8 of Oracle Java.
One thread calls isAnnotationPresent on an annotated class where the
annotation is not yet initialised for its defining classloader. This
will result in a call on AnnotationType.getInstance, locking the class
object for sun.reflect.annotation.AnnotationType. getInstance will
result in a Class.initAnnotationsIfNecessary for that annotation,
trying to acquire a lock on the class object of that annotation.
In the meanwhile, another thread has requested Class.getAnnotations
for that annotation(!). Since getAnnotations locks the class object it
was requested on, the first thread can't lock it when it runs into
Class.initAnnotationsIfNecessary for that annotation. But the thread
holding the lock will try to acquire the lock for the class object of
sun.reflect.annotation.AnnotationType in AnnotationType.getInstance
which is hold by the first thread, thus resulting in the deadlock.
JDK-7122142 : (ann) Race condition between isAnnotationPresent and getAnnotations
Well, there is no specified behavior, so normally the correct way to deal with it would be to say “if no behavior is specified, assume no safety guarantees”.
But…
The problem here is that if these methods are not thread-safe, the specification lacks a documentation of how to achieve thread-safety correctly here. Recall that instances of java.lang.Class are visible across all threads of the entire application or even within multiple applications if your JVM hosts multiple apps/applets/servlets/beans/etc.
So unlike classes you instantiate for your own use where you can control access to these instances, you can’t preclude other threads from accessing the same methods of a particular java.lang.Class instance. So even if we engage with the very awkward concept of relying on some kind of convention for accessing such a global resource (e.g. like saying “the caller has to do synchronized(x.class)”), the problem here is, even bigger, that no such convention exists (well, or isn’t documented which comes down to the same).
So in this special case, where no caller’s responsibility is documented and can’t be established without such a documentation, IBM is in charge of telling how they think, programmers should use these methods correctly when they are implemented in a non-thread-safe manner.
There is an alternative interpretation I want to add: all information, java.lang.Class offers, is of a static constant nature. This class reflects what has been invariably compiled into the class. And it has no methods to alter any state. So maybe there’s no additional thread-safety documentation as all information is to be considered immutable and hence naturally thread-safe.
Rather, the fact that under the hood some information is loaded on demand is the undocumented implementation detail that the programmer does not need to be aware of. So if JRE developers decide to implement lazy creation for efficiency they must maintain the like-immutable behavior, read thread safety.
I have declared a Spring bean, which polls my email server every so and so seconds. If there is mail, it fetches it, and tries to extract any attached files in it. These files are then submitted to an Uploader which stores them safely. The uploader is also declared as a Spring bean. A third bean associates the email's sender with the file's filename and stores that in a DB.
It turned out that when a few people tried to send emails at the same time, a bunch of messy stuff happened. Records in the DB got wrong filenames. Some did not get filenames at all, etc.
I attributed the problem to the fact that beans are scoped to singleton by default. This means that a bunch of threads are probably messing up with one and the same instance at the same time. The question is how to solve this.
If I synchronize all the sensitive methods, then all threads will stack up and wait for each other, which is kind of against the whole idea of multithreading.
On the other hand, scoping the beans to "request" is going to create new instances of each of them, which is not really good either, if we speak about memory consumption, and thread scheduling
I am confused. What should I do?
Singleton-scoped beans should not hold any state - that solves the problem usually. If you only pass data as method parameters, and don't assign it to fields, you will be safe.
I agree with both #Bozho and #stivio answers.
The preferred options are to either pass store no state in a singleton scoped beans, and pass in a context object to the methods, or to use a prototype / request scoped beans that get created for every processing cycle. Synchronization can be usually avoided, by choosing one of these approaches, and you gain much more performance, while avoiding deadlocks. Just make sure you're not modifying any shared state, like static members.
There are pros and cons for each approach:
Singlton beans are act as a service-like class, which some may say are not a good Object-Oriented design.
Passing context to methods in a long chain of methods may make your code messy, if you're not careful.
Prototype beans may hold a lot of memory for longer than you intended, and may cause memory exhaustion. You need to be careful with the life cycle of these beans.
Prototype beans may make your design neater. Make sure you're not reusing beans by multiple threads though.
I tend to go with the service approach in most simple cases. You can also let these singleton beans create a processing object that can hold it's state for the computation. This is a solution that may serve you best for the more complexed scenarios.
Edit:
There are cases when you have a singleton bean depending on prototype scoped bean, and you want a new instance of the prototype bean for each method invocation. Spring supplies several solutions for that:
The first is using Method Injection, as described in the Spring reference documentation. I don't really like this approach, as it forces your class to be abstract.
The second is to use a ServiceLocatorFactoryBean, or your own factory class (which needs to be injected with the dependencies, and invoke a constructor). This approach works really well in most cases, and does not couple you to Spring.
There are cases when you also want the prototype beans to have runtime dependencies. A good friend of mine wrote a good post about this here: http://techo-ecco.com/blog/spring-prototype-scoped-beans-and-dependency-injection/.
Otherwise just declare your beans as request, don't worry about the memory consumption, the garbage collection will clear it up, as long there is enough memory it won't be a performance problem too.
Speaking abstractly: if you'e using Spring Integration, then you should build your code in terms of the messages themselves. Eg, all important state should be propagated with the messages. This makes it trivial to scale out by adding more spring Integration instances to handle the load. The only state (really) in Spring Integration is for components like the aggregator, which waits and collects messages that have a correllation. In this case, you can delegate to a backing store like MongoDB to handle the storage of these messages, and that is of course thread safe.
More generally, this is an example of a staged event driven architecture - components must statelessly (N(1) no matter how many messages) handle messages and then forward them on a channel for consumption by another component that does not know about the previous component from which the message came.
If you are encountering thread-safety issues using Spring Integration, you might be doing something a bit differently than intended and it might be worth revisiting your approach...
Singletons should be stateful and thread-safe.
If a singleton is stateless, it's a degenerate case of being stateful, and thread-safe is trivially true. But then what's the point of being singleton? Just create a new one everytime someone requests.
If an instance is stateful and not thread-safe, then it must not be singleton; each thread should exclusively have a different instance.
I have a stand-alone JMS app that subscribes to several different JMS topics. Each topic has its own session and onMessage() listener. Each onMessage() method updates a common current value table - all the onMessage() methods update the same current value table.
I've read that the onMessage method is actually called on the JMS provider's thread. So, my question is: if all these onMessage() methods are called on a separate thread than my app, doesn't this present a concurrency problem since all these threads update a common CVT? Seems like I need to synchronize access to the CVT somehow?
Short answer to your question: YES, you need to take care of concurrency concerns when your JMS code is updating some common in-memory object.
However, I'm not sure what you mean by "common current value table"? If this is some database table, then database should take care of concurrency issues for you.
EDIT: it turned out that "common current value table" is a common in-memory object. As I mentioned earlier, in this case you need to handle the concurrency concerns yourself (Java concurrency tutorial).
There are mainly two approaches to this problem:
synchronization - suitable if you have low-contention or you are stuck with some non-threadsafe object, then your best choice is synchronization.
high-level concurrency objects that come with the JDK - best fit if you have high-contention and you are using some class from regular collections; just swap in an instance of concurrent collections.
In any case, it is highly recommended to do your own testing to choose the best approach for you.
If you would be dealing with expensive to create non-threadsafe stateless procedural code (no storage of data involved) then you could also use object pooling (e.g. Commons Pool), but this is not relevant in your current issue.
JMS onMessage() method is always called by the JMS provider's thread (also known as asynchronous calling).
Suppose that I have a method called doSomething() and I want to use this method in a multithreaded application (each servlet inherits from HttpServlet).I'm wondering if it is possible that a race condition will occur in the following cases:
doSomething() is not staic method and it writes values to a database.
doSomething() is static method but it does not write values to a database.
what I have noticed that many methods in my application may lead to a race condition or dirty read/write. for example , I have a Poll System , and for each voting operation, a certain method will change a single cell value for that poll as the following:
[poll_id | poll_data ]
[1 | {choice_1 : 10, choice_2 : 20}]
will the JSP/Servlets app solve these issues by itself, or I have to solve all that by myself?
Thanks..
It depends on how doSomething() is implemented and what it actually does. I assume writing to the database uses JDBC connections, which are not threadsafe. The preferred way of doing that would be to create ThreadLocal JDBC connections.
As for the second case, it depends on what is going on in the method. If it doesn't access any shared, mutable state then there isn't a problem. If it does, you probably will need to lock appropriately, which may involve adding locks to every other access to those variables.
(Be aware that just marking these methods as synchronized does not fix any concurrency bugs. If doSomething() incremented a value on a shared object, then all accesses to that variable need to be synchronized since i++ is not an atomic operation. If it is something as simple as incrementing a counter, you could use AtomicInteger.incrementAndGet().)
The Servlet API certainly does not magically make concurrency a non-issue for you.
When writing to a database, it depends on the concurrency strategy in your persistence layer. Pessimistic locking, optimistic locking, last-in-wins? There's way more going on when you 'write to a database' that you need to decide how you're going to handle. What is it you want to have happen when two people click the button at the same time?
Making doSomething static doesn't seem to have too much bearing on the issue. What's happening in there is the relevant part. Is it modifying static variables? Then yes, there could be race conditions.
The servlet api will not do anything for you to make your concurrency problems disappear. Things like using the synchronized keyword on your servlets are a bad idea because you are basically forcing your threads to be processed one at a time and it ruins your ability to respond quickly to multiple users.
If you use Spring or EJB3, either one will provide threadlocal database connections and the ability to specify transactions. You should definitely check out one of those.
Case 1, your servlet uses some code that accesses a database. Databases have locking mechanisms that you should exploit. Two important reasons for this: the database itself might be used from other applications that read and write that data, it's not enough for your app to deal with contending with itself. And: your own application may be deployed to a scaled, clustered web container, where multiple copies of your code are executing on separate machines.
So, there are many standard patterns for dealing with locks in databases, you may need to read up on Pessimistic and Optimistic Locking.
The servlet API and JBC connection pooling gives you some helpful guarantees so that you can write your servlet code without using Java synchronisation provided your variables are in method scope, in concept you have
Start transaction (perhaps implicit, perhaps on entry to an ejb)
Get connection to DB ( Gets you a connection from pool, associated with your tran)
read/write/update code
Close connection (actually keeps it for your thread until your transaction commits)
Commit (again maybe implictly)
So your only real issue is dealing with any contentions in the DB. All of the above tends to be done rather more nicely using things such as JPA these days, but under the covers thats more or less what's happening.
Case 2: static method, this presumably implies that you now keep everything in a memory structure. This (barring remote invocation of some sort) impies a single JVM and you managing your own locking. Should your JVM or machine crash I guess you lose your data. If you care about your data then using a DB is probably better.
OR, how about a completely other approach: servlet simply records the "vote" by writing a message to a persistent JMS queue. Have some other processes pick up the votes from the queue and adds them up. You won't give immediate feedback to the voter this way, but you decouple the user's experience from the actual (in similar scenarios) quite complex processing .
I thing that the best solution for your problem is to use something like "synchronized" keyword and wait/notify!
I'm using ThreadLocal variables (through Clojure's vars, but the following is the same for plain ThreadLocals in Java) and very often run into the issue that I can't be sure that a certain code path will be taken on the same thread or on another thread. For code under my control this is obviously not too big a problem, but for polymorphic third party code there's sometimes not even a way to statically determine whether it's safe to assume single threaded execution.
I tend to think this is a inherent issue with ThreadLocals, but I'd like to hear some advise on how to use them in a safe way.
Then don't use ThreadLocals! They are specifically for when you want a variable that's associated with a Thread, as if there were a Map<Thread,T>.
The typical use case (as far as I know) for a ThreadLocal is in a web application framework. An HTTP filter obtains a database connection on an incoming request, and stores the connection in a static ThreadLocal. All subsequent controllers needing the connection can easily obtain it from the framework using a static call. When the response is returned, the same filter releases the connection again.