I am working on implementing a very specific health check strategy for my service. Here are the details :
All the application threads update a mutable Spring bean "problemDetectedTimestamp" when they come across a problem.
I have a background thread(implemented using ScheduledExecutorService) running every second which updates another mutable Spring bean "isServiceHealthy" based on the value of "problemDetectedTimestamp". To elaborate, the background thread checks whether the value of "problemDetectedTimestamp" falls in the last thread execution interval. If yes, the "isServiceHealthy" flag is updated to "false". I have some other application logic which is dependent on "isServiceHealthy" flag.
Now, we have multiple threads running and as such, I would want the "problemDetectedTimestamp" to be updated in a thread safe manner without incurring the overhead of using "synchronized" blocks. I am considering declaring "problemDetectedTimestamp" as a volatile variable to ensure that writes from application threads are atomic and my background thread reads it from memory rather than from its local cache. But, I am not sure of how to declare the bean as "volatile" in Spring XML as "volatile" is a concept specific to java/c++ and I am pretty sure Spring is not tighly coupled with java as such.
Related
Scenario
We are developing an API that will handle around 2-3 million hits per hour in a multi-threaded environment. The server is Apache Tomcat 7.0.64.
We have a custom object with lot of data let's call it XYZDataContext. When a new request comes in we associate XYZDataContext object to the request context. One XYZDataContext object per request. We will be spawning various threads in parallel to serve that request to collect/process data from/into XYZDataContext object. Our threads that will process things in parallel need access to this XYZDataContext object and
to avoid passing around of this object everywhere in the application, to various objects/methods/threads,
we are thinking to make it a threadlocal. Threads will use data from XYZDataContext object and will also update data in this object.
When the thread finishes we are planning to merge the data from the updated XYZDataContext object in the spawned child thread into the main thread's XYZDataContext object.
My questions:
Is this a good approach?
Threadpool risks - Tomcat server will maintain a threadpool and I read that using threadlocal with thread pools is a disaster because thread is not GCed per say and is reused so the references to the threadlocal objects will not get GCed and will result in storing huge objects in memory that we don't need anymore eventually resulting into OutOfMemory issues...
UNLESS they are referenced as weak references so that get GCed immediately.
We're using Java 1.7 open JDK. I saw the source code for ThreadLocal and the although the ThreadLocalMap.Entry is a weakreference it's not associated with a ReferenceQueue, and the comment for Entry constructor says "since reference queues are not used, stale entries are guaranteed to be removed only when the table starts running out of space."
I guess this works great in case of caches but is not the best thing in our case. I would like that the threadlocal XYZDataContext object be GCed immediately. Will the ThreadLocal.remove() method be effective here?
Is there any way to enforce emptying the space in the next GC run?
This is a right scenario to use ThreadLocal objects? Or are we abusing the threadlocal concept and using it where it shouldn't be used?
My gut feeling tells me you're on the wrong path. Since you already have a central context object (one for all threads) and you want to access it from multiple threads at the same time I would go with a Singleton hosting the context object and providing threadsafe methods to access it.
Instead of manipulating multiple properties of your context object, I would strongly suggest to do all manipulations at the same time. Best would be if you pass only one object containing all the properties you want to change in your context object.
e.g
Singleton.getInstance().adjustContext(ContextAdjuster contextAdjuster)
You might also want to consider using a threadsafe queue, filling it up with ContextAdjuster objects from your threads and finally processing it in the Context's thread.
Google for things like Concurrent, Blocking and Nonblocking Queue in Java. I am sure you'll find tons of example code.
Under IBM JVM we have faced an issue when multiple threads are trying to call Class.getAnnotation at the same time on different objects (but with the same annotation). Threads are starting to deadlock waiting on a monitor inside a Hashtable, which is used as a cache for annotations in IBM JVM. The weirdest thing is that the thread that is holding this monitor is put into 'waiting on condition' state right inside Hashtable.get, making all other threads to wait indefinitely.
The support from IBM stated, that implementation of Class.getAnnotation is not thread safe.
Comparing to other JVM implementations (for example, OpenJDK) we see that they implement Class methods in thread safe manner. IBM JVM is a closed source JVM, they do publish some source code together with their JVM, but it's not enough to make a clear judgment whenever their implementation of Class is thread safe or not.
The Class documentation doesn't clearly state whenever its methods are thread safe or not. So is it a safe assumption to treat Class methods (getAnnotation in particular) as a thread safe or we must use sync blocks in multi threaded environment?
How do popular frameworks (ex. Hibernate) are mitigating this problem? We haven't found any usage of synchronization in Hibernate code that was using getAnnotation method.
Your problem might be related to bug fixed in version 8 of Oracle Java.
One thread calls isAnnotationPresent on an annotated class where the
annotation is not yet initialised for its defining classloader. This
will result in a call on AnnotationType.getInstance, locking the class
object for sun.reflect.annotation.AnnotationType. getInstance will
result in a Class.initAnnotationsIfNecessary for that annotation,
trying to acquire a lock on the class object of that annotation.
In the meanwhile, another thread has requested Class.getAnnotations
for that annotation(!). Since getAnnotations locks the class object it
was requested on, the first thread can't lock it when it runs into
Class.initAnnotationsIfNecessary for that annotation. But the thread
holding the lock will try to acquire the lock for the class object of
sun.reflect.annotation.AnnotationType in AnnotationType.getInstance
which is hold by the first thread, thus resulting in the deadlock.
JDK-7122142 : (ann) Race condition between isAnnotationPresent and getAnnotations
Well, there is no specified behavior, so normally the correct way to deal with it would be to say “if no behavior is specified, assume no safety guarantees”.
But…
The problem here is that if these methods are not thread-safe, the specification lacks a documentation of how to achieve thread-safety correctly here. Recall that instances of java.lang.Class are visible across all threads of the entire application or even within multiple applications if your JVM hosts multiple apps/applets/servlets/beans/etc.
So unlike classes you instantiate for your own use where you can control access to these instances, you can’t preclude other threads from accessing the same methods of a particular java.lang.Class instance. So even if we engage with the very awkward concept of relying on some kind of convention for accessing such a global resource (e.g. like saying “the caller has to do synchronized(x.class)”), the problem here is, even bigger, that no such convention exists (well, or isn’t documented which comes down to the same).
So in this special case, where no caller’s responsibility is documented and can’t be established without such a documentation, IBM is in charge of telling how they think, programmers should use these methods correctly when they are implemented in a non-thread-safe manner.
There is an alternative interpretation I want to add: all information, java.lang.Class offers, is of a static constant nature. This class reflects what has been invariably compiled into the class. And it has no methods to alter any state. So maybe there’s no additional thread-safety documentation as all information is to be considered immutable and hence naturally thread-safe.
Rather, the fact that under the hood some information is loaded on demand is the undocumented implementation detail that the programmer does not need to be aware of. So if JRE developers decide to implement lazy creation for efficiency they must maintain the like-immutable behavior, read thread safety.
I've been exposing beans in our Spring web applications in case we need to make configuration changes on the fly. Recently I've been reviewing concurrency and I started to wonder what happens in other threads when you mutate one of these beans through JMX?
Does JMX have some way of forcing a memory model refresh so you don't need to worry about making the field volatile/synchronized to ensure other threads see the change?
When Tomcat creates a new thread to handle a request, that thread will see the changes even if the field is not thread-safe, correct? So unless I need the change to immediately take effect in current request threads is there any reason to worry about concurrency issues?
The JMX handler is not a special thread. Any changes there that need to be seen in other threads will need to be marked as volatile or synchronized.
// this needs to be volatile because another thread is accessing it
private volatile boolean shutdown;
...
#JmxOperation(description = "Shutdown the server")
public void shutdownSystem() {
// this is set by the JMX connection thread
shutdown = true;
}
That said, typically JMX values are statistics or configuration settings that I don't mind being lazily updated when the next memory barrier is crossed. This applies to counters and other debug information as well as booleans and other values that are only set by the JMX although they are used by other threads. In those cases, I do not mark the fields as volatile and it hasn't bitten us yet. YMMV.
FYI, the #JmxOperation annotation is from my SimpleJmx library.
I have a multithreaded program that loads its configuration on startup. The configuration is then handed down to the threads via their constructors.
But now I want to load one new config instance regularly, and pass it to the threads.
One idea would be to make the reference in the thread class to the config file volatile. Then, when an updated config instance is available, call a update update(Config c) method.
Is this the way to go? I will have terrible performance because every time the thread needs some setting it has to do all that volatile checking stuff.
Some better suggestions? Best practice? Maybe don't make it volatile and hope that the processor fetches that new object from main memory from time to time?
You could encapsulate all of your configuration values in a single immutable object, and when the configuration changes create a new instance of the object, and pass it to the threads, through listeners or explicit calling. The object has no volatile fields, only the object itself could be written in a volatile variable (or an AtomicReference).
The direct volatile approach with no other synchronization mechanisms is dangerous: you could read a halfway-rewritten configuration.
In any case, the performance impact on your application is likely to be negligible. Think about optimization later, if you find this is really a bottleneck.
Actually, are you sure that will have terrible performance?
If volatile used mostly for reading it's performance is not that bad. I'd recommend to try volatile first and measure performance degradation and only if it's significant then do any rework.
If you are really worrying about rapid volatile reads - then in you run method in thread you could have check for timeout - if 60 seconds passed since last config read - then reread it. Logic will reversed from update(Config c), to
if(moreThan60SecondsPassed)
{
localConfig = configconfigHolder.getConfig();
}
Also, if you'll be using non volatile - you won't get half read config. The danger is that you could have some threads not see updated value forever (no happens-before relationship).
Bw, did you consider recreating threads on config update? In this case you still could pass config through constructor. It depends on how often you want to update configuration.
You can use observer pattern to notify the threads of new config via listeners.
There is no way you can avoid volatile checking stuff. Maybe is expensive(do some performance tests) but your program will run correctly.
You may want to look at Commons Configuration:
A common issue with file-based configurations is to handle the reloading of the data file when it changes. This is especially important if you have long running applications and do not want to restart them when a configuration file was updated. Commons Configuration has the concept of so called reloading strategies that can be associated with a file-based configuration. Such a strategy monitors a configuration file and is able to detect changes. A default reloading strategy is FileChangedReloadingStrategy. It can be set on a file-based configuration as follows.
ManagedReloadingStrategy is an alternative to automatic reloading. It allows to hot-reload properties on a running application but only when requested by admin. The refresh() method will force a reload of the configuration source.
Suppose that I have a method called doSomething() and I want to use this method in a multithreaded application (each servlet inherits from HttpServlet).I'm wondering if it is possible that a race condition will occur in the following cases:
doSomething() is not staic method and it writes values to a database.
doSomething() is static method but it does not write values to a database.
what I have noticed that many methods in my application may lead to a race condition or dirty read/write. for example , I have a Poll System , and for each voting operation, a certain method will change a single cell value for that poll as the following:
[poll_id | poll_data ]
[1 | {choice_1 : 10, choice_2 : 20}]
will the JSP/Servlets app solve these issues by itself, or I have to solve all that by myself?
Thanks..
It depends on how doSomething() is implemented and what it actually does. I assume writing to the database uses JDBC connections, which are not threadsafe. The preferred way of doing that would be to create ThreadLocal JDBC connections.
As for the second case, it depends on what is going on in the method. If it doesn't access any shared, mutable state then there isn't a problem. If it does, you probably will need to lock appropriately, which may involve adding locks to every other access to those variables.
(Be aware that just marking these methods as synchronized does not fix any concurrency bugs. If doSomething() incremented a value on a shared object, then all accesses to that variable need to be synchronized since i++ is not an atomic operation. If it is something as simple as incrementing a counter, you could use AtomicInteger.incrementAndGet().)
The Servlet API certainly does not magically make concurrency a non-issue for you.
When writing to a database, it depends on the concurrency strategy in your persistence layer. Pessimistic locking, optimistic locking, last-in-wins? There's way more going on when you 'write to a database' that you need to decide how you're going to handle. What is it you want to have happen when two people click the button at the same time?
Making doSomething static doesn't seem to have too much bearing on the issue. What's happening in there is the relevant part. Is it modifying static variables? Then yes, there could be race conditions.
The servlet api will not do anything for you to make your concurrency problems disappear. Things like using the synchronized keyword on your servlets are a bad idea because you are basically forcing your threads to be processed one at a time and it ruins your ability to respond quickly to multiple users.
If you use Spring or EJB3, either one will provide threadlocal database connections and the ability to specify transactions. You should definitely check out one of those.
Case 1, your servlet uses some code that accesses a database. Databases have locking mechanisms that you should exploit. Two important reasons for this: the database itself might be used from other applications that read and write that data, it's not enough for your app to deal with contending with itself. And: your own application may be deployed to a scaled, clustered web container, where multiple copies of your code are executing on separate machines.
So, there are many standard patterns for dealing with locks in databases, you may need to read up on Pessimistic and Optimistic Locking.
The servlet API and JBC connection pooling gives you some helpful guarantees so that you can write your servlet code without using Java synchronisation provided your variables are in method scope, in concept you have
Start transaction (perhaps implicit, perhaps on entry to an ejb)
Get connection to DB ( Gets you a connection from pool, associated with your tran)
read/write/update code
Close connection (actually keeps it for your thread until your transaction commits)
Commit (again maybe implictly)
So your only real issue is dealing with any contentions in the DB. All of the above tends to be done rather more nicely using things such as JPA these days, but under the covers thats more or less what's happening.
Case 2: static method, this presumably implies that you now keep everything in a memory structure. This (barring remote invocation of some sort) impies a single JVM and you managing your own locking. Should your JVM or machine crash I guess you lose your data. If you care about your data then using a DB is probably better.
OR, how about a completely other approach: servlet simply records the "vote" by writing a message to a persistent JMS queue. Have some other processes pick up the votes from the queue and adds them up. You won't give immediate feedback to the voter this way, but you decouple the user's experience from the actual (in similar scenarios) quite complex processing .
I thing that the best solution for your problem is to use something like "synchronized" keyword and wait/notify!