I have a multithreaded program that loads its configuration on startup. The configuration is then handed down to the threads via their constructors.
But now I want to load one new config instance regularly, and pass it to the threads.
One idea would be to make the reference in the thread class to the config file volatile. Then, when an updated config instance is available, call a update update(Config c) method.
Is this the way to go? I will have terrible performance because every time the thread needs some setting it has to do all that volatile checking stuff.
Some better suggestions? Best practice? Maybe don't make it volatile and hope that the processor fetches that new object from main memory from time to time?
You could encapsulate all of your configuration values in a single immutable object, and when the configuration changes create a new instance of the object, and pass it to the threads, through listeners or explicit calling. The object has no volatile fields, only the object itself could be written in a volatile variable (or an AtomicReference).
The direct volatile approach with no other synchronization mechanisms is dangerous: you could read a halfway-rewritten configuration.
In any case, the performance impact on your application is likely to be negligible. Think about optimization later, if you find this is really a bottleneck.
Actually, are you sure that will have terrible performance?
If volatile used mostly for reading it's performance is not that bad. I'd recommend to try volatile first and measure performance degradation and only if it's significant then do any rework.
If you are really worrying about rapid volatile reads - then in you run method in thread you could have check for timeout - if 60 seconds passed since last config read - then reread it. Logic will reversed from update(Config c), to
if(moreThan60SecondsPassed)
{
localConfig = configconfigHolder.getConfig();
}
Also, if you'll be using non volatile - you won't get half read config. The danger is that you could have some threads not see updated value forever (no happens-before relationship).
Bw, did you consider recreating threads on config update? In this case you still could pass config through constructor. It depends on how often you want to update configuration.
You can use observer pattern to notify the threads of new config via listeners.
There is no way you can avoid volatile checking stuff. Maybe is expensive(do some performance tests) but your program will run correctly.
You may want to look at Commons Configuration:
A common issue with file-based configurations is to handle the reloading of the data file when it changes. This is especially important if you have long running applications and do not want to restart them when a configuration file was updated. Commons Configuration has the concept of so called reloading strategies that can be associated with a file-based configuration. Such a strategy monitors a configuration file and is able to detect changes. A default reloading strategy is FileChangedReloadingStrategy. It can be set on a file-based configuration as follows.
ManagedReloadingStrategy is an alternative to automatic reloading. It allows to hot-reload properties on a running application but only when requested by admin. The refresh() method will force a reload of the configuration source.
Related
I have a pretty basic method,
//do stuff
}
. I was having issues in that new quotes would update the order, so I wanted to synchronize on the order parameter. So my code would like:
handleOrder(IOrder order) {
synchronized(order){
//do stuff
}
}
Now however, intellij is complaining that:
Synchronization on method parameter 'order'
Inspection info: Reports synchronization on a local variable or parameter. It is very difficult to guarantee correctness when such synchronization is used. It may be possible to improve code like this by controlling access through e.g. a synchronized wrapper class, or by synchronizing on a field.
Is this something I actually need to be concerned about?
Yes, because this type of synchronization is generally an indication that the code cannot easily be reviewed to ensure that deadlocks don't take place.
When you synchronize on a field, you're combining the synchronization code with the instance being used in a way that permits you to have most, if not all of the competing methods in the same file. This makes it easier to review the file for deadlocks and errors in the synchronization approach. The same idea applies when using a synchronized wrapper class.
When you synchronize on a passed instance (local field) then you need to review all of the code of the entire application for other synchronization efforts on the same instance to get the same level of security that a mistake was not made. In addition, this will have to be done frequently, as there is little assurance that after the next commit, a developer will have done the same code scan to make sure that their synchronization didn't impact code that lived in some remote directory (or even in a remote JAR file that doesn't have source code on their machine).
Please note: Although this question mentions Java, I think it's an OOP/concurrency problem at heart and can probably be answered by anyone with significant programming experience.
So I'm building a ConfigurationLoader that will read Configuration POJO from a remote service and make it available via an API. A few things:
As soon as the ConfigurationLoader is asked for the Configuration the first time, a background thread (worker) will ping the remote service every, say, 30 seconds, for updates and then apply those updates to the Configuration instance; and
If the Configuration is modified, the background worker will be notified of the change and will push the "new" Configuration to the remote service;
Both the ConfigurationLoader and the Configuration must be thread-safe
So when I think "thread safety" the first thing I think of is immutability, which leads me towards excellent projects like Immutables. The problem is that Configuration can't be immutable because we need to be able to change it on the client-side and then let the loader ripple those changes back to the server.
My next thought was to try and make both ConfigurationLoader and Configuration singletons, but the problem is there is that the ConfigurationLoader takes a lot of arguments to instantiate it, and as this excellent answer points out, a singleton that takes arguments in construction is not a true singleton.
// Groovy pseudo-code
class Configuration {
// Immutable? Singleton? Other?
}
class ConfigurationLoader {
// private fields
Configuration configuration
ConfigurationLoader(int fizz, boolean buzz, Foo foo, List<Bar> bars) {
super()
this.fizz = fizz
this.buzz = buzz;
// etc.
}
Configuration loadConfiguration() {
if(configuration == null) {
// Create background worker that will constantly update
// 'configuration'
}
}
}
What are my options here? How do I create both the loader and the config to be thread-safe, where the config is changeable by the client-side (on-demand) or asynchronously by a background worker thread?
The problem is that Configuration can't be immutable because we need to be able to change it
It can still be immutable, you just create a new one for every change ("copy-on-write").
What are my options here?
First thing you'll have to think about: How do you want to react to configuration changes in concurrently running tasks? Basically, you have three options:
Ignore configuration change until the task is done
I.e. some directory your codes writes files to - finish writing the current file to the current target dir, put new files in the new dir. Writing some bytes into /new/path/somefile won't be a good idea if you never created that file. Your best option for this is probably an immutable Configuration object that you store in a field of your task instance (i.e. at task creation - in that case you can also make that field final for clarity). This usually works best if your code is designed as a collection of isolated small tasks.
Pros: Config never changes within a single task, so this is simple to get tread-safe and easy to test.
Cons: Config updates never make it to already running tasks.
Make your tasks check for config changes
I.e. your task regularly sends some data to an email address. Have a central storage for your config (like in your pseudo-code) and re-fetch it in some interval (i.e. between collecting data and sending the mail) from your task code. This usually works best for long-running/permanent tasks.
Pros: Config can change during a task run, but still somewhat simple to get safe - just make sure you have some memory barrier in place for reading the config (make your private configuration field volatile, use an AtomicReference, guard it with a lock, whatever).
Cons: Task code will be harder to test than first option. Config values may still be outdated between checks.
Signal config changes to your tasks
Basically option two, but the other way around. Whenever config changes, interrupt your tasks, handle the interrupt as a "config needs updating" message, continue/restart with new config.
Pros: Config values are never outdated.
Cons: This is the hardest to get right. Some might even argue that you cannot get this right, because interruption should only be used for task abortion. There is only very minor benefits (if at all) over the second option if you place your task's update checks at the right spots. Don't do this if you don't have a good reason to.
You need a singleton to pull this off, but your singleton isn't the immutable thing. Its the threadsafe thing. Make your singleton (Configuration) contain a simple Properties object or something and protect access to this with synchronization. Your Configuration Loader somehow knows of this Configuration singleton and functions to set, under synchronization, new instances of the Properties object when it detects change.
I'm pretty sure Apache Commons Configuration does something like this.
I am creating multiple threads and all the threads read the same property file (there is no write operation done to property file). Will this cause performance overhead since the same property file is read multiple times by multiple threads?
I suggest to load the properties file once and use the same Properties instance by all the threads.
Load once reduce the disk access:
better performance for this application
better availability of the entire system
Multiple reading is not a concurrency problem.
A comment of didierc highlight a possible bottleneck: each access to Properties is synchronized, so when thread read a value, all the other may wait.
To avoid this, you may confine the use of the Properties in the constructor or initialization of your threads. Don't use p.getProperty( XXX ) in a loop inside the Thread.run() methods.
The answer is "it depends". Mostly it depends on how much work each thread does in addition to reading the properties file. If each thread does much work in addition to reading the file, performance will not be much affected.
You should be more concerned about potential correctness problems: if different threads used different properties, would the program behave correctly? If not, your program has a race hazard bug: if the property file is altered (or deleted) while the program runs, some threads could use different properties and so produce an incorrect computation.
Property files are used for program configuration. Programs typically read all their configuration information soon after they start, before doing any real work. they can therefore fail fast if the configuration is faulty.You probably ought to do likewise, treating the spawning of threads as the real work to do later. This also ensures a user of your program gets only one error message per fault in the configuration, rather than one message per thread.
I've been exposing beans in our Spring web applications in case we need to make configuration changes on the fly. Recently I've been reviewing concurrency and I started to wonder what happens in other threads when you mutate one of these beans through JMX?
Does JMX have some way of forcing a memory model refresh so you don't need to worry about making the field volatile/synchronized to ensure other threads see the change?
When Tomcat creates a new thread to handle a request, that thread will see the changes even if the field is not thread-safe, correct? So unless I need the change to immediately take effect in current request threads is there any reason to worry about concurrency issues?
The JMX handler is not a special thread. Any changes there that need to be seen in other threads will need to be marked as volatile or synchronized.
// this needs to be volatile because another thread is accessing it
private volatile boolean shutdown;
...
#JmxOperation(description = "Shutdown the server")
public void shutdownSystem() {
// this is set by the JMX connection thread
shutdown = true;
}
That said, typically JMX values are statistics or configuration settings that I don't mind being lazily updated when the next memory barrier is crossed. This applies to counters and other debug information as well as booleans and other values that are only set by the JMX although they are used by other threads. In those cases, I do not mark the fields as volatile and it hasn't bitten us yet. YMMV.
FYI, the #JmxOperation annotation is from my SimpleJmx library.
Suppose that I have a method called doSomething() and I want to use this method in a multithreaded application (each servlet inherits from HttpServlet).I'm wondering if it is possible that a race condition will occur in the following cases:
doSomething() is not staic method and it writes values to a database.
doSomething() is static method but it does not write values to a database.
what I have noticed that many methods in my application may lead to a race condition or dirty read/write. for example , I have a Poll System , and for each voting operation, a certain method will change a single cell value for that poll as the following:
[poll_id | poll_data ]
[1 | {choice_1 : 10, choice_2 : 20}]
will the JSP/Servlets app solve these issues by itself, or I have to solve all that by myself?
Thanks..
It depends on how doSomething() is implemented and what it actually does. I assume writing to the database uses JDBC connections, which are not threadsafe. The preferred way of doing that would be to create ThreadLocal JDBC connections.
As for the second case, it depends on what is going on in the method. If it doesn't access any shared, mutable state then there isn't a problem. If it does, you probably will need to lock appropriately, which may involve adding locks to every other access to those variables.
(Be aware that just marking these methods as synchronized does not fix any concurrency bugs. If doSomething() incremented a value on a shared object, then all accesses to that variable need to be synchronized since i++ is not an atomic operation. If it is something as simple as incrementing a counter, you could use AtomicInteger.incrementAndGet().)
The Servlet API certainly does not magically make concurrency a non-issue for you.
When writing to a database, it depends on the concurrency strategy in your persistence layer. Pessimistic locking, optimistic locking, last-in-wins? There's way more going on when you 'write to a database' that you need to decide how you're going to handle. What is it you want to have happen when two people click the button at the same time?
Making doSomething static doesn't seem to have too much bearing on the issue. What's happening in there is the relevant part. Is it modifying static variables? Then yes, there could be race conditions.
The servlet api will not do anything for you to make your concurrency problems disappear. Things like using the synchronized keyword on your servlets are a bad idea because you are basically forcing your threads to be processed one at a time and it ruins your ability to respond quickly to multiple users.
If you use Spring or EJB3, either one will provide threadlocal database connections and the ability to specify transactions. You should definitely check out one of those.
Case 1, your servlet uses some code that accesses a database. Databases have locking mechanisms that you should exploit. Two important reasons for this: the database itself might be used from other applications that read and write that data, it's not enough for your app to deal with contending with itself. And: your own application may be deployed to a scaled, clustered web container, where multiple copies of your code are executing on separate machines.
So, there are many standard patterns for dealing with locks in databases, you may need to read up on Pessimistic and Optimistic Locking.
The servlet API and JBC connection pooling gives you some helpful guarantees so that you can write your servlet code without using Java synchronisation provided your variables are in method scope, in concept you have
Start transaction (perhaps implicit, perhaps on entry to an ejb)
Get connection to DB ( Gets you a connection from pool, associated with your tran)
read/write/update code
Close connection (actually keeps it for your thread until your transaction commits)
Commit (again maybe implictly)
So your only real issue is dealing with any contentions in the DB. All of the above tends to be done rather more nicely using things such as JPA these days, but under the covers thats more or less what's happening.
Case 2: static method, this presumably implies that you now keep everything in a memory structure. This (barring remote invocation of some sort) impies a single JVM and you managing your own locking. Should your JVM or machine crash I guess you lose your data. If you care about your data then using a DB is probably better.
OR, how about a completely other approach: servlet simply records the "vote" by writing a message to a persistent JMS queue. Have some other processes pick up the votes from the queue and adds them up. You won't give immediate feedback to the voter this way, but you decouple the user's experience from the actual (in similar scenarios) quite complex processing .
I thing that the best solution for your problem is to use something like "synchronized" keyword and wait/notify!