Why aren't variables in Java volatile by default? - java

Possibly similar question:
Do you ever use the volatile keyword in Java?
Today I was debugging my game; It had a very difficult threading problem that would show up every few minutes, but was difficult to reproduce. So first I added the synchronized keyword to each of my methods. That didn't work. Then I added the volatile keyword to every field. The problem seemed to just fix itself.
After some experimentation I found that the field responsible was a GameState object which kept track of my game's current state, which can be either playing or busy. When busy, the game ignores user input. What I had was a thread that constantly changed the state variable, while the Event thread reads the state variable. However, after one thread changes the variable, it takes several seconds for the other thread to recognize the changes, which ultimately causes the problem.
It was fixed by making the state variable volatile.
Why aren't variables in Java volatile by default and what's a reason not to use the volatile keyword?

To make a long story short, volatile variables--be they in Java or C#--are never cached locally within the thread. This doesn't have much of an implication unless you're dealing with a multiprocessor/multicore CPU with threads executing on different cores, as they'd be looking at the same cache. When you declare a variable as volatile, all reads and writes come straight from and go straight to the actual main memory location; there's no cache involved. This has implications when it comes to optimization, and to do so unnecessarily (when most variables don't need to be volatile) would be inflicting a performance penalty (paltry as it may or may not be) for a relatively small gain.

Volatiles are really only needed when you're trying to write low-level thread-safe, lock-free code. Most of your code probably shouldn't be either thread-safe or lock-free. In my experience, lock-free programming is only worth attempting after you've found that the simpler version which does do locking is incurring a significant performance hit due to the locking.
The more pleasant alternative is to use other building blocks in java.util.concurrent, some of which are lock-free but don't mess with your head quite as much as trying to do it all yourself at a low level.
Volatility has its own performance costs, and there's no reason why most code should incur those costs.

Personally I think fields should have been final by default and mutable only with an extra keyword, but that boat has sailed along time ago. ;)

While others are correct in pointing out why it would be a bad idea to default to volatile, there's another point to make: there is very likely a bug in your code.
Variables seldom need to made volatile: there is always a way to properly synchronize access to variables (either by synchronized keyword, or using AtomicXxx objects from java.util.concurrency): exceptions would include JNI code manipulating these (which is not bound by synchronization directives).
So instead of adding volatile, you may want to figure out WHY it resolved the problem. It isn't the only way to solve it, and there is probably a better way.

Because the compiler can't optimise volatile variables.
volatile tells the compiler that the variable can change at any time. Therefore, it can't assume that the variable won't change and optimise accordingly.

Declaring variables volatile generally has a huge impact on performance. On traditional single-threaded systems, it was relativly easy to know what needed to be volatile; it was those things that accessed hardware.
On multi-threaded it can be a little more complex, but I would generally encourage using notifications and event queues to handle passing data between theads in leau of magic variables. In Java it may not matter much; in C/C++ you would get into trouble when those variables cannot be set atomically by the underlying hardware.

Related

Tomcat: shared static variables and methods across sessions.

As far as i know, static variables and methods are shared across different sessions. dose this sort of behavior may cause performance degradation, for example when different sessions are reading a static var or calling a static variable at the same time.
There's no usually performance penalty involved in multiple threads reading the same variable or calling the same method at the same time, as long as no other threads are writing to that variable.
And if one thread can write a variable that another thread is reading, then you have a concurrency control issue that you need to handle carefully.
Note, however, that there may be an exception to the above on specific kinds of hardware when a variable that one thread writes is adjacent in memory to a variable that other threads read. In this case they may be in the same "cache line" -- the unit of memory that is read from RAM and cached, and in that case there may be contention between the readers and writers, as the hardware can't tell that they aren't accessing the same location.
The googlable term for this is "false sharing".
Simply "using static variables across sessions" does not inherently have performance implications. There is, however, a cousin concern that you need to look at, instead.
The fields that you're reading from/writing to from multiple user sessions will be accessed concurrently. This means that you will need to make your objects thread-safe (that's going to be necessary if you are writing to these static fields). This is what can have direct performance implications.

How Often Will Java Sync To Main Memory?

I have a set of counters which will only ever be updated in a single thread.
If I read these values from another thread and I don't user volatile/atomic/synchronized how out of date can these values be?
I ask as I am wondering if I can avoid using volatile/atomic/synchronized here.
I currently believe that I can't make any assumptions about time to update (so I am forced to use at least volatile). Just want to make sure I am not missing something here.
I ask as I am wondering if I can avoid using volatile/atomic/synchronized here.
In practice, the CPU cache is probably going to be synchronized to main memory anyway on a regular basis (how often depends on many parameters), so it sounds like you would be able to see some new values from time to time.
But that is missing the point: the actual problem is that if you don't use a proper synchronization pattern, the compiler is free to "optimise" your code and remove the update part.
For example:
class Broken {
boolean stop = false;
void broken() throws Exception {
while (!stop) {
Thread.sleep(100);
}
}
}
The compiler is authorised to rewrite that code as:
void broken() throws Exception {
while (true) {
Thread.sleep(100);
}
}
because there is no obligation to check if the non-volatile stop might change while you are executing the broken method. Mark the stop variable as volatile and that optimisation is not allowed any more.
Bottom line: if you need to share state you need synchronization.
How stale a value can get is left entirely to the discretion of the implementation -- the spec doesn't provide any guarantees. You will be writing code that depends on the implementation details of a particular JVM and which can be broken by changes to memory models or to how the JIT reorders code. The spec seems to be written with the intent of giving the implementers as much rope as they want, as long as they observe the constraints imposed by volatile, final, synchronized, etc.
It looks like the only way that I can avoid the synchronization of these variables is to do the following (similar to what Zan Lynx suggested in the comments):
Figure out the maximum age I am prepared to accept. I will make this
the "update interval".
Each "update interval" copy the unsynchronized counter variables to synchronized variables. This neeeds to be done on the write thread.
Read thread(s) can only read from these synchronized variables.
Of course, this optimization may only be a marginal improvement and would probably not be worth it considering the extra complexity it would create.
Java8 has a new class called LongAdder which helps with the problem of using volatile on a field. But until then...
If you do not use volatile on your counter then the results are unpredictable. If you do use volatile then there are performance problems since each write must guarantee cache/memory coherency. This is a huge performance problem when there are many threads writing frequently.
For statistics and counters that are not critical to the application, I give users the option of volatile/atomic or none with none the default. So far, most use none.

Java avoid race condition WITHOUT synchronized/lock

In order to avoid race condition, we can synchronize the write and access methods on the shared variables, to lock these variables to other threads.
My question is if there are other (better) ways to avoid race condition? Lock make the program slow.
What I found are:
using Atomic classes, if there is only one shared variable.
using a immutable container for multi shared variables and declare this container object with volatile. (I found this method from book "Java Concurrency in Practice")
I'm not sure if they perform faster than syncnronized way, is there any other better methods?
thanks
Avoid state.
Make your application as stateless as it is possible.
Each thread (sequence of actions) should take a context in the beginning and use this context passing it from method to method as a parameter.
When this technique does not solve all your problems, use the Event-Driven mechanism (+Messaging Queue).
When your code has to share something with other components it throws event (message) to some kind of bus (topic, queue, whatever).
Components can register listeners to listen for events and react appropriately.
In this case there are no race conditions (except inserting events to the queue). If you are using ready-to-use queue and not coding it yourself it should be efficient enough.
Also, take a look at the Actors model.
Atomics are indeed more efficient than classic locks due to their non-blocking behavior i.e. a thread waiting to access the memory location will not be context switched, which saves a lot of time.
Probably the best guideline when synchronization is needed is to see how you can reduce the critical section size as much as possible. General ideas include:
Use read-write locks instead of full locks when only a part of the threads need to write.
Find ways to restructure code in order to reduce the size of critical sections.
Use atomics when updating a single variable.
Note that some algorithms and data structures that traditionally need locks have lock-free versions (they are more complicated however).
Well, first off Atomic classes uses locking (via synchronized and volatile keywords) just as you'd do if you did it yourself by hand.
Second, immutability works great for multi-threading, you no longer need monitor locks and such, but that's because you can only read your immutables, you cand modify them.
You can't get rid of synchronized/volatile if you want to avoid race conditions in a multithreaded Java program (i.e. if the multiple threads cand read AND WRITE the same data). Your best bet is, if you want better performance, to avoid at least some of the built in thread safe classes which do sort of a more generic locking, and make your own implementation which is more tied to your context and thus might allow you to use more granullar synchronization & lock aquisition.
Check out this implementation of BlockingCache done by the Ehcache guys;
http://www.massapi.com/source/ehcache-2.4.3/src/net/sf/ehcache/constructs/blocking/BlockingCache.java.html
One of the alternatives is to make shared objects immutable. Check out this post for more details.
You can perform up to 50 million lock/unlocks per second. If you want this to be more efficient I suggest using more course grain locking. i.e. don't lock every little thing, but have locks for larger objects. Once you have much more locks than threads, you are less likely to have contention and having more locks may just add overhead.

if multiple threads are updating the same variable, what should be done so each thread updates the variable correctly?

If multiple threads are updating the same variable, what should I do so each thread updates the variable correctly?
Any help would be greatly appreciated
There are several options:
1) Using no synchronization at all
This can only work if the data is of primitive type (not long/double), and you don't care about reading stale values (which is unlikely)
2) Declaring the field as volatile
This will guarantee that stale values are never read. It also works fine for objects (assuming the objects aren't changed after creation), because of the happens-before guarantees of volatile variables (See "Java Memory Model").
3) Using java.util.concurrent.AtomicLong, AtomicInteger etc
They are all thread safe, and support special operations like atomic incrementation and atomic compare-and-set operations.
4) Protecting reads and writes with the same lock
This approach provides mutual exclusion, which allows defining a large atomic operation, where multiple data members are manipulated as a single operation.
This is a major problem with multi-threaded applications, and spans more than I could really cover in an answer, so I'll point you to some resources.
http://download.oracle.com/javase/tutorial/essential/concurrency/sync.html
http://www.vogella.de/articles/JavaConcurrency/article.html#concurrencyjava_synchronized
Essentially, you use the synchronized keyword to place a lock around a variable. This makes sure that the piece of code is only being run once at a time. You can also place locks around the same object in multiple areas.
Additionally, you need to look out for several pitfalls, such as Deadlock.
http://tutorials.jenkov.com/java-concurrency/deadlock.html
Errors caused by misuse of locks are often very difficult to debug and track down, because they aren't very consistent. So, you always need to be careful that you put all of your locks in the correct location.
You should implement locking on the variable in question.
Eg.
http://download.oracle.com/javase/tutorial/essential/concurrency/newlocks.html

Approach to a thread safe program

All,
What should be the approach to writing a thread safe program. Given a problem statement, my perspective is:
1 > Start of with writing the code for a single threaded environment.
2 > Underline the fields which would need atomicity and replace with possible concurrent classes
3 > Underline the critical section and enclose them in synchronized
4 > Perform test for deadlocks
Does anyone have any suggestions on the other approaches or improvements to my approach. So far, I can see myself enclosing most of the code in synchronized blocks and I am sure this is not correct.
Programming in Java
Writing correct multi-threaded code is hard, and there is not a magic formula or set of steps that will get you there. But, there are some guidelines you can follow.
Personally I wouldn't start with writing code for a single threaded environment and then converting it to multi-threaded. Good multi-threaded code is designed with multi-threading in mind from the start. Atomicity of fields is just one element of concurrent code.
You should decide on what areas of the code need to be multi-threaded (in a multi-threaded app, typically not everything needs to be threadsafe). Then you need to design how those sections will be threadsafe. Methods of making one area of the code threadsafe may be different than making other areas different. For example, understanding whether there will be a high volume of reading vs writing is important and might affect the types of locks you use to protect the data.
Immutability is also a key element of threadsafe code. When elements are immutable (i.e. cannot be changed), you don't need to worry about multiple threads modifying them since they cannot be changed. This can greatly simplify thread safety issues and allow you to focus on where you will have multiple data readers and writers.
Understanding details of concurrency in Java (and details of the Java memory model) is very important. If you're not already familiar with these concepts, I recommend reading Java Concurrency In Practice http://www.javaconcurrencyinpractice.com/.
You should use final and immutable fields wherever possible, any other data that you want to change add inside:
synchronized (this) {
// update
}
And remember, sometimes stuff brakes, and if that happens, you don't want to prolong the program execution by taking every possible way to counter it - instead "fail fast".
As you have asked about "thread-safety" and not concurrent performance, then your approach is essentially sound. However, a thread-safe program that uses synchronisation probably does not scale much in a multi cpu environment with any level of contention on your structure/program.
Personally I like to try and identify the highest level state changes and try and think about how to make them atomic, and have the state changes move from one immutable state to another – copy-on-write if you like. Then the actual write can be either a compare-and-set operation on an atomic variable or a synchronised update or whatever strategy works/performs best (as long as it safely publishes the new state).
This can be a bit difficult to structure if your new state is quite different (requires updates to several fields for instance), but I have seen it very successfully solve concurrent performance issues with synchronised access.
Buy and read Brian Goetz's "Java Concurrency in Practice".
Any variables (memory) accessible by multiple threads potentially at the same time, need to be protected by a synchronisation mechanism.

Categories