All,
What should be the approach to writing a thread safe program. Given a problem statement, my perspective is:
1 > Start of with writing the code for a single threaded environment.
2 > Underline the fields which would need atomicity and replace with possible concurrent classes
3 > Underline the critical section and enclose them in synchronized
4 > Perform test for deadlocks
Does anyone have any suggestions on the other approaches or improvements to my approach. So far, I can see myself enclosing most of the code in synchronized blocks and I am sure this is not correct.
Programming in Java
Writing correct multi-threaded code is hard, and there is not a magic formula or set of steps that will get you there. But, there are some guidelines you can follow.
Personally I wouldn't start with writing code for a single threaded environment and then converting it to multi-threaded. Good multi-threaded code is designed with multi-threading in mind from the start. Atomicity of fields is just one element of concurrent code.
You should decide on what areas of the code need to be multi-threaded (in a multi-threaded app, typically not everything needs to be threadsafe). Then you need to design how those sections will be threadsafe. Methods of making one area of the code threadsafe may be different than making other areas different. For example, understanding whether there will be a high volume of reading vs writing is important and might affect the types of locks you use to protect the data.
Immutability is also a key element of threadsafe code. When elements are immutable (i.e. cannot be changed), you don't need to worry about multiple threads modifying them since they cannot be changed. This can greatly simplify thread safety issues and allow you to focus on where you will have multiple data readers and writers.
Understanding details of concurrency in Java (and details of the Java memory model) is very important. If you're not already familiar with these concepts, I recommend reading Java Concurrency In Practice http://www.javaconcurrencyinpractice.com/.
You should use final and immutable fields wherever possible, any other data that you want to change add inside:
synchronized (this) {
// update
}
And remember, sometimes stuff brakes, and if that happens, you don't want to prolong the program execution by taking every possible way to counter it - instead "fail fast".
As you have asked about "thread-safety" and not concurrent performance, then your approach is essentially sound. However, a thread-safe program that uses synchronisation probably does not scale much in a multi cpu environment with any level of contention on your structure/program.
Personally I like to try and identify the highest level state changes and try and think about how to make them atomic, and have the state changes move from one immutable state to another – copy-on-write if you like. Then the actual write can be either a compare-and-set operation on an atomic variable or a synchronised update or whatever strategy works/performs best (as long as it safely publishes the new state).
This can be a bit difficult to structure if your new state is quite different (requires updates to several fields for instance), but I have seen it very successfully solve concurrent performance issues with synchronised access.
Buy and read Brian Goetz's "Java Concurrency in Practice".
Any variables (memory) accessible by multiple threads potentially at the same time, need to be protected by a synchronisation mechanism.
Related
I am trying understand a rationale behind biased locking and making it a default. Since reading this blog post, namely:
"Since most objects are locked by at most one thread during their lifetime, we allow that thread to bias an object toward itself"
I am perplexed... Why would anyone design a synchronized set of methods to be accessed by one thread only? In most cases, people devise certain building blocks specifically for the multi-threaded use-case, and not a single-threaded one. In such cases, EVERY lock aquisition by a thread which is not biased is at the cost of a safepoint, which is a huge overhead! Could someone please help me understand what I am missing in this picture?
The reason is probably that there are a decent number of libraries and classes that are designed to be thread safe but that are still useful outside of such circumstances. This is especially true of a number of classes that predate the Collections framework. Vector and it's subclasses is a good example. If you also consider that most java programs are not multi threaded it is in most cases an overall improvement to use a biased locking scheme, this is especially true of legacy code where the use of such Classes is all to common.
You are correct in a way, but there are cases when this is needed, as Holger very correctly points in his comment. There is so-called, the grace period when no biased-locking is attempted at all, so it's not like this will happen all the time. As I last remember looking at the code, it was 5 seconds. To prove this you would need a library that could inspect Java Object's header (jol comes to my mind), since biased locking is hold inside mark word. So only after 5 seconds will the object that held a lock before will be biased towards the same lock.
EDIT
I wanted to write a test for this, but seems like there is one already! Here is the link for it
I am learning multithreading, and I have a little question.
When I am sharing some variable between threads (ArrayList, or something other like double, float), should it be lcoked by the same object in read/write? I mean, when 1 thread is setting variable value, can another read at same time withoud any problems? Or should it be locked by same object, and force thread to wait with reading, until its changed by another thread?
All access to shared state must be guarded by the same lock, both reads and writes. A read operation must wait for the write operation to release the lock.
As a special case, if all you would to inside your synchronized blocks amounts to exactly one read or write operation, then you may dispense with the synchronized block and mark the variable as volatile.
Short: It depends.
Longer:
There is many "correct answer" for each different scenarios. (and that makes programming fun)
Do the value to be read have to be "latest"?
Do the value to be written have let all reader known?
Should I take care any race-condition if two threads write?
Will there be any issue if old/previous value being read?
What is the correct behaviour?
Do it really need it to be correct ? (yes, sometime you don't care for good)
tl;dr
For example, not all threaded programming need "always correct"
sometime you tradeoff correctness with performance (e.g. log or progress counter)
sometime reading old value is just fine
sometime you need eventually correct (e.g. in map-reduce, nobody nor synchronized is right until all done)
in some cases, correct is mandatory for every moment (e.g. your bank account balance)
in write-once, read-only it doesn't matter.
sometime threads in groups with complex cases.
sometime many small, independent lock run faster, but sometime flat global lock is faster
and many many other possible cases
Here is my suggestion: If you are learning, you should thing "why should I need a lock?" and "why a lock can help in DIFFERENT cases?" (not just the given sample from textbook), "will if fail or what could happen if a lock is missing?"
If all threads are reading, you do not need to synchronize.
If one or more threads are reading and one or more are writing you will need to synchronize somehow. If the collection is small you can use synchronized. You can either add a synchronized block around the accesses to the collection, synchronized the methods that access the collection or use a concurrent threadsafe collection (for example, Vector).
If you have a large collection and you want to allow shared reading but exclusive writing you need to use a ReadWriteLock. See here for the JavaDoc and an exact description of what you want with examples:
ReentrantReadWriteLock
Note that this question is pretty common and there are plenty of similar examples on this site.
In order to avoid race condition, we can synchronize the write and access methods on the shared variables, to lock these variables to other threads.
My question is if there are other (better) ways to avoid race condition? Lock make the program slow.
What I found are:
using Atomic classes, if there is only one shared variable.
using a immutable container for multi shared variables and declare this container object with volatile. (I found this method from book "Java Concurrency in Practice")
I'm not sure if they perform faster than syncnronized way, is there any other better methods?
thanks
Avoid state.
Make your application as stateless as it is possible.
Each thread (sequence of actions) should take a context in the beginning and use this context passing it from method to method as a parameter.
When this technique does not solve all your problems, use the Event-Driven mechanism (+Messaging Queue).
When your code has to share something with other components it throws event (message) to some kind of bus (topic, queue, whatever).
Components can register listeners to listen for events and react appropriately.
In this case there are no race conditions (except inserting events to the queue). If you are using ready-to-use queue and not coding it yourself it should be efficient enough.
Also, take a look at the Actors model.
Atomics are indeed more efficient than classic locks due to their non-blocking behavior i.e. a thread waiting to access the memory location will not be context switched, which saves a lot of time.
Probably the best guideline when synchronization is needed is to see how you can reduce the critical section size as much as possible. General ideas include:
Use read-write locks instead of full locks when only a part of the threads need to write.
Find ways to restructure code in order to reduce the size of critical sections.
Use atomics when updating a single variable.
Note that some algorithms and data structures that traditionally need locks have lock-free versions (they are more complicated however).
Well, first off Atomic classes uses locking (via synchronized and volatile keywords) just as you'd do if you did it yourself by hand.
Second, immutability works great for multi-threading, you no longer need monitor locks and such, but that's because you can only read your immutables, you cand modify them.
You can't get rid of synchronized/volatile if you want to avoid race conditions in a multithreaded Java program (i.e. if the multiple threads cand read AND WRITE the same data). Your best bet is, if you want better performance, to avoid at least some of the built in thread safe classes which do sort of a more generic locking, and make your own implementation which is more tied to your context and thus might allow you to use more granullar synchronization & lock aquisition.
Check out this implementation of BlockingCache done by the Ehcache guys;
http://www.massapi.com/source/ehcache-2.4.3/src/net/sf/ehcache/constructs/blocking/BlockingCache.java.html
One of the alternatives is to make shared objects immutable. Check out this post for more details.
You can perform up to 50 million lock/unlocks per second. If you want this to be more efficient I suggest using more course grain locking. i.e. don't lock every little thing, but have locks for larger objects. Once you have much more locks than threads, you are less likely to have contention and having more locks may just add overhead.
If multiple threads are updating the same variable, what should I do so each thread updates the variable correctly?
Any help would be greatly appreciated
There are several options:
1) Using no synchronization at all
This can only work if the data is of primitive type (not long/double), and you don't care about reading stale values (which is unlikely)
2) Declaring the field as volatile
This will guarantee that stale values are never read. It also works fine for objects (assuming the objects aren't changed after creation), because of the happens-before guarantees of volatile variables (See "Java Memory Model").
3) Using java.util.concurrent.AtomicLong, AtomicInteger etc
They are all thread safe, and support special operations like atomic incrementation and atomic compare-and-set operations.
4) Protecting reads and writes with the same lock
This approach provides mutual exclusion, which allows defining a large atomic operation, where multiple data members are manipulated as a single operation.
This is a major problem with multi-threaded applications, and spans more than I could really cover in an answer, so I'll point you to some resources.
http://download.oracle.com/javase/tutorial/essential/concurrency/sync.html
http://www.vogella.de/articles/JavaConcurrency/article.html#concurrencyjava_synchronized
Essentially, you use the synchronized keyword to place a lock around a variable. This makes sure that the piece of code is only being run once at a time. You can also place locks around the same object in multiple areas.
Additionally, you need to look out for several pitfalls, such as Deadlock.
http://tutorials.jenkov.com/java-concurrency/deadlock.html
Errors caused by misuse of locks are often very difficult to debug and track down, because they aren't very consistent. So, you always need to be careful that you put all of your locks in the correct location.
You should implement locking on the variable in question.
Eg.
http://download.oracle.com/javase/tutorial/essential/concurrency/newlocks.html
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I'm midway through programming a Java program, and I'm at the stage where I'm debugging far more concurrency issues than I'd like to be dealing with.
I have to ask: how do you deal with concurrency issues when setting out your program mentally? In my case, it's for a relatively simple game, yet issues with threads keep popping up - any quick-fix almost certainly leads to a new issue.
Speaking in very general terms, what techniques should I use when deciding how my application should 'flow' with out all my threads getting in a knot?
Concurrency boils down to managing shared state.
"All concurrency issues boil down to
coordinating access to mutable state.
The less mutable state, the easier it
is to ensure thread safety."
-- Java Concurrency in Practice
So the question you must ask yourself are:
What is the inherent shared data that the my application will need?
When can a thread work on a snapshot of the data, that is, it momentary work on a clone of the shared data?
Can I identify known pattern and use higher-level abstraction rather than low-level locks and thread coordination, e.g. queues, executor, etc. ?
Think of a global locking scheme as to avoid deadlock and have a consistent acquisition of locks
The simplest approach to manage shared state is to serialize every action. This coarse-grained approach results however into a high lock contention and poor performance. Managing concurrency can be seen an optimization exercise where you try to reduce the contention. So subsequent questions are:
How would the simplest approach be?
What are the simple choice that I can make to reduce contention (possibly with fine grained locking) and improve performance without making the solution overly complicated?
When am I going too fined-grained, that is, the complexity introduced isn't worth the performance gain?
A lot of approach to reduce contention rely on some form of trade-off between what would be necessary to enforce the correct behavior and what is feasible to reduce contention.
Where can I relax a few constraint and accept that sometimes stuff won't be 100% correct (e.g. a counter) ?
Can I be optimistic and deal with conflict only when concurrent modifications happen (e.g. using time stamp and retry logic - that's what TM do)?
Note that I never worked on a game, only on server-side part of enterprise apps. I can imagine that it can be quite different.
I use immutable data structures as much as possible. About the only time I do use mutable structures is when I have to such as with a library that will save a boatload of work. Even then I try to encapsulate that library in an immutable structure. If things can't change then there's less to worry about.
I should add that some things to keep in mind on your future endeavors are STM and Actor models. Both of these approaches to concurrency are showing very good progress. While there is some overhead for each, depending on the nature of your program that might not be an issue.
Edit:
Here are a few links to some libraries you could use in your next project. There's Deuce STM which as the name implies is an STM implementation for Java. Then there's the ActorFoundry which as the name implies is an Actor model for Java. However, I can't help but make the plug for Scala with its built in Actor model.
The fewer threads you have, the smaller state they share, and the simpler their interaction pattern on this shared state, the simpler your life will be.
You say Lists are throwing ConcurrentModificationException. I take it that your lists are acessed by seperate threads. So the first thing you should ask yourself is whether this is necessary. Is it not possible for the second thread to operate on a copy of the list?
If it is indeed necessary for the threads to access the list concurrently, locking the list during the entire traversal might be an option (Iterators are invalidated if the list is modified by any other means than that iterator). Of course, if you do other things while traversing the list, this traversal might take long, and locking out other threads might threaten the liveness of the system.
Also keep in mind that if the list is shared state, so are its contents, so if you intend to circumwent locking by copying the list, be sure to perform a deep copy, or prove that the objects contained in the list are themselves thread safe.
It's possible that the multi-threaded nature of your application might be a red herring, with respect to the ConcurrentModificationExceptions you mentioned: there are other ways that you can get a ConcurrentModificationException that don't necessarily involve multiple threads.
Consider the following:
List<Item> items = new ArrayList<Item>();
//... some code adding items to the list
for (Item item : items) {
if(item.isTheOneIWantToRemove()) {
items.remove(item); //This will result in a ConcurrentModificationException
}
}
Changing your for loop to a loop with an iterator, or an increasing index value solves the problem:
for (Iterator<String> it = items.iterator(); it.hasNext();) {
if(item.isTheOneIWantToRemove()) {
it.remove(); //No exception thrown
}
}
or
for (int i = 0; i < items.size(); i++) {
if(item.isTheOneIWantToRemove()) {
items.remove(items.get(i)); //No exception thrown
}
}
From the design perspective, I've found it useful to draw sequence diagrams where each thread's actions are color coded (that is, each thread has its own color). Using color in this way may be a non-standard use of a sequence diagram, but it's good for giving an overview of how and where threads interract.
As others have mentioned though, reducing the amount of threading in your design to the absolute minimum it needs to work properly will help a lot as well.
It depends what your threads do. Typically programs have a main thread that does the thinking and worker threads to do parallel tasks (timers, handling long computations on a GUI, etc.) But your app may be different - it depends on your design. What do you use threads for? What locks do you have to protect shared datastructures? If you use multiple locks, do you have a single order in which you lock to prevent deadlocks?
Try to use collections from java.util.concurrent package or even better immutable collections from Google Collections.
Read about using synchronized blocks