What is the worst can happen in java race condition? - java

I know this is clearly a race condition. But what are the possible things that can happen?
class Blah {
List<String> stuff;
public List<String> getStuff() {
return stuff
}
public void setStuff(List<String> newValue) {
this.stuff = newValue
}
}
b = new Blah();
// Thread one
b.setStuff(getListFromSomeNetworkResource());
for (String c : b.getStuff()) {
// Work with c
}
// Thread two
b.setStuff(getListFromSomeNetworkResource());
for (String c : b.getStuff()) {
// Work with c
}
Can this throw RuntimeException?
Can this segfault jvm?
Can this segfault one of the thread?
Does it depend on processor. What if it is an Intel Xeon processor?
Can this throw a NullPointer exception?
Thread 2 can read the contents set by Thread 1 and vice versa if the function actually returned different values
I understand this is a race condition and will not write such a code. But How do I convince others not to?
Update:
Assumptions:
getListFromSomeNetworkResource() always returns a new ArrayList. Size may be 0 or more.
getListFromSomeNetworkResource() is thread safe.

Can this throw RuntimeException?
No, if getListFromSomeNetworkResource() is thread-safe and doesn't return null.
Can this segfault jvm?
Can this segfault one of the thread?
Does it depend on processor. What if it is an Intel Xeon processor?
No.
Can this throw a NullPointer exception?
only if getListFromSomeNetworkResource() can return null.
Thread 2 can read the contents set by Thread 1 and vice versa if the function actually returned different values
yes, this is likely to happen.

The danger would be an ordering such as:
Thread one: b.setStuff(getListFromSomeNetworkResource());
Thread two: b.setStuff(getListFromSomeNetworkResource());
Thread one: b.stuff.iterator() (via b.getStuff(), at the start of the for loop)
In this case, thread one may be iterating over the list that thread two set. That publication, from thread two to thread one, was done without any synchronization -- that's a data race. Assuming that list is not itself thread-safe, lots of things can happen. The main issue would be that some of the list's state is visible to thread one, but not all of it, due to that data race.
It may throw some RuntimeException. For instance, maybe one field thinks that the list has n elements due to a resize. But the new array that came from that resize didn't make it over, so you end up with an ArrayIndexOutOfBoundsException
It may throw a NullPointerException for any number of reasons; maybe it's a linked list, and one of the reference writes didn't make it to thread one
It should not cause any segfaults: those can only come about from bugs in the JVM, but never bugs in your code.
It may depend on the processor, in that processors may have different handling of things like flushing memory from one CPU cache to another -- that's one of the reasons that unsafe publication can cause you to see only some of the data that one thread wrote, from another thread. There are ways to force those caches to get flushed; the way to specify them in Java is through the various data synchronization mechanisms (acquiring locks, using volatile fields, etc).

Related

Is it thread-safe to synchronize only on add to HashSet?

Imagine having a main thread which creates a HashSet and starts a lot of worker threads passing HashSet to them.
Just like in code below:
void main() {
final Set<String> set = new HashSet<>();
final ExecutorService threadExecutor =
Executors.newFixedThreadPool(10);
threadExecutor.submit(() -> doJob(set));
}
void doJob(final Set<String> pSet) {
// do some stuff
final String x = ... // doesn't matter how we received the value.
if (!pSet.contains(x)) {
synchronized (pSet) {
// double check to prevent multiple adds within different threads
if (!pSet.contains(x)) {
// do some exclusive work with x.
pSet.add(x);
}
}
}
// do some stuff
}
I'm wondering is it thread-safe to synchronize only on add method? Is there any possible issues if contains is not synchronized?
My intuition telling me this is fine, after leaving synchronized block changes made to set should be visible to all threads, but JMM could be counter-intuitive sometimes.
P.S. I don't think it's a duplicate of How to lock multiple resources in java multithreading
Even though answers to both could be similar, this question addresses more particular case.
I'm wondering is it thread-safe to synchronize only on the add method? Are there any possible issues if contains is not synchronized as well?
Short answers: No and Yes.
There are two ways of explaining this:
The intuitive explanation
Java synchronization (in its various forms) guards against a number of things, including:
Two threads updating shared state at the same time.
One thread trying to read state while another is updating it.
Threads seeing stale values because memory caches have not been written to main memory.
In your example, synchronizing on add is sufficient to ensure that two threads cannot update the HashSet simultaneously, and that both calls will be operating on the most recent HashSet state.
However, if contains is not synchronized as well, a contains call could happen simultaneously with an add call. This could lead to the contains call seeing an intermediate state of the HashSet, leading to an incorrect result, or worse. This can also happen if the calls are not simultaneous, due to changes not being flushed to main memory immediately and/or the reading thread not reading from main memory.
The Memory Model explanation
The JLS specifies the Java Memory Model which sets out the conditions that must be fulfilled by a multi-threaded application to guarantee that one thread sees the memory updates made by another. The model is expressed in mathematical language, and not easy to understand, but the gist is that visibility is guaranteed if and only if there is a chain of happens before relationships from the write to a subsequent read. If the write and read are in different threads, then synchronization between the threads is the primary source of these relationships. For example in
// thread one
synchronized (sharedLock) {
sharedVariable = 42;
}
// thread two
synchronized (sharedLock) {
other = sharedVariable;
}
Assuming that the thread one code is run before the thread two code, there is a happens before relationships between thread one releasing the lock and thread two acquiring it. With this and the "program order" relations, we can build a chain from the write of 42 to the assignment to other. This is sufficient to guarantee that other will be assigned 42 (or possibly a later value of the variable) and NOT any value in sharedVariable before 42 was written to it.
Without the synchronized block synchronizing on the same lock, the second thread could see a stale value of sharedVariable; i.e. some value written to it before 42 was assigned to it.
That code is thread safe for the the synchronized (pSet) { } part :
if (!pSet.contains(x)) {
synchronized (pSet) {
// Here you are sure to have the updated value of pSet
if (!pSet.contains(x)) {
// do some exclusive work with x.
pSet.add(x);
}
}
because inside the synchronized statement on the pSet object :
one and only one thread may be in this block.
and inside it, pSet has also its updated state guaranteed by the happens-before relationship with the synchronized keyword.
So whatever the value returned by the first if (!pSet.contains(x)) statement for a waiting thread, when this waited thread will wake up and enter in the synchronized statement, it will set the last updated value of pSet. So even if the same element was added by a previous thread, the second if (!pSet.contains(x)) would return false.
But this code is not thread safe for the first statement if (!pSet.contains(x)) that could be executed during a writing on the Set.
As a rule of thumb, a collection not designed to be thread safe should not be used to perform concurrently writing and reading operations because the internal state of the collection could be in a in-progress/inconsistent state for a reading operation that would occur meanwhile a writing operation.
While some no thread safe collection implementations accept such a usage in the facts, that is not guarantee at all that it will always be true.
So you should use a thread safe Set implementation to guarantee the whole thing thread safe.
For example with :
Set<String> pSet = ConcurrentHashMap.newKeySet();
That uses under the hood a ConcurrentHashMap, so no lock for reading and a minimal lock for writing (only on the entry to modify and not the whole structure).
No,
You don't know in what state the Hashset might be during add by another Thread. There might be fundamental changes ongoing, like splitting of buckets, so that contains may return false during the adding by another thread, even if the element would be there in a singlethreaded HashSet. In that case you would try to add an element a second time.
Even Worse Scenario: contains might get into an endless loop or throw an exception because of an temporary invalid state of the HashSet in the memory used by the two threads at the same time.

How should I maintain a cache of values read from a file?

Setup
There is a program running that is performing arbitrary computations and writing a status (an integer value, representing progress) to a file. The integer values can only be incremented.
Now I am developing an other application that can (among other things) perform arithmetic operations, e.g., comparisons, on those integer values. The files are permanently deleted and written by a different program. As such, there is no guarantee that a file exists at any time.
Basically, the application needs to execute something arbitrary, but has a constraint on the other program's progress, i.e., it may only execute something if the other program has done enough work.
Problem
When performing the arithmetic operations, the application should not care about where the integer values come from. Especially, accessing those integer values must not throw an exception. How should I separate all the bad things that can happen when performing io access?
Note that I do not want the execution thread to block until a value can be read from the file. E.g., say the file system dies somehow, then the integer values will not be updated, but the main thread should still continue to work. This desire is driven by the definition of the arithmetic comparison as a predicate, which has exactly two outcomes, true and false, but no third "error"-outcome. That's why I think that the values that are read from the file would need to be cached somehow.
Limitation
Java 1.7, Scala 2.11
Current Approach
I have a solution that looks as if it would work, but I am not sure if there could something go wrong.
The solution is to maintain a cache of those integer values for each file. The core functionality is provided the getters of the cache, while there is a separate "updater"-thread that constantly reads the files and updates the chaches.
If an error occurs the producer should take notice (i.e., log the error), but continue to run, because an incomplete computation should not affect subsequent computations.
A minimal example of what I am currently doing would look something like this:
object Application {
def main(args: Array[String]) {
val caches = args.map(filename => new Cache(Paths.get(filename))
val producer = new Thread(new Updater(caches)))
producer.start()
execute(caches)
producer.interrupt()
}
def execute(values: Array[AccessValue]) {
while (values.head.getValue < 5) {/* This should never throw an exception */}
}
class Updater(caches: Array[Cache]) {
def run() {
var interrupted = false
while(!interrupted) {
caches.foreach{cache =>
try {
val input = Files.newInputStream(cache.file)
cache.updateValue(parse(input))
} catch {
case _: InterruptedException =>
interrupted = true
case t: Throwable =>
log.error(t)
/*continue as if nothing happend*/
}
}
}
}
def parse(input: InputStream): Int = input.read() /* In reality, some xml parsing */
}
trait AccessValue{
def getValue: Int // should not throw an exception
}
class Cache(val file: Path) extends AccessValue{
private val value = 0
def getValue = value
def updateValue(newValue: Int) { value = newValue }
}
Doing it like this works on a synthetic test setup, but I am wondering whether something bad can happen. Also, if anyone would approach the problem differently, I would be glad to hear how.
Could there be a throwable that could cause other threads to go wild? I am thinking of something like OutOfMemoryException or StackOverflow. Would I need to handle them differently, or does it not matter, because, e.g., the whole application would die anyways?
What would happen if the the InterruptException is thrown outside the try block, or even in the catch block? Is there a better way to terminate a thread?
Must the member value of class Cache be declared volatile? I do not care much about the ordering of reads and write, but the compiler must not "optimize" reading the value away just because it deduces that the value is constant.
There are a lot of different concurrency-related libraries. Do you suggest me to use something other than new Thread(...).start()? If yes, what facility do you suggest? I know of Scala's ExecutionContext, Future's, and Java's Executors class, which provides various static constructors for thread pools. However, I have never used any of these before and I do not know their advantages and disadvantages. I also stumbled upon the name "Akka", but my guess is that using Akka is overkill for what I want to achieve.
Thank you
I would recommend to read through oracle's documentation on concurrency.
When one thread writes a value and different thread reads a value, you should always use a synchronized block or declare that value as volatile. Otherwise there is no guarantee that the value written by one thread is visible to the other thread (see oracle's documentation on establishing happens-before relationship).
The OutOfMemoryException can influence the other threads as the heap space to which the OutOfMemoryException refers is shared among threads. The StackOverflow exception would kill only the thread in which it occurs because each thread has its own stack.
If you do not need some sort of synchronization between the two threads then you probably do not need any Futures or Executors.

Java: is using synchronized(this) an advisable practice when creating a ConcurrentHashMap object?

I just finished developing a java web service server for a distributed programming course I am attending. One of the requirements was to guarantee multi-thread safety to our project hence I decided to use ConcurrentHashMap objects to store my data.
At the end of it all I am left with a question regarding this snippet of code:
public List<THost> getHList() throws ClusterUnavailable_Exception{
logger.entering(logger.getName(), "getHList");
if(hMap==null){
synchronized(this){
if(hMap==null){
hMap=createHMap();
}
}
}
if(hMap==null){
ClusterUnavailable cu = new ClusterUnavailable();
cu.setMessage("Data unavailable.");
ClusterUnavailable_Exception exc = new ClusterUnavailable_Exception("Data unavailable.", new ClusterUnavailable());
throw exc;
}
else{
List<THost> hList = new ArrayList<THost>(hMap.values());
logger.info("Returning list of hosts. Number of hosts returned = "+hList.size());
logger.exiting(logger.getName(), "getHList");
return hList;
}
}
do I have to use the synchronized statement when creating the concurrenthashmap object itself in order to guarantee that the service will not have any unpredictable behavior in a multi-threaded environment?
Don't bother. Eagerly initialize the Map, make the field final, and drop the synchronization until you have proven that it is actually necessary. The cost is minuscule and the "obviously safe and correct" solution will almost never be too slow.
You mentioned this is a class project -- focus on getting the code working. Concurrency is hard enough without inventing additional obstacles that you must then hurdle over.
The simple solution is to avoid the problem by eagerly initializing. And unless you have clear evidence (i.e. profiling) that eager initialization is a performance problem, that is also the best solution.
As to your question, the answer is that the synchronized block is necessary for correctness. Without it you can get the following sequence of events.
thread 1 calls getHList()
thread 1 sees that hMap is null and starts to create a map.
thread 2 calls getHList()
thread 2 sees that hMap is null and starts to create a map.
thread 1 finishes creating, and assigns the new map to hMap, and returns that map.
thread 2 finishes creating, and assigns the second new map to hMap, and returns that map.
In short, thread 1 and thread 2 could get different maps if they simultaneously call getHList() while hMap has its initial null value.
(In the above, I'm assuming that getHList() is a getter for hMap. However, the method as written won't compile, and its declared return type doesn't match the type of hMap ... so it is unclear what it is really intended to do.)
The below line has nothing to do with ConcurrentHashMap. Its just creating an instance of ConcurrentHashMap object.
Its just like synchronizing any object creation in JAVA.
hMap=new ConcurrentHashMap<BigInteger, THost>();
Double check locking pattern is broken before Java 1.5 (and is inefficient in Java 1.6 and later). See: http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html
Consider using a Initialization-on-demand holder or a single element enum type.

specific question on java threading + synchronization

I know this question sounds crazy, but consider the following java snippets:
Part - I:
class Consumer implements Runnable{
private boolean shouldTerminate = false
public void run() {
while( !shouldTerminate ){
//consume and perform some operation.
}
}
public void terminate(){
this.shouldTerminate = true;
}
}
So, the first question is, should I ever need to synchronize on shouldTerminate boolean? If so why? I don't mind missing the flag set to true for one or two cycles(cycle = 1 loop execution). And second, can a boolean variable ever be in a inconsistent state?(anything other than true or false)
Part - II of the question:
class Cache<K,V> {
private Map<K, V> cache = new HashMap<K, V>();
public V getValue(K key) {
if ( !cache.containsKey(key) ) {
synchronized(this.cache){
V value = loadValue(key)
cache.put(key, value);
}
}
return cache.get(key);
}
}
Should access to the whole map be synchronized? Is there any possibility where two threads try to run this method, with one "writer thread" halfway through the process of storing value into the map and simultaneously, a "reader thread" invoking the "contains" method. Will this cause the JVM to blow up? (I don't mind overwriting values in the map -- if two writer threads try to load at the same time)
Both of the code examples have broken concurrency.
The first one requires at least the field marked volatile or else the other thread might never see the variable being changed (it may store its value in CPU cache or a register, and not check whether the value in memory has changed).
The second one is even more broken, because the internals of HashMap are no thread-safe and it's not just a single value but a complex data structure - using it from many threads produces completely unpredictable results. The general rule is that both reading and writing the shared state must be synchronized. You may also use ConcurrentHashMap for better performance.
Unless you either synchronize on the variable, or mark the variable as volatile, there is no guarantee that separate threads' view of the object ever get reconciled. To quote the Wikipedia artible on the Java Memory Model
The major caveat of this is that as-if-serial semantics do not prevent different threads from having different views of the data.
Realistically, so long as the two threads synchronize on some lock at some time, the update to the variable will be seen.
I am wondering why you wouldn't want to mark the variable volatile?
It's not that the JVM will "blow up" as such. But both cases are incorrectly synchronised, and so the results will be unpredictable. The bottom line is that JVMs are designed to behave in a particular way if you synchronise in a particular way; if you don't synchronise correctly, you lose that guarantee.
It's not uncommon for people to think they've found a reason why certain synchronisation can be omitted, or to unknowingly omit necessary synchronisation but with no immediately obvious problem. But with inadequate synchronisation, there is a danger that your program could appear to work fine in one environment, only for an issue to appear later when a particular factor is changed (e.g. moving to a machine with more CPUs, or an update to the JVM that adds a particular optimisation).
Synchronizing shouldTerminate: See
Dilum's answer
Your bool value will
never be inconsistent state.
If one
thread is calling
cache.containsKey(key) while
another thread is calling
cache.put(key, value) the JVM will
blow up (by throwing ConcurrentModificationException)
something bad might happen if that put call caused the map
the grow, but will usually mostly work (worse than failure).

Java concurrency scenario -- do I need synchronization or not?

Here's the deal. I have a hash map containing data I call "program codes", it lives in an object, like so:
Class Metadata
{
private HashMap validProgramCodes;
public HashMap getValidProgramCodes() { return validProgramCodes; }
public void setValidProgramCodes(HashMap h) { validProgramCodes = h; }
}
I have lots and lots of reader threads each of which will call getValidProgramCodes() once and then use that hashmap as a read-only resource.
So far so good. Here's where we get interesting.
I want to put in a timer which every so often generates a new list of valid program codes (never mind how), and calls setValidProgramCodes.
My theory -- which I need help to validate -- is that I can continue using the code as is, without putting in explicit synchronization. It goes like this:
At the time that validProgramCodes are updated, the value of validProgramCodes is always good -- it is a pointer to either the new or the old hashmap. This is the assumption upon which everything hinges. A reader who has the old hashmap is okay; he can continue to use the old value, as it will not be garbage collected until he releases it. Each reader is transient; it will die soon and be replaced by a new one who will pick up the new value.
Does this hold water? My main goal is to avoid costly synchronization and blocking in the overwhelming majority of cases where no update is happening. We only update once per hour or so, and readers are constantly flickering in and out.
Use Volatile
Is this a case where one thread cares what another is doing? Then the JMM FAQ has the answer:
Most of the time, one thread doesn't
care what the other is doing. But when
it does, that's what synchronization
is for.
In response to those who say that the OP's code is safe as-is, consider this: There is nothing in Java's memory model that guarantees that this field will be flushed to main memory when a new thread is started. Furthermore, a JVM is free to reorder operations as long as the changes aren't detectable within the thread.
Theoretically speaking, the reader threads are not guaranteed to see the "write" to validProgramCodes. In practice, they eventually will, but you can't be sure when.
I recommend declaring the validProgramCodes member as "volatile". The speed difference will be negligible, and it will guarantee the safety of your code now and in future, whatever JVM optimizations might be introduced.
Here's a concrete recommendation:
import java.util.Collections;
class Metadata {
private volatile Map validProgramCodes = Collections.emptyMap();
public Map getValidProgramCodes() {
return validProgramCodes;
}
public void setValidProgramCodes(Map h) {
if (h == null)
throw new NullPointerException("validProgramCodes == null");
validProgramCodes = Collections.unmodifiableMap(new HashMap(h));
}
}
Immutability
In addition to wrapping it with unmodifiableMap, I'm copying the map (new HashMap(h)). This makes a snapshot that won't change even if the caller of setter continues to update the map "h". For example, they might clear the map and add fresh entries.
Depend on Interfaces
On a stylistic note, it's often better to declare APIs with abstract types like List and Map, rather than a concrete types like ArrayList and HashMap. This gives flexibility in the future if concrete types need to change (as I did here).
Caching
The result of assigning "h" to "validProgramCodes" may simply be a write to the processor's cache. Even when a new thread starts, "h" will not be visible to a new thread unless it has been flushed to shared memory. A good runtime will avoid flushing unless it's necessary, and using volatile is one way to indicate that it's necessary.
Reordering
Assume the following code:
HashMap codes = new HashMap();
codes.putAll(source);
meta.setValidProgramCodes(codes);
If setValidCodes is simply the OP's validProgramCodes = h;, the compiler is free to reorder the code something like this:
1: meta.validProgramCodes = codes = new HashMap();
2: codes.putAll(source);
Suppose after execution of writer line 1, a reader thread starts running this code:
1: Map codes = meta.getValidProgramCodes();
2: Iterator i = codes.entrySet().iterator();
3: while (i.hasNext()) {
4: Map.Entry e = (Map.Entry) i.next();
5: // Do something with e.
6: }
Now suppose that the writer thread calls "putAll" on the map between the reader's line 2 and line 3. The map underlying the Iterator has experienced a concurrent modification, and throws a runtime exception—a devilishly intermittent, seemingly inexplicable runtime exception that was never produced during testing.
Concurrent Programming
Any time you have one thread that cares what another thread is doing, you must have some sort of memory barrier to ensure that actions of one thread are visible to the other. If an event in one thread must happen before an event in another thread, you must indicate that explicitly. There are no guarantees otherwise. In practice, this means volatile or synchronized.
Don't skimp. It doesn't matter how fast an incorrect program fails to do its job. The examples shown here are simple and contrived, but rest assured, they illustrate real-world concurrency bugs that are incredibly difficult to identify and resolve due to their unpredictability and platform-sensitivity.
Additional Resources
The Java Language Specification - 17 Threads and Locks sections: §17.3 and §17.4
The JMM FAQ
Doug Lea's concurrency books
No, the code example is not safe, because there is no safe publication of any new HashMap instances. Without any synchronization, there is a possibility that a reader thread will see a partially initialized HashMap.
Check out #erickson's explanation under "Reordering" in his answer. Also I can't recommend Brian Goetz's book Java Concurrency in Practice enough!
Whether or not it is okay with you that reader threads might see old (stale) HashMap references, or might even never see a new reference, is beside the point. The worst thing that can happen is that a reader thread might obtain reference to and attempt to access a HashMap instance that is not yet initialized and not ready to be accessed.
No, by the Java Memory Model (JMM), this is not thread-safe.
There is no happens-before relation between writing and reading the HashMap implementation objects. So, although the writer thread appears to write out the object first and then the reference, a reader thread may not see the same order.
As also mentioned there is no guarantee that the reaer thread will ever see the new value. In practice with current compilers on existing hardware the value should get updated, unless the loop body is sufficienly small that it can be sufficiently inlined.
So, making the reference volatile is adequate under the new JMM. It is unlikely to make a substantial difference to system performance.
The moral of this story: Threading is difficult. Don't try to be clever, because sometimes (may be not on your test system) you wont be clever enough.
As others have already noted, this is not safe and you shouldn't do this. You need either volatile or synchronized here to force other threads to see the change.
What hasn't been mentioned is that synchronized and especially volatile are probably a lot faster than you think. If it's actually a performance bottleneck in your app, then I'll eat this web page.
Another option (probably slower than volatile, but YMMV) is to use a ReentrantReadWriteLock to protect access so that multiple concurrent readers can read it. And if that's still a performance bottleneck, I'll eat this whole web site.
public class Metadata
{
private HashMap validProgramCodes;
private ReadWriteLock lock = new ReentrantReadWriteLock();
public HashMap getValidProgramCodes() {
lock.readLock().lock();
try {
return validProgramCodes;
} finally {
lock.readLock().unlock();
}
}
public void setValidProgramCodes(HashMap h) {
lock.writeLock().lock();
try {
validProgramCodes = h;
} finally {
lock.writeLock().unlock();
}
}
}
I think your assumptions are correct. The only thing I would do is set the validProgramCodes volatile.
private volatile HashMap validProgramCodes;
This way, when you update the "pointer" of validProgramCodes you guaranty that all threads access the same latest HasMap "pointer" because they don't rely on local thread cache and go directly to memory.
The assignment will work as long as you're not concerned about reading stale values, and as long as you can guarantee that your hashmap is properly populated on initialization. You should at the least create the hashMap with Collections.unmodifiableMap on the Hashmap to guarantee that your readers won't be changing/deleting objects from the map, and to avoid multiple threads stepping on each others toes and invalidating iterators when other threads destroy.
( writer above is right about the volatile, should've seen that)
While this is not the best solution for this particular problem (erickson's idea of a new unmodifiableMap is), I'd like to take a moment to mention the java.util.concurrent.ConcurrentHashMap class introduced in Java 5, a version of HashMap specifically built with concurrency in mind. This construct does not block on reads.
Check this post about concurrency basics. It should be able to answer your question satisfactorily.
http://walivi.wordpress.com/2013/08/24/concurrency-in-java-a-beginners-introduction/
I think it's risky. Threading results in all kinds of subtly issues that are a giant pain to debug. You might want to look at FastHashMap, which is intended for read-only threading cases like this.
At the least, I'd also declare validProgramCodes to be volatile so that the reference won't get optimized into a register or something.
If I read the JLS correctly (no guarantees there!), accesses to references are always atomic, period. See Section 17.7 Non-atomic Treatment of double and long
So, if the access to a reference is always atomic and it doesn't matter what instance of the returned Hashmap the threads see, you should be OK. You won't see partial writes to the reference, ever.
Edit: After review of the discussion in the comments below and other answers, here are references/quotes from
Doug Lea's book (Concurrent Programming in Java, 2nd Ed), p 94, section 2.2.7.2 Visibility, item #3: "
The first time a thread access a field
of an object, it sees either the
initial value of the field or the
value since written by some other
thread."
On p. 94, Lea goes on to describe risks associated with this approach:
The memory model guarantees that, given the eventual occurrence of the above operations, a particular update to a particular field made by one thread will eventually be visible to another. But eventually can be an arbitrarily long time.
So when it absolutely, positively, must be visible to any calling thread, volatile or some other synchronization barrier is required, especially in long running threads or threads that access the value in a loop (as Lea says).
However, in the case where there is a short lived thread, as implied by the question, with new threads for new readers and it does not impact the application to read stale data, synchronization is not required.
#erickson's answer is the safest in this situation, guaranteeing that other threads will see the changes to the HashMap reference as they occur. I'd suggest following that advice simply to avoid the confusion over the requirements and implementation that resulted in the "down votes" on this answer and the discussion below.
I'm not deleting the answer in the hope that it will be useful. I'm not looking for the "Peer Pressure" badge... ;-)

Categories