Related
I'm using Java to write a multithreading application. One question I have is I have a list that is accessed by multiple threads, and I have one thread trying to update it. However, each time, the update thread will create a new List and then make the public shared list point to the new List, like this:
Public List<DataObject> publicDataObject = XXXX; // <- this will be accessed by multiple threads
Then I have one thread updating this List:
List<DataObject> newDataObjectList = CreateNewDataObject();
publicDataObject = newDataObjectList;
When I update the pointer of publicDataObject, do I need a lock to make it thread-safe?
Before going into the answer, let's first check if I understand the situation and the question.
Assumption as stated in problem description: Only a single thread creates new versions of the list, based on the previous value of publicDataObject, and stores that new list in publicDataObject.
Derived assumption: Other threads access DataObject elements from that list, but do not add, remove or change the order of elements.
If this assumption holds, the answer is below.
Otherwise, please make sure your question includes this in its description. This makes the answer much more complex, though, and I advise you to study the topic of concurrency more, for example by reading a book about Java concurrency.
Additional assumption:The DataObject objects themselves are thread-safe.
If this assumption does not hold, this would make the scope of the question too broad and I would suggest to study the topic of concurrency more, for example by reading a book about Java concurrency.
Answer
Given that the above assumptions are true, you do not need a lock, however, you cannot just access publicDataObject from multiple threads, using its definition in you code example. The reason is the Java Memory Model. The Java Memory Model makes no guarantees whatsoever about threads seeing changes made by other threads, unless you use special language constructs like atomic variables, locks or synchronization.
Simplified, these constructs ensure that a read in one thread that happens after a write in another, can see that written value, as long as you are using the same construct: the same atomic variable, lock or synchronisation on the same object. Locks and intrinsic locks (used by synchronisation) can also ensure exclusive access of a single thread to a block of code.
Given, again, that the above assumptions are true, you can suffice using an AtomicReference, whose get and set methods have the desired relationship:
// Definition
public AtomicReference<List<DataObject>> publicDataObject;
The reasons that a simple construct can be used that "only" guarantees visibility are:
The list that publicDataObject refers to, is always a new one (the first assumption). Therefore, other threads can never see a stale value of the list itself. Making sure that threads see the correct value of publicDataObject is therefore enough
As long as other threads don't change the list.
If in addition, only thread sets publicDataObject, there can be no race conditions in setting it, for example loosing updates that are overwritten by more recent updates before ever being read.
Are actions in a thread prior to calling ConcurrentMap.remove() guaranteed to happen-before actions subsequent to seeing the removal from another thread?
Documentation says this regarding objects placed into the collection:
Actions in a thread prior to placing an object into any concurrent collection happen-before actions subsequent to the access or removal of that element from the collection in another thread.
Example code:
{
final ConcurrentMap map = new ConcurrentHashMap();
map.put(1, new Object());
final int[] value = { 0 };
new Thread(() -> {
value[0]++;
value[0]++;
value[0]++;
value[0]++;
value[0]++;
map.remove(1); // A
}).start();
new Thread(() -> {
if (map.get(1) == null) { // B
System.out.println(value[0]); // expect 5
}
}).start();
}
Is A in a happens-before relationship with B? Therefore, should the program only, if ever, print 5?
You have found an interesting subtle aspect of these concurrency tools that is easy to overlook.
First, it’s impossible to provide a general guaranty regarding removal and the retrieval of a null reference, as the latter only proves the absence of a mapping but not a previous removal, i.e. the thread could have read the map’s initial state, before the key ever had a mapping, which, of course, can’t establish a happens-before relationship with the actions that happened after the map’s construction.
Also, if there are multiple threads removing the same key, you can’t assume a happens-before relationship, when retrieving null, as you don’t know which removal has been completed. This issue is similar to the scenario when two threads insert the same value, but the latter can be fixed on the application side by only perform insertions of distinguishable values or by following the usual pattern of performing the desired modifications on the value object which is going to be inserted and to query the retrieved object only. For a removal, there is no such fix.
In your special case, there’s a happens-before relationship between the map.put(1, new Object()) action and the start of the second thread, so if the second thread encounters null when querying the key 1, it’s clear that it witnessed the sole removal of your code, still, the specification didn’t bother to provide an explicit guaranty for this special case.
Instead, the specification of Java 8’s ConcurrentHashMap says,
Retrievals reflect the results of the most recently completed update operations holding upon their onset. (More formally, an update operation for a given key bears a happens-before relation with any (non-null) retrieval for that key reporting the updated value.)
clearly ruling out null retrievals.
I think, with the current (Java 8) ConcurrentHashMap implementation, your code can’t break as it is rather conservative in that it performs all access to its internal backing array with volatile semantics. But that is only the current implementation and, as explained above, your code is a special case and likely to become broken with every change towards a real-life application.
No, you have the order wrong.
There is a happens-before edge from the put() to the subsequent get(). That edge is not symmetric, and doesn't work in the other direction. There is no happens-before edge from at get() to another get() or a remove(), or from a put() to another put().
In this case, you put an object in the map. Then you modify another object. That's a no-no. There's no edge from the those writes to the get() in the second thread, so those writes may not be visible to the second thread.
On Intel hardware, I think this will always work. However, it isn't guaranteed by the Java memory model, so you have to be wary if you ever port this code to different hardware.
A does not need to happen before B.
Only the original put happens before both. Thus a null at B means that A happened.
However write back of thread local memory cache and instruction order of ++ and remove are not mentioned. volatile is not used; instead a Map and an array are used to hopefully keep thread data synchrone. On writing the data back, in-order relation should hold again.
To my understanding A could remove and be written back, then the last ++ happen, and something like 4 being printed at B. I would add volatile to the array. The Map itself will go fine.
I am far from certain, but as I did not see a corresponding answer, I stick my neck out. (To learn myself.)
As ConcurrentHashMap is a thread safe collection, the statement map.remove(1) must have a read barrier and a write barrier if it alters the map. The expression map.get(1) must have a read barrier or one, or both of those operations are not thread safe.
In reality ConcurrentHashMap up to Java 7, uses partitioned locks, so it always has a read/write barrier for nearly every operation.
A ConcurrentSkipListMap doesn't have to use locks, but to perform any thread safe write action, a write barrier is required.
This means your test should always act as expected.
I have a controller class that runs in thread A and composes a local variable list like this
Thread A
list = new ArrayList<Map<String, Order>>();
list.add(...);
list.add(...);
where Order is a java bean with several primitive properties like String, int, long, etc.
Once this list is constructed, its reference is passed to a UI thread (thread B) of Activity and accessed there. The cross-thread communication is done using a Handler class + post() method.
So the question is, can I access the list data from thread B without synchronization at all? Please note, that after constructed in thread A, the list will not be accessed/modified at all. It just exists like a local variable and is passed to thread B afterwards.
It is safe. The synchronization done at the message queue establishes a happens-before relationship. This of course assumes that you don't modify the Maps either afterwards. Also any objects contained in the maps, and so on must not be modified by other threads without proper synchronization.
In short, if the list and none of the data within it are not modified by other threads than B, you don't need any further synchronization.
It is not clear from the context you provide that where does this happen:
list = new ArrayList<Map<String, Order>>();
list.add(...);
list.add(...);
If it is in a constructor and list is final and the this reference does not leak from the constructor and you are absolutely sure that list won't change (for example by using the unmodifiableList decorator method) and the references to the Order instances are not accessible from elsewhere than it may be OK to not use synchronization. Otherwise you have Sword of Damocles over your head.
I mentioned the Order references because you may not get exceptions if you change them from somewhere else but it may lead to data inconsistency/corruption.
If you can guarantee that the list will not be modified then you don't need synchonization since all threads will always see the same List.
Yes, no need to synchronize if you're only going to read the data.
Note that even if thread A is going to eventually modify the list while thread B (or any other number of threads) is accessing it, you still do not have to synchronize because there's only one writer at any given time.
Sorry, the above statement is not completely correct. As stated in the JavaDoc:
If multiple threads access an ArrayList instance concurrently, and at
least one of the threads modifies the list structurally, it must be
synchronized externally. (A structural modification is any operation
that adds or deletes one or more elements, or explicitly resizes the
backing array; merely setting the value of an element is not a
structural modification.)
Also note I'm not taking into account element modification but purely list modifications.
Here's the deal. I have a hash map containing data I call "program codes", it lives in an object, like so:
Class Metadata
{
private HashMap validProgramCodes;
public HashMap getValidProgramCodes() { return validProgramCodes; }
public void setValidProgramCodes(HashMap h) { validProgramCodes = h; }
}
I have lots and lots of reader threads each of which will call getValidProgramCodes() once and then use that hashmap as a read-only resource.
So far so good. Here's where we get interesting.
I want to put in a timer which every so often generates a new list of valid program codes (never mind how), and calls setValidProgramCodes.
My theory -- which I need help to validate -- is that I can continue using the code as is, without putting in explicit synchronization. It goes like this:
At the time that validProgramCodes are updated, the value of validProgramCodes is always good -- it is a pointer to either the new or the old hashmap. This is the assumption upon which everything hinges. A reader who has the old hashmap is okay; he can continue to use the old value, as it will not be garbage collected until he releases it. Each reader is transient; it will die soon and be replaced by a new one who will pick up the new value.
Does this hold water? My main goal is to avoid costly synchronization and blocking in the overwhelming majority of cases where no update is happening. We only update once per hour or so, and readers are constantly flickering in and out.
Use Volatile
Is this a case where one thread cares what another is doing? Then the JMM FAQ has the answer:
Most of the time, one thread doesn't
care what the other is doing. But when
it does, that's what synchronization
is for.
In response to those who say that the OP's code is safe as-is, consider this: There is nothing in Java's memory model that guarantees that this field will be flushed to main memory when a new thread is started. Furthermore, a JVM is free to reorder operations as long as the changes aren't detectable within the thread.
Theoretically speaking, the reader threads are not guaranteed to see the "write" to validProgramCodes. In practice, they eventually will, but you can't be sure when.
I recommend declaring the validProgramCodes member as "volatile". The speed difference will be negligible, and it will guarantee the safety of your code now and in future, whatever JVM optimizations might be introduced.
Here's a concrete recommendation:
import java.util.Collections;
class Metadata {
private volatile Map validProgramCodes = Collections.emptyMap();
public Map getValidProgramCodes() {
return validProgramCodes;
}
public void setValidProgramCodes(Map h) {
if (h == null)
throw new NullPointerException("validProgramCodes == null");
validProgramCodes = Collections.unmodifiableMap(new HashMap(h));
}
}
Immutability
In addition to wrapping it with unmodifiableMap, I'm copying the map (new HashMap(h)). This makes a snapshot that won't change even if the caller of setter continues to update the map "h". For example, they might clear the map and add fresh entries.
Depend on Interfaces
On a stylistic note, it's often better to declare APIs with abstract types like List and Map, rather than a concrete types like ArrayList and HashMap. This gives flexibility in the future if concrete types need to change (as I did here).
Caching
The result of assigning "h" to "validProgramCodes" may simply be a write to the processor's cache. Even when a new thread starts, "h" will not be visible to a new thread unless it has been flushed to shared memory. A good runtime will avoid flushing unless it's necessary, and using volatile is one way to indicate that it's necessary.
Reordering
Assume the following code:
HashMap codes = new HashMap();
codes.putAll(source);
meta.setValidProgramCodes(codes);
If setValidCodes is simply the OP's validProgramCodes = h;, the compiler is free to reorder the code something like this:
1: meta.validProgramCodes = codes = new HashMap();
2: codes.putAll(source);
Suppose after execution of writer line 1, a reader thread starts running this code:
1: Map codes = meta.getValidProgramCodes();
2: Iterator i = codes.entrySet().iterator();
3: while (i.hasNext()) {
4: Map.Entry e = (Map.Entry) i.next();
5: // Do something with e.
6: }
Now suppose that the writer thread calls "putAll" on the map between the reader's line 2 and line 3. The map underlying the Iterator has experienced a concurrent modification, and throws a runtime exception—a devilishly intermittent, seemingly inexplicable runtime exception that was never produced during testing.
Concurrent Programming
Any time you have one thread that cares what another thread is doing, you must have some sort of memory barrier to ensure that actions of one thread are visible to the other. If an event in one thread must happen before an event in another thread, you must indicate that explicitly. There are no guarantees otherwise. In practice, this means volatile or synchronized.
Don't skimp. It doesn't matter how fast an incorrect program fails to do its job. The examples shown here are simple and contrived, but rest assured, they illustrate real-world concurrency bugs that are incredibly difficult to identify and resolve due to their unpredictability and platform-sensitivity.
Additional Resources
The Java Language Specification - 17 Threads and Locks sections: §17.3 and §17.4
The JMM FAQ
Doug Lea's concurrency books
No, the code example is not safe, because there is no safe publication of any new HashMap instances. Without any synchronization, there is a possibility that a reader thread will see a partially initialized HashMap.
Check out #erickson's explanation under "Reordering" in his answer. Also I can't recommend Brian Goetz's book Java Concurrency in Practice enough!
Whether or not it is okay with you that reader threads might see old (stale) HashMap references, or might even never see a new reference, is beside the point. The worst thing that can happen is that a reader thread might obtain reference to and attempt to access a HashMap instance that is not yet initialized and not ready to be accessed.
No, by the Java Memory Model (JMM), this is not thread-safe.
There is no happens-before relation between writing and reading the HashMap implementation objects. So, although the writer thread appears to write out the object first and then the reference, a reader thread may not see the same order.
As also mentioned there is no guarantee that the reaer thread will ever see the new value. In practice with current compilers on existing hardware the value should get updated, unless the loop body is sufficienly small that it can be sufficiently inlined.
So, making the reference volatile is adequate under the new JMM. It is unlikely to make a substantial difference to system performance.
The moral of this story: Threading is difficult. Don't try to be clever, because sometimes (may be not on your test system) you wont be clever enough.
As others have already noted, this is not safe and you shouldn't do this. You need either volatile or synchronized here to force other threads to see the change.
What hasn't been mentioned is that synchronized and especially volatile are probably a lot faster than you think. If it's actually a performance bottleneck in your app, then I'll eat this web page.
Another option (probably slower than volatile, but YMMV) is to use a ReentrantReadWriteLock to protect access so that multiple concurrent readers can read it. And if that's still a performance bottleneck, I'll eat this whole web site.
public class Metadata
{
private HashMap validProgramCodes;
private ReadWriteLock lock = new ReentrantReadWriteLock();
public HashMap getValidProgramCodes() {
lock.readLock().lock();
try {
return validProgramCodes;
} finally {
lock.readLock().unlock();
}
}
public void setValidProgramCodes(HashMap h) {
lock.writeLock().lock();
try {
validProgramCodes = h;
} finally {
lock.writeLock().unlock();
}
}
}
I think your assumptions are correct. The only thing I would do is set the validProgramCodes volatile.
private volatile HashMap validProgramCodes;
This way, when you update the "pointer" of validProgramCodes you guaranty that all threads access the same latest HasMap "pointer" because they don't rely on local thread cache and go directly to memory.
The assignment will work as long as you're not concerned about reading stale values, and as long as you can guarantee that your hashmap is properly populated on initialization. You should at the least create the hashMap with Collections.unmodifiableMap on the Hashmap to guarantee that your readers won't be changing/deleting objects from the map, and to avoid multiple threads stepping on each others toes and invalidating iterators when other threads destroy.
( writer above is right about the volatile, should've seen that)
While this is not the best solution for this particular problem (erickson's idea of a new unmodifiableMap is), I'd like to take a moment to mention the java.util.concurrent.ConcurrentHashMap class introduced in Java 5, a version of HashMap specifically built with concurrency in mind. This construct does not block on reads.
Check this post about concurrency basics. It should be able to answer your question satisfactorily.
http://walivi.wordpress.com/2013/08/24/concurrency-in-java-a-beginners-introduction/
I think it's risky. Threading results in all kinds of subtly issues that are a giant pain to debug. You might want to look at FastHashMap, which is intended for read-only threading cases like this.
At the least, I'd also declare validProgramCodes to be volatile so that the reference won't get optimized into a register or something.
If I read the JLS correctly (no guarantees there!), accesses to references are always atomic, period. See Section 17.7 Non-atomic Treatment of double and long
So, if the access to a reference is always atomic and it doesn't matter what instance of the returned Hashmap the threads see, you should be OK. You won't see partial writes to the reference, ever.
Edit: After review of the discussion in the comments below and other answers, here are references/quotes from
Doug Lea's book (Concurrent Programming in Java, 2nd Ed), p 94, section 2.2.7.2 Visibility, item #3: "
The first time a thread access a field
of an object, it sees either the
initial value of the field or the
value since written by some other
thread."
On p. 94, Lea goes on to describe risks associated with this approach:
The memory model guarantees that, given the eventual occurrence of the above operations, a particular update to a particular field made by one thread will eventually be visible to another. But eventually can be an arbitrarily long time.
So when it absolutely, positively, must be visible to any calling thread, volatile or some other synchronization barrier is required, especially in long running threads or threads that access the value in a loop (as Lea says).
However, in the case where there is a short lived thread, as implied by the question, with new threads for new readers and it does not impact the application to read stale data, synchronization is not required.
#erickson's answer is the safest in this situation, guaranteeing that other threads will see the changes to the HashMap reference as they occur. I'd suggest following that advice simply to avoid the confusion over the requirements and implementation that resulted in the "down votes" on this answer and the discussion below.
I'm not deleting the answer in the hope that it will be useful. I'm not looking for the "Peer Pressure" badge... ;-)
There is a case where a map will be constructed, and once it is initialized, it will never be modified again. It will however, be accessed (via get(key) only) from multiple threads. Is it safe to use a java.util.HashMap in this way?
(Currently, I'm happily using a java.util.concurrent.ConcurrentHashMap, and have no measured need to improve performance, but am simply curious if a simple HashMap would suffice. Hence, this question is not "Which one should I use?" nor is it a performance question. Rather, the question is "Would it be safe?")
Jeremy Manson, the god when it comes to the Java Memory Model, has a three part blog on this topic - because in essence you are asking the question "Is it safe to access an immutable HashMap" - the answer to that is yes. But you must answer the predicate to that question which is - "Is my HashMap immutable". The answer might surprise you - Java has a relatively complicated set of rules to determine immutability.
For more info on the topic, read Jeremy's blog posts:
Part 1 on Immutability in Java:
http://jeremymanson.blogspot.com/2008/04/immutability-in-java.html
Part 2 on Immutability in Java:
http://jeremymanson.blogspot.com/2008/07/immutability-in-java-part-2.html
Part 3 on Immutability in Java:
http://jeremymanson.blogspot.com/2008/07/immutability-in-java-part-3.html
Your idiom is safe if and only if the reference to the HashMap is safely published. Rather than anything relating the internals of HashMap itself, safe publication deals with how the constructing thread makes the reference to the map visible to other threads.
Basically, the only possible race here is between the construction of the HashMap and any reading threads that may access it before it is fully constructed. Most of the discussion is about what happens to the state of the map object, but this is irrelevant since you never modify it - so the only interesting part is how the HashMap reference is published.
For example, imagine you publish the map like this:
class SomeClass {
public static HashMap<Object, Object> MAP;
public synchronized static setMap(HashMap<Object, Object> m) {
MAP = m;
}
}
... and at some point setMap() is called with a map, and other threads are using SomeClass.MAP to access the map, and check for null like this:
HashMap<Object,Object> map = SomeClass.MAP;
if (map != null) {
.. use the map
} else {
.. some default behavior
}
This is not safe even though it probably appears as though it is. The problem is that there is no happens-before relationship between the set of SomeObject.MAP and the subsequent read on another thread, so the reading thread is free to see a partially constructed map. This can pretty much do anything and even in practice it does things like put the reading thread into an infinite loop.
To safely publish the map, you need to establish a happens-before relationship between the writing of the reference to the HashMap (i.e., the publication) and the subsequent readers of that reference (i.e., the consumption). Conveniently, there are only a few easy-to-remember ways to accomplish that[1]:
Exchange the reference through a properly locked field (JLS 17.4.5)
Use static initializer to do the initializing stores (JLS 12.4)
Exchange the reference via a volatile field (JLS 17.4.5), or as the consequence of this rule, via the AtomicX classes
Initialize the value into a final field (JLS 17.5).
The ones most interesting for your scenario are (2), (3) and (4). In particular, (3) applies directly to the code I have above: if you transform the declaration of MAP to:
public static volatile HashMap<Object, Object> MAP;
then everything is kosher: readers who see a non-null value necessarily have a happens-before relationship with the store to MAP and hence see all the stores associated with the map initialization.
The other methods change the semantics of your method, since both (2) (using the static initalizer) and (4) (using final) imply that you cannot set MAP dynamically at runtime. If you don't need to do that, then just declare MAP as a static final HashMap<> and you are guaranteed safe publication.
In practice, the rules are simple for safe access to "never-modified objects":
If you are publishing an object which is not inherently immutable (as in all fields declared final) and:
You already can create the object that will be assigned at the moment of declarationa: just use a final field (including static final for static members).
You want to assign the object later, after the reference is already visible: use a volatile fieldb.
That's it!
In practice, it is very efficient. The use of a static final field, for example, allows the JVM to assume the value is unchanged for the life of the program and optimize it heavily. The use of a final member field allows most architectures to read the field in a way equivalent to a normal field read and doesn't inhibit further optimizationsc.
Finally, the use of volatile does have some impact: no hardware barrier is needed on many architectures (such as x86, specifically those that don't allow reads to pass reads), but some optimization and reordering may not occur at compile time - but this effect is generally small. In exchange, you actually get more than what you asked for - not only can you safely publish one HashMap, you can store as many more not-modified HashMaps as you want to the same reference and be assured that all readers will see a safely published map.
For more gory details, refer to Shipilev or this FAQ by Manson and Goetz.
[1] Directly quoting from shipilev.
a That sounds complicated, but what I mean is that you can assign the reference at construction time - either at the declaration point or in the constructor (member fields) or static initializer (static fields).
b Optionally, you can use a synchronized method to get/set, or an AtomicReference or something, but we're talking about the minimum work you can do.
c Some architectures with very weak memory models (I'm looking at you, Alpha) may require some type of read barrier before a final read - but these are very rare today.
The reads are safe from a synchronization standpoint but not a memory standpoint. This is something that is widely misunderstood among Java developers including here on Stackoverflow. (Observe the rating of this answer for proof.)
If you have other threads running, they may not see an updated copy of the HashMap if there is no memory write out of the current thread. Memory writes occur through the use of the synchronized or volatile keywords, or through uses of some java concurrency constructs.
See Brian Goetz's article on the new Java Memory Model for details.
After a bit more looking, I found this in the java doc (emphasis mine):
Note that this implementation is not
synchronized. If multiple threads
access a hash map concurrently, and at
least one of the threads modifies the
map structurally, it must be
synchronized externally. (A structural
modification is any operation that
adds or deletes one or more mappings;
merely changing the value associated
with a key that an instance already
contains is not a structural
modification.)
This seems to imply that it will be safe, assuming the converse of the statement there is true.
One note is that under some circumstances, a get() from an unsynchronized HashMap can cause an infinite loop. This can occur if a concurrent put() causes a rehash of the Map.
http://lightbody.net/blog/2005/07/hashmapget_can_cause_an_infini.html
There is an important twist though. It's safe to access the map, but in general it's not guaranteed that all threads will see exactly the same state (and thus values) of the HashMap. This might happen on multiprocessor systems where the modifications to the HashMap done by one thread (e.g., the one that populated it) can sit in that CPU's cache and won't be seen by threads running on other CPUs, until a memory fence operation is performed ensuring cache coherence. The Java Language Specification is explicit on this one: the solution is to acquire a lock (synchronized (...)) which emits a memory fence operation. So, if you are sure that after populating the HashMap each of the threads acquires ANY lock, then it's OK from that point on to access the HashMap from any thread until the HashMap is modified again.
According to http://www.ibm.com/developerworks/java/library/j-jtp03304/ # Initialization safety you can make your HashMap a final field and after the constructor finishes it would be safely published.
...
Under the new memory model, there is something similar to a happens-before relationship between the write of a final field in a constructor and the initial load of a shared reference to that object in another thread.
...
This question is addressed in Brian Goetz's "Java Concurrency in Practice" book (Listing 16.8, page 350):
#ThreadSafe
public class SafeStates {
private final Map<String, String> states;
public SafeStates() {
states = new HashMap<String, String>();
states.put("alaska", "AK");
states.put("alabama", "AL");
...
states.put("wyoming", "WY");
}
public String getAbbreviation(String s) {
return states.get(s);
}
}
Since states is declared as final and its initialization is accomplished within the owner's class constructor, any thread who later reads this map is guaranteed to see it as of the time the constructor finishes, provided no other thread will try to modify the contents of the map.
So the scenario you described is that you need to put a bunch of data into a Map, then when you're done populating it you treat it as immutable. One approach that is "safe" (meaning you're enforcing that it really is treated as immutable) is to replace the reference with Collections.unmodifiableMap(originalMap) when you're ready to make it immutable.
For an example of how badly maps can fail if used concurrently, and the suggested workaround I mentioned, check out this bug parade entry: bug_id=6423457
Be warned that even in single-threaded code, replacing a ConcurrentHashMap with a HashMap may not be safe. ConcurrentHashMap forbids null as a key or value. HashMap does not forbid them (don't ask).
So in the unlikely situation that your existing code might add a null to the collection during setup (presumably in a failure case of some kind), replacing the collection as described will change the functional behaviour.
That said, provided you do nothing else concurrent reads from a HashMap are safe.
[Edit: by "concurrent reads", I mean that there are not also concurrent modifications.
Other answers explain how to ensure this. One way is to make the map immutable, but it's not necessary. For example, the JSR133 memory model explicitly defines starting a thread to be a synchronised action, meaning that changes made in thread A before it starts thread B are visible in thread B.
My intent is not to contradict those more detailed answers about the Java Memory Model. This answer is intended to point out that even aside from concurrency issues, there is at least one API difference between ConcurrentHashMap and HashMap, which could scupper even a single-threaded program which replaced one with the other.]
http://www.docjar.com/html/api/java/util/HashMap.java.html
here is the source for HashMap. As you can tell, there is absolutely no locking / mutex code there.
This means that while its okay to read from a HashMap in a multithreaded situation, I'd definitely use a ConcurrentHashMap if there were multiple writes.
Whats interesting is that both the .NET HashTable and Dictionary<K,V> have built in synchronization code.
If the initialization and every put is synchronized you are save.
Following code is save because the classloader will take care of the synchronization:
public static final HashMap<String, String> map = new HashMap<>();
static {
map.put("A","A");
}
Following code is save because the writing of volatile will take care of the synchronization.
class Foo {
volatile HashMap<String, String> map;
public void init() {
final HashMap<String, String> tmp = new HashMap<>();
tmp.put("A","A");
// writing to volatile has to be after the modification of the map
this.map = tmp;
}
}
This will also work if the member variable is final because final is also volatile. And if the method is a constructor.