specific question on java threading + synchronization - java

I know this question sounds crazy, but consider the following java snippets:
Part - I:
class Consumer implements Runnable{
private boolean shouldTerminate = false
public void run() {
while( !shouldTerminate ){
//consume and perform some operation.
}
}
public void terminate(){
this.shouldTerminate = true;
}
}
So, the first question is, should I ever need to synchronize on shouldTerminate boolean? If so why? I don't mind missing the flag set to true for one or two cycles(cycle = 1 loop execution). And second, can a boolean variable ever be in a inconsistent state?(anything other than true or false)
Part - II of the question:
class Cache<K,V> {
private Map<K, V> cache = new HashMap<K, V>();
public V getValue(K key) {
if ( !cache.containsKey(key) ) {
synchronized(this.cache){
V value = loadValue(key)
cache.put(key, value);
}
}
return cache.get(key);
}
}
Should access to the whole map be synchronized? Is there any possibility where two threads try to run this method, with one "writer thread" halfway through the process of storing value into the map and simultaneously, a "reader thread" invoking the "contains" method. Will this cause the JVM to blow up? (I don't mind overwriting values in the map -- if two writer threads try to load at the same time)

Both of the code examples have broken concurrency.
The first one requires at least the field marked volatile or else the other thread might never see the variable being changed (it may store its value in CPU cache or a register, and not check whether the value in memory has changed).
The second one is even more broken, because the internals of HashMap are no thread-safe and it's not just a single value but a complex data structure - using it from many threads produces completely unpredictable results. The general rule is that both reading and writing the shared state must be synchronized. You may also use ConcurrentHashMap for better performance.

Unless you either synchronize on the variable, or mark the variable as volatile, there is no guarantee that separate threads' view of the object ever get reconciled. To quote the Wikipedia artible on the Java Memory Model
The major caveat of this is that as-if-serial semantics do not prevent different threads from having different views of the data.
Realistically, so long as the two threads synchronize on some lock at some time, the update to the variable will be seen.
I am wondering why you wouldn't want to mark the variable volatile?

It's not that the JVM will "blow up" as such. But both cases are incorrectly synchronised, and so the results will be unpredictable. The bottom line is that JVMs are designed to behave in a particular way if you synchronise in a particular way; if you don't synchronise correctly, you lose that guarantee.
It's not uncommon for people to think they've found a reason why certain synchronisation can be omitted, or to unknowingly omit necessary synchronisation but with no immediately obvious problem. But with inadequate synchronisation, there is a danger that your program could appear to work fine in one environment, only for an issue to appear later when a particular factor is changed (e.g. moving to a machine with more CPUs, or an update to the JVM that adds a particular optimisation).

Synchronizing shouldTerminate: See
Dilum's answer
Your bool value will
never be inconsistent state.
If one
thread is calling
cache.containsKey(key) while
another thread is calling
cache.put(key, value) the JVM will
blow up (by throwing ConcurrentModificationException)
something bad might happen if that put call caused the map
the grow, but will usually mostly work (worse than failure).

Related

Java ArrayList's set thread-safety

I have a usecase where multiple threads can be reading and modifying an ArrayList where the default values for these booleans are True.
The only modification the threads can make is setting an element of that ArrayList from True to False.
All of the threads will be also reading the ArrayList concurrently, but it is okay to read staled values.
Note:
The size of the ArrayList will not change throughout the lifetime of the ArrayList.
Question:
Is it necessary to synchronize the ArrayList across these threads? The only synchronization I'm doing is marking the ArrayList as volatile such that any update to it will be flushed back to the main memory from a thread's local memory. Is this enough?
Here is a sample code on how this ArrayList gets used by threads
myList is the ArrayList in question and its values are initialized to True
if (!myList.get(index)) {
return;
} else {
// do some operations to determine if we should
// update the value of myList to False.
if (needToUpdateList) {
myList.set(index, False);
}
}
Update
I previously said these threads do not care about staled values which is true. However, I have another thread that only reads these values and perform one final action. This thread does care about staleness. Does the synchronization requirement change?
Is there a cheaper way to "publish" the updated values besides requiring synchronization on every update? I'm trying to minimize locking as much as possible.
As it says in the Javadoc of ArrayList:
Note that this implementation is not synchronized. If multiple threads access an ArrayList instance concurrently, and at least one of the threads modifies the list structurally, it must be synchronized externally.
You're not modifying the list structurally, so you don't need to synchronize for that reason.
The other reason you'd want to synchronize is to avoid reading stale values; but you say that don't care about that.
As such there is no reason to synchronize.
Edit for the update #3
If you do care about reading stale values, you do need to synchronize.
An alternative to synchronization which would avoid locking the entire list would be to make it a List<AtomicBoolean>: this would not require synchronization of the list, because you aren't changing the values stored in the list; but reads of an AtomicBoolean value guarantees visibility.
It depends on what you want to do when an element is true. Consider your code, with a separate thread messing with the value you're looking at:
if (!myList.get(index)) { // <--- at this point, the value is True, so go to else branch
return;
} else {
// <--- at this point, other thread sets value to False.
// do some operations to determine if we should
// update the value of myList to False.
// **** Do these operations assume the value is still True?
// **** If so, then your application is not thread-safe.
if (needToUpdateList) {
myList.set(index, False);
}
}
Update
I previously said these threads do not care about staled values which is true. However, I have another thread that only reads these values and perform one final action. This thread does care about staleness. Does the synchronization requirement change?
You just invalidated a lot of perfectly good answers.
Yes, synchronization matters now. In fact probably atomicity matters too. Use a synchronized List or maybe even a Concurrent list or map of some sort.
Anytime you read-modify-write the list, you probably also have to hold the synchronized lock to preserve atomicity:
synchronized( myList ) {
if (!myList.get(index)) {
return;
} else {
// do some operations to determine if we should
// update the value of myList to False.
if (needToUpdateList) {
myList.set(index, False);
}
}
}
Edit: to reduce the time the lock is held, a lot depends on how long "do some operations" take, but a CocurrentHashMap reduces lock contention at the cost of some additional overhead. It might be worth profiling your code to determine the actual overhead and which method is faster/better.
ConcurrentHashMap<Integer,Boolean> myList = new ConcurrentHashMap<>();
//...
if( myList.get( index ) != null ) return;
// "do some opertaions" here
if( needToUpdate )
myList.put( index, false );
But I'm still not convinced that this isn't premature optimization. Write correct code first, fully synchronizing the list is probably fine. Then profile working code. If you do find a bottleneck, then you can worry about whether reducing lock contention is a good idea. But probably the code bottleneck won't be in the lock contention and will in fact be somewhere totally different.
I did some more googling and found that each thread might be storing the values in the registry or the local cache. The specs only offer some guarantee on when data would be written to shared/global memory.
https://docs.oracle.com/javase/specs/jls/se7/html/jls-17.html#jls-17.4.5
basically volatile, synchronized, thread.start(), thread.join()...
So yeah using the AtmoicBoolean will probably be the easiest, but you can also synchronize or make a class with a volatile boolean in it.
check this link out:
http://tutorials.jenkov.com/java-concurrency/volatile.html#variable-visibility-problems

Volatile keyword for thread safety

I have the below code:
public class Foo {
private volatile Map<String, String> map;
public Foo() {
refresh();
}
public void refresh() {
map = getData();
}
public boolean isPresent(String id) {
return map.containsKey(id);
}
public String getName(String id) {
return map.get(id);
}
private Map<String, String> getData() {
// logic
}
}
Is the above code thread safe or do I need to add synchronized or mutexes in there? If it's not thread safe, please clarify why.
Also, I've read that one should use AtomicReference instead of this, but in the source of the AtomicReference class, I can see that the field used to hold the value is volatile (along with a few convenience methods).
Is there a specific reason to use AtomicReference instead?
I've read multiple answer related to this but the concept of volatile still confuses me. Thanks in advance.
If you're not modifying the contents of map (except inside of refresh() when creating it), then there are no visibility issues in the code.
It's still possible to do isPresent(), refresh(), getName() (if no outside synchronization is used) and end up with isPresent()==true and getName()==null.
A class is "thread safe" if it does the right thing when it is used by multiple threads at the same time. There is no way to tell whether a class is thread safe unless you can say what "the right thing" means, and especially, what "the right thing when used by multiple threads" means.
What is the right thing if thread A calls foo.isPresent("X") and it returns true, and then thread B calls foo.refresh(), and then thread A calls foo.getName("X")?
If you are going to claim "thread safety", then you must be very explicit about what the caller should expect in cases like that.
Volatile is only useful in this scenario to update the value immediately. It doesn't really make the code by itself thread-safe.
But because you've stated in your comment, you only update the reference and because the reference-switch is atomic, your code will be thread-safe.(from the given code)
If I understood your question correctly and your comments - your class Foo holds a Map in which only the reference should be updated e.g. a whole new Map added instead of mutating it. If this is the premise:
It does not make any difference if you declare it as volatile or not. Every read/write operation in Java is atomic itself. You will never see a half transaction on these operations. See the JLS 17.7
17.7. Non-Atomic Treatment of double and long
For the purposes of the Java programming language memory model, a single write to a non-volatile long or double value is treated as two separate writes: one to each 32-bit half. This can result in a situation where a thread sees the first 32 bits of a 64-bit value from one write, and the second 32 bits from another write.
Writes and reads of volatile long and double values are always atomic.
Writes to and reads of references are always atomic, regardless of whether they are implemented as 32-bit or 64-bit values.
Some implementations may find it convenient to divide a single write action on a 64-bit long or double value into two write actions on adjacent 32-bit values. For efficiency's sake, this behavior is implementation-specific; an implementation of the Java Virtual Machine is free to perform writes to long and double values atomically or in two parts.
Implementations of the Java Virtual Machine are encouraged to avoid splitting 64-bit values where possible. Programmers are encouraged to declare shared 64-bit values as volatile or synchronize their programs correctly to avoid possible complications.
EDIT: Although the top statement still stands as it is - for thread safety it's necessary to add volatile to reflect the immediate update on different Threads to reflect the reference update. The behavior of a Thread is to make local copy of it while with volatile it will do a happens-before relationship in other words the Threads will have the same state of the Map.

Do I need to add some locks or synchronization if there is only one thread writing and several threads reading?

Say I have a global object:
class Global {
public static int remoteNumber = 0;
}
There is a thread runs periodically to get new number from remote, and updates it (only write):
new Thread {
#override
public void run() {
while(true) {
int newNumber = getFromRemote();
Global.remoteNumber = newNumber;
Thread.sleep(1000);
}
}
}
And there are one or more threads using this global remoteNumber randomly (only read):
int n = Global.remoteNumber;
doSomethingWith(n);
You can see I don't use any locks or synchronize to protected it, is it correct? Is there any potential issue that might cause problems?
Update:
In my case, it's not really important that the reading threads must get the latest new value in realtime. I mean, if there is any issue (caused of lacking lock/synchronization) make one reading thread missed that value, it doesn't matter, because it will have chance to run the same code soon (maybe in a loop)
But reading a undetermined value is not allowed (I mean, if the old value is 20, the new updated value is 30, but the reading threads reads a non-existent value say 33, I'm not sure if it's possible)
You need synchronization here (with one caveat, which I'll discuss later).
The main problem is that the reader threads may never see any of the updates the writer thread makes. Usually any given write will be seen eventually. But here your update loop is so simple that a write could easily be held in cache and never make it out to main memory. So you really must synchronize here.
EDIT 11/2017 I'm going to update this and say that it's probably not realistic that a value could be held in cache for so long. I think it's a issue though that a variable access like this could be optimized by the compiler and held in a register though. So synchronization is still needed (or volatile) to tell the optimizer to be sure to actually fetch a new value for each loop.
So you either need to use volatile, or you need to use a (static) getter and setter methods, and you need to use the synchronized keyword on both methods. For an occasional write like this, the volatile keyword is much lighter weight.
The caveat is if you truly don't need to see timely updates from the write thread, you don't have to synchronize. If a indefinite delay won't affect your program functionality, you could skip the synchronization. But something like this on a timer doesn't look like a good use case for omitting synchronization.
EDIT: Per Brian Goetz in Java Concurrency in Practice, it is not allowed for Java/a JVM to show you "indeterminate" values -- values that were never written. Those are more technically called "out of thin air" values and they are disallowed by the Java spec. You are guaranteed to see some write that was previously made to your global variable, either the zero it was initialized with, or some subsequent write, but no other values are permitted.
Read threads can read old value for undetermined time, but in practice there no problem. Its because each thread has own copy of this variable. Sometimes they sync. You can use volatile keyword to remove this optimisation:
public static volatile int remoteNumber = 0;

Proper use of volatile variables and synchronized blocks

I am trying to wrap my head around thread safety in java (or in general). I have this class (which I hope complies with the definition of a POJO) which also needs to be compatible with JPA providers:
public class SomeClass {
private Object timestampLock = new Object();
// are "volatile"s necessary?
private volatile java.sql.Timestamp timestamp;
private volatile String timestampTimeZoneName;
private volatile BigDecimal someValue;
public ZonedDateTime getTimestamp() {
// is synchronisation necessary here? is this the correct usage?
synchronized (timestampLock) {
return ZonedDateTime.ofInstant(timestamp.toInstant(), ZoneId.of(timestampTimeZoneName));
}
}
public void setTimestamp(ZonedDateTime dateTime) {
// is this the correct usage?
synchronized (timestampLock) {
this.timestamp = java.sql.Timestamp.from(dateTime.toInstant());
this.timestampTimeZoneName = dateTime.getZone().getId();
}
}
// is synchronisation required?
public BigDecimal getSomeValue() {
return someValue;
}
// is synchronisation required?
public void setSomeValue(BigDecimal val) {
someValue = val;
}
}
As stated in the commented rows in the code, is it necessary to define timestamp and timestampTimeZoneName as volatile and are the synchronized blocks used as they should be? Or should I use only the synchronized blocks and not define timestamp and timestampTimeZoneName as volatile? A timestampTimeZoneName of a timestamp should not be erroneously matched with another timestamp's.
This link says
Reads and writes are atomic for all variables declared volatile
(including long and double variables)
Should I understand that accesses to someValue in this code through the setter/getter are thread safe thanks to volatile definitions? If so, is there a better (I do not know what "better" might mean here) way to accomplish this?
To determine if you need synchronized, try to imagine a place where you can have a context switch that would break your code.
In this case, if the context switch happens where I put the comment, then in getTimestamp() you're going to be reading different values from each timestamp type.
Also, although assignments are atomic, this expression java.sql.Timestamp.from(dateTime.toInstant()); certainly isn't, so you can get a context switch inbetween dateTime.toInstant() and the call to from. In short you definitely need the synchronized blocks.
synchronized (timestampLock) {
this.timestamp = java.sql.Timestamp.from(dateTime.toInstant());
//CONTEXT SWITCH HERE
this.timestampTimeZoneName = dateTime.getZone().getId();
}
synchronized (timestampLock) {
return ZonedDateTime.ofInstant(timestamp.toInstant(), ZoneId.of(timestampTimeZoneName));
}
In terms of volatile, I'm pretty sure they're required. You have to guarantee that each thread definitely is getting the most updated version of a variable.
This is the contract of volatile. And although it may be covered by the synchronized block, and volatile not actually necessary here, it's good to write anyway. If the synchronized block does the job of volatile already, the VM won't do the guarantee twice. This means volatile won't cost you any more, and it's a very good flashing light that says to the programmer: "I'M USED IN MULTIPLE THREADS".
For someValue: If there's no synchronized block here, then volatile is definitely necessary. If you call a set in one thread, the other thread has no queue that tells it that may have been updated outside of this thread. So it may use an old and cached value. The JIT can do a lot of funny optimizations if it assumes single thread. Ones that can simply break your program.
Now I'm not entirely certain if synchronized is required here. My guess is no. I would add it anyway to be safe though. Or you can let java worry about the synchronization and use http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/atomic/AtomicInteger.html
Nothing new here, this is just a more explicit version of something #Cruncher already said:
You need synchronized whenever it is important for two or more fields in your program to be consistent with one another. Suppose you have two parallel lists, and your code depends on them both being the same length. That's called an invariant as in, the two lists are invariably the same length.
How can you write a method, append(x,y), that adds a new pair of values to the lists without temporarily breaking the invariant? You can't. The method must add one item to the first list, breaking the invariant, and then add the other item to the second list, fixing it again. There's no other way.
In a single-threaded program, that temporary broken state is no problem because no other method can possibly use the lists while append(x,y) is running. That's no longer true in a multithreaded program. In the worst case, append(x,y) could add x to the x list, and then the scheduler could suspend the thread at that exact moment to allow other threads to run. The CPUs could execute millions of instructions before append(x,y) gets to finish the job and make the lists right again. During all of that time, other threads would see the broken invariant, and possibly corrupt your data or crash the program as a result.
The fix is for append(x,y) to be synchronized on some object, and (this is the important part), for every other method that uses the lists to be synchronized on the same object. Since only one thread can be synchronized on a given object at a given time, it will not be possible for any other thread to see the lists in an inconsistent state.
So, if thread A calls append(x,y), and thread B tries to look at the lists "at the same time", will thread B see the what the lists looked like before or after thread A did its work? That's called a data race. And with only the synchronization that I have described so far, there's no way to know which thread will win. All we've done so far is to guarantee one particular invariant.
If it matters which thread wins the race, then that means that there is some higher-level invariant that also needs protection. You will have to add more synchronization to protect that one too. "Thread safety" -- two little words to name a subject that is both broad and deep.
Good Luck, and Have Fun!
// is synchronisation required?
public BigDecimal getSomeValue() {
return someValue;
}
// is synchronisation required?
public void setSomeValue(BigDecimal val) {
someValue = val;
}
I think Yes you are require to put the synchronization block because consider an example in which one thread is setting the value and at the same time other thread is trying to read from getter method, like here in the example you will see the syncronization block.So, if you take your variable inside the method then you must require the synchronization block.

Java concurrency scenario -- do I need synchronization or not?

Here's the deal. I have a hash map containing data I call "program codes", it lives in an object, like so:
Class Metadata
{
private HashMap validProgramCodes;
public HashMap getValidProgramCodes() { return validProgramCodes; }
public void setValidProgramCodes(HashMap h) { validProgramCodes = h; }
}
I have lots and lots of reader threads each of which will call getValidProgramCodes() once and then use that hashmap as a read-only resource.
So far so good. Here's where we get interesting.
I want to put in a timer which every so often generates a new list of valid program codes (never mind how), and calls setValidProgramCodes.
My theory -- which I need help to validate -- is that I can continue using the code as is, without putting in explicit synchronization. It goes like this:
At the time that validProgramCodes are updated, the value of validProgramCodes is always good -- it is a pointer to either the new or the old hashmap. This is the assumption upon which everything hinges. A reader who has the old hashmap is okay; he can continue to use the old value, as it will not be garbage collected until he releases it. Each reader is transient; it will die soon and be replaced by a new one who will pick up the new value.
Does this hold water? My main goal is to avoid costly synchronization and blocking in the overwhelming majority of cases where no update is happening. We only update once per hour or so, and readers are constantly flickering in and out.
Use Volatile
Is this a case where one thread cares what another is doing? Then the JMM FAQ has the answer:
Most of the time, one thread doesn't
care what the other is doing. But when
it does, that's what synchronization
is for.
In response to those who say that the OP's code is safe as-is, consider this: There is nothing in Java's memory model that guarantees that this field will be flushed to main memory when a new thread is started. Furthermore, a JVM is free to reorder operations as long as the changes aren't detectable within the thread.
Theoretically speaking, the reader threads are not guaranteed to see the "write" to validProgramCodes. In practice, they eventually will, but you can't be sure when.
I recommend declaring the validProgramCodes member as "volatile". The speed difference will be negligible, and it will guarantee the safety of your code now and in future, whatever JVM optimizations might be introduced.
Here's a concrete recommendation:
import java.util.Collections;
class Metadata {
private volatile Map validProgramCodes = Collections.emptyMap();
public Map getValidProgramCodes() {
return validProgramCodes;
}
public void setValidProgramCodes(Map h) {
if (h == null)
throw new NullPointerException("validProgramCodes == null");
validProgramCodes = Collections.unmodifiableMap(new HashMap(h));
}
}
Immutability
In addition to wrapping it with unmodifiableMap, I'm copying the map (new HashMap(h)). This makes a snapshot that won't change even if the caller of setter continues to update the map "h". For example, they might clear the map and add fresh entries.
Depend on Interfaces
On a stylistic note, it's often better to declare APIs with abstract types like List and Map, rather than a concrete types like ArrayList and HashMap. This gives flexibility in the future if concrete types need to change (as I did here).
Caching
The result of assigning "h" to "validProgramCodes" may simply be a write to the processor's cache. Even when a new thread starts, "h" will not be visible to a new thread unless it has been flushed to shared memory. A good runtime will avoid flushing unless it's necessary, and using volatile is one way to indicate that it's necessary.
Reordering
Assume the following code:
HashMap codes = new HashMap();
codes.putAll(source);
meta.setValidProgramCodes(codes);
If setValidCodes is simply the OP's validProgramCodes = h;, the compiler is free to reorder the code something like this:
1: meta.validProgramCodes = codes = new HashMap();
2: codes.putAll(source);
Suppose after execution of writer line 1, a reader thread starts running this code:
1: Map codes = meta.getValidProgramCodes();
2: Iterator i = codes.entrySet().iterator();
3: while (i.hasNext()) {
4: Map.Entry e = (Map.Entry) i.next();
5: // Do something with e.
6: }
Now suppose that the writer thread calls "putAll" on the map between the reader's line 2 and line 3. The map underlying the Iterator has experienced a concurrent modification, and throws a runtime exception—a devilishly intermittent, seemingly inexplicable runtime exception that was never produced during testing.
Concurrent Programming
Any time you have one thread that cares what another thread is doing, you must have some sort of memory barrier to ensure that actions of one thread are visible to the other. If an event in one thread must happen before an event in another thread, you must indicate that explicitly. There are no guarantees otherwise. In practice, this means volatile or synchronized.
Don't skimp. It doesn't matter how fast an incorrect program fails to do its job. The examples shown here are simple and contrived, but rest assured, they illustrate real-world concurrency bugs that are incredibly difficult to identify and resolve due to their unpredictability and platform-sensitivity.
Additional Resources
The Java Language Specification - 17 Threads and Locks sections: §17.3 and §17.4
The JMM FAQ
Doug Lea's concurrency books
No, the code example is not safe, because there is no safe publication of any new HashMap instances. Without any synchronization, there is a possibility that a reader thread will see a partially initialized HashMap.
Check out #erickson's explanation under "Reordering" in his answer. Also I can't recommend Brian Goetz's book Java Concurrency in Practice enough!
Whether or not it is okay with you that reader threads might see old (stale) HashMap references, or might even never see a new reference, is beside the point. The worst thing that can happen is that a reader thread might obtain reference to and attempt to access a HashMap instance that is not yet initialized and not ready to be accessed.
No, by the Java Memory Model (JMM), this is not thread-safe.
There is no happens-before relation between writing and reading the HashMap implementation objects. So, although the writer thread appears to write out the object first and then the reference, a reader thread may not see the same order.
As also mentioned there is no guarantee that the reaer thread will ever see the new value. In practice with current compilers on existing hardware the value should get updated, unless the loop body is sufficienly small that it can be sufficiently inlined.
So, making the reference volatile is adequate under the new JMM. It is unlikely to make a substantial difference to system performance.
The moral of this story: Threading is difficult. Don't try to be clever, because sometimes (may be not on your test system) you wont be clever enough.
As others have already noted, this is not safe and you shouldn't do this. You need either volatile or synchronized here to force other threads to see the change.
What hasn't been mentioned is that synchronized and especially volatile are probably a lot faster than you think. If it's actually a performance bottleneck in your app, then I'll eat this web page.
Another option (probably slower than volatile, but YMMV) is to use a ReentrantReadWriteLock to protect access so that multiple concurrent readers can read it. And if that's still a performance bottleneck, I'll eat this whole web site.
public class Metadata
{
private HashMap validProgramCodes;
private ReadWriteLock lock = new ReentrantReadWriteLock();
public HashMap getValidProgramCodes() {
lock.readLock().lock();
try {
return validProgramCodes;
} finally {
lock.readLock().unlock();
}
}
public void setValidProgramCodes(HashMap h) {
lock.writeLock().lock();
try {
validProgramCodes = h;
} finally {
lock.writeLock().unlock();
}
}
}
I think your assumptions are correct. The only thing I would do is set the validProgramCodes volatile.
private volatile HashMap validProgramCodes;
This way, when you update the "pointer" of validProgramCodes you guaranty that all threads access the same latest HasMap "pointer" because they don't rely on local thread cache and go directly to memory.
The assignment will work as long as you're not concerned about reading stale values, and as long as you can guarantee that your hashmap is properly populated on initialization. You should at the least create the hashMap with Collections.unmodifiableMap on the Hashmap to guarantee that your readers won't be changing/deleting objects from the map, and to avoid multiple threads stepping on each others toes and invalidating iterators when other threads destroy.
( writer above is right about the volatile, should've seen that)
While this is not the best solution for this particular problem (erickson's idea of a new unmodifiableMap is), I'd like to take a moment to mention the java.util.concurrent.ConcurrentHashMap class introduced in Java 5, a version of HashMap specifically built with concurrency in mind. This construct does not block on reads.
Check this post about concurrency basics. It should be able to answer your question satisfactorily.
http://walivi.wordpress.com/2013/08/24/concurrency-in-java-a-beginners-introduction/
I think it's risky. Threading results in all kinds of subtly issues that are a giant pain to debug. You might want to look at FastHashMap, which is intended for read-only threading cases like this.
At the least, I'd also declare validProgramCodes to be volatile so that the reference won't get optimized into a register or something.
If I read the JLS correctly (no guarantees there!), accesses to references are always atomic, period. See Section 17.7 Non-atomic Treatment of double and long
So, if the access to a reference is always atomic and it doesn't matter what instance of the returned Hashmap the threads see, you should be OK. You won't see partial writes to the reference, ever.
Edit: After review of the discussion in the comments below and other answers, here are references/quotes from
Doug Lea's book (Concurrent Programming in Java, 2nd Ed), p 94, section 2.2.7.2 Visibility, item #3: "
The first time a thread access a field
of an object, it sees either the
initial value of the field or the
value since written by some other
thread."
On p. 94, Lea goes on to describe risks associated with this approach:
The memory model guarantees that, given the eventual occurrence of the above operations, a particular update to a particular field made by one thread will eventually be visible to another. But eventually can be an arbitrarily long time.
So when it absolutely, positively, must be visible to any calling thread, volatile or some other synchronization barrier is required, especially in long running threads or threads that access the value in a loop (as Lea says).
However, in the case where there is a short lived thread, as implied by the question, with new threads for new readers and it does not impact the application to read stale data, synchronization is not required.
#erickson's answer is the safest in this situation, guaranteeing that other threads will see the changes to the HashMap reference as they occur. I'd suggest following that advice simply to avoid the confusion over the requirements and implementation that resulted in the "down votes" on this answer and the discussion below.
I'm not deleting the answer in the hope that it will be useful. I'm not looking for the "Peer Pressure" badge... ;-)

Categories