Q: Code isnt working without syncronized method [duplicate] - java

I have a class that contains a boolean field like this one:
public class MyClass
{
private bool boolVal;
public bool BoolVal
{
get { return boolVal; }
set { boolVal = value; }
}
}
The field can be read and written from many threads using the property. My question is if I should fence the getter and setter with a lock statement? Or should I simply use the volatile keyword and save the locking? Or should I totally ignore multithreading since getting and setting boolean values atomic?
regards,

There are several issues here.
The simple first. Yes, reading and writing a boolean variable is an atomic operation. (clarification: What I mean is that read and write operations by themselves are atomic operations for booleans, not reading and writing, that will of course generate two operations, which together will not be atomic)
However, unless you take extra steps, the compiler might optimize away such reading and writing, or move the operations around, which could make your code operate differently from what you intend.
Marking the field as volatile means that the operations will not be optimized away, the directive basically says that the compiler should never assume the value in this field is the same as the previous one, even if it just read it in the previous instruction.
However, on multicore and multicpu machines, different cores and cpus might have a different value for the field in their cache, and thus you add a lock { } clause, or anything else that forces a memory barrier. This will ensure that the field value is consistent across cores. Additionally, reads and writes will not move past a memory barrier in the code, which means you have predictability in where the operations happen.
So if you suspect, or know, that this field will be written to and read from multiple threads, I would definitely add locking and volatile to the mix.
Note that I'm no expert in multithreading, I'm able to hold my own, but I usually program defensively. There might (I would assume it is highly likely) that you can implement something that doesn't use a lock (there are many lock-free constructs), but sadly I'm not experienced enough in this topic to handle those things. Thus my advice is to add both a lock clause and a volatile directive.

volatile alone is not enough and serves for a different purpose, lock should be fine, but in the end it depends if anyone is going to set boolVal in MyClass iself, who knows, you may have a worker thread spinning in there. It also depends and how you are using boolVal internally. You may also need protection elsewhere. If you ask me, if you are not DEAD SURE you are going to use MyClass in more than one thread, then it's not worth even thinking about it.
P.S. you may also want to read this section

Related

Do I need to add some locks or synchronization if there is only one thread writing and several threads reading?

Say I have a global object:
class Global {
public static int remoteNumber = 0;
}
There is a thread runs periodically to get new number from remote, and updates it (only write):
new Thread {
#override
public void run() {
while(true) {
int newNumber = getFromRemote();
Global.remoteNumber = newNumber;
Thread.sleep(1000);
}
}
}
And there are one or more threads using this global remoteNumber randomly (only read):
int n = Global.remoteNumber;
doSomethingWith(n);
You can see I don't use any locks or synchronize to protected it, is it correct? Is there any potential issue that might cause problems?
Update:
In my case, it's not really important that the reading threads must get the latest new value in realtime. I mean, if there is any issue (caused of lacking lock/synchronization) make one reading thread missed that value, it doesn't matter, because it will have chance to run the same code soon (maybe in a loop)
But reading a undetermined value is not allowed (I mean, if the old value is 20, the new updated value is 30, but the reading threads reads a non-existent value say 33, I'm not sure if it's possible)
You need synchronization here (with one caveat, which I'll discuss later).
The main problem is that the reader threads may never see any of the updates the writer thread makes. Usually any given write will be seen eventually. But here your update loop is so simple that a write could easily be held in cache and never make it out to main memory. So you really must synchronize here.
EDIT 11/2017 I'm going to update this and say that it's probably not realistic that a value could be held in cache for so long. I think it's a issue though that a variable access like this could be optimized by the compiler and held in a register though. So synchronization is still needed (or volatile) to tell the optimizer to be sure to actually fetch a new value for each loop.
So you either need to use volatile, or you need to use a (static) getter and setter methods, and you need to use the synchronized keyword on both methods. For an occasional write like this, the volatile keyword is much lighter weight.
The caveat is if you truly don't need to see timely updates from the write thread, you don't have to synchronize. If a indefinite delay won't affect your program functionality, you could skip the synchronization. But something like this on a timer doesn't look like a good use case for omitting synchronization.
EDIT: Per Brian Goetz in Java Concurrency in Practice, it is not allowed for Java/a JVM to show you "indeterminate" values -- values that were never written. Those are more technically called "out of thin air" values and they are disallowed by the Java spec. You are guaranteed to see some write that was previously made to your global variable, either the zero it was initialized with, or some subsequent write, but no other values are permitted.
Read threads can read old value for undetermined time, but in practice there no problem. Its because each thread has own copy of this variable. Sometimes they sync. You can use volatile keyword to remove this optimisation:
public static volatile int remoteNumber = 0;

Is unsynchronized read of integer threadsafe in java?

I see this code quite frequently in some OSS unit tests, but is it thread safe ? Is the while loop guaranteed to see the correct value of invoc ?
If no; nerd points to whoever also knows which CPU architecture this may fail on.
private int invoc = 0;
private synchronized void increment() {
invoc++;
}
public void isItThreadSafe() throws InterruptedException {
for (int i = 0; i < TOTAL_THREADS; i++) {
new Thread(new Runnable() {
public void run() {
// do some stuff
increment();
}
}).start();
}
while (invoc != TOTAL_THREADS) {
Thread.sleep(250);
}
}
No, it's not threadsafe. invoc needs to be declared volatile, or accessed while synchronizing on the same lock, or changed to use AtomicInteger. Just using the synchronized method to increment invoc, but not synchronizing to read it, isn't good enough.
The JVM does a lot of optimizations, including CPU-specific caching and instruction reordering. It uses the volatile keyword and locking to decide when it can optimize freely and when it has to have an up-to-date value available for other threads to read. So when the reader doesn't use the lock the JVM can't know not to give it a stale value.
This quote from Java Concurrency in Practice (section 3.1.3) discusses how both writes and reads need to be synchronized:
Intrinsic locking can be used to guarantee that one thread sees the effects of another in a predictable manner, as illustrated by Figure 3.1. When thread A executes a synchronized block, and subsequently thread B enters a synchronized block guarded by the same lock, the values of variables that were visible to A prior to releasing the lock are guaranteed to be visible to B upon acquiring the lock. In other words, everything A did in or prior to a synchronized block is visible to B when it executes a synchronized block guarded by the same lock. Without synchronization, there is no such guarantee.
The next section (3.1.4) covers using volatile:
The Java language also provides an alternative, weaker form of synchronization, volatile variables, to ensure that updates to a variable are propagated predictably to other threads. When a field is declared volatile, the compiler and runtime are put on notice that this variable is shared and that operations on it should not be reordered with other memory operations. Volatile variables are not cached in registers or in caches where they are hidden from other processors, so a read of a volatile variable always returns the most recent write by any thread.
Back when we all had single-CPU machines on our desktops we'd write code and never have a problem until it ran on a multiprocessor box, usually in production. Some of the factors that give rise to the visiblity problems, things like CPU-local caches and instruction reordering, are things you would expect from any multiprocessor machine. Elimination of apparently unneeded instructions could happen for any machine, though. There's nothing forcing the JVM to ever make the reader see the up-to-date value of the variable, you're at the mercy of the JVM implementors. So it seems to me this code would not be a good bet for any CPU architecture.
Well!
private volatile int invoc = 0;
Will do the trick.
And see Are java primitive ints atomic by design or by accident? which sites some of the relevant java definitions. Apparently int is fine, but double & long might not be.
edit, add-on. The question asks, "see the correct value of invoc ?". What is "the correct value"? As in the timespace continuum, simultaneity doesn't really exist between threads. One of the above posts notes that the value will eventually get flushed, and the other thread will get it. Is the code "thread safe"? I would say "yes", because it won't "misbehave" based on the vagaries of sequencing, in this case.
Theoretically, it is possible that the read is cached. Nothing in Java memory model prevents that.
Practically, that is extremely unlikely to happen (in your particular example). The question is, whether JVM can optimize across a method call.
read #1
method();
read #2
For JVM to reason that read#2 can reuse the result of read#1 (which can be stored in a CPU register), it must know for sure that method() contains no synchronization actions. This is generally impossible - unless, method() is inlined, and JVM can see from the flatted code that there's no sync/volatile or other synchronization actions between read#1 and read#2; then it can safely eliminate read#2.
Now in your example, the method is Thread.sleep(). One way to implement it is to busy loop for certain times, depending on CPU frequency. Then JVM may inline it, and then eliminate read#2.
But of course such implementation of sleep() is unrealistic. It is usually implemented as a native method that calls OS kernel. The question is, can JVM optimize across such a native method.
Even if JVM has knowledge of internal workings of some native methods, therefore can optimize across them, it's improbable that sleep() is treated that way. sleep(1ms) takes millions of CPU cycles to return, there is really no point optimizing around it to save a few reads.
--
This discussion reveals the biggest problem of data races - it takes too much effort to reason about it. A program is not necessarily wrong, if it is not "correctly synchronized", however to prove it's not wrong is not an easy task. Life is much simpler, if a program is correctly synchronized and contains no data race.
As far as I understand the code it should be safe. The bytecode can be reordered, yes. But eventually invoc should be in sync with the main thread again. Synchronize guarantees that invoc is incremented correctly so there is a consistent representation of invoc in some register. At some time this value will be flushed and the little test succeeds.
It is certainly not nice and I would go with the answer I voted for and would fix code like this because it smells. But thinking about it I would consider it safe.
If you're not required to use "int", I would suggest AtomicInteger as an thread-safe alternative.

Thread safety in java

All,
I started learning Java threads in the past few days and have only read about scenarios where even after using synchronizer methods/blocks, the code/class remains vulnerable to concurrency issues. Can anyone please provide a scenario where synchronized blocks/methods fail ? And, what should be the alternative in these cases to ensure thread safety.
Proper behaviour under concurrent access is a complex topic, and it's not as simple as just slapping synchronized on everything, as now you have to think about how operations might interleave.
For instance, imagine you have a class like a list, and you want to make it threadsafe. So you make all the methods synchronized and continue. Chances are, clients might be using your list in the following way:
int index = ...; // this gets set somewhere, maybe passed in as an argument
// Check that the list has enough elements for this call to make sense
if (list.size() > index)
{
return list.get(index);
}
else
{
return DEFAULT_VALUE;
}
In a single-threaded environment this code is perfectly safe. However, if the list is being accessed (and possibly modified) concurrently, it's possible for the list's size to change after the call to size(), but before the call to get(). So the list could "impossibly" throw an IndexOutOfBoundsException (or similar) in this case, even though the size was checked beforehand.
There's no shortcut of how to fix this - you simply need to think carefully about the use-cases for your class/interface, and ensure that you can actually guarantee them when interleaved with any other valid operations. Often this might require some additional complexity, or simply more specifics in the documentation. If the hypothetical list class specified that it always synchronized on its own monitor, than that specific situation could be fixed as
synchronized(list)
{
if (list.size() > index)
{
return list.get(index);
}
}
but under other synchronization schemes, this would not work. Or it might be too much of a bottleneck. Or forcing the clients to make the multiple calls within the same lexical scope may be an unacceptable constraint. It all depends on what you're trying to achieve, as to how you can make your interface safe, performant and elegant.
Scenario 1 Classic deadlock:
Object Mutex1;
Object Mutex2;
public void method1(){
synchronized(Mutex1){
synchronized(Mutex2){
}
}
}
public void method2(){
synchronized(Mutex2){
synchronized(Mutex1){
}
}
}
Other scenarios include anything with a shared resource even a variable, because one thread could change the variables contents, or even make it point to null without the other thread knowing. Writing to IO has similar issues try writing code to a file using two threads or out to a sockeet.
Very good articles about concurrency and the Java Memory Model can be found at Angelika Langers website
"vulnerable to concurrency issues" is very vague. It would help to know what you have actually read and where. Two things that come to mind:
Just slapping on "synchronized" somewhere does not mean the code is synchronized correctly - it can be very hard to do correctly, and developers frequently miss some problematic scenarios even when they think they're doing it right.
Even if the synchronization correctly prevents non-deterministic changes to the data, you can still run into deadlocks.
Synchronized methods prevent other methods/blocks requiring same monitor from being executed when you execute them.
But if you have 2 methods, lets say int get() and set(int val) and have somewhere else method which does
obj.set(1+obj.get());
and this method runs in two threads, you can end with value increased by one or by two, depending on unpredictable factors.
Therefore you must somehow protect using such methods too (but only if its needed).
btw. use each monitor for as few functions/blocks as possible, so only those who can wrongly influence each other are synchronized.
And try to expose as few as possible methods requiring further protection.

Java memory model - can someone explain it?

For years and years, I've tried to understand the part of Java specification that deals with memory model and concurrency. I have to admit that I've failed miserably. Yes' I understand about locks and "synchronized" and wait() and notify(). And I can use them just fine, thank you. I even have a vague idea about what "volatile" does. But all of that was not derived from the language spec - rather from general experience.
Here are two sample questions that I am asking. I am not so much interested in particular answers, as I need to understand how the answers are derived from the spec (or may be how I conclude that the spec has no answer).
What does "volatile" do, exactly?
Are writes to variable atomic? Does it depend on variable's type?
I'm not going to attempt to actually answer your questions here - instead I'll redirect you to the book which I seeing recommended for advice on this topic: Java Concurrency in Practice.
One word of warning: if there are answers here, expect quite a few of them to be wrong. One of the reasons I'm not going to post details is because I'm pretty sure I'd get it wrong in at least some respects. I mean no disrespect whatsoever to the community when I say that the chances of everyone who thinks they can answer this question actually having enough rigour to get it right is practically zero. (Joe Duffy recently found a bit of the .NET memory model that was surprised by. If he can get it wrong, so can mortals like us.)
I will offer some insight on just one aspect, because it's often misunderstood:
There's a difference between volatility and atomicity. People often think that an atomic write is volatile (i.e. you don't need to worry about the memory model if the write is atomic). That's not true.
Volatility is about whether one thread performing a read (logically, in the source code) will "see" changes made by another thread.
Atomicity is about whether there is any chance that if a change is seen, only part of the change will be seen.
For instance, take writing to an integer field. That is guaranteed to be atomic, but not volatile. That means that if we have (starting at foo.x = 0):
Thread 1: foo.x = 257;
Thread 2: int y = foo.x;
It's possible for y to be 0 or 257. It won't be any other value, (e.g. 256 or 1) due to the atomicity constraint. However, even if you know that in "wall time" the code in thread 2 executed after the code in thread 1, there could be odd caching, memory accesses "moving" etc. Making the variable x volatile will fix this.
I'll leave the rest up to real honest-to-goodness experts.
non-volatile variables can be cached thread-locally, so different threads may see different values at the same time; volatile prevents this (source)
writes to variables of 32 bits or smaller are guaranteed to be atomic (implied here); not so for long and double, though 64bit JVMs probably implement them as atomic operations
I wont try to explain these issues here but instead refer you to Brian Goetz excellent book on the subject.
The book is "Java Concurrency in Practice", can be found at Amazon or any other well sorted store for computer literature.
This is a good link which can give you a little in depth information:
http://www.cs.umd.edu/~pugh/java/memoryModel/jsr-133-faq.html
I recently found an excellent article that explain volatile as:
First, you have to understand a little something about the Java memory model. I've struggled a bit over the years to explain it briefly and well. As of today, the best way I can think of to describe it is if you imagine it this way:
Each thread in Java takes place in a separate memory space (this is clearly untrue, so bear with me on this one).
You need to use special mechanisms to guarantee that communication happens between these threads, as you would on a message passing system.
Memory writes that happen in one thread can "leak through" and be seen by another thread, but this is by no means guaranteed. Without explicit communication, you can't guarantee which writes get seen by other threads, or even the order in which they get seen.
The Java volatile modifier is an example of a special mechanism to guarantee that communication happens between threads. When one thread writes to a volatile variable, and another thread sees that write, the first thread is telling the second about all of the contents of memory up until it performed the write to that volatile variable.
Additional links:
http://jeremymanson.blogspot.com/2008/11/what-volatile-means-in-java.html
http://www.javaperformancetuning.com/news/qotm030.shtml
JVM Memory model
High level diagram
Code sample
class MainClass {
void method1() { //<- main
int variable1 = 1;
Class1 variable2 = new Class1();
variable2.method2();
}
}
class Class1 {
static Class2 classVariable4 = new Class2();
int instanceVariable5 = 0;
Class2 instanceVariable6 = new Class2();
void method2() {
int variable3 = 3;
}
}
class Class2 { }
*Notes:
thread stack contains only local variables
Members(Class and Instance variables) are stored on heap even they are primitives
What does "volatile" do, exactly?
[Java volatile]
Are writes to variable atomic? Does it depend on variable's type?
[Atomic variable]
[Java thread safe of local variables]
Other answers above are absolutely correct in that your question is not for the feint of heart.
However, I understand your pain on really wanting to get what is under the hood - for this I would point you back to the worlds compilers and lower-level predecessors to java - i.e. assembly, C and C++.
Read about different kinds of barriers ('fences'). Understanding what a memory barrier is, and where it is necessary, will help you have an intuitive grasp of what volatile does.
One notion might be helpful: data (datum) and copies.
If you declare a variable, let's say a byte, it resides somewhere in the memory, in a data segment (roughly speaking). There are 8 bits somewhere in the memory devoted to store that piece of information.
However, there can be several copies of that data, moving around in your machine. For various technical reasons, e.g. thread's local storage, compiler optimizations. And if we have several copies, they might be out of sync.
So you should always keep this notion in mind. It's true not only for java class fields, but for cpp variables, database records (the record state data gets copied into several sessions etc.). Variables, their hidden/visible copies and the subtle syncing issues will be around forever.
Another attempt to provide a summary of things I understood from the answers here and from other sources (the first attempt was pretty far off base. I hope this one is better).
Java memory model is about propagating values written to memory in one thread to other threads so that other threads can see them as they read from memory.
In short, if you obtain a lock on a mutex, anything written by any thread that released that mutex before will be visible to your thread.
If you read a volatile variable, anything written to that volatile variable before you read it is visible to the reading thread. Also, any write to volatile variable done by the thread that write to your variable before the write to your variable is visible. Moreover, in Java 1.5 any write at all, volatile or not, that happened on any thread that wrote to your volatile variable before the write to your volatile variable will be visible to you.
After an object is constructed, you can pass it to another thread, and all final members will be visible and fully constructed in the new thread. There are no similar guarantees about non-final members. That makes me think that assignment to a final member acts as a write to volatile variable (memory fence).
Anything that a thread wrote before its Runnable exited is visible to the thread that executes join(). Anything that a thread wrote before executing start() will be visible to the spawned thread.
Another thing to mention: volatile variables and synchronization have a function that's rarely mentioned: besides flushing the thread cache and providing one-thread-at-a-time access they also prevent compiler and CPU from reordering reads and writes across sync boundary.
None of it is new and the other answers have stated it better. I just wanted to write this up to clear my head.
This explains it using cities (threads) and planets (main memory).
http://mollypages.org/tutorials/javamemorymodel.mp
There are no direct flights from city to city.
You have to first go to another planet (Mars in this case) and then to another city on your home planet. So, from NYC to Tokyo, you have to go:
NYC -> Mars -> Tokyo
Now replace NYC and Tokyo with 2 threads, Mars with Main memory and the flights as acquiring/releasing locks and you have the JMM.

Java concurrency scenario -- do I need synchronization or not?

Here's the deal. I have a hash map containing data I call "program codes", it lives in an object, like so:
Class Metadata
{
private HashMap validProgramCodes;
public HashMap getValidProgramCodes() { return validProgramCodes; }
public void setValidProgramCodes(HashMap h) { validProgramCodes = h; }
}
I have lots and lots of reader threads each of which will call getValidProgramCodes() once and then use that hashmap as a read-only resource.
So far so good. Here's where we get interesting.
I want to put in a timer which every so often generates a new list of valid program codes (never mind how), and calls setValidProgramCodes.
My theory -- which I need help to validate -- is that I can continue using the code as is, without putting in explicit synchronization. It goes like this:
At the time that validProgramCodes are updated, the value of validProgramCodes is always good -- it is a pointer to either the new or the old hashmap. This is the assumption upon which everything hinges. A reader who has the old hashmap is okay; he can continue to use the old value, as it will not be garbage collected until he releases it. Each reader is transient; it will die soon and be replaced by a new one who will pick up the new value.
Does this hold water? My main goal is to avoid costly synchronization and blocking in the overwhelming majority of cases where no update is happening. We only update once per hour or so, and readers are constantly flickering in and out.
Use Volatile
Is this a case where one thread cares what another is doing? Then the JMM FAQ has the answer:
Most of the time, one thread doesn't
care what the other is doing. But when
it does, that's what synchronization
is for.
In response to those who say that the OP's code is safe as-is, consider this: There is nothing in Java's memory model that guarantees that this field will be flushed to main memory when a new thread is started. Furthermore, a JVM is free to reorder operations as long as the changes aren't detectable within the thread.
Theoretically speaking, the reader threads are not guaranteed to see the "write" to validProgramCodes. In practice, they eventually will, but you can't be sure when.
I recommend declaring the validProgramCodes member as "volatile". The speed difference will be negligible, and it will guarantee the safety of your code now and in future, whatever JVM optimizations might be introduced.
Here's a concrete recommendation:
import java.util.Collections;
class Metadata {
private volatile Map validProgramCodes = Collections.emptyMap();
public Map getValidProgramCodes() {
return validProgramCodes;
}
public void setValidProgramCodes(Map h) {
if (h == null)
throw new NullPointerException("validProgramCodes == null");
validProgramCodes = Collections.unmodifiableMap(new HashMap(h));
}
}
Immutability
In addition to wrapping it with unmodifiableMap, I'm copying the map (new HashMap(h)). This makes a snapshot that won't change even if the caller of setter continues to update the map "h". For example, they might clear the map and add fresh entries.
Depend on Interfaces
On a stylistic note, it's often better to declare APIs with abstract types like List and Map, rather than a concrete types like ArrayList and HashMap. This gives flexibility in the future if concrete types need to change (as I did here).
Caching
The result of assigning "h" to "validProgramCodes" may simply be a write to the processor's cache. Even when a new thread starts, "h" will not be visible to a new thread unless it has been flushed to shared memory. A good runtime will avoid flushing unless it's necessary, and using volatile is one way to indicate that it's necessary.
Reordering
Assume the following code:
HashMap codes = new HashMap();
codes.putAll(source);
meta.setValidProgramCodes(codes);
If setValidCodes is simply the OP's validProgramCodes = h;, the compiler is free to reorder the code something like this:
1: meta.validProgramCodes = codes = new HashMap();
2: codes.putAll(source);
Suppose after execution of writer line 1, a reader thread starts running this code:
1: Map codes = meta.getValidProgramCodes();
2: Iterator i = codes.entrySet().iterator();
3: while (i.hasNext()) {
4: Map.Entry e = (Map.Entry) i.next();
5: // Do something with e.
6: }
Now suppose that the writer thread calls "putAll" on the map between the reader's line 2 and line 3. The map underlying the Iterator has experienced a concurrent modification, and throws a runtime exception—a devilishly intermittent, seemingly inexplicable runtime exception that was never produced during testing.
Concurrent Programming
Any time you have one thread that cares what another thread is doing, you must have some sort of memory barrier to ensure that actions of one thread are visible to the other. If an event in one thread must happen before an event in another thread, you must indicate that explicitly. There are no guarantees otherwise. In practice, this means volatile or synchronized.
Don't skimp. It doesn't matter how fast an incorrect program fails to do its job. The examples shown here are simple and contrived, but rest assured, they illustrate real-world concurrency bugs that are incredibly difficult to identify and resolve due to their unpredictability and platform-sensitivity.
Additional Resources
The Java Language Specification - 17 Threads and Locks sections: §17.3 and §17.4
The JMM FAQ
Doug Lea's concurrency books
No, the code example is not safe, because there is no safe publication of any new HashMap instances. Without any synchronization, there is a possibility that a reader thread will see a partially initialized HashMap.
Check out #erickson's explanation under "Reordering" in his answer. Also I can't recommend Brian Goetz's book Java Concurrency in Practice enough!
Whether or not it is okay with you that reader threads might see old (stale) HashMap references, or might even never see a new reference, is beside the point. The worst thing that can happen is that a reader thread might obtain reference to and attempt to access a HashMap instance that is not yet initialized and not ready to be accessed.
No, by the Java Memory Model (JMM), this is not thread-safe.
There is no happens-before relation between writing and reading the HashMap implementation objects. So, although the writer thread appears to write out the object first and then the reference, a reader thread may not see the same order.
As also mentioned there is no guarantee that the reaer thread will ever see the new value. In practice with current compilers on existing hardware the value should get updated, unless the loop body is sufficienly small that it can be sufficiently inlined.
So, making the reference volatile is adequate under the new JMM. It is unlikely to make a substantial difference to system performance.
The moral of this story: Threading is difficult. Don't try to be clever, because sometimes (may be not on your test system) you wont be clever enough.
As others have already noted, this is not safe and you shouldn't do this. You need either volatile or synchronized here to force other threads to see the change.
What hasn't been mentioned is that synchronized and especially volatile are probably a lot faster than you think. If it's actually a performance bottleneck in your app, then I'll eat this web page.
Another option (probably slower than volatile, but YMMV) is to use a ReentrantReadWriteLock to protect access so that multiple concurrent readers can read it. And if that's still a performance bottleneck, I'll eat this whole web site.
public class Metadata
{
private HashMap validProgramCodes;
private ReadWriteLock lock = new ReentrantReadWriteLock();
public HashMap getValidProgramCodes() {
lock.readLock().lock();
try {
return validProgramCodes;
} finally {
lock.readLock().unlock();
}
}
public void setValidProgramCodes(HashMap h) {
lock.writeLock().lock();
try {
validProgramCodes = h;
} finally {
lock.writeLock().unlock();
}
}
}
I think your assumptions are correct. The only thing I would do is set the validProgramCodes volatile.
private volatile HashMap validProgramCodes;
This way, when you update the "pointer" of validProgramCodes you guaranty that all threads access the same latest HasMap "pointer" because they don't rely on local thread cache and go directly to memory.
The assignment will work as long as you're not concerned about reading stale values, and as long as you can guarantee that your hashmap is properly populated on initialization. You should at the least create the hashMap with Collections.unmodifiableMap on the Hashmap to guarantee that your readers won't be changing/deleting objects from the map, and to avoid multiple threads stepping on each others toes and invalidating iterators when other threads destroy.
( writer above is right about the volatile, should've seen that)
While this is not the best solution for this particular problem (erickson's idea of a new unmodifiableMap is), I'd like to take a moment to mention the java.util.concurrent.ConcurrentHashMap class introduced in Java 5, a version of HashMap specifically built with concurrency in mind. This construct does not block on reads.
Check this post about concurrency basics. It should be able to answer your question satisfactorily.
http://walivi.wordpress.com/2013/08/24/concurrency-in-java-a-beginners-introduction/
I think it's risky. Threading results in all kinds of subtly issues that are a giant pain to debug. You might want to look at FastHashMap, which is intended for read-only threading cases like this.
At the least, I'd also declare validProgramCodes to be volatile so that the reference won't get optimized into a register or something.
If I read the JLS correctly (no guarantees there!), accesses to references are always atomic, period. See Section 17.7 Non-atomic Treatment of double and long
So, if the access to a reference is always atomic and it doesn't matter what instance of the returned Hashmap the threads see, you should be OK. You won't see partial writes to the reference, ever.
Edit: After review of the discussion in the comments below and other answers, here are references/quotes from
Doug Lea's book (Concurrent Programming in Java, 2nd Ed), p 94, section 2.2.7.2 Visibility, item #3: "
The first time a thread access a field
of an object, it sees either the
initial value of the field or the
value since written by some other
thread."
On p. 94, Lea goes on to describe risks associated with this approach:
The memory model guarantees that, given the eventual occurrence of the above operations, a particular update to a particular field made by one thread will eventually be visible to another. But eventually can be an arbitrarily long time.
So when it absolutely, positively, must be visible to any calling thread, volatile or some other synchronization barrier is required, especially in long running threads or threads that access the value in a loop (as Lea says).
However, in the case where there is a short lived thread, as implied by the question, with new threads for new readers and it does not impact the application to read stale data, synchronization is not required.
#erickson's answer is the safest in this situation, guaranteeing that other threads will see the changes to the HashMap reference as they occur. I'd suggest following that advice simply to avoid the confusion over the requirements and implementation that resulted in the "down votes" on this answer and the discussion below.
I'm not deleting the answer in the hope that it will be useful. I'm not looking for the "Peer Pressure" badge... ;-)

Categories