Let's say I have a very simple web-service whose only task is to count how many times it's endpoint was called. The endpoint is /hello.
#Controller
public class HelloController {
private int calls = 0;
#RequestMapping("/hello")
public String hello() {
incrementCalls();
return "hello";
}
private void incrementCalls() {
calls++;
}
}
Now this works all fine as long as two users don't call /hello at the same time, simultaneously. But when a parallel call to /hello does happen, the calls variable would only get incremented once (if I am not mistaken). So obviously some kind of synchronization would need to take place here.
The question is what would be the best way to make this method thread-safe?
The reason that calls++ could cause a behavior other than you expect is if it is not atomic. Atomic operations happen in such a way that the entire operation can not be intercepted by another thread. Atomicity is implemented either by locking the operation, or by harnessing hardware that already performs it an atomic manner.
Incrementing is most likely not an atomic operation, as you have supposed, since it is a shortcut for calls = calls + 1;. It could indeed happen that two threads retrieve the same value for calls before either has a chance to increment. Both would then store the same value back, instead of one getting the already-incremented value.
There are a few simple ways of turning the get-and-increment into an atomic operation. The simplest one for you, not requiring any imports, is to make your method synchronized:
private void incrementCalls() {
calls++;
}
This will implicitly lock on the HelloController object it belongs to whenever a thread enters the method. Other threads will have to wait until the lock is released to enter the method, making the whole method into an atomic operation.
Another method would be to explicitly synchronize the portion of the code that you want. This is often a better choice for large methods that do not need to have a lot of atomic operations, since synchronization is fairly time- and space-expensive:
private void incrementCalls() {
sychronized(this) {
calls++;
}
}
Making the whole method synchronized is just a shortcut for wrapping its entire contents in synchronized(this).
java.util.concurrent.atomic.AtomicInteger handles the synchronization to make most of the operations you would want to do on an integer into atomic operations for you. In this case you could call getAndAdd(1), or getAndIncrement(). This would probably be the cleanest solution in terms of keeping the code legible as it reduces the number of braces and uses carefully designed library functions.
Related
I'm working with a framework that requires a callback when sending a request. Each callback has to implement this interface. The methods in the callback are invoked asynchronously.
public interface ClientCallback<RESP extends Response>
{
public void onSuccessResponse(RESP resp);
public void onFailureResponse(FailureResponse failure);
public void onError(Throwable e);
}
To write integration tests with TestNG, I wanted to have a blocking callback. So I used a CountDownLatch to synchronize between threads.
Is the AtomicReference really needed here or is a raw reference okay? I know that if I use a raw reference and a raw integer (instead of CountDownLatch), the code wouldn't work because visibility is not guaranteed. But since the CountDownLatch is already synchronized, I wasn't sure whether I needed the extra synchronization from AtomicReference.
Note: The Result class is immutable.
public class BlockingCallback<RESP extends Response> implements ClientCallback<RESP>
{
private final AtomicReference<Result<RESP>> _result = new AtomicReference<Result<RESP>>();
private final CountDownLatch _latch = new CountDownLatch(1);
public void onSuccessResponse(RESP resp)
{
_result.set(new Result<RESP>(resp, null, null));
_latch.countDown();
}
public void onFailureResponse(FailureResponse failure)
{
_result.set(new Result<RESP>(null, failure, null));
_latch.countDown();
}
public void onError(Throwable e)
{
_result.set(new Result<RESP>(null, null, e));
_latch.countDown();
}
public Result<RESP> getResult(final long timeout, final TimeUnit unit) throws InterruptedException, TimeoutException
{
if (!_latch.await(timeout, unit))
{
throw new TimeoutException();
}
return _result.get();
}
You don't need to use another synchronization object (AtomicRefetence) here. The point is that the variable is set before CountDownLatch is invoked in one thread and read after CountDownLatch is invoked in another thread. CountDownLatch already performs thread synchronization and invokes memory barrier so the order of writing before and reading after is guaranteed. Because of this you don't even need to use volatile for that field.
A good starting point is the javadoc (emphasis mine):
Memory consistency effects: Until the count reaches zero, actions in a thread prior to calling countDown() happen-before actions following a successful return from a corresponding await() in another thread.
Now there are two options:
either you never call the onXxx setter methods once the count is 0 (i.e. you only call one of the methods once) and you don't need any extra synchronization
or you may call the setter methods more than once and you do need extra synchronization
If you are in scenario 2, you need to make the variable at least volatile (no need for an AtomicReference in your example).
If you are in scenario 1, you need to decide how defensive you want to be:
to err on the safe side you can still use volatile
if you are happy that the calling code won't mess up with the class, you can use a normal variable but I would at least make it clear in the javadoc of the methods that only the first call to the onXxx methods is guaranteed to be visible
Finally, in scenario 1, you may want to enforce the fact that the setters can only be called once, in which case you would probably use an AtomicReference and its compareAndSet method to make sure that the reference was null beforehand and throw an exception otherwise.
Short answer is you don't need AtomicReference here. You'll need volatile though.
The reason is that you're only writing to and reading from the reference (Result) and not doing any composite operations like compareAndSet().
Reads and writes are atomic for reference variables and for most primitive variables (all types except long and double).
Reference,
Sun Java tutorial
https://docs.oracle.com/javase/tutorial/essential/concurrency/atomic.html
Then there is JLS (Java Language Specification)
Writes to and reads of references are always atomic, regardless of whether they are implemented as 32-bit or 64-bit values.
Java 8
http://docs.oracle.com/javase/specs/jls/se8/html/jls-17.html#jls-17.7
Java 7
http://docs.oracle.com/javase/specs/jls/se7/html/jls-17.html#jls-17.7
Java 6
http://docs.oracle.com/javase/specs/jls/se6/html/memory.html#17.7
Source : https://docs.oracle.com/javase/tutorial/essential/concurrency/atomic.html
Atomic actions cannot be interleaved, so they can be used without fear of thread interference. However, this does not eliminate all need to synchronize atomic actions, because memory consistency errors are still possible. Using volatile variables reduces the risk of memory consistency errors, because any write to a volatile variable establishes a happens-before relationship with subsequent reads of that same variable. This means that changes to a volatile variable are always visible to other threads. What's more, it also means that when a thread reads a volatile variable, it sees not just the latest change to the volatile, but also the side effects of the code that led up the change.
Since you have only single operation write/read and it's atomic, making the variable volatile will suffice.
Regarding use of CountDownLatch, it's used to wait for n operations in other threads to complete. Since you have only one operation, you can use Condition, instead of CountDownLatch.
If you're interested in usage of AtomicReference, you can check Java Concurrency in Practice (Page 326), find the book below:
https://github.com/HackathonHackers/programming-ebooks/tree/master/Java
Or the same example used by #Binita Bharti in following StackOverflow answer
When to use AtomicReference in Java?
In order for an assignment to be visible across threads some sort of memory barrier must be crossed. This can be accomplished several different ways, depending on what exactly you're trying to do.
You can use a volatile field. Reads and writes to volatile fields are atomic and visible across threads.
You can use an AtomicReference. This is effectively the same as a volatile field, but it's a little more flexible (you can reassign and pass around references to the AtomicReference) and has a few extra operations, like compareAndSet().
You can use a CountDownLatch or similar synchronizer class, but you need to pay close attention to the memory invariants they offer. CountDownLatch, for instance, guarantees that all threads that await() will see everything that occurs in a thread that calls countDown() up to when countDown() is called.
You can use synchronized blocks. These are even more flexible, but require more care - both the write and the read must be synchronized, otherwise the write may not be seen.
You can use a thread-safe collection, such as a ConcurrentHashMap. Overkill if all you need is a cross-thread reference, but useful for storing structured data that multiple threads need to access.
This isn't intended to be a complete list of options, but hopefully you can see there are several ways to ensure a value becomes visible to other threads, and that AtomicReference is simply one of those mechanisms.
I am trying to wrap my head around thread safety in java (or in general). I have this class (which I hope complies with the definition of a POJO) which also needs to be compatible with JPA providers:
public class SomeClass {
private Object timestampLock = new Object();
// are "volatile"s necessary?
private volatile java.sql.Timestamp timestamp;
private volatile String timestampTimeZoneName;
private volatile BigDecimal someValue;
public ZonedDateTime getTimestamp() {
// is synchronisation necessary here? is this the correct usage?
synchronized (timestampLock) {
return ZonedDateTime.ofInstant(timestamp.toInstant(), ZoneId.of(timestampTimeZoneName));
}
}
public void setTimestamp(ZonedDateTime dateTime) {
// is this the correct usage?
synchronized (timestampLock) {
this.timestamp = java.sql.Timestamp.from(dateTime.toInstant());
this.timestampTimeZoneName = dateTime.getZone().getId();
}
}
// is synchronisation required?
public BigDecimal getSomeValue() {
return someValue;
}
// is synchronisation required?
public void setSomeValue(BigDecimal val) {
someValue = val;
}
}
As stated in the commented rows in the code, is it necessary to define timestamp and timestampTimeZoneName as volatile and are the synchronized blocks used as they should be? Or should I use only the synchronized blocks and not define timestamp and timestampTimeZoneName as volatile? A timestampTimeZoneName of a timestamp should not be erroneously matched with another timestamp's.
This link says
Reads and writes are atomic for all variables declared volatile
(including long and double variables)
Should I understand that accesses to someValue in this code through the setter/getter are thread safe thanks to volatile definitions? If so, is there a better (I do not know what "better" might mean here) way to accomplish this?
To determine if you need synchronized, try to imagine a place where you can have a context switch that would break your code.
In this case, if the context switch happens where I put the comment, then in getTimestamp() you're going to be reading different values from each timestamp type.
Also, although assignments are atomic, this expression java.sql.Timestamp.from(dateTime.toInstant()); certainly isn't, so you can get a context switch inbetween dateTime.toInstant() and the call to from. In short you definitely need the synchronized blocks.
synchronized (timestampLock) {
this.timestamp = java.sql.Timestamp.from(dateTime.toInstant());
//CONTEXT SWITCH HERE
this.timestampTimeZoneName = dateTime.getZone().getId();
}
synchronized (timestampLock) {
return ZonedDateTime.ofInstant(timestamp.toInstant(), ZoneId.of(timestampTimeZoneName));
}
In terms of volatile, I'm pretty sure they're required. You have to guarantee that each thread definitely is getting the most updated version of a variable.
This is the contract of volatile. And although it may be covered by the synchronized block, and volatile not actually necessary here, it's good to write anyway. If the synchronized block does the job of volatile already, the VM won't do the guarantee twice. This means volatile won't cost you any more, and it's a very good flashing light that says to the programmer: "I'M USED IN MULTIPLE THREADS".
For someValue: If there's no synchronized block here, then volatile is definitely necessary. If you call a set in one thread, the other thread has no queue that tells it that may have been updated outside of this thread. So it may use an old and cached value. The JIT can do a lot of funny optimizations if it assumes single thread. Ones that can simply break your program.
Now I'm not entirely certain if synchronized is required here. My guess is no. I would add it anyway to be safe though. Or you can let java worry about the synchronization and use http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/atomic/AtomicInteger.html
Nothing new here, this is just a more explicit version of something #Cruncher already said:
You need synchronized whenever it is important for two or more fields in your program to be consistent with one another. Suppose you have two parallel lists, and your code depends on them both being the same length. That's called an invariant as in, the two lists are invariably the same length.
How can you write a method, append(x,y), that adds a new pair of values to the lists without temporarily breaking the invariant? You can't. The method must add one item to the first list, breaking the invariant, and then add the other item to the second list, fixing it again. There's no other way.
In a single-threaded program, that temporary broken state is no problem because no other method can possibly use the lists while append(x,y) is running. That's no longer true in a multithreaded program. In the worst case, append(x,y) could add x to the x list, and then the scheduler could suspend the thread at that exact moment to allow other threads to run. The CPUs could execute millions of instructions before append(x,y) gets to finish the job and make the lists right again. During all of that time, other threads would see the broken invariant, and possibly corrupt your data or crash the program as a result.
The fix is for append(x,y) to be synchronized on some object, and (this is the important part), for every other method that uses the lists to be synchronized on the same object. Since only one thread can be synchronized on a given object at a given time, it will not be possible for any other thread to see the lists in an inconsistent state.
So, if thread A calls append(x,y), and thread B tries to look at the lists "at the same time", will thread B see the what the lists looked like before or after thread A did its work? That's called a data race. And with only the synchronization that I have described so far, there's no way to know which thread will win. All we've done so far is to guarantee one particular invariant.
If it matters which thread wins the race, then that means that there is some higher-level invariant that also needs protection. You will have to add more synchronization to protect that one too. "Thread safety" -- two little words to name a subject that is both broad and deep.
Good Luck, and Have Fun!
// is synchronisation required?
public BigDecimal getSomeValue() {
return someValue;
}
// is synchronisation required?
public void setSomeValue(BigDecimal val) {
someValue = val;
}
I think Yes you are require to put the synchronization block because consider an example in which one thread is setting the value and at the same time other thread is trying to read from getter method, like here in the example you will see the syncronization block.So, if you take your variable inside the method then you must require the synchronization block.
Is this Java class thread safe or reset method needs to be synchronized too? If yes can someone tell me the reason why?
public class NamedCounter {
private int count;
public synchronized void increment() { count++; }
public synchronized int getCount() { return count; }
public void reset() { count = 0; }
}
Not without synchronizing rest() and adding more methods. You will run into cases where you will need more methods. For example
NamedCounter counter = new NamedCounter();
counter.increment();
// at this exact time (before reaching the below line) another thread might change changed the value of counter!!!!
if(counter.getCount() == 1) {
//do something....this is not thread safe since you depeneded on a value that might have been changed by another thread
}
To fix the above you need something like
NamedCounter counter = new NamedCounter();
if(counter.incrementAndGet()== 1) { //incrementAndGet() must be a synchronized method
//do something....now it is thread safe
}
Instead, use Java's bulit-in class AtomicInteger which covers all cases. Or if you are trying to learn thread safety then use AtomicInteger as a standard (to learn from).
For production code, go with AtomicInteger without even thinking twice! Please note that using AtomicInteger does not automatically guarantee thread safety in your code. You MUST make use of the methods that are provided by the api. They are there for a reason.
Note that synchronized is not just about mutual exclusion, it is fundamentally about the proper ordering of operations in terms of the visibility of their actions. Therefore reset must be synchronized as well, otherwise the writes it makes may occur concurrently to other two methods, and have no guarantee to be visible.
To conclude, your class is not thread-safe as it stands, but will be as soon as you synchronize the reset method.
You have to synchronize your reset() method also.
To make a class thread safe you have to synchronize all paths that access a variable else you will have undesired results with the unsynchronized paths.
You need to add synchronized to reset method too and then it will be synchronized. But in this way you achieve syncronization through locks, that is, each thread accesing the method will lock on the NamedCounter object instace.
However, if you use AtomicInteger as your count variable, you don't need to syncronize anymore because it uses the CAS cpu operation to achieve atomicity without the need to synchronize.
Not an answer, but too long for a comment:
If reset() is synch'ed, then the 0 become visible to any thread that reads or increments the counter later. Without synchronization, there is no visibility guarantee. Looking at the interaction of concurrent increment and the unsychronized reset, it may be that 0 becomes visible to the incrementing thread before entering the method, then the result will be 1. If counter is set to 0 between increment's read and write, the reset will be forgotten. If it is set after the write, the end result will be 0. So, if you want to assert that for every reading thread, the counter is 0 after reset, that method must be synchronized, too. But David Schwartz is correct that those low-level synchronizations make little sense whithout higher-level semantics of those interactions.
I have a problem with limiting concurrent access to a method. I have a method MyService that can be called from many places at many times. This method must return a String, that should be updated according to some rules. For this, I have an updatedString class. Before getting the String, it makes sure that the String is updated, if not, it updates it. Many threads could read the String at the same time but ONLY ONE should renew the String at the same time if it is out of date.
public final class updatedString {
private static final String UPstring;
private static final Object lock = new Object();
public static String getUpdatedString(){
synchronized(lock){
if(stringNeedRenewal()){
renewString();
}
}
return getString();
}
...
This works fine. If I have 7 threads getting the String, it guarantees that, if necessary, ONLY one thread is updating the String.
My question is, is it a good idea to have all this static? Why if not? Is it fast? Is there a better way to do this?
I have read posts like this:
What Cases Require Synchronized Method Access in Java? which suggests that static mutable variables are not a good idea, and static classes either. But I cannot see any dead-lock in the code or a better valid solution. Only that some threads will have to wait until the String is updated (if necessary) or wait for other thread to leave the synchronized block (which causes a small delay).
If the method is not static, then I have a problem because this will not work since the synchronized method acts only for the current instance that the thread is using. Synchronized the method does not work either, it seems that the lock instance-specific and not class-specific.
The other solution could be to have a Singleton that avoids creating more than one instance and then use a single synchronized not-static class, but I do not like this solution too much.
Additional information:
stringNeedRenewal() is not too expensive although it has to read from a database. renewString() on the contrary is very expensive, and has to read from several tables on the database to finally come to an answer. The String needs arbitrary renewal, but this does not happen very often (from once per hour to once per week).
#forsvarir made me think... and I think he/she was right. return getString(); MUST be inside the synchronized method. At a first sight it looks as if it can be out of it so threads will be able to read it concurrently, but what happens if a thread stops running WHILE calling getString() and other thread partially execute renewString()? We could have this situation (assuming a single processor):
THREAD 1 starts getString(). The OS
starts copying into memory the bytes
to be returned.
THREAD 1 is stopped by the OS before finishing the copy.
THREAD 2 enters the synchronized
block and starts renewString(),
changing the original String in
memory.
THREAD 1 gets control back
and finish getString using a
corrupted String!! So it copied one
part from the old string and another
from the new one.
Having the read inside the synchronized block can make everything very slow, since threads could only access this one by one.
As #Jeremy Heiler pointed out, this is an abstract problem of a cache. If the cache is old, renew it. If not, use it. It is better more clear to picture the problem like this instead of a single String (or imagine that there are 2 strings instead of one). So what happens if someone is reading at the same time as someone is modifying the cache?
First of all, you can remove the lock and the synchronized block and simply use:
public static synchronized String getUpdatedString(){
if(stringNeedRenewal()){
renewString();
}
return getString();
}
this synchronizes on the UpdatedString.class object.
Another thing you can do is used double-checked locking to prevent unnecessary waiting. Declare the string to be volatile and:
public static String getUpdatedString(){
if(stringNeedRenewal()){
synchronized(lock) {
if(stringNeedRenewal()){
renewString();
}
}
}
return getString();
}
Then, whether to use static or not - it seems it should be static, since you want to invoke it without any particular instance.
I would suggest looking into a ReentrantReadWriteLock. (Whether or not it is performant is up to you to decide.) This way you can have many read operations occur simultaneously.
Here is the example from the documentation:
class CachedData {
Object data;
volatile boolean cacheValid;
ReentrantReadWriteLock rwl = new ReentrantReadWriteLock();
void processCachedData() {
rwl.readLock().lock();
if (!cacheValid) {
// Must release read lock before acquiring write lock
rwl.readLock().unlock();
rwl.writeLock().lock();
// Recheck state because another thread might have acquired
// write lock and changed state before we did.
if (!cacheValid) {
data = ...
cacheValid = true;
}
// Downgrade by acquiring read lock before releasing write lock
rwl.readLock().lock();
rwl.writeLock().unlock(); // Unlock write, still hold read
}
use(data);
rwl.readLock().unlock();
}
}
This isn't exactly what you're after, and I'm not a Java specialist, so take this with a pinch of salt :)
Perhaps the code sample you've provided is contrived, but if not, I'm unclear what the purpose of the class is. You only want one thread to update the string to it's new value. Why? Is it to save effort (because you'd rather use the processor cycles on something else)? Is it to maintain consistentcy (once a certain point is reached, the string must be updated)?
How long is the cycle between required updates?
Looking at your code...
public final class updatedString {
private static final String UPstring;
private static final Object lock = new Object();
public static String getUpdatedString(){
synchronized(lock){
// One thread is in this block at a time
if(stringNeedRenewal()){
renewString(); // This updates the shared string?
}
}
// At this point, you're calling out to a method. I don't know what the
// method does, I'm assuming it just returns UPstring, but at this point,
// you're no longer synchronized. The string actually returned may or may
// not be the same one that was present when the thread went through the
// synchronized section hence the question, what is the purpose of the
// synchronization...
return getString(); // This returns the shared string?
}
The right locking / optimizations depend upon the reason that you're putting them in place, the likelyhood of a write being required and as Paulo has said, the cost of the operations involved.
For some situations where writes are rare, and obviously depending upon what renewString does, it may be desirable to use an optimistic write approach. Where each thread checks if a refresh is required, proceeds to perform the update on a local and then only at the end, assigns the value across to the field being read (you need to track the age of your updates if you follow this approach). This would be faster for reading, since the check for 'does the string need renewed' can be performed outside of the synchronised section. Various other approaches could be used, depending upon the individual scenario...
as long as you lock is static, everything else doesn't have to be, and things will work just as they do now
Lets say I have a class like this in Java:
public class Function {
public static int foo(int n) {
return n+1;
}
}
What happens if I call the foo method like this from a thread?
x = Function.foo(y);
Can I do that with two threads, without them waiting for each other? Let's say that foo takes a while, and that it gets called a lot, so that each thread would would likely be trying to use foo at the same time. Can they do that, or do I have to make all the methods in Function instance methods and give each thread it's own Function object?
The code you are calling does not store any state, and thus will return deterministically whether called from one thread or many - and its not like the "lines of code" needs to be guarded (as you seem to imply by your question), because its ok to run the "same lines of code" from multiple threads, provided they dont share data (which in this case, doesnt).
Problem comes if you had code like
public class Function {
private static int last = 0;
public static int foo(int n) {
last += n;
return last;
}
}
this is when you start to need to worry about different threads clobbering the static last.
As long as foo() uses just parameters and local variables, any number of threads can call it at once (and if you have multiple cores, they might even execute it at the same time).
The problem with calling the same method from multiple threads appears when that method accesses shared state. For example, if Functions also declared a static map:
private static Map<String,Object> myObjects;
In this case, two threads could attempt to update the map at the same time. Since most map implementations aren't internally synchronized, the two threads could change the same internal structures, and corrupt the map data.
Synchronizing on shared state, while easy in theory, is not so easy in practice. For example, you could simply use a ConcurrentHashMap, which can be accessed by multiple threads simultaneously. However, it makes no guarantees about preservation of state between calls, so you could put something into the map at time X, and some other thread could remove it at time Y, before your first thread tries to access it again at time Z.
You should be aware that each thread has its own stack, and n (the only variable here) lives on the stack; so those threads do not interfere.