Lets say I have a class like this in Java:
public class Function {
public static int foo(int n) {
return n+1;
}
}
What happens if I call the foo method like this from a thread?
x = Function.foo(y);
Can I do that with two threads, without them waiting for each other? Let's say that foo takes a while, and that it gets called a lot, so that each thread would would likely be trying to use foo at the same time. Can they do that, or do I have to make all the methods in Function instance methods and give each thread it's own Function object?
The code you are calling does not store any state, and thus will return deterministically whether called from one thread or many - and its not like the "lines of code" needs to be guarded (as you seem to imply by your question), because its ok to run the "same lines of code" from multiple threads, provided they dont share data (which in this case, doesnt).
Problem comes if you had code like
public class Function {
private static int last = 0;
public static int foo(int n) {
last += n;
return last;
}
}
this is when you start to need to worry about different threads clobbering the static last.
As long as foo() uses just parameters and local variables, any number of threads can call it at once (and if you have multiple cores, they might even execute it at the same time).
The problem with calling the same method from multiple threads appears when that method accesses shared state. For example, if Functions also declared a static map:
private static Map<String,Object> myObjects;
In this case, two threads could attempt to update the map at the same time. Since most map implementations aren't internally synchronized, the two threads could change the same internal structures, and corrupt the map data.
Synchronizing on shared state, while easy in theory, is not so easy in practice. For example, you could simply use a ConcurrentHashMap, which can be accessed by multiple threads simultaneously. However, it makes no guarantees about preservation of state between calls, so you could put something into the map at time X, and some other thread could remove it at time Y, before your first thread tries to access it again at time Z.
You should be aware that each thread has its own stack, and n (the only variable here) lives on the stack; so those threads do not interfere.
Related
I was reading Effective Java, and came across a condition where Joshua Bloch recommends something like
class MyComparator extends Comparator<String>{
private MyComparator(){}
private static final MyComparator INSTANCE = new MyComparator();
public int compare(String s1,String s2){
// Omitted
}
}
XYZComparator is stateless, it has no fields. hence all instances of the class are functionally equivalent. Thus it should be a singleton to save on unnecessary object creation.
So is it always safe to create a static final Object of whatever class it is pointing to if it has no fields? Wouldn't this cause multithreading issue when compare is called from two threads parallely? Or I misunderstood something basic. Is it like every thread has autonomy of execution if no fields is shared?
So is it always safe to create a static final Object of whatever class it is pointing to if it has no fields?
I would dare to say yes. Having no fields makes a class stateless and, thus, immutable, which is always desirable in a multithreading environment.
Stateless objects are always thread-safe.
Immutable objects are always thread-safe.
An excerpt from Java Concurrency In Practice:
Since the actions of a thread accessing a stateless object cannot affect the correctness of operations in other threads, stateless objects are thread-safe.
Stateless objects are always thread-safe.
The fact that most servlets can be implemented with no state greatly reduces the burden of making servlets threadͲ
safe. It is only when servlets want to remember things from one request to another that the thread-safety requirement becomes an issue.
...
An immutable object is one whose state cannot be changed after construction. Immutable objects are inherently
thread-safe; their invariants are established by the constructor, and if their state cannot be changed, these invariants
always hold.
Immutable objects are always thread-safe.
Immutable objects are simple. They can only be in one state, which is carefully controlled by the constructor. One of the
most difficult elements of program design is reasoning about the possible states of complex objects. Reasoning about
the state of immutable objects, on the other hand, is trivial.
Wouldn't this cause multithreading issue when compare is called from two threads parallelly?
No. Each thread has own stack where local variables (including method parameters) are stored. The thread's stack isn't shared, so there is no way to mess it up parallelly.
Another good example would be a stateless servlet. One more extract from that great book.
#ThreadSafe
public class StatelessFactorizer implements Servlet {
public void service(ServletRequest req, ServletResponse resp) {
BigInteger i = extractFromRequest(req);
BigInteger[] factors = factor(i);
encodeIntoResponse(resp, factors);
}
}
StatelessFactorizer is, like most servlets, stateless: it has no fields and references no fields from other classes. The
transient state for a particular computation exists solely in local variables that are stored on the thread's stack and are
accessible only to the executing thread. One thread accessing a StatelessFactorizer cannot influence the result of
another thread accessing the same StatelessFactorizer; because the two threads do not share state, it is as if they
were accessing different instances.
Is it like every thread has autonomy of execution if no fields is shared?
Each thread has its own program counter, stack, and local variables. There is a term "thread confinement" and one of its forms is called "stack confinement".
Stack confinement is a special case of thread confinement in which an object can only be reached through local variables. Just as encapsulation can make it easier to preserve invariants, local variables can make it easier to confine objects to a thread. Local variables are intrinsically confined to the executing thread; they exist on the executing thread's stack, which is not accessible to other threads.
To read:
Java Concurrency In Practice
Thread Confinement
Stack Confinement using local object reference
Multithreading issues are caused by unwanted changes in state. If there is no state that is changed, there are no such issues. That is also why immutable objects are very convenient in a multithreaded environment.
In this particular case, the method only operates on the input parameters s1 and s2 and no state is kept.
So is it always safe to create a static final Object of whatever class it is pointing to if it has no fields?
"Always" is too strong a claim. It's easy to construct an artificial class where instances are not thread-safe despite having no fields:
public class NotThreadSafe {
private static final class MapHolder {
private static final Map<NotThreadSafe, StringBuilder> map =
// use ConcurrentHashMap so that different instances don't
// interfere with each other:
new ConcurrentHashMap<>();
}
private StringBuilder getMyStringBuilder() {
return MapHolder.map.computeIfAbsent(this, k -> new StringBuilder());
}
public void append(final Object s) {
getMyStringBuilder().append(s);
}
public String get() {
return getMyStringBuilder().toString();
}
}
. . . but that code is not realistic. If your instances don't have any mutable state, then they'll naturally be threadsafe; and in normal Java code, mutable state means instance fields.
XYZComparator is stateless, it has no fields. hence all instances of the class are functionally equivalent. Thus it should be a singleton to save on unnecessary object creation.
From that point of view, the "current day" answer is probably: make MyComparator an enum. The JVM guarantees that MyComparatorEnum.INSTANCE will be a true singelton, and you don't have to worry about the subtle details that you have to consider when building singletons "yourself".
Explanation
So is it always safe to create a static final Object of whatever class it is pointing to if it has no fields?
Depends. Multi-threading issues can only occur when one thread is changing something while another thread is using it at the same time. Since the other thread might then not be aware of the changes due to caching and other effects. Or it results in a pure logic bug where the creator did not think about that a thread can be interrupted during an operation.
So when a class is stateless, which you have here, it is absolutely safe to be used in a multi-threaded environment. Since there is nothing for any thread to change in the first place.
Note that this also means that a class is not allowed to use not-thread-safe stuff from elsewhere. So for example changing a field in some other class while another thread is using it.
Example
Here is a pretty classic example:
public class Value {
private int value;
public int getValue() {
return value;
}
public void increment() {
int current = value; // or just value++
value = current + 1;
}
}
Now, lets assume both threads call value.increment(). One thread gets interrupted after:
int current = value; // is 0
Then the other starts and fully executes increment. So
int current = value; // is 0
value = current + 1; // is 1
So value is now 1. Now the first thread continues, the expected outcome would be 2, but we get:
value = current + 1; // is 1
Since its current was already computed before the second thread ran through, so it is still 0.
We also say that an operation (or method in this case) is not atomic. So it can be interrupted by the scheduler.
This issue can of course only happen because Value has a field value, so it has a changeable state.
YES. It is safe to create a static final object of a class if it has no fields. Here, the Comparator provides functionality only, through its compare(String, String) method.
In case of multithreading, the compare method will have to deal with local variables only (b/c it is from stateless class), and local variables are not shared b/w thread, i.e., each thread will have its own (String, String) copy and hence will not interfere with each other.
Calling the compare method from two threads in parallel is safe (stack confinement). The parameters you pass to the method are stored in that thread's stack, that any other thread cannot access.
An immutable singleton is always recommended. Abstain from creating mutable singletons, as they introduce global state in your application, that is bad.
Edit: If the params passed are mutable object references, then you have to take special care to ensure thread safety.
Let's say I have a very simple web-service whose only task is to count how many times it's endpoint was called. The endpoint is /hello.
#Controller
public class HelloController {
private int calls = 0;
#RequestMapping("/hello")
public String hello() {
incrementCalls();
return "hello";
}
private void incrementCalls() {
calls++;
}
}
Now this works all fine as long as two users don't call /hello at the same time, simultaneously. But when a parallel call to /hello does happen, the calls variable would only get incremented once (if I am not mistaken). So obviously some kind of synchronization would need to take place here.
The question is what would be the best way to make this method thread-safe?
The reason that calls++ could cause a behavior other than you expect is if it is not atomic. Atomic operations happen in such a way that the entire operation can not be intercepted by another thread. Atomicity is implemented either by locking the operation, or by harnessing hardware that already performs it an atomic manner.
Incrementing is most likely not an atomic operation, as you have supposed, since it is a shortcut for calls = calls + 1;. It could indeed happen that two threads retrieve the same value for calls before either has a chance to increment. Both would then store the same value back, instead of one getting the already-incremented value.
There are a few simple ways of turning the get-and-increment into an atomic operation. The simplest one for you, not requiring any imports, is to make your method synchronized:
private void incrementCalls() {
calls++;
}
This will implicitly lock on the HelloController object it belongs to whenever a thread enters the method. Other threads will have to wait until the lock is released to enter the method, making the whole method into an atomic operation.
Another method would be to explicitly synchronize the portion of the code that you want. This is often a better choice for large methods that do not need to have a lot of atomic operations, since synchronization is fairly time- and space-expensive:
private void incrementCalls() {
sychronized(this) {
calls++;
}
}
Making the whole method synchronized is just a shortcut for wrapping its entire contents in synchronized(this).
java.util.concurrent.atomic.AtomicInteger handles the synchronization to make most of the operations you would want to do on an integer into atomic operations for you. In this case you could call getAndAdd(1), or getAndIncrement(). This would probably be the cleanest solution in terms of keeping the code legible as it reduces the number of braces and uses carefully designed library functions.
Just to make sure I understand the concepts presented in java concurrency in practice.
Lets say I have the following program:
public class Stuff{
private int x;
public Stuff(int x){
this.x=x;
}
public int getX(){return x;}
}
public class UseStuff(){
private Stuff s;
public void makeStuff(int x){
s=new Stuff(x);
}
public int useStuff(){
return s.getX();
}
}
If I let multiple threads to play with this code, then I'm not only in trouble because s might be pointing to multiple instances if two or more threads are entering to the makeStuff method, but even if just one thread creates a new Stuff, then an other thread who is just entered to useStuff can return the value 0 (predefined int value) or the value assigned to "x" by its constructor.
That all depends on whether the constructor has finished initializing x.
So at this point, to make it thread safe I must do one thing and then I can choose from two different ways.
First I must make makeStuff() atomic, so "s" will point to one object at a time.
Then I either make useStuff synchronized as well which ensures the I get back the Stuff object x var only after its constructor has finished building it, OR i can make Stuff's x final, and by this the JMM makes sure that x's value will only be visible after it has been initialized.
Do I understand the importance of final fields in the context of concurrency and JMM?
Do I understand the importance of final fields in the context of concurrency and JMM?
Not quite. The spec writes:
final fields also allow programmers to implement thread-safe immutable objects without synchronization. A thread-safe immutable object is seen as immutable by all threads, even if a data race is used to pass references to the immutable object between threads. This can provide safety guarantees against misuse of an immutable class by incorrect or malicious code
If you make x final, this guarantees that every thread that obtains a reference to a Stuff instance will observe x to have been assigned. It does not guarantee that any thread will obtain such a reference.
That is, in the absence of synchronization action in useStuff(), the runtime is permitted to satisfy a read of s from a register, which might return a stale value.
The cheapest correctly synchronized variant of this code is declaring s volatile, which ensures that writes to s happen-before (and are therefore visible to) subsequent reads of s. If you do that, you need not even make x final (because the write to x happens-before the write of s, the read of s happens-before the read of x, and happens-before is transitive).
Some answers claim that s can only refer to one object at a time. This is wrong; because there is no memory barrier, different threads can have their own notion about the value of s. In order for all threads to see a consistent value assigned to s, you need to declare s as volatile, or use some other memory barrier.
If you do this, you won't need to declare x as final for the correct value to be visible to all threads (but you might still want to; fields shouldn't be mutable without a reason). That's because the initialization of x happens-before the assignment of s in "source code order," and the write of the volatile field s happens-before other thread reads that value from s. If you subsequently modified the value of a non-final field x, however, you could run into trouble because the modification isn't guaranteed to be visible to other threads. Making Stuff immutable would eliminate that possibility.
Of course, there's nothing to stop threads from clobbering the value assigned to s, so different threads could still see different values for x. This isn't really a threading issue though. Even a single thread could write and then read different values of x over time. But preventing this behavior in a multi-threaded environment requires atomicity, that is, checking to see whether s has a value and assigning one if not should appear as one indivisible action to other threads. An AtomicReference would be the best solution, but the synchronized keyword would work too.
What are you trying to protect by making things synchronized? Are you concerned that thread A will call makeStuff and then thread B will call getStuff afterwards and the value won't be there? I'm not sure how synchronizing any of this will help that. Depending on what problem you are trying to avoid, it might be as simple as marking s as volatile.
I'm not sure what you're doing there. Why are you trying to create an object and then assign it to a field? Why save it if it can be overwritten by other call to makeStuff? It seems like you use UseStuff both as an proxy and as a factory to your actual Stuff model object. You better separate the two:
public class StuffFactory {
public static Stuff createStuff(int value) {
return new StuffProxy(value);
}
}
public class StuffProxy extends Stuff {
// Replacement for useStuff from your original UseStuff class
#Override
public int getX() {
//Put custom logic here
return super.getX();
}
}
The logic here is that each thread is responsible for creation of their own Stuff objects (using the factory) so concurrent access no longer an issue.
I know that concurrently accessing the same object from different threads, without synchronisation, is in general a bad thing. But what about this case:
I have multiple threads running (consider two, ThreadA & ThreadB). I also have this static class to keep count of the number of times a Thread does something.
public class Counter {
static private int counter=0;
static public void incCounter() {
counter++;
}
}
What happens if ThreadA and ThreadB both call Counter.incCounter()?
It's not safe.
Each thread will attempt to read counter, add one to it, and write back the result. You're not guaranteed what order these reads and writes happen in, or even if the results are visible to each thread.
In particular, one failure case would be that each thread reads the value 0, increments it to 1, and writes back the value 1. This would give the counter the value 1 even after two threads attempted to increment it.
Consider using AtomicInteger.incrementAndGet() instead.
Its value will either be 1 or 2. There's no difference between static and non static variables in this context.
It doesn't matter whether it's a static object or an instance: if you change it from multiple threads, you're going to have a problem.
to avoid conflict use the keyword synchronized.
public class Counter {
static private int counter=0;
public static synchronized void incCounter() {
counter++;
}
}
this keywords allows only one thread for time to call incCounter().
Dave is correct, but a quick fix is just to add the "synchronized" keyword to that method description; if multiple threads call that method, they will block at the method boundary until the one inside (that won the race) increments and exists, then the 2nd caller will enter.
This is a lot like designing a good "getInstance()" method on a Singleton class; you typically want it to be synchronized so you don't have the case where 2+ threads enter the method, ALL see that the "instance" is null, and then ALL create a new instance, assign it to the local member and return it.
Your threads can end up with different references to the "same" instance in that case. So you synchronize the code block, only let the first thread create the instance if it's null, and otherwise ALWAYS return the same one to all callers.
The if(instance == null) check plus the return are cheap; on the order of microseconds I believe for the future calls to getInstance (or in your example incCounter) so no need to shy away from the synchronized keyword if you need it; that's what it is for.
That being said, if you can't spare microseconds... well then you might be using the wrong language :)
consider this class,with no instance variables and only methods which are non-synchronous can we infer from this info that this class in Thread-safe?
public class test{
public void test1{
// do something
}
public void test2{
// do something
}
public void test3{
// do something
}
}
It depends entirely on what state the methods mutate. If they mutate no shared state, they're thread safe. If they mutate only local state, they're thread-safe. If they only call methods that are thread-safe, they're thread-safe.
Not being thread safe means that if multiple threads try to access the object at the same time, something might change from one access to the next, and cause issues. Consider the following:
int incrementCount() {
this.count++;
// ... Do some other stuff
return this.count;
}
would not be thread safe. Why is it not? Imagine thread 1 accesses it, count is increased, then some processing occurs. While going through the function, another thread accesses it, increasing count again. The first thread, which had it go from, say, 1 to 2, would now have it go from 1 to 3 when it returns. Thread 2 would see it go from 1 to 3 as well, so what happened to 2?
In this case, you would want something like this (keeping in mind that this isn't any language-specific code, but closest to Java, one of only 2 I've done threading in)
int incrementCount() synchronized {
this.count++;
// ... Do some other stuff
return this.count;
}
The synchronized keyword here would make sure that as long as one thread is accessing it, no other threads could. This would mean that thread 1 hits it, count goes from 1 to 2, as expected. Thread 2 hits it while 1 is processing, it has to wait until thread 1 is done. When it's done, thread 1 gets a return of 2, then thread 2 goes throguh, and gets the expected 3.
Now, an example, similar to what you have there, that would be entirely thread-safe, no matter what:
int incrementCount(int count) {
count++;
// ... Do some other stuff
return this.count;
}
As the only variables being touched here are fully local to the function, there is no case where two threads accessing it at the same time could try working with data changed from the other. This would make it thread safe.
So, to answer the question, assuming that the functions don't modify anything outside of the specific called function, then yes, the class could be deemed to be thread-safe.
Consider the following quote from an article about thread safety ("Java theory and practice: Characterizing thread safety"):
In reality, any definition of thread safety is going to have a certain degree of circularity, as it must appeal to the class's specification -- which is an informal, prose description of what the class does, its side effects, which states are valid or invalid, invariants, preconditions, postconditions, and so on. (Constraints on an object's state imposed by the specification apply only to the externally visible state -- that which can be observed by calling its public methods and accessing its public fields -- rather than its internal state, which is what is actually represented in its private fields.)
Thread safety
For a class to be thread-safe, it first must behave correctly in a single-threaded environment. If a class is correctly implemented, which is another way of saying that it conforms to its specification, no sequence of operations (reads or writes of public fields and calls to public methods) on objects of that class should be able to put the object into an invalid state, observe the object to be in an invalid state, or violate any of the class's invariants, preconditions, or postconditions.
Furthermore, for a class to be thread-safe, it must continue to behave correctly, in the sense described above, when accessed from multiple threads, regardless of the scheduling or interleaving of the execution of those threads by the runtime environment, without any additional synchronization on the part of the calling code. The effect is that operations on a thread-safe object will appear to all threads to occur in a fixed, globally consistent order.
So your class itself is thread-safe, as long as it doesn't have any side effects. As soon as the methods mutate any external objects (e.g. some singletons, as already mentioned by others) it's not any longer thread-safe.
Depends on what happens inside those methods. If they manipulate / call any method parameters or global variables / singletons which are not themselves thread safe, the class is not thread safe either.
(yes I see that the methods as shown here here have no parameters, but no brackets either, so this is obviously not full working code - it wouldn't even compile as is.)
yes, as long as there are no instance variables. method calls using only input parameters and local variables are inherently thread-safe. you might consider making the methods static too, to reflect this.
If it has no mutable state - it's thread safe. If you have no state - you're thread safe by association.
No, I don't think so.
For example, one of the methods could obtain a (non-thread-safe) singleton object from another class and mutate that object.
Yes - this class is thread safe but this does not mean that your application is.
An application is thread safe if the threads in it cannot concurrently access heap state. All objects in Java (and therefore all of their fields) are created on the heap. So, if there are no fields in an object then it is thread safe.
In any practical application, objects will have state. If you can guarantee that these objects are not accessed concurrently then you have a thread safe application.
There are ways of optimizing access to shared state e.g. Atomic variables or with carful use of the volatile keyword, but I think this is going beyond what you've asked.
I hope this helps.