I was wondering what the implications are for using static methods in a Java EE application.
For example: There is a class that handles converting of dates, reordering of strings etc.
All methods in this class are static.
These methods are used by Servlets.
Does this mean that the static methods need to be thread safe (in that if many users are using the application at the same time and are accessing the static method at the same time that there could be some issues)?
Edit I would like to know about this in the context of a web application - are two users going to hit the static methods at the same time and mess with each others result (of the static method)?
accessing the methods in parallel is fine, as long as there are not shared class variables; e.g. if the method declares its own stuff, you're good:
public static void thing() {
String x = "";
// do stuff with x
}
The above is fine.
String x = "";
public static void thing() {
// do stuff with x
}
This one isn't.
Does this mean that the static methods need to be thread safe (in that
if many users are using the application at the same time and are
accessing the static method at the same time that there could be some
issues)?
Only if there's shared state. If you're allocating new objects on the heap for each invocation then it's not an issue.
But I never like doing this sort of thing since introducing shared state immediately means you have thread-safety issues. I prefer to create an instance of a converter/helper class (object creation normally being negligible performance-wise). That immediately means you're thread-safe (provided you're not sharing state) and each instance can (for example) be customisable upon construction, to give different behaviours where required.
Of course there will be issues if you don't protect your static methods and they change the a state.
Consider this sample
public class GlobalCount {
private int count = 0;
public static void increment() {
count++; // that is : count = count + 1 (which means some thread may use the old value of count when assigning)
}
}
If more than one thread calls increment, you may lose some increments (that is you may have count smaller than the number of times the increment method was called.
So you have to set your method as synchronized :
public static synchronized void increment() {
count++;
}
If you think you don't have a shared state, be careful : many standard classes (for example SimpleDateFormat, as you're speaking of date formatting) aren't thread safe and may fail if an instance is called from more than one thread at the same time.
So, as soon as you have a static instance accessed from more than one thread, be very careful.
Related
Here is an example method for explaining thread safety:
class Counter {
private int counter = 0;
public void increment() {
counter++;
}
public int getValue() {
return counter;
}
}
In order to provide thread safety, there are several methods and I would prefer using AtomicInteger approach. However;
1. I am also wondering if I can provide thread safe by using final for the necessary variable(s). If so, how can I perform this?
2. Is one of the reason using final commonly in Java for variables and method arguments to provide thread safety?
In properly synchronized code, the final isn't needed.
E.g. if you would use:
class MyCounter{
private AtomicInteger c = new AtomicInteger();
public int inc(){return c.incrementAndGet();}
public int get(){return c.get();}
}
And you would share the MyCounter-instance with another thread, you need to make sure that there is a happens-before edge between writing c and reading c. This can be done in various ways e.g. you pass the MyCounter-instance to the constructor of some thread (thread start rule). Or you pass it through a volatile field (volatile variable rule) or a synchronized block (monitor lock rule).
This is typically called 'safe publication' and for a correctly synchronized system, this is all you need. If you don't pass the reference safely, you have a data race and weird problems can happen like seeing a partially constructed object. Therefore there is a second mechanism called initialization safety; so no matter if the reference to an object isn't published safely, initialization safety using final will act as a backup solution. The primary use-case for this AFAIK is security.
So for correctly synchronized code, there is no need for final.
That doesn't mean that you should not add finals. It has all kinds of benefits like no accidental changes and it is pretty informative. So I prefer to make as many fields final as possible.
Final has no meaning for method arguments from a memory model perspective, since they are private to a thread. Only shared memory needs to be dealt with in a memory model. Making arguments of a method final is a flavor issue. Some people want it, others don't. I'm not crazy about long method signatures and tend not to add them unless I'm writing some difficult code. But I would be fine if local variables and formal arguments would be final by default (like Rust).
I m just gonna add this with Erwan Daniel's answer .
Your
If you want a counter shared between all your Threads here is another version of your code.
class SharedCounter {
private AtomicInteger sharedCounter ;
public Counter(){
this.sharedCounter = new AtomicInteger(0);
}
public void increment() {
sharedCounter.getAndIncrement();
}
public int value() {
return sharedCounter.get();
}
The final will prevent your atomicInteger12 from changing the object it's using And you can freely set it's value.
final SharedCounter atomicInteger12 = new Counter() ;
No, the final keyword doesn't have anything in common with thread safety.
The final keyword on variables makes them immutable, you can't change their value anymore.
However, it's not like the const keyword in c++ where the whole variable content cannot change. In Java only the reference is immutable.
final AtomicReference<String> toto = new AtomicReference<>("text");
toto.set("new text"); // totally fine
toto = new AtomicReference<>("text"); // does not compile, as toto is immutable reference.
But, there is another keyword that fulfill what you are looking for. It's volatile. https://www.baeldung.com/java-volatile
In short, the value change on all thread simultaneously and is available immediately.
That's what is used in all the Atomic* Java classes.
Ex. https://github.com/AdoptOpenJDK/openjdk-jdk11/blob/master/src/java.base/share/classes/java/util/concurrent/atomic/AtomicInteger.java
If I have one instance of an object A and it has an instance method foo() with only variables created and used in that method is that method thread safe even if the same instance is accessed by many threads?
If yes, does this still apply if instance method bar() on object A creates many threads and calls method foo() in the text described above?
Does it mean that every thread gets a "copy" of the method even if it belongs to the same instance?
I have deliberately not used the synchronized keyword.
Thanks
Yes. All local variables (variables defined within a method) will be on their own Stack frame. So, they will be thread safe, provided a reference is not escaping the scope (method)
Note : If a local reference escapes the method (as an argument to another method) or a method works on some class level or instance level fields, then it is not thread-safe.
Does it mean that every thread gets a "copy" of the method even if it belongs to the same instance
No, there will be only one method. every thread shares the same method. But each Thread will have its own Stack Frame and local variables will be on that thread's Stack frame. Even if you use synchronize on local Objects, Escape Analysis proves that the JVM will optimize your code and remove all kinds of synchronization.
example :
public static void main(String[] args) {
Object lock = new Object();
synchronized (lock) {
System.out.println("hello");
}
}
will be effectively converted to :
public static void main(String[] args) {
Object lock = new Object(); // JVm might as well remove this line as unused Object or instantiate this on the stack
System.out.println("hello");
}
You have to separate the code being run, and the data being worked on.
The method is code, executed by each of the threads. If that code contains a statement such as int i=5 which defines a new variable i, and sets its value to 5, then each thread will create that variable.
The problem with multi-threading is not with common code, but with common data (and other common resources). If the common code accesses some variable j that was created elsewhere, then all threads will access the same variable j, i.e. the same data. If one of these threads modifies the shared data while the others are reading, all kinds of errors might occur.
Now, regarding your question, your code should be thread safe as long as your variables are defined within bar(), and bar() doesn't access some common resource such as a file.
You should post some example code to make sure we understand the use case.
For this example:
public class Test {
private String varA;
public void doSomething() {
String varB;
}
}
If you don't do anything to modify varA in this example and only modify varB, this example is Thread Safe.
If, however, you create or modify varA and depend on it's state, then the method is NOT Thread Safe.
The well acclaimed book JCIP says this about ThreadLocal usage :
It is easy to abuse ThreadLocal by treating its thread confinement property as a license to use global variables or as a means of creating "hidden" method arguments.
Thread-local variables can detract from reusability and introduce hidden couplings among classes, and should therefore be used with care.
What does it mean by saying that Thread-local variables can reduce reusability and introduce hidden couplings among classes?
They reduce reusability in much the same way that global variables do: when you method's computations depend on state which is external to the method, but not passed as parameters (i.e. class fields for example), your method is less reusable, because it's tightly coupled to the state of the object/class in which it resides (or worse, on a different class entirely).
Edit: Ok, here's an example to make it more clear. I've used ThreadLocal just for the sake of the question, but it applies to global variables in general. Assume I want to calculate the sum of the first N integers in parallel on several threads. We know that the best way to do it is to calculate local sums for each thread and them sum them up at the end. For some reason we decide that the call method of each Task will use a ThreadLocal sum variable which is defined in a different class as a global (static) variable:
class Foo {
public static ThreadLocal<Long> localSum = new ThreadLocal<Long>() {
public Long initialValue() {
return new Long(0);
}
};
}
class Task implements Callable<Long> {
private int start = 0;
private int end = 0;
public Task(int start, int end) {
this.start = start;
this.end = end;
}
public Long call() {
for(int i = start; i < end; i++) {
Foo.localSum.set(Foo.localSum.get() + i);
}
return Foo.localSum.get();
}
}
The code works correctly and gives us the expected value of the global sum, but we notice that the class Task and its call method are now strictly coupled to the Foo class. If I want to reuse the Task class in another project, I must also move the Foo class otherwise the code will not compile.
Although this is a simple example complicated on purpose, you can see the perils of "hidden" global variables. It also affects readability, since someone else reading the code will have to also search for the class Foo and see what the definition of Foo.localSum is. You should keep your classes as self-contained as possible.
A ThreadLocal is declared per thread - normally a field is declared per object of that class - Starting from this - there can be a whole lot of things that can go wrong if the ThreadLocal is misused.
If a thread passes through multiple objects , ( either of the single class or multiple classes ) , the ThreadLocal used by this thread is the same instance across all these instances. This is the coupling BG is talking about. The moment there is a coupling - the reusability becomes difficult and error prone.
Suppose I have a Utility class,
public class Utility {
private Utility() {} //Don't worry, just doing this as guarantee.
public static int stringToInt(String s) {
return Integer.parseInt(s);
}
};
Now, suppose, in a multithreaded application, a thread calls, Utility.stringToInt() method and while the operation enters the method call, another thread calls the same method passing a different s.
What happens in this case? Does Java lock a static method?
There is no issue here. Each thread will use its own stack so there is no point of collision among different s. And Integer.parseInt() is thread safe as it only uses local variables.
Java does not lock a static method, unless you add the keyword synchronized.
Note that when you lock a static method, you grab the Mutex of the Class object the method is implemented under, so synchronizing on a static method will prevent other threads from entering any of the other "synchronized" static methods.
Now, in your example, you don't need to synchronize in this particular case. That is because parameters are passed by copy; so, multiple calls to the static method will result in multiple copies of the parameters, each in their own stack frame. Likewise, simultaneous calls to Integer.parseInt(s) will each create their own stack frame, with copies of s's value passed into the separate stack frames.
Now if Integer.parseInt(...) was implemented in a very bad way (it used static non-final members during a parseInt's execution; then there would be a large cause for concern. Fortunately, the implementers of the Java libraries are better programmers than that.
In the example you gave, there is no shared data between threads AND there is no data which is modified. (You would have to have both for there to be a threading issue)
You can write
public enum Utility {
; // no instances
public synchronized static int stringToInt(String s) {
// does something which needs to be synchronised.
}
}
this is effectively the same as
public enum Utility {
; // no instances
public static int stringToInt(String s) {
synchronized(Utility.class) {
// does something which needs to be synchronised.
}
}
}
however, it won't mark the method as synchronized for you and you don't need synchronisation unless you are accessing shared data which can be modified.
It should not unless specified explicitly. Further in this case, there wont be any thread safety issue since "s" is immutable and also local to the method.
You dont need synchronization here as the variable s is local.
You need to worry only if multiple threads share resources, for e.g. if s was static field, then you have to think about multi-threading.
In what cases is it necessary to synchronize access to instance members?
I understand that access to static members of a class always needs to be synchronized- because they are shared across all object instances of the class.
My question is when would I be incorrect if I do not synchronize instance members?
for example if my class is
public class MyClass {
private int instanceVar = 0;
public setInstanceVar()
{
instanceVar++;
}
public getInstanceVar()
{
return instanceVar;
}
}
in what cases (of usage of the class MyClass) would I need to have methods:
public synchronized setInstanceVar() and
public synchronized getInstanceVar() ?
Thanks in advance for your answers.
The synchronized modifier is really a bad idea and should be avoided at all costs. I think it is commendable that Sun tried to make locking a little easier to acheive, but synchronized just causes more trouble than it is worth.
The issue is that a synchronized method is actually just syntax sugar for getting the lock on this and holding it for the duration of the method. Thus, public synchronized void setInstanceVar() would be equivalent to something like this:
public void setInstanceVar() {
synchronized(this) {
instanceVar++;
}
}
This is bad for two reasons:
All synchronized methods within the same class use the exact same lock, which reduces throughput
Anyone can get access to the lock, including members of other classes.
There is nothing to prevent me from doing something like this in another class:
MyClass c = new MyClass();
synchronized(c) {
...
}
Within that synchronized block, I am holding the lock which is required by all synchronized methods within MyClass. This further reduces throughput and dramatically increases the chances of a deadlock.
A better approach is to have a dedicated lock object and to use the synchronized(...) block directly:
public class MyClass {
private int instanceVar;
private final Object lock = new Object(); // must be final!
public void setInstanceVar() {
synchronized(lock) {
instanceVar++;
}
}
}
Alternatively, you can use the java.util.concurrent.Lock interface and the java.util.concurrent.locks.ReentrantLock implementation to achieve basically the same result (in fact, it is the same on Java 6).
It depends on whether you want your class to be thread-safe. Most classes shouldn't be thread-safe (for simplicity) in which case you don't need synchronization. If you need it to be thread-safe, you should synchronize access or make the variable volatile. (It avoids other threads getting "stale" data.)
If you want to make this class thread safe I would declare instanceVar as volatile to make sure you get always the most updated value from memory and also I would make the setInstanceVar() synchronized because in the JVM an increment is not an atomic operation.
private volatile int instanceVar =0;
public synchronized setInstanceVar() { instanceVar++;
}
. Roughly, the answer is "it depends". Synchronizing your setter and getter here would only have the intended purpose of guaranteeing that multiple threads couldn't read variables between each others increment operations:
synchronized increment()
{
i++
}
synchronized get()
{
return i;
}
but that wouldn't really even work here, because to insure that your caller thread got the same value it incremented, you'd have to guarantee that you're atomically incrementing and then retrieving, which you're not doing here - i.e you'd have to do something like
synchronized int {
increment
return get()
}
Basically, synchronization is usefull for defining which operations need to be guaranteed to run threadsafe (inotherwords, you can't create a situation where a separate thread undermines your operation and makes your class behave illogically, or undermines what you expect the state of the data to be). It's actually a bigger topic than can be addressed here.
This book Java Concurrency in Practice is excellent, and certainly much more reliable than me.
To simply put it, you use synchronized when you have mutliple threads accessing the same method of the same instance which will change the state of the object/or application.
It is meant as a simple way to prevent race conditions between threads, and really you should only use it when you are planning on having concurrent threads accessing the same instance, such as a global object.
Now when you are reading the state of an instance of a object with concurrent threads, you may want to look into the the java.util.concurrent.locks.ReentrantReadWriteLock -- which in theory allows many threads to read at a time, but only one thread is allowed to write. So in the getter and setting method example that everyone seems to be giving, you could do the following:
public class MyClass{
private ReentrantReadWriteLock rwl = new ReentrantReadWriteLock();
private int myValue = 0;
public void setValue(){
rwl.writeLock().lock();
myValue++;
rwl.writeLock().unlock();
}
public int getValue(){
rwl.readLock.lock();
int result = myValue;
rwl.readLock.unlock();
return result;
}
}
In Java, operations on ints are atomic so no, in this case you don't need to synchronize if all you're doing is 1 write and 1 read at a time.
If these were longs or doubles, you do need to synchronize because it's possible for part of the long/double to be updated, then have another thread read, then finally the other part of the long/double updated.