I have a class something like this:
public class Outer {
public static final TaskUpdater TASK_UPDATER = new TaskUpdater() {
public void doSomething(Task task) {
//uses and modifies task and some other logic
}
};
public void taskRelatedMethod() {
//some logic
TASK_UPDATER.doSomething(new Task());
//some other logic
}
}
I've noticed some strange behaviour when running this in a multi-threaded environment that I can't reproduce locally, and I suspect it's a threading issue. Is it possible for two instances of Outer to somehow interfere with each other by both calling doSomething on TASK_UPDATER? Each will be passing a difference instance of Task into the doSomething method.
Is it possible for two instances of Outer to somehow interfere with each other by both calling doSomething on TASK_UPDATER?
The answer is "it depends". Any time you have multiple threads sharing the same object instances, you may have concurrency issues. In your case, you have multiple instances of Outer sharing the same static TaskUpdater instance. This in itself is not a problem however if TaskUpdater has any fields, they will be shared by the threads. If the threads make any changes to those fields in any way then data synchronization needs to happen and possible blocking around critical code sections. If the TaskUpdater is only reading and operating on the Task argument, which seems to be per Outer instance, then there is no problem.
For example, you could have a task updater like:
public static final TaskUpdater TASK_UPDATER = new TaskUpdater() {
public void doSomething(Task task) {
int total = 0;
for (Job job : task.getJobs() {
total += job.getSize();
}
task.setTotalSize(total);
}
};
In this case, the task is only changing the Task instance passed in. It can use local variables without a problem because those are on the stack and now shared between threads. This is thread safe.
However consider this updater:
public static final TaskUpdater TASK_UPDATER = new TaskUpdater() {
private long total = 0;
public void doSomething(Task task) {
for (Job job : task.getJobs() {
// race condition and memory synchronization issues here
total += job.getSize();
}
}
public long getTotal() {
return total;
}
};
In this case, both threads will be updating the same total field on the shard TaskUpdater. This is not thread safe since you have race conditions around the += (since it is 3 operations: get, plus, set) as well as memory synchronization issues. One thread may have a cached version of total which is 5 which it increments to 6 but another thread has already incremented its cached version of total to 10.
When threads share common fields you need to protect those operations and worry about synchronization in terms of mutex access and memory publishing. In this case, making total be an AtomicLong will be in order.
private AtomicLong total = new AtomicLong(0);
...
total.addAndGet(job.getSize());
AtomicLong wraps a volatile long so the memory is published appropriately to all threads and it has code that does atomic test/set operations which removes the race conditions.
Related
I understood that reading and writing data from multiple threads need to have a good locking mechanism to avoid data race. However, one situation is: If multiple threads try to write to a single variable with a single value, can this be a problem.
For example, here my sample code:
public class Main {
public static void main(String[] args) {
final int[] a = {1};
while(true) {
new Thread(new Runnable() {
#Override
public void run() {
a[0] = 1;
assert a[0] == 1;
}
}).start();
}
}
}
I have run this program for a long time, and look like everything is fine. If this code can cause the problem, how can I reproduce that?
Your test case does not cover the actual problem. You test the variable's value in the same thread - but that thread already copied the initial state of the variable and when it changes within the thread, the changes are visible to that thread, just like in any single-threaded applications. The real issue with write operations is how and when is the updated value used in the other threads.
For example, if you were to write a counter, where each thread increments the value of the number, you would run into issues. An other problem is that your test operation take way less time than creating a thread, therefore the execution is pretty much linear. If you had longer code in the threads, it would be possible for multiple threads to access the variable at the same time. I wrote this test using Thread.sleep(), which is known to be unreliable (which is what we need):
int[] a = new int[]{0};
for(int i = 0; i < 100; i++) {
final int k = i;
new Thread(new Runnable() {
#Override
public void run() {
try {
Thread.sleep(20);
} catch(InterruptedException e) {
e.printStackTrace();
}
a[0]++;
System.out.println(a[0]);
}
}).start();
}
If you execute this code, you will see how unreliable the output is. The order of the numbers change (they are not in ascending order), there are duplicates and missing numbers as well. This is because the variable is copied to the CPU memory multiple times (once for each thread), and is pasted back to the shared ram after the operation is complete. (This does not happen right after it is completed to save time in case it is needed later).
There also might be some other mechanics in the JVM that copy the values within the RAM for threads, but I'm unaware of them.
The thing is, even locking doesn't prevent these issues. It prevents threads from accessing the variable at the same time, but it generally doesn't make sure that the value of the variable is updated before the next thread accesses it.
I would like to understand what happens from a cache point-of-view (MESI protocol) if a programmer forget to add the volatile keyword to a variable used for synchronization.
The following snippet of code simply stops after some iterations. I know it's because one of the 2 threads (let's assume THREAD 0) does not see the update made to current variable by THREAD 1 done by getNext(), therefore it keeps looping forever.
However, I don't understand why it is the case. THREAD 0 is looping on current and should see at some point that the cache line was updated by THREAD 1 (the cache line switched to Modified state) and issue a "Read" message on the memory bus to fetch it in its local cache. Adding volatile modifier to current variable will make everything works fine.
What is happening that prevent THREAD 0 to continue its execution ?
Reference: Memory Barriers: a Hardware View for Software Hackers
public class Volatile {
public static final int THREAD_0 = 0;
public static final int THREAD_1 = 1;
public static int current;
public static int counter = 0;
public static void main(String[] args) {
current = 0;
/** Thread 0 **/
Thread thread0 = new Thread(() -> {
while(true) { /** THREAD_0 */
while (current != THREAD_0);
counter++;
System.out.println("Thread0:" + counter);
current = getNext(THREAD_0);
}
});
/** Thread 1 **/
Thread thread1 = new Thread(() -> {
while(true) { /** THREAD_1 */
while (current != THREAD_1);
counter++;
System.out.println("Thread1:" + counter);
current = getNext(THREAD_1);
}
});
thread0.start();
thread1.start();
}
public static int getNext(int threadId) {
return threadId == THREAD_0 ? THREAD_1 : THREAD_0;
}
}
Different processor architectures can have different degrees of cache coherence. Some may be pretty minimal, because the goal is to maximize throughput and synching up caches with memory unnecessarily will drag that down. Java imposes its own memory barriers when it detects that coordination is required, but that depends on the programmer following the rules.
If the programmer doesn't add indications (like using the synchronized or volatile keywords) that a variable can be shared across threads, the JIT is allowed to assume that no other thread is modifying that variable, and it can optimize accordingly. The bytecode that tests that variable can get reordered drastically. If the JIT reorders the code enough then when the hardware detects a modification and fetches the new value may not matter.
Quoting from Java Concurrency in Practice 3.1:
In the absence of synchronization, the compiler, processor, and runtime can do some downright weird things to the order in which operations appear to execute. Attempts to reason about the order in which memory actions "must" happen in insufficiently synchronized multithreaded programs will almost certainly be incorrect.
(There are several places in the JCIP book making the case that reasoning about insufficiently-synchronized code is futile.)
I was thinking about how to solve race condition between two threads which tries to write to the same variable using immutable objects and without helping any keywords such as synchronize(lock)/volatile in java.
But I couldn't figure it out, is it possible to solve this problem with such solution at all?
public class Test {
private static IAmSoImmutable iAmSoImmutable;
private static final Runnable increment1000Times = () -> {
for (int i = 0; i < 1000; i++) {
iAmSoImmutable.increment();
}
};
public static void main(String... args) throws Exception {
for (int i = 0; i < 10; i++) {
iAmSoImmutable = new IAmSoImmutable(0);
Thread t1 = new Thread(increment1000Times);
Thread t2 = new Thread(increment1000Times);
t1.start();
t2.start();
t1.join();
t2.join();
// Prints a different result every time -- why? :
System.out.println(iAmSoImmutable.value);
}
}
public static class IAmSoImmutable {
private int value;
public IAmSoImmutable(int value) {
this.value = value;
}
public IAmSoImmutable increment() {
return new IAmSoImmutable(++value);
}
}
If you run this code you'll get different answers every time, which mean a race condition is happening.
You can not solve race condition without using any of existence synchronisation (or volatile) techniques. That what they were designed for. If it would be possible there would be no need of them.
More particularly your code seems to be broken. This method:
public IAmSoImmutable increment() {
return new IAmSoImmutable(++value);
}
is nonsense for two reasons:
1) It makes broken immutability of class, because it changes object's variable value.
2) Its result - new instance of class IAmSoImmutable - is never used.
The fundamental problem here is that you've misunderstood what "immutability" means.
"Immutability" means — no writes. Values are created, but are never modified.
Immutability ensures that there are no race conditions, because race conditions are always caused by writes: either two threads performing writes that aren't consistent with each other, or one thread performing writes and another thread performing reads that give inconsistent results, or similar.
(Caveat: even an immutable object is effectively mutable during construction — Java creates the object, then populates its fields — so in addition to being immutable in general, you need to use the final keyword appropriately and take care with what you do in the constructor. But, those are minor details.)
With that understanding, we can go back to your initial sentence:
I was thinking about how to solve race condition between two threads which tries to write to the same variable using immutable objects and without helping any keywords such as synchronize(lock)/volatile in java.
The problem here is that you actually aren't using immutable objects: your entire goal is to perform writes, and the entire concept of immutability is that no writes happen. These are not compatible.
That said, immutability certainly has its place. You can have immutable IAmSoImmutable objects, with the only writes being that you swap these objects out for each other. That helps simplify the problem, by reducing the scope of writes that you have to worry about: there's only one kind of write. But even that one kind of write will require synchronization.
The best approach here is probably to use an AtomicReference<IAmSoImmutable>. This provides a non-blocking way to swap out your IAmSoImmutable-s, while guaranteeing that no write gets silently dropped.
(In fact, in the special case that your value is just an integer, the JDK provides AtomicInteger that handles the necessary compare-and-swap loops and so on for threadsafe incrementation.)
Even if the problems are resolved by :
Avoiding the change of IAmSoImmutable.value
Reassigning the new object created within increment() back into the iAmSoImmutable reference.
There still are pieces of your code that are not atomic and that needs a sort of synchronization.
A solution would be to use a synchronized method of course
public synchronized static void increment() {
iAmSoImmutable = iAmSoImmutable.increment();
}
Thread t1 = new Thread(() -> {
for (int i = 0; i < 1000; i++) {
increment();
}
});
Thread t2 = new Thread(() -> {
for (int i = 0; i < 1000; i++) {
increment();
}
});
I have read article concerning atomic operation in Java but still have some doubts needing to be clarified:
int volatile num;
public void doSomething() {
num = 10; // write operation
System.out.println(num) // read
num = 20; // write
System.out.println(num); // read
}
So i have done w-r-w-r 4 operations on 1 method, are they atomic operations? What will happen if multiple threads invoke doSomething() method simultaneously ?
An operation is atomic if no thread will see an intermediary state, i.e. the operation will either have completed fully, or not at all.
Reading an int field is an atomic operation, i.e. all 32 bits are read at once. Writing an int field is also atomic, the field will either have been written fully, or not at all.
However, the method doSomething() is not atomic; a thread may yield the CPU to another thread while the method is being executing, and that thread may see that some, but not all, operations have been executed.
That is, if threads T1 and T2 both execute doSomething(), the following may happen:
T1: num = 10;
T2: num = 10;
T1: System.out.println(num); // prints 10
T1: num = 20;
T1: System.out.println(num); // prints 20
T2: System.out.println(num); // prints 20
T2: num = 20;
T2: System.out.println(num); // prints 20
If doSomething() were synchronized, its atomicity would be guaranteed, and the above scenario impossible.
volatile ensures that if you have a thread A and a thread B, that any change to that variable will be seen by both. So if it at some point thread A changes this value, thread B could in the future look at it.
Atomic operations ensure that the execution of the said operation happens "in one step." This is somewhat confusion because looking at the code 'x = 10;' may appear to be "one step", but actually requires several steps on the CPU. An atomic operation can be formed in a variety of ways, one of which is by locking using synchronized:
What the volatile keyword promises.
The lock of an object (or the Class in the case of static methods) is acquired, and no two objects can access it at the same time.
As you asked in a comment earlier, even if you had three separate atomic steps that thread A was executing at some point, there's a chance that thread B could begin executing in the middle of those three steps. To ensure the thread safety of the object, all three steps would have to be grouped together to act like a single step. This is part of the reason locks are used.
A very important thing to note is that if you want to ensure that your object can never be accessed by two threads at the same time, all of your methods must be synchronized. You could create a non-synchronized method on the object that would access the values stored in the object, but that would compromise the thread safety of the class.
You may be interested in the java.util.concurrent.atomic library. I'm also no expert on these matters, so I would suggest a book that was recommended to me: Java Concurrency in Practice
Each individual read and write to a volatile variable is atomic. This means that a thread won't see the value of num changing while it's reading it, but it can still change in between each statement. So a thread running doSomething while other threads are doing the same, will print a 10 or 20 followed by another 10 or 20. After all threads have finished calling doSomething, the value of num will be 20.
My answer modified according to Brian Roach's comment.
It's atomic because it is integer in this case.
Volatile can only ganrentee visibility among threads, but not atomic. volatile can make you see the change of the integer, but cannot ganrentee the integration in changes.
For example, long and double can cause unexpected intermediate state.
Atomic Operations and Synchronization:
Atomic executions are performed in a single unit of task without getting affected from other executions. Atomic operations are required in multi-threaded environment to avoid data irregularity.
If we are reading/writing an int value then it is an atomic operation. But generally if it is inside a method then if the method is not synchronized many threads can access it which can lead to inconsistent values. However, int++ is not an atomic operation. So by the time one threads read it’s value and increment it by one, other thread has read the older value leading to wrong result.
To solve data inconsistency, we will have to make sure that increment operation on count is atomic, we can do that using Synchronization but Java 5 java.util.concurrent.atomic provides wrapper classes for int and long that can be used to achieve this atomically without usage of Synchronization.
Using int might create data data inconsistencies as shown below:
public class AtomicClass {
public static void main(String[] args) throws InterruptedException {
ThreardProcesing pt = new ThreardProcesing();
Thread thread_1 = new Thread(pt, "thread_1");
thread_1.start();
Thread thread_2 = new Thread(pt, "thread_2");
thread_2.start();
thread_1.join();
thread_2.join();
System.out.println("Processing count=" + pt.getCount());
}
}
class ThreardProcesing implements Runnable {
private int count;
#Override
public void run() {
for (int i = 1; i < 5; i++) {
processSomething(i);
count++;
}
}
public int getCount() {
return this.count;
}
private void processSomething(int i) {
// processing some job
try {
Thread.sleep(i * 1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
OUTPUT: count value varies between 5,6,7,8
We can resolve this using java.util.concurrent.atomic that will always output count value as 8 because AtomicInteger method incrementAndGet() atomically increments the current value by one. shown below:
public class AtomicClass {
public static void main(String[] args) throws InterruptedException {
ThreardProcesing pt = new ThreardProcesing();
Thread thread_1 = new Thread(pt, "thread_1");
thread_1.start();
Thread thread_2 = new Thread(pt, "thread_2");
thread_2.start();
thread_1.join();
thread_2.join();
System.out.println("Processing count=" + pt.getCount());
}
}
class ThreardProcesing implements Runnable {
private AtomicInteger count = new AtomicInteger();
#Override
public void run() {
for (int i = 1; i < 5; i++) {
processSomething(i);
count.incrementAndGet();
}
}
public int getCount() {
return this.count.get();
}
private void processSomething(int i) {
// processing some job
try {
Thread.sleep(i * 1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
Source: Atomic Operations in java
Consider code sniper below:
package sync;
public class LockQuestion {
private String mutable;
public synchronized void setMutable(String mutable) {
this.mutable = mutable;
}
public String getMutable() {
return mutable;
}
}
At time Time1 thread Thread1 will update ‘mutable’ variable. Synchronization is needed in setter in order to flush memory from local cache to main memory.
At time Time2 ( Time2 > Time1, no thread contention) thread Thread2 will read value of mutable.
Question is – do I need to put synchronized before getter? Looks like this won’t cause any issues - memory should be up to date and Thread2’s local cache memory should be invalidated&updated by Thread1, but I’m not sure.
Rather than wonder, why not just use the atomic references in java.util.concurrent?
(and for what it's worth, my reading of happens-before does not guarantee that Thread2 will see changes to mutable unless it also uses synchronized ... but I always get a headache from that part of the JLS, so use the atomic references)
It will be fine if you make mutable volatile, details in the "cheap read-write lock"
Are you absolutely sure that the getter will be called only after the setter is called? If so, you don't need the getter to be synchronized, since concurrent reads do not need to synchronized.
If there is a chance that get and set can be called concurrently then you definitely need to synchronize the two.
If you worry so much about the performance in the reading thread, then what you do is read the value once using proper synchronization or volatile or atomic references. Then you assign the value to a plain old variable.
The assign to the plain variable is guaranteed to happen after the atomic read (because how else could it get the value?) and if the value will never be written to by another thread again you are all set.
I think you should start with something which is correct and optimise later when you know you have an issue. I would just use AtomicReference unless a few nano-seconds is too long. ;)
public static void main(String... args) {
AtomicReference<String> ars = new AtomicReference<String>();
ars.set("hello");
long start = System.nanoTime();
int runs = 1000* 1000 * 1000;
int length = test(ars, runs);
long time = System.nanoTime() - start;
System.out.printf("get() costs " + 1000*time / runs + " ps.");
}
private static int test(AtomicReference<String> ars, int runs) {
int len = 0;
for (int i = 0; i < runs; i++)
len = ars.get().length();
return len;
}
Prints
get() costs 1219 ps.
ps is a pico-second, with is 1 millionth of a micro-second.
This probably will never result in incorrect behavior, but unless you also guarantee the order that the threads startup in, you cannot necessarily guarantee that the compiler didn't reorder the read in Thread2 before the write in Thread1. More specifically, the entire Java runtime only has to guarantee that threads execute as if they were run in serial. So, as long as the thread has the same output running serially under optimizations, the entire language stack (compiler, hardware, language runtime) can do
pretty much whatever it wants. Including allowing Thread2 to cache the the result of LockQuestion.getMutable().
In practice, I would be very surprised if that ever happened. If you want to guarantee that this doesn't happen, have LockQuestion.mutable be declared as final and get initialized in the constructor. Or use the following idiom:
private static class LazySomethingHolder {
public static Something something = new Something();
}
public static Something getInstance() {
return LazySomethingHolder.something;
}