I have a Java program which creates threads each one executing the same code (the same run()).
My main looks like:
{
// Create threads
GameOfLifeThread[][] threads = new GameOfLifeThread[vSplit][hSplit];
for(int i=0; i<vSplit; i++){
for(int j=0; j<hSplit; j++){
threads[i][j] = new GameOfLifeThread(initalField, ...);
}
}
// Run threads
for(int i=0; i<vSplit; i++){
for(int j=0; j<hSplit; j++){
// threads[i][j].run();
(new Thread(threads[i][j])).start();
}
}
return ...;
}
initialField is a global 2D array. Each thread is supposed to make some changes to it.
The problem is that after the threads execution the array stays unchanged even if there is only a single worker thread. However, when I run
threads[i][j].run();
instead of
(new Thread(threads[i][j])).start();
with a single worker thread (i.e. pure serial execution by the main thread) the initalField changes as it should.
What could be the problem? It looks like the array's elements are passed by value, but it cannot be so.
Thank you in advance.
Just one guess in the blue:
Your initalField must be volatile, otherwise it may be cached by the threads and won't get changed (as viewed by the other threads), because they can get cached thread-locally.
This and this answer may explain it a bit better.
Related
I understood that reading and writing data from multiple threads need to have a good locking mechanism to avoid data race. However, one situation is: If multiple threads try to write to a single variable with a single value, can this be a problem.
For example, here my sample code:
public class Main {
public static void main(String[] args) {
final int[] a = {1};
while(true) {
new Thread(new Runnable() {
#Override
public void run() {
a[0] = 1;
assert a[0] == 1;
}
}).start();
}
}
}
I have run this program for a long time, and look like everything is fine. If this code can cause the problem, how can I reproduce that?
Your test case does not cover the actual problem. You test the variable's value in the same thread - but that thread already copied the initial state of the variable and when it changes within the thread, the changes are visible to that thread, just like in any single-threaded applications. The real issue with write operations is how and when is the updated value used in the other threads.
For example, if you were to write a counter, where each thread increments the value of the number, you would run into issues. An other problem is that your test operation take way less time than creating a thread, therefore the execution is pretty much linear. If you had longer code in the threads, it would be possible for multiple threads to access the variable at the same time. I wrote this test using Thread.sleep(), which is known to be unreliable (which is what we need):
int[] a = new int[]{0};
for(int i = 0; i < 100; i++) {
final int k = i;
new Thread(new Runnable() {
#Override
public void run() {
try {
Thread.sleep(20);
} catch(InterruptedException e) {
e.printStackTrace();
}
a[0]++;
System.out.println(a[0]);
}
}).start();
}
If you execute this code, you will see how unreliable the output is. The order of the numbers change (they are not in ascending order), there are duplicates and missing numbers as well. This is because the variable is copied to the CPU memory multiple times (once for each thread), and is pasted back to the shared ram after the operation is complete. (This does not happen right after it is completed to save time in case it is needed later).
There also might be some other mechanics in the JVM that copy the values within the RAM for threads, but I'm unaware of them.
The thing is, even locking doesn't prevent these issues. It prevents threads from accessing the variable at the same time, but it generally doesn't make sure that the value of the variable is updated before the next thread accesses it.
My question is extremely basic: once I have written some array values by one or more threads (phase 1), how can I 'publish' my array to make all the changes visible to other threads (phase 2)?
I have code that does all the array writing, then all the array reading, then again all the writing, then again all the reading etc. I'd like to do it in multiple threads, so multiple threads first would do the array writing phase, then multiple threads would do the array reading phase etc.
My concern is how to safely publish the array writes after each writing phase.
Consider the following simplified thread-unsafe code, that does just one writing phase with just one thread and then just one reading phase with multiple threads:
ExecutorService executor = Executors.newFixedThreadPool(5);
double[] arr = new double[5];
for (int i=0; i<5; ++i) {
arr[i] = 1 + Math.random();
}
for (int i=0; i<5; ++i) {
final int j=i;
executor.submit(() -> System.out.println(String.format("arr[%s]=%s", j, arr[j])));
}
The code normally prints non-zero values, but I understand that it might occasionally print zeros as well, as the array is not properly published by the writing thread, so some writes might not be visible to other threads.
I'd like to fix this problem and write the above code properly, in a thread-safe manner, i.e. to make sure that all my writes will be visible to the reading threads.
1. Could you advise on the best way to do so?
The concurrent collections and AtomicXxxArray are not an option for me because of performance (and also code clarity), as I have 2D arrays etc.
2. I can think of the following possible solutions, but I am not 100% sure they would work. Could you also advise on the solutions below?
Solution 1: assignment to a final array
Justification: I expect a final field to be always properly initialized with the latest writes, including all its recursive dependencies.
for (int i=0; i<5; ++i) {
arr[i] = 1 + Math.random();
}
final double[] arr2 = arr; //<---- safe publication?
for (int i=0; i<5; ++i) {
final int j=i;
executor.submit(() -> System.out.println(String.format("arr[%s]=%s", j, arr2[j])));
}
Solution 2: a latch
Justification: I expect the latch to establish a perfect happens-before relationship between the writing thread(s) and the reading threads.
CountDownLatch latch = new CountDownLatch(1); //1 = the number of writing threads
for (int i=0; i<5; ++i) {
arr[i] = Math.random();
}
latch.countDown(); //<- writing is done
for (int i=0; i<5; ++i) {
final int j=i;
executor.submit(() -> {
try {latch.await();} catch (InterruptedException e) {...} //happens-before(writings, reading) guarantee?
System.out.println(String.format("arr[%s]=%s", j, arr[j]));
});
}
Update: this answer https://stackoverflow.com/a/5173805/1847482 suggests the following solution:
volatile int guard = 0;
...
//after the writing is done:
guard = guard + 1; //write some new value
//just before the reading: read the volatile variable, e.g.
guard = guard + 1; //includes reading
... //do the reading
This solution uses the following rule: "if thread A writes some non-volatile stuff and a volatile variable after that, thread B is guaranteed to see the changes of the volatile stuff as well if it reads the volatile variable first".
Your first example is perfectly safe, because the tasks originate from the writer thread. As the docs say:
Actions in a thread prior to the submission of a Runnable to an Executor happen-before its execution begins.
I have a method that runs with Selenium to create user accounts on a website quickly. Currently it processes one after the other, but I'm thinking if I can process 10 at once that would be better.
I have a for loop currently, which is used to tell the code within, which line of my 2D array to read the user information from. I am struggling with the concept of how to make any stream or thread use the correct value and fetch the correct user information.
Currently I have something similar to the below simplified:
I need to load a new page and driver everytime this loops and need to send the value of the array to the web field. So basically I want this to go off and loop and not wait for the first loop to finish before starting the next loop, but probably limit to 10 or so running at once.
for(i=0,i<myarray.length, i++)
{
Webdriver.start();
WebElement.findby.(By.name("field1").sendkeys(myArray[i][2]);
Webdriver.end();
}
As I said code is not actual code it is just to get my question across.
Hope that is clear.
I think you're saying, I am iterating through myArray and running my test once for each element in that array, but instead of running one test and waiting for it to finish before running the next, I want to run a whole bunch at a time.
You can do this pretty trivially with the Java 8 ForkJoinPool.
ForkJoinTask[] tasks = new ForkJoinTask[myarray.length];
for(i=0,i<myarray.length, i++)
{
int j = i; // need an effectively final copy of i
tasks[i] = ForkJoinPool.commonPool().submit(() -> {
Webdriver.start();
WebElement.findby.(By.name("field1").sendkeys(myArray[j][2]);
Webdriver.end();
});
}
for (i = 0; i < my array.length; i++) {
tasks[i].join();
}
The tests will run in parallel using threads from the "common" ForkJoinPool. If you want to adjust the number of threads that are used, create your own ForkJoinPool. (See this question for more information.)
I would explicitly start separate Thread for every task as the most of time will be likely spent on waiting until the user account is created.
Please, see the code snippet with rough example below:
public void createAccounts() throws InterruptedException {
List<Thread> threadList = new ArrayList<>();
Object[][] myArray = new Object[1][1];
for(int i=0; i<myArray.length; i++) {
final int index = i;
//Add thread for user creation
threadList.add(new Thread(new Runnable() {
#Override
public void run() {
Webdriver.start();
WebElement.findby.(By.name("field1").sendkeys(myArray[index][2]);
Webdriver.end();
}
}));
}
//Start all threads
for (Thread thread : threadList) {
thread.start();
}
//Wait until all threads are finished
for (Thread thread : threadList) {
thread.join();
}
}
I'm trying to implement a fast version of LZ77 and I have a question to ask you about concurrent programming.
For now I have a final byte[] buffer and a final int[] resultHolder, both of the same length. The program does the following:
The main thread writes all the buffer, then notifies the Threads and wait for them to complete.
The single working Thread processes a portion of the buffer saving the results in the same portion of the result holder. Worker's portion is exclusive. After that the main thread is notified and the worker pauses.
When all the workers have paused, the main thread reads the data in resultHolder and updates the buffer, then (if needed) the process begins again from point 1.
Important things in manager (main Thread) are declared as follow:
final byte[] buffer = new byte[SIZE];
final MemoryHelper memoryHelper = new MemoryHelper();
final ArrayBlockingQueue<Object> waitBuffer = new ArrayBlockingQueue<Object>(TOT_WORKERS);
final ArrayBlockingQueue<Object> waitResult = new ArrayBlockingQueue<Object>(TOT_WORKERS);
final int[] resultHolder = new int[SIZE];
MemoryHelper simply wraps a volatile field and provides two methods: one for reading it and one for writing to it.
Worker's run() code:
public void run() {
try {
// Wait main thread
while(manager.waitBuffer.take() != SHUTDOWN){
// Load new buffer values
manager.memoryHelper.readVolatile();
// Do something
for (int i = a; i <= b; i++){
manager.resultHolder[i] = manager.buffer[i] + 10;
}
// Flush new values of resultHolder
manager.memoryHelper.writeVolatile();
// Signal job done
manager.waitResult.add(Object.class);
}
} catch (InterruptedException e) { }
}
Finally, the important part of main Thread:
for(int i=0; i < 100_000; i++){
// Start workers
for (int j = 0; j < TOT_WORKERS; j++)
waitBuffer.add(Object.class);
// Wait workers
for (int j = 0; j < TOT_WORKERS; j++)
waitResult.take();
// Load results
memoryHelper.readVolatile();
// Do something
processResult();
setBuffer();
// Store buffer
memoryHelper.writeVolatile();
}
Synchronization on ArrayBlockingQueue works well. My doubt is in using readVolatile() and writeVolatile(). I've been told that writing to a volatile field flushes to memory all the previously changed data, then reading it from another thread makes them visible.
So is it enough in this case to ensure a correct visibility? There is never a real concurrent access to the same memory areas, so a volatile field should be a lot cheaper than a ReadWriteLock.
You don't even need volatile here, because BlockingQueues already provide necessary memory visibility guarantees:
Memory consistency effects: As with other concurrent collections, actions in a thread prior to placing an object into a BlockingQueue happen-before actions subsequent to the access or removal of that element from the BlockingQueue in another thread.
In general, if you already have some kind of synchronization, you probably don't need to do anything special to ensure memory visibility, because it's already guaranteed by synchronization primitives you use.
However, volatile reads and writes can be used to ensure memory visibility when you don't have explicit synchronization (e.g. in lock-free algorithms).
P. S.
Also it looks like you can use CyclicBarrier instead of your solution with queues, it's especially designed for similar scenarios.
I am wondering if it is possible to avoid the lost update problem, where multiple threads are updating the same date, while avoiding using synchronized(x) { }.
I will be doing numerous adds and increments:
val++;
ary[x] += y;
ary[z]++;
I do not know how Java will compile these into byte code and if a thread could be interrupted in the middle of one of these statements blocks of byte code. In other words are those statements thread safe?
Also, I know that the Vector class is synchronized, but I am not sure what that means. Will the following code be thread safe in that the value at position i will not change between the vec.get(i) and vec.set(...).
class myClass {
Vector<Integer> vec = new Vector<>(Integer);
public void someMethod() {
for (int i=0; i < vec.size(); i++)
vec.set(i, vec.get(i) + value);
}
}
Thanks in advance.
For the purposes of threading, ++ and += are treated as two operations (four for double and long). So updates can clobber one another. Not just be one, but a scheduler acting at the wrong moment could wipe out milliseconds of updates.
java.util.concurrent.atomic is your friend.
Your code can be made safe, assuming you don't mind each element updating individually and you don't change the size(!), as:
for (int i=0; i < vec.size(); i++) {
synchronized (vec) {
vec.set(i, vec.get(i) + value);
}
}
If you want to add resizing to the Vector you'll need to move the synchronized statement outside of the for loop, and you might as well just use plain new ArrayList. There isn't actually a great deal of use for a synchronised list.
But you could use AtomicIntegerArray:
private final AtomicIntegerArray ints = new AtomicIntegerArray(KNOWN_SIZE);
[...]
int len = ints.length();
for (int i=0; i<len; ++i) {
ints.addAndGet(i, value);
}
}
That has the advantage of no locks(!) and no boxing. The implementation is quite fun too, and you would need to understand it do more complex update (random number generators, for instance).
vec.set() and vec.get() are thread safe in that they will not set and retrieve values in such a way as to lose sets and gets in other threads. It does not mean that your set and your get will happen without an interruption.
If you're really going to be writing code like in the examples above, you should probably lock on something. And synchronized(vec) { } is as good as any. You're asking here for two operations to happen in sync, not just one thread safe operation.
Even java.util.concurrent.atomic will only ensure one operation (a get or set) will happen safely. You need to get-and-increment in one operation.