I am new to multi-threading in Java and don't quite understand what's going on.
From online tutorials and lecture notes, I know that the synchronized block, which must be applied to a non-null object, ensures that only one thread can execute that block of code. Since an array is an object in Java, synchronize can be applied to it. Further, if the array stores objects, I should be able to synchronize each element of the array too.
My program has several threads updated an array of numbers, hence I created an array of Long objects:
synchronized (grid[arrayIndex]){
grid[arrayIndex] += a.getNumber();
}
This code sits inside the run() method of the thread class which I have extended. The array, grid, is shared by all of my threads. However, this does not return the correct results while running the same program on one thread does.
This will not work. It is important to realize that grid[arrayIndex] += ... is actually replacing the element in the grid with a new object. This means that you are synchronizing on an object in the array and then immediately replacing the object with another in the array. This will cause other threads to lock on a different object so they won't block. You must lock on a constant object.
You can instead lock on the entire array object, if it is never replaced with another array object:
synchronized (grid) {
// this changes the object to another Long so can't be used to lock
grid[arrayIndex] += a.getNumber();
}
This is one of the reasons why it is a good pattern to lock on a final object. See this answer with more details:
Why is it not a good practice to synchronize on Boolean?
Another option would be to use an array of AtomicLong objects, and use their addAndGet() or getAndAdd() method. You wouldn't need synchronization to increment your objects, and multiple objects could be incremented concurrently.
The java class Long is immutable, you cannot change its value. So when you perform an action:
grid[arrayIndex] += a.getNumber();
it is not changing the value of grid[arrayIndex], which you are locking on, but is actually creating a new Long object and setting its value to the old value plus a.getNumber. So you will end up with different threads synchronizing on different objects, which leads to the results you are seeing
The synchronized block you have here is no good. When you synchronize on the array element, which is presumably a number, you're synchronizing only on that object. When you reassign the element of the array to a different object than the one you started with, the synchronization is no longer on the correct object and other threads will be able to access that index.
One of these two options would be more correct:
private final int[] grid = new int[10];
synchronized (grid) {
grid[arrayIndex] += a.getNumber();
}
If grid can't be final:
private final Object MUTEX = new Object();
synchronized (MUTEX) {
grid[arrayIndex] += a.getNumber();
}
If you use the second option and grid is not final, any assignment to grid should also be synchronized.
synchronized (MUTEX) {
grid = new int[20];
}
Always synchronize on something final, always synchronize on both access and modification, and once you have that down, you can start looking into other locking mechanisms, such as Lock, ReadWriteLock, and Semaphore. These can provide more complex locking mechanisms than synchronization that is better for scenarios where Java's default synchronization alone isn't enough, such as locking data in a high-throughput system (read/write locking) or locking in resource pools (counting semaphores).
Related
I've searched for this question and I only found answer for primitive type arrays.
Let's say I have a class called MyClass and I want to have an array of its objects in my another class.
class AnotherClass {
[modifiers(?)] MyClass myObjects;
void initFunction( ... ) {
// some code
myObjects = new MyClass[] { ... };
}
MyClass accessFunction(int index) {
return myObjects[index];
}
}
I read somewhere that declaring an array volatile does not give volatile access to its fields, but giving a new value of the array is safe.
So, if I understand it well, if I give my array a volatile modifier in my example code, it would be (kinda?) safe. In case of I never change its values by the [] operator.
Or am I wrong? And what should I do if I want to change one of its value? Should I create a new instance of the array an replace the old value with the new in the initial assignment?
AtomicXYZArray is not an option because it is only good for a primitive type arrays. AtomicIntegerArray uses native code for get() and set(), so it didn't help me.
Edit 1:
Collections.synchronizedList(...) can be a good alternative I think, but now I'm looking for arrays.
Edit 2: initFunction() is called from a different class.
AtomicReferenceArray seems to be a good answer. I didn't know about it, up to now. (I'm still interested in that my example code would work with volatile modifier (before the array) with only this two function called from somewhere else.)
This is my first question. I hope I managed to reach the formal requirements. Thanks.
Yes you are correct when you say that the volatile word will not fulfill your case, as it will protect the reference to the array and not its elements.
If you want both, Collections.synchronizedList(...) or synchronized collections is the easiest way to go.
Using modifiers like you are inclining to do is not the way to do this, as you will not affect the elements.
If you really, must, use and array like this one: new MyClass[]{ ... };
Then AnotherClass is the one that needs to take responsibility for its safety, you are probably looking for lower level synchronization here: synchronized key word and locks.
The synchonized key word is the easier and yuo may create blocks and method that lock in a object, or in the class instance by default.
In higher levels you can use Streams to perform a job for you. But in the end, I would suggest you use a synchronized version of an arraylist if you are already using arrays. and a volatile reference to it, if necessary. If you do not update the reference to your array after your class is created, you don't need volatile and you better make it final, if possible.
For your data to be thread-safe you want to ensure that there are no simultaneous:
write/write operations
read/write operations
by threads to the same object. This is known as the readers/writers problem. Note that it is perfectly fine for two threads to simultaneously read data at the same time from the same object.
You can enforce the above properties to a satisfiable level in normal circumstances by using the synchronized modifier (which acts as a lock on objects) and atomic constructs (which performs operations "instantaneously") in methods and for members. This essentially ensures that no two threads can access the same resource at the same time in a way that would lead to bad interleaving.
if I give my array a volatile modifier in my example code, it would be (kinda?) safe.
The volatile keyword will place the array reference in main memory and ensure that no thread can cache a local copy of it within their private memory, which helps with thread visibility although it won't guarantee thread safety by itself. Also the use of volatile should be used sparsely unless by experienced programmers as it may cause unintended effects on the program.
And what should I do if I want to change one of its value? Should I create a new instance of the array an replace the old value with the new in the initial assignment?
Create synchronized mutator methods for the mutable members of your class if they need to be changed or use the methods provided by atomic objects within your classes. This would be the simplest approach to changing your data without causing any unintended side-effects (for example, removing the object from the array whilst a thread is accessing the data in the object being removed).
Volatile does actually work in this case with one caveat: all the operations on MyClass may only read values.
Compared to all what you might read about what volatile does, it has one purpose in the JMM: creating a happens-before relationship. It only affects two kinds of operations:
volatile read (eg. accessing the field)
volatile write (eg. assignment to the field)
That's it. A happens-before relationship, straight from the JLS §17.4.5:
Two actions can be ordered by a happens-before relationship. If one action happens-before another, then the first is visible to and ordered before the second.
A write to a volatile field (§8.3.1.4) happens-before every subsequent read of that field.
If x and y are actions of the same thread and x comes before y in program order, then hb(x, y).
These relationships are transitive. Taken all together this implies some important points: All actions taken on a single thread happened-before that thread's volatile write to that field (third point above). A volatile write of a field happens-before a read of that field (point two). So any other thread that reads the volatile field would see all the updates, including all referred to objects like array elements in this case, as visible (first point). Importantly, they are only guaranteed to see the updates visible when the field was written. This means that if you fully construct an object, and then assign it to a volatile field and then never mutate it or any of the objects it refers to, it will be never be in an inconsistent state. This is safe taken with the caveat above:
class AnotherClass {
private volatile MyClass[] myObjects = null;
void initFunction( ... ) {
// Using a volatile write with a fully constructed object.
myObjects = new MyClass[] { ... };
}
MyClass accessFunction(int index) {
// volatile read
MyClass[] local = myObjects;
if (local == null) {
return null; // or something else
}
else {
// should probably check length too
return local[index];
}
}
}
I'm assuming you're only calling initFunction once. Even if you did call it more than once you would just clobber the values there, it wouldn't ever be in an inconsistent state.
You're also correct that updating this structure is not quite straightforward because you aren't allowed to mutate the array. Copy and replace, as you stated is common. Assuming that only one thread will be updating the values you can simply grab a reference to the current array, copy the values into a new array, and then re-assign the newly constructed value back to the volatile reference. Example:
private void add(MyClass newClass) {
// volatile read
MyClass[] local = myObjects;
if (local == null) {
// volatile write
myObjects = new MyClass[] { newClass };
}
else {
MyClass[] withUpdates = new MyClass[local.length + 1];
// System.arrayCopy
withUpdates[local.length] = newClass;
// volatile write
myObjects = withUpdates;
}
}
If you're going to have more than one thread updating then you're going to run into issues where you lose additions to the array as two threads could copy and old array, create a new array with their new element and then the last write would win. In that case you need to either use more synchronization or AtomicReferenceFieldUpdater
As Integer class is also immutable class and we know that immutable class is thread-safe what is the need of Atomic Integer.
I am confused .
Is it the reason that reads and write of immutable objects need not be atomic whereas read and write of atomic integer is atomic .
That means atomic classes are also thread-safe.
AtomicInteger is used in multithreaded environments when you need to make sure that only one thread can update an int variable. The advantage is that no external synchronization is requried since the operations which modify it's value are executed in a thread-safe way.
Consider the followind code:
private int count;
public int updateCounter() {
return ++count;
}
If multiple threads would call the updateCounter method, it's possible that some of them would receive the same value. The reason it that the ++count operation isn't atomical since isn't only one operation, but made from three operations: read count, add 1 to it's value and write it back to it. Multiple calling threads could see the variable as unmodified to it's latest value.
The above code should be replaced with this:
private AtomicInteger count = new AtomicInteger(0);
public int updateCounter() {
return count.incrementAndGet();
}
The incrementAndGet method is guaranteed to atomically increment the stored value and return it's value without using any external synchonization.
If your value never changes, you don't have to use the AtomicInteger, it's enought to use int.
AtomicInteger is thread safe (in fact, all classes from java.util.concurrent.atomic package are thread safe), while normal integers are NOT threadsafe.
You would require 'synchronized' & 'volatile' keywords, when you are using an 'Integer' variable in multi-threaded environment (to make it thread safe) where as with atomic integers you don't need 'synchronized' & 'volatile' keywords as atomic integers take care of thread safety.
Also, I would recommend the below helpful tutorial on the same subject:
http://tutorials.jenkov.com/java-concurrency/compare-and-swap.html
Please refer below oracle doc for more information on 'atomic' package:
https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/atomic/package-summary.html
While immutable objects are thread-safe by definition, mutable objects can be thread safe too.
That is precisely the purpose of the Atomic... classes (AtomicInteger, AtomicBoolean, and so on).
The various ...get... and ...set... methods allow thread-safe access and mutation of the object.
Not surprisingly, the class is declared in the java.util.concurrent package.
You only have to browse the API for the java.util.concurrent.atomic package:
A small toolkit of classes that support lock-free thread-safe programming on single variables.
Consider a variable
int myInt = 3;
AtomicInteger relates to myInt.
Integer relates to 3.
in other words, your variable is mutable and can change it's value. While the value 3 is an integer literal, a constant, an immutable expression.
Integers are object representations of literals and are therefore immutable, you can basically only read them.
AtomicIntegers are containers for those values. You can read and set them. Same as asigning a value to variable. But different to changing the value of int variable, operations on an AtomicInteger are atomic.
For example this is not atomic
if(myInt == 3) {
myInt++;
}
This is atomic
AtomicInteger myInt = new AtomicInteger(3);
//atomic
myInt.compareAndSet(3, 4);
I think the main difference between AtomicInteger and normal immutable Integer will come into the picture, once we understand why even immutable Integers are not thread-safe.
Let's see with an example.
Suppose, we have a value of int count = 5, which is being shared by two different threads named T1 and T2 with both reading and writing at the same time.
We know that, if there is any value being reassigned into an immutable object, the old object remains at the pool and the new one takes over.
Now, when T1 and T2 are updating their values into count variable, Java might take this value into some cache and will do the set operations there and we won't know when JVM will write the updated value into main memory, so there might be a possibility that one of the threads may be updating the value into a totally stale value.
This brings us to the volatile keyword.
Volatile - This keyword ensures that all the I/O operations on any variable will take place on the main memory so that all the threads are working with the most updated value.
Consider, if 1 Thread is writing and all other threads are reading then, volatile will solve our problem, but if all the threads are reading and writing on the same variable at the same time, then we need synchronizing to ensure thread-safety.
Volatile keyword does not ensure thread-safety.
Now, coming to why AtomicIntegers. Even if are using syncrhonized keyword to ensure thread-safety, the actual update operation of count variable will be a three step process.
get updated value of count variable
increment the value by 1
set the value to count variable
This is why it takes a slightly longer time to update any value for normal Integers to update values once the thread safety is taken into consideration.
**AtomicIntegers solve this problem furthermore of thread safety and also faster updates by an optimized lock-free algorithm called Compare-And-Swap (CAS method).
They perform all the update operations atomically as a single-step process. **
I have a class Helper with one single method int findBiggestNumber(int [] array) and no instance variables.
If I make an object Helper h = new Helper(); and let 10 different threads use that object's only method findBiggestNumber to find their array's biggest number, will they interfere with each other?
My fear is that for example thread-1 starts calculating its array's biggest number when the parameter in findBiggestNumber is referencing an array in for example thread-8. Could that happen in my example?
No the problem you described could not happen. As your Helper class has no members, it is thread safe.
Thread safety issues arise when mutable (changeable) data is shared between multiple threads. However in your example, Helper does not contain any data (i.e. variables) that will be shared between multiple threads as each thread will pass their own data (int[] array) into Helper's findBiggestNumber() method.
Without your implementation of findBiggestNumber, it's impossible to say if it is thread safe since you could be mutating the array passed as an argument. If that is true, and you pass the same array to multiple threads, then there is potentially a race condition; otherwise, it is thread safe.
findBiggestNumber could also be modifying global or static data which would also make it thread unsafe.
I have a static array of classes similar to the following:
public class Entry {
private String sharedvariable1= "";
private String sharedvariable2= "";
private int sharedvariable3= -1;
private int mutablevariable1 = -1
private int mutablevariable2 = -2;
public Entry (String sharedvariable1,
String sharedvariable2,
int sharedvariable3) {
this.sharedvariable1 = sharedvariable1;
this.sharedvariable2 = sharedvariable2;
this.sharedvariable3 = sharedvariable 3;
}
public Entry (Entry entry) { //copy constructor.
this (entry.getSharedvariable1,
entry.getSharedvariable2,
entry.getSharedvaraible3);
}
....
/* other methods including getters and setters*/
}
At some point in my program I access an instance of this object and make a copy of it using the copy constructor above. I then change the value of the two mutable variables above. This program is running in a multithreaded environment. Please note. ALL VARIABLES ARE SET WITH THEIR INITIAL VALUES PRIOR TO THREADING. Only after the program is threaded an a copy is made, are the variables changed. I believe that it is thread safe because I am only reading the static object, not writing to it (even shared variable3, although an int and mutable is only read) and I am only making changes to the copy of the static object (and the copy is being made within a thread). But, I want to confirm that my thinking is correct here.
Can someone please evaluate what I am doing?
It is not thread-safe. You need to wrap anything that modifies the sharedvariables thusly:
synchronized (this) {
this.sharedvariable1 = newValue;
}
For setters, you can do this instead:
public synchronized void setSharedvariable1(String sharedvariable1) {
this.sharedvariable1 = sharedvariable1;
}
Then in your copy constructor, you'll do similarly:
public Entry (Entry entry) {
this();
synchronized(entry) {
this.setSharedvariable1(entry.getSharedvariable1());
this.setSharedvariable2(entry.getSharedvariable2());
this.setSharedvariable3(entry.getSharedvariable3());
}
}
This ensures that if modifications are being made to an instance, the copy operation will wait until the modifications are done.
It is not thread-safe, you should synchronize in your copy constructor. You are reading each of the three variables from the original object in your copy constructor. These operations are not atomic together. So it could be that while you are reading the first value the third value gets changed by another thread. In this case you have a "copied" object in an inconsistent state.
It's not thread safe. And I mean that is does not guarantee thread safety for multiple threads that use the same Entry instance.
The problem I see here is as follows:
Thread 1 starts constructing an Entry instance. It does not keep that instance hidden from other threads access.
Thread 2 accesses that instance, using its copy constructor, while it is still in the middle of construction.
Considering the initial value for Entry's field private int sharedvariable3= -1;, the result might be that the new "copied" instance created by Thread 2 will have its sharedvariable3 field set to 0 (the default for int class fields in java).
That's the problem.
If it bothers you, you've got to either synchronize the read/write operations, or take care of Entry instances publication. Meaning, don't allow access of other threads to an Entry instance that is in the middle of construction.
I don't really get, why you consider private instance variables as shared. Usually shared fields are static and not private - I recommend you not to share private instance variables. For thread-safety you should synchronize the operations that mutate the variables values.
You can use the synchronized keyword for that but choose the correct monitor object (I think the entry itself should do). Another alternative is to use some lock implementation from java.util.concurrent. Usually locks offer higher throughput and better granularity (for example multiple parallel reads but only one write at any given time).
Another thing you have to think about is what is called the memory barrier. Have a look at this interesting article http://java.dzone.com/articles/java-memory-model-programer%E2%80%99s
You can enforce the happens before semantic with the volatile keyword. Explicit synchronization (locks or synchonized code) also crosses the memory barrier and enforces happens before semantics.
Finally a general piece of advice: You should avoid shared mutable state at all costs. Synchronization is a pain in the ass (performance and maintenance wise). Bugs that result from incorrect synchronization are incredibly hard to detect. It is better to design for immutability or isolated mutability (e.g. actors).
The answer is that it is thread safe under the conditions outlined since I am only reading from the variables in their static state and only changing the copies.
I am just curious about the java memory model a little.
Here is what i though.
If i have the following class
public class Test {
int[] numbers = new int[Integer.MAX_VALUE]; // kids dont try this at home
void increment(int ind){
numbers[ind]++;
}
int get(int ind){
return numbers[ind];
}
}
There are multiple readers get() and one writer increment() thread accessing this class.
The question is here , is there actually any synchronization at all that i have to do in order to leave the class at a consistent state after each method call?
Why i am asking this, i am curious if the elements in the array are cached in some way by the JVM or is this only applied to class members? If the members inside the array could be cached, is there a way to define them as volatile ?
Thanks
Roman
As an alternative to synchronizing those methods, you could also consider replacing the int[] with an array of AtomicIntegers. This would have the benefit/downside (depending on your application) of allowing concurrent access to different elements in your list.
You will definitely have to use some sort of synchronization (either on your class or the underlying data structure) in order to ensure the data is left in a consistent state after method calls. Consider the following situations, with two Threads A and B, with the integer array initially containing all zero values.
Thread A calls increment(0). The post-increment operation is not atomic; you can actually consider it to be broken down into at least three steps:
Read the current value; Add one to the current value; Store the value.
Thread B also calls increment(0). If this happens soon after Thread A has done the same, they will both read the same initial value for the element at index 0 of the array.
At this point, both Thread A and B have read a value of '0' for the element they want to increment. Both will increment the value to '1' and store it back in the first element of the array.
Thus, only the work of the Thread that last writes to the array is seen.
The situation is similar if you had a decrement() method. If both increment() and decrement() were called at near-simultaneous times by two separate Threads, there is no telling what the outcome would be. The value would either be incremented by one or decremented by one, and the operations would not "cancel" each other out.
EDIT: Update to reflect Roman's (OP) comment below
Sorry, I mis-read the post. I think I understand your question, which is along the lines of:
"If I declare an array as volatile,
does that mean access to its elements
are treated as volatile as well?"
The quick answer is No: Please see this article for more information; the information in the previous answers here is also correct.
Yes, the VM is allowed to cache inside the thread any field that is not synchronized or voltile. To prevent this, you could mark the fields as volatile, but they still wouldn't be thread safe, since ++ is not an atomic operation. Add the synchronized keyword to the methods, and you're safe.