Right now i am trying to learn more about java threading, and i have a small question that i cannot find a direct answer to anywhere. Lets say i have two threadsthat both share an object:
public class FooA implements Runnable
{
Object data;
public FooA(final Object newData)
{
data = newData;
}
public void doSomething()
{
synchronized(data)
{
data = new Integer(1);
}
}
public void run() {
// Does stuff
}
}
public class FooB implements Runnable
{
Object data;
public FooB(final Object newData)
{
data = newData;
}
public void doSomething()
{
synchronized(data)
{
System.out.println(data);
}
}
}
Would FooA block FooB when it is in the doSomething section of the code? Or vice versa? My gut feeling says yes, but according to the book i am reading it says no. Hence the need for monitor objects. I made a slightly more complex version of this, and everything worked fine.
I looked around a bit, but couldn't find a concrete answer.
There are a few issues with this example.
Firstly, synchronized(data) means that it synchronizes on the object that is in data at the time. If you've initialized your two objects with the same object, you should get the synchronization.
However, since you're setting data itself within the code, this won't really work after that (since it won't be the same object then).
final in the constructor's parameter isn't particularly useful. It would be more useful as a modifier on the field itself. It wouldn't work in this particular example because you're modifying the value, but in general, it's a good way of preventing some concurrency issues when you know the value is going to be fixed.
I made a slightly more complex version
of this, and everything worked fine.
It's very hard, or almost impossible, to debug concurrency issues by trial and error. The fact that it doesn't fail doesn't mean it will work reliably.
I'd recommend reading this book: http://www.javaconcurrencyinpractice.com/
The problem is that one of the synchronised blocks assigns a new object to data. If that block starts first, and changes data, subsequent runs will be using a different object to lock on. So from then on, both will be able to run simultaneously.
The answer is yes in this case but it is brittle (i.e. the next change to the code will probably break something). Why does it work?
Because FooB never notices that FooA changes the object (each thread gets its own reference, so FooB never notices when FooA assigns its reference a new value).
For cases like that, I suggest to use AtomicReference which makes sure that two threads can access the same object and anyone can update that reference any time and the other threads get the new value only after the update.
Any Java Object may be used for synchronization.
If FooA and FooB are both constructed with references to same object then they are sharing the same "lock" object and will block as you are expecting. As the
Object data;
decalrations are not final, either FooA or FooB, or both could assign different values to data, and then be synchronizing on different objects - which may be good or bad depending upon what you are trying to do.
In your code sample, fooA.data and fooB.data are not the same object unless someone initializes them as such with something like:
Object o = new Object();
FooA fooA = new FooA(o);
FooB fooB = new FooB(o);
Unless they are initialized to the same instance, they are not the same object, only the same type and name.
When FooA assigns new Integer(1) to data, they will again not be the same object, only the same type and name. So after that, they won't be synchronized unless you call:
fooB.data = fooA.data;
This would have to happen inside the synchronization block to guarantee synchronized execution.
Also,something to know about threading is that even if everything works once, that doesn't mean your program is correct or that it will work every time. Threading problems only occur when the timing happens to work out just right (or just wrong, as it were).
It's correct as far as you have to synchronize on the same object, but since one thread modifies the object referenced by "data", you would need to synchronize on "this".
Related
I've searched for this question and I only found answer for primitive type arrays.
Let's say I have a class called MyClass and I want to have an array of its objects in my another class.
class AnotherClass {
[modifiers(?)] MyClass myObjects;
void initFunction( ... ) {
// some code
myObjects = new MyClass[] { ... };
}
MyClass accessFunction(int index) {
return myObjects[index];
}
}
I read somewhere that declaring an array volatile does not give volatile access to its fields, but giving a new value of the array is safe.
So, if I understand it well, if I give my array a volatile modifier in my example code, it would be (kinda?) safe. In case of I never change its values by the [] operator.
Or am I wrong? And what should I do if I want to change one of its value? Should I create a new instance of the array an replace the old value with the new in the initial assignment?
AtomicXYZArray is not an option because it is only good for a primitive type arrays. AtomicIntegerArray uses native code for get() and set(), so it didn't help me.
Edit 1:
Collections.synchronizedList(...) can be a good alternative I think, but now I'm looking for arrays.
Edit 2: initFunction() is called from a different class.
AtomicReferenceArray seems to be a good answer. I didn't know about it, up to now. (I'm still interested in that my example code would work with volatile modifier (before the array) with only this two function called from somewhere else.)
This is my first question. I hope I managed to reach the formal requirements. Thanks.
Yes you are correct when you say that the volatile word will not fulfill your case, as it will protect the reference to the array and not its elements.
If you want both, Collections.synchronizedList(...) or synchronized collections is the easiest way to go.
Using modifiers like you are inclining to do is not the way to do this, as you will not affect the elements.
If you really, must, use and array like this one: new MyClass[]{ ... };
Then AnotherClass is the one that needs to take responsibility for its safety, you are probably looking for lower level synchronization here: synchronized key word and locks.
The synchonized key word is the easier and yuo may create blocks and method that lock in a object, or in the class instance by default.
In higher levels you can use Streams to perform a job for you. But in the end, I would suggest you use a synchronized version of an arraylist if you are already using arrays. and a volatile reference to it, if necessary. If you do not update the reference to your array after your class is created, you don't need volatile and you better make it final, if possible.
For your data to be thread-safe you want to ensure that there are no simultaneous:
write/write operations
read/write operations
by threads to the same object. This is known as the readers/writers problem. Note that it is perfectly fine for two threads to simultaneously read data at the same time from the same object.
You can enforce the above properties to a satisfiable level in normal circumstances by using the synchronized modifier (which acts as a lock on objects) and atomic constructs (which performs operations "instantaneously") in methods and for members. This essentially ensures that no two threads can access the same resource at the same time in a way that would lead to bad interleaving.
if I give my array a volatile modifier in my example code, it would be (kinda?) safe.
The volatile keyword will place the array reference in main memory and ensure that no thread can cache a local copy of it within their private memory, which helps with thread visibility although it won't guarantee thread safety by itself. Also the use of volatile should be used sparsely unless by experienced programmers as it may cause unintended effects on the program.
And what should I do if I want to change one of its value? Should I create a new instance of the array an replace the old value with the new in the initial assignment?
Create synchronized mutator methods for the mutable members of your class if they need to be changed or use the methods provided by atomic objects within your classes. This would be the simplest approach to changing your data without causing any unintended side-effects (for example, removing the object from the array whilst a thread is accessing the data in the object being removed).
Volatile does actually work in this case with one caveat: all the operations on MyClass may only read values.
Compared to all what you might read about what volatile does, it has one purpose in the JMM: creating a happens-before relationship. It only affects two kinds of operations:
volatile read (eg. accessing the field)
volatile write (eg. assignment to the field)
That's it. A happens-before relationship, straight from the JLS §17.4.5:
Two actions can be ordered by a happens-before relationship. If one action happens-before another, then the first is visible to and ordered before the second.
A write to a volatile field (§8.3.1.4) happens-before every subsequent read of that field.
If x and y are actions of the same thread and x comes before y in program order, then hb(x, y).
These relationships are transitive. Taken all together this implies some important points: All actions taken on a single thread happened-before that thread's volatile write to that field (third point above). A volatile write of a field happens-before a read of that field (point two). So any other thread that reads the volatile field would see all the updates, including all referred to objects like array elements in this case, as visible (first point). Importantly, they are only guaranteed to see the updates visible when the field was written. This means that if you fully construct an object, and then assign it to a volatile field and then never mutate it or any of the objects it refers to, it will be never be in an inconsistent state. This is safe taken with the caveat above:
class AnotherClass {
private volatile MyClass[] myObjects = null;
void initFunction( ... ) {
// Using a volatile write with a fully constructed object.
myObjects = new MyClass[] { ... };
}
MyClass accessFunction(int index) {
// volatile read
MyClass[] local = myObjects;
if (local == null) {
return null; // or something else
}
else {
// should probably check length too
return local[index];
}
}
}
I'm assuming you're only calling initFunction once. Even if you did call it more than once you would just clobber the values there, it wouldn't ever be in an inconsistent state.
You're also correct that updating this structure is not quite straightforward because you aren't allowed to mutate the array. Copy and replace, as you stated is common. Assuming that only one thread will be updating the values you can simply grab a reference to the current array, copy the values into a new array, and then re-assign the newly constructed value back to the volatile reference. Example:
private void add(MyClass newClass) {
// volatile read
MyClass[] local = myObjects;
if (local == null) {
// volatile write
myObjects = new MyClass[] { newClass };
}
else {
MyClass[] withUpdates = new MyClass[local.length + 1];
// System.arrayCopy
withUpdates[local.length] = newClass;
// volatile write
myObjects = withUpdates;
}
}
If you're going to have more than one thread updating then you're going to run into issues where you lose additions to the array as two threads could copy and old array, create a new array with their new element and then the last write would win. In that case you need to either use more synchronization or AtomicReferenceFieldUpdater
Situation: I have multiple states of the same object represented by different instances (which are made using a deep-copy). Now I want to make sure that, no matter which of these grouped instances is accessed, all operations that perform modifications are redirected onto the youngest of these instances[1].
Example:[2]
//Let's create an object
MyObject mObj = new MyObject(...);
//Let's create a list of past states
List<MyObject> pastStates = new ArrayList<MyObject>();
//doing some operations on mObj ....
mObj.modify(...);
//done modifying mObj, now let's save it's state and then create a copy to begin again
pastStates.add(mObj.copy());
//more of this...
mObj.modify(...);
pastStates.add(mObj.copy());
//let's compare some old states for whatever reason (e.g. part of an algorithm)
compare(MyObject o1, MyObject o2) {
if(o1.getA() == o2.getA()) {
o2.modify(...); //wait, we modified an old state...
}
Now this is a rather obvious example and probably a classic case of programmer's fault. They modified something that is clearly advertised as being a past state whatsoever... But say we still want to be nice and try to help and thus intercept the method call and perform it on the correct instance namely the youngest/master instance.[3]
Question: Is there a way to do this with standard java?
Bonus: Is there a way that doesn't have a horrible impact on performance?
Background: I'm experimenting around with different ways to make a library/engine, I'm writing for fun, harder to misuse by the enduser. As I will need these states internally anyways (snapshots in time for certain background functionalities), I would like to make them available to the enduser as well so they can profit of my statekeeping, e.g. for use in analytical algorithms.
[1] There can be multiple groups of instances of an object that are not related to each other; relation will presumably be kept by a one way link to the youngest instance which simply won't ever change.
[2] This code is meant as an example, it is clear that this mistake could be prevented by the enduser paying more attention when writing code.
[3] Now an easy way to prevent modification is to wrap the object into an immutable version which throws exception when trying to modify it > but we do not write this object ourselves and don't want to force it upon the enduser to write two versions of their own object if we don't have to...
I would probably create two classes: an "inner" one which is immutable and an "outer" one that maintains a list of inners. (Note: I don't mean inner classes in the JLS sense, just an object that is fully controlled by its wrapper.)
Something like this:
public final class Outer {
private final List<Inner> history = new ArrayList<>(); //history is inverted for brevity, 0 is the latest one
public Outer(int x) {
this.history.add(new Inner(x));
}
public void add(int x) {
history.add( 0, new Inner(history.get(0).x+x);
}
public Inner current() {
return history.get(0);
}
public static final class Inner {
private final int x;
private Inner(int x) {
this.x = x;
}
public int getX() {
return x;
}
}
}
With this setup clients can only instantiate Outer, can only mutate Outer but have access to a read-only copy of all the past states. There is no way to accidentally modify a past state. There is no need for separate grouping logic either because each instance of Outer naturally only records its own history.
Method interception can be done with AOP by using an around advice. AspectJ is a good tool for solving such problems. The impact on performance should also be no problem.
In an around advice in most cases you call proceed to execute the target method on the target object, but you can also prevent the method execution and instead do a method call on another object.
Yes, it is possible using bytecode modification.
Actually, if it was done by AspectJ or other library, it would be implemented using proxies or byte code modification. But I'm not sure that this specific task is possible with Aspect programming libraries API.
You can find working example for your task in this repo.
This test from repository works fine:
//Let's create an object
MyObject mObj = new MyObject();
MyObjectActiveRepository.INSTANCE.putToGroup(mObj, "group1");
MyObjectActiveRepository.INSTANCE.registerActiveForItsGroup(mObj);
//Let's create a list of past states
List<MyObject> pastStates = new ArrayList<MyObject>();
//doing some operations on mObj ....
mObj.modify("state1");
//done modifying mObj, now let's save it's state and then create a copy to begin again
pastStates.add(mObj.copy());
//more of this...
mObj.modify("state2");
pastStates.add(mObj.copy());
mObj.modify("state3");
assertEquals("state1", pastStates.get(0).getState());
assertEquals("state2", pastStates.get(1).getState());
assertEquals("state3", mObj.getState());
pastStates.get(0).modify("stateNew");
assertEquals("state1", pastStates.get(0).getState());
assertEquals("state2", pastStates.get(1).getState());
assertEquals("stateNew", mObj.getState());
Shortly -
I use ByteBuddy (Bytecode generation and modification tool) to redefine class bytecode before it has been load to:
remove final from class (if we have)
add field to save MyObject's "group" to address your (1) note
intercept call to copy(we need to copy "group" field additionally) and modify (to retarget call)
replace class code in classloader
TypePool typePool = TypePool.Default.ofClassPath();
new ByteBuddy()
.rebase(typePool.describe("MyObject").resolve(), ClassFileLocator.ForClassLoader.ofClassPath())
.modifiers(TypeManifestation.PLAIN) //our class can be final and we have no access to it - so remove final
.defineField("group", String.class, Visibility.PUBLIC)
.method(named("modify")).intercept(MethodDelegation.to(typePool.describe("Interceptors").resolve()))
.method(named("copy")).intercept(MethodDelegation.to(typePool.describe("Interceptors").resolve()))
.make()
.load(InterceptorsInitializer.class.getClassLoader(), ClassLoadingStrategy.Default.INJECTION);
Implemented MyObjectActiveRepository which contains information about active object for group and "group" field related functionality.Interceptors with simple copy redefinition which add "group" setting and modify, which makes our retargeting.
I think it should be lite code, the most expensive part is reflection call to setter on group-to-object assignment after object creation (this part can be improved; if we use ByteBuddy - we can replace reflection with implementing new interface with getGroup() and setGroup(String) methods during byte code generation with delegating them to FieldAccessor.ofField("group"), so we will have fine effective invokevirtual thru interface). modify() should have near the same performance, because it doesn't use reflection, only fully generated bytecode. I didn't make any benchmarking.
Could anyone explain what is the difference between these examples?
Example # 1.
public class Main {
private Object lock = new Object();
private MyClass myClass = new MyClass();
public void testMethod() {
// TODO Auto-generated method stub
synchronized (myClass) {
// TODO: modify myClass variable
}
}
}
Example # 2.
package com.test;
public class Main {
private MyClass myClass = new MyClass();
private Object lock = new Object();
public void testMethod() {
// TODO Auto-generated method stub
synchronized (lock) {
// TODO: modify myClass variable
}
}
}
What should I use as a monitor lock if I need to take care about synchronization when modifying the variable?
Assuming that Main is not intended to be a "leaky abstraction", here is minimal difference between the first and second examples.
It may be better to use an Object rather than some other class because an Object instance has no fields and is therefore smaller. And the Object-as-lock idiom makes it clear that the lock variable is intended to only ever used as a lock.
Having said that, there is a definite advantage in locking on an object that nothing else will ever see. The problem with a Main method synchronizing on a Main (e.g. this) is that other unrelated code could also be synchronizing on it for an unrelated purpose. By synchronizing on dedicated (private) lock object you avoid that possibility.
In response to the comment:
There is a MAJOR difference in the two cases. In the first you're locking the object that you want to manipulate. In the second you're locking some other object that has no obvious relationship to the object being manipulated. And the second case takes more space, since you must allocate the (otherwise unused) Object, rather than using the already-existing instance you're protecting.
I think you are making an INCORRECT assumption - that MyClass is the data structure that needs protecting. In fact, the Question doesn't say that. Indeed the way that the example is written implies that the lock is intended to protect the entire Main class ... not just a part of its state. And in that context, there IS an obvious connection ...
The only case where it would be better to lock the MyClass would be if the Main was a leaky abstraction that allowed other code to get hold of its myClass reference. That would be bad design, especially in a multi-threaded app.
Based on the revision history, I'm pretty sure that is not the OP's intention.
The statement synchronization is useful when changing variables of an object.
You are changing variables of myClass so you want to lock on myClass object. If you were to change something in lock then you want to lock on lock object.
In example #2 you are modifying myClass but locking on lock object which is nonsense.
In first case you lock on object that it known only within this method, so it is unlikely that anybody else will use the same object to lock on, so such lock is almost useless. Second variant makes much more sense for me.
At the same time, myClass variable is also known only within this method, so it is unlikely that other thread will access it, so probably lock is not necessary here at all. Need more complete example to say more.
In general, you want to lock on the "root" object of the data you're manipulating. If you're, eg, going to subtract a value from a field in object A and add that value to object B, you need to lock some object that is somehow common (at least by convention) between A and B, possibly the "owner" object of the two. This is because you're doing the lock to maintain a "contract" of consistency between separate pieces of data -- the object locked must be common to and conceptually encompassing of the entire set of data that must be kept consistent.
The simple case, of course, is when you're modifying field A and field B in the same object, in which case locking that object is the obvious choice.
A little less obvious is when you're dealing with static data belonging to a single class. In that case you generally want to lock the class.
A separate "monitor" object -- created only to serve as a lockable entity -- is rarely needed in Java, but might apply to, say, elements of two parallel arrays, where you want to maintain consistency between element N of the two arrays. In that case, something like a 3rd array of monitor objects might be appropriate.
(Note that this is all just a "quick hack" at laying out some rules. There are many subtleties that one can run into, especially when attempting to allow the maximum of concurrent access to heavily-accessed data. But such cases are rare outside of high-performance computing.)
Whatever you choose, it's critical that the choice be consistent across all references to the protected data. You don't want to lock object A in one case and object B in another, when referencing/modifying the same data. (And PLEASE don't fall into the trap of thinking you can lock an arbitrary instance of Class A and that will somehow serve to lock another instance of Class A. That's a classical beginner's mistake.)
In your above example you'd generally want to lock the created object, assuming the consistency you're assuring is all internal to that object. But note that in this particular example, unless the constructor for MyClass somehow lets the object address "escape", there is no need to lock at all, since there is no way that another thread can get the address of the new object.
The difference are the class of the lock and its scope
- Both topics are pretty much orthogonal with synchronization
objects with different classes may have different sizes
objects in different scopes may be available in different contexts
Basically both will behave the same in relation to synchronization
Both examples are not good syncronisation practise.
The lock Object should be placed in MyClass as private field.
public class Test{
private MyObj myobj = new MyObj(); //it is not volatile
public class Updater extends Thred{
myobje = getNewObjFromDb() ; //not am setting new object
}
public MyObj getData(){
//getting stale date is fine for
return myobj;
}
}
Updated regularly updates myobj
Other classes fetch data using getData
IS this code thread safe without using volatile keyword?
I think yes. Can someone confirm?
No, this is not thread safe. (What makes you think it is?)
If you are updating a variable in one thread and reading it from another, you must establish a happens-before relationship between the write and the subsequent read.
In short, this basically means making both the read and write synchronized (on the same monitor), or making the reference volatile.
Without that, there are no guarantees that the reading thread will see the update - and it wouldn't even be as simple as "well, it would either see the old value or the new value". Your reader threads could see some very odd behaviour with the data corruption that would ensue. Look at how lack of synchronization can cause infinite loops, for example (the comments to that article, especially Brian Goetz', are well worth reading):
The moral of the story: whenever mutable data is shared across threads, if you don’t use synchronization properly (which means using a common lock to guard every access to the shared variables, read or write), your program is broken, and broken in ways you probably can’t even enumerate.
No, it isn't.
Without volatile, calling getData() from a different thread may return a stale cached value.
volatile forces assignments from one thread to be visible on all other threads immediately.
Note that if the object itself is not immutable, you are likely to have other problems.
You may get a stale reference. You may not get an invalid reference.
The reference you get is the value of the reference to an object that the variable points to or pointed to or will point to.
Note that there are no guarantees how much stale the reference may be, but it's still a reference to some object and that object still exists. In other words, writing a reference is atomic (nothing can happen during the write) but not synchronized (it is subject to instruction reordering, thread-local cache et al.).
If you declare the reference as volatile, you create a synchronization point around the variable. Simply speaking, that means that all cache of the accessing thread is flushed (writes are written and reads are forgotten).
The only types that don't get atomic reads/writes are long and double because they are larger than 32-bits on 32-bit machines.
If MyObj is immutable (all fields are final), you don't need volatile.
The big problem with this sort of code is the lazy initialization. Without volatile or synchronized keywords, you could assign a new value to myobj that had not been fully initialized. The Java memory model allows for part of an object construction to be executed after the object constructor has returned. This re-ordering of memory operations is why the memory-barrier is so critical in multi-threaded situations.
Without a memory-barrier limitation, there is no happens-before guarantee so you do not know if the MyObj has been fully constructed. This means that another thread could be using a partially initialized object with unexpected results.
Here are some more details around constructor synchronization:
Constructor synchronization in Java
Volatile would work for boolean variables but not for references. Myobj seems to perform like a cached object it could work with an AtomicReference. Since your code extracts the value from the DB I'll let the code stay as is and add the AtomicReference to it.
import java.util.concurrent.atomic.AtomicReference;
public class AtomicReferenceTest {
private AtomicReference<MyObj> myobj = new AtomicReference<MyObj>();
public class Updater extends Thread {
public void run() {
MyObj newMyobj = getNewObjFromDb();
updateMyObj(newMyobj);
}
public void updateMyObj(MyObj newMyobj) {
myobj.compareAndSet(myobj.get(), newMyobj);
}
}
public MyObj getData() {
return myobj.get();
}
}
class MyObj {
}
I have a class Cache which picks up List<SomeObject> someObjectList from DB and stores it in static variable.
Now I have another thread A which uses this List as follows
class A extends Thread{
private List<SomeObject> somobjLst;
public A(){
somobjLst = Cache.getSomeObjectList();
}
void run(){
//somobjLst used in a loop here, no additong are done it , but its value is used
}
}
Now if at some point of time if some objects are added to Cache.someObjectList will it reflect in class A. I think it should as A only holds a refrence to it.
Will there will be any problem in A's code when content of Cache.someObjectList change?
EDIT:
As per suggestions :
if i make
void run (){
while(true){
synchronized(someObjList){
}
try{
Thread.sleep(INTERVAL);
}catch(Exception e){
}
}
}
will this solve problem?
Yes, the changes will be reflected in class A as well. Exactly as you say: A holds a reference to the exact same object as Cache.
Yes, it can lead to a problem if A doesn't expect it to change. It also can lead to a problem if the List implementation is not thread safe (most general-purpose implementations are not thread-safe!). Accessing a non-thread-safe data structure from two threads at the same time can lead to very nasty problems.
Sure, you are holding reference to collection in your thread. If collection is changed while you are iterating over it in thread ConcurrentModificationException will be thrown.
To avoid it you have to use some kind of synchronization mechanism. For example synchronize the iteration over collection and its modification done in other thread using synchronize(collection).
This is a kind of "pessimistic" locking.
Other possibility is to use collections from java.util.concurrent package.