Volatile class instances and member access in Java - java

I think what I'm doing is correct, but since this could blow up quite badly if not, I'd really like clarification.
The code is an example to try and express the point, sorry for any minor typos.
I have the following class
public class Components
{
public final String mVar1;
public final boolean mVar2;
public Components(String var1, boolean var2)
{
mVar1 = mVar1;
mVar2 = mVar2;
}
}
If I create a volatile instance of this class, I believe that assigning the value of this component to the address of one already created and in memory is thread safe.
public class Storage
{
public static volatile Components sComponents = null;
}
So, regardless of whether I set this variable on the main or any other thread (where a set will simply point it to an object already created, NOT create a new one), it should be thread safe, because the volatile keyword is acting on the Components reference, which will just be updated to point to the object that already exists.
So, for example
public class ThreadedClass
{
public ThreadedClass()
{
// Create an instance of Components so we have something to copy
mInitialComponents = new Components("My String", false);
// Spin off a thread
create_a_new_thread( threadEntryPoint );
}
// This function is called every frame on the main thread
public void update()
{
// If we have our components, print them out
if (Storage.sComponents != null)
{
print(sComponents.mVar1);
print(sComponents.mVar2);
}
}
private Components mInitialComponents = null;
private void threadEntryPoint()
{
// Just sleep for a bit so update gets called a few times
sleep(3000);
// Set our components
Storage.sComponents = mInitialComponents;
}
}
(In the real world code, mInitialComponents is created and accessed via a synchronized function so accessing the original object is thread safe).
So, my question is, when calling update on the main or any other thread, once Storage.sComponents has been set to the existing object in threadEntryPoint, is it simply updating the objects reference, so the object will be guaranteed to be complete whenever we check for null.
Or is it possible for some or none of the internal members to have been correctly assigned.
Thanks

Your update method is not thread safe and could throw a null pointer exception. This can be resolved by changing it to be:
// This function is called every frame on the main thread
public void update()
{
final Components components = Storage.sComponents;
// If we have our components, print them out
if (components != null)
{
print(components.mVar1);
print(components.mVar2);
}
}
The inner values within Components are safe to use as they are final. This is assuming that you do not leak references to the Components instance from within it's constructor.

It is safe to assume that if components is not null, its member variables have been initialised correctly. According to the Java virtual machine spec, any access through a reference to an object that is returned from new is guaranteed to see the fully initialized version of any final fields in that object. See the JVM spec, chapter 17.5.

Related

Init static variable in synchronized block for all the threads, read without synchronized

I often see the following pattern. One thread will initialize the client in init() method in synchronized block. All the other threads, also called init() method before they start to use the other class methods.
Client value is not changed after initialization. They dont set the client value as volatile.
My question is that if this is correct to do? Do all of the threads that create client, and call init() method , will after init method finished see the correct initilized value that was initialized byt the first thread that called init() method?
public class DB {
private static Object lock = new Object();
private static Client client;
public init() {
synchronized (lock) {
if (client != null) {
return;
}
client = new Client();
}
}
public insert(Object data) {
client.insert(data); // is this ok to access the client without volatile or synchronized?
}
}
The rationale behind that pattern is that they think that because they read the client under synchronized block in init() method, the client will be set to the correct initialized value, and because the client is never changed, they can use it without volatile or synchronized after. IS this correct assumption?
You can see this pattern for example here: https://github.com/brianfrankcooper/YCSB/blob/cd1589ce6f5abf96e17aa8ab80c78a4348fdf29a/mongodb/src/main/java/site/ycsb/db/MongoDbClient.java where they initialized the database in init method and used it without synchronization after.
It is only safe to do this if you are guaranteed to have called init() before calling insert(data).
There is a happens-before edge created by the synchronized block: the end of the synchronized block happens before the next invocation of the same block.
This means that if a thread has invoked init(), then either:
client was previously uninitialized, so it is initialized on this call.
client was previously initialized, and the write to client is has happened before the current thread enters the synchronized block.
No further synchronization is then necessary, at least with respect to client.
However, if a thread doesn't call init(), then there are no guarantees as to whether client is initialized; and no guarantee as to whether the client initialized by another thread (one that did call init()) will be visible to the current thread (the one that didn't call init()).
Relying on clients to call init() first is brittle. It would be much better either to use an eagerly-initialized field:
public class DB {
private static final Client client = new Client();
public insert(Object data) {
client.insert(data); // Guaranteed to be initialized once class loading is complete.
}
}
or, if you must do it lazily, use a lazy holder:
public class DB {
private static class Holder {
private static final Client client = new Client();
}
public insert(Object data) {
Holder.client.insert(data); // Holder.client is initialized on first access.
}
}
Or, of course, chuck in a call to init() inside the insert method:
public insert(Object data) {
init();
client.insert(data);
}
The disadvantage of the latter approach is that all threads must synchronize. In the other two approaches, there is no contention after the first invocation.
It looks like the rationale behind this type of pattern is to ensure that you can only have one instance of Client in the application. Multiple invocations (parallel/sequential) of init() method on different/same DB objects will not allow creating a new Client if it is already created and synchronized block is just to ensure that client object will be created only once if multiple threads called init() parallelly.
But it has nothing to do with safe call of insert() method on client object and that totally depends on the implementation of the insert() method that may be thread-safe or may not be.

Inner fields of a class are not collected by GC via Phantom References

I am having problems with Phantom References when referents are the fields inside the class. When class objects are set to null, fields are not collected automatically by GC
Controller.java
public class Controller {
public static void main( String[] args ) throws InterruptedException
{
Collector test = new Collector();
test.startThread();
Reffered strong = new Reffered();
strong.register();
strong = null; //It doesn't work
//strong.next =null; //It works
test.collect();
Collector.m_stopped = true;
System.out.println("Done");
}
}
Collector.java: I am having a Collector that registers an object to reference queue and prints it when it is collected.
import java.lang.ref.PhantomReference;
import java.lang.ref.Reference;
import java.lang.ref.ReferenceQueue;
import java.util.HashMap;
import java.util.Map;
public class Collector {
private static Thread m_collector;
public static boolean m_stopped = false;
private static final ReferenceQueue refque = new ReferenceQueue();
Map<Reference,String> cleanUpMap = new HashMap<Reference,String>();
PhantomReference<Reffered> pref;
public void startThread() {
m_collector = new Thread() {
public void run() {
while (!m_stopped) {
try {
Reference ref = refque.remove(1000);
System.out.println(" Timeout ");
if (null != ref) {
System.out.println(" ref not null ");
}
} catch (Exception ex) {
break;
}
}
}
};
m_collector.setDaemon(true);
m_collector.start();
}
public void register(Test obj) {
System.out.println("Creating phantom references");
//Referred strong = new Referred();
pref = new PhantomReference(obj, refque);
cleanUpMap.put(pref, "Free up resources");
}
public static void collect() throws InterruptedException {
System.out.println("GC called");
System.gc();
System.out.println("Sleeping");
Thread.sleep(5000);
}
}
Reffered.java
public class Reffered {
int i;
public Collector test;
public Test next;
Reffered () {
test= new Collector();
next = new Test();
}
void register() {
test.register(next);
}
}
Test is a empty class. I can see that "next" field in Refferred class is not collected when Reffered object is set to null. In other words, when "strong" is set to null, "next" is not collected. I assumed that "next" will be automatically collected by GC because "next" is no more referenced when "strong" is set to null. However, when "strong.next" is set to null, "next" is collected as we think. Why is "next" not collected automatically when strong is set to null?
You have a very confusing code structure.
At the beginning of your code, you have the statements
Collector test = new Collector();
test.startThread();
so you are creating an instance of Collector that the background thread will have a reference to. That thread isn’t even touching that reference, but since it is an anonymous inner class, it will hold a reference to its outer instance.
Within Reffered you have a field of type Collector that is initialized with new Collector() in the constructor, in other words, you are creating another instance of Collector. This is the instance on which you invoke register.
So all artifacts created by register, the PhantomReference held in pref and the HashMap held in cleanUpMap, which has also a reference to the PhantomReference, are only referenced by the instance of Collector referenced by Reffered. If the Reffered instance becomes unreachable, all these artifacts become unreachable too and nothing will be registered at the queue.
This is the place to recall the java.lang.ref package documentation:
The relationship between a registered reference object and its queue is one-sided. That is, a queue does not keep track of the references that are registered with it. If a registered reference becomes unreachable itself, then it will never be enqueued. It is the responsibility of the program using reference objects to ensure that the objects remain reachable for as long as the program is interested in their referents.
There are some ways to illustrate the issue with your program.
Instead of doing either, strong = null; or strong.next = null;, you may do both:
strong.next = null;
strong = null;
here, it doesn’t matter that next has been nulled out, this variable is unreachable anyway, once strong = null has been executed. After that, the PhantomReference that was only reachable through the Reffered instance has become unreachable itself and no “ref not null” message will be printed.
Alternatively, you may change that code part to
strong.next = null;
strong.test = null;
which will also make the PhantomReference unreachable, thus never enqueued.
But if you change it to
Object o = strong.test;
strong = null;
the message “ref not null” will be printed as o holds an indirect reference to the PhantomReference. It must be emphasized that this is not guaranteed behavior, Java is allowed to eliminate the effect of unused local variables. But it is sufficiently reproducible with the current HotSpot implementation to demonstrate the point.
The bottom line is, the Test instance has been always collected as expected. It’s just that in some cases, more has been collected than you were aware of, including the PhantomReference itself, so no notification happened.
As a last remark, a variable like public static boolean m_stopped that you share between two threads must be declared volatile to ensure that a thread will notice modifications made by another thread. It happens to work here without, because the JVM’s optimizer did not do much work for such a short running program and architectures like x68 synchronize caches. But it’s unreliable.

One thread updates variable and another read it, do I need something special

I have a class that has the object "Card". This class keeps checking to see if the object is not null anymore. Only one other thread can update this object. Should I just do it like the code below? Use volatile?Syncronized? lock (which I dont know how to use really)? What do you recommend as easiest solution?
Class A{
public Card myCard = null;
public void keepCheck(){
while(myCard == null){
Thread.sleep(100)
}
//value updated
callAnotherMethod();
}
Another thread has following:
public void run(){
a.myCard = new Card(5);
}
What do you suggest?
You should use a proper wait event (see the Guarded Block tutorial), otherwise you run the risk of the "watching" thread seeing the reference before it sees completely initialized member fields of the Card. Also wait() will allow the thread to sleep instead of sucking up CPU in a tight while loop.
For example:
Class A {
private final Object cardMonitor = new Object();
private volatile Card myCard;
public void keepCheck () {
synchronized (cardMonitor) {
while (myCard == null) {
try {
cardMonitor.wait();
} catch (InterruptedException x) {
// either abort or ignore, your choice
}
}
}
callAnotherMethod();
}
public void run () {
synchronized (cardMonitor) {
myCard = new Card(5);
cardMonitor.notifyAll();
}
}
}
I made myCard private in the above example. I do recommend avoiding lots of public fields in a case like this, as the code could end up getting messy fast.
Also note that you do not need cardMonitor -- you could use the A itself, but having a separate monitor object lets you have finer control over synchronization.
Beware, with the above implementation, if run() is called while callAnotherMethod() is executing, it will change myCard which may break callAnotherMethod() (which you do not show). Moving callAnotherMethod() inside the synchronized block is one possible solution, but you have to decide what the appropriate strategy is there given your requirements.
The variable needs to be volatile when modifying from a different thread if you intend to poll for it, but a better solution is to use wait()/notify() or even a Semaphore to keep your other thread sleeping until myCard variable is initialized.
Looks like you have a classic producer/consumer case.
You can handle this case using wait()/notify() methods. See here for an example: How to use wait and notify in Java?
Or here, for more examples: http://www.programcreek.com/2009/02/notify-and-wait-example/

Concurrent access to static methods

I have a static method with the following signature:
public static List<ResultObjects> processRequest(RequestObject req){
// process the request object and return the results.
}
What happens when there are multiple calls made to the above method concurrently? Will the requests be handled concurrently or one after the other?
Answering exactly your question:
Method will be executed concurrently (multiple times in the same time if you have several threads).
Requests will be handled concurrently.
You need to add the synchronized modifier if you are working with objects that require concurrent access.
All your calls to the method will be executed concurrently... but:
You may have concurrency issue (and being in non thread-safe situation) as soon as the code of your static method modify static variables. And in this case, you can declare your method as synchronized
If your method only use local variables you won't have concurrency issues.
If you need to avoid concurrent execution, you need to explicitly synchronize. The fact that the method is static has nothing to do with it. If you declare the method itself to be synchronized, then the synchronization will be on the class object. Otherwise you will need to synchronize on some static object (since this doesn't exist for static methods).
I see a lot of answers but none really pointing out the reason.
So this can be thought like this,
Whenever a thread is created, it is created with its own stack (I guess the size of the stack at the time of creation is ~2MB). So any execution that happens actually happens within the context of this thread stack.
Any variable that is created lives in the heap but it's reference lives in the stack with the exceptions being static variables which do not live in the thread stack.
Any function call you make is actually pushed onto the thread stack, be it static or non-static. Since the complete method was pushed onto the stack, any variable creation that takes place lives within the stack (again exceptions being static variables) and only accessible to one thread.
So all the methods are thread safe until they change the state of some static variable.
You can check it yourself:
public class ConcurrentStatic {
public static void main(String[] args) {
for (String name: new String[] {"Foo", "Bar", "Baz"}) {
new Thread(getRunnable(name)).start();
}
}
public static Runnable getRunnable(final String name) {
return new Runnable() {
public void run() {
longTask(name);
}
};
}
public static void longTask(String label) {
System.out.println(label + ": start");
try {
Thread.sleep(10000);
} catch (InterruptedException e) {
e.printStackTrace();
}
System.out.println(label + ": end");
}
}
all method invocations from separate threads in java are concurrent by default.

Implementation of "canonical" lock objects

I have a store of data objects and I wish to synchronize modifications that are related to one particular object at a time.
class DataStore {
Map<ID, DataObject> objects = // ...
// other indices and stuff...
public final void doSomethingToObject(ID id) { /* ... */ }
public final void doSomethingElseToObject(ID id) { /* ... */ }
}
That is to say, I do not wish my data store to have a single lock since modifications to different data objects are completely orthogonal. Instead, I want to be able to take a lock that pertains to a single data object only.
Each data object has a unique id. One way is to create a map of ID => Lock and synchronize upon the one lock object associated with the id. Another way is to do something like:
synchronize(dataObject.getId().toString().intern()) {
// ...
}
However, this seems like a memory leak -- the internalized strings may never be collected.
Yet another idea is to synchronize upon the data object itself; however, what if you have an operation where the data object doesn't exist yet? For example, what will a method like addDataObject(DataObject) synchronize upon?
In summary, how can I write a function f(s), where s is a String, such that f(s)==f(t) if s.equals(t) in a memory-safe manner?
Add the lock directly to this DataObject, you could define it like this:
public class DataObject {
private Lock lock = new ReentrantLock();
public void lock() { this.lock.lock(); }
public void unlock() { this.lock.unlock(); }
public void doWithAction( DataObjectAction action ) {
this.lock();
try {
action.doWithLock( this ) :
} finally {
this.unlock();
}
}
// other methods here
}
public interface DataObjectAction { void doWithLock( DataObject object ); }
And when using it, you could simply do it like this:
DataObject object = // something here
object.doWithAction( new DataObjectAction() {
public void doWithLock( DataObject object ) {
object.setProperty( "Setting the value inside a locked object" );
}
} );
And there you have a single object locked for changes.
You could even make this a read-write lock if you also have read operations happening while writting.
For such case, I normally have 2 level of lock:
First level as a reader-writer-lock, which make sure update to the map (add/delete) is properly synchronized by treating them as "write", and access to entries in map is considered as "read" on the map. Once accessed to the value, then synchronize on the value. Here is a little example:
class DataStore {
Map<ID, DataObject> objMap = // ...
ReadWritLock objMapLock = new ReentrantReadWriteLock();
// other indices and stuff...
public void addDataObject(DataObject obj) {
objMapLock.writeLock().lock();
try {
// do what u need, u may synchronize on obj too, depends on situation
objMap.put(obj.getId(), obj);
} finally {
objMapLock.writeLock().unlock();
}
}
public final void doSomethingToObject(ID id) {
objMapLock.readLock().lock();
try {
DataObject dataObj = this.objMap.get(id);
synchronized(dataObj) {
// do what u need
}
} finally {
objMapLock.readLock().unlock();
}
}
}
Everything should then be properly synchronized without sacrificing much concurrency
Yet another idea is to synchronize upon the data object itself; however, what if you have an operation where the data object doesn't exist yet? For example, what will a method like addDataObject(DataObject) synchronize upon?
Synchronizing on the object is probably viable.
If the object doesn't exist yet, then nothing else can see it. Provided that you can arrange that the object is fully initialized by its constructor, and that it is not published by the constructor before the constructor returns, then you don't need to synchronize it. Another approach is to partially initialize in the constructor, and then use synchronized methods to do the rest of the construction and the publication.

Categories