I have a Set with any type of values and an AtomicBoolean that indicates if the functionality provided by that class is running.
private Set<Object> set = new HashSet<>();
private AtomicBoolean running;
Now, i have two methods, one of them is adding objects to the set and the other serves as a setup method for my class.
public void start() {
// ...
set.foreEach(someApi::addObject);
// ...
running.set(true);
}
public void addObject(Object o) {
set.add(o);
if(running.get()) {
someApi.addObject(o);
}
}
However, there is a problem with that code. If the method is invoked from another thread while the start method is iterating through the set running is still false. Thus, the object will not be added to the api.
Question: How can i guarantee that all objects in the set and objects added with addObject will be added to the api exactly one time?
My ideas:
use a lock and block the addObject method if the setup is currently adding methods to the api (or make both methods synchronized, which will slightly decrease performence tough)
Question: How can i guarantee that all objects in the set and objects added with addObject will be added to the api exactly one time?
You have to be careful here because this gets close to the ole "double check locking bug".
If I understand you question you want to:
queue the objects passed into addObject(...) in the set before the call to start().
then when start() is called, call the API method on the objects in the set.
handle the overlap if additional objects are added during the call to start()
call the method once and only once on all objects passed to addObject(...).
What is confusing is that your API call is also named addObject(). I assume this is different from the addObject(...) method in your code sample. I'm going to rename it below to be someApiMethod(...) to show that it's not going recursive.
The easiest way is going to be, unfortunately, having a synchronized block in each of the methods:
private final Set<Object> set = new HashSet<>();
public void start() {
synchronized (set) {
set.forEach(someApi::someApiMethod);
}
}
public void addObject(Object obj) {
synchronized (set) {
if (set.add(obj)) {
someApi.addObject(obj);
}
}
}
}
To make it faster is going to take a lot more complicated code. One thing you could do is use a ConcurrentHashMap and a AtomicBoolean running. Something like:
private final ConcurrentMap<Object, Object> map = new ConcurrentHashMap<>();
private final Set<Object> beforeStart = new HashSet<>();
private final AtomicBoolean running = new AtomicBoolean();
public void start() {
synchronized (beforeStart) {
for (Object obj : beforeStart) {
doIfAbsent(obj);
}
running.set(true);
}
}
public void addObject(Object obj) {
if (running.get()) {
doIfAbsent(obj);
} else {
synchronized (beforeStart) {
// we have to test running again once we get the lock
if (running.get()) {
doIfAbsent(obj);
} else {
beforeStart.add(obj);
}
}
}
}
private void doIfAbsent(Object obj) {
if (map.putIfAbsent(obj, obj)) {
someApi.someApiMethod(obj);
}
}
This is pretty complicated and it may not be any faster depending on how large your hash map is and other factors.
Related
I have the following classes:
class MyObjectManager {
Map<String, MyObject> myObjects;
void start(String myObjectName) {
// Create or reuse myObject for given name and run its a() method.
}
void stop(String myObjectName) {
// Create or reuse myObject for given name and run its b() method.
}
}
class MyObject {
void start() {
// do something
}
void stop() {
// do something
}
}
I need MyObject's start and stop methods to be ran synchronously for every MyObject instance and in the same order that MyObjectManager's methods were called with that myObjectName. However, I don't care about the order in which different MyObjects instances' methods are called. Initially, the map is empty and I create every MyObject the first time any of MyObjectManager's methods is called with specific myObjectName.
I think this can be solved by operating a master lock that synchronizes operations with myObjects map such as checking, adding and retrieving MyObjects instances. But after I obtained a MyObject instance either by retrieving it from the map or by creating it I need to lock on MyObject instance and release the master lock so that it's not kept unnecessarily locked while I execute MyObject.a() or MyObject.b().
I'm used to placing unlocking in finally blocks, so I imagine the code to be something like
void start(String myObjectName) {
MyObject myObject = null;
try {
synchronized(myObjects) {
myObject = createOrRetrieveMyObject(myObjectName);
if (myObject == null) {
// It's possible that myObjectName is invalid and MyObject will not be created.
return;
}
myObject.lock();
}
myObject.start();
} finally {
if (myObject != null) {
myObject.unlock();
}
}
}
That doesn't look pretty. Is there a better way?
I need MyObject's start and stop methods to be ran synchronously for every MyObject instance ...
I think that it might be more convenient do implement this synchronization logic inside MyObject.start() and MyObject.stop() (for instance you can make these methods synchronized), instead of requirement to use lock()+unlock() every time someone calls these methods.
I would like to convert the following code to fit multithread environment.
List<Observer> list = new ArrayList<>();
public void removeObserver(Observer p) {
for (Observer observer: list) {
if (observer.equals(p)) {
list.remove(observer);
break;
}
}
}
public void addObserver(Observer p) {
list.add(p);
}
public void notifyObserver(Event obj) {
for (Observer observer: list) {
observer.notify(obj);
}
}
Definitely, one of the easiest way to do so, is to add synchronized keyword, which ensure only one thread can runs the logic, and thereby ensuring result is correct.
However, is there better way to solve the issue. I have do some sort of research, and found that I can use Collections.synchronizedList, and also notice such list.iterator is not thread-safe, so I should avoid use of forEach loop or iterator directly unless I do a synchronized (list)
I just don't want to use synchronized, and think if there is another possible approach. Here is my second attempt.
List<Observer> list = Collections.synchronizedList(new ArrayList<Observer>()); // which is thread safe
public void removeObserver(Observer p) {
// as the list may get modify, I create a copy first
List<Observer> copy = new CopyOnWriteArrayList(list);
for (Observer observer: copy) {
if (observer.equals(p)) {
// but now, no use of iterator
list.remove(observer); // remove it from the original copy
break;
}
}
}
public void addObserver(Observer p) {
list.add(p);
}
public void notifyObserver(Event obj) {
List<Observer> copy = new CopyOnWriteArrayList(list);
// not use iterator, as thread safe list's iterator can be thread unsafe
// and for-each loop use iterator concept
for (Observer observer: copy) {
observer.notify(obj);
}
}
I just want to ask if my second attempt is thread-safe? Also, is there a better approach to do this then my proposed second method?
Definitely, one of the easiest way to do so, is to add synchronized keyword, which ensure only one thread can runs the logic, and thereby ensuring result is correct.
This is correct.
However, is there better way to solve the issue?
Possibly. But lets take look at your second attempt:
List<Observer> list = Collections.synchronizedList(new ArrayList<Observer>());
// which is thread safe
Yes it is thread-safe. With certain constraints.
public void removeObserver(Observer p) {
// as the list may get modify, I create a copy first
List<Observer> copy = new CopyOnWriteArrayList(list);
...
Three problems here:
You are creating a copy of the list. That is an O(N) operation.
The CopyOnWriteArrayList constructor is going to iterate list ... and iteration of a list created by synchronizedList is not atomic / thread-safe so you have a race condition.
There is no actual benefit of using CopyOnWriteArrayList here over (say) ArrayList. The copy object is local and thread-confined so it doesn't need to be thread-safe.
In summary, this is not thread-safe AND it is more expensive simply making the original methods synchronized.
A possibly better way:
List<Observer> list = new CopyOnWriteArrayList()
public void removeObserver(Observer p) {
list.remove(p)
}
public void addObserver(Observer p) {
list.add(p);
}
public void notifyObserver(Event obj) {
for (Observer observer: list) {
observer.notify(obj);
}
}
This is thread-safe with the caveat that an Observer added while a notifyObserver call is in progress will not be notified.
The only potential problem is that mutations to a CopyOnWriteArrayList are expensive since they create a copy of the entire list. So if the ratio of mutations to notifies is too high, this may be more expensive than the solution using synchronized methods. On the other hand, if multiple threads call notifyObserver, those calls can proceed in parallel.
I've come across this particular scenario many times, and I wonder what's the "clean" way of solving it. It all comes to this: how can I store a reference to an object that's being set in a different Thread?
Let me illustrate this with an example, imagine I have a class named Bar, and objects from this class are retrieved from this method:
public class BarBuilder {
public static void buildNewBar(final BarListener listener) {
// This could be an HTTP request or something that can only be done in a
// different thread
new Thread(new Runnable() {
#Override
public void run() {
listener.onNewBar(new Bar());
}
}).start();
}
}
The important part here is that buildNewBar() method has to be executed in another Thread, so instead of returning the value, it will communicate the result through a listener. This is quite common for operations that need HTTP requests or any sort of connection.
Now, my problem is if I need the value before continuing execution, how can I access to it? I can lock a thread with a semaphore until I have my value, but the storing of the value is what I don't have clear (If I declare a final variable, it cannot be set again). I solved it creating a new class which I named "Pointer", but I wonder why there isn't any built in java class to do this (I used Vector before, but it doesn't seem like a good solution either).
public Bar getBar() {
final Pointer<Bar> barPointer = new Pointer<Bar>();
final Semaphore semaphore = new Semaphore(0);
BarBuilder.buildNewBar(new BarListener() {
#Override
public void onNewBar(Bar bar) {
barPointer.set(bar);
semaphore.release();
}
});
semaphore.acquireUninterruptibly();
// Now I have my value
return barPointer.get();
}
public class Pointer<T> {
T object;
public void set(T object) {
this.object = object;
}
public T get() {
return object;
}
}
Let's see if there is a better way of doing this supported by Java language, I have seen classes like Reference, but it seems like their purpose is something different and setters don't exist (they are read-only), so that doesn't solve my issues either.
public Bar getBar() {
final BarPointer barPointer = new BarPointer().
BarBuilder.buildNewBar(barPointer);
return barPointer.get();
}
public class BarPointer extends FutureTask<Bar> implements BarListener {
#Override
public void onNewBar(Bar bar) {
set(bar);
}
}
In order to eliminate the need to write a custom Pointer class, I would simply use AtomicReference.
Suppose I have the following:
public class Foo {
private ReadingList mReadingList = new ReadingList();
public ReadingList getReadingList() {
synchronized (mReadingList) {
return mReadingList;
}
}
}
If I try modifying the ReadingList object in two threads, the synchronization above won't help me, right?:
// Thread 1
foo1.getReadingList().setName("aaa");
// Thread 2
foo2.getReadingList().setName("bbb");
do I have to wrap each method I want synchronized like so:
public class Foo {
private ReadingList mReadingList = new ReadingList();
public synchronized void setReadingListName(String name) {
mReadingList.setName(name);
}
public synchronized void setReadingListAuthor(String author) {
mReadingList.setAuthor(author);
}
...
and so on for each method of ReadingList I want exposed and synched? I'd end up just writing wrapper methods for each method of ReadingList.
Thanks
1. You have access to the ReadingList source
If you have access to the ReadingList object, add synchronized to all of the methods of ReadingList if you desire synchronized access to all of the fields or a certain group of setters if you only wish to interleave access to certain fields.
2. You do not have access to ReadingList
You would have to write something like:
public class Foo {
private ReadingList mReadingList = new ReadingList();
public void setReadingListName(String name) {
synchronized(mReadingList) {
mReadingList.setName(name);
}
}
public void setReadingListAuthor(String author) {
synchronized(mReadingList) {
mReadingList.setAuthor(author);
}
}
...
3. Use a general purpose lock object
Depending on the nature of Foo and how general-purpose this whole thing is, you may find that only a certain class or classes present the threading issue in ReadingList.
In such a class you could use a general purpose lock object:
public class Bar {
Object readingListLock = new Object();
public void someMethodThatModifiesReading() {
synchronized(readingListLock) {
foo.getReadingList().setName("1");
}
}
public void someOtherMethodThatModifiesReading() {
synchronized(readingListLock) {
foo.getReadingList().setName("2");
}
}
...
}
A quick (and not so efficient solution) is to synchronize all methods of ReadingList in ReadingList's implementation.
Look for Reader-Writer lock for a more efficient way to synchronize access: it allows multiple reads and single write at a time.
Your first solution only makes sure one thread gets the ReadingList at a time, and nothing else - many threads can read and modify the ReadingList concurrently.
I have a store of data objects and I wish to synchronize modifications that are related to one particular object at a time.
class DataStore {
Map<ID, DataObject> objects = // ...
// other indices and stuff...
public final void doSomethingToObject(ID id) { /* ... */ }
public final void doSomethingElseToObject(ID id) { /* ... */ }
}
That is to say, I do not wish my data store to have a single lock since modifications to different data objects are completely orthogonal. Instead, I want to be able to take a lock that pertains to a single data object only.
Each data object has a unique id. One way is to create a map of ID => Lock and synchronize upon the one lock object associated with the id. Another way is to do something like:
synchronize(dataObject.getId().toString().intern()) {
// ...
}
However, this seems like a memory leak -- the internalized strings may never be collected.
Yet another idea is to synchronize upon the data object itself; however, what if you have an operation where the data object doesn't exist yet? For example, what will a method like addDataObject(DataObject) synchronize upon?
In summary, how can I write a function f(s), where s is a String, such that f(s)==f(t) if s.equals(t) in a memory-safe manner?
Add the lock directly to this DataObject, you could define it like this:
public class DataObject {
private Lock lock = new ReentrantLock();
public void lock() { this.lock.lock(); }
public void unlock() { this.lock.unlock(); }
public void doWithAction( DataObjectAction action ) {
this.lock();
try {
action.doWithLock( this ) :
} finally {
this.unlock();
}
}
// other methods here
}
public interface DataObjectAction { void doWithLock( DataObject object ); }
And when using it, you could simply do it like this:
DataObject object = // something here
object.doWithAction( new DataObjectAction() {
public void doWithLock( DataObject object ) {
object.setProperty( "Setting the value inside a locked object" );
}
} );
And there you have a single object locked for changes.
You could even make this a read-write lock if you also have read operations happening while writting.
For such case, I normally have 2 level of lock:
First level as a reader-writer-lock, which make sure update to the map (add/delete) is properly synchronized by treating them as "write", and access to entries in map is considered as "read" on the map. Once accessed to the value, then synchronize on the value. Here is a little example:
class DataStore {
Map<ID, DataObject> objMap = // ...
ReadWritLock objMapLock = new ReentrantReadWriteLock();
// other indices and stuff...
public void addDataObject(DataObject obj) {
objMapLock.writeLock().lock();
try {
// do what u need, u may synchronize on obj too, depends on situation
objMap.put(obj.getId(), obj);
} finally {
objMapLock.writeLock().unlock();
}
}
public final void doSomethingToObject(ID id) {
objMapLock.readLock().lock();
try {
DataObject dataObj = this.objMap.get(id);
synchronized(dataObj) {
// do what u need
}
} finally {
objMapLock.readLock().unlock();
}
}
}
Everything should then be properly synchronized without sacrificing much concurrency
Yet another idea is to synchronize upon the data object itself; however, what if you have an operation where the data object doesn't exist yet? For example, what will a method like addDataObject(DataObject) synchronize upon?
Synchronizing on the object is probably viable.
If the object doesn't exist yet, then nothing else can see it. Provided that you can arrange that the object is fully initialized by its constructor, and that it is not published by the constructor before the constructor returns, then you don't need to synchronize it. Another approach is to partially initialize in the constructor, and then use synchronized methods to do the rest of the construction and the publication.