How often is a thread executed? My Observer pattern gone wrong? - java

The following is a simplified version of my current code. I am pretty sure I am not doing any thing wrong syntax-wise, and I can't locate my conceptual mistake.
This is sort of an observer pattern I tried to implement. I could not afford to inherit from Java.utils.observable as my class is already complicated and inherits from another class.
There are two parts here:
There's a Notifier class implementing Runnable :
public class Notifier implements Runnable{
public void run()
{
while(true)
{
MyDataType data = getData();
if(data.isChanged()==true)
{
refresh();
}
}
}
}
And then there is my main class which needs to respond to changes to MyDataType data.
public class abc {
private MyDataType data;
public void abc(){
Notifier notifier = new Notifier();
Thread thread = new Thread(notifier);
thread.start();
}
public MyDataType getData(){
return this.data;
}
public void refresh(){
MyDatatype data = getData();
//Do something with data
}
}
The problem : What's happening is that the notifier is calling refresh() when 'data' changes. However inside refresh(), when I do getData(), I am getting the old version of 'data'!
I should mention that there are other parts of the code which are calling the refresh() function too.
What am I overlooking?
Any other better solutions to this problem?
How should I approach designing Subject-Observer systems if I can't apply the default Java implementation out of the box?

when I do getData(), I am getting the old version of 'data'!
Your data field is shared among more than one thread so it must be marked with the volatile keyword.
private volatile MyDataType data;
This causes a "memory barrier" around the read and the the write that keeps the value visible to all threads. Even though the notifier thread is calling getData(), the value for data is being retrieved out if its memory cache. Without the memory barrier, the data value will be updated randomly or never.
As #JB mentioned in the comments, the volatile protects you against a re-assignment of the data field. If you update one of the fields within the current data value, the memory barrier will not be crossed that the notifier's memory will not be updated.
Looking back at your code, it looks like this is the case:
if(data.isChanged()==true)
{
refresh();
}
If data is not being assigned to a new object then making data to be volatile won't help you. You will have to:
Set some sort of volatile boolean dirty; field whenever data has been updated.
Update or read data within a synchronize block each and every time.

First, your data variable might be cached, so you will always need to get the latest value by making it volatile.
Second, what you are doing here is a producer / consumer pattern. This pattern is usually best implemented with messages. When you receive new data, you could create an immutable object and post it to the consumer thread (via a thread safe queue like a BlockingQueue) instead of having a shared variable.
Something along these lines:
public class Notifier extends Thread{
private BlockingQueue<E> consumerQueue = null;
public setConsumerQueue(BlockingQueue<E> val){
consumerQueue = val;
}
// main method where data is received from socket...
public void run(){
while(!interrupted()){
data = ... // got new data here
if(!data.isChanged()) continue;
// Post new data only when it has changed
if(consumerQueue!=null) consumerQueue.offer(data);
}
}
}
public class Consumer extends Thread{
private BlockingQueue<E> consumerQueue = new BlockingQueue<E>();
public Consumer (Producer val){
val.setConsumerQueue(consumerQueue);
}
public void run(){
while(!interrupted()){
data = consumerQueue.take();// block until there is data from producer
if(data !=null) processData(data);
}
}
}

Related

flink SourceFunction<> is being replaced in StreamExecutionEnvironment.addSource()?

I ran into this problem when I was trying to create a custom source of event. Which contains a queue that allow my other process to add items into it. Then expect my CEP pattern to print some debug messages when there is a match.
But there is no match no matter what I add to the queue. Then I notice that the queue inside mySource.run() is always empty. Which means the queue I used to create the mySource instance is not the same as the one inside StreamExecutionEnvironment. If I change the queue to static, force all instances to share the same queue, everything works as expected.
DummySource.java
public class DummySource implements SourceFunction<String> {
private static final long serialVersionUID = 3978123556403297086L;
// private static Queue<String> queue = new LinkedBlockingQueue<String>();
private Queue<String> queue;
private boolean cancel = false;
public void setQueue(Queue<String> q){
queue = q;
}
#Override
public void run(org.apache.flink.streaming.api.functions.source.SourceFunction.SourceContext<String> ctx)
throws Exception {
System.out.println("run");
synchronized (queue) {
while (!cancel) {
if (queue.peek() != null) {
String e = queue.poll();
if (e.equals("exit")) {
cancel();
}
System.out.println("collect "+e);
ctx.collectWithTimestamp(e, System.currentTimeMillis());
}
}
}
}
#Override
public void cancel() {
System.out.println("canceled");
cancel = true;
}
}
So I dig into the source code of StreamExecutionEnvironment. Inside the addSource() method. There is a clean() method which looks like it replaces the instance to a new one.
Returns a "closure-cleaned" version of the given function.
Why is that? and Why it needs to be serialize?
I've also try to turn off the clean closure using getConfig(). The result is still the same. My queue instance is not the same one which env is using.
How do I solve this problem?
The clean() method used on functions in Flink is mainly to ensure the Function(like SourceFunction, MapFunction) serialisable. Flink will serialise those functions and distribute them onto task nodes to execute them.
For simple variables in your Flink main code, like int, you can simply reference them in your function. But for the large or not-serialisable ones, better using broadcast and rich source function. Please refer to https://cwiki.apache.org/confluence/display/FLINK/Variables+Closures+vs.+Broadcast+Variables

How can I synchronize the class so that I can use from UI thread and background threads?

I have a utility class as follows:
public class MetaUtility {
private static final SparseArray<MetaInfo> metaInfo = new SparseArray<>();
public static void flush() {
metaInfo.clear();
}
public static void addMeta(int key, MetaInfo info) {
if(info == null) {
throw new NullPointerException();
}
metaInfo.append(key, info);
}
public static MetaInfo getMeta(int key) {
return metaInfo.get(key);
}
}
This class is very simple and I wanted to have a "central" container to be used across classes/activities.
The issue is threading.
Right now it is populated (i.e the addMeta is called) only in 1 place in the code (not in the UI thread) and that is not going to change.
The getter is accessed by UI thread and in some cases by background threads.
Carefully reviewing the code I don't think that I would end up with the case that the background thread would add elements to the sparse array while some other thread would try to access it.
But this is very tricky for someone to know unless he knew the code very well.
My question is, how could I design my class so that I can safely use it from all threads including UI thread?
I can't just add a synchronized or make it block because that would block the UI thread. What can I do?
You should just synchronize on your object, because what your class is right now is just a wrapper class around a SparseArray. If there are thread level blocking issues, they would be from misuse of this object (well, I guess class considering it only exposes public static methods) in some other part of your project.
First shoot can be with synchronized.
#Jim What about the thread scheduling latency?
Android scheduler is based on Linux and it is known as a completely fair scheduler (CFS). It is "fair" in the sense that it tries to balance the execution of tasks not only based on the priority of the thread but also by tracking the amount of execution time that has been given to a thread.
If you'll see "Skipped xx frames! The application may be doing too much work on its main thread", then need some optimisations.
If you have uncontended lock you should not be afraid of using synchronized. In this case lock should be thin, which means that it would not pass blocked thread to OS scheduler, but would try to acquire lock again a few instructions after. But if you still would want to write non-blocking implementation, then you could use AtomicReference for holding the SparseArray<MetaInfo> array and update it with CAS.
The code might be smth like this:
static AtomicReference<SparseArray<MetaInfo>> atomicReference = new AtomicReference<>();
public static void flush() {
atomicReference.set(new SparseArray<MetaInfo>);
}
public static void addMeta(int key, MetaInfo info) {
if(info == null) {
throw new NullPointerException();
}
do {
SparseArray<MetaInfo> current = atomicReference.get();
SparseArray<MetaInfo> newArray = new SparseArray<MetaInfo>(current);
// plus add a new info
} while (!atomicReference.compareAndSet(current, newArray));
}
public static MetaInfo getMeta(int key) {
return atomicReference.get().get(key);
}

How to Thread a complex Model class to provide synchronization with a Controller class?

How can I proceed in a controller based on whether just one part of a complex model has produced the correct flag?
A controller class is playing a queue of Midi sequences while holding onto an instance of a model class that is dynamically updated via user button presses. After the Midi queue ends, the controller needs to synchronize with the model to check that the user has made a certain number of entries before proceeding to update the interface and move to the next part of the application. The Model represents quite a lot of other data in addition to the ArrayList of user button presses, so the challenge is how to best compartmentalize the synchronization part.
Right now, the pattern I'm trying is something like the following, which doesn't work because of nested class access between the controller and the model:
//Controller
...
Thread entriesCoordination = new Thread( new Model.InnerClass);
entriesCoordination.start();
Thread t = new Thread (this);
t.run();
...
//in runnable nested class in controller
private Model.InncerClass c = new Model.InnerClass();
public void run() {
synchronized( c) {
while (!c.hasFinishedEntries()){
try{
c.wait();
} catch (InterruptedException ignore{}
}
}
}
//Midiqueue completed and Entries finished
}
//in Model
//in runnable nested class in Model
public synchronized boolean hasFinishedEntries() {
return fIsFinishedWithEntries;
}
public void run() {
while(true) {
try{
synchronized(this) {
try{
if(entriesArray.size() == max_size) {
fIsFinishedWithEntries = true;
notifyAll();
} else {...}
}
}
}
}
}
Furthermore, this seems wasteful because it basically means that I need to create a thread and run the inner class of the Model in parallel the entire duration of the time that the user can make these button selections, rather than something that would just poll the Model when I know that the Midi queue has ended.
What's the design pattern to synchronize to one flag in a Model class from a Controller class without having to make a inner class in the model just to handle the synchronization.
I think the right thing to do here is to use an AtomicBoolean and define methods on each of your thread objects to get and set the boolean.
The Model.InnerClass would be changed to add the AtomicBoolean and to change the getter to not be synchronized.
private final AtomicBoolean fIsFinishedWithEntries = new AtomicBoolean();
public boolean hasFinishedEntries() {
return fIsFinishedWithEntries.get();
}
In the run method it something needs to set the finished boolean to be true.
public void run() {
while(true) {
if (entriesArray.size() == max_size) {
synchronized (this) {
fIsFinishedWithEntries.set(true);
notifyAll();
}
} else {...}
}
}
}
You'll need to rest it to false somewhere if you are doing this more than once.
Right now, the pattern I'm trying is something like the following, which doesn't work because of nested class access between the controller and the model:
You need to first create your Model.InnerClass instance and inject that into your controller thread. Making the hasFinishedEntries() be static is ugly so instead in your controller you'd call:
private Model.InnerClass innerClass;
public ControllerThread(Model.InnerClass innerClass) {
this.innerClass = innerClass;
}
...
public void run() {
synchronized (innerClass) {
while (!innerClass.hasFinishedEntries()){
innerClass.wait();
}
}
}
How can I access whether the entries are finished without synchronizing on the entire Model class?
You can obviously just poll the hasFinishedEntries() whenever you want to see if the queue has ended. I'm not sure of a better way to do this without a thread. Is there some way to setup a UI event which checks for a condition every so often?

Implementation of "canonical" lock objects

I have a store of data objects and I wish to synchronize modifications that are related to one particular object at a time.
class DataStore {
Map<ID, DataObject> objects = // ...
// other indices and stuff...
public final void doSomethingToObject(ID id) { /* ... */ }
public final void doSomethingElseToObject(ID id) { /* ... */ }
}
That is to say, I do not wish my data store to have a single lock since modifications to different data objects are completely orthogonal. Instead, I want to be able to take a lock that pertains to a single data object only.
Each data object has a unique id. One way is to create a map of ID => Lock and synchronize upon the one lock object associated with the id. Another way is to do something like:
synchronize(dataObject.getId().toString().intern()) {
// ...
}
However, this seems like a memory leak -- the internalized strings may never be collected.
Yet another idea is to synchronize upon the data object itself; however, what if you have an operation where the data object doesn't exist yet? For example, what will a method like addDataObject(DataObject) synchronize upon?
In summary, how can I write a function f(s), where s is a String, such that f(s)==f(t) if s.equals(t) in a memory-safe manner?
Add the lock directly to this DataObject, you could define it like this:
public class DataObject {
private Lock lock = new ReentrantLock();
public void lock() { this.lock.lock(); }
public void unlock() { this.lock.unlock(); }
public void doWithAction( DataObjectAction action ) {
this.lock();
try {
action.doWithLock( this ) :
} finally {
this.unlock();
}
}
// other methods here
}
public interface DataObjectAction { void doWithLock( DataObject object ); }
And when using it, you could simply do it like this:
DataObject object = // something here
object.doWithAction( new DataObjectAction() {
public void doWithLock( DataObject object ) {
object.setProperty( "Setting the value inside a locked object" );
}
} );
And there you have a single object locked for changes.
You could even make this a read-write lock if you also have read operations happening while writting.
For such case, I normally have 2 level of lock:
First level as a reader-writer-lock, which make sure update to the map (add/delete) is properly synchronized by treating them as "write", and access to entries in map is considered as "read" on the map. Once accessed to the value, then synchronize on the value. Here is a little example:
class DataStore {
Map<ID, DataObject> objMap = // ...
ReadWritLock objMapLock = new ReentrantReadWriteLock();
// other indices and stuff...
public void addDataObject(DataObject obj) {
objMapLock.writeLock().lock();
try {
// do what u need, u may synchronize on obj too, depends on situation
objMap.put(obj.getId(), obj);
} finally {
objMapLock.writeLock().unlock();
}
}
public final void doSomethingToObject(ID id) {
objMapLock.readLock().lock();
try {
DataObject dataObj = this.objMap.get(id);
synchronized(dataObj) {
// do what u need
}
} finally {
objMapLock.readLock().unlock();
}
}
}
Everything should then be properly synchronized without sacrificing much concurrency
Yet another idea is to synchronize upon the data object itself; however, what if you have an operation where the data object doesn't exist yet? For example, what will a method like addDataObject(DataObject) synchronize upon?
Synchronizing on the object is probably viable.
If the object doesn't exist yet, then nothing else can see it. Provided that you can arrange that the object is fully initialized by its constructor, and that it is not published by the constructor before the constructor returns, then you don't need to synchronize it. Another approach is to partially initialize in the constructor, and then use synchronized methods to do the rest of the construction and the publication.

Is there any way to know the progress of a EJB Asynchronous process?

I'm trying to get the percentage of the progress from a EJB Asynchronous process. Is this possible?
Does anyone have an idea how I could do this?
To get to know the progress of asynchronous processes is always tricky, especially if you don't know if they have actually started yet.
The best way I have found is to write another function that just gets the progress, so, if you have some unique id for each call, then update a hashmap with the current process. You may want to look at Concurrent Hashmap (http://download-llnw.oracle.com/javase/1.5.0/docs/api/java/util/concurrent/ConcurrentHashMap.html)
Then this other lookup function will just take the unique id, and return the progress back to the client.
If it hasn't been started, you can also return that, and ideally you may want to also be able to return any error messages that came up in the processing.
Then, when it has finished, and you returned the error message or success, then delete it from the hashmap, the client got the information, and that info won't change, so no point it keeping it around.
UPDATE:
In your interface make a new function
String progressDone(String id);
You will then refer to that synchronously, as it just goes out and comes right back, so it can look up the id in the hashmap and return either the percentage done or an error message.
But, this means that your actually worker function needs to every so often put information in the hashmap as to where it is, which is why I suggested using the concurrent hashmap, so that you don't have to worry about concurrent writes, and so locking considerations.
The solution I have found is an context object shared between asynchronous method and main thread. Here is an example:
Asynchronous job itself:
#Stateless
public class AsyncRunner implements AsyncRunnerLocal {
#Asynchronous
public Future<ResultObject> doWorkAsynchronous(WorkContext context) {
context.setRunning(true);
for (int i = 0; i < 100; i++) {
//Do the next iteration of your work here
context.setProgress(i);
}
context.setRunning(false);
return new AsyncResult(new ResultObject());
}
}
Shared context object. Important thing here is volatile keyword. Field values will be locally cached in each thread without it and progress will not be visible in main thread:
public class WorkContext {
//volatile is important!
private volatile Integer progress = 0;
private volatile boolean running = false;
//getters and setters are omitted
}
Usage example:
public class ProgressChecker {
#EJB
private AsyncRunnerLocal asyncRunner;
private WorkContext context;
private Future<ResultObject> future;
public void startJob() {
this.context = new WorkContext();
future = asyncRunner.doWorkAsynchronous(this.context);
//the job is running now
while (!future.isDone()) {
System.out.println("Progress: " + this.context.getProgress());
Thread.sleep(1000); //try catch is omitted
}
}
}
In EJB3.1 #Asynchronous method-calls can return java.util.concurrent.Future, this interface provides information boolean isCancelled() or boolean isDone(), but no information if the execution started. From my point of view, there is no way to get the information if the process started its execution via the EJB-Container in standard ways.

Categories