Possible concurrency issue in Xlet development

Possible concurrency issue in Xlet development - java

I am involved in development of Xlet using Java 1.4 API.
The docs say Xlet interface methods (those are actually xlet life-cycle methods) are called on its special thread (not the EDT thread). I checked by logging - this is true. This is a bit surprising for me, because it is different from BB/Android frameworks where life-cycle methods are called on the EDT, but it's OK so far.
In the project code I see the app extensively uses Display.getInstance().callSerially(Runnable task) calls (this is an LWUIT way of running a Runnable on the EDT thread).
So basically some pieces of code inside of the Xlet implementation class do create/update/read operations on xlet internal state objects from EDT thread and some other pieces of code do from the life-cycle thread without any synchronization (including that state variables are not declared as volatile). Smth like this:
class MyXlet implements Xlet {
Map state = new HashMap();
public void initXlet(XletContext context) throws XletStateChangeException {
state.put("foo", "bar"); // does not run on the EDT thread
Display.getInstance().callSerially(new Runnable() {
public void run() {
// runs on the EDT thread
Object foo = state.get("foo");
// branch logic depending on the got foo
}
});
}
..
}
My question is: does this create a background for rare concurrency issues? Should the access to the state be synchronized explicitly (or at least state should be declared as volatile)?
My guess is it depends on whether the code is run on a multy-core CPU or not, because I'm aware that on a multy-core CPU if 2 threads are running on its own core, then variables are cached so each thread has its own version of the state unless explicitly synchronized.
I would like to get some trustful response on my concerns.

Yes, in the scenario you describe, the access to the shared state must be made thread safe.
There are 2 problems that you need to be aware of:
The first issue, visability (which you've already mentioned), can still occur on a uniprocessor. The problem is that the JIT compiler is allowed to cache varibles in registers and on a context switch the OS will most likely dump the contents of the registers to a thread context so that it can be resumed later on. However, this is not the same as writing the contents of the registers back to the fields of an object, hence after a context switch we can not assume that the fields of an object is up to date.
For example, take the follow code:
class Example {
private int i;
public void doSomething() {
for (i = 0; i < 1000000; i ++) {
doSomeOperation(i);
}
}
}
Since the loop variable (an instance field) i is not declared as volatile, the JIT is allowed to optimise the loop variable i using a CPU register. If this happens, then the JIT will not be required to write the value of the register back to the instance variable i until after the loop has completed.
So, lets's say a thread is executing the above loop and it then get's pre-empted. The newly scheduled thread won't be able to see the latest value of i because the latest value of i is in a register and that register was saved to a thread local execution context. At a minimum the instance field i will need to be declared volatile to force each update of i to be made visible to other threads.
The second issue is consistent object state. Take the HashMap in your code as an example, internally it is composed of several non final member variables size, table, threshold and modCount. Where table is an array of Entry that forms a linked list. When a element is put into or removed from the map, two or more of these state variables need to be updated atomically for the state to be consistent. For HashMap this has to be done within a synchronized block or similar for it to be atomic.
For the second issue, you would still experience problems when running on a uniprocessor. This is because the OS or JVM could pre-emptively switch threads while the current thread is part way through executing the put or remove method and then switch to another thread that tries to perform some other operation on the same HashMap.
Imagine what would happen if your EDT thread was in the middle of calling the 'get' method when a pre-emptive thread switch occurs and you get a callback that tries to insert another entry into the map. But this time the map exceeds the load factor causing the map to resized and all the entries to be re-hashed and inserted.

Related

Background Thread doesn't work, but with a simple 'System.out' works

A strange thing. Code below works, if the condition desiredHealth < p.getFakeHealth() is true, it DOES SOMETHING.
#Override
public void run(){
while(game_running){
System.out.println("asd");
if(desiredHealth < player.getFakeHealth()){
DOES SOMETHING
}
}
BUT... without 'System.out' it does not work. It doesn't check the condition.
It is somehow on lower priority, or something.
#Override
public void run(){
while(game_running){
if(desiredHealth < player.getFakeHealth())
DOES SOMETHING
}
}
I'm new to threads, so please, dont shout at me :)
Just for info, this thread is a normal class which 'extends Thread' and yes - it is running. Also 'game_running' is true all the time.

the variable must be volatile because the volatile keyword indicates that a value may change between different accesses, even if it does not appear to be modified.
So, be sure game_running is declared volatile.

Explanation:
Ahh, I have seen this on an older SO question. I'm gonna try to find it for further information.
Your problem is happening because the output stream's print is blocking the current thread and one of the desiredHealth and player.getFakeHealth() expressions get a second chance to be evaluated/changed by other thread and voilà! Magic happens. This is because printf on glibc is synchronized, so when you print, the rest of the operations are waiting for the println operation to complete.
Resolution:
We don't have enough context(who is initializing the player, who does the changes and so on), but it's obvious that you have a threading issue, something is not properly synchronized and your background thread works with bad values. One of the reasons might be that some variables are not volatile and if your background thread reads a cached value, you already have a problem.

One of the topics you need to study regarding concurrency is the Java memory model (that's the official spec but I suggest you read a tutorial or a good book, as the spec is rather complicated for beginners).
One of the issues when different threads work with the same memory (use the same variable - e.g. when one is writing into a variable, the other makes decisions based on their value) is that for optimization reasons, the values written by one thread are not always seen by the other.
For example, one thread could run on one CPU, and that variable is loaded into a register in that CPU. If it needed to write it back to main memory all the time, it would slow processing. So it manipulates it in that register, and only writes it back to memory when it's necessary. But what if another thread is expecting to see the values the first thread is writing?
In that case, it won't see them until they are written back, which may never happen.
There are several ways to ensure that write operations are "committed" to memory before another thread needs to use them. One is to use synchronization, another is to use the volatile keyword.
System.out.println() in fact includes a synchronized operation, so it may cause such variables to be committed to memory, and thus enable the thread to see the updated value.
Declaring the variable as volatile means that any changes in it are seen by all the other threads immediately. So using volatile variables is also a way of ensuring that they are seen.
The variable that is used to decide whether to keep the thread running should normally be declared volatile. But also, in your case, the variables desiredHealth (if it's written by a different thread) and whatever variables getFakeHealth() relies on (if they are written by a different thread) should be volatile or otherwise synchronized.
The bottom line is that whatever information is shared between two threads needs to be synchronized or at the very least use volatile. Information that is not shared can be left alone.

How can I be notified when a thread (that I didn't start) ends?

I have a library in a Jar file that needs to keep track of how many threads that use my library. When a new thread comes in is no problem: I add it to a list. But I need to remove the thread from the list when it dies.
This is in a Jar file so I have no control over when or how many threads come through. Since I didn't start the thread, I cannot force the app (that uses my Jar) to call a method in my Jar that says, "this thread is ending, remove it from your list". I'd REALLY rather not have to constantly run through all the threads in the list with Thread.isAlive().
By the way: this is a port of some C++ code which resides in a DLL and easily handles the DLL_THREAD_DETACH message. I'd like something similar in Java.
Edit:
The reason for keeping a list of threads is: we need to limit the number of threads that use our library - for business reasons. When a thread enters our library we check to see if it's in the list. If not, it's added. If it is in the list, we retrieve some thread-specific data. When the thread dies, we need to remove it from the list. Ideally, I'd like to be notified when it dies so I can remove it from the list. I can store the data in ThreadLocal, but that still doesn't help me get notification of when the thread dies.
Edit2:
Original first sentence was: "I have a library in a Jar file that needs to keep track of threads that use objects in the library."

Normally you would let the GC clean up resources. You can add a component to the thread which will be cleaned up when it is not longer accessible.
If you use a custom ThreadGroup, it will me notified when a thread is removed from the group. If you start the JAR using a thread in the group, it will also be part of the group. You can also change a threads group so it will be notifed via reflection.
However, polling the threads every few second is likely to be simpler.

You can use a combination of ThreadLocal and WeakReference. Create some sort of "ticket" object and when a thread enters the library, create a new ticket and put it in the ThreadLocal. Also, create a WeakReference (with a ReferenceQueue) to the ticket instance and put it in a list inside your library. When the thread exits, the ticket will be garbage collected and your WeakReference will be queued. by polling the ReferenceQueue, you can essentially get "events" indicating when a thread exits.

Based on your edits, your real problem is not tracking when a thread dies, but instead limiting access to your library. Which is good, because there's no portable way to track when a thread dies (and certainly no way within the Java API).
I would approach this using a passive technique, rather than an active technique of trying to generate and respond to an event. You say that you're already creating thread-local data on entry to your library, which means that you already have the cutpoint to perform a passive check. I would implement a ThreadManager class that looks like the following (you could as easily make the methods/variables static):
public class MyThreadLocalData {
// ...
}
public class TooManyThreadsException
extends RuntimeException {
// ...
}
public class ThreadManager
{
private final static int MAX_SIZE = 10;
private ConcurrentHashMap<Thread,MyThreadLocalData> threadTable = new ConcurrentHashMap<Thread,ThreadManager.MyThreadLocalData>();
private Object tableLock = new Object();
public MyThreadLocalData getThreadLocalData() {
MyThreadLocalData data = threadTable.get(Thread.currentThread());
if (data != null) return data;
synchronized (tableLock) {
if (threadTable.size() >= MAX_SIZE) {
doCleanup();
}
if (threadTable.size() >= MAX_SIZE) {
throw new TooManyThreadsException();
}
data = createThreadLocalData();
threadTable.put(Thread.currentThread(), data);
return data;
}
}
The thread-local data is maintained in threadTable. This is a ConcurrentHashMap, which means that it provides fast concurrent reads, as well as concurrent iteration (that will be important below). In the happy case, the thread has already been here, so we just return its thread-local data.
In the case where a new thread has called into the library, we need to create its thread-local data. If we have fewer threads than the limit, this proceeds quickly: we create the data, store it in the map, and return it (createThreadLocalData() could be replaced with a new, but I tend to like factory methods in code like this).
The sad case is where the table is already at its maximum size when a new thread enters. Because we have no way to know when a thread is done, I chose to simply leave the dead threads in the table until we need space -- just like the JVM and memory management. If we need space, we execute doCleanup() to purge the dead threads (garbage). If there still isn't enough space once we've cleared dead threads, we throw (we could also implement waiting, but that would increase complexity and is generally a bad idea for a library).
Synchronization is important. If we have two new threads come through at the same time, we need to block one while the other tries to get added to the table. The critical section must include the entirety of checking, optionally cleaning up, and adding the new item. If you don't make that entire operation atomic, you risk exceeding your limit. Note, however, that the initial get() does not need to be in the atomic section, so we don't need to synchronize the entire method.
OK, on to doCleanup(): this simply iterates the map and looks for threads that are no longer alive. If it finds one, it calls the destructor ("anti-factory") for its thread-local data:
private void doCleanup() {
for (Thread thread : threadTable.keySet()) {
if (! thread.isAlive()) {
MyThreadLocalData data = threadTable.remove(thread);
if (data != null) {
destroyThreadLocalData(data);
}
}
}
}
Even though this function is called from within a synchronized block, it's written as if it could be called concurrently. One of the nice features of ConcurrentHashMap is that any iterators it produces can be used concurrently, and give a view of the map at the time of call. However, that means that two threads might check the same map entry, and we don't want to call the destructor twice. So we use remove() to get the entry, and if it's null we know that it's already been (/being) cleaned up by another thread.
As it turns out, you might want to call the method concurrently. Personally, I think the "clean up when necessary" approach is simplest, but your thread-local data might be expensive to hold if it's not going to be used. If that's the case, create a Timer that will repeatedly call doCleanup():
public Timer scheduleCleanup(long interval) {
TimerTask task = new TimerTask() {
#Override
public void run() {
doCleanup();
}
};
Timer timer = new Timer(getClass().getName(), true);
timer.scheduleAtFixedRate(task, 0L, interval);
return timer;
}

If threads only can access static methods to a class with its own instance, created by a different thread, what thread will this execute on?

Suppose I have a class with 2 public static methods that control a single private instance of it's self. The basic structure of the class is below:
public class MyClass {
private static MyClass myclass = null;
private final Process, OutputStreamWriter, Strings, ints, etc....
private class constructor....
private class methods....
public static void command(String cmd) {
if(myclass == null) {
myclass = new MyClass();
}
myclass.setCmd(cmd);
}
public static void execute() {
myclass.run();
myclass.close();
}
}
I'm using this in an android application and I just want to verify how this works before I go to far into designing around this. Suppose that the command for the class comes from the UI thread. The UI thread calls the first static method
MyClass.command("parse and do what's in this string");
Now I expect the MyClass.execute() call, in some cases, may take almost up to a second to complete. I basically just want to verify that if I call the MyClass.execute() method from a Service or Runnable, that the execution will happen on that thread.
In the post static-method-behavior-in-multi-threaded-environment-in-java selig states that:
Memory in java is split up into two kinds - the heap and the stacks. The heap is where all the objects live and the stacks are where the threads do their work. Each thread has its own stack and can't access each others stacks. Each thread also has a pointer into the code which points to the bit of code they're currently running.
When a thread starts running a new method it saves the arguments and local variables in that method on its own stack. Some of these values might be pointers to objects on the heap. If two threads are running the same method at the same time they will both have their code pointers pointing at that method and have their own copies of arguments and local variables on their stacks....
Now since the UI thread made the call to the static method MyClass.command("Do this"), which technically instantiated the private local arguments and variables for that class, would this mean that the class is located on the UI thread's stack??? Meaning that if I called the MyClass.execute() from a service thread or runnable thread, the actual execution would happen on the UI thread while the service or runnable waits on it? Is my understanding of this correct?
Thank You!

Ok, there's a lot of misinformation in your post.
1)Services and Runnables do not, by default, have their own thread. Services run on the UI thread, although they can create a thread (an IntentService will do so by default). Runnables run on whatever thread calls run on them. But unless you post them to a Handler attached to another thread or to a Thread object, they won't start a new one.
2)All Java objects are on the heap, not the stack. The stack only holds primitive types and references to objects (but the objects themselves are on the heap).
3)Yes, each Thread has its own stack, so it can have its own set of local variables. But that doesn't prevent it from touching anything on the heap. That includes any object in the program
4)The only things that are private to a Thread are the local variables in the function. And notice that any local object is still on the heap and a reference to it can be saved and passed to another thread.
5)There is absolutely nothing that restricts threads to calling only static methods. You can call any type of method you want.

Classes are not located on stacks. There is no interaction between threads and classes.
If you call a method (including MyClass.execute()) the method will run on the same thread as its caller. So if you call it from a service, it will run in the service's thread (but note that this might also be the UI thread, unless you made the service run in a separate thread!). If you call it from some random thread, it will run on that thread.
The stack is not actually important to understanding what Java code does.

would this mean that the class is located on the UI thread's stack?
You're mixing up references and objects.
Objects are always on the heap in Java. Each thread has its own copy of the reference to the object in its own stack. So whatever changes command() makes will affect the object whose reference is on the stack -- the reference itself is unchanged, as well as all the other values on the stack, as the entity being altered is on the heap.
Meaning that if I called the MyClass.execute() from a service thread or runnable thread, the actual execution would happen on the UI thread while the service or runnable waits on it?
If you call MyClass.execute() from a different thread, the code for execute() will be executed on that different thread, period. Each thread keeps track of what code it is executing independently of other threads. So if you call MyClass.execute() from a different thread, the execution will not magically transfer to the UI thread, and would occur independently of other threads, with the exception of any shared objects.

Thread with custom states

Is it possible in Java (Android) to implement a customized version of a Thread which carries its own States?
What I mean is:
While ThreadA is in Running state, it still can be polled by ThreadB that asks for its state
e.g.
ThreadA.getState();
It is possible to modify the states values to some custom ones? So as to implement a sort of basic communication system between those two threads?
Thanks.

Yes that is possible. I used this a lot in my previous projects, all what you need is to extend the Thread class.
public class StateThread extends Thread{
String state = "ThreadState";
public synchronized void setState(String newState){
state = newState;
}
public synchronized String getState(){
return state;
}
#override
public void run(){
// Do stuff and update state...
}
}

Yes, it is possible to perform this task.
Is it a good design? I don't think so.
There are other means to perform communication between threads -
For example, you should use a queue with a Producer/Consumer pattern.
I am sure that Android, as JavaSE supports thread local - you can use it in order to manage local thread data (including states) (maybe in combination with a queue that will get "operations" to change the state managed by a thread
If you do decide to go for the solution of having setState and getState methods, at least consider using the ReaderWriterLock to optimize your locking

Threads state is maintained by the Virtual Machine. VM uses the state to monitor and manage the actual thread.
That's why there is no mechanism to modify the state of the Thread. There is no setState function that allows to set your custom state.
For your application purpose, you can define your own instance variables by extending Thread but that cannot alter Thread's state in any way.

Synchronizing with shared data is not very useful for determining the 'state' of a thread - the thread writes its state as 'healthy', then gets stuck - the monitor thread then checks the state and finds it healthy.
Monitoring the 'state' should mean making the checked thread do something, not just looking directly at some shared object.
If you have a message-passing design, (as suggested by zaske), you can pass around a 'state record' on the input queue of evey thread, asking it to record its state inside and pass it on to the next thread. The 'monitor' thread waits for the record to come back, all filled in. If it does not get it in a resonable time, it could log what it has got - it keeps a reference to the state record object, so it could see which thread has not updated its state. It could, perhaps, fail to feed a watchdog timer.

Thread Pool, Shared Data, Java Synchronization

Say, I have a data object:
class ValueRef { double value; }
Where each data object is stored in a master collection:
Collection<ValueRef> masterList = ...;
I also have a collection of jobs, where each job has a local collection of data objects (where each data object also appears in the masterList):
class Job implements Runnable {
Collection<ValueRef> neededValues = ...;
void run() {
double sum = 0;
for (ValueRef x: neededValues) sum += x;
System.out.println(sum);
}
}
Use-case:
for (ValueRef x: masterList) { x.value = Math.random(); }
Populate a job queue with some jobs.
Wake up a thread pool
Wait until each job has been evaluated
Note: During the job evaluation, all of the values are all constant. The threads however, have possibly evaluated jobs in the past, and retain cached values.
Question: what is the minimal amount of synchronization necessary to ensure each thread sees the latest values?
I understand synchronize from the monitor/lock-perspective, I do not understand synchronize from the cache/flush-perspective (ie. what is being guaranteed by the memory model on enter/exit of the synchronized block).
To me, it feels like I should need to synchronize once in the thread that updates the values to commit the new values to main memory, and once per worker thread, to flush the cache so the new values are read. But I'm unsure how best to do this.
My approach: create a global monitor: static Object guard = new Object(); Then, synchronize on guard, while updating the master list. Then finally, before starting the thread pool, once for each thread in the pool, synchronize on guard in an empty block.
Does that really cause a full flush of any value read by that thread? Or just values touched inside the synchronize block? In which case, instead of an empty block, maybe I should read each value once in a loop?
Thanks for your time.
Edit: I think my question boils down to, once I exit a synchronized block, does every first read (after that point) go to main memory? Regardless of what I synchronized upon?

It doesn't matter that threads of a thread pool have evaluated some jobs in the past.
Javadoc of Executor says:
Memory consistency effects: Actions in a thread prior to submitting a Runnable object to an Executor happen-before its execution begins, perhaps in another thread.
So, as long as you use standard thread pool implementation and change the data before submitting the jobs you shouldn't worry about memory visibility effects.

What you are planning sounds sufficient. It depends on how you plan to "wake up thread pool."
The Java Memory Model provides that all writes performed by a thread before entering a synchronized block are visible to threads that subsequently synchronize on that lock.
So, if you are sure the worker threads are blocked in a wait() call (which must be inside a synchronized block) during the time you update the master list, when they wake up and become runnable, the modifications made by the master thread will be visible to these threads.
I would encourage you, however, to apply the higher level concurrency utilities in the java.util.concurrent package. These will be more robust than your own solution, and are a good place to learn concurrency before delving deeper.
Just to clarify: It's almost impossible to control worker threads without using a synchronized block where a check is made to see whether the worker has a task to implement. Thus, any changes made by the controller thread to the job happen-before the worker thread awakes. You require a synchronized block, or at least a volatile variable to act as a memory barrier; however, I can't think how you'd create a thread pool with using one of these.
As an example of the advantages of using the java.util.concurrency package, consider this: you could use a synchronized block with a wait() call in it, or a busy-wait loop with a volatile variable. Because of the overhead of context switching between threads, a busy wait can actually perform better under certain conditions—it's not necessary the horrible idea that one might assume at first glance.
If you use the Concurrency utilities (in this case, probably an ExecutorService), the best selection for your particular case can be made for you, factoring in the environment, the nature of the task, and the needs of other threads at a given time. Achieving that level of optimization yourself is a lot of needless work.

Why don't you make Collection<ValueRef> and ValueRef immutable or at least don't modify the values in the collection after you have published the reference to the collection. Then you will not have any worry about synchronization.
That is when you want to change the values of the collection, create a new collection and put new values in it. Once the values have been set pass the collection reference new job objects.
The only reason not to do this would be if the size of the collection is so large that it barely fits in memory and you cannot afford to have two copies, or the swapping of the collections would cause too much work for the garbage collector (prove that one of these is a problem before you use a mutable data structure for threaded code).

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.