Simple Thread Management - Java - Android - java

I have an application which spawns a new thread when a user asks for an image to be filtered.
This is the only type of task that I have and all are of equal importance.
If I ask for too many concurrent threads (Max I ever want is 9) the thread manager throws a RejectedExecutionException.
At the minute what I do is;
// Manage Concurrent Tasks
private Queue<AsyncTask<Bitmap,Integer,Integer>> tasks = new LinkedList<AsyncTask<Bitmap,Integer,Integer>>();
#Override
public int remainingSize() {
return tasks.size();
}
#Override
public void addTask(AsyncTask<Bitmap, Integer, Integer> task) {
try{
task.execute(currentThumbnail);
while(!tasks.isEmpty()){
task = tasks.remove();
task.execute(currentThumbnail);
}
} catch (RejectedExecutionException r){
Log.i(TAG,"Caught RejectedExecutionException Exception - Adding task to Queue");
tasks.add(task);
}
}
Simply add the rejected task to a queue and the next time a thread is started the queue is checked to see if there is a backlog.
The obvious issue with this is that if the final task gets rejected on its first attempt it will never be restarted (Until after it's no longer needed).
Just wondering if there's a simple model I should use for managing this sort of thing. I need tasks to notify the queue when they are done.

The reason for the RejectedExecutionException is because AsyncTask implements a thread pool of its own (per Mr. Martelli's answer), but one that is capped at a maximum of 10 simultaneous tasks. Why they have that limit, I have no idea.
Hence, one possibility is for you to clone AsyncTask, raise the limit (or go unbounded, which is also possible with LinkedBlockingQueue), and use your clone. Then, perhaps, submit the change as a patch to AsyncTask for future Android releases.
Click here to run a Google Code Search for AsyncTask -- the first hit should be the implementation.
If you just want to raise the limit, adjust MAXIMUM_POOL_SIZE to be as big as you're likely to need. If you want to go unbounded, use the zero-argument LinkedBlockingQueue constructor instead of the one being presently used. AFAICT, the rest of the code probably stays the same.

You seem to have implemented a version of the Thread Pool design pattern -- the wikipedia article points to many helpful articles on the subject, which may help you refine your implementation. I also recommend this Java-specific article which has clear code and explanation.

Maybe an option is to have the task wait on a blocking queue (of bitmaps) instead of taking bitmap as a parameter, but you will have to add a way for the task(s) to terminate.

Related

Purpose of re-usable Executor, ServiceExecutor, etc interfaces

I'm building a GUI'd application with javaFX that supports a long-running CPU intensive operation, something like Prime95 or Orthos.
One of the problems I've run into is trying to get counters to increment nicely. If you think about an ElapsedTime field with an incrementing counter with millisecond resolution, what I need is a job on the UI thread to call elapsedTimeTextField.setText("00:00:00.001") to happen 1ms before a corresponding call elapsedTimeTextField.setText("00:00:00.002"). I also need to let the UI thread do more important jobs between those two calls.
Structuring code to do this has been tedious, and has resulted in a number of our controller classes creating threads that simply loop on code similar to:
Thread worker = new Thread(this::doUpdates);
worker.start();
//...
private void doUpdates(){
while(true){
String computedTime = computeTimeToDisplay();
runLaterOnUI(() -> textField.setText(computedTime));
sleep(DUTY_CYCLE_DOWNTIME);
}
}
While this does the job, its unfavorable because:
It's difficult to unit test: from a testing environment you either have to modify this code to give some kind of signal when it completes its first pass, (typically a count-down-latch) or you have to do silly non-deterministic & arbitrary sleep()s
It doesn't have any kind of backoff: if the UI thread is flooded with jobs this code is going to exacerbate the problem. Some kind of requeueing scheme, whereby the downtime takes into account the latency of the job and some kind of hard-coded sleep is preferable since it means that if the UI job is flooded we're not asking it to do work unduly.
It doesn't have centralized exception handling short of the threads default handler. This means that if an exception is raised in the computeTimeToDisplay() method (or for that fact, in the runLaterOnUI call or the sleep() call) the text field will no longer be updated.
I have addressed each of these concerns reasonably well individually, but I don't have any obvious and reusable idiom for tackling these three problems.
I suspect that the Future, Task, Executor, ServiceExecutor, etc classes (the classes in the java.util.concurrent package that aren't a lock or a collection) can help me to this goal, but I'm not sure how to use them.
Can somebody suggest some documentation to read and some idioms to follow that will help me in pursuit of these goals? Is there an agreed on idiom --that doesn't involve anonymous classes and contains minimal boiler-plate-- for this kind of concurrent-job?
I recommend using a ScheduledThreadPoolExecutor with a core pool size of 1 and optionally with a thread priority of Thread.NORM_PRIORITY + 1 (use a ThreadFactoryBuilder to create a ThreadFactory with higher than standard priority) for the UI thread - this will let you schedule tasks such as the counter increment using ScheduledThreadPoolExecutor#scheduleAtFixedRate. Don't execute anything other than UI tasks on this executor - execute your CPU tasks on a separate ThreadPoolExecutor with standard priority; if you have e.g. 16 logical cores then create a ThreadPoolExecutor with 16 core threads to make full use of your computer when the UI thread is idle, and let the virtual machine take care of ensuring that the UI thread executes its jobs when it's supposed to.
You question is multi-faceted and I am not going to pretend that I understand all of it. This answer will address only one part of the question.
It doesn't have any kind of backoff: if the UI thread is flooded with jobs this code is going to exacerbate the problem. Some kind of requeueing scheme, whereby the downtime takes into account the latency of the job and some kind of hard-coded sleep is preferable since it means that if the UI job is flooded we're not asking it to do work unduly.
The in-built java.util.concurrent classes such as Task, Service and ScheduledService include facilities to send message updates from a non-UI thread to a UI thread in way that does not flood the UI thread. You could use those classes directly (which would seem advisable, though perhaps that perception is naive of me as I don't fully understand your requirements). Or you can implement a similar custom facility in your code if you aren't using java.util.concurrent directly.
Here is the relevant code from the Task implementation:
/**
* Used to send message updates in a thread-safe manner from the subclass
* to the FX application thread. AtomicReference is used so as to coalesce
* updates such that we don't flood the event queue.
*/
private AtomicReference<String> messageUpdate = new AtomicReference<>();
private final StringProperty message = new SimpleStringProperty(this, "message", "");
/**
* Updates the <code>message</code> property. Calls to updateMessage
* are coalesced and run later on the FX application thread, so calls
* to updateMessage, even from the FX Application thread, may not
* necessarily result in immediate updates to this property, and
* intermediate message values may be coalesced to save on event
* notifications.
* <p>
* <em>This method is safe to be called from any thread.</em>
* </p>
*
* #param message the new message
*/
protected void updateMessage(String message) {
if (isFxApplicationThread()) {
this.message.set(message);
} else {
// As with the workDone, it might be that the background thread
// will update this message quite frequently, and we need
// to throttle the updates so as not to completely clobber
// the event dispatching system.
if (messageUpdate.getAndSet(message) == null) {
runLater(new Runnable() {
#Override public void run() {
final String message = messageUpdate.getAndSet(null);
Task.this.message.set(message);
}
});
}
}
}
The code works by ensuring that a runLater call is only made if the UI has processed (i.e. rendered) the last update.
Internally the JavaFX 8 system runs on a pulse system. Unless there is an unusually long time consuming operation on the UI thread or general system slowdown, each pulse will usually occur 60 times a second, or approximately every 16-17 milliseconds.
You mention the following:
what I need is a job on the UI thread to call elapsedTimeTextField.setText("00:00:00.001") to happen 1ms before a corresponding call elapsedTimeTextField.setText("00:00:00.002").
However, you can see from the JavaFX architecture description that updating the text more than 60 times a second is pointless as the additional updates will never be rendered. The sample code above from Task, takes care of this by ensuring that a UI update request is only ever issued at a time that the UI update thread can actually reflect the new value in the UI.
Some General Advice
This is just advice, it does not directly solve your problem, take it for what you will, some of it might not even be particularly relevant to your situation or problem.
Make clear the problem you are trying to solve in your questions. That is sometimes more important than a description of the symptoms you are experiencing and trying to resolve. It also helps prevent XY questions.
Be clear from the start on what you are actually doing to solve the problem. An mcve can sometimes help here.
For example, your initial problem statement does not state that you may have 10,000 controllers or provide code for what you term to be a controller. There is not much information on the expected length of time for tasks, what the UI display representing task progress and result is, why millisecond accuracy level might be important to display, if task results need to coalesced, if the tasks can be split and run concurrently, how many threads you are using, etc.
Don't try to develop your own higher level concurrency tools from primitives like ConcurrentLinkedQueue.
For your backend segmented work jobs, use high level concurrency utilities from Java SE, such as Executors, ForkJoin and BlockingQueue.
Orchestrate and synchronize the output of backend jobs with your UI using JavaFX concurrency utilities such as Task.
Know that the high level concurrency utilities and JavaFX concurrency tools can be used in unison, like in this example. I.e., the choice of concurrency tools doesn't need to be an either/or situation.
Extensive use of immutable objects can be a lifesaver in concurrent development.
If you will be doing a lot of concurrent development, take time for detailed study of high quality resources on concurrent programming such as Concurrency in Practice.
Concurrency in general is often simply hard to get right.

How can I be notified when a thread (that I didn't start) ends?

I have a library in a Jar file that needs to keep track of how many threads that use my library. When a new thread comes in is no problem: I add it to a list. But I need to remove the thread from the list when it dies.
This is in a Jar file so I have no control over when or how many threads come through. Since I didn't start the thread, I cannot force the app (that uses my Jar) to call a method in my Jar that says, "this thread is ending, remove it from your list". I'd REALLY rather not have to constantly run through all the threads in the list with Thread.isAlive().
By the way: this is a port of some C++ code which resides in a DLL and easily handles the DLL_THREAD_DETACH message. I'd like something similar in Java.
Edit:
The reason for keeping a list of threads is: we need to limit the number of threads that use our library - for business reasons. When a thread enters our library we check to see if it's in the list. If not, it's added. If it is in the list, we retrieve some thread-specific data. When the thread dies, we need to remove it from the list. Ideally, I'd like to be notified when it dies so I can remove it from the list. I can store the data in ThreadLocal, but that still doesn't help me get notification of when the thread dies.
Edit2:
Original first sentence was: "I have a library in a Jar file that needs to keep track of threads that use objects in the library."
Normally you would let the GC clean up resources. You can add a component to the thread which will be cleaned up when it is not longer accessible.
If you use a custom ThreadGroup, it will me notified when a thread is removed from the group. If you start the JAR using a thread in the group, it will also be part of the group. You can also change a threads group so it will be notifed via reflection.
However, polling the threads every few second is likely to be simpler.
You can use a combination of ThreadLocal and WeakReference. Create some sort of "ticket" object and when a thread enters the library, create a new ticket and put it in the ThreadLocal. Also, create a WeakReference (with a ReferenceQueue) to the ticket instance and put it in a list inside your library. When the thread exits, the ticket will be garbage collected and your WeakReference will be queued. by polling the ReferenceQueue, you can essentially get "events" indicating when a thread exits.
Based on your edits, your real problem is not tracking when a thread dies, but instead limiting access to your library. Which is good, because there's no portable way to track when a thread dies (and certainly no way within the Java API).
I would approach this using a passive technique, rather than an active technique of trying to generate and respond to an event. You say that you're already creating thread-local data on entry to your library, which means that you already have the cutpoint to perform a passive check. I would implement a ThreadManager class that looks like the following (you could as easily make the methods/variables static):
public class MyThreadLocalData {
// ...
}
public class TooManyThreadsException
extends RuntimeException {
// ...
}
public class ThreadManager
{
private final static int MAX_SIZE = 10;
private ConcurrentHashMap<Thread,MyThreadLocalData> threadTable = new ConcurrentHashMap<Thread,ThreadManager.MyThreadLocalData>();
private Object tableLock = new Object();
public MyThreadLocalData getThreadLocalData() {
MyThreadLocalData data = threadTable.get(Thread.currentThread());
if (data != null) return data;
synchronized (tableLock) {
if (threadTable.size() >= MAX_SIZE) {
doCleanup();
}
if (threadTable.size() >= MAX_SIZE) {
throw new TooManyThreadsException();
}
data = createThreadLocalData();
threadTable.put(Thread.currentThread(), data);
return data;
}
}
The thread-local data is maintained in threadTable. This is a ConcurrentHashMap, which means that it provides fast concurrent reads, as well as concurrent iteration (that will be important below). In the happy case, the thread has already been here, so we just return its thread-local data.
In the case where a new thread has called into the library, we need to create its thread-local data. If we have fewer threads than the limit, this proceeds quickly: we create the data, store it in the map, and return it (createThreadLocalData() could be replaced with a new, but I tend to like factory methods in code like this).
The sad case is where the table is already at its maximum size when a new thread enters. Because we have no way to know when a thread is done, I chose to simply leave the dead threads in the table until we need space -- just like the JVM and memory management. If we need space, we execute doCleanup() to purge the dead threads (garbage). If there still isn't enough space once we've cleared dead threads, we throw (we could also implement waiting, but that would increase complexity and is generally a bad idea for a library).
Synchronization is important. If we have two new threads come through at the same time, we need to block one while the other tries to get added to the table. The critical section must include the entirety of checking, optionally cleaning up, and adding the new item. If you don't make that entire operation atomic, you risk exceeding your limit. Note, however, that the initial get() does not need to be in the atomic section, so we don't need to synchronize the entire method.
OK, on to doCleanup(): this simply iterates the map and looks for threads that are no longer alive. If it finds one, it calls the destructor ("anti-factory") for its thread-local data:
private void doCleanup() {
for (Thread thread : threadTable.keySet()) {
if (! thread.isAlive()) {
MyThreadLocalData data = threadTable.remove(thread);
if (data != null) {
destroyThreadLocalData(data);
}
}
}
}
Even though this function is called from within a synchronized block, it's written as if it could be called concurrently. One of the nice features of ConcurrentHashMap is that any iterators it produces can be used concurrently, and give a view of the map at the time of call. However, that means that two threads might check the same map entry, and we don't want to call the destructor twice. So we use remove() to get the entry, and if it's null we know that it's already been (/being) cleaned up by another thread.
As it turns out, you might want to call the method concurrently. Personally, I think the "clean up when necessary" approach is simplest, but your thread-local data might be expensive to hold if it's not going to be used. If that's the case, create a Timer that will repeatedly call doCleanup():
public Timer scheduleCleanup(long interval) {
TimerTask task = new TimerTask() {
#Override
public void run() {
doCleanup();
}
};
Timer timer = new Timer(getClass().getName(), true);
timer.scheduleAtFixedRate(task, 0L, interval);
return timer;
}

Why Thread.sleep is bad to use

Apologies for this repeated question but I haven't found any satisfactory answers yet. Most of the question had their own specific use case:
Java - alternative to thread.sleep
Is there any better or alternative way to skip/avoid using Thread.sleep(1000) in Java?
My question is for the very generic use case. Wait for a condition to complete. Do some operation. Check for a condition. If the condition is not true, wait for some time and again do the same operation.
For e.g. Consider a method that creates a DynamoDB table by calling its createAPI table. DynamoDB table takes some time to become active so that method would call its DescribeTable API to poll for status at regular intervals until some time(let's say 5 mins - deviation due to thread scheduling is acceptable). Returns true if the table becomes active in 5 mins else throws exception.
Here is pseudo code:
public void createDynamoDBTable(String name) {
//call create table API to initiate table creation
//wait for table to become active
long endTime = System.currentTimeMillis() + MAX_WAIT_TIME_FOR_TABLE_CREATE;
while(System.currentTimeMillis() < endTime) {
boolean status = //call DescribeTable API to get status;
if(status) {
//status is now true, return
return
} else {
try {
Thread.sleep(10*1000);
} catch(InterruptedException e) {
}
}
}
throw new RuntimeException("Table still not created");
}
I understand that by using Thread.sleep blocks the current thread, thereby consuming resources. but in a fairly mid size application, is one thread a big concern?
I read somewhere that use ScheduledThreadPoolExecutor and do this status polling there. But again, we would have to initialize this pool with at least 1 thread where runnable method to do the polling would run.
Any suggestions on why using Thread.sleep is said to be such a bad idea and what are the alternative options for achieving same as above.
http://msmvps.com/blogs/peterritchie/archive/2007/04/26/thread-sleep-is-a-sign-of-a-poorly-designed-program.aspx
It's fine to use Thread.sleep in that situation. The reason people discourage Thread.sleep is because it's frequently used in an ill attempt to fix a race condition, used where notification based synchronization is a much better choice etc.
In this case, AFAIK you don't have an option but poll because the API doesn't provide you with notifications. I can also see it's a infrequent operation because presumably you are not going to create thousand tables.
Therefore, I find it fine to use Thread.sleep here. As you said, spawning a separate thread when you are going to block the current thread anyways seems to complicate things without merit.
Yes, one should try to avoid usage of Thread.sleep(x) but it shouldn't be totally forgotten:
Why it should be avoided
It doesn't release the lock
It doesn't gurantee that the execution will start after sleeping time (So it may keep waiting forever - obviously a rare case)
If we mistakenly put a foreground processing thread on sleep then we wouldn't be able to close that application till x milliseconds.
We now full loaded with new concurrency package for specific problems (like design patterns (ofcourse not exactly), why to use Thread.sleep(x) then.
Where to use Thread.sleep(x):
For providing delays in background running threads
And few others.

Best way to wait on tasks to complete before adding new ones to threadpool in Java?

I want to use something like a ThreadPoolExecutor to manage running a bunch of tasks on available threads. These tasks are all of the same type but deal with different accounts. New tasks for these accounts can be added at regular intervals and I want it to check and not allow the new tasks to start until the old tasks for the same account have already completed. What's the best way to do this?
EXAMPLE
Task for account "234" is started (via ThreadPoolExecutor.execute())
Task for account "238" is started (via ThreadPoolExecutor.execute())
New Task for account "234" created but not added to execute because first "234" task not complete (best way to check this?)
Task for account "238" completes
New Task for account "238" starts (via ThreadPoolExecutor.execute()) because none currently running for that account
What's the best way to do this? Simply have it check with a wait/sleep() for some check variable in the Runnable for "234"'s first task to finish? Or is there a better solution?
I have no doubt some one with more experience with this part of the API will have a better idea, but here are my thoughts on the subject...
Basically, I'd start with a "running" and "waiting" queue. The "running" queue keeps track of what's currently running, the "waiting" queue keeps track of the tasks that you holding back. These queue will need to be keyed to some kind of "group identifier" to make it easier to look up (ie Map<String, List<Runnable>), for example, your account number
I'd look at overriding the execute method. In here I'd compare the incoming task against the running queue to determine if any related tasks are currently running. If there is, I'd drop the new task into a wait queue.
I'd then override the beforeExecute method. Here I would register the task in the "running" queue.
I'd override the 'afterExecute' method. Here I would remove the completed task from "running" queue, look up the queue of waiting tasks (via the group identifier of the completed tasks) and add the first task in the queue into the executor via the execute method
Or you could do as Louis suggests :P
One simple possibility. Perhaps overly simple. Create 10 SingleThreadedExecutors. For each task
"hash" the accountID by taking accountID mod 10 to find the
appropriate SingleThreadedExecutor. (in practice, accountID may not
be an int, e.g. if it's a String take it's hashCode() mod 10).
Submit the task to that SingleThreadedExecutor.
This may not ideal, as processing of account 238 will be forced to wait until 358 is complete, but at least you are sure that a specific account, say, 234, will never be running at the same time. Depends on how much latency you can allow. Obviously, you could play with the number of Executors and the simplistic "hashing" algortihm I described.
I faced the same issue. My solution was to use a HashSet.
private static HashSet<Integer> runningTasks = new HashSet();
public void run(){
boolean isAlreadyRunning = false;
synchronized (runningTasks) {
if (runningTasks.contains(this.accountId)) {
isAlreadyRunning = true;
} else {
runningTasks.add(this.accountId);
}
}
if(isAlreadyRunning){
//schedule this task to run later here
//what I did was to reinsert this task to the task queue 5 seconds later
return;
}
//do your stuffs here
synchronized (runningTasks) {
runningTasks.remove(this.accountId);
}
}

Cancelling a thread safely due to timeout

I have a queue of tasks that need to be performed, and a pool of workers that pick up the tasks and perform them. There's also a "manager" class that keeps track of the worker, allows the user to stop or restart them, reports on their progress, etc. Each worker does something like this:
public void doWork() {
checkArguments();
performCalculation();
saveResultsToDatabase();
performAnotherCalculation();
saveResultsToDatabase();
performYetAnotherCalculation();
saveResultsToDatabase();
}
In this case, "database" does not necessarily refer to an Oracle database. That's certainly one of the options, but the results could also be saved on disk, in Amazon SimpleDB, etc.
So far, so good. However, sometimes the performCalculation() code locks up intermittently, due to a variety of factors, but mostly due to a poor implementation of networking code in a bunch of third-party libraries (f.ex. Socket.read() never returns). This is bad, obviously, because the task is now stuck forever, and the worker is now dead.
What I'd like to do is wrap that entire doWork() method in some sort of a timeout, and, if the timeout expires, give the task to someone else.
How can I do that, though ? Let's say the original worker is stuck in the "performCalculation()" method. I then give the task to some other worker, who completes it, and then the original worker decides to wake up and save its intermediate results to the database... thus corrupting perfectly valid data. Is there some general pattern I can use to avoid this ?
I can see a couple of solutions, but most of them will require some serious refactoring of all the business-logic code, from the ground up... which is probably the right thing to do philosophically, but is simply not something I have time for.
Have you tried using a Future? They are useful for running a task and waiting for it to complete, using a timeout etc. For example:
private Runnable performCalc = new Runnable() {
public void run() {
performCalculation();
}
}
public void doWork() {
try {
ExecutorService executor = Executors.newFixedThreadPool(1);
executor.submit(performCalc).get(); // Timeouts can be used here.
executor.submit(anotherCalc).get();
} catch(InterruptedException e) {
// Asked to stop. Rollback out transactions.
} catch(OtherExceptions here) {
}
}
If performCalculation stuck on blocking IO, there is little you can do to interrupt it. One solution is to close the underlying socket or set timeout on socket operations using Socket.setSoTimeout, but you have to own the code which reads from the socket to do that.
Otherwise you can add some reconciliation mechanism before saving the data into the database. Use some kind of timestamps to detect if the data in the database is newer that the data which original worker fetched from the network.
I suppose the easiest thing to do would be to have a separate timer thread, started when the thread with performCalculation() starts. The timer thread can wake up after a period of time and Thread.interrupt() the calculation thread, which can then perform any necessary rollback when handling the InterruptedException.
Granted, this is bolting on additional complexity to manage other problems, and consequently isn't the most elegant solution.

Categories