Dynamic Scheduled Concurrent Task Execution in Java - java

I'm trying to implement an application that programs tasks based on some user input. The users can put a number of IPs with telnet commands associated with them (one to one relationship), a frequency of execution, and 2 groups (cluster, objectClass).
The user should be able to add/remove IPs, Clusters, commands, etc, at runtime. They should also be able to interrupt the executions.
This application should be able to send the telnet commands to the IPs, wait for a response and save the response in a database based on the frequency. The problem I'm having is trying to make all of this multithreaded, because there are at least 60,000 IPs to telnet, and doing it in a single thread would take too much time. One thread should process a group of IPs in the same cluster with the same objectClass.
I've looked at Quartz to schedule the jobs. With Quartz I tried to make a dynamic job that took a list of IPs (with commands), processed them and saved the result to database. But then I ran into the problem of the different timers that users gave. The examples on the Quartz webpage are incomplete and don't go too much into detail.
Then I tried to do it the old fashioned way, using java Threads, but I need to have exception handling and parameter passing, Threads don't do that. Then I discovered the Callables and Executors but I can't schedule tasks with Callables.
So Now I'm stumped, what do I do?

OK, here are some ideas. Take with the requisite grain of salt.
First, create a list of all of the work that you need to do. I assume you have this in tables somewhere and you can make a join that looks like this:
cluster | objectClass | ip-address | command | frequency | last-run-time
this represents all of the work your system needs to do. For the sake of explanation, I'll say frequency can take the form of "1 per day", "1 per hour", "4 per hour", "every minute". This table has one row per (cluster,objectClass,ip-address,command). Assume a different table has a history of runs, with error messages and other things.
Now what you need to do is read that table, and schedule the work. For scheduling use one of these:
ScheduledExecutorService exec = Executors...
When you schedule something, you need to tell it how often to run (easy enough with the frequencies we've given), and a delay. If something is to run every minute and it last ran 4 min 30 seconds ago, the initial delay is zero. If something is to run each hour the the initial delay is (60 min - 4.5 min = 55.5 min).
ScheduledFuture<?> handle = exec.scheduleAtFixedRate(...);
More complex types of scheduling are why things like Quartz exist, but basically you just need a way to resolve, given(schedule, last-run) an elapsed time to the next execution. If you can do that, then instead of scheduleAtFixedRate(...) you can use schedule(...) and then schedule the next run of a task as that task completes.
Anyway, when you schedule something, you'll get a handle back to it
ScheduledFuture<?> handle = exec.scheduleAtFixedRate(...);
Hold this handle in something that's accessible. For the sake of argument let's say it's a map by TaskKey. TaskKey is (cluster | objectClass | ip-address | command) together as an object.
Map<TaskKey,ScheduledFuture<?>> tasks = ...;
You can use that handle to cancel and schedule new jobs.
cancelForCustomer(CustomerId id) {
List<TaskKey> keys = db.findAllTasksOwnedByCustomer(id);
for(TaskKey key : keys) {
ScheduledFuture<?> f = tasks.get(key);
if(f!=null) f.cancel();
}
}
For parameter passing, create an object to represent your work. Create one of these with all the parameters you need.
class HostCheck implements Runnable {
private final Address host;
private final String command;
private final int something;
public HostCheck(Address host, String command; int something) {
this.host = host; this.command = command; this.something = something;
}
....
}
For exception handling, localize that all into your object
class HostCheck implements Runnable {
...
public void run() {
try {
check();
scheduleNextRun(); // optionally, if fixed-rate doesn't work
} catch( Exception e ) {
db.markFailure(task); // or however.
// Point is tell somebody about the failure.
// You can use this to decide to stop scheduling checks for the host
// or whatever, but just record the info now and us it to influence
// future behavior in, er, the future.
}
}
}
OK, so up to this point I think we're in pretty good shape. Lots of detail to fill in but it feels manageable. Now we get to some complexity, and that's the requirement that execution of "cluster/objectClass" pairs are serial.
There are a couple of ways to handle this.
If the number of unique pairs are low, you can just make Map<ClusterObjectClassPair,ScheduledExecutorService>, making sure to create single-threaded executor services (e.g., Executors.newSingleThreadScheduledExecutor()). So instead of a single scheduling service (exec, above), you have a bunch. Simple enough.
If you need to control the amount of work you attempt concurrently, then you can have each HealthCheck acquire a permit before execution. Have some global permit object
public static final Semaphore permits = java.util.concurrent.Semaphore(30);
And then
class HostCheck implements Runnable {
...
public void run() {
permits.acquire()
try {
check();
scheduleNextRun();
} catch( Exception e ) {
// regular handling
} finally {
permits.release();
}
}
}
You only have one thread per ClusterObjectClassPair, which serializes that work, and then permits just limit how many ClusterObjectClassPair you can talk to at a time.
I guess this turned it a quite a long answer. Good luck.

Related

Concurrent and scalable data structure in Java to handle tasks?

for my current development I have many threads (Producers) that create Tasks and many threads that that consume these Tasks (consumers)
Each Producers is identified by a unique name; A Tasks is made of:
the name of its Producers
a name
data
My question concerns the data structure used by the (Producers) and the (consumers).
Concurrent Queue?
Naively, we could imagine that Producers populate a concurrent-queue with Tasks and (consumers) reads/consumes the Tasks stored in the concurrent-queue.
I think that this solution would rather well scale but one single case is problematic: If a Producers creates very quickly two Tasks having the same name but not the same data (Both tasks T1 and T2 have the same name but T1 has data D1 and T2 has data D2), it is theoretically possible that they are consumed in the order T2 then T1!
Task Map + Queue?
Now, I imagine creating my own data structure (let's say MyQueue) based on Map + Queue. Such as a queue, it would have a pop() and a push() method.
The pop() method would be quite simple
The push() method would:
Check if an existing Task is not yet inserted in MyQueue (doing find() in the Map)
if found: data stored in the Task to-be-inserted would be merged with data stored in the found Task
if not found: the Task would be inserted in the Map and an entry would be added in the Queue
Of course, I'll have to make it safe for concurrent access... and that will certainly be my problem; I am almost sure that this solution won't scale.
So What?
So my question is now what are the best data structure I have to use in order to fulfill my requirements
You could try Heinz Kabutz's Striped Executor Service a possible candidate.
This magical thread pool would ensure that all Runnables with the same stripeClass would be executed in the order they were submitted, but StripedRunners with different stripedClasses could still execute independently.
Instead of making a data structure safe for concurrent access, why not opting out concurrent and go for parallel?
Functional programming models such as MapReduce are a very scalable way to solve this kind of problems.
I understand that D1 and D2 can be either analyzed together or in isolation and the only constraint is that they shouldn't be analyzed in the wrong order. (Making some assumption here ) But in case the real problem is only the way the results are combined, there might be an easy solution.
You could remove the constraint all together allowing them to be analyzed separately and then having a reduce function that is able to re-combine them together in a sensible way.
In this case you'd have the first step as map and the second as reduce.
Even if the computation is more efficient if done in a single go, a big part of scaling, especially scaling out is accomplished by denormalization.
If consumers are running in parallel, I doubt there is a way to make them execute tasks with the same name sequentially.
In your example (from comments):
BlockingQueue can really be a problem (unfortunately) if a Producer
"P1" adds a first task "T" with data D1 and quickly a second task "T"
with data D2. In this case, the first task can be handled by a thread
and the second task by another thread; If the threads handling the
first task is interrupted, the thread handling the second one can
complete first
There is no difference if P1 submits D2 not so quickly. Consumer1 could still be too slow, so consumer 2 would be able to finish first. Here is an example for such scenario:
P1: submit D1
C1: read D1
P2: submit D2
C2: read D2
C2: process D2
C1: process D1
To solve it, you will have to introduce some kind of completion detection, which I believe will overcomplicate things.
If you have enough load and can process some tasks with different names not sequentially, then you can use a queue per consumer and put same named tasks to the same queue.
public class ParallelQueue {
private final BlockingQueue<Task>[] queues;
private final int consumersCount;
public ParallelQueue(int consumersCount) {
this.consumersCount = consumersCount;
queues = new BlockingQueue[consumersCount];
for (int i = 0; i < consumersCount; i++) {
queues[i] = new LinkedBlockingQueue<>();
}
}
public void push(Task<?> task) {
int index = task.name.hashCode() % consumersCount;
queues[index].add(task);
}
public Task<?> pop(int consumerId) throws InterruptedException {
int index = consumerId % consumersCount;
return queues[index].take();
}
private final static class Task<T> {
private final String name;
private final T data;
private Task(String name, T data) {
this.name = name;
this.data = data;
}
}
}

Processing sub-streams of a stream in Java using executors

I have a program that processes a huge stream (not in the sense of java.util.stream, but rather InputStream) of data coming in through the network. The stream consists of objects, each having a sort of sub-stream identifier. Right now the whole processing is done in a single thread, but it takes a lot of CPU time and each sub-stream can easily be processed independently, so I'm thinking of multi-threading it.
However, each sub-stream requires to keep a lot of bulky state, including various buffers, hash maps and such. There is no particular reason to make it concurrent or synchronized since sub-streams are independent of each other. Moreover, each sub-stream requires that its objects are processed in the order they arrive, which means that probably there should be a single thread for each sub-stream (but possibly one thread processing multiple sub-streams).
I'm thinking of several approaches to this, but they are not quite elegant.
Create a single ThreadPoolExecutor for all tasks. Each task will contain the next object to process and the reference to a Processor instance which keeps all the state. That would ensure the necessary happens-before relationship thus ensuring that the processing thread will see the up-to-date state for this sub-stream. This approach has no way to make sure that the next object of the same sub-stream will be processed in the same thread, as far as I can see. Moreover, it needs some guarantee that objects will be processed in the order they come in, which will require additional synchronization of the Processor objects, introducing unnecessary delays.
Create multiple single-thread executors manually and a sort of hash-map that maps sub-stream identifiers to executor. This approach requires manual management of executors, creating or shutting down them as new sub-streams begin or end, and distributing the tasks between them accordingly.
Create a custom executor that processes a special subclass of tasks each having a sub-stream ID. This executor would use it as a hint to use the same thread for executing this task as the previous one with the same ID. However, I don't see an easy way to implement such executor. Unfortunately, it doesn't seem possible to extend any of the existing executor classes, and implementing an executor from scratch is kind of overkill.
Create a single ThreadPoolExecutor, but instead of creating a task for each incoming object, create a single long-running task for each sub-stream that would block in a concurrent queue, waiting for the next object. Then put objects in queues according to their sub-stream IDs. This approach needs as many threads as there are sub-streams because the tasks will be blocked. The expected number of sub-streams is about 30-60, so that may be acceptable.
Alternatively, proceed as in 4, but limit the number of threads, assigning multiple sub-streams to a single task. This is sort of a hybrid between 2 and 4. As far as I can see, this is the best approach of these, but it still requires some sort of manual sub-stream distribution between tasks and some way to shut the extra tasks down as sub-streams end.
What would be the best way to ensure that each sub-stream is processed in its own thread without a lot of error-prone code? So that the following pseudo-code will work:
// loop {
Item next = stream.read();
int id = next.getSubstreamID();
Processor processor = getProcessor(id);
SubstreamTask task = new SubstreamTask(processor, next, id);
executor.submit(task); // This makes sure that the task will
// be executed in the same thread as the
// previous task with the same ID.
// } // loop
I suggest having an array of single threaded executors. If you can devise a consistent hashing strategy for sub-streams, you can map sub-streams to individual threads. e.g.
final ExecutorsService[] es = ...
public void submit(int id, Runnable run) {
es[(id & 0x7FFFFFFF) % es.length].submit(run);
}
The key could be an String or long but some way to identify the sub-stream. If you know a particular sub-stream is very expensive, you could assign it a dedicated thread.
The solution I finally chose looks like this:
private final Executor[] streamThreads
= new Executor[Runtime.getRuntime().availableProcessors()];
{
for (int i = 0; i < streamThreads.length; ++i) {
streamThreads[i] = Executors.newSingleThreadExecutor();
}
}
private final ConcurrentHashMap<SubstreamId, Integer>
threadById = new ConcurrentHashMap<>();
This code determines which executor to use:
Message msg = in.readNext();
SubstreamId msgSubstream = msg.getSubstreamId();
int exe = threadById.computeIfAbsent(msgSubstream,
id -> findBestExecutor());
streamThreads[exe].execute(() -> {
// processing goes here
});
And the findBestExecutor() function is this:
private int findBestExecutor() {
// Thread index -> substream count mapping:
final int[] loads = new int[streamThreads.length];
for (int thread : threadById.values()) {
++loads[thread];
}
// return the index of the minimum load
return IntStream.range(0, streamThreads.length)
.reduce((i, j) -> loads[i] <= loads[j] ? i : j)
.orElse(0);
}
This is, of course, not very efficient, but note that this function is only called when a new sub-stream shows up (which happens several times every few hours, so it's not a big deal in my case). My real code looks a bit more complicated because I have a way to determine whether two sub-streams are likely to finish simultaneously, and if they are, I try to assign them to different threads in order to maintain even load after they do finish. But since I never mentioned this detail in the question, I guess it doesn't belong to the answer either.

Picky host (lock?)

I believe my problem can be considered regardless of used language but, to have some 'anchor', I'll describe it using the Java language.
Let's consider the following scenario:
I have a class PickyHost extending Thread and an instance of it, pickyHostInst running.
That class might look like this:
class PickyHost extends Thread {
private ArrayList<Guest> guests;
public void enter(Guest g) {
// deal with g
}
private void pickGuests() {
// ...
}
public void run() {
// listen indefinitely
}
}
Moreover, in the background, I have many Guest instances running (they also extend Thread class) and once in a while, some guest wants to invoke enter method on pickyHostInst with an argument g being itself.
Now, I want PickyHost to be picky in the following sense:
Immediately after someone invokes enter method, it puts g at the end of guests list and forces g to wait for notification. Also (I think here lies the crux of the matter) it goes itself for a 5 seconds sleep and somehow allows (during these 5 seconds) other guests to invoke enter method (if so happens, then it forgets about how long it had to sleep and resets its alarm clock to sleep exactly 5 seconds again) - I'll call it a sensitive sleep.
As you can see, the total amount of time pickyHostInst sleeps can be huge if many guests arrive - like: A arrives, then after 4 seconds B arrives, then after another 4 seconds C arrives and so on. However, suppose there's been created a chain A, B, ..., G of guests and from the moment of arrival of G till 5 seconds later, no-one arrived.
Then I want pickyHostInst to invoke pickGuests method which, using some algorithm, determines a subset S of {A, B, ..., G} of guests to notify that they can stop waiting and carry on doing what they normally do and moreover removes elements of S from guests list. Method pickGuests can take some time to accomplish and in the meantime some guest H might have arrived and invoked enter - then enter should proceed normally but pickGuests should ignore H and to the end of its last invocation deal with {A, B, ..., G} - not with {A, B, ..., G, H}.
After finishing pickGuests, pickyHostInst should (here I have 2 ideas - implementing any of them will make me happy :))
either
fall again into 5 seconds of sensitive sleep after which, if no guest after H arrived, invoke pickGuests again, or
simultaneously serves guests via enter method as usual but invokes pickGuests only after
max("a moment when last guest from S (from the last invocation) notifies pickyHostInst (like: the last "Thank you, Mr Host" from among S)", "a moment 5 seconds after the last (newest) guest invoked enter").
Finally, after a long introduction, my question - which tools do I need to accomplish such task? I'm unfortunately a bit lost among the richness of various locks and multithreading/locking mechanisms and can't discern which one fits to my problem (or which ones, combined somehow).
I'll greatly appreciate some code-sketches that would put me on the right track.
You can use a java.util.Timer object, which can be reset in the enter method. The timer task will run in its own thread and do the picking for you if it is not canceled before hand.
Note that the enter method will be running on one of the many Guest threads. This means that it should probably synchronized. The easiest way to do this is is to add the synchronized keyword to the method declaration in Java: public synchronized void enter(Guest g). This will ensure that only one guest can enter at a time. You can put the timer cancel/restart code in here.
The way java.util.Timer works it through the abstract java.util.TimerTask class. This is a type of Runnable that also has a method to cancel the task. My recommendation is to schedule a task that will pick guests after a 5000ms interval whenever a guest enters. If a task from the previous guest is running, cancel it first.
The enter method should acquire the guest's lock (using a synchronized block) and have the guest wait. The picking should call the notify() method on the guests you select. This will allow them to continue executing.
When you remove selected guests from your queue, be aware of the fact that Java collections are not thread-safe by default. You will have to use an external lock to ensure that no-one else is modifying your list when you add and remove guests. The Collections.synchronizedList(List) method provides a handy way to do this.
Here is a list of links that discuss the topics I have mentioned:
http://docs.oracle.com/javase/tutorial/essential/concurrency/ (excellent tutorial for beginners)
http://docs.oracle.com/javase/7/docs/api/java/util/Timer.html
http://docs.oracle.com/javase/7/docs/api/java/util/TimerTask.html
http://docs.oracle.com/javase/7/docs/api/java/util/Collections.html#synchronizedList%28java.util.List%29
I might do this like this. I'd try to avoid notify/notifyAll as you'll have to involve a flag because of spurious wakeups and that clutters the code quite a bit. CountDownLatch is a much better choice here IMO even though the name is a bit weird.
static final long five_sec = TimeUnit.SECOND.toNanos(5)
final Queue<Pair<Guest, CountDownLatch>> guests = new LinkedList<>();
long earliest = -1;
// Synchronizing on "this" is fine but using private lock
// object is even better
final Object lock = new Object();
void enter(Guest g){
Pair p = Pair.of(g, new CountDownLatch(1));
synchronized(lock){
guests.get().add(p);
earliest = System.nanoTime() + five_sec;
}
p.second.await();
}
void pickGuests(){
synchronized(lock){
// pop a few guests from sofar and wake them
Guest g = sofar.poll();
if(g != null){
g.second.countDown();
}
}
}
void run(){
while(!Thread.currentThread().isInterrupted()){
long sleepTime;
synchronized(lock){
if(System.nanoTime() > earliest){
pickGuests();
}
sleepTime = earliest - System.nanoTime();
sleepTime = sleepTime < five_sec ? five_sec : sleepTime;
}
Thread.sleep(sleepTime);
}
}

Java- Efficient Scheduling Structure?

I apologise for the length of this problem, but I thought it important to include sufficient detail given that I'm looking for a suitable approach to my problem, rather than a simple code suggestion!
General description:
I am working on a project that requires tasks being able to be 'scheduled' at some relative repeating interval.
These intervals are in terms of some internal time, that is represented as an integer that is incremented as the program executes (so not equal to real time). Each time this happens, the schedule will be interogated to check for any tasks due to execute at this timestep.
If a task is executed, it should then be rescheduled to run again at a position relative to the current time (e.g. in 5 timesteps). This relative position is simply stored as an integer property of the Task object.
The problem:
I am struggling somewhat to decide upon how I should structure this- partly because it is a slightly difficult set of search terms to look for.
As it stands, I am thinking that each time the timer is incremented I need to:
Execute tasks at the '0' position in the schedule
Re-add those tasks to the schedule again at their relative position (e.g. a task that repeats every 5 steps will be returned to the position 5)
Each group of tasks in the schedule will have their 'time until execution' decremented one (e.g. a task at position 1 will move to position 0)
Assumptions:
There are a couple of assumptions that may limit the possible solutions I can use:
The interval must be relative, not a specific time, and is defined to be an integer number of steps from the current time
These intervals may take any integer value, e.g. are not bounded.
Multiple tasks may be scheduled for the same timestep, but their order of execution is not important
All execution should remain in a single thread- multi-threaded solutions are not suitable due to other constraints
The main questions I have are:
How could I design this Schedule to work in an efficient manner? What datatypes/collections may be useful?
Is there another structure/approach I should consider?
Am I wrong to dismiss scheduling frameworks (e.g. Quartz), which appear to work more in the 'real' time domain rather 'non-real' time domain?
Many thanks for any possible help. Please feel free to comment for further information if neccessary, I will edit wherever needed!
Well, Quartz is quite powerfull tools, however it has limited configuration possibilities, so if you need specific features, you should propably write your own solution.
However, it's a good idea to study the Quartz source code and data structures, because they have successfully dealt with much problems you would find f.g. inter-process synchronization on database level, running delayed tasks etc.
I've written once my own scheduler, which was adapted to tasks where propably Quartz would not be easy to adapt, but once I've learned Quartz I've understood how much I could improve in my solutions, knowing how it was done in Quartz.
How about this, it uses your own Ticks with executeNextInterval() :
import java.util.ArrayList;
import java.util.LinkedList;
import java.util.List;
public class Scheduler {
private LinkedList<Interval> intervals = new LinkedList<Scheduler.Interval>();
public void addTask(Runnable task, int position) {
if(position<0){
throw new IllegalArgumentException();
}
while(intervals.size() <= position){
intervals.add(new Interval());
}
Interval interval = intervals.get(position);
interval.add(task);
}
public void executeNextInterval(){
Interval current = intervals.removeFirst();
current.run();
}
private static class Interval {
private List<Runnable> tasks = new ArrayList<Runnable>();
public void add(Runnable task) {
tasks.add(task);
}
public void run() {
for (Runnable task : tasks) {
task.run();
}
}
}
}
You might want to add some error handling, but it should do your job.
And here are some UnitTests for it :)
import junit.framework.Assert;
import org.junit.Test;
public class TestScheduler {
private static class Task implements Runnable {
public boolean didRun = false;
public void run() {
didRun = true;
}
}
Runnable fail = new Runnable() {
#Override
public void run() {
Assert.fail();
}
};
#Test
public void queue() {
Scheduler scheduler = new Scheduler();
Task task = new Task();
scheduler.addTask(task, 0);
scheduler.addTask(fail, 1);
Assert.assertFalse(task.didRun);
scheduler.executeNextInterval();
Assert.assertTrue(task.didRun);
}
#Test
public void queueWithGaps() {
Scheduler scheduler = new Scheduler();
scheduler.addTask(fail, 1);
scheduler.executeNextInterval();
}
#Test
public void queueLonger() {
Scheduler scheduler = new Scheduler();
Task task0 = new Task();
scheduler.addTask(task0, 1);
Task task1 = new Task();
scheduler.addTask(task1, 1);
scheduler.addTask(fail, 2);
scheduler.executeNextInterval();
scheduler.executeNextInterval();
Assert.assertTrue(task0.didRun);
Assert.assertTrue(task1.didRun);
}
}
A circular linked list might be the data structure you're looking for. Instead of decrementing fields in each task element, you simply increment the index of the 'current' field in the circular list of tasks. A pseudocode structure might look something like this:
tick():
current = current.next()
for task : current.tasklist():
task.execute()
any time you schedule a new task, you just add it in the position N ticks forward of the current 'tick'
Here are a couple of thoughts:
Keep everything simple. If you don't have millions of tasks, there is no need for an optimized data structure (except pride or the urge for premature optimization).
Avoid relative times. Use an absolute internal tick. If you add a task, set the "run next time" to the current tick value. Add it to the list, sort the list by time.
When looking for tasks, start at the head of the list and pick everything which has a time <= current tick, run the task.
Collect all those tasks in another list. After all have run, calculate the "run next time" based on the current tick and the increment (so you don't get tasks that loop), add all of them to the list, sort.
Take a look at the way DelayQueue uses a PriorityQueue to maintain such an ordered list of events. DelayQueue works using real time and hence can use the variable timed wait methods available in Condition and LockSupport. You could implement something like a SyntheticDelayQueue that behaves in the same way as DelayQueue but uses your own synthetic time service. You would obviously have to replace the timed wait/signalling mechanisms that come for free with the jdk though and this might be non trivial to do efficiently.
If I had to do it, I'd create a simple queue ( linked list variant). This queue would contain a dynamic data structure (simple list for example) containing all the tasks that need to be done. At each time interval (or time-step), the process reads the first node of the queue, executes the instructions it finds in the list of that node. At the end of each execution it would compute the rescheduling and add the new execution to another node in the queue or create nodes up to that position before storing the instruction within that node. The first node is then removed and the second node (now the first) is executed at the next time-step. This system would also not require any integers to be kept track of and all data structures needed are found in the java language. This should solve your problem.
Use a ScheduledExecutorService. It has everything you need built right in. Here's how simple it is to use:
// Create a single-threaded ScheduledExecutorService
ScheduledExecutorService scheduler = Executors.newScheduledThreadPool(1); // 1 thread
// Schedule something to run in 10 seconds time
scheduler.schedule(new Runnable() {
public void run() {
// Do something
}}, 10, TimeUnit.SECONDS);
// Schedule something to run in 2 hours time
scheduler.schedule(new Runnable() {
public void run() {
// Do something else
}}, 2, TimeUnit.HOURS);
// etc as you need

Java: Best way to retrieve timings form multiple threads

We have 1000 threads that hit a web service and time how long the call takes. We wish for each thread to return their own timing result to the main application, so that various statistics can be recorded.
Please note that various tools were considered for this, but for various reasons we need to write our own.
What would be the best way for each thread to return the timing - we have considered two options so far :-
1. once a thread has its timing result it calls a singleton that provides a synchronised method to write to the file. This ensures that all each thread will write to the file in turn (although in an undetermined order - which is fine), and since the call is done after the timing results have been taken by the thread, then being blocked waiting to write is not really an issue. When all threads have completed, the main application can then read the file to generate the statistics.
2. Using the Executor, Callable and Future interfaces
Which would be the best way, or are there any other better ways ?
Thanks very much in advance
Paul
Use the latter method.
Your workers implement Callable. You then submit them to a threadpool, and get a Future instance for each.
Then just call get() on the Futures to get the results of the calculations.
import java.util.*;
import java.util.concurrent.*;
public class WebServiceTester {
public static class Tester
implements Callable {
public Integer call() {
Integer start = now();
//Do your test here
Integer end = now();
return end - start;
}
}
public static void main(String args[]) throws Exception {
ExecutorService pool = Executors.newFixedThreadPool(1000);
Set<Future<Integer>> set = new HashSet<Future<Integer>>();
for (int i =0 ; i < 1000 i++) {
set.add(pool.submit(new Tester()));
}
Set<Integer> results = new Set<Integer>();
for (Future<Integer> future : set) {
results.put(future.get());
}
//Manipulate results however you wish....
}
}
Another possible solution I can think of would be to use a CountDownLatch (from the java concurrency packages), each thread decrementing it (flagging they are finished), then once all complete (and the CountDownLatch reaches 0) your main thread can happily go through them all, asking them what their time was.
The executor framework can be implemented here. The time processing can be done by the Callable object. The Future can help you identify if the thread has completed processing.
You could pass an ArrayBlockingQueue to the threads to report their results to. You could then have a file writing thread that takes from the queue to write to the file.

Categories