Long running HTTP request and threads synchronization

Long running HTTP request and threads synchronization - java

I need to code a web service that solves a complex problem by the use of a heuristic algorithm. The algorithm will run as long as the amount of time specified in the POST request has lapsed (i.e. passing timeAllowance=60 will make sure that the heuristic algorithm stops after 60 seconds and returns the best solution found).
The heuristic algorithm has to run on several threads to take advantage of all the server cores. During the execution of the algorithm, these methods have to "communicate" between each other. Each thread will run the heuristic algorithm and after certain amount of time, the threads will communicate the solutions they found and, if the allowed time has not expired, a new cycle is run with a different initial population. Summarizing:
Generate initial populations (pretty much randomly)
Launch heuristic algorithms threads, each one taking a population as input
After a certain amount of time, terminate the threads and communicate to a "controller entity" the new populations found by the threads
Do some logical reasoning and generate the new populations based on the result of the threads launched at point 2
If the allowed time has not expired, go back to point 2 with the new populations. Otherwise quit
My question is: how would you structure the code using Spring MVC?
Just as a test, I tried to launch 10 threads in a service method and to call that method from a controller (autowiring the service). Everything the threads are doing is to sleep for 60 seconds. I was expecting the HTTP request to wait for all the threads to terminate (i.e. about 60 seconds), but it actually responds straight away.
Any help very much appreciated.
Thank you!

You don't want thread, you want a thread pool (ExecutorService). Submit some number of Callable<HeuristicResult> to your pool and wait on returned Future<HeuristicResult>. Once all futures are done, do your point 4. and go back to 2. (but reusing the thread pool).
At the end shutdown the pool or reuse it for all requests (more scalable).
I tried to launch 10 threads [...] I was expecting the HTTP request to wait for all the threads to terminate [...], but it actually responds straight away.
Starting a thread is non-blocking and from that moment thread works asynchronously. You can call join() on created thread to wait for its termination. But a thread pool and Future.get() is much more modern and flexible.

Without seeing any code I would guess that reason this returned straight away is that you started the task in a background thread not the thread servicing the request.
If I were writing this service I would probably not wait for 60 seconds before returning the response. I would start the task in the background (using a service) and return a status page immediately. On this page you could use ajax to poll the server for the status of the task and use javascript to render a progress bar in the browser.
Therefore you would need a controller method to start the process and one to allow the browser to obtain the status. Since you just need the time since it started to derive the progress I would most likely just put the start time and total allowed time in the session. Then you need a controller method to calculate the percentage of time elapsed and return that to the browser.

Related

Java: Controlling hardware tasks with pausable ThreadPoolExecutor

I want to implement a single-producer - multi-consumer logic where each consumer processing time depends on a hardware response.
**EDIT
I have a Set of objects (devices). Each object (device) corresponds to a hardware real unit I want to simulate in software.
My main class distributes a list of tasks to each device. Each task takes a certain time to complete - which I want to have control, in order to simulate the hardware operation. Each device object has its own SingleThreadExecutorService service executor to manage its own queued tasks. A Sleep on a task of a specific device object should not interfere on main, or other devices object's performance.
So far things are working but I am not sure how to get a future from the tasks without blocking the main thread with a while(!future.isDone()). When I do it, two problems occur:
task 1 is submitted to device[ 1 ].executor. Tasks 1 sleeps to simulate hardware operation time.
task 2 should be submitted to device[ 2 ].executor as soon as task 1 is submitted, but it won't, because main thread is hold while waiting for task 1 to return a Future. This issue accumulates delay on the simulation since every task added causes the next device to have to wait for the previous to complete, instead of running simultaneously.
Orange line indicates a command to force device to wait for 1000 milliseconds.
When Future returns, it then submits a new task to device 2, but it is already 1 second late, seen in blue line. And so on, green line shows the delay increment.
If I don't use Future to get when tasks were finished, the simulation seems to run correctly. I couldn't find a way to use future.isDone() without having to create a new thread just to check it. Also, I would really be glad if someone could advice me how to proceed in this scenario.

If your goal is to implement something where each consumer task is talking to a hardware device during the processing of its task, then the run method of the task should simply talk to the device and block until it receives the response from the device. (How you do that will depend on the device and its API ...)
If your goal is to do the above with a simulated device (i.e. for testing purposes) then have the task call Thread.sleep(...) to simulate the time that the device would take to respond.
Based on your problem description (as I understand it), the PausableSchedulerThreadPoolExecutor class that you have found won't help. What that class does is to pause the threads themselves. All of them.
UPDATE
task 2 should be submitted to device[ 2 ].executor as soon as task 1 is submitted, but it won't, because main thread is hold while waiting for task 1 to return a Future.
That is not correct. The Future object is returned immediately ... when the task is submitted.
You mistake (probably) is that the main thread is calling get on the Future. That will block. But the point is that is your main thread actually needs to call get on the Future before submitting the next task then it is essentially single-threaded.
Real solution: figure out how to break that dependency that makes your application single threaded. (But beware: if you pass the Future as a parameter to a task, then the corresponding worker thread may block. Unless you have enough threads in the thread pool you could end up with starvation and reduced concurrency.)

Socket timeout not respected with multiple threads in Java?

I have a java program that spawns out 4 threads. In each thread, I have multiple socket timeouts. However, it seems that these timeouts are not respected i.e. the readLine() function might block for a longer period of time.
I want the following behavior: If I set a socket timeout to 300 ms, then I want the readLine() function to return within 300 ms from when the readLine() (i.e. the underlying select call) was invoked, no matter what. I understand that the OS scheduler would be putting the threads to sleep while doing processor sharing, but is there any way in Java to force the threads to always be woken up to ensure this behavior? Or is this just not the right way to think when doing multi-threaded programming?
Ideally, since I am spawning out 4 threads and running on a 6-core machine, each thread should be able to get its own CPU and run in parallel, and respect the select timeout ... but that's perhaps too much to expect ...
PS: I actually do use Thread.interrupt() to ensure that each of my threads exits within a certain time (I check the time elapsed in the master thread, and interrupt the child threads if its been too long). In each of my threads, I connect to a (different) server, make a request, and wait for a response. I do not know how long the response will be. So I keep on calling the readLine() method, till it times out with a SocketTimeoutException. I enforce a timeout of 300 ms, since I expect the server to start responding within this time. The reason I want to enforce this timeout is that the server behaves in a broadcast fashion, and sends responses to a request by a single client to all clients. So if I don't have a timeout, I will keep on getting data in response to requests by some other clients.

If I really understood your problem, you can always try to invoke Thread.interrupt() in the thread that is performing the readLine() operation. As you didn't provide any code, I leave this link for you to read. It's general, but it provides enough information about interrupting threads.
These two links may also be of use to you: How do you kill a thread in Java? and How to abort a thread in a fast and clean way in java?.
Regarding your question about the OS scheduler, you should be aware that in general purpose OSes you do not have total control on the way the OS schedules tasks. For instance, in Linux the interrupts are the highest priority tasks, then there are scheduling policies that enable you to put "some" determinism on how the tasks are scheduled. In Java, you can use the setPriority() method to change the priority of the thread, but in fact it is the same as using the nice command, and still you don't get any guarantees that this thread will be scheduled ahead of other OS threads.
I hope this helps.

You are making some wrong assumptions here:
that the timeout will be exactly 300ms. In fact the timeout is that will be at least 300ms.
the OS Scheduler does not do anything (rather than schedule the java os processes) within the java threads.
having 6 core, does not mean that each one of your threads will run at separate core, it is not possible to bind thread->core at java
for last, you consider that the jvm has only yours 4 threads running, but in fact there has more threads, by example, the garbage collector thread(s).
Asking your question: "is there any way in Java to force the threads to always be woken up to ensure this behavior? "
Yes, depending how it is your code, if the thread is thread.sleep(), you can use thread.interrupt() (for readline() uses it) and handle the InterruptionException or if they are object.wait(), you can use object.notify() or object.notifyAll().

Is dangerous to start threads in Java and not to wait for them (with .join())?

When writing a multithread internet server in java, the main-thread starts new
ones to serve incoming requests in parallel.
Is any problem if the main-thread does not wait ( with .join()) for them?
(It is obviously absurd create a new thread and then, wait for it).
I know that, in a practical situation, you should (or "you must"?) implement a pool
of threads to "re-use" them for new requests when they become idle.
But for small applications, should we use a pool of threads?

You don't need to wait for threads.
They can either complete running on their own (if they've been spawned to perform one particular task), or run indefinitely (e.g. in a server-type environment).
They should handle interrupts and respond to shutdown requests, however. See this article on how to do this correctly.
If you need a set of threads I would use a pool and executor methods since they'll look after thread resource management for you. If you're writing a multi-threaded network server then I would investigating using (say) a servlet container or a framework such as Mina.

The only problem in your approach is that it does not scale well beyond a certain request rate. If the requests are coming in faster than your server is able to handle them, the number of threads will rise continuously. As each thread adds some overhead and uses CPU time, the time for handling each request will get longer, so the problem will get worse (because the number of threads rises even faster). Eventually no request will be able to get handled anymore because all of the CPU time is wasted with overhead. Probably your application will crash.
The alternative is to use a ThreadPool with a fixed upper bound of threads (which depends on the power of the hardware). If there are more requests than the threads are able to handle, some requests will have to wait too long in the request queue, and will fail due to a timeout. But the application will still be able to handle the rest of the incoming requests.
Fortunately the Java API already provides a nice and flexible ThreadPool implementation, see ThreadPoolExecutor. Using this is probably even easier than implementing everything with your original approach, so no reason not to use it.

Thread.join() lets you wait for the Thread to end, which is mostly contrary to what you want when starting a new Thread. At all, you start the new thread to do stuff in parallel to the original Thread.
Only if you really need to wait for the spawned thread to finish, you should join() it.

You should wait for your threads if you need their results or need to do some cleanup which is only possible after all of them are dead, otherwise not.
For the Thread-Pool: I would use it whenever you have some non-fixed number of tasks to run, i.e. if the number depends on the input.

I would like to collect the main ideas of this interesting (for me) question.
I can't totally agree with "you
don't need to wait for threads".
Only in the sense that if you don't
join a thread (and don't have a
pointer to it) once the thread is
done, its resources are freed
(right? I'm not sure).
The use of a thread pool is only
necessary to avoid the overhead of
thread creation, because ...
You can limit the number of parallel
running threads by accounting, with shared variables (and without a thread pool), how many of then
were started but not yet finished.

Java daemon - handling shutdown requests

I'm currently working on a daemon that will be doing A LOT of different tasks. It's multi threaded and is being built to handle almost any kind of internal-error without crashing. Well I'm getting to the point of handling a shutdown request and I'm not sure how I should go about doing it.
I have a shutdown hook setup, and when it's called it sets a variable telling the main daemon loop to stop running. The problem is, this daemon spawns multiple threads and they can take a long time. For instance, one of these threads could be converting a document. Most of them will be quick (I'm guessing under 10 seconds), but there will be threads that can last as long as 10+ minutes.
What I'm thinking of doing right now is when a shutdown hook has been sent, do a loop for like 5 seconds on ThreadGroup.activeCount() with a 500ms (or so) Sleep (all these threads are in a ThreadGroup) and before this loop, I will send a notification to all threads telling them a shutdown request has been called. Then they will have to instantly no matter what they're doing cleanup and shutdown.
Anyone else have any suggestions? I'm interested in what a daemon like MySQL for instance does when it gets told to stop, it stops instantly. What happens if like 10 query's are running that are very slow are being called? Does it wait or does it just end them. I mean servers are really quick, so there really isn't any kind of operation that I shouldn't be able to do in less than a second. You can do A LOT in 1000ms now days.
Thanks

The java.util.concurrent package provides a number of utilities, such as ThreadPoolExecutor (along with various specialized types of other Executor implementations from the Executors class) and ThreadPoolExecutor.awaitTermination(), which you might want to look into - as they provide the same exact functionality you are looking to implement. This way you can concentrate on implementing the actual functionality of your application/tasks instead of worrying about things like thread and task scheduling.

Are your thread jobs amenable to interruption via Thread#interrupt()? Do they mostly call on functions that themselves advertise throwing InterruptedException? If so, then the aforementioned java.util.concurrent.ExecutorService#shutdownNow() is the way to go. It will interrupt any running threads and return the list of jobs that were never started.
Similarly, if you hang on to the Futures produced by ExecutorService#submit(), you can use Future#cancel(boolean) and pass true to request that a running job be interrupted.
Unless you're calling on code out of your control that swallows interrupt signals (say, by catching InterruptedException without calling Thread.currentThread().interrupt()), using the built-in cooperative interruption facility is a better choice than introducing your own flags to approximate what's already there.

handling sleep in java scheduled executor service

I have a sort of complex problem like below.
- we have a real time system with large number threads requirement. In order to optimize the performance, we are thinking of following design.
create a thread pool executor with max number of threads
each thread is used to create scheduled executor service.
now the tasks are being assigned to these executor services evenly based on load
BUT the biggest problem is, if one of the task in the queue contains a sleep (for few secs), it blocks the corresponding Schedule executor service thread for that duration and subsequently all the following tasks in that queue.
In this regard, please suggest me how to suspend the execution of the task with sleep OR overriding the sleep somehow and rejoin/schedule the task again to the queue.
Thanks in advance
Seshu

Assuming I understand your question, your Schedule Executor service threads have a deadline requirement, but the actual workers can sleep for an unknown length of time, possibly throwing off the timing of the Schedule Executors. From your description I'm guessing what you want is for a task that needs to sleep to actually stop, save progress information and then requeue itself for the remainder of the work to be rescheduled at some future time. You'd have to build this into your application architecture.
Alternatively, you could have the scheduler threads launch the worker tasks in their own separate threads, letting them sleep as necessary, with one scheduler thread collecting all the worker terminations.
To get a better answer you're going to have to provide more information about what you're trying to accomplish.

Tasks which sleep are inherently unfriendly for running in any kind of bounded thread pool. The sleep is explicitly telling the thread that it must do nothing for a period of time.
If possible, split the task into 2 (or more parts), eliminating the sleep completely. Get the first half-task to schedule the second task with an appropriate delay.
Failing that, you could consider increasing the size of your thread pool somewhat - either setting a much larger cap to its size, or possibly even eliminating the cap altogether (not recommended for a server than might end up with many clients).
Alternatively, move the tasks with sleep statements in them into their own Scheduled executor. Then, they'll delay each other, but better-behaved tasks, with no wait statements in them, will get preferential treatment.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.