ThreadPoolExecutor to handle high Memory Tasks in Grails? - java

I have some jobs that process images for my application. They take a lot of Heap memory. Thus I want to restrict the number of image processing tasks or queue them in some way.
I also use GPars to handle the image processing, but with my approach sometimes to many worker threads are open concurrently.
How can I use a ThreadPoolExecutor in Grails to get this done right?

I think u can do this by using GParsExecutorsPool
GParsExecutorsPool.withPool() {
Closure longLastingCalculation = {calculate()}
Closure fastCalculation = longLastingCalculation.async() //create a new closure, which starts the original closure on a thread pool
Future result=fastCalculation() //returns almost immediately
//do stuff while calculation performs …
println result.get()
}
For more details check this link:
Use of ThreadPool - the Java Executors' based concurrent collection processor

Related

Balancing multiple queues

I suspect this is really easy but I’m unsure if there’s a naïve way of doing it in Java. Here’s my problem, I have two scripts for processing data and both have the same inputs/outputs except one is written for the single CPU and the other is for GPUs. The work comes from a queue server and I’m trying to write a program that sends the data to either the CPU or GPU script depending on which one is free.
I do not understand how to do this.
I know with executorservice I can specify how many threads I want to keep running but not sure how to balance between two different ones. I have 2 GPU’s and 8 CPU cores on the system and thought I could have threadexecutorservice keep 2 GPU and 8 CPU processes running but unsure how to balance between them since the GPU will be done a lot quicker than the CPU tasks.
Any suggestions on how to approach this? Should I create two queues and keep pooling them to see which one is less busy? or is there a way to just put all the work units(all the same) into one queue and have the GPU or CPU process take from the same queue as they are free?
UPDATE: just to clarify. the CPU/GPU programs are outside the scope of the program I'm making, they are simply scripts that I call via two different method. I guess the simplified version of what I'm asking is if two methods can take work from the same queue?
Can two methods take work from the same queue?
Yes, but you should use a BlockingQueue to save yourself some synchronization heartache.
Basically, one option would be to have a producer which places tasks into the queue via BlockingQueue.offer. Then design your CPU/GPU threads to call BlockingQueue.take and perform work on whatever they receive.
For example:
main (...) {
BlockingQueue<Task> queue = new LinkedBlockingQueue<>();
for (int i=0;i<CPUs;i++) {
new CPUThread(queue).start();
}
for (int i=0;i<GPUs;i++) {
new GPUThread(queue).start();
}
for (/*all data*/) {
queue.offer(task);
}
}
class CPUThread {
public void run() {
while(/*some condition*/) {
Task task = queue.take();
//do task work
}
}
}
//etc...
Obviously there is more than one way to do it, usually simplest is the best. I would suggest threadpools, one with 2 threads for CPU tasks, second with 8 threads will run GPU tasks. Your work unit manager can submit work to the pool that has idle threads at the moment (I would recommend synchronizing that block of code). Standard Java ThreadPoolExecutor has getActiveCount() method you can use for it, see
http://docs.oracle.com/javase/6/docs/api/java/util/concurrent/ThreadPoolExecutor.html#getActiveCount().
Use Runnables like this:
CPUGPURunnable implements Runnable {
run() {
if ( Thread.currentThread() instance of CPUGPUThread) {
CPUGPUThread t = Thread.currentThread();
if ( t.isGPU())
runGPU();
else
runCPU();
}
}
}
CPUGPUThreads is a Thread subclass that knows if it runs in CPU or GPU mode, using a flag. Have a ThreadFactory for ThreadPoolExecutors that creates either a CPU of GPU thread. Set up a ThreadPoolExecutor with two workers. Make sure the Threadfactory creates a CPU and then a GPU thread instance.
I suppose you have two objects that represents two GPUs, with methods like boolean isFree() and void execute(Runnable). Then you should start 8 threads which in a loop take next job from the queue, put it in a free GPU, if any, otherwise execute the job itself.

Any available design pattern for a thread that is capable of executing a specific job sent by another threads?

I'm working on a project where execution time is critical. In one of the algorithms I have, I need to save some data into a database.
What I did is call a method that does that. It fires a new thread every time it's called. I faced a runoutofmemory problem since the loaded threads are more than 20,000 ...
My question now is, I want to start only one thread, when the method is called, it adds the job into a queue and notifies the thread, it sleeps when no jobs are available and so on. Any design patterns available or examples available online ?
Run, do not walk to your friendly Javadocs and look up ExecutorService, especially Executors.newSingleThreadExecutor().
ExecutorService myXS = Executors.newSingleThreadExecutor();
// then, as needed...
myXS.submit(myRunnable);
And it will handle the rest.
Yes, you want a worker thread or thread pool pattern.
http://en.wikipedia.org/wiki/Thread_pool_pattern
See http://www.ibm.com/developerworks/library/j-jtp0730/index.html for Java examples
I believe the pattern you're looking for is called producer-consumer. In Java, you can use the blocking methods on a BlockingQueue to pass tasks from the producers (that create the jobs) to the consumer (the single worker thread). This will make the worker thread automatically sleep when no jobs are available in the queue, and wake up when one is added. The concurrent collections should also handle using multiple worker threads.
Are you looking for java.util.concurrent.Executor?
That said, if you have 20000 concurrent inserts into the database, using a thread pool will probably not save you: If the database can't keep up, the queue will get longer and longer, until you run out of memory again. Also, note that an executors queue is volatile, i.e. if the server crashes, the data in it will be gone.

Spawning tons of threads without running out of memory

I have a multi-threaded application which creates hundreds of threads on the fly. When the JVM has less memory available than necessary to create the next Thread, it's unable to create more threads. Every thread lives for 1-3 minutes. Is there a way, if I create a thread and don't start it, the application can be made to automatically start it when it has resources, and otherwise wait until existing threads die?
You're responsible for checking your available memory before allocating more resources, if you're running close to your limit. One way to do this is to use the MemoryUsage class, or use one of:
Runtime.getRuntime().totalMemory()
Runtime.getRuntime().freeMemory()
...to see how much memory is available. To figure out how much is used, of course, you just subtract total from free. Then, in your app, simply set a MAX_MEMORY_USAGE value that, when your app has used that amount or more memory, it stops creating more threads until the amount of used memory has dropped back below this threshold. This way you're always running with the maximum number of threads, and not exceeding memory available.
Finally, instead of trying to create threads without starting them (because once you've created the Thread object, you're already taking up the memory), simply do one of the following:
Keep a queue of things that need to be done, and create a new thread for those things as memory becomes available
Use a "thread pool", let's say a max of 128 threads, as all your "workers". When a worker thread is done with a job, it simply checks the pending work queue to see if anything is waiting to be done, and if so, it removes that job from the queue and starts work.
I ran into a similar issue recently and I used the NotifyingBlockingThreadPoolExecutor solution described at this site:
http://today.java.net/pub/a/today/2008/10/23/creating-a-notifying-blocking-thread-pool-executor.html
The basic idea is that this NotifyingBlockingThreadPoolExecutor will execute tasks in parallel like the ThreadPoolExecutor, but if you try to add a task and there are no threads available, it will wait. It allowed me to keep the code with the simple "create all the tasks I need as soon as I need them" approach while avoiding huge overhead of waiting tasks instantiated all at once.
It's unclear from your question, but if you're using straight threads instead of Executors and Runnables, you should be learning about java.util.concurrent package and using that instead: http://docs.oracle.com/javase/tutorial/essential/concurrency/executors.html
Just write code to do exactly what you want. Your question describes a recipe for a solution, just implement that recipe. Also, you should give serious thought to re-architecting. You only need a thread for things you want to do concurrently and you can't usefully do hundreds of things concurrently.
This is an alternative, lower level solution Then the above mentioed NotifyingBlocking executor - it is probably not as ideal but will be simple to implement
If you want alot of threads on standby, then you ultimately need a mechanism for them to know when its okay to "come to life". This sounds like a case for semaphores.
Make sure that each thread allocates no unnecessary memory before it starts working. Then implement as follows :
1) create n threads on startup of the application, stored in a queue. You can Base this n on the result of Runtime.getMemory(...), rather than hard coding it.
2) also, creat a semaphore with n-k permits. Again, base this onthe amount of memory available.
3) now, have each of n-k threads periodically check if the semaphore has permits, calling Thread.sleep(...) in between checks, for example.
4) if a thread notices a permit, then update the semaphore, and acquire the permit.
If this satisfies your needs, you can go on to manage your threads using a more sophisticated polling or wait/lock mechanism later.

Multithreading help in Java

I'm new to Java, and I need some help working on this program. This is a small part of a large class project, and I must use multithreading.
Here's what I want to do algorithmically:
while (there is still input left, store chunk of input in <chunk>)
{
if there is not a free thread in my array then
wait until a thread finishes
else there is a free thread then
apply the free thread to <chunk> (which will do something to chunk and output it).
Note: The ordering of the chunks being output must be the same as input
}
So, the main things I don't know how to do:
How can I check whether or not there's a free thread in the array? I know that there is a function ThreadAlive, but it seems super inefficient to poll every single thread every time in my loop.
If there is no free thread, how can I wait until one has finished?
The ordering is important. How can I preserve the ordering in which the threads output? As in, the order of the output needs to match the order of the input. How can I guarantee this synchronization?
How do I even pass the chunk to my thread? Can I just use the Runnable interface to do this?
Any help with these four bullets is greatly appreciated. Since I'm a super noob, code samples would help significantly.
(side-note: Making an array of threads was just an idea of mine to handle the user defined number of threads. If you have a better way to handle this you're welcome to suggest it!)
Sounds like you basically have a producer/consumer model and can be solved with an ExecutorService and BlockingQueue. Here is a similar question with a similar answer:
producer/consumer work queues
As #altaiojok mentioned, you want to use an ExecutorService and BlockingQueue. The basic algorithm works like this:
ExecutorService executor = Executors.newFixedThreadPool(...); //or newCachedThreadPool, etc...
BlockingQueue<Future<?>> outputQueue = new LinkedBlockingQueue<Future<?>>();
//To be run by an input processing thread
void submitTasks() {
BufferedReader input = ... //going to assume you have a file that you want to read, this could be any method of input (file, keyboard, network, etc...)
String line = input.readLine();
while(line != null) {
outputQueue.add(executor.submit(new YourCallableImplementation(line)));
line = input.readLine();
}
}
//To be run by a different output processing thread
void processTaskOutput() {
try {
while(true) {
Future<?> resultFuture = outputQueue.take();
? result = resultFuture.get();
//process the output (write to file, send to network, print to screen, etc...
}
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}
I'll leave it to you to figure out how to implement Runnable to make the input and output thread as well as how to implement Callable for the tasks you need to process.
I would suggest using commons-pool which offers pooling of threads so you can easily limit the number of used threads and it also offers some other helper methods.
Concerning the ordering: have a look at the synchronize keyword.
And I would suggest to have a look at the java tutorial (the part about concurrency): http://download.oracle.com/javase/tutorial/essential/concurrency/index.html
Streams might come handy:
List<Chunk> chunks = new ArrayList<>();
//....
Function<Chunk, String> toWeightInfo = (chunk) -> "weight = "+(chunk.size()*chunk.prio());
List<String> results = chunks.parallelStream()
.map(toWeightInfo)
.collect(Collectors.toList());
System.out.println(results);
The parallel stream uses the System's default "fork/join" thread pool, which should be the size of available logical CPUs and processes your stuff in parallel. It also guarantees the same order of results.
The parallel streams API hides all the complexity of assigning free threads to jobs and optimizations like work-stealing away from you. Just give it something to chew on and it will work its magic.
If you need to use a thread pool of a custom size, please refer to the
Custom thread pool in Java 8 parallel stream question.
You might also have a look at this good Java 8 Stream Tutorial.
If your case is rather complex and you're streaming chunks into your program, and you've got multiple stages of work, where some must be serial and some can be parallel and some depend on each other, you might have a look at the Disruptor framework from LMAX.
Kind regards
Use ExecutorCompletionService and Future<T>. Together they provide a threadpool based task framework that takes care of all your concerns.
How can I check whether or not there's a free thread in the array? I know that there is a function ThreadAlive, but it seems super inefficient to poll every single thread every time in my loop.
You dont have to. The executor will do this for you in an (super)efficient manner.You just have to submit tasks to it and sit back.
If there is no free thread, how can I wait until one has finished?
Again , you really dont have to. This is taken care of by executor.
The ordering is important. How can I preserve the ordering in which the threads output? As in, the order of the output needs to match the order of the input. How can I guarantee this synchronization?
This is a concern. If you want the processed output ( of chunks, in your words ) to arrive in the same order as these chunks are present in the initial array, you have to address a few points :
Is it just the order of arrival of the results that matter , or is it that the tasks processing themselves have dependencies on the order ? If it is the former , it is much easily done, but if its the later , then you have problems. ( which I think are very hard things to start with considering your admission of being new to Java, so I would just recommend more learning on your part before attempting this. )
Assuming it is the former case , what you can do is this : Submit the chunks to the executor in some order , and each submission will give you a handle ( called a Future<Result> ) to the task processed output. Store these handles in a ordered queue, and when you want the results , call the get() on these Future(s). Note that if some task in the middle of the order takes long time to complete , then the results of the following tasks will also be delayed.
How do I even pass the chunk to my thread? Can I just use the Runnable interface to do this?
Create a Callable instance wrapping one chunk each into the instance. This represents your task that you will submit() to the ExecutorService.

a "simple" thread pool in java

I'm looking for a simple object that will hold my work threads and I need it to not limit the number of threads, and not keep them alive longer than needed.
But I do need it to have a method similar to an ExecutorService.shutdown();
(Waiting for all the active threads to finish but not accepting any new ones)
so maybe a threadpool isn't what I need, so I would love a push in the right direction.
(as they are meant to keep the threads alive)
Further clarification of intent:
each thread is an upload of a file, and I have another process that modifies files, but it waits for the file to not have any uploads. by joining each of the threads. So when they are kept alive it locks that process. (each thread adds himself to a list for a specific file on creation, so I only join() threads that upload a specific file)
One way to do what you awant is to use a Callable with a Future that returns the File object of a completed upload. Then pass the Future into another Callable that checks Future.isDone() and spins until it returns true and then do whatever you need to do to the file. Your use case is not unique and fits very neatly into the java.util.concurrent package capabilities.
One interesting class is ExecutorCompletionService class which does exactly what you want with waiting for results then proceeding with an additional calculation.
A CompletionService that uses a
supplied Executor to execute tasks.
This class arranges that submitted
tasks are, upon completion, placed on
a queue accessible using take. The
class is lightweight enough to be
suitable for transient use when
processing groups of tasks.
Usage Examples: Suppose you have a set of solvers for a certain problem,
each returning a value of some type
Result, and would like to run them
concurrently, processing the results
of each of them that return a non-null
value, in some method use(Result r).
You could write this as:
void solve(Executor e, Collection<Callable<Result>> solvers)
throws InterruptedException, ExecutionException
{
CompletionService<Result> ecs = new ExecutorCompletionService<Result>(e);
for (Callable<Result> s : solvers) { ecs.submit(s); }
int n = solvers.size();
for (int i = 0; i < n; ++i)
{
Result r = ecs.take().get();
if (r != null) { use(r); }
}
}
You don't want an unbounded ExecutorService
You almost never want to allow unbounded thread pools, as they actually can limit the performance of your application if the number of threads gets out of hand.
You domain is limited by disk or network I/O or both, so a small thread pool would be sufficient. You are not going to want to try and read from hundreds or thousands of incoming connections with a thread per connection.
Part of your solution, if you are receiving more than a handful of concurrent uploads is to investigate the java.nio package and read about non-blocking I/O as well.
Is there a reason that you don't want to reuse threads? Seems to me that the simplest thing would be to use ExecutorService anyway and let it reuse threads.

Categories