What is the use of allowCoreThreadTimeout( ) in ThreadPoolExecutor?

What is the use of allowCoreThreadTimeout( ) in ThreadPoolExecutor? - java

I have to use Java ThreadPoolExecutor in one of my component in Android. I was searching the use of allowCoreThreadTimeout( ).
I have read the Java & Android docs related.
But I didn't get any useful implementation scenario of the method.
Can someone please help me??

This method allows you to specify whether to terminate the core thread if there is no incoming task within the thread keep alive time. This is related to other configuration like, setCorePoolSize(), setKeepAliveTime(..)
When you are creating a thread pool and idle threads exist in the pool even though there is no task is running. It is costly to keep these thread alive. If you want get rid of these when you have no task to execute, this method is useful. You need to pass true value then they will be die after the keep alive time.
In Summary:
allowCoreThreadTimeOut(true) // Could save memory compromising performance
allowCoreThreadTimeOut(false) // Comsume memory but high performance

Permitting core threads to timeout allows an application to efficiently handle 'bursty' traffic. Consider a scenario where an enterprise application is idle during business hours but receives a burst of many requests at the end of the day.
One way of efficiently handling this scenario would be to enable allowCoreThreadTimeout() and set coreThreads = maxThreads to some appropriately high value. During that peak time your thread pool would scale up to handle the traffic and then afterwards scale back down to zero, freeing up server resources.

public void allowCoreThreadTimeOut(boolean value)
This is greatly explained in javadoc
Sets the policy governing whether core threads may time out and terminate if no tasks arrive within the keep-alive time, being replaced if needed when new tasks arrive.
When false, core threads are never terminated due to lack of incoming tasks. When true, the same keep-alive policy applying to non-core threads applies also to core threads. To avoid continual thread replacement, the keep-alive time must be greater than zero when setting true. This method should in general be called before the pool is actively used.

It's useful in situations when you can not call ThreadPoolExecutor.shutdown method explicitly at the end of your object lifecycle (framework doesn't provide "onClose" hook, for example), but you need to use ThreadPoolExecutor. In this case, without allowCoreThreadTimeout(true) method call, ThreadPoolExecutor's core threads will block GC of your object and cause memory leaks.
Here's how this scenario referenced in "Finalization" section of the ThreadPoolExecutor documentation:
A pool that is no longer referenced in a program AND has no remaining
threads will be shutdown automatically. If you would like to ensure
that unreferenced pools are reclaimed even if users forget to call
shutdown(), then you must arrange that unused threads eventually die,
by setting appropriate keep-alive times, using a lower bound of zero
core threads and/or setting allowCoreThreadTimeOut(boolean).

you can check parse implementation for android sdk, it's really nice.

Related

Prevent from slow job taking over a thread pool

I have a system where currently every job has it's own Runnable class and I pre defined a fixed number of threads for every job.
My understanding is that it is a wrong practice, because:
You have to tailor the number of threads with respect to the machine running the process.
Each threads can only take one type of job.
Would you agree on that? (current solution is wrong)
So, I'd like to use something like Java's ThreadPool instead. I was conflicted with an argument claiming that by doing so, slow jobs will take over most of the thread pool, leaving no place to the other jobs. Whereas, with the current solution, a fixed number of threads were assigned to the slow worker and it won't hurt the others.
(Notice that you can't know a-priori if a job will be "slow")
How can a system be both adaptive in the number of threads it uses, but at the same time not be bounded to the most slow job?

You could try getting the time it takes for the job to complete (With a hand-made Timer class of sorts. Then you normalize this value by dividing this time by the maximum time any given thread has taken. Finally, you multiply this number by a fixed number which varies depending on how many threads you want running per job per second. This will be the requested amount of threads this process should be using. You can adjust that according.
Edit: You can set minimum and maximum values that regulate how many threads a job is entitled to. You could alternatively request threads from a very spacious job when another thread enters the system.
Hope that helps!

It's more of a business problem. Let's say I am a telecom operator. I bar my subscribers from making outgoing calls when they don't clear their dues. When they make payment I clear a flag and in a second the subscriber can make calls. But a lot of other activities go on in my system like usage processing, billing, bill formatting etc.
Now let's assume I have a system wide common pool of threads and I started the billing of 50K subscribers. All my threads are now processing the relatively long running billing jobs and a huge queue is building up.
A poor customer now makes a payment and wants to make an urgent call. But I have no thread left in my pool to clear the flag. The customer had to wait for an hour before he can make the call. That's SLA breach.
What I should have done is create separate thread pools. If the call unblocking jobs are not very frequent and short, I can create a separate pool for it with core size 5 maybe. For billing jobs I'd rather create a pool with core size 25 and max-size 30.
So, my system limits won't anyway exceed because I know in even the worst situation I won't have more than 30 threads.
This will also make it easy to debug. If I have a different thread name pattern for each pool amd my system has some issues. I can easily take a thread dump and understand if the billing or the payment stuff is the culprit.
So, I think the existing design is based on some business use case which you need to thoroughly understand before proposing a solution.

Best practices with Akka in Scala and third-party Java libraries

I need to use memcached Java API in my Scala/Akka code. This API gives you both synchronous and asynchronous methods. The asynchronous ones return java.util.concurrent.Future. There was a question here about dealing with Java Futures in Scala here How do I wrap a java.util.concurrent.Future in an Akka Future?. However in my case I have two options:
Using synchronous API and wrapping blocking code in future and mark blocking:
Future {
blocking {
cache.get(key) //synchronous blocking call
}
}
Using asynchronous Java API and do polling every n ms on Java Future to check if the future completed (like described in one of the answers above in the linked question above).
Which one is better? I am leaning towards the first option because polling can dramatically impact response times. Shouldn't blocking { } block prevent from blocking the whole pool?

I always go with the first option. But i am doing it in a slightly different way. I don't use the blocking feature. (Actually i have not thought about it yet.) Instead i am providing a custom execution context to the Future that wraps the synchronous blocking call. So it looks basically like this:
val ecForBlockingMemcachedStuff = ExecutionContext.fromExecutorService(Executors.newFixedThreadPool(100)) // whatever number you think is appropriate
// i create a separate ec for each blocking client/resource/api i use
Future {
cache.get(key) //synchronous blocking call
}(ecForBlockingMemcachedStuff) // or mark the execution context implicit. I like to mention it explicitly.
So all the blocking calls will use a dedicated execution context (= Threadpool). So it is separated from your main execution context responsible for non blocking stuff.
This approach is also explained in a online training video for Play/Akka provided by Typesafe. There is a video in lesson 4 about how to handle blocking calls. It is explained by Nilanjan Raychaudhuri (hope i spelled it correctly), who is a well known author for Scala books.
Update: I had a discussion with Nilanjan on twitter. He explained what the difference between the approach with blocking and a custom ExecutionContext is. The blocking feature just creates a special ExecutionContext. It provides a naive approach to the question how many threads you will need. It spawns a new thread every time, when all the other existing threads in the pool are busy. So it is actually an uncontrolled ExecutionContext. It could create lots of threads and lead to problems like an out of memory error. So the solution with the custom execution context is actually better, because it makes this problem obvious. Nilanjan also added that you need to consider circuit breaking for the case this pool gets overloaded with requests.
TLDR: Yeah, blocking calls suck. Use a custom/dedicated ExecutionContext for blocking calls. Also consider circuit breaking.

The Akka documentation provides a few suggestions on how to deal with blocking calls:
In some cases it is unavoidable to do blocking operations, i.e. to put
a thread to sleep for an indeterminate time, waiting for an external
event to occur. Examples are legacy RDBMS drivers or messaging APIs,
and the underlying reason is typically that (network) I/O occurs under
the covers. When facing this, you may be tempted to just wrap the
blocking call inside a Future and work with that instead, but this
strategy is too simple: you are quite likely to find bottlenecks or
run out of memory or threads when the application runs under increased
load.
The non-exhaustive list of adequate solutions to the “blocking
problem” includes the following suggestions:
Do the blocking call within an actor (or a set of actors managed by a router), making sure to configure a thread pool which is either
dedicated for this purpose or sufficiently sized.
Do the blocking call within a Future, ensuring an upper bound on the number of such calls at any point in time (submitting an unbounded
number of tasks of this nature will exhaust your memory or thread
limits).
Do the blocking call within a Future, providing a thread pool with an upper limit on the number of threads which is appropriate for the
hardware on which the application runs.
Dedicate a single thread to manage a set of blocking resources (e.g. a NIO selector driving multiple channels) and dispatch events as they
occur as actor messages.
The first possibility is especially well-suited for resources which
are single-threaded in nature, like database handles which
traditionally can only execute one outstanding query at a time and use
internal synchronization to ensure this. A common pattern is to create
a router for N actors, each of which wraps a single DB connection and
handles queries as sent to the router. The number N must then be tuned
for maximum throughput, which will vary depending on which DBMS is
deployed on what hardware.

Common practices to avoid timeouts / starvation in Java?

I have a web-service that write files to disk and other stuff to database. The entire operation takes 1-2 seconds for each write.
The service can, bur that is unlikely, be called from several clients at the same time. Let´s assume that 20 clients call the webservice at the same time, the write operations must be synchronized. In that case, some clients can get a time out exception because they have to wait to many seconds.
Are there any good practices to solve these kind of situations? As it is now, the methods are synchronized (and that can cause the starvation/timeouts).
Should I let all threads get into the write method by removing the synchronized keyword and put their task into a task queue to avoid a timeout? Is that the correct way to get arount this?

Removing the synchronized and putting it into a task queue by itself will not help you (because that's effectively what the synchronized is doing for you). However if you respond to the web request as soon as you put it on the queue, then you will reduce your response fime. But at the cost of some reliability as the user will get a confirmation that the work is done and the work will not really have been done (the system could crash before the work is done).

Francis Upton's practice is indeed an accepted practice.
Another one, is making more fine grained synchronization. Instead of synchronizing all read/write methods of a class, you can synchronize access of the exact invariants that should be synchronized.
And yet even better, is to get rid of synchronization altogether. This is possible using the java.util.concurrent package. This package introduce new collections that use Non-Blocking Algorithms (implemented in java using Compare-Ans-Swap atomic instructions). These collections, such as ConcurrentHashMap, enable much better throughput when scaling.
You can read more about it in this article.

In this type of implementation (slow service under increasing load) you want to make as much as possible async, including the timeout processing (if server-based) and the required I/O. Don't hold up your client response threads waiting for either of these time-consuming operations, to preserve the server's responsiveness to new requests, but instead fire off the required operations (maybe to a dynamic thread pool) and let callbacks process the results, whether timeout, complete I/O, or errors.
Send the appropriate response depending on what happens first, but be prepared to roll back I/O if you send an error/timeout message and then a completed I/O arrives (due to a race condition between I/O and timer). This implies transactional semantics are required in the server.
This is an area that get increasingly complex as your load grows but good design early on should allow you to scale as load grows. Ideally the client servicing threads should not block at all.

Servers and threading models

I am troubled with the following concept:
Most books/docs describe how robust servers are multithreaded and that the most common approach is to start a new thread to serve each new client. E.g. a thread is dedicated to each new connection. But how is this actually implemented in big systems? If we have a server that accepts requests from 100000 clients, it has started 100000 threads? Is this realistic? Aren't there limits on how many threads can run in a server? Additionally the overhead of context switching and synchronization, doesn't it degrade performance? Is it implemented as a mix of queues and threads? In this case is the number of queues fixed? Can anybody enlighten me on this, and perhaps give me a good reference that describes these?
Thanks!

The common method is to use thread pools. A thread pool is a collection of already created threads. When a new request gets to the server it is assigned a spare thread from the pool. When the request is handled, the thread is returned to the pool.
The number of threads in a pool is configured depending on the characteristics of the application. For example, if you have an application that is CPU bound you will not want too many threads since context switches will decrease performance. On the other hand, if you have a DB or IO bound application you want more threads since much time is spent waiting. Hence, more threads will utilize the CPU better.
Google "thread pools" and you will for sure find much to read about the concept.

Also Read up on the SEDA pattern link , link

In addition to the answers above I should notice, that really high-performance servers with many incoming connections attempt not to spawn a thread per each connection but use IO Completion Ports, select() and other asynchronous techniques for working with multiple sockets in one thread. And of course special attention must be paid to ensure that problems with one request or one socket won't block other sockets in the same thread.
Also thread management consumes CPU time, so threads should not be spawned for each connection or each client request.

In most systems a thread pool is used. This is a pool of available threads that wait for incoming requests. The number of threads can grow to a configured maximum number, depending on the number of simultaneous requests that come in and the characteristics of the application.
If a requests arrives, an unoccupied thread is requested from the thread pool. This thread is then dedicated to handling the request until the request finishes. When that happens, the thread is returned to the thread pool to handle another request.
Since there is only a limited number of threads, in most server systems one should attempt to make the lifetime of requests as short as possible. The less time a request needs to execute, the sooner a thread can be reused for a new request.
If requests come in while all threads are occupied, most servers implement a queueing mechanism for requests. Of course the size of the queue is also limited, so when more requests arrive than can be queued, new requests will be denied.
One other reason for having a thread pool instead of starting threads for each request is that starting a new thread is an expensive operation. It's better to have a number of threads started beforehand and reusing them then starting new threads all the time.

To get network servers to handle lots of concurrent connections there are several approaches (mostly divided up in "one thread per connection" and "several connections per thread" categories), take a look at the C10K page, which is a great resource on this topic, discussing and comparing a lot of approaches and linking to further resources on them.

Creating 10k threads is not likely to be efficient in most cases, but can be done and would work.
If you needed to serve 10k clients at once, doing so on a single machine would be unlikely but possible.
Depending on the client side implementation, it may be that the 10,000 clients do not need to maintain an open TCP connection - depending on the purpose, the protocol design can greatly improve the efficiency of implementation.
I think the appropriate solution for high scale systems is probably extremely domain-specific, and if you wanted a suggestion you'd have to explain more about your problem domain.

Java concurrency - Should block or yield?

I have multiple threads each one with its own private concurrent queue and all they do is run an infinite loop retrieving messages from it. It could happen that one of the queues doesn't receive messages for a period of time (maybe a couple seconds), and also they could come in big bursts and fast processing is necessary.
I would like to know what would be the most appropriate to do in the first case: use a blocking queue and block the thread until I have more input or do a Thread.yield()?
I want to have as much CPU resources available as possible at a given time, as the number of concurrent threads may increase with time, but also I don't want the message processing to fall behind, as there is no guarantee of when the thread will be reescheduled for execution when doing a yield(). I know that hardware, operating system and other factors play an important role here, but setting that aside and looking at it from a Java (JVM?) point of view, what would be the most optimal?

Always just block on the queues. Java yields in the queues internally.
In other words: You cannot get any performance benefit in the other threads if you yield in one of them rather than just block.

You certainly want to use a blocking queue - they are designed for exactly this purpose (you want your threads to not use CPU time when there is no work to do).
Thread.yield() is an extremely temperamental beast - the scheduler plays a large role in exactly what it does; and one simple but valid implementation is to simply do nothing.

Alternatively, consider converting your implementation to use one of the managed ExecutorService implementations - probably ThreadPoolExecutor.
This may not be appropriate for your use case, but if it is, it removes the whole burden of worrying about thread management from your own code - and these questions about yielding or not simply vanish.
In addition, if better thread management algorithms emerge in future - for example, something akin to Apple's Grand Central Dispatch - you may be able to convert your application to use it with almost no effort.

Another thing that you could do is use the concurrent hash map for your queue. When you do a read it gives you a reference of the object you were looking for, so it is possible you my miss a message that was just put into the queue. But if all this is doing is listening for a message you will catch it the next iteration. It would be different if the messages could be updated by other threads. But there doesn't really seem to be a reason to block that I can see.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.