In the catalina.out I saw this message appearing in the log:
Maximum number of threads (200) created for connector with address null and port 80
Does this mean a process is hogging something or do I need to just increase the thread size?
After restarting tomcat, I had spam like this message:
"SEVERE: The web application [/MyServlet] is still processing a request that has yet to finish. This is very likely to create a memory leak. You can control the time allowed for requests to finish by using the unloadDelay attribute of the standard Context implementation."
Is there a way to solve my situation?
Yes, it sounds like you've got some request handler which never completes. Each time it's invoked, it'll basically soak up another thread, until you've run out of threads in the pool.
You need to work out which request is failing to complete, and fix the code. If you can take a dump of the stacks of all threads, it's likely to become clear which requests are failing to complete.
Related
I have created a spring boot web application and deployed war of the same to tomcat container.
The application connects to mongoDB using Async connections. I am using mongodb-driver-async library for that.
At startup everything works fine. But as soon as load increases, It shows following exception in DB connections:
org.springframework.web.context.request.async.AsyncRequestTimeoutException: null
at org.springframework.web.context.request.async.TimeoutDeferredResultProcessingInterceptor.handleTimeout(TimeoutDeferredResultProcessingInterceptor.java:42)
at org.springframework.web.context.request.async.DeferredResultInterceptorChain.triggerAfterTimeout(DeferredResultInterceptorChain.java:75)
at org.springframework.web.context.request.async.WebAsyncManager$5.run(WebAsyncManager.java:392)
at org.springframework.web.context.request.async.StandardServletAsyncWebRequest.onTimeout(StandardServletAsyncWebRequest.java:143)
at org.apache.catalina.core.AsyncListenerWrapper.fireOnTimeout(AsyncListenerWrapper.java:44)
at org.apache.catalina.core.AsyncContextImpl.timeout(AsyncContextImpl.java:131)
at org.apache.catalina.connector.CoyoteAdapter.asyncDispatch(CoyoteAdapter.java:157)
I am using following versions of software:
Spring boot -> 1.5.4.RELEASE
Tomcat (installed as standalone binary) -> apache-tomcat-8.5.37
Mongo DB version: v3.4.10
mongodb-driver-async: 3.4.2
As soon as I restart the tomcat service, everything starts working fine.
Please help, what could be the root cause of this issue.
P.S.: I am using DeferredResult and CompletableFuture to create Async REST API.
I have also tried using spring.mvc.async.request-timeout in application and configured asynTimeout in tomcat. But still getting same error.
It's probably obvious that Spring is timing out your requests and throwing AsyncRequestTimeoutException, which returns a 503 back to your client.
Now the question is, why is this happening? There are two possibilities.
These are legitimate timeouts. You mentioned that you only see the exceptions when the load on your server increases. So possibly your server just can't handle that load and its performance has degraded to the point where some requests can't complete before Spring times them out.
The timeouts are caused by your server failing to send a response to an asynchronous request due to a programming error, leaving the request open until Spring eventually times it out. It's easy for this to happen if your server doesn't handle exceptions well. If your server is synchronous, it's okay to be a little sloppy with exception handling because unhandled exceptions will propagate up to the server framework, which will send a response back to the client. But if you fail to handle an exception in some asynchronous code, that exception will be caught elsewhere (probably in some thread pool management code), and there's no way for that code to know that there's an asynchronous request waiting on the result of the operation that threw the exception.
It's hard to figure out what might be happening without knowing more about your application. But there are some things you could investigate.
First, try looking for resource exhaustion.
Is the garbage collector running all the time?
Are all CPUs pegged at 100%?
Is the OS swapping heavily?
If the database server is on a separate machine, is that machine showing signs of resource exhaustion?
How many connections are open to the database? If there is a connection pool, is it maxed out?
How many threads are running? If there are thread pools in the server, are they maxed out?
If something's at its limit then possibly it is the bottleneck that is causing your requests to time out.
Try setting spring.mvc.async.request-timeout to -1 and see what happens. Do you now get responses for every request, only slowly, or do some requests seem to hang forever? If it's the latter, that strongly suggests that there's a bug in your server that's causing it to lose track of requests and fail to send responses. (If setting spring.mvc.async.request-timeout appears to have no effect, then the next thing you should investigate is whether the mechanism you're using for setting the configuration actually works.)
A strategy that I've found useful in these cases is to generate a unique ID for each request and write the ID along with some contextual information every time the server either makes an asynchronous call or receives a response from an asynchronous call, and at various checkpoints within asynchronous handlers. If requests go missing, you can use the log information to figure out the request IDs and what the server was last doing with that request.
A similar strategy is to save each request ID into a map in which the value is an object that tracks when the request was started and what your server last did with that request. (In this case your server is updating this map at each checkpoint rather than, or in addition to, writing to the log.) You can set up a filter to generate the request IDs and maintain the map. If your filter sees the server send a 5xx response, you can log the last action for that request from the map.
Hope this helps!
Asynchroneus tasks are arranged in a queue(pool) which is processed in parallel depending on the number of threads allocated. Not all asynchroneus tasks are executed at the same time. Some of them are queued. In a such system getting AsyncRequestTimeoutException is normal behaviour.
If you are filling up the queues with asynchroneus tasks that are unable to execute under pressure. Increasing the timeout will only delay the problem. You should focus instead on the problem:
Reduce the execution time(through various optimizations) of asynchroneus task. This will relax the pooling of async tasks. It oviously requires coding.
Increase the number of CPUSs allocated in order to be able to run more efficiently the parallel tasks.
Increase the number of threads servicing the executor of the driver.
Mongo Async driver is using AsynchronousSocketChannel or Netty if Netty is found in the classpath. In order to increase the number of the worker threads servicing the async comunication you should use:
MongoClientSettings.builder()
.streamFactoryFactory(NettyStreamFactoryFactory(io.netty.channel.EventLoopGroup eventLoopGroup,
io.netty.buffer.ByteBufAllocator allocator))
.build();
where eventLoopGroup would be io.netty.channel.nio.NioEventLoopGroup(int nThreads))
on the NioEventLoopGroup you can set the number of threads servicing your async comunication
Read more about Netty configuration here https://mongodb.github.io/mongo-java-driver/3.2/driver-async/reference/connecting/connection-settings/
I'm using Hibernate and Tomcat JDBC connection pool to use my MySQL DB.
When, from any reason, the DB is down, my application got stuck.
For example, I have REST resources (with Jersey), they are not getting any requests.
I also using quartz for schedule tasks, they aren't running as well.
When I start my DB again, everything goes back to normal.
I don't even know where to start looking, anyone has an idea?
Thanks
What must be happening is your application must be receiving request but there must be some exception while establishing database connection .see the logs.
try some flow where your are not doing any database operation. It must work fine.
When the application has hung, get a thread dump of the JVM, this will tell you the state of each thread and, rather than guessing as to the cause, you'll have concrete evidence.
Having said that, and going with the guess work approach, a lot comes down to how your connection pool is configured and the action your code takes when it receives the SQLException. If the application is totally hung, then my first port of call would be to find out if the db accessing threads are in a WAIT state, or even deadlocked. The thread dump will help you to determine if that is the case.
see kill -3 to get java thread dump for how to take a thread dump
We're using Glassfish 3.0.1 and experiencing very long response times; in the order of 5 minutes for 25% of our POST/PUT requests, by the time the response comes back the front facing load balancer has timed out.
My theory is that the requests are queuing up and waiting for an available thread.
The reason I think this is because the access logs reveal that the requests are taking a few seconds to complete however the time at which the requests are being executed are five minutes later than I'd expect.
Does anyone have any advice for debugging what is going on with the thread pools? or what the optimum settings should be for them?
Is it required to do a thread dump periodically or will a one off dump be sufficient?
At first glance, this seems to have very little to do with the threadpools themselves. Without knowing much about the rest of your network setup, here are some things I would check:
Is there a dead/nonresponsive node in the load balancer pool? This can cause all requests to be tried against this node until they fail due to timeout before being redirected to the other node.
Is there some issue with initial connections between the load balancer and the Glassfish server? This can be slow or incorrect DNS lookups (though the server should cache results), a missing proxy, or some other network-related problem.
Have you checked that the clocks are synchronized between the machines? This could cause the logs to get out of sync. 5min is a pretty strange timeout period.
If all these come up empty, you may simply have an impedance mismatch between the load balancer and the web server and you may need to add webservers to handle the load. The load balancer should be able to give you plenty of stats on the traffic coming in and how it's stacking up.
Usually you get this behaviour if you configured not enough worker threads in your server. Default values range from 15 to 100 threads in common webservers. However if your application blocks the server's worker threads (e.g. by waiting for queries) the defaults are way too low frequently.
You can increase the number of workers up to 1000 without problems (assure 64 bit). Also check the number of workerthreads (sometimes referred to as 'max concurrent/open requests') of any in-between server (e.g. a proxy or an apache forwarding via mod_proxy).
Another common pitfall is your software sending requests to itself (e.g. trying to reroute or forward a request) while blocking an incoming request.
Taking threaddump is the best way to debug what is going on with the threadpools. Please take 3-4 threaddumps one after another with 1-2 seconds gap between each threaddump.
From threaddump, you can find the number of worker threads by their name. Find out long running threads from the multiple threaddumps.
You may use TDA tool (http://java.net/projects/tda/downloads/download/tda-bin-2.2.zip) for analyzing threaddumps.
We've got a normal-ish server stack including BlazeDS, Tomcat and Hibernate.
We'd like to arrange things such that if certain errors (especially AssertionError) are thrown, the current thread is considered to be in an unknown state and won't be used for further HTTP requests. (Partly because we're storing some things, such as Hibernate transaction session, in thread-local storage. Now, we can catch throwables and make sure to roll back transactions and rethrow, but there's no guarantee about what other code may have left who knows what in the thread-local storage.)
Tomcat with the default thread pool behavior reuses threads. We tried specifying our own executor, which seems to be the most specific method of changing its thread pool behavior, but it doesn't always call Executor.execute() with a new task for each request. (Most likely it reuses the same execution context for all requests in the same HTTP connection.)
One option is to disable keepalive, so that there's only one request per HTTP connection, but that's ugly.
Anyway, I'd like to know. Is there a way to tell Tomcat not to reuse a thread, or to kill or exit the thread so that Tomcat is forced to create a new one?
(From the Tomcat source, it appears Tomcat will close the connection and abandon the task/thread after sending a HTTP 500 response, but I don't know how to get BlazeDS to generate a 500 response; that's another angle I'd like to know more about.)
I would strongly suggest simply getting rid of your use of thread-local storage, or at least coming up with a method to clear the thread-local storage when a request first enters the pipeline (with a <filter> in web.xml, for example).
Having to reconfigure something basic about Tomcat to get it to work with your app points to a code smell.
I'm working through the 'Simple Point-to-Point Example' section of the Sun JMS tutorial (sender source, receiver source), using Glassfish as my JMS provider. I've set up the QueueConnectionFactory and Queue in the Glassfish admin UI, and added the relevant JARs to my classpath and the receiver is receiving the messages sent by the sender.
However, neither sender nor receiver terminate. The main thread exits normally (after successfully calling queueConnection.close()) but two non-daemon threads are left hanging around:
iMQReadChannel-0
imqConnectionFlowControl-0
It seems (from this java.net thread) that the reason is that queueConnection.close() just returns the connection to the pool, rather than really closing it. I can't find any way to tell the pool to shutdown, so the only option I'm left with is System.exit(), which feels wrong.
I've tried setting the minimum pool size to 0, the maximum pool size to 1 and the idle timeout to 10 seconds but it seems to make no difference. Even when I just lookup the connection factory and don't ask for a connection, these two threads are still started and don't terminate.
Any help much appreciated!
Why don't you simply terminate with a System.exit(0)? Given the sample, the current behavior is correct (a Java program terminates when all non-daemon threads end).
Maybe you can have the samples shutting down properly by playing with client library's properties (idle time, etc...), but it seems others ( http://www.nabble.com/Simple-JMS-Client-doesn%27t-quit-td15662753.html) still experience the very same problem (and, anyway, i still don't understand what the point is).
Good news for us. "Will not fixed"
http://java.net/jira/browse/GLASSFISH-1429?focusedCommentId=85555&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_85555