Glassfish server log flodding with Interrupting idle Thread - java

We have deployed an Spring MVC application on GlassFish Server Open Source Edition 3.1.2.2. Server log is on warning level, so after deployment I observed that lots of server.log files are generated almost 95-97% of logs are filled with:
[#|2015-10-15T20:19:20.995+0530|WARNING|glassfish3.1.2|com.sun.grizzly.config.GrizzlyServiceListener|_ThreadID=13;_ThreadName=Thread-2;|GRIZZLY0023: Interrupting idle Thread: http-thread-pool-80(7).|#]
While google search I come to know about issue posted on JIRA and one patch added on it, I haven't tried that patch yet but I wanted to know the reason behind this WARNING. Some doubts are in my mind :
Is this warning safe to ignore?
Why glassfish service is interrupting threads? what is actually
happening in glassfish service?
How can I avoid this warning to generate? and what will cause if I
ignore this (what will be the impact)?

1) If your CPU usage is high it is not safe to ignore, as it could cause the death of your server
2) Most probably you see this problem because servlet/webapp processes
request longer than 15mins (by default).
3) If above-mentioned is ok for you, you will need to change the request timeout (disable it). But on the other hand, it will not be safe if long processing times is not something, that you really except to.
Try this patch or check your web app. If you will provide more information about the servlet/webapp that is causing this issue, it would be easier to answer.

Related

Solution to remove lock after Application server redeployment

Before redeploying the application war, I checked the xd.lck file from one of the environment path:
Private property of Exodus: 20578#localhost
jetbrains.exodus.io.LockingManager.lock(LockingManager.kt:89)
I'm testing from both Nginx Unit and Payara server to eliminate the possibility that this is an isolated case with Unit.
And process 20578 shows from htop:
20578 root 20 0 2868M 748M 7152 S 0.7 75.8 14:05.75 /usr/lib/jvm/zulu-8-amd64/bin/java -cp /
After redeployment finished successfully, accessing the web application throws:
java.lang.Thread.run(Thread.java:748)
at jetbrains.exodus.log.Log.tryLock(Log.kt:799)
at jetbrains.exodus.log.Log.<init>(Log.kt:120)
at jetbrains.exodus.env.Environments.newLogInstance(Environments.java:142)
at jetbrains.exodus.env.Environments.newLogInstance(Environments.java:121)
at jetbrains.exodus.env.Environments.newLogInstance(Environments.java:10
And checking the same xd.lck file shows the same content. Meaning to say that "lock is not immediately released" contrary to what is described here.
My assumption is for this specific case with Payara Server (based on Glassfish) is that, the server does not kill the previous process even after redeployment has completed. Maybe perhaps for "zero-downtime" redeployment, not sure, Payara experts can correct me here.
Checking with htop the process 20578 is still running even after the redeployment.
As with Xodus, since most application servers behave this way, what would be the best solution and/or workaround so we don't need to manually delete all lock files of each environment (if can be deleted) every time we redeploy?
Solution is for the Java application to look for the process locking the file then do a kill -15 signal for example to gracefully make the Java handle the signal to be able to close environments:
// Get all PersistentEntityStore's
entityStoreMap.forEach((dir, entityStore) -> {
entityStore.getEnvironment().close();
entityStore.close();
}

Spring boot + tomcat 8.5 + mongoDB, AsyncRequestTimeoutException

I have created a spring boot web application and deployed war of the same to tomcat container.
The application connects to mongoDB using Async connections. I am using mongodb-driver-async library for that.
At startup everything works fine. But as soon as load increases, It shows following exception in DB connections:
org.springframework.web.context.request.async.AsyncRequestTimeoutException: null
at org.springframework.web.context.request.async.TimeoutDeferredResultProcessingInterceptor.handleTimeout(TimeoutDeferredResultProcessingInterceptor.java:42)
at org.springframework.web.context.request.async.DeferredResultInterceptorChain.triggerAfterTimeout(DeferredResultInterceptorChain.java:75)
at org.springframework.web.context.request.async.WebAsyncManager$5.run(WebAsyncManager.java:392)
at org.springframework.web.context.request.async.StandardServletAsyncWebRequest.onTimeout(StandardServletAsyncWebRequest.java:143)
at org.apache.catalina.core.AsyncListenerWrapper.fireOnTimeout(AsyncListenerWrapper.java:44)
at org.apache.catalina.core.AsyncContextImpl.timeout(AsyncContextImpl.java:131)
at org.apache.catalina.connector.CoyoteAdapter.asyncDispatch(CoyoteAdapter.java:157)
I am using following versions of software:
Spring boot -> 1.5.4.RELEASE
Tomcat (installed as standalone binary) -> apache-tomcat-8.5.37
Mongo DB version: v3.4.10
mongodb-driver-async: 3.4.2
As soon as I restart the tomcat service, everything starts working fine.
Please help, what could be the root cause of this issue.
P.S.: I am using DeferredResult and CompletableFuture to create Async REST API.
I have also tried using spring.mvc.async.request-timeout in application and configured asynTimeout in tomcat. But still getting same error.
It's probably obvious that Spring is timing out your requests and throwing AsyncRequestTimeoutException, which returns a 503 back to your client.
Now the question is, why is this happening? There are two possibilities.
These are legitimate timeouts. You mentioned that you only see the exceptions when the load on your server increases. So possibly your server just can't handle that load and its performance has degraded to the point where some requests can't complete before Spring times them out.
The timeouts are caused by your server failing to send a response to an asynchronous request due to a programming error, leaving the request open until Spring eventually times it out. It's easy for this to happen if your server doesn't handle exceptions well. If your server is synchronous, it's okay to be a little sloppy with exception handling because unhandled exceptions will propagate up to the server framework, which will send a response back to the client. But if you fail to handle an exception in some asynchronous code, that exception will be caught elsewhere (probably in some thread pool management code), and there's no way for that code to know that there's an asynchronous request waiting on the result of the operation that threw the exception.
It's hard to figure out what might be happening without knowing more about your application. But there are some things you could investigate.
First, try looking for resource exhaustion.
Is the garbage collector running all the time?
Are all CPUs pegged at 100%?
Is the OS swapping heavily?
If the database server is on a separate machine, is that machine showing signs of resource exhaustion?
How many connections are open to the database? If there is a connection pool, is it maxed out?
How many threads are running? If there are thread pools in the server, are they maxed out?
If something's at its limit then possibly it is the bottleneck that is causing your requests to time out.
Try setting spring.mvc.async.request-timeout to -1 and see what happens. Do you now get responses for every request, only slowly, or do some requests seem to hang forever? If it's the latter, that strongly suggests that there's a bug in your server that's causing it to lose track of requests and fail to send responses. (If setting spring.mvc.async.request-timeout appears to have no effect, then the next thing you should investigate is whether the mechanism you're using for setting the configuration actually works.)
A strategy that I've found useful in these cases is to generate a unique ID for each request and write the ID along with some contextual information every time the server either makes an asynchronous call or receives a response from an asynchronous call, and at various checkpoints within asynchronous handlers. If requests go missing, you can use the log information to figure out the request IDs and what the server was last doing with that request.
A similar strategy is to save each request ID into a map in which the value is an object that tracks when the request was started and what your server last did with that request. (In this case your server is updating this map at each checkpoint rather than, or in addition to, writing to the log.) You can set up a filter to generate the request IDs and maintain the map. If your filter sees the server send a 5xx response, you can log the last action for that request from the map.
Hope this helps!
Asynchroneus tasks are arranged in a queue(pool) which is processed in parallel depending on the number of threads allocated. Not all asynchroneus tasks are executed at the same time. Some of them are queued. In a such system getting AsyncRequestTimeoutException is normal behaviour.
If you are filling up the queues with asynchroneus tasks that are unable to execute under pressure. Increasing the timeout will only delay the problem. You should focus instead on the problem:
Reduce the execution time(through various optimizations) of asynchroneus task. This will relax the pooling of async tasks. It oviously requires coding.
Increase the number of CPUSs allocated in order to be able to run more efficiently the parallel tasks.
Increase the number of threads servicing the executor of the driver.
Mongo Async driver is using AsynchronousSocketChannel or Netty if Netty is found in the classpath. In order to increase the number of the worker threads servicing the async comunication you should use:
MongoClientSettings.builder()
.streamFactoryFactory(NettyStreamFactoryFactory(io.netty.channel.EventLoopGroup eventLoopGroup,
io.netty.buffer.ByteBufAllocator allocator))
.build();
where eventLoopGroup would be io.netty.channel.nio.NioEventLoopGroup(int nThreads))
on the NioEventLoopGroup you can set the number of threads servicing your async comunication
Read more about Netty configuration here https://mongodb.github.io/mongo-java-driver/3.2/driver-async/reference/connecting/connection-settings/

Tomcat restarts with errors (exit 143), runs and then fails after time

This is my first time asking a question on Stack Overflow. I recently configured an Ubuntu 16.04 virtual private server to host a web application. I run ngnix on a Tomcat server that reads and writes to a MySQL database. The application runs fine except for the fact that Tomcat restarts itself once in a while which results in a 500 error that stems from a "broken-pipe" when anyone tries to login (i.e. make a connection to the database).
I will post an image of the 500 next time it happens. I went into my vps and looked at my Tomcat restart message. This is what I see: Tomcat status message.
I also did a little diving into the Tomcat logs and this is a log file that corresponds with that restart time: Tomcat log file
I did some research to try and solve this myself, but with no success. I believe that the exit=143 is the process being terminated by another program or the system itself. I also have done some moving of the mysql-connector-java.jar. I read that it should be located in the Tomcat/lib directory and not in the WEB-INF of the web application. Perhaps I need to configure other settings.
Any help or any direction would be much appreciated. I've fought this issue for a week with having learned much, but accomplished little.
Thanks
Look at the timeline. It starts at 19:49:23.766 in the Tomcat log with this message:
A valid shutdown command was received via the shutdown port. Stopping the Server instance.
Exit code 143 is a result of that shutdown and doesn't indicate anything.
The question you need answered is: Who send that shutdown command, and why?
On a side note: The earlier messages indicates that Tomcat lost connection to the database, and that you didn't configure a validation query. You should always configure that, since database connections in the connection pool will go stale, and that needs to be detected.
Theory: Do you have some monitoring service running that tests your application being up? Does that monitoring detect a timed-out database connection, classify that as a hung webapp and auto-restart Tomcat?
While I don't think I am able to see to the core of the problem you have with your overall setup given the small excerpt of your log files, one thing strikes the eye. In the Tomcat log, there is the line
A valid shutdown command was received via the shutdown port. Stopping the server instance.
This explains why the server was restarted. Someone (some external process, a malicious attacker, script, or whatever. Could be anything depending on the setup of your server) sent a shutdown command to Tomcat's shutdown port (8005 by default) which made the Tomcat shut down.
Refer to OWASP's recommendations for securing a Tomcat server instance for fixing this possible security whole.
Regarding the ostensible Hibernate problems you have, I don't get enough information from your logs to make a useful statement. But you can leave the MySQL jar in Tomcat/lib, since this is not the root cause of your problem.

Tomcat 7 stops responding

I have a REST (Jersey) web server using Tomcat 7 + Hibernate + Spring + Ehcache (as local cache).
The server randomly stops responding. I haven't captured (reproduced) the stopping behavior so it is hard to tell exactly when the server hangs. Once the server hangs, if I send a request, the request can't even hit the server (I don't see any request coming in from the application log file)
I understand it is very generic questions. But where do I need to take a look to find out more info?
After spending googling quite some time, I found out that I need to look at catalina.out log file and need to see the heap dump for possible deadlock, and JDBC connection, etc.
Where/How do I find out the heap dump? And where do I see any logs for JDBC connections?
I am using Spring + Hibernate and use transaction manager to manage the transaction. Is there any particular configuration I need to specify in the data source?
Very hard to give any definitive advice with such a generic question.
Before going for a heap dump, I would start with a thread dump using the jstack tool found in a JDK install.
This could give you an idea of what your Tomcat is doing (or not doing) when it stops responding.

failed HTTP requests on Tomcat

My web app is running on 64-bit Java 6.0.23, Tomcat 6.0.29 (with Apache Portable Runtime 1.4.2), on Linux (CentOS). Tomcat's JAVA_OPTS includes -Xincgc, which is supposed to help prevent long garbage collections.
The app is under heavy load and has intermittent failures, and I'd like to troubleshoot it.
Here is the symptom: Very intermittently, an HTTP client will send an HTTP request to the web app and get an empty response back.
The app doesn't use a database, so it's definitely not a problem with JDBC connections. So I figure the problem is perhaps one of: memory (perhaps long garbage collections), out of threads, or out of file descriptors.
I used javamelody to view the number of threads that are being used, and it seems that maxThreads is set high enough to not be running out of threads. Similarly, we have the number of available of file descriptors set to a very high number.
The app does use a lot of memory. Does it seem like memory is probably the culprit here, or is there something else that I might be overlooking?
I guess my main confusion, though, is why garbage collections would cause HTTP requests to fail. Intuitively, I would guess that a long garbage collection might cause an HTTP request to take a long time to run, but I would not guess that a long garbage collection would cause an HTTP request to fail.
Additional info in response to Jon Skeet's comments...
The client is definitely not timing out. The empty response happens fairly quickly. When it fails, there is no data and no HTTP headers.
I very much doubt that garbage collection is responsible for the issue.
You really really need to find out exactly what this "empty response" consists of:
Does the server just chop the connection?
Does the client perhaps time out?
Does the server give a valid HTTP response but with no data?
Each of these could suggest very different ways of finding out what's going on. Determining the failure mode should be your primary concern, IMO. Until you know that, it's complete guesswork.

Categories