I have a browser game built on a java web server using jsp.
I added a new module that uses some http session object and keep data in it. However, after it runs 3-4 hours, it suddenly stops working and freezes. When I check the error log, I dont see any exception thrown.
The server has 50-60 online in a moment.
I monitored the server using visualVM and here is the result after 4 hours until it stops :
I set the max memory as 1024Mb. As you can see its problem is not about the memory.
The thing that I notice is when the server stops, the thread amount increased.
According to the screenshot, should I doubt the httpsession object ? Why does the server stop responding ??
It looks like a system limitation or a deadlock.
Your thread graph looks like problematic : the number of living thread is important and never decreases. A web application should be stateless. The living tread count should rises when the requests arrive but also drops when the requests are finished.
I have not the impression it is the case in your application.
MGorgon is right.
You should also check "Deadlock detection" in jconsole.
If you use a JDK 6+ version, you could use ThreadMXBean. It has a findDeadlockedThreads() methods and other interesting methods to address your need.
Anyway, if it is not a deadlock, to get more information about the cause of the problem, I advise you to look in the system log whatever you OS is. You would have maybe interesting things.
Related
I have a java .jar file that i launch on an AWS instance in detached mode. So when i exit the ssh session, it still runs.
The app does some network stuff, and is expected to run for days until it finishes it task.
I have made logs all over the app, made log in the end of main method. I also made a global try/catch and added logging to the catch section.
Still, after some days i enter into ssh and see that the app just stopped running. No exceptions, main method did not complete because the log in the end did not trigger. It seems that the process was just killed in the middle of working. Sometimes it works for 5 hours, sometimes for 3-4 days without stopping.
I have no idea what could be the cause of this. I expect the java process to run until it finished, or until it crashes. Am i missing something?
upd:
it is an aws t2.micro, i think, the free tier one. It runs ubuntu 18.04.3 LTS
You need to monitor the server and application. The first thing to look at is your instance cloudwatch statistics for any CPU or memory spikes. If you find one, you know what you need to fix if you want to run your application on micro instance. For further reading
Monitoring Your Instances Using CloudWatch
Alternatively, you can collect and dump the java process statistics regularly when you are running the application. This can give insight of how heap,stack and cpu usage. Check this SO post for further details :
How do I monitor the computer's CPU, memory, and disk usage in Java?
I run a web server on tomcat 7 with Java 8, the server perform a lot of IO operations - mostly DB and HTTP calls, each transaction consumes a generous amount of memory and it serves around 100 concurrents at a given time.
After some time, around 10,000 requests made, but not in particular, the server start hangs, not respond or respond with empty 500 responses.
I see some errors on the logs which I currently trying to solve, but what bugs me is that I can't figure out what eventually causes that - catalina log file does not show a heap space exception, plus I took some memory dumps and it seems like there's always room to grow and garbage to collect, so I decided it is not a memory problem. Then I took thread dumps, I've always seen dozens of threads in WAITING, TIMED_WAITING, PARKING, etc...from what I read it seems like these threads are available to handle incoming work.
It's worth mentioning that all the work is done asynchronously, with no blocking operations and it seems like all the thread pools are available. Even more, I stop the traffic to the server and let it rest for some time, and even then the issue doesn't go away. So I figured it's also not a thread problem.
So...my question is:
Maybe it is a memory issue? Can it be a thread-cpu issue? can it be anything else?
I'm seeing strange behavior and I don't know how to gain any further insight into and am hoping someone can help.
Background: I have a query that takes a long time to return results so instead of making the user wait for the data directly upon request I execute this query via a Timer object at regular intervals and store the results in a static variable. Therefore, when the user requests the data I always just pull from the static variable, therefore making the response virtually instant. So far so good.
Issue: The behavior I'm seeing, however, is that if I make a request for the data just as the background (Timer) request has begun to query the data, my user's request waits for the data to come back before responding -- forcing the user to wait. It's as if tomcat is behaving synchronously with the threads (I know it's not -- it just looks that way).
This is in a Production environment and, for the most part, everything works great but for users there are times when the site just hangs for them and they feel it's unreliable (well, in a sense it is).
What I've done: Being that the requests for the data were in a static method I thought "A ha! The threads are syncronized which is causing the delay!" so i pulled all of my static methods out, removed the syncronization and forced each call to instantiate it's own object to retrieve the data (to keep it thread safe). There isn't any syncronization on a semaphore to the static variable either.
I've also installed javamelody to try and gain some insight into what's going on but nothing new thus far. I have noticed a lot (majority) of threads are in "WAITING" state but they also have 0ms for User and CPU time so don't think that is pointing to anything(?).
Running Tomcat 5.5 (no apache layer), struts 2, Java 1.5
If anyone has any idea why a simple request to a static variable hangs for longer background processes I would really appreciate it! Or if you know how I can gain insight that would be great too.
Thanks!
One possible explanation is that the threads are actually blocking at the database level due to database locking (or something) caused by the long-running query.
The way to figure out what is going on is to find out exactly where the blocked threads are blocking. A thread dump can be produced by sending a SIGQUIT (or equivalent) to the JVM, and included stack traces for all Java thread stacks. Alternatively, you can get the same information (and more) by attaching a debugger, etcetera. Either way, the class name and line number of the top frame of each stack should allow you to look at the source code and figure out (at least) what kind of locking or blocking is going on.
For those who would like to know I eventually found VisualVM (http://visualvm.java.net/download.html). It's perfect. I run Tomcat from eclipse like I normally do and it appears within the VisualVM client. Right-mouse click the tomcat icon, choose Thread Dump and, boom, I've got it all.
Thanks, all, for the help and pointers towards the right direction!
We have a Java App that connects via RMI to another Java app.
There are multiple instances of this app running at the same time, and after a few days an instance just stops processing... the CPU is in 0 and I have an extra thread listening to an specific port that helps to shutdown the App.
I can connect to the specific port but the app doesn't do anything.
We're using Log4j to keep a log and nothing is written, so there aren't any exceptions thrown.
We also use c3p0 for the DB connections.
Anyone have ideas?
Thanks,
I would suggest starting with a thread dump of the affected application.
You need to see what is going on on a thread by thread basis. It could be that you have a long running thread, or other process which is blocking other work from being done.
Since you are running linux, you can get your thread dump with the following command
kill -3 <pid>
If you need help reading the output, please post it in your original question.
If nothing is shown from the thread dump, other alternatives can be looked at.
Hum... I would suggest using JMetter to stress the Application and take note of anything weird that might be happening (such as Memory Leaks, Deadlocks and such). Also review the code for any Exceptions that might interrupt the program (or System.exit() calls). Finally, if other people have access to the computer, makes sense to check if the process wasn't killed manually somehow.
We are working on a large Java program that was converted from a Forte application. During the day we are getting Blocking SPID's in the server. We had a DBA visit yesterday and he set up a profile template to run to catch the locking/blocking action. When we run this profile the blocking problem goes away. Why?
This application is distributed using RMI and has around 70 users. We are using SQL 2000 and windows 2000 servers to keep compatibility with a bunch of old VB helper applications.
We have traced the blocking down to a specific screen and stored procedure but now we can't get the errors to happen with profiler running.
Thanks for any help!
Theo
The good old Heisenberg debugger problem.
Any profiler does two things: it adds code in place to invoke the debugger, and it stores data. The first one can thward optimizers, and the second can change the timing of something, causing a race condition to go away.
This blocking SPID problem seems to show up on Google a lot; the reason appears to be that it occurs when some resource is locked when another one wants it, so the timing error sounds likely.
Microsoft has an article on how to deal with the problem.
Just a collection of random thoughts.. I've seen traces take a server down but never make things better.
What trace template are you using? (These are taken from SQL Server 2005 tools, sorry)
The "Standard (default)" one tracks high levels calls and logon/logout
The "TSQL_SPs" tracks statement calls which would be a lot more intrusive
Is it binary and guaranteed too? Trace on= no blocks, trace off = blocks, or is it unlucky coincidence? When you're all watching the DBA does someone stop clicking in the client and come to watch?
Is something else being switched off as part of the trace. That is, are you using profiler or a scripted trace (lots of sp_trace_set% statements)?. In a scripted trace, there may be something that switches something else off.