I've got a somewhat dated Java EE application running on Sun Application Server 8.1 (aka SJSAS, precursor to Glassfish). With 500+ simultaneous users the application becomes unacceptably slow and I'm trying to assist in identifying where most of the execution time is spent and what can be done to speed it up. So far, we've been experimenting and measuring with LoadRunner, the app server logs, Oracle statpack, snoop, adjusting the app server acceptor and session (worker) threads, adjusting Hibernate batch size and join fetch use, etc but after some initial gains we're struggling to improve matters more.
Ok, with that introduction to the problem, here's the real question: If you had a slow Java EE application running on a box whose CPU and memory use never went above 20% and while running with 500+ users you showed two things: 1) that requesting even static files within the same app server JVM process was exceedingly slow, and 2) that requesting a static file outside of the app server JVM process but on the same box was fast, what would you investigate?
My thoughts initially jumped to the application server threads, both acceptor and session threads, thinking that even requests for static files were being queued, waiting for an available thread, and if the CPU/memory weren't really taxed then more threads were in order. But then we upped both the acceptor and session threads substantially and there was no improvement.
Clarification Edits:
1) Static files should be served by a web server rather than an app server. I am using the fact that in our case this (unfortunately) is not the configuration so that I can see the app server performance for files that it doesn't execute -- therefore excluding any database performance costs, etc.
2) I don't think there is a proxy between the requesters and the app server but even if there was it doesn't seem to be overloaded because static files requested from the same application server machine but outside of the application's JVM instance return immediately.
3) The JVM heap size (Xmx) is set to 1GB.
Thanks for any help!
SunONE itself is a pain in the ass. I have a very same problem, and you know what? A simple redeploy of the same application to Weblogic reduced the memory consumption and CPU consumption by about 30%.
SunONE is a reference implementation server, and shouldn't be used for production (don't know about Glassfish).
I know, this answer doesn't really helps, but I've noticed considerable pauses even in a very simple operations, such as getting a bean instance from a pool.
May be, trying to deploy JBoss or Weblogic on the same machine would give you a hint?
P.S. You shouldn't serve static content from under application server (though I do it too sometimes, when CPU is abundant).
P.P.S. 500 concurrent users is quite high a load, I'd definetely put SunONE behind a caching proxy or Apache which serves static content.
After using a Sun performance monitoring tool we found that the garbage collector was running every couple seconds and that only about 100MB out of the 1GB heap was being used. So we tried adding the following JVM options and, so far, this new configuration as greatly improved performance.
-XX:+DisableExplicitGC -XX:+AggressiveHeap
See http://java.sun.com/docs/performance/appserver/AppServerPerfFaq.html
Our lesson: don't leave JVM option tuning and garbage collection adjustments to the end. If you're having performance trouble, look at these settings early in your troubleshooting process.
Related
Recently we have migrated all our companies applications from Websphere to Tomcat application server. As part of this process we had performance testing done.
We found that couple of applications are having over 100% performance degradation in Tomcat. We increased the number of threads, we configured datasource settings to accommodate our test, we have also increased the read and write buffer sizes in the Tomcat server.
Application Background:
-> Spring Framework
-> Hibernate
-> Oracle 12c
-> JSPs
-> OpenJDK 8
We already checked the database and found no issues with performance in DB.
The CPU utilization while running the test is always less than 10%.
Heap settings are -xms = 1.5G to -xmx = 2G, and it never utilizes more than 1.2G.
We also have two nodes and HAProxy on top to balance the load. (We don't have a web server in place).
Despite our best efforts we couldn't pinpoint the issue that is causing the performance degradation. I am aware that this information isn't enough to provide a solution for our problem, however, any suggestion on how to proceed will be very helpful as we hit a dead-end and are unable to proceed.
Appreciate it if you can share any points that will be helpful in finding the issue.
Thanks.
Take Thread Dumps and analyze which part of application is having issues and start troubleshoot from there.
Follow this article for detailed explanation about Thread Dumps analysis - https://dzone.com/articles/how-analyze-java-thread-dumps
There are plenty of possible reasons for the problem you've mentioned and there really isn't much data to work with. Regardless of that, as kann commented a good way to start would be gathering Thread Dumps of the java process.
I'd also question if you're running in the same servers or if it's newly setup servers and how they are looking (resource-wise). Is there any CPU/Memory/IO constraints during the test?
Regarding the Xmx it sounds like you're not passing the -XX:+AlwaysPreTouch flag to the JVM but I would advice you to look into it as it will make the JVM zero the heap memory on start-up instead of doing it in runtime (which can mean a performance hit).
I usually use Spring Boot + JPA + Hibernate + Postgres.
At the end of the development of a WEB application I compile in Jar, then I run it directly with Java and then I do reverse proxy with Apache (httpd).
I have noticed that when starting there are no problems or latency, when accessing the website it works very quickly, but when several hours pass without anyone making a request to the server and then I want to access I must wait at least 20 seconds until the server responds, after this I can continue to access the site normally.
Why does this happen ?, It is as if Spring were in standby mode every time it detects that it has no load of requests, but I am not sure if it is so or is a problem. If it's some native spring functionality, how can I disable it?
Although I need to use a little more memory in idle state I want the answers to be fast regardless of whether it is loaded or not.
Without knowing more, it is likely that while your webapp is sitting idle, other programs on your server is using memory and cause the JVM memory to be swapped to disk.
When you then access the webapp again, the OS has to swap that JVM memory back into RAM, one page at a time. That takes time, but once the memory is back in RAM, your webapp will run normally.
Unfortunately, the way Java memory works, swapping JVM memory to disk is very bad for performance. That is an issue for most languages that rely on garbage collectors to free memory. Languages with manual memory management, e.g. C++ code, will usually not be hit as badly, when memory is swapped to disk, because memory use is more "focused" in those languages.
Solution: If my guess at the cause of your problem is correct, reconfigure your server so the JVM memory won't be swapped to disk.
Note that when I say server, I mean the physical machine. The "other programs", that your JVM is fighting for memory, might be running in different VMs, i.e. not in the same OS.
Let's say I have a very large Java application that's deployed on Tomcat. Over the course of a few weeks, the server will run out of memory, application performance is degraded, and the server needs a restart.
Obviously the application has some memory leaks that need to be fixed.
My question is.. If the application were deployed to a different server, would there be any change in memory utilization?
Certainly the services offered by the application server might vary in their memory utilization, and if the server includes its own unique VM -- i.e., if you're using J9 or JRockit with one server and Oracle's JVM with another -- there are bound to be differences. One relevant area that does matter is class loading: some app servers have better behavior than others with regard to administration. Warm-starting the application after a configuration change can result in serious memory leaks due to class loading problems on some server/VM combinations.
But none of these are really going to help you with an application that leaks. It's the program using the memory, not the server, so changing the server isn't going to affect much of anything.
There will probably be a slight difference in memory utilisation, but only in as much as the footprint differs between servlet containers. There is also a slight chance that you've encountered a memory leak with the container - but this is doubtful.
The most likely issue is that your application has a memory leak - in any case, the cause is more important than a quick fix - what would you do if the 'new' container just happens to last an extra week etc? Moving the problem rarely solves it...
You need to start analysing the applications heap memory, to locate the source of the problem. If your application is crashing with an OOME, you can add this to the JVM arguments.
-XX:-HeapDumpOnOutOfMemoryError
If the performance is just degrading until you restart the container manually, you should get into the routine of triggering periodic heap dumps. A timeline of dumps is often the most help, as you can see which object stores just grow over time.
To do this, you'll need a heap analysis tool:
JHat or IBM Heap Analyser or whatever your preference :)
Also see this question:
Recommendations for a heap analysis tool for Java?
Update:
And this may help (for obvious reasons):
How do I analyze a .hprof file?
I have a J2EE java application which processes SOAP requests. In our production environment (HPUX,OC4J,Java 5) we have about 20 threads running for this process, and we sometimes see 1 thread pausing for ~15 seconds. Until now, I haven't succeeded replicating the problem in our preproduction environment, and I'm scared of breaking stuff and violating SLA's if I use jconsole and associated tools on our production server.
Who has any inspiration? I know about http://java.sun.com/j2se/1.5/pdf/jdk50_ts_guide.pdf but I miss the experience to dare using it straight in production (plus, the HPUX guys threw some of these tools out of the toolbox, replacing them with HPJMeter)
Also, although this suggests a GC problem to me, I don't yet know enough to prove or disprove this theory and I am open to other suggestions.
We connect jconsole (and other tools) straight to production regularly. There is no significant overhead for us, the instrumentation is already going on within the JVM so you'd just be connecting a remote process to read published values. I say go for it!
Either way, you really need to see what's going on on the box. Thread dumps might or do some internal instrumentation. By internal instrumentation, I mean recording key measures within the code and exposing those somehow. It's essentially what the JVM does (exposing them via JMX) but rolling your own gives you more specificity. For example, I'm frequently recording request/response or other critical path performance timings internally.
oh, and one more thing. You can setup your app to using an agent to provide even more information. Typically this would be to plug a profiler in (like jprofiler or yourkit) but this does usually add more overhead and isn't recommended for production.
It's also worth thinking about the cost of not getting the information you need out of the VM. For example, is the cost of not fixing the issue more or less than the cost of a small % drop of performance when monitoring?
More scientifically, this article has some comments. It's suggesting up to 7% overhead (contradicting my previous point), a previous article from 2006 suggests 3-4% but both are highly contextual results. For example, CPU intensive applications may or may not be affected more than IO bound ones.
So a more appropriate answer from me (rather than just "go for it") would be to understand the impact it would have for your application in your environment through measurement. Run representative tests on a similar environment to production with jconsole connected and disconnected and see for your self.
Also see this stackoverflow question.
There are a few things that you can do on HP-UX to get additional information from a running Java process. If you send the PROF signal to the JVM, it will toggle the generation of a GC log (as if you had used the -Xverbosegc command line option). Generating the GC log is very inexpensive, so you should be able to turn this on in production without affecting the performance.
If you send the USR2 signal to the JVM, it starts profiling (same as -Xeprof). If you send the signal a second time, it turns off the profiling. This will have a noticeable performance impact, though it is smaller that what you would see from an external, third party profiler.
You can analyze the resulting data files using HPjmeter. HPjmeter can also connect to a running JVM for real-time monitoring. With Java 5, you need to start the JVM with the -agentlib option. If you were using Java 6, you could attach to the running JVM without needing any extra command line options.
First, just a bit of background:
One of our customers is experiencing CPU usage spikes for WebSphere instances running one of our web apps (other instances with other apps are fine). They have a test environment and a live environment (both iSeries) which both experience the problem - with a single app per instance setup. We have deployed this application locally in our own test environments and also for many other customers all on iSeries with no similar problems.
What's actually happening:
Every one second or so, the CPU usage for the WebSphere process' CPU usage jumps to anywhere from 7%-20% even though there are no requests being processed at the time. Customer has reported seeing spikes as high as 30%. These spikes average out to be 1.5% of CPU overall - the other WebSphere instances typically use 0%-0.1% when idle.
My investigations so far
So, I had a look at the threads. One thread in there test environment was using ~350 CPU cycles per second. A similar thread in their live environment was using ~1500 CPU cycles per second (showing that it has bigger CPU). The call stack for these threads looks like
Type Program Statement Procedure
QLESPI QSYS 17 LE_Create_Thread2__FP12crtt >
QJVALIBJVM QSYS 7 startThread__FPv
J com/ibm/ws/util/Threa > run
J com/ibm/ws/util/Threa > run
J com/ibm/ws/util/Threa > getTask
J com/ibm/ws/util/Bound > poll
The entire class name from the bottom line is com/ibm/ws/util/BoundedBuffer. I asked the customer to do a JVM Dump for me - the only additional information I got from this was the thread name:
Thread: 00002F82 Deferrable Alarm : 11
Now for my questions:
Can any of you identify the problem, given these symptoms? (Maybe that's a long shot!)
What is Deferrable Alarm? From the JVM Dump, I can see 4 threads with this name. The other three seem to be doing just fine. By debugging my local WebSphere (on Windows) and adding breakpoints in the BoundedBuffer class, I see that BoudedBuffers are polling and periodically invoking some listener.
I don't have access to the WebSphere console for the customer machines, and they aren't owning up to having made any config changes. I can ask them to check the console for me though - what should I be asking them to look at?
I have telnet access to the customer boxes, is there anything else I can investigate here? Looking at the WebSphere profile files, etc? Which files should I be looking at?
Because the Call Stack and JVM Dump don't explicitly reference our code, is it safe to assume that this is a configuration problem?
It's been a long question, so thanks for reading this far.
30 April Update (1)
This morning I've noticed that this behaviour only happens after the first request of the day has been processed (irrespective of which Web Service is invoked). This points the finger back at our application or Apache Axis. Could it be that this is just normal behaviour?!
30 April Update (2)
So it seems that this CPU activity is some kind of housekeeping activity for the web-container or maybe something within Apache Axis. I've now observed this happening on a few different web-applications on a few different servers. Applications with no web component don't suffer the same additional CPU overhead.
I'd imagine if it is housekeeping work, that "tuning" it somehow could be counter productive - by that, I mean that making the App Server idle better would probably negatively affect the amount of "real" work it can do.
You could try to profile and do heap dumps of the application, that could answer a few questions related to memory and cpu usage.
I would recommend following the must gather documentation provided by IBM, and raising a PMR along with your own investigation. Things you might suspect:
Garbage collection (unlikely on low application utilization)
Timers or tasks (such as java.util.Timer or commonj work manager)
Pretest connection that has a complex SQL query (in the DataSource's WebSphere Application Server data source properties)
I would also recommend using the profiler to determine the cause, YourKit profiler is a pretty decent one.
Very instinctively (being unfamiliar with iSeries platforms) I would look at disk IO related issues. Can you describe the disk subsystem? Can you see if your app is spending an unusually large amount of time in iowait?
I know this doesn't quite match your problem. But might be worth a look if your running prior to WAS 6.1 patch 17.
http://www-01.ibm.com/support/docview.wss?uid=swg24018437
Hope this helps. Cheers John
My best guess is that it is some type of monitoring is being done on the instance, like Tivioli etc. Have you ruled out any GC activity?
HTH Tom
Most application servers are implemented in java itself and so is WebSphere. This servers apart from serving client requests have to do other periodical jobs like say resource pool management. Performing this jobs will create some temporary objects that needs to be garbage collected.
Depending on how much heap you have allocated, usage and garbage collector settings, garbage collector will be invoked. I'd say try to see if it is garbage collector thread that is taking up your CPU. For this connect jconsole utility to remote websphere process for a day and see if there is any co-relation between heap usage and cpu usage.
I am also experiencing this very same issue, [Deferrable Alarm:x] using with BoundedBuffer. The only difference I have is that this is on a Windows 7 64bit machine. There is absolutely no Tivioli or other batch process running, no requests being made, the single instance is just idle.
I can run the application in DEBUG mode and pause the Deferrable Alarm thread and the CPU spikes stop, resume and they start again.
I've checked disk activity, network activity and their is nothing happening there.
I am running WebSphere 6.1.0.27 .