We're using Appdynamics Java agent for monitoring our production applications. We have noticed slow growth in memory and the application eventually stalls. We ran a head dump on one of the JVMs and got the below reports.
Problem Suspect 1:
The thread com.singularity.ee.agent.appagent.kernel.config.xml.a# 0x1267......
AD thread config Poller keeps local variable config size of 28546.79(15.89%) KB
Problem Suspect 2:
280561 Instances of
com.singularity.ee.agent.appagent.services.transactionmonitor.com.exitcall.p loaded by com.singularity.ee.agent.appagent.kernel.classloader.d# 0x6c000....
occupy 503413.3(28.05%) KB. These instances are referenced from one instance of java.util.HashMap$Node[]...
We figured that these classes were from the Appdynamics APM that hooks on to the running JVM and sends monitored events to the controller. There is so much convoluted process associated with reaching out to the vendor, so I am wondering if there are any work arounds for this like we enabling our java apps with JMX and Appd getting the monitoring events from JMX rather than directly hooking on to the applications' JVM. Thanks for your suggestions.
I have developed a REST API using Spring Framework. When I deploy this in Tomcat 8 on RHEL, the response times for POST and PUT requests are very high when compared to deployment on my local machine (Windows 8.1). On RHEL server it takes 7-9 seconds whereas on local machine it is less than 200 milliseconds.
RAM and CPU of RHEL server are 4 times that of local machine. Default tomcat configurations are used in both Windows and RHEL. Network latency is ruled out because GET requests take more or less same time as local machine whereas time taken to first byte is more for POST and PUT requests.
I even tried profiling the remote JVM using Visual JVM. There are no major hotspots in my custom code.
I was able to reproduce this same issue in other RHEL servers. Is there any tomcat setting which could help in fixing this performance issue ?
The profiling log you have placed means nothing, more or less. It shows the following:
The blocking queue is blocking. Which is normal, because this is its purpose - to block. This mean there is nothing to take from it.
It is waiting for connection on the socket. Which is also normal.
You do not specify what is your RHEL 8 physical/hardware setup. The operating system here might not be the only thing. You can not eliminate still network latency. What about if you have SAN, the SAN may have latency itself. If you are using SSD drive and the RHEL is using SAN with replication you may experience network latecy there.
I am more inclined to first check the IO on the disk than to focus on operating system. If the server is shared there might be other processes occupying the disk.
You are saying that the latency is ruled out because the GET requests are taking the same time. This is not enough to overrule it as I said this is the latency between the client and the application server, it does not check the latency between your app server machin and your SAN or disk or whatever storage is there.
My app client access my Tomcat. Some times it works well, but sometimes it times out - especially when two people quickly flush the frame to access the server. What might be the problem?
I can make sure that my database doesn't hang. Because I also have a management system on my Tomcat and they use the same database. The system works well even if my app can't access the server.
First check your server tomcat running system configuration, like ram capacity and internet speed ect.. because it seems to be you are using same system for data base also.
Some time bad/ slow network connections in client side also will cause
this kind of time out errors, So just add conn.setTimeout(60000) line in from your client code near http call.
I have a Producer Consumer based application based on Netty. The basic requirement was to build a message oriented middleware (MOM)
MOM
So the MOM is based on the concept of queuing (Queuing makes systems loosely coupled and that was the basic requirement of the application).
The broker understands the MQTT protocol. We performed stress testing of the application on our local machine. These are the specs of the local machine.
We were getting great results. However, our production server is AWS Ubuntu based. So we stress tested the same application on AWS Ubuntu server. The performance was 10X poor than the local system. This is the configuration of the AWS server.
We have tried the following options to figure out where the issue is.
Initially we checked for bugs in our business logic. Did not find any.
Made the broker, client and all other dependencies same on mac as well as aws. What I mean by same dependencies is that we installed the same versions on aws as on mac.
Increased the ulimit on AWS.
Played with sysctl settings.
We were using Netty 4.1 and we had a doubt that it might be a Netty error as we do not have stable release for Netty 4.1 yet. So we even built the entire application using Netty 3.9.8 Final (Stable) and we still faced the same issue.
Increased the hardware configurations substantially of the AWS machine.
Now we have literally run out of options. The java version is the same on both machines.
So the last resort for us is to build the entire application using NodeJS but that would require a lot of effort rather than tweaking something in Netty itself. We are not searching for Java based alternatives to Netty as we think this might even be a bug in JVM NIO's native implementation on Mac and Ubuntu.
What possible options can we try further to solve this bug. Is this a Netty inherent issue. Or is this something to do with some internal implementations on Mac and Ubuntu which are different and are leading to perfomance differences as we see them ?
EDIT
The stress testing parameters are as follows.
We had 1000 clients sending 1000 messages per second (Global rate).
We ran the test for about 10 minutes to note the latency.
On the server side we have 10 consumer threads handling the messages.
We have a new instance of ChannelHandler per client.
For boss pool and worker pool required by Netty, we used the Cached Thread pool.
We have tried tuning the consumer threads but to no avail.
Edit 2
These are the profiler results provided by jvmtop for one phase of load testing.
We have some applications that sometimes get into a bad state, but only in production (of course!). While taking a heap dump can help to gather state information, it's often easier to use a remote debugger. Setting this up is easy -- one need only add this to his command line:
-Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=PORT
There seems to be no available security mechanism, so turning on debugging in production would effectively allow arbitrary code execution (via hotswap).
We have a mix of 1.4.2 and 1.5 Sun JVMs running on Solaris 9 and Linux (Redhat Enterprise 4). How can we enable secure debugging? Any other ways to achieve our goal of production server inspection?
Update: For JDK 1.5+ JVMs, one can specify an interface and port to which the debugger should bind. So, KarlP's suggestion of binding to loopback and just using a SSH tunnel to a local developer box should work given SSH is set up properly on the servers.
However, it seems that JDK1.4x does not allow an interface to be specified for the debug port. So, we can either block access to the debug port somewhere in the network or do some system-specific blocking in the OS itself (IPChains as Jared suggested, etc.)?
Update #2: This is a hack that will let us limit our risk, even on 1.4.2 JVMs:
Command line params:
-Xdebug
-Xrunjdwp:
transport=dt_socket,
server=y,
suspend=n,
address=9001,
onthrow=com.whatever.TurnOnDebuggerException,
launch=nothing
Java Code to turn on debugger:
try {
throw new TurnOnDebuggerException();
} catch (TurnOnDebugger td) {
//Nothing
}
TurnOnDebuggerException can be any exception guaranteed not to be thrown anywhere else.
I tested this on a Windows box to prove that (1) the debugger port does not receive connections initially, and (2) throwing the TurnOnDebugger exception as shown above causes the debugger to come alive. The launch parameter was required (at least on JDK1.4.2), but a garbage value was handled gracefully by the JVM.
We're planning on making a small servlet that, behind appropriate security, can allow us to turn on the debugger. Of course, one can't turn it off afterward, and the debugger still listens promiscuously once its on. But, these are limitations we're willing to accept as debugging of a production system will always result in a restart afterward.
Update #3: I ended up writing three classes: (1) TurnOnDebuggerException, a plain 'ol Java exception, (2) DebuggerPoller, a background thread the checks for the existence of a specified file on the filesystem, and (3) DebuggerMainWrapper, a class that kicks off the polling thread and then reflectively calls the main method of another specified class.
This is how its used:
Replace your "main" class with DebuggerMainWrapper in your start-up scripts
Add two system (-D) params, one specifying the real main class, and the other specifying a file on the filesystem.
Configure the debugger on the command line with the onthrow=com.whatever.TurnOnDebuggerException part added
Add a jar with the three classes mentioned above to the classpath.
Now, when you start up your JVM everything is the same except that a background poller thread is started. Presuming that the file (ours is called TurnOnDebugger) doesn't initially exist, the poller checks for it every N seconds. When the poller first notices it, it throws and immediately catches the TurnOnDebuggerException. Then, the agent is kicked off.
You can't turn it back off, and the machine is not terribly secure when its on. On the upside, I don't think the debugger allows for multiple simultaneous connections, so maintaining a debugging connection is your best defense. We chose the file notification method because it allowed us to piggyback off of our existing Unix authen/author by specifying the trigger file in a directory where only the proper uses have rights. You could easily build a little war file that achieved the same purpose via a socket connection. Of course, since we can't turn off the debugger, we'll only use it to gather data before killing off a sick application. If anyone wants this code, please let me know. However, it will only take you a few minutes to throw it together yourself.
If you use SSH you can allow tunneling and tunnel a port to your local host. No development required, all done using sshd, ssh and/or putty.
The debug socket on your java server can be set up on the local interface 127.0.0.1.
You're absolutely right: the Java Debugging API is inherently insecure. You can, however, limit it to UNIX domain sockets, and write a proxy with SSL/SSH to let you have authenticated and encrypted external connections that are then proxied into the UNIX domain socket. That at least reduces your exposure to someone who can get a process into the server, or someone who can crack your SSL.
Export information/services into JMX and then use RMI+SSL to access it remotely. Your situation is what JMX is designed for (the M stands for Management).
Good question.
I'm not aware of any built-in ability to encrypt connections to the debugging port.
There may be a much better/easier solution, but I would do the following:
Put the production machine behind a firewall that blocks access to the debugging port(s).
Run a proxy process on the host itself that connects to the port, and encrypts the input and output from the socket.
Run a proxy client on the debugging workstation that also encrypts/decrypts the input. Have this connect to the server proxy. Communication between them would be encrypted.
Connect your debugger to the proxy client.