Java application monitoring over longer time periods?

Java application monitoring over longer time periods? - java

Using tools like JConsole I can monitor a Java application real time. How can analyze the performance over a longer time period? Let's say over a day? Or week?
Are there simple tools like jConsole I can use?

There are options for the generic "monitor as much parameters as we can" approach:
Command line: jcmd <PID|main class> PerfCounter.print (ref) – you will then need to wrap your head around the names of the properties it outputs, schedule running this periodically, store the data somewhere, visualize it yourself.
A lot (all?) of this information is also exposed by JMX beans. You can then find them out (you can see them and what they export in JConsole for example), and using a command-line tool like e.g. jmxterm you can record the values and visualize them. Same procedure: schedule yourself, record, visualize yourself. It's not too user-friendly, so why I am mentioning this approach is that...
...people usually use a specialized monitoring system (think Graphite, Zabbix, Logstash/Kibana etc. — I am throwing these in just as keys for search together with "Java/JVM/JMX/JFR") that can collect information from Java processes through JMX and nicely present it. Periodic running, storing the timeseries data, visualization is solved by these systems.
JFR ("Java Flight Recorder") is a mechanism built into the JVM that allows to have continuous recording of many JVM + system metrics, dumps them into a file periodically, then you can visualize these with JMC ("Java Mission Control"). It is "cheaper" in the sense that you do not need to install/support a separate monitoring system, but is less accessible (unless paired with a monitoring system): you need to collect, download, process files.
In addition to these, there is jstat which is basically the same as jcmd ... PerfCounter.print, but mostly for memory-related metrics, it has the "run periodically" functionality built in and presents results slightly differently (one "recording" – 1 line).
I would say: if you need to do it once or occasionally, be it over a longer period, and need just a few parameters, like memory/number of threads/..., then target using jstat, jcmd PerfCounter.print; if you need more parameters, then JFR/JMC. If you need it as something that runs alongside your system, always collecting and present, available to people not having admin rights in the system where the JVM resides, then look into the monitoring systems and their integration with Java applications.

Related

How do I get memory information of jvm cluster

I have to display jvm memory usage data on a page. I need to find the jvm memory stats such as free memory and max memory.
java runtime functions give data only of one jvm. How do I find this for a jvm cluster consisting of 4 jvms.
If possible it could be a unix command or some java function.

since JVM doesnt support clustering out of the box. (assuming you are referring to the standard oracle distribution)
you will have to develop an aggregation of JVM memory stats from different JVMs .

There is no such thing as a "jvm cluster" since JVMs can't really be clustered. Ie, there is no clustering capability in the JVM itself.
Programs (themselves running on a jvm) can be clustered using a third party tool or library (or by writing the relevant code yourself, which I would advise against).
This means that, since there is no core-jdk support for clustering, there is also no java function call that can give certain values for the cluster. The software/tool/library you are using to cluster your program might be able to give this information but you'd have to look that up in the documentation.
For the same reason, there's also no unix call. *nix OSes know nothing about your java cluster, they just know that there are processes running on them that use the CPU and memory and probably do some I/O. They have no idea about any clustering and therefore can not help you with your question.
So, to find what you are looking for:
If it's a true scaling cluster, ie the workload gets automatically divided over the different jvms in the cluster, you'd have to take a look at the documentation for the clustering software (tool/library) you use to find out if they can give you that information.
If you use a third party application (such as Zabbix) to monitor different JVMs you might construct a screen or view which can show you the data for multiple JVMs in one screen. Again, you'd have to look this up in the documentation for that tool.

Limit resource utilization of JNA calls without changing dll

How can you prevent a JNA method-call from exceeding thresholds for CPU utilization, thread-counts, and memory limits?
Background:
I'm working on a safety critical application and one of the non-safety-critical features requires the use of a library written in C. The dlls have been given to me as a black-box and there's no chance that I'll get access to the source code beyond the java interface files. Is there a way to limit the CPU usage, thread-count, and memory used by the JNA code?

See ulimit and sysctl, which are applicable to your overall JVM process (or any other process, for that matter).
It's not readily possible to segment parts of your JVM which are making native accesses via JNA from those that aren't, though.
You should run some profiling while you exercise your shared library to figure out what resources it does use, so you can focus on setting limits around those (lsof or strace would be used on linux, I'm not sure of the equivalent on windows).

For most operating systems you must either call your C code from a new thread or new process. I would recommend calling it from a new process as then you can sandbox it easier and deeper. Typically on a Unix like system one switches to a new user set aside for the service and that has user resource limits on it. However, on Linux one can use user namespaces and cgroups for more dynamic and flexible sandboxing. On Microsoft Windows one typically uses Job objects for resource sandboxing but permissions based sandboxing is more complicated (a lot of Windows is easily sandboxable with access controls but the GUI and window messaging parts make things complicated and annoying).

How to profile a distributed app in java?

I've got an app running on a grid of uniform java processes (potentially on different physical machines). I'd like to collect cpu usage statistics from a single run of this app. I've went over profiling tools looking for an option of automatic collection of data but failed to find any in netbeans, tptp, jvisualvm, yourkit etc.
Maybe I'm looking in a wrong way?
What I was thinking is:
run the processes on the grid with some special setup that allows them to dump profiling info
run my app as usual - it will push tasks to the grid, the processes will execute the tasks and publish profiling info
uses some tool to collect and analyze the profiling results
but I can't find anything even remotely similar to this.
Any thoughts, experience, suggestions?
Thank you!

If you have allowed remote JMX access and if you are using SUN JDK 1.6 then try using jvisualvm. It has the option of remote JMX connection. Though I haven't it used for profiling CPU in a distributed environment.
Note: For CPU profiling your application should be running on SUN JDK 1.6 or above.
Have a look at these links:
JVisualVM
JVisualVM - Working with Remote Applications
Get heap dump from a remote application in Java using JVisualVM
Unable to profile JBoss 5 using jvisualvm
http://www.taranfx.com/java-visualvm

I have used CA Introscope for this type of monitoring. It uses Instrumentation to collect metrics over time. As an example, it can be configured to provide you a view of all nodes and their performance over time. From that node view, you can drill down to the method level to help you figure out where your bottle necks are.
Yes, it will provide CPU utilization.
It's a commercial $$$ tool, but its a great tool for collecting, monitoring and interrogating performance data.

if you look at something like zabbix (though there are tons others of monitoring tools), this allows for gathering data via JMX from a Java app. And if you enable JMX in your app and allow it to be queried externally (via TCP/IP) you will have access to a lot of the hotspot internals (free memory etc) also thread stacks etc. Then you could have these values graphed as well. It does need configuration but what you're looking for don't think can be done with a one line of a script.

Just to add that profiling information on each node usually contain timestamps.
To match these timestamps all machines should have exactly the same time (10 millis delta maximum)
cluster nodes should synchronize with single source network time server (NTP)

You can use some JMX library, e.g. jmxterm and wrap it in some code to connect to multiple hosts an poll them for changes. If you are abit familiar with Python, look at mys simple script here for some inspiration: http://rostislav-matl.blogspot.com/2011/02/monitoring-tomcat-with-jmxterm.html .

http://www.hyperic.com/products/open-source-systems-monitoring
I never tried other tools mentioned in other answers. I was more than satisfied with hyperic.
It exposes webservices API as well which you can use to write your own analysis tools.

If you know the critical paths you want to analyse I would suggest time stamping your process in key places and combining the logs yourself. This is likely to be a useful addition to your profiling, can be used in production and may be even more useful as a result. (It is for my project)
I have used YourKit to monitor a number of processes at once. It can show you what is happening in each in real time and collect the results when all is finished.
I don't know if it provides a combined view of what is happening.

I was looking for something similar and found Hyperic
Claims are the tool can monitor most common applications ans systems, gather all information and present them in a conveniant fashion.
To be honest this is on my todo list, so I can't say if it will do the job or not. Anyway, it seem impressive.

Profiling JVMs with JVMTI, how to distinguish the different JVMs?

I am writing a profiler with the aid of the JVM TI.
In C++ I have written a simple agent, which writes the information collected to a socket. With Java Swing I have built a simple GUI which reads these data from a socket to visualize it.
However I am facing some usability issues. I would like to provide the functionality to start profiling a Java application on request. There is the Attach API which provides the possibility to inject an agent into a running JVM.
But to start a new Java program and inject the agent is a little bit more complicated. One way would be, to make a call to the command line and start the Java program from the GUI Profiler:
java -agentlib:agent Program
I kind of dislike this idea, because it is somehow hacky but I see no other way, do you?
To summarize I need two ways to start profiling a JVM:
Start a Java applicatiom from the scratch and start profiling it directly
Attach to a running JVM and inject the agent to start profiling it
Further, I would be in need to distinguish the different JVMs which I inspect, but how to do that? There no unique identifier for the different JVMs. The Attach API gives the possibility to list the different JVMs with their name and id, but what to do in the first case? Is it possible to inject the agent with arguments?

You can also generate your own GUID in the Agent_OnLoad and use that for logging. this way if your some of your processes have short lives and others long lives you can distinguish between recycled PIDS.

I solved the problem by using the local process identification (pid) and the network address to uniquely identfy the JVM.

Distributed Program Execution Manager

Given the information about machines in a cluster (IP address/machine name) and a program (Java language) to run, is there a software (manager) available which would execute this program and returns the output along with the runtimes on each of the machines?
Currently, I am using a shell script to do this, but I couldn't get time taken (in secs) to run the java program back. It would be good if there is some distributed program execution manager like the one I described above.

Instead of writing your own script, you could simply use something like tentakel or shmux to run your application parallel on multiple nodes . You can run tentakel as
tentakel 'time <your application name>'
to get the output and the time it takes for the application to run.

I like to use Hudson for stuff like that. It was originally written for performing software builds and tests, but is more generic than that. Basically a controller for managing jobs and executions along with a client to deploy on nodes. Hadoop is another option if you have flexibility to re-write your app for a specific distributed computing framework.

I don't understand your question very much. What "runtime" do you want to get back? What clustering solution are you using? For distributed communication in Java I would recommend JGroups. FOr distributed JVM check Terracotta.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.