How do I get memory information of jvm cluster - java

I have to display jvm memory usage data on a page. I need to find the jvm memory stats such as free memory and max memory.
java runtime functions give data only of one jvm. How do I find this for a jvm cluster consisting of 4 jvms.
If possible it could be a unix command or some java function.

since JVM doesnt support clustering out of the box. (assuming you are referring to the standard oracle distribution)
you will have to develop an aggregation of JVM memory stats from different JVMs .

There is no such thing as a "jvm cluster" since JVMs can't really be clustered. Ie, there is no clustering capability in the JVM itself.
Programs (themselves running on a jvm) can be clustered using a third party tool or library (or by writing the relevant code yourself, which I would advise against).
This means that, since there is no core-jdk support for clustering, there is also no java function call that can give certain values for the cluster. The software/tool/library you are using to cluster your program might be able to give this information but you'd have to look that up in the documentation.
For the same reason, there's also no unix call. *nix OSes know nothing about your java cluster, they just know that there are processes running on them that use the CPU and memory and probably do some I/O. They have no idea about any clustering and therefore can not help you with your question.
So, to find what you are looking for:
If it's a true scaling cluster, ie the workload gets automatically divided over the different jvms in the cluster, you'd have to take a look at the documentation for the clustering software (tool/library) you use to find out if they can give you that information.
If you use a third party application (such as Zabbix) to monitor different JVMs you might construct a screen or view which can show you the data for multiple JVMs in one screen. Again, you'd have to look this up in the documentation for that tool.

Related

Java application monitoring over longer time periods?

Using tools like JConsole I can monitor a Java application real time. How can analyze the performance over a longer time period? Let's say over a day? Or week?
Are there simple tools like jConsole I can use?
There are options for the generic "monitor as much parameters as we can" approach:
Command line: jcmd <PID|main class> PerfCounter.print (ref) – you will then need to wrap your head around the names of the properties it outputs, schedule running this periodically, store the data somewhere, visualize it yourself.
A lot (all?) of this information is also exposed by JMX beans. You can then find them out (you can see them and what they export in JConsole for example), and using a command-line tool like e.g. jmxterm you can record the values and visualize them. Same procedure: schedule yourself, record, visualize yourself. It's not too user-friendly, so why I am mentioning this approach is that...
...people usually use a specialized monitoring system (think Graphite, Zabbix, Logstash/Kibana etc. — I am throwing these in just as keys for search together with "Java/JVM/JMX/JFR") that can collect information from Java processes through JMX and nicely present it. Periodic running, storing the timeseries data, visualization is solved by these systems.
JFR ("Java Flight Recorder") is a mechanism built into the JVM that allows to have continuous recording of many JVM + system metrics, dumps them into a file periodically, then you can visualize these with JMC ("Java Mission Control"). It is "cheaper" in the sense that you do not need to install/support a separate monitoring system, but is less accessible (unless paired with a monitoring system): you need to collect, download, process files.
In addition to these, there is jstat which is basically the same as jcmd ... PerfCounter.print, but mostly for memory-related metrics, it has the "run periodically" functionality built in and presents results slightly differently (one "recording" – 1 line).
I would say: if you need to do it once or occasionally, be it over a longer period, and need just a few parameters, like memory/number of threads/..., then target using jstat, jcmd PerfCounter.print; if you need more parameters, then JFR/JMC. If you need it as something that runs alongside your system, always collecting and present, available to people not having admin rights in the system where the JVM resides, then look into the monitoring systems and their integration with Java applications.

JAVA Distributed processing on a single machine (Ironic i know)

I am creating a (semi) big data analysis app. I am utilizing apache-mahout. I am concerned about the fact that with java, I am limited to 4gb of memory. This 4gb limitation seems somewhat wasteful of the memory modern computers have at their disposal. As a solution, I am considering using something like RMI or some form of MapReduce. (I, as of yet, have no experience with either)
First off: is it plausible to have multiple JVM's running on one machine and have them talk? and if so, am I heading in the right direction with the two ideas alluded to above?
Furthermore,
In attempt to keep this an objective question, I will avoid asking "Which is better" and instead will ask:
1) What are key differences (not necessarily in how they work internally, but in how they would be implemented by me, the user)
2) Are there drawbacks or benefits to one or the other and are there certain situations where one or the other is used?
3) Is there another alternative that is more specific to my needs?
Thanks in advance
First, re the 4GB limit, check out Understanding max JVM heap size - 32bit vs 64bit . On a 32 bit system, 4GB is the maximum, but on a 64 bit system the limit is much higher.
It is a common configuration to have multiple jvm's running and communicating on the same machine. Two good examples would be IBM Websphere and Oracle's Weblogic application servers. They run the administrative console in one jvm, and it is not unusual to have three or more "working" jvm's under its control.
This allows each JVM to fail independently without impacting the overall system reactiveness. Recovery is transparent to the end users because some fo the "working" jvm's are still doing their thing while the support team is frantically trying to fix things.
You mentioned both RMI and MapReduce, but in a manner that implies that they fill the same slot in the architecture (communication). I think that it is necessary to point out that they fill different slots - RMI is a communications mechanism, but MapReduce is a workload management strategy. The MapReduce environment as a whole typically depends on having a (any) communication mechanism, but is not one itself.
For the communications layer, some of your choices are RMI, Webservices, bare sockets, MQ, shared files, and the infamous "sneaker net". To a large extent I recommend shying away from RMI because it is relatively brittle. It works as long as nothing unexpected happens, but in a busy production environment it can present challenges at unexpected times. With that said, there are many stable and performant large scale systems built around RMI.
The direction the world is going this week for cross-tier communication is SOA on top of something like spring integration or fuse. SOA abstracts the mechanics of communication out of the equation, allowing you to hook things up on the fly (more or less).
MapReduce (MR) is a way of organizing batched work. The MR algorithm itself is essentially turn the input data into a bunch of maps on input, then reduce it to the minimum amount necessary to produce an output. The MR environment is typically governed by a workload manager which receives jobs and parcels out the work in the jobs to its "worker bees" splattered around the network. The communications mechanism may be defined by the MR library, or by the container(s) it runs in.
Does this help?

JVM memory profiling when multiple applications are running on the same JVM

I am running a Web Based Java application on JBOSS and OFBIZ. I am suspecting some memory leak is happening so, did some memory profiling of the JVM on which the application along with JBOSS and OFBIZ are running. I am suspecting garbage collection is not working as expected for the application.
I used VisulaVM, JConsole, YourKit, etc to do the memory profiling. I could see how much heap memory is getting used, how many classes are getting loaded, how many threads are getting created, etc. But I need to know how much of memory is used only by the application, how much by JBOSS and how much by OFBIZ, respectively. I want to find out who is using how much memory and what is the usage pattern. That will help me identify where the memory leak is happening, and where tuning is needed.
But the memory profilers I ran so far, I was unable to differentiate the usage of each application separately. Can you please tell me which tool can help me with that?
There is no way to do this with Java since the Java runtime has no clear way to say "this is application A and this is B".
When you run several applications in one Java VM, you're just running one: JBoss. JBoss then has a very complex classloader but the app you're profiling is actually JBoss.
To do what you want to do, you have to apply filters but this only works when you have a memory leak in a class which isn't shared between applications (so when com.pany.app.a.Foo leaks, you can do this).
If you can't use filters, you have to look harder at the numbers to figure out what's going on. That means you'll probably have to let the app server run out of memory, create a heap dump and then look for what took most of the memory and work from there.
The only other alternative is to install a second server, deploy just one app there and watch it.
You can install and create Docker containers, allowing you to run processes in isolation. This will allow you to use multiple containers with the same base and without having to install the JDK multiple times. The advantage of this is separation of concerns- Every application can be deployed in a separate container. With this, you can then profile any specific application running on the JVM because each namespace is provided with a completely isolated application's view of the operating environment, including process trees, network, user ids and mounted file system.
Here are a couple of resources for Docker:
Deploying Java applications with Docker
JVM plus Docker: Better together
Docker
Please let me know if you have any questions!
Another good tool to use to find java memory leaks is Plumbr. you can try it out for free, it will find the cause for the java.lang.OutOfMemoryError and even shows you the exact location of the problem along with solution guidelines.
I explored various Java memory profilers, and found that YourKit can give me the closest result. In YourKit dashboard you can get links to individual classes running. So, if you are familiar with the codebase, you will know which class belongs to which app. You click on any class, you will see the CPU, Memory usage related to that. Also, if you notice any issues, YourKit can help you trace back to the particular line of the code in your source java files!
If you add YourKit to Eclipse, clicking on the object name in the 'issue area', will highlight the code line in the particular source file, which is the source of the problem.
Pretty cool!!

How to profile a distributed app in java?

I've got an app running on a grid of uniform java processes (potentially on different physical machines). I'd like to collect cpu usage statistics from a single run of this app. I've went over profiling tools looking for an option of automatic collection of data but failed to find any in netbeans, tptp, jvisualvm, yourkit etc.
Maybe I'm looking in a wrong way?
What I was thinking is:
run the processes on the grid with some special setup that allows them to dump profiling info
run my app as usual - it will push tasks to the grid, the processes will execute the tasks and publish profiling info
uses some tool to collect and analyze the profiling results
but I can't find anything even remotely similar to this.
Any thoughts, experience, suggestions?
Thank you!
If you have allowed remote JMX access and if you are using SUN JDK 1.6 then try using jvisualvm. It has the option of remote JMX connection. Though I haven't it used for profiling CPU in a distributed environment.
Note: For CPU profiling your application should be running on SUN JDK 1.6 or above.
Have a look at these links:
JVisualVM
JVisualVM - Working with Remote Applications
Get heap dump from a remote application in Java using JVisualVM
Unable to profile JBoss 5 using jvisualvm
http://www.taranfx.com/java-visualvm
I have used CA Introscope for this type of monitoring. It uses Instrumentation to collect metrics over time. As an example, it can be configured to provide you a view of all nodes and their performance over time. From that node view, you can drill down to the method level to help you figure out where your bottle necks are.
Yes, it will provide CPU utilization.
It's a commercial $$$ tool, but its a great tool for collecting, monitoring and interrogating performance data.
if you look at something like zabbix (though there are tons others of monitoring tools), this allows for gathering data via JMX from a Java app. And if you enable JMX in your app and allow it to be queried externally (via TCP/IP) you will have access to a lot of the hotspot internals (free memory etc) also thread stacks etc. Then you could have these values graphed as well. It does need configuration but what you're looking for don't think can be done with a one line of a script.
Just to add that profiling information on each node usually contain timestamps.
To match these timestamps all machines should have exactly the same time (10 millis delta maximum)
cluster nodes should synchronize with single source network time server (NTP)
You can use some JMX library, e.g. jmxterm and wrap it in some code to connect to multiple hosts an poll them for changes. If you are abit familiar with Python, look at mys simple script here for some inspiration: http://rostislav-matl.blogspot.com/2011/02/monitoring-tomcat-with-jmxterm.html .
http://www.hyperic.com/products/open-source-systems-monitoring
I never tried other tools mentioned in other answers. I was more than satisfied with hyperic.
It exposes webservices API as well which you can use to write your own analysis tools.
If you know the critical paths you want to analyse I would suggest time stamping your process in key places and combining the logs yourself. This is likely to be a useful addition to your profiling, can be used in production and may be even more useful as a result. (It is for my project)
I have used YourKit to monitor a number of processes at once. It can show you what is happening in each in real time and collect the results when all is finished.
I don't know if it provides a combined view of what is happening.
I was looking for something similar and found Hyperic
Claims are the tool can monitor most common applications ans systems, gather all information and present them in a conveniant fashion.
To be honest this is on my todo list, so I can't say if it will do the job or not. Anyway, it seem impressive.

Profiling Java EE1.5 or SE1.6

I'm new into Java profiling, and I'd like to ask about it.
Does it make sense to profile an application on the server, where I only have the console?
Is there any console profiler which make sense?
Or should I profile the application only on localhost?
VisualVM is able to connect to a remote host. The profiling works the same way as local. It's part of the JDK since JDK 6 update 7.
When you do profiling you should generally try to reproduce the production environment as closely as possible. Differences in hardware (# of cores, memory, etc) and software (OS, JVM version) can make your profiling results as unique as the runtime environment.
For example, what looks like a CPU bottleneck worth optimizing on your local machine might disappear entirely or turn into a disk bottleneck on your production server depending on the differences in the CPU.
All modern profilers will allow to attach to a remotely running JVM so you don't need to worry about only having console access.
What profiler you decide to use will depend on your needs and preferences. Certain profilers will show you "hotspots" where your code is spending the majority of the time and these are often good candidates for optimization.
I prefer to use JProfiler for its extensive features and good performance. I previously used YourKit but switched to JProfiler for its memory and thread profiling features.
Does it make sense to profile an
application on the server, where I
only have the console?
Fortunately, that does not matter, since Java has always (well, for a long time, anyway) supported remote profiling, i.e. the Profiler can run on a different machine than the JVM being profiled and get its data via the network.
All Java profilers that I've ever seen support this, including visualvm, which comes with recent JDKs (in the bin directory).
There's a "quick and dirty" but effective way to find performance problems in Java.
This is a language-agnostic explanation of why it works.
Note that the JDK comes with a built-in profiler, HPROF. HPROF is a bit simplistic, but will find many problems. It is activated simply by invoking the JVM with parameter -agentlib:hprof; it will then automatically run as long as your JVM is running. It collects data until the JVM terminates, then dumps it into a file on the server, which you can analyze.
See e.g. http://java.sun.com/developer/technicalArticles/Programming/HPROF.html for a good introduction. A nice graphical analyzer for HPROF's results is PerfAnal: http://java.sun.com/developer/technicalArticles/Programming/perfanal/

Categories