UPDATES on Oct 25:
Now I found out what's causing the problem.
1) The child process kills itself, that's why strace/perf/auditctl cannot track it down.
2) The JNI call to create a process is triggered from a Java thread. When the thread eventually dies, it's also destroying the process that it creates.
3) In my code to fork and execve() a child process, I have the code to monitor parent process death and kill my child process with the following line: prctl( PR_SET_PDEATHSIG, SIGKILL ); My fault that I didn't pay special attention to this flag before b/c it's considered as a BEST PRACTICE for my other projects where child process is forked from the main thread.
4) If I comment out this line, the problem is gone. The original purpose is to kill the child process when the parent process is gone. Even w/o this flag, it's still the correct behavior. Seems like the ubuntu box default behavior.
5) Finally found it's a kernel bug, fixed in kernel version 3.4.0, my ubuntu box from AWS is kernel version 3.13.0-29-generic.
There are a couple of useful links to the issues:
a) http://www.linuxprogrammingblog.com/threads-and-fork-think-twice-before-using-them
b) prctl(PR_SET_PDEATHSIG, SIGNAL) is called on parent thread exit, not parent process exit.
c) https://bugzilla.kernel.org/show_bug.cgi?id=43300
UPDATES on Oct 15:
Thanks so much for all the suggestions. I am investigating from one area of the system to another area. It's hard 2 find a reason.
I am wondering 2 things.
1) why are powerful tools such as strace, auditctl and perf script not able to track down who caused the kill?
2) Is +++ killed by SIGKILL +++ really means its killed from signal?
ORIGINAL POST
I have a long running C process launched from a Java application server in Ubuntu 12 through the JNI interface. The reason I use JNI interface to start a process instead of through Java's process builder is b/c of the performance reasons. It's very inefficient for java process builder to do IPC especially b/c extra buffering introduces very long delay.
Periodically it is terminated by SIGKILL mysteriously. The way I found out is through strace, which says: "+++ killed by SIGKILL +++"
I checked the following:
It's not a crash.
It's not a OOM. Nothing in dmesg. My process uses only 3.3% of 1Gbytes of memory.
Java layer didn't kill the process. I put a log in the JNI code if the code terminates the process, but no log was written to indicate that.
It's not a permission issue. I tried to run as sudo or a different user, both cases causes the process to be killed.
If I run the process locally in a shell, everything works fine. What's more, in my C code for my long-running process, I ignore the signal SIGHUP. Only when it's running as a child process of Java server, it gets killed.
The process is very CPU intensive. It's using 30% of the CPU. There are lots of voluntary context switch and nonvoluntary_ctxt_switches.
(NEW UPDATE) One IMPORTANT thing very likely related to why my process is killed. If the process do some heavy lifting, it won't be killed, however, sometimes it's doing little CPU intensive work. When that happens, after a while, roughly 1 min, it is killed. It's status is always S(Sleeping) instead of R(Running). It seems that the OS decides to kill the process if it was idle most of the time, and not kill the process if it was busy.
I suspect Java's GC is the culprit, however, Java will NEVER garbage collect a singleton object associated with JNI. (My JNI object is tied to that singleton).
I am puzzled by the reason why it's terminated. Does anyone has a good suggestion how to track it down?
p.s.
On my ubuntu limit -a result is:
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 7862
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 65535
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 7862
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
I tried to increase the limits, and still does not solve the issue.
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) unlimited
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 65535
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) unlimited
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) unlimited
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
Here is proc status when I run cat /proc/$$$/status
Name: mimi_coso
State: S (Sleeping)
Tgid: 2557
Ngid: 0
Pid: 2557
PPid: 2229
TracerPid: 0
Uid: 0 0 0 0
Gid: 0 0 0 0
FDSize: 256
Groups: 0
VmPeak: 146840 kB
VmSize: 144252 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM: 36344 kB
VmRSS: 34792 kB
VmData: 45728 kB
VmStk: 136 kB
VmExe: 116 kB
VmLib: 23832 kB
VmPTE: 292 kB
VmSwap: 0 kB
Threads: 1
SigQ: 0/7862
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000004
SigIgn: 0000000000011001
SigCgt: 00000001c00064ee
CapInh: 0000000000000000
CapPrm: 0000001fffffffff
CapEff: 0000001fffffffff
CapBnd: 0000001fffffffff
Seccomp: 0
Cpus_allowed: 7fff
Cpus_allowed_list: 0-14
Mems_allowed: 00000000,00000001
Mems_allowed_list: 0
voluntary_ctxt_switches: 16978
nonvoluntary_ctxt_switches: 52120
strace shows:
$ strace -p 22254 -s 80 -o /tmp/debug.lighttpd.txt
read(0, "SGI\0\1\0\0\0\1\0c\0\0\0\t\0\0T\1\2248\0\0\0\0'\1\0\0(\0\0"..., 512) = 113
read(0, "SGI\0\1\0\0\0\1\0\262\1\0\0\10\0\1\243\1\224L\0\0\0\0/\377\373\222D\231\214"..., 512) = 448
sendto(3, "<15>Oct 10 18:34:01 MixCoder[271"..., 107, MSG_NOSIGNAL, NULL, 0) = 107
write(1, "SGO\0\0\0\0 \272\1\0\0\t\0\1\253\1\243\273\0\0\0\0'\1\0\0\0\0\0\1\242"..., 454) = 454
sendto(3, "<15>Oct 10 18:34:01 MixCoder[271"..., 107, MSG_NOSIGNAL, NULL, 0) = 107
write(1, "SGO\0\0\0\0 \341\0\0\0\10\0\0\322\1\254Z\0\0\0\0/\377\373R\4\0\17\21!"..., 237) = 237
read(0, "SGI\0\1\0\0\0\1\0)\3\0\0\t\0\3\32\1\224`\0\0\0\0'\1\0\0\310\0\0"..., 512) = 512
read(0, "\344u\233\16\257\341\315\254\272\300\351\302\324\263\212\351\225\365\1\241\225\3+\276J\273\37R\234R\362z"..., 512) = 311
read(0, "SGI\0\1\0\0\0\1\0\262\1\0\0\10\0\1\243\1\224f\0\0\0\0/\377\373\222d[\210"..., 512) = 448
sendto(3, "<15>Oct 10 18:34:01 MixCoder[271"..., 107, MSG_NOSIGNAL, NULL, 0) = 107
write(1, "SGO\0\0\0\0 %!\0\0\t\0\0+\1\243\335\0\0\0\0\27\0\0\0\0\1B\300\36"..., 8497) = 8497
sendto(3, "<15>Oct 10 18:34:01 MixCoder[271"..., 107, MSG_NOSIGNAL, NULL, 0) = 107
write(1, "SGO\0\0\0\0 \341\0\0\0\10\0\0\322\1\254t\0\0\0\0/\377\373R\4\0\17\301\31"..., 237) = 237
read(0, "SGI\0\1\0\0\0\1\0\262\1\0\0\10\0\1\243\1\224\200\0\0\0\0/\377\373\222d/\200"..., 512) = 448
sendto(3, "<15>Oct 10 18:34:01 MixCoder[271"..., 107, MSG_NOSIGNAL, NULL, 0) = 107
write(1, "SGO\0\0\0\0 \341\0\0\0\10\0\0\322\1\254\216\0\0\0\0/\377\373R\4\0\17\361+"..., 237) = 237
read(0, "SGI\0\1\0\0\0\1\0\221\0\0\0\t\0\0\202\1\224\210\0\0\0\0'\1\0\0P\0\0"..., 512) = 159
read(0, unfinished ...)
+++ killed by SIGKILL +++
Assuming that you have root access on your machine, you can enable audit on kill(2) syscall to gather such information.
root # auditctl -a exit,always -F arch=b64 -S kill -F a1=9
root # auditctl -l
LIST_RULES: exit,always arch=3221225534 (0xc000003e) a1=9 (0x9) syscall=kill
root # sleep 99999 &
[2] 11688
root # kill -9 11688
root # ausearch -sc kill
time->Tue Oct 14 00:38:44 2014
type=OBJ_PID msg=audit(1413272324.413:441376): opid=11688 oauid=52872 ouid=0 oses=20 ocomm="sleep"
type=SYSCALL msg=audit(1413272324.413:441376): arch=c000003e syscall=62 success=yes exit=0 a0=2da8 a1=9 a2=0 a3=0 items=0 ppid=6107 pid=6108 auid=52872 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsg
id=0 tty=pts2 ses=20 comm="bash" exe="/bin/bash" key=(null)
The other way is to set up kernel tracing which may be an over-kill when audit system can do same work.
Finally I figured out the reason why.
The child process kills itself and it's a linux kernel bug.
Details:
1) The child process kills itself, that's why strace/perf/auditctl cannot track it down.
2) The JNI call to create a process is triggered from a Java thread. When the thread eventually dies, it's also destroying the process that it creates.
3) In my code to fork and execve() a child process, I have the code to monitor parent process death and kill my child process with the following line: prctl( PR_SET_PDEATHSIG, SIGKILL ); I didn't pay special attention to this flag before b/c it's considered as a BEST PRACTICE for my other projects where child process is forked from the main thread.
4) If I comment out this line, the problem is gone. The original purpose is to kill the child process when the parent process is gone. Even w/o this flag, it's still the correct behavior. Seems like the ubuntu box default behavior.
5) From this article, https://bugzilla.kernel.org/show_bug.cgi?id=43300. it's a kernel bug, fixed in kernel version 3.4.0, my ubuntu box from AWS is kernel version 3.13.0-29-generic.
My machine configuration:
===>Ubuntu 14.04 LTS
===>3.13.0-29-generic
Some useful links to the issues:
a) http://www.linuxprogrammingblog.com/threads-and-fork-think-twice-before-using-them
b) prctl(PR_SET_PDEATHSIG, SIGNAL) is called on parent thread exit, not parent process exit
c) https://bugzilla.kernel.org/show_bug.cgi?id=43300
As I mentioned earlier, the other choice is to use kernel trace, which can be done by perf tool.
# apt-get install linux-tools-3.13.0-35-generic
# perf list | grep kill
syscalls:sys_enter_kill [Tracepoint event]
syscalls:sys_exit_kill [Tracepoint event]
syscalls:sys_enter_tgkill [Tracepoint event]
syscalls:sys_exit_tgkill [Tracepoint event]
syscalls:sys_enter_tkill [Tracepoint event]
syscalls:sys_exit_tkill [Tracepoint event]
# perf record -a -e syscalls:sys_enter_kill sleep 10
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.634 MB perf.data (~71381 samples) ]
// Open a new shell to kill.
$ sleep 9999 &
[1] 2387
$ kill -9 2387
[1]+ Killed sleep 9999
$ echo $$
9014
// Dump the trace in your original shell.
# perf script
...
bash 9014 [001] 1890350.544971: syscalls:sys_enter_kill: pid: 0x00000953, sig: 0x00000009
Related
Caused by: java.sql.SQLException: Network error IOException: Too many open files
at net.sourceforge.jtds.jdbc.JtdsConnection.<init>(JtdsConnection.java:436)
at net.sourceforge.jtds.jdbc.Driver.connect(Driver.java:184)
at net.sourceforge.jtds.jdbcx.JtdsDataSource.getConnection(JtdsDataSource.java:186)
at net.sourceforge.jtds.jdbcx.JtdsDataSource.getXAConnection(JtdsDataSource.java:99)
at bitronix.tm.resource.jdbc.PoolingDataSource.createPooledConnection(PoolingDataSource.java:341)
at org.springframework.boot.jta.bitronix.PoolingDataSourceBean.createPooledConnection(PoolingDataSourceBean.java:110)
at bitronix.tm.resource.common.XAPool.createPooledObject(XAPool.java:283)
at bitronix.tm.resource.common.XAPool.grow(XAPool.java:391)
at bitronix.tm.resource.common.XAPool.getInPool(XAPool.java:371)
at bitronix.tm.resource.common.XAPool.getConnectionHandle(XAPool.java:123)
at bitronix.tm.resource.common.XAPool.getConnectionHandle(XAPool.java:91)
at bitronix.tm.resource.jdbc.PoolingDataSource.getConnection(PoolingDataSource.java:258)
... 112 common frames omitted
Caused by: java.net.SocketException: Too many open files
at java.net.Socket.createImpl(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at net.sourceforge.jtds.jdbc.SharedSocket.createSocketForJDBC3(SharedSocket.java:288)
at net.sourceforge.jtds.jdbc.SharedSocket.<init>(SharedSocket.java:251)
at net.sourceforge.jtds.jdbc.JtdsConnection.<init>(JtdsConnection.java:331)
... 123 common frames omitted
In spring boot application which is running on linux we are using IBM MQ9 with MSSQL 2017
Using com.ibm.mq.jms.MQXAQueueConnectionFactory to connect to 4 different Qs and processing high volume of data.
We have tried increasing the soft limit and hard limit to 8192 but still we get this error around
4k+ above value for command
lsof -u <ouruser> | grep "jsk" | wc -l
Also tried with XX:+MaxFDLimit but no use.
I have tried almost everything that I could find for similar issues suggested in stackoverflow and the issue still persists.
If I run a simple java program like below in the same linux box , I can see this value going till 16380 and then throwing Too Many Open files exception
public class tooMany {
public static void main(String[] args) throws IOException {
System.out.println("Trying to reproduce");
try {
for (int x = 0; x < 1000000; x++) {
System.out.println("Opening the file");
FileInputStream leakyHandle = new FileInputStream("some file to read");
System.out.println("file is opened");
}
System.out.println("Method Should Have Failed");
} catch (IOException e) {
System.out.println("too many open files");
} catch (Exception e) {
System.out.println("Unexpected exception");
}
}
}
but in my application its failing after 4k+ ..
Any help is much needed as this issue is been dragging for long now. The concern is we could not reproduce this in lower environment and seeing this twice a week in PROD..
Output of ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 770116
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 8192
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 16384
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
Background/context:
We are running Java application on one of CompuLab CoM:
https://compulab.co.il/products/computer-on-modules/cm-fx6/#overview
JVM version: Oracle Java 7 ARM 1.7.0_60
OS reference:
http://www.compulab.co.il/workspace/mediawiki/index.php5/CM-FX6_Linux
The application is not trivial: lots of threads, access to Ethernet (LAN), serial interface, GPRS/UMTS modem, access to Internet (ppp deamon), GPS, touch screen, database (SQLite), file system. In other words use OS resources extensively.....
We are observing that Java application (all of its threads) and OS basic functionality randomly hangs. I would say it is a Linux kernel bug but by killing the Java application it recovers and operates normally.
This state always takes exactly 24 minutes. Afterwards it recovers and behaves normally. Average rate of occurrence is once per 24-30 hours.
When it happens, externally invoked events like messages sent to application via Ethernet or serial interface are buffered (by OS probably) and all of them are processed immediately after it recovers.
When I establish SSH connection to device in advance, after it happens the connection is either blocked (all command are buffered and processed after it recovers - 24 mintes) or its working, than:
basic OS utilities does not work: "top" for example
jstack -F does not work, just hangs and does not produce any output
killing Java application by kill -9 PID released the OS and everything starts to operate normally
While it is in this state, the OS each time behaves differently. Other findings:
Basic network based utilities does not work (SSH, FTP) – can not
establish new connection to OS from another machine.
PING from another machine does work until I unplug an plug Ethernet
cable from device, sometimes PING than stops working
Sometimes OS system time hangs as well (not always), after 24 minutes
it continues delayed for 24 minutes.
New USB input devices (mouse, keyboard) can not be connected while in
that state (happens always).
Another strange thing:
A touch screen is used for interaction with a user (driver compiled as kernel module). And it works even while it is hung. Java application (GUI Swing) can handle events like pressing button so I can run some code behind button click handler.
It seems like all threads are blocked but Java Swing can handle some input events and our application precesses them until it needs to interact with already blocked threads or OS (run bash script on button click) or call sleep method. Than it hangs as well.
In other words, the Java application is hung ”partially” - can still handle something.
Already tried:
Tools for JVM remote debugging: Java Mission Control, VisualVM.
Connection was also established before it hung. Everything seemed OK
in terms of thread dump, heap dump etc. (I can send by e-mail). Even
the connection remained and I could see in thees tools that processor
usage dropped to 0 % for JVM.
jstack -F (via SSH): does not work, just hangs and does not produce
any output
I tried to run OS without the driver for touch screen and it still
happened.
I tried to run two parallel Java application. One of them was very
simple – just writing to log timestamps. And both of them hung.
I tried to run System.exit(0) in terms of button click handler while
app. and all threads hung and it does not worked (hung as well)
Questions:
Is it Linux kernel bug or JVM (its ARM implementation) bug?
Is Java (JVM) able to hang and block basic OS functionality (FTP, SSH, system time, other utilities)?
How can I further diagnose/debug this issue when basic utilities like jstack -F does not work?
Do you have any ideas what could be the cause of this issue and why it always recovers exactly after 24 minutes?
Update 1: 2014-07-10
Finally I manage to “catch” this weird state again. Here are my further findings.
Based on nos suggestion I tried run via ssh (established in advanced):
*strace -f -p PID*
Unfortunately the bash script command hung as well (same behavior like with jstack).
As far as the user limit (ulimit) and OS resources are concerned, bellow I report figures taken just after the system recovered from last hung. At that state it had been running for 24 hours and I can confirm that those figures remain roughly the same during long-term operation (no random peeks during operation). From my point of view, they are ok and application is not stepping over any resource or other limit in any way.
Java current heap
Used: 18 MB, Free: 12 MB, Total: 30 MB, Max: 230 MB
Java heap
root#cm-debian:~# /usr/lib/jvm/jdk1.7.0_60/bin/jmap -heap 3242
Attaching to process ID 3242, please wait...
Debugger attached successfully.
Client compiler detected.
JVM version is 24.60-b09
using thread-local object allocation.
Mark Sweep Compact GC
Heap Configuration:
MinHeapFreeRatio = 40
MaxHeapFreeRatio = 70
MaxHeapSize = 249561088 (238.0MB)
NewSize = 1048576 (1.0MB)
MaxNewSize = 4294836224 (4095.875MB)
OldSize = 4194304 (4.0MB)
NewRatio = 2
SurvivorRatio = 8
PermSize = 12582912 (12.0MB)
MaxPermSize = 67108864 (64.0MB)
G1HeapRegionSize = 0 (0.0MB)
Heap Usage:
New Generation (Eden + 1 Survivor Space):
capacity = 10092544 (9.625MB)
used = 6772088 (6.458366394042969MB)
free = 3320456 (3.1666336059570312MB)
67.09991058745942% used
Eden Space:
capacity = 9043968 (8.625MB)
used = 6620336 (6.3136444091796875MB)
free = 2423632 (2.3113555908203125MB)
73.2016743093297% used
From Space:
capacity = 1048576 (1.0MB)
used = 151752 (0.14472198486328125MB)
free = 896824 (0.8552780151367188MB)
14.472198486328125% used
To Space:
capacity = 1048576 (1.0MB)
used = 0 (0.0MB)
free = 1048576 (1.0MB)
0.0% used
tenured generation:
capacity = 22134784 (21.109375MB)
used = 17650936 (16.83324432373047MB)
free = 4483848 (4.276130676269531MB)
79.7429782915433% used
Perm Generation:
capacity = 19136512 (18.25MB)
used = 19023016 (18.141761779785156MB)
free = 113496 (0.10823822021484375MB)
99.40691386183647% used
9597 interned Strings occupying 729344 bytes.
top
top - 11:41:29 up 21:59, 2 users, load average: 1.51, 1.25, 1.22
Tasks: 93 total, 1 running, 92 sleeping, 0 stopped, 0 zombie
Cpu(s): 9.4%us, 8.0%sy, 0.0%ni, 82.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 966780k total, 273080k used, 693700k free, 27216k buffers
Swap: 0k total, 0k used, 0k free, 126352k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3242 root 20 0 398m 79m 11m S 23.6 8.4 346:16.82 java
3889 root 20 0 2804 1096 848 R 5.5 0.1 0:00.07 top
1 root 20 0 2124 688 596 S 0.0 0.1 0:02.92 init
2 root 20 0 0 0 0 S 0.0 0.0 0:00.03 kthreadd
3 root 20 0 0 0 0 S 0.0 0.0 0:14.32 ksoftirqd/0
5 root 20 0 0 0 0 S 0.0 0.0 0:00.14 kworker/u:0
6 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/0
7 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 khelper
java limits
root#cm-debian:~# java -XX:+PrintFlagsFinal -version | grep -iE 'HeapSize|PermSize|ThreadStackSize'
uintx AdaptivePermSizeWeight = 20 {product}
intx CompilerThreadStackSize = 0 {pd product}
uintx ErgoHeapSizeLimit = 0 {product}
uintx HeapSizePerGCThread = 67108864 {product}
uintx InitialHeapSize := 15468480 {product}
uintx LargePageHeapSizeThreshold = 134217728 {product}
uintx MaxHeapSize := 249561088 {product}
uintx MaxPermSize = 67108864 {pd product}
uintx PermSize = 12582912 {pd product}
intx ThreadStackSize = 320 {pd product}
intx VMThreadStackSize = 512 {pd product}
java version "1.7.0_60"
Java(TM) SE Runtime Environment (build 1.7.0_60-b19)
Java HotSpot(TM) Client VM (build 24.60-b09, mixed mode)
process limits
root#cm-debian:~# cat /proc/3242/limits
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size 0 unlimited bytes
Max resident set unlimited unlimited bytes
Max processes unlimited unlimited processes
Max open files 8192 8192 files
Max locked memory 65536 65536 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 16382 16382 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
system memory info
root#cm-debian:~# cat /proc/meminfo
MemTotal: 966780 kB
MemFree: 694312 kB
Buffers: 27384 kB
Cached: 126364 kB
SwapCached: 0 kB
Active: 140748 kB
Inactive: 107684 kB
Active(anon): 94992 kB
Inactive(anon): 2064 kB
Active(file): 45756 kB
Inactive(file): 105620 kB
Unevictable: 0 kB
Mlocked: 0 kB
HighTotal: 524288 kB
HighFree: 301088 kB
LowTotal: 442492 kB
LowFree: 393224 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 0 kB
Writeback: 0 kB
AnonPages: 94692 kB
Mapped: 21220 kB
Shmem: 2376 kB
Slab: 13268 kB
SReclaimable: 5284 kB
SUnreclaim: 7984 kB
KernelStack: 960 kB
PageTables: 980 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 483388 kB
Committed_AS: 137260 kB
VmallocTotal: 286720 kB
VmallocUsed: 2928 kB
VmallocChunk: 283040 kB
root#cm-debian:~# vmstat -s
966780 K total memory
272468 K used memory
140776 K active memory
107712 K inactive memory
694312 K free memory
27392 K buffer memory
126404 K swap cache
0 K total swap
0 K used swap
0 K free swap
726963 non-nice user cpu ticks
0 nice user cpu ticks
621187 system cpu ticks
6371123 idle cpu ticks
3683 IO-wait cpu ticks
324 IRQ cpu ticks
2146 softirq cpu ticks
0 stolen cpu ticks
130871 pages paged in
97520 pages paged out
0 pages swapped in
0 pages swapped out
293822206 interrupts
494034482 CPU context switches
1412595732 boot time
3916 forks
threads
root#cm-debian:~# cat /proc/sys/kernel/pid_max
32768
root#cm-debian:~# cat /proc/sys/kernel/threads-max
15102
root#cm-debian:~# cat /proc/sys/vm/max_map_count
65530
root#cm-debian:~# ls -l /proc/3242/task/ | wc -l
33
root#cm-debian:~# ps huH p 3242 | wc -l
32
root#cm-debian:~# grep -s '^Threads' /proc/[0-9]*/status | awk '{ sum += $2; } END { print sum; }'
122
open files / file descriptors
root#cm-debian:~# ls -l /proc/3242/fd | wc -l
81
Update 2: 2014-13-10
This time I logged all Java threads stack traces while the OS was hung (as I stated previously, the touch screen and its events still works so I wrote stack traces to log file in terms of UI button handler).
From my point of view, all threads are in “correct” state (sleeping, waiting for UDP datagram etc..) and it is obvious that the hang is not caused by a Java application SW operation which would took 24 minutes.
10:49:42,293> [INFO ] THREAD stack traces:
****************************************
ID: 56, name: Mpg123AudioPlayer_PASSENGER_ctrlLoop
java.lang.Thread.sleep(Native Method)
java.lang.Thread.sleep(Thread.java:340)
java.util.concurrent.TimeUnit.sleep(TimeUnit.java:360)
epis5fcc.audio.mpg.MpgAudioOutputPlayer.ctrlLoop(MpgAudioOutputPlayer.java:169)
epis5fcc.audio.mpg.MpgAudioOutputPlayer.access$000(MpgAudioOutputPlayer.java:19)
epis5fcc.audio.mpg.MpgAudioOutputPlayer$1.run(MpgAudioOutputPlayer.java:88)
java.lang.Thread.run(Thread.java:745)
ID: 11, name: AWT-EventQueue-0
java.lang.Thread.getStackTrace(Thread.java:1589)
epis5fcc.domain.debug.ThreadStackTracesLogger.log(ThreadStackTracesLogger.java:30)
epis5fcc.ui.settings.FccRegistryScreen$7.actionPerformed(FccRegistryScreen.java:303)
javax.swing.AbstractButton.fireActionPerformed(AbstractButton.java:2018)
javax.swing.AbstractButton$Handler.actionPerformed(AbstractButton.java:2341)
javax.swing.DefaultButtonModel.fireActionPerformed(DefaultButtonModel.java:402)
javax.swing.DefaultButtonModel.setPressed(DefaultButtonModel.java:259)
javax.swing.plaf.basic.BasicButtonListener.mouseReleased(BasicButtonListener.java:252)
java.awt.Component.processMouseEvent(Component.java:6516)
javax.swing.JComponent.processMouseEvent(JComponent.java:3320)
java.awt.Component.processEvent(Component.java:6281)
java.awt.Container.processEvent(Container.java:2229)
java.awt.Component.dispatchEventImpl(Component.java:4872)
java.awt.Container.dispatchEventImpl(Container.java:2287)
java.awt.Component.dispatchEvent(Component.java:4698)
java.awt.LightweightDispatcher.retargetMouseEvent(Container.java:4832)
java.awt.LightweightDispatcher.processMouseEvent(Container.java:4492)
java.awt.LightweightDispatcher.dispatchEvent(Container.java:4422)
java.awt.Container.dispatchEventImpl(Container.java:2273)
java.awt.Window.dispatchEventImpl(Window.java:2719)
java.awt.Component.dispatchEvent(Component.java:4698)
java.awt.EventQueue.dispatchEventImpl(EventQueue.java:735)
java.awt.EventQueue.access$200(EventQueue.java:103)
java.awt.EventQueue$3.run(EventQueue.java:694)
java.awt.EventQueue$3.run(EventQueue.java:692)
java.security.AccessController.doPrivileged(Native Method)
java.security.ProtectionDomain$1.doIntersectionPrivilege(ProtectionDomain.java:76)
java.security.ProtectionDomain$1.doIntersectionPrivilege(ProtectionDomain.java:87)
java.awt.EventQueue$4.run(EventQueue.java:708)
java.awt.EventQueue$4.run(EventQueue.java:706)
java.security.AccessController.doPrivileged(Native Method)
java.security.ProtectionDomain$1.doIntersectionPrivilege(ProtectionDomain.java:76)
java.awt.EventQueue.dispatchEvent(EventQueue.java:705)
java.awt.EventDispatchThread.pumpOneEventForFilters(EventDispatchThread.java:242)
java.awt.EventDispatchThread.pumpEventsForFilter(EventDispatchThread.java:161)
java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:150)
java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:146)
java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:138)
java.awt.EventDispatchThread.run(EventDispatchThread.java:91)
ID: 34, name: Mpg123AudioPlayer_DRIVER_ctrlLoop
java.lang.Thread.sleep(Native Method)
java.lang.Thread.sleep(Thread.java:340)
java.util.concurrent.TimeUnit.sleep(TimeUnit.java:360)
epis5fcc.audio.mpg.MpgAudioOutputPlayer.ctrlLoop(MpgAudioOutputPlayer.java:169)
epis5fcc.audio.mpg.MpgAudioOutputPlayer.access$000(MpgAudioOutputPlayer.java:19)
epis5fcc.audio.mpg.MpgAudioOutputPlayer$1.run(MpgAudioOutputPlayer.java:88)
java.lang.Thread.run(Thread.java:745)
ID: 26, name: IOTxUdpAccessLoop_IODispatchAccess
java.lang.Thread.sleep(Native Method)
jCommons.comm.io.access.IOUdpAccess.transmitLoop(IOUdpAccess.java:114)
jCommons.comm.io.access.IOAccessBase$2.run(IOAccessBase.java:50)
java.lang.Thread.run(Thread.java:745)
ID: 29, name: MasterLoop_main
java.lang.Thread.sleep(Native Method)
jCommons.master.MasterLoop.ctrlLoop(MasterLoop.java:87)
jCommons.master.MasterLoop.access$000(MasterLoop.java:11)
jCommons.master.MasterLoop$1.run(MasterLoop.java:58)
java.lang.Thread.run(Thread.java:745)
ID: 27, name: IORxSerialPortAccessPollLoop_IOModemAccess
java.lang.Thread.sleep(Native Method)
jCommons.comm.io.access.IOSerialPortAccessPoll.reciveLoop(IOSerialPortAccessPoll.java:256)
jCommons.comm.io.access.IOAccessBase$1.run(IOAccessBase.java:43)
java.lang.Thread.run(Thread.java:745)
ID: 31, name: UsbUpdateWatchService_ctrlLoop
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
java.util.concurrent.LinkedBlockingDeque.takeFirst(LinkedBlockingDeque.java:489)
java.util.concurrent.LinkedBlockingDeque.take(LinkedBlockingDeque.java:678)
sun.nio.fs.AbstractWatchService.take(AbstractWatchService.java:118)
jCommons.update.usb.UsbUpdateWatchService.ctrlLoop(UsbUpdateWatchService.java:107)
jCommons.update.usb.UsbUpdateWatchService.access$000(UsbUpdateWatchService.java:25)
jCommons.update.usb.UsbUpdateWatchService$1.run(UsbUpdateWatchService.java:75)
java.lang.Thread.run(Thread.java:745)
ID: 25, name: IORxUdpAccessLoop_IODispatchAccess
java.net.PlainDatagramSocketImpl.receive0(Native Method)
java.net.AbstractPlainDatagramSocketImpl.receive(AbstractPlainDatagramSocketImpl.java:145)
java.net.DatagramSocket.receive(DatagramSocket.java:786)
jCommons.comm.io.access.IOUdpAccess.reciveLoop(IOUdpAccess.java:175)
jCommons.comm.io.access.IOAccessBase$1.run(IOAccessBase.java:43)
java.lang.Thread.run(Thread.java:745)
ID: 2, name: Reference Handler
java.lang.Object.wait(Native Method)
java.lang.Object.wait(Object.java:503)
java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
ID: 30, name: VehicleCtrl_ctrlLoop
java.lang.Thread.sleep(Native Method)
epis5fcc.domain.vehicle.control.VehicleCtrl.ctrlLoop(VehicleCtrl.java:74)
jCommons.comm.protocol.ProtCtrlBase$1.run(ProtCtrlBase.java:24)
java.lang.Thread.run(Thread.java:745)
ID: 35, name: Mpg123AudioPlayer_INNER_ctrlLoop
java.lang.Thread.sleep(Native Method)
java.lang.Thread.sleep(Thread.java:340)
java.util.concurrent.TimeUnit.sleep(TimeUnit.java:360)
epis5fcc.audio.mpg.MpgAudioOutputPlayer.ctrlLoop(MpgAudioOutputPlayer.java:169)
epis5fcc.audio.mpg.MpgAudioOutputPlayer.access$000(MpgAudioOutputPlayer.java:19)
epis5fcc.audio.mpg.MpgAudioOutputPlayer$1.run(MpgAudioOutputPlayer.java:88)
java.lang.Thread.run(Thread.java:745)
ID: 21, name: IORxSerialPortAccessPollLoop_IOFccAccess
java.lang.Thread.sleep(Native Method)
jCommons.comm.io.access.IOSerialPortAccessPoll.reciveLoop(IOSerialPortAccessPoll.java:256)
jCommons.comm.io.access.IOAccessBase$1.run(IOAccessBase.java:43)
java.lang.Thread.run(Thread.java:745)
ID: 7, name: FileWatchdog
java.lang.Thread.sleep(Native Method)
org.apache.log4j.helpers.FileWatchdog.run(FileWatchdog.java:104)
ID: 8, name: Java2D Disposer
java.lang.Object.wait(Native Method)
java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
sun.java2d.Disposer.run(Disposer.java:145)
java.lang.Thread.run(Thread.java:745)
ID: 17, name: com.google.inject.internal.util.$Finalizer
java.lang.Object.wait(Native Method)
java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
com.google.inject.internal.util.$Finalizer.run(Finalizer.java:114)
ID: 10, name: AWT-XAWT
sun.awt.X11.XToolkit.waitForEvents(Native Method)
sun.awt.X11.XToolkit.run(XToolkit.java:541)
sun.awt.X11.XToolkit.run(XToolkit.java:505)
java.lang.Thread.run(Thread.java:745)
ID: 32, name: Thread-4
sun.nio.fs.LinuxWatchService.poll(Native Method)
sun.nio.fs.LinuxWatchService.access$600(LinuxWatchService.java:47)
sun.nio.fs.LinuxWatchService$Poller.run(LinuxWatchService.java:311)
java.lang.Thread.run(Thread.java:745)
ID: 28, name: IOTxSerialPortAccessPollLoop_IOModemAccess
java.lang.Thread.sleep(Native Method)
jCommons.comm.io.access.IOSerialPortAccessPoll.transmitLoop(IOSerialPortAccessPoll.java:187)
jCommons.comm.io.access.IOAccessBase$2.run(IOAccessBase.java:50)
java.lang.Thread.run(Thread.java:745)
ID: 14, name: DestroyJavaVM
ID: 22, name: IOTxSerialPortAccessPollLoop_IOFccAccess
java.lang.Thread.sleep(Native Method)
jCommons.comm.io.access.IOSerialPortAccessPoll.transmitLoop(IOSerialPortAccessPoll.java:187)
jCommons.comm.io.access.IOAccessBase$2.run(IOAccessBase.java:50)
java.lang.Thread.run(Thread.java:745)
ID: 19, name: TimerQueue
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
java.util.concurrent.DelayQueue.take(DelayQueue.java:220)
javax.swing.TimerQueue.run(TimerQueue.java:171)
java.lang.Thread.run(Thread.java:745)
ID: 12, name: AWT-Shutdown
java.lang.Object.wait(Native Method)
java.lang.Object.wait(Object.java:503)
sun.awt.AWTAutoShutdown.run(AWTAutoShutdown.java:296)
java.lang.Thread.run(Thread.java:745)
ID: 23, name: IORxUdpAccessLoop_IOCityScrnAccess
java.net.PlainDatagramSocketImpl.receive0(Native Method)
java.net.AbstractPlainDatagramSocketImpl.receive(AbstractPlainDatagramSocketImpl.java:145)
java.net.DatagramSocket.receive(DatagramSocket.java:786)
jCommons.comm.io.access.IOUdpAccess.reciveLoop(IOUdpAccess.java:175)
jCommons.comm.io.access.IOAccessBase$1.run(IOAccessBase.java:43)
java.lang.Thread.run(Thread.java:745)
ID: 3, name: Finalizer
java.lang.Object.wait(Native Method)
java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)
ID: 4, name: Signal Dispatcher
ID: 52, name: pool-3-thread-1
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:745)
ID: 24, name: IOTxUdpAccessLoop_IOCityScrnAccess
java.lang.Thread.sleep(Native Method)
jCommons.comm.io.access.IOUdpAccess.transmitLoop(IOUdpAccess.java:114)
jCommons.comm.io.access.IOAccessBase$2.run(IOAccessBase.java:50)
java.lang.Thread.run(Thread.java:745)
ID: 36, name: RemoteUpdateCtrl_ctrlLoop
java.lang.Thread.sleep(Native Method)
epis5fcc.domain.update.remote.RemoteUpdateCtrl.ctrlLoop(RemoteUpdateCtrl.java:94)
jCommons.comm.protocol.ProtCtrlBase$1.run(ProtCtrlBase.java:24)
java.lang.Thread.run(Thread.java:745)
ID: 55, name: Mpg123AudioPlayer_OUTER_ctrlLoop
java.lang.Thread.sleep(Native Method)
java.lang.Thread.sleep(Thread.java:340)
java.util.concurrent.TimeUnit.sleep(TimeUnit.java:360)
epis5fcc.audio.mpg.MpgAudioOutputPlayer.ctrlLoop(MpgAudioOutputPlayer.java:169)
epis5fcc.audio.mpg.MpgAudioOutputPlayer.access$000(MpgAudioOutputPlayer.java:19)
epis5fcc.audio.mpg.MpgAudioOutputPlayer$1.run(MpgAudioOutputPlayer.java:88)
java.lang.Thread.run(Thread.java:745)
This appear to be a problem related with GPT and local timers simultaneous use.
On Freescale community you can see one more question similar to yours and other from someone given some clarification.
For the resolution, apply this patch.
From the second post you can jump to kernel 3.10.17 from Fresscale or 3.13.3 from kernel.org
Currently I am trying the patch to see if resolves a similar problem.
I have a Java application that uses a native library for some of its functionality. It uses JNI to control the native library and also receives asynchronous callback from the library. You can think of it as a Java frontend and native backend that communicate with each other.
I am facing a memory leak. Shortly after I start the application, the memory slowly but steadily increases. So I tried to look what could cause the leak.
First, I tried replacing the Java frontend with a simple C++ text interface. That way, the application doesn't use Java in any way - and the leaks stopped. So the problem must be in Java frontend.
So I fired up the jvisualVM to see if the heap increases - and it turned out it doesn't. The Java heap size was fairly constant. I even launched the program with xmx32m, but the memory kept increasing well past 100m without any OutOfMemoryErrors. In fact, the jvisualVM showed Java heap at about 7m.
So I dug deeper into the program with WinDbg. I analyzed the heap patterns with !heap -s command and I got this:
Heaps on a freshly run program:
0:059> !heap -s
LFH Key : 0x382288b9
Termination on corruption : ENABLED
Heap Flags Reserv Commit Virt Free List UCR Virt Lock Fast
(k) (k) (k) (k) length blocks cont. heap
-----------------------------------------------------------------------------
00330000 00000002 2048 1704 2048 22 71 2 0 0 LFH
005b0000 00001002 1088 212 1088 68 3 2 0 0 LFH
00aa0000 00001002 1088 108 1088 15 7 2 0 0 LFH
004f0000 00001002 15424 12876 15424 1372 89 9 0 1 LFH
...
0:059> !heap -stat -h 004f0000
heap # 004f0000
group-by: TOTSIZE max-display: 20
size #blocks total ( %) (percent of total busy bytes)
2b110 20 - 562200 (60.36)
98 166e - d5150 (9.33)
6cd20 1 - 6cd20 (4.77)
...
Heaps on a program that has been running for about half an hour:
0:046> !heap -s
LFH Key : 0x5e47ba72
Termination on corruption : ENABLED
Heap Flags Reserv Commit Virt Free List UCR Virt Lock Fast
(k) (k) (k) (k) length blocks cont. heap
-----------------------------------------------------------------------------
006b0000 00000002 2048 1744 2048 46 92 2 0 0 LFH
00200000 00001002 1088 220 1088 68 3 2 0 0 LFH
00950000 00001002 1088 108 1088 15 7 2 0 0 LFH
001b0000 00001002 47808 31936 47808 1855 102 12 0 0 LFH
...
0:046> !heap -stat -h 001b0000
heap # 001b0000
group-by: TOTSIZE max-display: 20
size #blocks total ( %) (percent of total busy bytes)
98 59d1 - 355418 (36.67)
2b110 10 - 2b1100 (29.61)
6cd20 1 - 6cd20 (4.68)
...
Now it can be clearly seen that the leaks are caused by a growing number of blocks with size 98. But when I try to analyze one of the blocks with !heap -p -a, I get:
*** ERROR: Symbol file could not be found. Defaulted to export symbols for jvm.dll
without any stack trace. So the blocks are allocated somewhere inside the jvm.dll, and because there are no pdbs for JVM, I cannot debug the leak further.
I managed to pinpoint where the leak is occuring in my code. All callbacs to the Java frontend pass through one function:
void callback(JNIEnv *env, int stream, double value, char *callbackName){
jclass jni = env->FindClass("nativ/Callbacks");
jmethodID callbackMethodID = env->GetStaticMethodID(jni, callbackName, "(ID)V");
jvalue params[2];
params[0].i = (long)(stream);
params[1].d = value;
env->CallStaticVoidMethodA(jni, callbackMethodID, params); //commenting this out stops the leaks
}
When I comment out the last command, the leaks stop, but I get no feedback back to the frontend.
Could this be a JVM bug? How do I find out?
malloc() internally calls HeapAlloc(). I guess you need a 'Release' method to release the memory allocated by JVM, as long as your library hold reference to JVM's internal state.
I have deployed Java code on two different servers.The code is doing File Writing operations.
On the local server ,parameters are :
uname -a
SunOS snmi5001 5.10 Generic_120011-14 sun4u sparc SUNW,SPARC-Enterprise
ulimit -a
time(seconds) unlimited
file(blocks) unlimited
data(kbytes) unlimited
stack(kbytes) 389296
coredump(blocks) unlimited
nofiles(descriptors) 20000
vmemory(kbytes) unlimited
Java Version:
java version "1.5.0_12"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_12-b04)
Java HotSpot(TM) Server VM (build 1.5.0_12-b04, mixed mode)
On a Different(lets say MIT) server :
uname -a
SunOS au11qapcwbtels2 5.10 Generic_147440-05 sun4u sparc SUNW,Sun-Fire-15000
ulimit -a
time(seconds) unlimited
file(blocks) unlimited
data(kbytes) unlimited
stack(kbytes) 8192
coredump(blocks) unlimited
nofiles(descriptors) 256
vmemory(kbytes) unlimited
java -version
java version "1.5.0_32"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_32-b05)
Java HotSpot(TM) Server VM (build 1.5.0_32-b05, mixed mode)
The problem is that the code is running signficatly slower on the MIT server.
Because of the difference in nofiles and stack for the two OS's ,i thought if i change the ulimit -s and ulimit -n it would make a difference.
I cannot change the parameters on MIT server without confirming the problem,so the decreased the ulimit parameters for the local server and retested.But code finished execution is same time.
I have no idea what difference between the OS parameters which could be causing this.
Any help is appreciated.I will post more paramters if anyone tells me what to look for.
EDIT:
For MIT Server
No of CPU: psrinfo -p
24
psrinfo -pv
The physical processor has 2 virtual processors (0 4)
UltraSPARC-IV+ (portid 0 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (1 5)
UltraSPARC-IV+ (portid 1 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (2 6)
UltraSPARC-IV+ (portid 2 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (3 7)
UltraSPARC-IV+ (portid 3 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (32 36)
UltraSPARC-IV+ (portid 32 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (33 37)
UltraSPARC-IV+ (portid 33 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (34 38)
UltraSPARC-IV+ (portid 34 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (35 39)
UltraSPARC-IV+ (portid 35 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (64 68)
UltraSPARC-IV+ (portid 64 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (65 69)
UltraSPARC-IV+ (portid 65 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (66 70)
UltraSPARC-IV+ (portid 66 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (67 71)
UltraSPARC-IV+ (portid 67 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (96 100)
UltraSPARC-IV+ (portid 96 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (97 101)
UltraSPARC-IV+ (portid 97 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (98 102)
UltraSPARC-IV+ (portid 98 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (99 103)
UltraSPARC-IV+ (portid 99 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (128 132)
UltraSPARC-IV+ (portid 128 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (129 133)
UltraSPARC-IV+ (portid 129 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (130 134)
UltraSPARC-IV+ (portid 130 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (131 135)
UltraSPARC-IV+ (portid 131 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (224 228)
UltraSPARC-IV+ (portid 224 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (225 229)
UltraSPARC-IV+ (portid 225 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (226 230)
UltraSPARC-IV+ (portid 226 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (227 231)
UltraSPARC-IV+ (portid 227 impl 0x19 ver 0x24 clock 1800 MHz)
kstat cpu_info :
module: cpu_info instance: 231
name: cpu_info231 class: misc
brand UltraSPARC-IV+
chip_id 227
clock_MHz 1800
core_id 231
cpu_fru hc:///component=SB7
cpu_type sparcv9
crtime 587.102844985
current_clock_Hz 1799843256
device_ID 9223937394446500460
fpu_type sparcv9
implementation UltraSPARC-IV+ (portid 227 impl 0x19 ver 0x24 clock 1800 MHz)
pg_id 48
snaptime 19846866.5310415
state on-line
state_begin 1334854522
For the Local server i could only get the kstat info :
module: cpu_info instance: 0
name: cpu_info0 class: misc
brand SPARC64-VI
chip_id 1024
clock_MHz 2150
core_id 0
cpu_fru hc:///component=/MBU_A/CPUM0
cpu_type sparcv9
crtime 288.5675516
device_ID 250691889836161
fpu_type sparcv9
implementation SPARC64-VI (portid 1024 impl 0x6 ver 0x93 clock 2150 MHz)
snaptime 207506.8330168
state on-line
state_begin 1354493257
module: cpu_info instance: 1
name: cpu_info1 class: misc
brand SPARC64-VI
chip_id 1024
clock_MHz 2150
core_id 0
cpu_fru hc:///component=/MBU_A/CPUM0
cpu_type sparcv9
crtime 323.4572206
device_ID 250691889836161
fpu_type sparcv9
implementation SPARC64-VI (portid 1024 impl 0x6 ver 0x93 clock 2150 MHz)
snaptime 207506.8336113
state on-line
state_begin 1354493292
Similarly total 59 instances .
Also the memory for local server : vmstat
kthr memory page disk faults cpu
r b w swap free re mf pi po fr de sr s0 s1 s4 s1 in sy cs us sy id
0 0 0 143845984 93159232 431 895 1249 30 29 0 2 6 0 -0 1 3284 72450 6140 11 3 86
The memory for the MIT server : vmstat
kthr memory page disk faults cpu
r b w swap free re mf pi po fr de sr m0 m1 m2 m3 in sy cs us sy id
0 0 0 180243376 184123896 81 786 248 15 15 0 0 3 14 -0 4 1854 7563 2072 1 1 98
df -h for MIT server:
Filesystem Size Used Available Capacity Mounted on
/dev/md/dsk/d0 7.9G 6.7G 1.1G 86% /
/devices 0K 0K 0K 0% /devices
ctfs 0K 0K 0K 0% /system/contract
proc 0K 0K 0K 0% /proc
mnttab 0K 0K 0K 0% /etc/mnttab
swap 171G 1.7M 171G 1% /etc/svc/volatile
objfs 0K 0K 0K 0% /system/object
sharefs 0K 0K 0K 0% /etc/dfs/sharetab
/platform/sun4u-us3/lib/libc_psr/libc_psr_hwcap2.so.1
7.9G 6.7G 1.1G 86% /platform/sun4u-us3/lib/libc_psr.so.1
/platform/sun4u-us3/lib/sparcv9/libc_psr/libc_psr_hwcap2.so.1
7.9G 6.7G 1.1G 86% /platform/sun4u-us3/lib/sparcv9/libc_psr.so.1
/dev/md/dsk/d3 7.9G 6.6G 1.2G 85% /var
swap 6.0G 56K 6.0G 1% /tmp
swap 171G 40K 171G 1% /var/run
swap 171G 0K 171G 0% /dev/vx/dmp
swap 171G 0K 171G 0% /dev/vx/rdmp
/dev/md/dsk/d5 2.0G 393M 1.5G 21% /home
/dev/vx/dsk/appdg/oravl
2.0G 17M 2.0G 1% /ora
/dev/md/dsk/d60 1.9G 364M 1.5G 19% /apps/stats
/dev/md/dsk/d4 16G 2.1G 14G 14% /var/crash
/dev/md/dsk/d61 1005M 330M 594M 36% /opt/controlm6
/dev/vx/dsk/appdg/oraproductvl
10G 2.3G 7.6G 24% /ora/product
/dev/md/dsk/d63 963M 1.0M 904M 1% /var/opt/app
/dev/vx/dsk/dmldg/appsdmlsvtvl
1.0T 130G 887G 13% /apps/dml/svt
/dev/vx/dsk/appdg/homeappusersvl
20G 19G 645M 97% /home/app/users
/dev/vx/dsk/dmldg/appsdmlmit2vl
20G 66M 20G 1% /apps/dml/mit2
/dev/vx/dsk/dmldg/datadmlmit2vl
1.9T 1.1T 773G 61% /data/dml/mit2
/dev/md/dsk/d62 9.8G 30M 9.7G 1% /usr/openv/netbackup/logs
df -h for local server :
Filesystem Size Used Available Capacity Mounted on
/dev/dsk/c0t0d0s0 20G 7.7G 12G 40% /
/devices 0K 0K 0K 0% /devices
ctfs 0K 0K 0K 0% /system/contract
proc 0K 0K 0K 0% /proc
mnttab 0K 0K 0K 0% /etc/mnttab
swap 140G 1.6M 140G 1% /etc/svc/volatile
objfs 0K 0K 0K 0% /system/object
fd 0K 0K 0K 0% /dev/fd
/dev/dsk/c0t0d0s5 9.8G 9.3G 483M 96% /var
swap 140G 504K 140G 1% /tmp
swap 140G 80K 140G 1% /var/run
swap 140G 0K 140G 0% /dev/vx/dmp
swap 140G 0K 140G 0% /dev/vx/rdmp
/dev/dsk/c0t0d0s6 9.8G 9.4G 403M 96% /opt
/dev/vx/dsk/eva8k/tlkhome
2.0G 66M 1.8G 4% /tlkhome
/dev/vx/dsk/eva8k/tlkuser4
48G 26G 20G 57% /tlkuser4
/dev/vx/dsk/eva8k/ST82
1.1G 17M 999M 2% /ST_A_82
/dev/vx/dsk/eva8k/tlkuser11
37G 37G 176M 100% /tlkuser11
/dev/vx/dsk/eva8k/oravl97
20G 12G 7.3G 63% /oravl97
/dev/vx/dsk/eva8k/tlkuser5
32G 23G 8.3G 74% /tlkuser5
/dev/vx/dsk/eva8k/mbtlkproj1
2.0G 18M 1.9G 1% /mbtlkproj1
/dev/vx/dsk/eva8k/Oravol98
38G 25G 12G 68% /oravl98
/dev/vx/dsk/eva8k_new/tlkuser15
57G 57G 0K 100% /tlkuser15
/dev/vx/dsk/eva8k/Oravol1
39G 16G 22G 42% /oravl01
/dev/vx/dsk/eva8k/Oravol99
30G 8.3G 20G 30% /oravl99
/dev/vx/dsk/eva8k/tlkuser9
18G 13G 4.8G 73% /tlkuser9
/dev/vx/dsk/eva8k/oravl08
32G 25G 6.3G 81% /oravl08
/dev/vx/dsk/eva8k/oravl07
46G 45G 1.2G 98% /oravl07
/dev/vx/dsk/eva8k/Oravol3
103G 90G 13G 88% /oravl03
/dev/vx/dsk/eva8k_new/tlkuser12
79G 79G 0K 100% /tlkuser12
/dev/vx/dsk/eva8k/Oravol4
88G 83G 4.3G 96% /oravl04
/dev/vx/dsk/eva8k/oravl999
10G 401M 9.0G 5% /oravl999
/dev/vx/dsk/eva8k_new/tlkuser14
54G 39G 15G 73% /tlkuser14
/dev/vx/dsk/eva8k/Oravol2
85G 69G 14G 84% /oravl02
/dev/vx/dsk/eva8k/sdkhome
1.0G 17M 944M 2% /sdkhome
/dev/vx/dsk/eva8k/tlkuser7
44G 36G 7.8G 83% /tlkuser7
/dev/vx/dsk/eva8k/tlkproj1
1.0G 17M 944M 2% /tlkproj1
/dev/vx/dsk/eva8k/tlkuser3
35G 29G 5.9G 84% /tlkuser3
/dev/vx/dsk/eva8k/tlkuser10
29G 29G 2.7M 100% /tlkuser10
/dev/vx/dsk/eva8k/oravl05
30G 29G 1.2G 97% /oravl05
/dev/vx/dsk/eva8k/oravl06
36G 34G 1.6G 96% /oravl06
/dev/vx/dsk/eva8k/tlkuser6
29G 27G 2.1G 93% /tlkuser6
/dev/vx/dsk/eva8k/tlkuser2
36G 30G 5.8G 84% /tlkuser2
/dev/vx/dsk/eva8k/tlkuser1
66G 49G 16G 75% /tlkuser1
/dev/vx/dsk/eva8k_new/tlkuser13
84G 77G 7.0G 92% /tlkuser13
/dev/vx/dsk/eva8k_new/tlkuser16
44G 37G 6.4G 86% /tlkuser16
/dev/vx/dsk/eva8k/db2
1.0G 593M 404M 60% /opt/db2V8.1
/dev/vx/dsk/eva8k/WebSphere6029
3.0G 2.2G 776M 75% /opt/WebSphere6029
/dev/vx/dsk/eva8k/websphere6
2.0G 88M 1.8G 5% /opt/websphere6
/dev/vx/dsk/eva8k/wli
4.0G 1.4G 2.5G 36% /opt/wli10gR3MP1
/dev/vx/dsk/eva8k/user
2.0G 19M 1.9G 1% /user/telstra/history
dvcinasdm3:/oracle_cdrom/data
576G 576G 206M 100% /oracle_cdrom
dvcinasdm2:/system_kits
822G 818G 4.2G 100% /system_kits
dvcinasdm2:/db_share 295G 283G 13G 96% /db_share
dvcinas2dm2:/system_data/data
315G 283G 32G 90% /system_data
dvcinas2dm2:/ossinfra/data
49G 18G 32G 36% /ossinfra
For local server the command : /usr/sbin/prtpicl -v | egrep "devfs-path|driver-name|subsystem-id" | nawk '/:subsystem-id/ { print $0; getline; print $0; getline; print $0; }' | nawk -F: '{ print $2 }' gives :
subsystem-id 0x13a1
devfs-path /pci#0,600000/pci#0/pci#8/pci#0/scsi#1
driver-name mpt
subsystem-id 0x1648
devfs-path /pci#0,600000/pci#0/pci#8/pci#0/network#2
driver-name bge
subsystem-id 0x1648
devfs-path /pci#0,600000/pci#0/pci#8/pci#0/network#2,1
driver-name bge
subsystem-id 0xfc11
devfs-path /pci#0,600000/pci#0/pci#8/pci#0,1/SUNW,emlxs#1
driver-name emlxs
subsystem-id 0x125e
devfs-path /pci#3,700000/network
driver-name e1000g
subsystem-id 0x125e
devfs-path /pci#3,700000/network
driver-name e1000g
subsystem-id 0x13a1
devfs-path /pci#10,600000/pci#0/pci#8/pci#0/scsi#1
driver-name mpt
subsystem-id 0x1648
devfs-path /pci#10,600000/pci#0/pci#8/pci#0/network
driver-name bge
subsystem-id 0x1648
devfs-path /pci#10,600000/pci#0/pci#8/pci#0/network
driver-name bge
subsystem-id 0xfc11
devfs-path /pci#10,600000/pci#0/pci#8/pci#0,1/SUNW,emlxs#1
driver-name emlxs
For MIT server it gives :
subsystem-id 0xfc00
devfs-path /pci#3d,600000/SUNW,emlxs#1
driver-name emlxs
subsystem-id 0xfc00
devfs-path /pci#3d,600000/SUNW,emlxs#1,1
driver-name emlxs
subsystem-id 0xfc00
devfs-path /pci#5d,600000/SUNW,emlxs#1
driver-name emlxs
subsystem-id 0xfc00
devfs-path /pci#5d,600000/SUNW,emlxs#1,1
driver-name emlxs
on the start of i/o consuming code,iostat -d c3t50001FE1502613A9d7 5 shows :
1161 37 134 0 0 0 0 0 0 329 24 2
3 2 3 0 0 0 0 0 0 554 71 10
195 26 6 0 0 0 0 0 0 853 108 19
37 6 4 0 0 0 0 0 0 1134 143 10
140 8 7 0 0 0 0 0 0 3689 86 7
173 24 85 0 0 0 0 0 0 9914 74 9
0 0 0 0 0 0 0 0 0 12323 114 2
13 9 41 0 0 0 0 0 0 10609 117 2
0 0 0 0 0 0 0 0 0 10746 72 2
sd0 sd1 sd4 ssd134
kps tps serv kps tps serv kps tps serv kps tps serv
1 0 3 0 0 0 0 0 0 11376 137 2
2 0 10 0 0 0 0 0 0 11980 157 3
231 39 14 0 0 0 0 0 0 10584 140 3
785 175 5 0 0 0 0 0 0 13503 170 2
9 4 32 0 0 0 0 0 0 11597 168 2
7 1 6 0 0 0 0 0 0 11555 106 2
On the MIT server iostat shows :
0.0 460.4 0.0 4029.2 0.4 0.6 0.9 1.2 2 11 c6t5006048452A79BD6d206
0.0 885.2 0.0 8349.3 0.5 0.8 0.6 0.9 3 24 c4t5006048452A79BD9d206
0.0 660.0 0.0 5618.8 0.5 0.7 0.7 1.0 2 18 c6t5006048452A79BD6d206
0.0 779.1 0.0 7408.6 0.3 0.7 0.4 0.8 2 21 c4t5006048452A79BD9d206
0.0 569.8 0.0 4893.9 0.3 0.5 0.5 1.0 2 15 c6t5006048452A79BD6d206
0.0 521.5 0.0 5433.6 0.2 0.5 0.3 0.9 1 16 c4t5006048452A79BD9d206
0.0 362.8 0.0 3134.8 0.2 0.4 0.6 1.1 1 10 c6t5006048452A79BD6d206
So,we can see that the kps for local server is much more than that of MIT server,during the time of max i/o operations.
Conclusions on the local and MIT server
A quick glance at your machines:
Local server is a small-chassis Sun Enterprise machine on SPARC VI, possibly a M4000. You are writing data on an external file system (called eva8k_new) over multipathed PCIe slots using a direct SCSI connection. This machine is 3-5 years old.
MIT server is a SunFire 15000 - an old, mainframe-class Solaris server. It has 12 dual-core UltraSPARC IV+ CPUs in the hardware partition that you are running in (the physical chassis can be logically split into several different hardware partitions which cannot see each other at all). You are writing to a SAN over a 1Gb/s or 2Gb/s fibre channel (the LUN might be called dmldg) on multipathed PCI slots. This machine is at least 7 years old, but the technology is 10 years old.
The storage system used on the local and MIT servers are both external. The performance of the storage is dependent on a number of factors including the I/O speed of the physical interface (PCI vs. PCIe) and the interconnect (1 or 2Gb/s fibre channel on the SunFire). This article explains how to get this information.
Theoretical performance problems
The performance of your application may be gated on one of several bottlenecks (assuming no code problems and network latencies/bottlenecks):
CPU: If your CPU were faster, you could get the application to go faster.
Single-threaded: Some applications are bottlenecked on a single thread, and so adding threads/cores does not improve performance.
Multi-thread capable: Sometimes, if the application is multi-threaded, adding more threads/cores can improve performance
Storage IO bandwidth or IOPS: The application is reading from or writing to storage system (including disks). Adding disks, changing RAID type, adding disk cache and other things may improve IO or IOPS; alternatively you might change to another storage subsystem.
IO bandwidth is the maximum amount of data that can pass in a given second, which may saturate first if streaming data to or from a disk
IOPS (IO operations per second) is the maximum number of IO commands (read or write) that can be processed per second. Typically this saturates first for processes that are searching for or in files, or (re)writing small chunks.
Looking at your issue, we can do a quick check:
If the issue is CPU, then:
You should see the CPU utilisation for the java process in top to be very high during program execution (90-99%)
The problem is not likely threading, because the SunFire MIT Server has a good number of cores available, therefore the problem is single-thread performance.
The UltraSPARC IV+ is quite a lot slower than the SPARC VI's. This is easily a noticeable drop, so this might be the reason the MIT server is slower
If the issue is IO, then:
You will see the CPU utilization for the java process in top to be low (probably 50% or lower, but possibly as high as 80% or so as a rule of thumb)
You will see the IO to the disk subsystem using iostat saturate - that is immediately rise to a fixed number and not really 'peak' over this number. The following options might be useful: iostat -d <disk> 5. The throughput value and number of operations/sec will be higher on the local server, and lower on the MIT server
You need to speak to the administrator to see if a faster storage system is available for the MIT server.
All the above is assuming that other processes on the servers are not interfering with the operation of your program - clearly another high-cpu process or one writing a lot to the same disk will affect the performance greatly.
Conclusions
From the CPU data you provide, there is no evidence of a CPU bottleneck.
From the iostat data you provide, as you comment, the IO on the SunFire is significantly below that of the local server. This is likely the result of the attached storage, namely at least one of:
Lower performance of PCI vs. PCIe in the local server
Probable 1Gb/s fibre channel slower than the (possibly faster) SCSI attached storage on the local server
Older and slower disks on the SunFire vs. the local attached storage
(Note that the same SAN appears connected to the local server, so this could be tested).
With clear evidence of a hardware being the cause of the performance difference, there is little that can be done.
Some things may improve the general performance of the application, though. It's a good idea to run a Java profiler on the application. Examples include Netbeans and JProfiler.
The profiler will identify which IO operations are the problem. You might be able to:
Generally improve the algorithm at the bottleneck
Use a caching layer to aggregate multiple write operations before writing once
If using the original Java I/O clases (in java.io), you could rewrite the application to use Java NIO
EDIT: Thoughts on a caching layer
Assumption: That the problematic IO operation is either repeatedly writing small chunks to disk and flushing them, or keeps performing random-access write-to-disk operations. Your application may already be streaming to disk efficiently, in which case caching would not be useful.
When you have an expensive or slow operation in an application, you will want to minimize the number of times it is invoked - ideally to the theoretical minimum which hopefully is 1. However your code may not be doing so - for example you are using an OutputStream and writing small chunks to it and flushing to disk. In this case, you may write each disk block (8k) many times, each time with just a little more data.
Instead, you could use a RAM cache to consolidate all the writes; when you know there will be no more writes to the block, then you write it exactly once to disk. For streaming, Java has the BufferedOutputStream for this for simplistic cases. When you obtain the FileOutputStream instance from the File, wrap the FileOutputStream in the BufferedOutputStream and use only the BufferedOutputStream.
If, however, you are performing true random-access writes (eg using a java.io.RandomAccessFile), and moving the file pointer with RandomAccessFile.seek(), you may want to consider writing a write cache in RAM. Precisely what this would look like depends wholly on your file data structure, but you might want to start with a block paging mechanism. Chapter 1 of Java NIO has an introduction to those concepts, but hopefully you either don't need to go there or you find a close match in the NIO API.
If you are concerned about performance, I wouldn't use such an old version of Java. It's quite likely that the OS calls and native code generated for one architecture is sub-optimal. I would expect the newer architecure to suffer.
Can you compare Java 7 between these machines?
The ulimit suggest the first machine has much more resources. Which model of CPUs and how much memory do the two machines have?
I have a concurrent system with many machines/nodes involved. Each machine run several JVMs doing different stuff. It is a "layered" architecture where each layer consists of many JVM running across the machines. Basically the top-layer JVM receives input from the outside via files, parses the input and sends it as many small records for "storage" in layer-two. Layer-two doesn't actually persist the data itself but actually persists it in layer-three (HBase and Solr) and HBase actually doesn't persist it itself either since it sends it to layer-four (HDFS) for persistence.
Most of the communication among the layers is synchronized so of course it ends up in a lot of threads waiting for lower layers to complete. But I would expect those waiting threads to be "free" wrt CPU usage.
I see a very high iowait (%wa in top) though - something like 80-90% iowait and only 10-20% sys/usr CPU usage. The system seems exhausted - slow to login via ssh and slow to respond to commands etc.
My question is if all those JVM threads waiting for lower layers to complete can cause this? Is it not supposed to be "free" waiting for responses (sockets). Does it matter with respect to this, whether the different layers uses blocking or non-blocking (NIO) io? Exactly in what situations does Linux count something as iowait (%wa in top)? When all threads in all JVMs on the machines are in a situation where it is waiting (counting because there is no other thread to run to do something meaningful in the meantime)? Or does threads waiting also count in %wa even though there are other processes ready to use the CPU for real processing?
I would really want to get a thorough explanation on how it works and how to interpret this high %wa. In the beginning I guessed that it counted as %wa when all threads where waiting, but that there where actually plenty of room for doing more, so I tried to increase the number of threads expecting to get more throughput, but that doesn't happen. So it is a real problem, not just a "visual" problem looking at top.
The output below is taken from a machine where only HBase and HDFS is running. It is on machines with HBase and/or HDFS that the problem i showing (most clearly)
--- jps ---
19498 DataNode
19690 HRegionServer
19327 SecondaryNameNode
---- typical top -------
top - 11:13:21 up 14 days, 18:20, 1 user, load average: 4.83, 4.50, 4.25
Tasks: 99 total, 1 running, 98 sleeping, 0 stopped, 0 zombie
Cpu(s): 14.1%us, 4.3%sy, 0.0%ni, 5.4%id, 74.8%wa, 0.0%hi, 1.3%si, 0.0%st
Mem: 7133800k total, 7099632k used, 34168k free, 55540k buffers
Swap: 487416k total, 248k used, 487168k free, 2076804k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+
COMMAND
19690 hbase 20 0 4629m 4.2g 9244 S 51 61.7 194:08.84 java
19498 hdfs 20 0 1030m 116m 9076 S 16 1.7 75:29.26 java
---- iostat -kd 1 ----
root#edrxen1-2:~# iostat -kd 1
Linux 2.6.32-29-server (edrxen1-2) 02/22/2012 _x86_64_ (2 CPU)
Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
xvda 3.53 3.36 15.66 4279502 19973226
dm-0 319.44 6959.14 422.37 8876213913 538720280
dm-1 0.00 0.00 0.00 912 624
xvdb 229.03 6955.81 406.71 8871957888 518747772
Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
xvda 0.00 0.00 0.00 0 0
dm-0 122.00 3852.00 0.00 3852 0
dm-1 0.00 0.00 0.00 0 0
xvdb 105.00 3252.00 0.00 3252 0
Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
xvda 0.00 0.00 0.00 0 0
dm-0 57.00 1712.00 0.00 1712 0
dm-1 0.00 0.00 0.00 0 0
xvdb 78.00 2428.00 0.00 2428 0
--- iostat -x ---
Linux 2.6.32-29-server (edrxen1-2) 02/22/2012 _x86_64_ (2 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
8.06 0.00 3.29 65.14 0.08 23.43
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
xvda 0.00 0.74 0.35 3.18 6.72 31.32 10.78 0.11 30.28 6.24 2.20
dm-0 0.00 0.00 213.15 106.59 13866.95 852.73 46.04 1.29 14.41 2.83 90.58
dm-1 0.00 0.00 0.00 0.00 0.00 0.00 8.00 0.00 5.78 1.12 0.00
xvdb 0.07 86.97 212.73 15.69 13860.27 821.42 64.27 2.44 25.21 3.96 90.47
--- free -o ----
total used free shared buffers cached
Mem: 7133800 7099452 34348 0 55612 2082364
Swap: 487416 248 487168
IO wait on Linux indicates that processes are blocked on uninterruptible I/O. In practice, it typically means that the process is performing disk access -- in this case, I'd guess one of the following:
hdfs is performing a lot of disk accesses, and it's making other disk access slow as a result. (Checking iostat -x may help, as it'll show an extra "%util" column which indicates what percentage of the time the disk is "busy".)
You're running low on system memory under load, and are ending up dipping into swap sometimes.