I have a pretty simple script that is something like the following:
#!/bin/bash
VAR1="$1"
MOREF='sudo run command against $VAR1 | grep name | cut -c7-'
echo $MOREF
When I run this script from the command line and pass it the arguments, I am not getting any output. However, when I run the commands contained within the $MOREF variable, I am able to get output.
How can one take the results of a command that needs to be run within a script, save it to a variable, and then output that variable on the screen?
In addition to backticks `command`, command substitution can be done with $(command) or "$(command)", which I find easier to read, and allows for nesting.
OUTPUT=$(ls -1)
echo "${OUTPUT}"
MULTILINE=$(ls \
-1)
echo "${MULTILINE}"
Quoting (") does matter to preserve multi-line variable values; it is optional on the right-hand side of an assignment, as word splitting is not performed, so OUTPUT=$(ls -1) would work fine.
$(sudo run command)
If you're going to use an apostrophe, you need `, not '. This character is called "backticks" (or "grave accent"):
#!/bin/bash
VAR1="$1"
VAR2="$2"
MOREF=`sudo run command against "$VAR1" | grep name | cut -c7-`
echo "$MOREF"
Some Bash tricks I use to set variables from commands
Sorry, there is a loong answer, but as bash is a shell, where the main goal is to run other unix commands and react on result code and/or output, ( commands are often piped filter, etc... ).
Storing command output in variables is something basic and fundamental.
Therefore, depending on
compatibility (posix)
kind of output (filter(s))
number of variable to set (split or interpret)
execution time (monitoring)
error trapping
repeatability of request (see long running background process, further)
interactivity (considering user input while reading from another input file descriptor)
do I miss something?
First simple, old (obsolete), and compatible way
myPi=`echo '4*a(1)' | bc -l`
echo $myPi
3.14159265358979323844
Compatible, second way
As nesting could become heavy, parenthesis was implemented for this
myPi=$(bc -l <<<'4*a(1)')
Using backticks in script is to be avoided today.
Nested sample:
SysStarted=$(date -d "$(ps ho lstart 1)" +%s)
echo $SysStarted
1480656334
bash features
Reading more than one variable (with Bashisms)
df -k /
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/dm-0 999320 529020 401488 57% /
If I just want a used value:
array=($(df -k /))
you could see an array variable:
declare -p array
declare -a array='([0]="Filesystem" [1]="1K-blocks" [2]="Used" [3]="Available" [
4]="Use%" [5]="Mounted" [6]="on" [7]="/dev/dm-0" [8]="999320" [9]="529020" [10]=
"401488" [11]="57%" [12]="/")'
Then:
echo ${array[9]}
529020
But I often use this:
{ read -r _;read -r filesystem size using avail prct mountpoint ; } < <(df -k /)
echo $using
529020
( The first read _ will just drop header line. ) Here, in only one command, you will populate 6 different variables (shown by alphabetical order):
declare -p avail filesystem mountpoint prct size using
declare -- avail="401488"
declare -- filesystem="/dev/dm-0"
declare -- mountpoint="/"
declare -- prct="57%"
declare -- size="999320"
declare -- using="529020"
Or
{ read -a head;varnames=(${head[#]//[K1% -]});varnames=(${head[#]//[K1% -]});
read ${varnames[#],,} ; } < <(LANG=C df -k /)
Then:
declare -p varnames ${varnames[#],,}
declare -a varnames=([0]="Filesystem" [1]="blocks" [2]="Used" [3]="Available" [4]="Use" [5]="Mounted" [6]="on")
declare -- filesystem="/dev/dm-0"
declare -- blocks="999320"
declare -- used="529020"
declare -- available="401488"
declare -- use="57%"
declare -- mounted="/"
declare -- on=""
Or even:
{ read _ ; read filesystem dsk[{6,2,9}] prct mountpoint ; } < <(df -k /)
declare -p mountpoint dsk
declare -- mountpoint="/"
declare -a dsk=([2]="529020" [6]="999320" [9]="401488")
(Note Used and Blocks is switched there: read ... dsk[6] dsk[2] dsk[9] ...)
... will work with associative arrays too: read _ disk[total] disk[used] ...
Other related sample: Parsing xrandr output: and end of Firefox tab by bash in a size of x% of display size? or at AskUbuntu.com Parsing xrandr output
Dedicated fd using unnamed fifo:
There is an elegent way! In this sample, I will read /etc/passwd file:
users=()
while IFS=: read -u $list user pass uid gid name home bin ;do
((uid>=500)) &&
printf -v users[uid] "%11d %7d %-20s %s\n" $uid $gid $user $home
done {list}</etc/passwd
Using this way (... read -u $list; ... {list}<inputfile) leave STDIN free for other purposes, like user interaction.
Then
echo -n "${users[#]}"
1000 1000 user /home/user
...
65534 65534 nobody /nonexistent
and
echo ${!users[#]}
1000 ... 65534
echo -n "${users[1000]}"
1000 1000 user /home/user
This could be used with static files or even /dev/tcp/xx.xx.xx.xx/yyy with x for ip address or hostname and y for port number or with the output of a command:
{
read -u $list -a head # read header in array `head`
varnames=(${head[#]//[K1% -]}) # drop illegal chars for variable names
while read -u $list ${varnames[#],,} ;do
((pct=available*100/(available+used),pct<10)) &&
printf "WARN: FS: %-20s on %-14s %3d <10 (Total: %11u, Use: %7s)\n" \
"${filesystem#*/mapper/}" "$mounted" $pct $blocks "$use"
done
} {list}< <(LANG=C df -k)
And of course with inline documents:
while IFS=\; read -u $list -a myvar ;do
echo ${myvar[2]}
done {list}<<"eof"
foo;bar;baz
alice;bob;charlie
$cherry;$strawberry;$memberberries
eof
Practical sample parsing CSV files:
As this answer is loong enough, for this paragraph,
I just will let you refer to
this answer to How to parse a CSV file in Bash?, I read a file by using an unnamed fifo, using syntax like:
exec {FD}<"$file" # open unnamed fifo for read
IFS=';' read -ru $FD -a headline
while IFS=';' read -ru $FD -a row ;do ...
... But using bash loadable CSV module.
On my website, you may find the same script, reading CSV as inline document.
Sample function for populating some variables:
#!/bin/bash
declare free=0 total=0 used=0 mpnt='??'
getDiskStat() {
{
read _
read _ total used free _ mpnt
} < <(
df -k ${1:-/}
)
}
getDiskStat $1
echo "$mpnt: Tot:$total, used: $used, free: $free."
Nota: declare line is not required, just for readability.
About sudo cmd | grep ... | cut ...
shell=$(cat /etc/passwd | grep $USER | cut -d : -f 7)
echo $shell
/bin/bash
(Please avoid useless cat! So this is just one fork less:
shell=$(grep $USER </etc/passwd | cut -d : -f 7)
All pipes (|) implies forks. Where another process have to be run, accessing disk, libraries calls and so on.
So using sed for sample, will limit subprocess to only one fork:
shell=$(sed </etc/passwd "s/^$USER:.*://p;d")
echo $shell
And with Bashisms:
But for many actions, mostly on small files, Bash could do the job itself:
while IFS=: read -a line ; do
[ "$line" = "$USER" ] && shell=${line[6]}
done </etc/passwd
echo $shell
/bin/bash
or
while IFS=: read loginname encpass uid gid fullname home shell;do
[ "$loginname" = "$USER" ] && break
done </etc/passwd
echo $shell $loginname ...
Going further about variable splitting...
Have a look at my answer to How do I split a string on a delimiter in Bash?
Alternative: reducing forks by using backgrounded long-running tasks
In order to prevent multiple forks like
myPi=$(bc -l <<<'4*a(1)'
myRay=12
myCirc=$(bc -l <<<" 2 * $myPi * $myRay ")
or
myStarted=$(date -d "$(ps ho lstart 1)" +%s)
mySessStart=$(date -d "$(ps ho lstart $$)" +%s)
This work fine, but running many forks is heavy and slow.
And commands like date and bc could make many operations, line by line!!
See:
bc -l <<<$'3*4\n5*6'
12
30
date -f - +%s < <(ps ho lstart 1 $$)
1516030449
1517853288
So we could use a long running background process to make many jobs, without having to initiate a new fork for each request.
You could have a look how reducing forks make Mandelbrot bash, improve from more than eight hours to less than 5 seconds.
Under bash, there is a built-in function: coproc:
coproc bc -l
echo 4*3 >&${COPROC[1]}
read -u $COPROC answer
echo $answer
12
echo >&${COPROC[1]} 'pi=4*a(1)'
ray=42.0
printf >&${COPROC[1]} '2*pi*%s\n' $ray
read -u $COPROC answer
echo $answer
263.89378290154263202896
printf >&${COPROC[1]} 'pi*%s^2\n' $ray
read -u $COPROC answer
echo $answer
5541.76944093239527260816
As bc is ready, running in background and I/O are ready too, there is no delay, nothing to load, open, close, before or after operation. Only the operation himself! This become a lot quicker than having to fork to bc for each operation!
Border effect: While bc stay running, they will hold all registers, so some variables or functions could be defined at initialisation step, as first write to ${COPROC[1]}, just after starting the task (via coproc).
Into a function newConnector
You may found my newConnector function on GitHub.Com or on my own site (Note on GitHub: there are two files on my site. Function and demo are bundled into one unique file which could be sourced for use or just run for demo.)
Sample:
source shell_connector.sh
tty
/dev/pts/20
ps --tty pts/20 fw
PID TTY STAT TIME COMMAND
29019 pts/20 Ss 0:00 bash
30745 pts/20 R+ 0:00 \_ ps --tty pts/20 fw
newConnector /usr/bin/bc "-l" '3*4' 12
ps --tty pts/20 fw
PID TTY STAT TIME COMMAND
29019 pts/20 Ss 0:00 bash
30944 pts/20 S 0:00 \_ /usr/bin/bc -l
30952 pts/20 R+ 0:00 \_ ps --tty pts/20 fw
declare -p PI
bash: declare: PI: not found
myBc '4*a(1)' PI
declare -p PI
declare -- PI="3.14159265358979323844"
The function myBc lets you use the background task with simple syntax.
Then for date:
newConnector /bin/date '-f - +%s' #0 0
myDate '2000-01-01'
946681200
myDate "$(ps ho lstart 1)" boottime
myDate now now
read utm idl </proc/uptime
myBc "$now-$boottime" uptime
printf "%s\n" ${utm%%.*} $uptime
42134906
42134906
ps --tty pts/20 fw
PID TTY STAT TIME COMMAND
29019 pts/20 Ss 0:00 bash
30944 pts/20 S 0:00 \_ /usr/bin/bc -l
32615 pts/20 S 0:00 \_ /bin/date -f - +%s
3162 pts/20 R+ 0:00 \_ ps --tty pts/20 fw
From there, if you want to end one of background processes, you just have to close its fd:
eval "exec $DATEOUT>&-"
eval "exec $DATEIN>&-"
ps --tty pts/20 fw
PID TTY STAT TIME COMMAND
4936 pts/20 Ss 0:00 bash
5256 pts/20 S 0:00 \_ /usr/bin/bc -l
6358 pts/20 R+ 0:00 \_ ps --tty pts/20 fw
which is not needed, because all fd close when the main process finishes.
As they have already indicated to you, you should use `backticks`.
The alternative proposed $(command) works as well, and it also easier to read, but note that it is valid only with Bash or KornShell (and shells derived from those),
so if your scripts have to be really portable on various Unix systems, you should prefer the old backticks notation.
I know three ways to do it:
Functions are suitable for such tasks:**
func (){
ls -l
}
Invoke it by saying func.
Also another suitable solution could be eval:
var="ls -l"
eval $var
The third one is using variables directly:
var=$(ls -l)
OR
var=`ls -l`
You can get the output of the third solution in a good way:
echo "$var"
And also in a nasty way:
echo $var
Just to be different:
MOREF=$(sudo run command against $VAR1 | grep name | cut -c7-)
When setting a variable make sure you have no spaces before and/or after the = sign. I literally spent an hour trying to figure this out, trying all kinds of solutions! This is not cool.
Correct:
WTFF=`echo "stuff"`
echo "Example: $WTFF"
Will Fail with error "stuff: not found" or similar
WTFF= `echo "stuff"`
echo "Example: $WTFF"
If you want to do it with multiline/multiple command/s then you can do this:
output=$( bash <<EOF
# Multiline/multiple command/s
EOF
)
Or:
output=$(
# Multiline/multiple command/s
)
Example:
#!/bin/bash
output="$( bash <<EOF
echo first
echo second
echo third
EOF
)"
echo "$output"
Output:
first
second
third
Using heredoc, you can simplify things pretty easily by breaking down your long single line code into a multiline one. Another example:
output="$( ssh -p $port $user#$domain <<EOF
# Breakdown your long ssh command into multiline here.
EOF
)"
You need to use either
$(command-here)
or
`command-here`
Example
#!/bin/bash
VAR1="$1"
VAR2="$2"
MOREF="$(sudo run command against "$VAR1" | grep name | cut -c7-)"
echo "$MOREF"
If the command that you are trying to execute fails, it would write the output onto the error stream and would then be printed out to the console.
To avoid it, you must redirect the error stream:
result=$(ls -l something_that_does_not_exist 2>&1)
This is another way and is good to use with some text editors that are unable to correctly highlight every intricate code you create:
read -r -d '' str < <(cat somefile.txt)
echo "${#str}"
echo "$str"
You can use backticks (also known as accent graves) or $().
Like:
OUTPUT=$(x+2);
OUTPUT=`x+2`;
Both have the same effect. But OUTPUT=$(x+2) is more readable and the latest one.
Here are two more ways:
Please keep in mind that space is very important in Bash. So, if you want your command to run, use as is without introducing any more spaces.
The following assigns harshil to L and then prints it
L=$"harshil"
echo "$L"
The following assigns the output of the command tr to L2. tr is being operated on another variable, L1.
L2=$(echo "$L1" | tr [:upper:] [:lower:])
Mac/OSX nowadays come with old Bash versions, ie GNU bash, version 3.2.57(1)-release (arm64-apple-darwin21). In this case, one can use:
new_variable="$(some_command)"
A concrete example:
newvar="$(echo $var | tr -d '123')"
Note the (), instead of the usual {} in Bash 4.
Some may find this useful.
Integer values in variable substitution, where the trick is using $(()) double brackets:
N=3
M=3
COUNT=$N-1
ARR[0]=3
ARR[1]=2
ARR[2]=4
ARR[3]=1
while (( COUNT < ${#ARR[#]} ))
do
ARR[$COUNT]=$((ARR[COUNT]*M))
(( COUNT=$COUNT+$N ))
done
I am trying to extract the frequency of the most frequent line in a file using the following command:
sort file.txt | uniq -c | sort -r | head -1| xargs
I am trying to accomplish from within a Java program, using the ProcessBuilder class. Here is how I am passing to its constructor:
ProcessBuilder builder=new ProcessBuilder("/bin/sh", "-c","sort",fileName,"| uniq -c | sort -r | head -1 | xargs");
When I run the program, it just stops executing beyond this line. There are no errors, but the program just halts at this line. What is it that I might be doing wrong?
Thanks!
Try including a filename directly into command:
ProcessBuilder builder=new ProcessBuilder("/bin/sh", "-c","sort " + fileName + " | uniq -c | sort -r | head -1 | xargs");
Despite my luddite tendencies, I now have a phone with Java support - but no Flash support. I also have a copy of Macromedia Flash MX2004, though I'm unlikely to upgrade any time soon.
What I'd like to be able to do is develop some content (including vector animations) in Flash, then use those resources in a Java Micro Edition application. I don't need all features of Flash - in particular, I don't care about ActionScript support. But I do want to be able to load a SWF file (or, perhaps better, an alternative file format that can be generated using a converter tool), and to be able to display animations and use other resources (particularly play sounds) from in that file.
Is there a good library and toolkit to support this kind of thing? Obviously (from the MX2004) it doesn't need to be completely up to date.
On knowledge level - I've been a programmer for decades, and my everyday language these days is C++. However, I have a very limited knowledge of Java, and virtually no knowledge (yet) of Micro Edition and its libraries.
I've already heard of Flash to J2ME converters, but so far as I can see they generate complete applications in one step, rather than treating the SWF file as a source of resources to be controlled from separately written Java code.
EDIT
I get the feeling that this is (with slight modifications) probably quite easy. Java Mobile Edition supports SVG vector graphics. SVG supports animations. There are (I'm pretty certain) ways to convert flash animations to SVG - probably a simple export-to-SVG in the application, though I've not checked.
This in itself doesn't give me a convenient bundle-of-media resources file format, but that's a relatively simple problem to solve, so long as there's a way to "load" SVG and other media files from some kind of non-file stream class that gets its data in turn from the bundle-of-media file.
I've never used Java ME before, so I won't be able to help on that side, but I use actionscript/flash on a daily basis.
The 'easiest' thing I can think of is a 2 step process:
Export your animation as vector sequence via File > Export > Export Movie and choosing the right format (e.g. .ai/.eps/.dxf).
Convert the vector sequence to svg. Inkscape has a few handy SVG conversion tools.
A long winded way would be to write a JSFL script in Flash MX 2004.
You would traverse the shapes for each frame, then write the path data to SVG.
Another slightly different way would be to export the vector sequence as explained above (unfortunately there is no JSFL functionality to automate that), then from JSFL read and loop through each file, parse it and write an SVG.
The only advantage this would give you though is not having to install Inkscape and you wouldn't need to switch to another application.
I wouldn't recommend this though because:
You would need to write a parser (dxf/eps might be the simplest)
You will need to make an SVG and you only have Strings at your disposal (E4X XML support was added in Flash CS3)
I'm not saying it's impossible, it just seems impractical.
Found this thread on the Inkscape forum sharing a bash script that
extracts SWF objects to an SVG file
using SWFTools, but haven't tried that yet. For reference, here is hadi's script:
#!/bin/bash
#USAGE ./swf2svg.sh /path/to/file.swf > output.svg
FILE=$1;
DUMP="dump.txt"
echo '<?xml version="1.0" standalone="no"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<svg width="100%" height="100%" version="1.1"
xmlns="http://www.w3.org/2000/svg">
';
swfdump -s $FILE > $DUMP
fillCols=();
lineCols=();
lineWidth=();
FILLREGEX="[0-9]+(\s*)SOLID [0-f]{8}";
LINEREGEX="[0-9]+(\s*)[0-9]\.[0-9]{0,2} [0-f]{8}";
lastStartPoint="";
pathClosedTag="";
firstGroup="TRUE";
firstPath="TRUE";
cat $DUMP | while read line
do
#Remove ( and )
line=`echo $line | sed "s/[()]//g"`
#tmp=`echo $line | egrep -o "DEFINE(SHAPE|SPRITE)"`;
tmp=`echo $line | egrep -o "DEFINE(SHAPE|SPRITE)[0-9]? defines id [0-9]+"`;
if [ "$tmp" != "" ]
then
if [ "$firstGroup" == "TRUE" ]
then
firstGroup="FALSE";
else
if [ "$firstPath" == "FALSE" ]
then
if [ "$lastStartPoint" != "" ]
then
if [ "$lastStartPoint" == "$curPoint" ]
then
pathClosedTag="Z";
fi
fi
lastStartPoint=$curPoint;
echo $pathClosedTag'" />';
fi;
firstPath="TRUE";
echo '</g>';
fi
id=`echo $tmp | awk {'print $4'}`
echo '<g id="'$id'">';
fillCols=();
lineCols=();
lineWidth=();
fi
tmp=`echo $line | egrep -o "($FILLREGEX)?((\s*)$LINEREGEX)?"`;
if [ "$tmp" != "" ]
then
fillInx=`echo $tmp | egrep -o "$FILLREGEX" | awk {'print $1'}`;
fillCol=`echo $tmp | egrep -o "$FILLREGEX" | awk {'print $3'}`;
if [ "$fillCol" != "" ]
then
fillCols[$fillInx]=$fillCol;
fi
lineInx=`echo $tmp | egrep -o "$LINEREGEX" | awk {'print $1'}`;
lineWth=`echo $tmp | egrep -o "$LINEREGEX" | awk {'print $2'}`;
lineCol=`echo $tmp | egrep -o "$LINEREGEX" | awk {'print $3'}`;
if [ "$lineCol" != "" ]
then
lineCols[$lineInx]=$lineCol;
lineWidth[$lineInx]=$lineWth;
fi
fi
tmp=`echo $line | awk {'print $6'}`;
if [ "$tmp" == "lineTo" ]
then
echo $line | awk {'print "L"$7" "$8'}
fi
if [ "$tmp" == "moveTo" ]
then
curPoint=`echo $line | awk {'print $9" "$10'}`;
if [ "$lastStartPoint" != "" ]
then
if [ "$lastStartPoint" == "$curPoint" ]
then
pathClosedTag="Z";
fi
fi
lastStartPoint=$curPoint;
if [ "$firstPath" == "TRUE" ]
then
firstPath="FALSE";
else
echo $pathClosedTag'" />';
fi;
#Remove : and /
line=`echo $line | sed "s/[:/]/ /g"`
fInx=`echo $line | awk '{printf "%d", $4}'`;
lInx=`echo $line | awk '{printf "%d", $6}'`;
stl="";
val=${fillCols[$fInx]:0:6};
if [ $fInx -gt 0 -a "$val" != "" ]
then
stl="fill:#$val;";
fi
val=${lineCols[$lInx]:0:6};
if [ $lInx -gt 0 -a "$val" != "" ]
then
stl=$stl"stroke:#$val;";
val=${lineWidth[$lInx]};
if [ "$val" != "" ]
then
stl=$stl"stroke-width:$val;";
fi
fi
echo '<path style="'$stl'" d="';
echo $line | awk {'print "M"$9" "$10'}
fi
if [ "$tmp" == "splineTo" ]
then
echo $line | awk {'print "Q"$7" "$8" "$9" "$10'}
fi
done
echo 'Z" />';
echo '</g>';
echo '</svg>';
If anybody else using a more recent version of Flash (like CS4 or CS5) reads this, there is a Flash 2 SVG extension available.
I am inspecting a Java process in Linux using
top -H
However, I cannot read the name of the thread in the "COMMAND" column (because it is too long). If I use 'c' to expand the full name of the process, then it is still to long to fit.
How can I obtain the full name of the command?
You can inspect java threads with the tool jstack. It will list the names, stacktraces and other useful information of all threads belonging to the specified process pid.
Edit: The parameter nid in the thread dump of jstack is the hex version of the LWP that is displayed by top in the pid column for threads.
This might be a little old, but here's what I did to kinda merge top and jstack together. I used two scripts, but I'm sure it all could be done in one.
First, I save the output of top with the pids for my java threads into a file and save the jstack output into another file:
#!/bin/sh
top -H -b -n 1 | grep java > /tmp/top.log
jstack -l `ps fax | grep java | grep tomcat | sed "s/ *\([0-9]*\) .*/\1/g"` > /tmp/jstack.log
Then I use a perl script to call the bash script (called cpu-java.sh here) and kinda merge the two files (/tmp/top.log and /tmp/jstack.log):
#!/usr/bin/perl
system("sh cpu-java.sh");
open LOG, "/tmp/top.log" or die $!;
print "PID\tCPU\tMem\tJStack Info\n";
while ($l = <LOG>) {
$pid = $l;
$pid =~ s/root.*//g;
$pid =~ s/ *//g;
$hex_pid = sprintf("%#x", $pid);
#values = split(/\s{2,}/, $l);
$pct = $values[4];
$mem = $values[5];
open JSTACK, "/tmp/jstack.log" or die $!;
while ($j = <JSTACK>){
if ($j =~ /.*nid=.*/){
if ($j =~ /.*$hex_pid.*/){
$j =~ s/\n//;
$pid =~ s/\n//;
print $pid . "\t" . $pct . "\t" . $mem . "\t" . $j . "\n";
}
}
}
close JSTACK;
}
close LOG;
The output helps me to find out which threads are hogging my cpu:
PID CPU Mem JStack Info
22460 0 8.0 "main" prio=10 tid=0x083cb800 nid=0x57bc runnable [0xb6acc000]
22461 0 8.0 "GC task thread#0 (ParallelGC)" prio=10 tid=0x083d2c00 nid=0x57bd runnable
22462 0 8.0 "GC task thread#1 (ParallelGC)" prio=10 tid=0x083d4000 nid=0x57be runnable
22463 0 8.0 "GC task thread#2 (ParallelGC)" prio=10 tid=0x083d5800 nid=0x57bf runnable
22464 0 8.0 "GC task thread#3 (ParallelGC)" prio=10 tid=0x083d7000 nid=0x57c0 runnable
...
Then I can go back to /tmp/jstack.log and take a look at the stack trace for the problematic thread and try to figure out what's going on from there. Of course this solution is platform-dependent, but it should work with most flavors of *nix and some tweaking here and there.
I have created a top-like command specifically for visualizing Java threads ordered by CPU usage and posted the source code at: https://github.com/jasta/jprocps. The command-line syntax is not nearly as rich as top, but it does support some of the same commands:
$ jtop -n 1
Sample output (showing ant and IntelliJ running):
PID TID USER %CPU %MEM THREAD
13480 13483 jasta 104 2.3 main
13480 13497 jasta 86.3 2.3 C2 CompilerThread1
13480 13496 jasta 83.0 2.3 C2 CompilerThread0
4866 4953 jasta 1.0 13.4 AWT-EventQueue-1 12.1.4#IC-129.713, eap:false
4866 14154 jasta 0.9 13.4 ApplicationImpl pooled thread 36
4866 5219 jasta 0.8 13.4 JobScheduler pool 5/8
From this output, I can pull up the thread's stack trace in jconsole or jstack manually and figure out what's going on.
NOTE: jtop is written in Python and requires that jstack be installed.
As far as I found out jstack is outdated as of JDK 8. What I used to retrieve all Java Thread names is:
<JDK_HOME>/bin/jcmd <PID> Thread.print
Check jcmd documentation for more.
With OpenJDK on Linux, JavaThread names don't propagate to native threads, you cannot see java thread name while inspecting native threads with any tool.
However there is some work in progress:
https://bugs.openjdk.java.net/browse/JDK-7102541
http://mail.openjdk.java.net/pipermail/hotspot-dev/2012-July/006211.html
Personally, I find the OpenJDK development tool slow so I just apply patches myself.
Threads don't have names as far as the kernel is concerned; they only have ID numbers. The JVM assigns names to threads, but that's private internal data within the process, which the "top" program can't access (and doesn't know about anyway).
This shell script combines the output from jstack and top to list Java threads by CPU usage. It expects one argument, the account user that owns the processes.
Name: jstack-top.sh
#!/bin/sh
#
# jstack-top - join jstack and top to show cpu usage, etc.
#
# Usage: jstack-top <user> | view -
#
USER=$1
TOPS="/tmp/jstack-top-1.log"
JSKS="/tmp/jstack-top-2.log"
PIDS="$(ps -u ${USER} --no-headers -o pid:1,cmd:1 | grep 'bin/java' | grep -v 'grep' | cut -d' ' -f1)"
if [ -f ${JSKS} ]; then
rm ${JSKS}
fi
for PID in ${PIDS}; do
jstack -l ${PID} | grep "nid=" >>${JSKS}
done
top -u ${USER} -H -b -n 1 | grep "%CPU\|java" | sed -e 's/[[:space:]]*$//' > ${TOPS}
while IFS= read -r TOP; do
NID=$(echo "${TOP}" | sed -e 's/^[[:space:]]*//' | cut -d' ' -f1)
if [ "${NID}" = "PID" ]; then
JSK=""
TOP="${TOP} JSTACK"
else
NID=$(printf 'nid=0x%x' ${NID})
JSK=$(grep "${NID} " ${JSKS})
fi
echo "${TOP} ${JSK}"
done < "${TOPS}"
Expanding on Andre's earlier answer in Perl, here is one in Python that runs significantly faster.
It re-uses files created earlier and does not loop several times over the jstack output:
#!/usr/bin/env python
import re
import sys
import os.path
import subprocess
# Check if jstack.log top.log files are present
if not os.path.exists("jstack.log") or not os.path.exists("top.log"):
# Delete either file
os.remove("jstack.log") if os.path.exists("jstack.log") else None
os.remove("top.log") if os.path.exists("top.log") else None
# And dump them via a bash run
cmd = """
pid=$(ps -e | grep java | sed 's/^[ ]*//g' | cut -d ' ' -f 1)
top -H -b -n 1 | grep java > top.log
/usr/intel/pkgs/java/1.8.0.141/bin/jstack -l $pid > jstack.log
"""
subprocess.call(["bash", "-c", cmd])
# Verify that both files were written
for f in ["jstack.log", "top.log"]:
if not os.path.exists(f):
print "ERROR: Failed to create file %s" % f
sys.exit(1)
# Thread ID parser
jsReg = re.compile('"([^\"]*)".*nid=(0x[0-9a-f]*)')
# Top line parser
topReg = re.compile('^\s*([0-9]*)(\s+[^\s]*){7}\s+([0-9]+)')
# Scan the entire jstack file for matches and put them into a dict
nids = {}
with open("jstack.log", "r") as jstack:
matches = (jsReg.search(l) for l in jstack if "nid=0x" in l)
for m in matches:
nids[m.group(2)] = m.group(1)
# Print header
print "PID\tNID\tCPU\tTHREAD"
# Scan the top output and emit the matches
with open("top.log", "r") as top:
matches = (topReg.search(l) for l in top)
for m in matches:
# Grab the pid, convert to hex and fetch from NIDS
pid = int(m.group(1))
nid = "0x%x" % pid
tname = nids.get(nid, "<MISSING THREAD>")
# Grab CPU percent
pct = int(m.group(3))
# Emit line
print "%d\t%s\t%d\t%s" % (pid, nid, pct, tname)
Old question, but I had just the same problem with top.
It turns out, you can scroll top's output to the right simply by using the cursors keys :)
(but unfortunately there won't be any thread name shown)
You mentioned "Linux". Then using the little tool "threadcpu" might be a solution:
threadcpu_-_show_cpu_usage_of_threads
$ threadcpu -h
threadcpu shows CPU usage of threads in user% and system%
usage:
threadcpu [-h] [-s seconds] [-p path-to-jstack]
options:
-h display this help page
-s measuring interval in seconds, default: 10
-p path to JRE jstack, default: /usr/bin/jstack
example usage:
threadcpu -s 30 -p /opt/java/bin/jstack 2>/dev/null|sort -n|tail -n 12
output columns:
user percent <SPACE> system percent <SPACE> PID/NID [ <SPACE> JVM thread name OR (process name) ]
Some sample outputs:
$ threadcpu |sort -n|tail -n 8
3 0 33113 (klzagent)
3 0 38518 (klzagent)
3 0 9874 (BESClient)
3 41 6809 (threadcpu)
3 8 27353 VM Periodic Task Thread
6 0 31913 hybrisHTTP4
21 8 27347 C2 CompilerThread0
50 41 3244 (BESClient)
$ threadcpu |sort -n|tail -n 8
0 20 52358 (threadcpu)
0 40 32 (kswapd0)
2 50 2863 (BESClient)
11 0 31861 Gang worker#0 (Parallel CMS Threads)
11 0 31862 Gang worker#1 (Parallel CMS Threads)
11 0 31863 Gang worker#2 (Parallel CMS Threads)
11 0 31864 Gang worker#3 (Parallel CMS Threads)
47 10 31865 Concurrent Mark-Sweep GC Thread
$ threadcpu |sort -n|tail -n 8
2 0 14311 hybrisHTTP33
2 4 60077 ajp-bio-8009-exec-11609
2 8 30657 (klzagent)
4 0 5661 ajp-bio-8009-exec-11649
11 16 28144 (batchman)
15 20 3485 (BESClient)
21 0 7652 ajp-bio-8009-exec-11655
25 0 7611 ajp-bio-8009-exec-11654
The output is intentionally very simple to make further processing (e.g. for monitoring) more easy.