Inspecting Java threads in Linux using top

Inspecting Java threads in Linux using top - java

I am inspecting a Java process in Linux using
top -H
However, I cannot read the name of the thread in the "COMMAND" column (because it is too long). If I use 'c' to expand the full name of the process, then it is still to long to fit.
How can I obtain the full name of the command?

You can inspect java threads with the tool jstack. It will list the names, stacktraces and other useful information of all threads belonging to the specified process pid.
Edit: The parameter nid in the thread dump of jstack is the hex version of the LWP that is displayed by top in the pid column for threads.

This might be a little old, but here's what I did to kinda merge top and jstack together. I used two scripts, but I'm sure it all could be done in one.
First, I save the output of top with the pids for my java threads into a file and save the jstack output into another file:
#!/bin/sh
top -H -b -n 1 | grep java > /tmp/top.log
jstack -l `ps fax | grep java | grep tomcat | sed "s/ *\([0-9]*\) .*/\1/g"` > /tmp/jstack.log
Then I use a perl script to call the bash script (called cpu-java.sh here) and kinda merge the two files (/tmp/top.log and /tmp/jstack.log):
#!/usr/bin/perl
system("sh cpu-java.sh");
open LOG, "/tmp/top.log" or die $!;
print "PID\tCPU\tMem\tJStack Info\n";
while ($l = <LOG>) {
$pid = $l;
$pid =~ s/root.*//g;
$pid =~ s/ *//g;
$hex_pid = sprintf("%#x", $pid);
#values = split(/\s{2,}/, $l);
$pct = $values[4];
$mem = $values[5];
open JSTACK, "/tmp/jstack.log" or die $!;
while ($j = <JSTACK>){
if ($j =~ /.*nid=.*/){
if ($j =~ /.*$hex_pid.*/){
$j =~ s/\n//;
$pid =~ s/\n//;
print $pid . "\t" . $pct . "\t" . $mem . "\t" . $j . "\n";
}
}
}
close JSTACK;
}
close LOG;
The output helps me to find out which threads are hogging my cpu:
PID CPU Mem JStack Info
22460 0 8.0 "main" prio=10 tid=0x083cb800 nid=0x57bc runnable [0xb6acc000]
22461 0 8.0 "GC task thread#0 (ParallelGC)" prio=10 tid=0x083d2c00 nid=0x57bd runnable
22462 0 8.0 "GC task thread#1 (ParallelGC)" prio=10 tid=0x083d4000 nid=0x57be runnable
22463 0 8.0 "GC task thread#2 (ParallelGC)" prio=10 tid=0x083d5800 nid=0x57bf runnable
22464 0 8.0 "GC task thread#3 (ParallelGC)" prio=10 tid=0x083d7000 nid=0x57c0 runnable
...
Then I can go back to /tmp/jstack.log and take a look at the stack trace for the problematic thread and try to figure out what's going on from there. Of course this solution is platform-dependent, but it should work with most flavors of *nix and some tweaking here and there.

I have created a top-like command specifically for visualizing Java threads ordered by CPU usage and posted the source code at: https://github.com/jasta/jprocps. The command-line syntax is not nearly as rich as top, but it does support some of the same commands:
$ jtop -n 1
Sample output (showing ant and IntelliJ running):
PID TID USER %CPU %MEM THREAD
13480 13483 jasta 104 2.3 main
13480 13497 jasta 86.3 2.3 C2 CompilerThread1
13480 13496 jasta 83.0 2.3 C2 CompilerThread0
4866 4953 jasta 1.0 13.4 AWT-EventQueue-1 12.1.4#IC-129.713, eap:false
4866 14154 jasta 0.9 13.4 ApplicationImpl pooled thread 36
4866 5219 jasta 0.8 13.4 JobScheduler pool 5/8
From this output, I can pull up the thread's stack trace in jconsole or jstack manually and figure out what's going on.
NOTE: jtop is written in Python and requires that jstack be installed.

As far as I found out jstack is outdated as of JDK 8. What I used to retrieve all Java Thread names is:
<JDK_HOME>/bin/jcmd <PID> Thread.print
Check jcmd documentation for more.

With OpenJDK on Linux, JavaThread names don't propagate to native threads, you cannot see java thread name while inspecting native threads with any tool.
However there is some work in progress:
https://bugs.openjdk.java.net/browse/JDK-7102541
http://mail.openjdk.java.net/pipermail/hotspot-dev/2012-July/006211.html
Personally, I find the OpenJDK development tool slow so I just apply patches myself.

Threads don't have names as far as the kernel is concerned; they only have ID numbers. The JVM assigns names to threads, but that's private internal data within the process, which the "top" program can't access (and doesn't know about anyway).

This shell script combines the output from jstack and top to list Java threads by CPU usage. It expects one argument, the account user that owns the processes.
Name: jstack-top.sh
#!/bin/sh
#
# jstack-top - join jstack and top to show cpu usage, etc.
#
# Usage: jstack-top <user> | view -
#
USER=$1
TOPS="/tmp/jstack-top-1.log"
JSKS="/tmp/jstack-top-2.log"
PIDS="$(ps -u ${USER} --no-headers -o pid:1,cmd:1 | grep 'bin/java' | grep -v 'grep' | cut -d' ' -f1)"
if [ -f ${JSKS} ]; then
rm ${JSKS}
fi
for PID in ${PIDS}; do
jstack -l ${PID} | grep "nid=" >>${JSKS}
done
top -u ${USER} -H -b -n 1 | grep "%CPU\|java" | sed -e 's/[[:space:]]*$//' > ${TOPS}
while IFS= read -r TOP; do
NID=$(echo "${TOP}" | sed -e 's/^[[:space:]]*//' | cut -d' ' -f1)
if [ "${NID}" = "PID" ]; then
JSK=""
TOP="${TOP} JSTACK"
else
NID=$(printf 'nid=0x%x' ${NID})
JSK=$(grep "${NID} " ${JSKS})
fi
echo "${TOP} ${JSK}"
done < "${TOPS}"

Expanding on Andre's earlier answer in Perl, here is one in Python that runs significantly faster.
It re-uses files created earlier and does not loop several times over the jstack output:
#!/usr/bin/env python
import re
import sys
import os.path
import subprocess
# Check if jstack.log top.log files are present
if not os.path.exists("jstack.log") or not os.path.exists("top.log"):
# Delete either file
os.remove("jstack.log") if os.path.exists("jstack.log") else None
os.remove("top.log") if os.path.exists("top.log") else None
# And dump them via a bash run
cmd = """
pid=$(ps -e | grep java | sed 's/^[ ]*//g' | cut -d ' ' -f 1)
top -H -b -n 1 | grep java > top.log
/usr/intel/pkgs/java/1.8.0.141/bin/jstack -l $pid > jstack.log
"""
subprocess.call(["bash", "-c", cmd])
# Verify that both files were written
for f in ["jstack.log", "top.log"]:
if not os.path.exists(f):
print "ERROR: Failed to create file %s" % f
sys.exit(1)
# Thread ID parser
jsReg = re.compile('"([^\"]*)".*nid=(0x[0-9a-f]*)')
# Top line parser
topReg = re.compile('^\s*([0-9]*)(\s+[^\s]*){7}\s+([0-9]+)')
# Scan the entire jstack file for matches and put them into a dict
nids = {}
with open("jstack.log", "r") as jstack:
matches = (jsReg.search(l) for l in jstack if "nid=0x" in l)
for m in matches:
nids[m.group(2)] = m.group(1)
# Print header
print "PID\tNID\tCPU\tTHREAD"
# Scan the top output and emit the matches
with open("top.log", "r") as top:
matches = (topReg.search(l) for l in top)
for m in matches:
# Grab the pid, convert to hex and fetch from NIDS
pid = int(m.group(1))
nid = "0x%x" % pid
tname = nids.get(nid, "<MISSING THREAD>")
# Grab CPU percent
pct = int(m.group(3))
# Emit line
print "%d\t%s\t%d\t%s" % (pid, nid, pct, tname)

Old question, but I had just the same problem with top.
It turns out, you can scroll top's output to the right simply by using the cursors keys :)
(but unfortunately there won't be any thread name shown)

You mentioned "Linux". Then using the little tool "threadcpu" might be a solution:
threadcpu_-_show_cpu_usage_of_threads
$ threadcpu -h
threadcpu shows CPU usage of threads in user% and system%
usage:
threadcpu [-h] [-s seconds] [-p path-to-jstack]
options:
-h display this help page
-s measuring interval in seconds, default: 10
-p path to JRE jstack, default: /usr/bin/jstack
example usage:
threadcpu -s 30 -p /opt/java/bin/jstack 2>/dev/null|sort -n|tail -n 12
output columns:
user percent <SPACE> system percent <SPACE> PID/NID [ <SPACE> JVM thread name OR (process name) ]
Some sample outputs:
$ threadcpu |sort -n|tail -n 8
3 0 33113 (klzagent)
3 0 38518 (klzagent)
3 0 9874 (BESClient)
3 41 6809 (threadcpu)
3 8 27353 VM Periodic Task Thread
6 0 31913 hybrisHTTP4
21 8 27347 C2 CompilerThread0
50 41 3244 (BESClient)
$ threadcpu |sort -n|tail -n 8
0 20 52358 (threadcpu)
0 40 32 (kswapd0)
2 50 2863 (BESClient)
11 0 31861 Gang worker#0 (Parallel CMS Threads)
11 0 31862 Gang worker#1 (Parallel CMS Threads)
11 0 31863 Gang worker#2 (Parallel CMS Threads)
11 0 31864 Gang worker#3 (Parallel CMS Threads)
47 10 31865 Concurrent Mark-Sweep GC Thread
$ threadcpu |sort -n|tail -n 8
2 0 14311 hybrisHTTP33
2 4 60077 ajp-bio-8009-exec-11609
2 8 30657 (klzagent)
4 0 5661 ajp-bio-8009-exec-11649
11 16 28144 (batchman)
15 20 3485 (BESClient)
21 0 7652 ajp-bio-8009-exec-11655
25 0 7611 ajp-bio-8009-exec-11654
The output is intentionally very simple to make further processing (e.g. for monitoring) more easy.

Related

Will this NGS data annotation code pipeline work?

threads=36
task() {
NEWNAME="${1/%.norm.vcf.gz/.snv.indel.vcf.gz}"
java -Xmx8g -jar /home/ubuntu/snpEff/SnpSift.jar annotate /home/ubuntu/data/spliceai_scores.raw.snv.hg19.vcf.gz $1 | java -Xmx8g -jar /home/ubuntu/snpEff/SnpSift.jar annotate /home/ubuntu/data/spliceai_scores.raw.indel.hg19.vcf.gz > $NEWNAME1
}
for file in $(ls *norm.vcf.gz)
do
if [ $(jobs -r | wc -l) -ge $threads ]; then
wait $(jobs -r -p | head -1)
fi
task "$file" &
done
wait
I am now learning NGS pipeline and I would like to annotate two DBs(snv,indel) using SnpSift. Currently I am using 36 threads CPU. Since there are two commands in task() function, annotate snv and indel, will my code wait $(jobs -r -p | head -1) work fine?
Are there any other recommendations to use CPU efficiently?

How to return a value printed by Java to the bash script it is called from? [duplicate]

I have a pretty simple script that is something like the following:
#!/bin/bash
VAR1="$1"
MOREF='sudo run command against $VAR1 | grep name | cut -c7-'
echo $MOREF
When I run this script from the command line and pass it the arguments, I am not getting any output. However, when I run the commands contained within the $MOREF variable, I am able to get output.
How can one take the results of a command that needs to be run within a script, save it to a variable, and then output that variable on the screen?

In addition to backticks `command`, command substitution can be done with $(command) or "$(command)", which I find easier to read, and allows for nesting.
OUTPUT=$(ls -1)
echo "${OUTPUT}"
MULTILINE=$(ls \
-1)
echo "${MULTILINE}"
Quoting (") does matter to preserve multi-line variable values; it is optional on the right-hand side of an assignment, as word splitting is not performed, so OUTPUT=$(ls -1) would work fine.

$(sudo run command)
If you're going to use an apostrophe, you need `, not '. This character is called "backticks" (or "grave accent"):
#!/bin/bash
VAR1="$1"
VAR2="$2"
MOREF=`sudo run command against "$VAR1" | grep name | cut -c7-`
echo "$MOREF"

Some Bash tricks I use to set variables from commands
Sorry, there is a loong answer, but as bash is a shell, where the main goal is to run other unix commands and react on result code and/or output, ( commands are often piped filter, etc... ).
Storing command output in variables is something basic and fundamental.
Therefore, depending on
compatibility (posix)
kind of output (filter(s))
number of variable to set (split or interpret)
execution time (monitoring)
error trapping
repeatability of request (see long running background process, further)
interactivity (considering user input while reading from another input file descriptor)
do I miss something?
First simple, old (obsolete), and compatible way
myPi=`echo '4*a(1)' | bc -l`
echo $myPi
3.14159265358979323844
Compatible, second way
As nesting could become heavy, parenthesis was implemented for this
myPi=$(bc -l <<<'4*a(1)')
Using backticks in script is to be avoided today.
Nested sample:
SysStarted=$(date -d "$(ps ho lstart 1)" +%s)
echo $SysStarted
1480656334
bash features
Reading more than one variable (with Bashisms)
df -k /
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/dm-0 999320 529020 401488 57% /
If I just want a used value:
array=($(df -k /))
you could see an array variable:
declare -p array
declare -a array='([0]="Filesystem" [1]="1K-blocks" [2]="Used" [3]="Available" [
4]="Use%" [5]="Mounted" [6]="on" [7]="/dev/dm-0" [8]="999320" [9]="529020" [10]=
"401488" [11]="57%" [12]="/")'
Then:
echo ${array[9]}
529020
But I often use this:
{ read -r _;read -r filesystem size using avail prct mountpoint ; } < <(df -k /)
echo $using
529020
( The first read _ will just drop header line. ) Here, in only one command, you will populate 6 different variables (shown by alphabetical order):
declare -p avail filesystem mountpoint prct size using
declare -- avail="401488"
declare -- filesystem="/dev/dm-0"
declare -- mountpoint="/"
declare -- prct="57%"
declare -- size="999320"
declare -- using="529020"
Or
{ read -a head;varnames=(${head[#]//[K1% -]});varnames=(${head[#]//[K1% -]});
read ${varnames[#],,} ; } < <(LANG=C df -k /)
Then:
declare -p varnames ${varnames[#],,}
declare -a varnames=([0]="Filesystem" [1]="blocks" [2]="Used" [3]="Available" [4]="Use" [5]="Mounted" [6]="on")
declare -- filesystem="/dev/dm-0"
declare -- blocks="999320"
declare -- used="529020"
declare -- available="401488"
declare -- use="57%"
declare -- mounted="/"
declare -- on=""
Or even:
{ read _ ; read filesystem dsk[{6,2,9}] prct mountpoint ; } < <(df -k /)
declare -p mountpoint dsk
declare -- mountpoint="/"
declare -a dsk=([2]="529020" [6]="999320" [9]="401488")
(Note Used and Blocks is switched there: read ... dsk[6] dsk[2] dsk[9] ...)
... will work with associative arrays too: read _ disk[total] disk[used] ...
Other related sample: Parsing xrandr output: and end of Firefox tab by bash in a size of x% of display size? or at AskUbuntu.com Parsing xrandr output
Dedicated fd using unnamed fifo:
There is an elegent way! In this sample, I will read /etc/passwd file:
users=()
while IFS=: read -u $list user pass uid gid name home bin ;do
((uid>=500)) &&
printf -v users[uid] "%11d %7d %-20s %s\n" $uid $gid $user $home
done {list}</etc/passwd
Using this way (... read -u $list; ... {list}<inputfile) leave STDIN free for other purposes, like user interaction.
Then
echo -n "${users[#]}"
1000 1000 user /home/user
...
65534 65534 nobody /nonexistent
and
echo ${!users[#]}
1000 ... 65534
echo -n "${users[1000]}"
1000 1000 user /home/user
This could be used with static files or even /dev/tcp/xx.xx.xx.xx/yyy with x for ip address or hostname and y for port number or with the output of a command:
{
read -u $list -a head # read header in array `head`
varnames=(${head[#]//[K1% -]}) # drop illegal chars for variable names
while read -u $list ${varnames[#],,} ;do
((pct=available*100/(available+used),pct<10)) &&
printf "WARN: FS: %-20s on %-14s %3d <10 (Total: %11u, Use: %7s)\n" \
"${filesystem#*/mapper/}" "$mounted" $pct $blocks "$use"
done
} {list}< <(LANG=C df -k)
And of course with inline documents:
while IFS=\; read -u $list -a myvar ;do
echo ${myvar[2]}
done {list}<<"eof"
foo;bar;baz
alice;bob;charlie
$cherry;$strawberry;$memberberries
eof
Practical sample parsing CSV files:
As this answer is loong enough, for this paragraph,
I just will let you refer to
this answer to How to parse a CSV file in Bash?, I read a file by using an unnamed fifo, using syntax like:
exec {FD}<"$file" # open unnamed fifo for read
IFS=';' read -ru $FD -a headline
while IFS=';' read -ru $FD -a row ;do ...
... But using bash loadable CSV module.
On my website, you may find the same script, reading CSV as inline document.
Sample function for populating some variables:
#!/bin/bash
declare free=0 total=0 used=0 mpnt='??'
getDiskStat() {
{
read _
read _ total used free _ mpnt
} < <(
df -k ${1:-/}
)
}
getDiskStat $1
echo "$mpnt: Tot:$total, used: $used, free: $free."
Nota: declare line is not required, just for readability.
About sudo cmd | grep ... | cut ...
shell=$(cat /etc/passwd | grep $USER | cut -d : -f 7)
echo $shell
/bin/bash
(Please avoid useless cat! So this is just one fork less:
shell=$(grep $USER </etc/passwd | cut -d : -f 7)
All pipes (|) implies forks. Where another process have to be run, accessing disk, libraries calls and so on.
So using sed for sample, will limit subprocess to only one fork:
shell=$(sed </etc/passwd "s/^$USER:.*://p;d")
echo $shell
And with Bashisms:
But for many actions, mostly on small files, Bash could do the job itself:
while IFS=: read -a line ; do
[ "$line" = "$USER" ] && shell=${line[6]}
done </etc/passwd
echo $shell
/bin/bash
or
while IFS=: read loginname encpass uid gid fullname home shell;do
[ "$loginname" = "$USER" ] && break
done </etc/passwd
echo $shell $loginname ...
Going further about variable splitting...
Have a look at my answer to How do I split a string on a delimiter in Bash?
Alternative: reducing forks by using backgrounded long-running tasks
In order to prevent multiple forks like
myPi=$(bc -l <<<'4*a(1)'
myRay=12
myCirc=$(bc -l <<<" 2 * $myPi * $myRay ")
or
myStarted=$(date -d "$(ps ho lstart 1)" +%s)
mySessStart=$(date -d "$(ps ho lstart $$)" +%s)
This work fine, but running many forks is heavy and slow.
And commands like date and bc could make many operations, line by line!!
See:
bc -l <<<$'3*4\n5*6'
12
30
date -f - +%s < <(ps ho lstart 1 $$)
1516030449
1517853288
So we could use a long running background process to make many jobs, without having to initiate a new fork for each request.
You could have a look how reducing forks make Mandelbrot bash, improve from more than eight hours to less than 5 seconds.
Under bash, there is a built-in function: coproc:
coproc bc -l
echo 4*3 >&${COPROC[1]}
read -u $COPROC answer
echo $answer
12
echo >&${COPROC[1]} 'pi=4*a(1)'
ray=42.0
printf >&${COPROC[1]} '2*pi*%s\n' $ray
read -u $COPROC answer
echo $answer
263.89378290154263202896
printf >&${COPROC[1]} 'pi*%s^2\n' $ray
read -u $COPROC answer
echo $answer
5541.76944093239527260816
As bc is ready, running in background and I/O are ready too, there is no delay, nothing to load, open, close, before or after operation. Only the operation himself! This become a lot quicker than having to fork to bc for each operation!
Border effect: While bc stay running, they will hold all registers, so some variables or functions could be defined at initialisation step, as first write to ${COPROC[1]}, just after starting the task (via coproc).
Into a function newConnector
You may found my newConnector function on GitHub.Com or on my own site (Note on GitHub: there are two files on my site. Function and demo are bundled into one unique file which could be sourced for use or just run for demo.)
Sample:
source shell_connector.sh
tty
/dev/pts/20
ps --tty pts/20 fw
PID TTY STAT TIME COMMAND
29019 pts/20 Ss 0:00 bash
30745 pts/20 R+ 0:00 \_ ps --tty pts/20 fw
newConnector /usr/bin/bc "-l" '3*4' 12
ps --tty pts/20 fw
PID TTY STAT TIME COMMAND
29019 pts/20 Ss 0:00 bash
30944 pts/20 S 0:00 \_ /usr/bin/bc -l
30952 pts/20 R+ 0:00 \_ ps --tty pts/20 fw
declare -p PI
bash: declare: PI: not found
myBc '4*a(1)' PI
declare -p PI
declare -- PI="3.14159265358979323844"
The function myBc lets you use the background task with simple syntax.
Then for date:
newConnector /bin/date '-f - +%s' #0 0
myDate '2000-01-01'
946681200
myDate "$(ps ho lstart 1)" boottime
myDate now now
read utm idl </proc/uptime
myBc "$now-$boottime" uptime
printf "%s\n" ${utm%%.*} $uptime
42134906
42134906
ps --tty pts/20 fw
PID TTY STAT TIME COMMAND
29019 pts/20 Ss 0:00 bash
30944 pts/20 S 0:00 \_ /usr/bin/bc -l
32615 pts/20 S 0:00 \_ /bin/date -f - +%s
3162 pts/20 R+ 0:00 \_ ps --tty pts/20 fw
From there, if you want to end one of background processes, you just have to close its fd:
eval "exec $DATEOUT>&-"
eval "exec $DATEIN>&-"
ps --tty pts/20 fw
PID TTY STAT TIME COMMAND
4936 pts/20 Ss 0:00 bash
5256 pts/20 S 0:00 \_ /usr/bin/bc -l
6358 pts/20 R+ 0:00 \_ ps --tty pts/20 fw
which is not needed, because all fd close when the main process finishes.

As they have already indicated to you, you should use `backticks`.
The alternative proposed $(command) works as well, and it also easier to read, but note that it is valid only with Bash or KornShell (and shells derived from those),
so if your scripts have to be really portable on various Unix systems, you should prefer the old backticks notation.

I know three ways to do it:
Functions are suitable for such tasks:**
func (){
ls -l
}
Invoke it by saying func.
Also another suitable solution could be eval:
var="ls -l"
eval $var
The third one is using variables directly:
var=$(ls -l)
OR
var=`ls -l`
You can get the output of the third solution in a good way:
echo "$var"
And also in a nasty way:
echo $var

Just to be different:
MOREF=$(sudo run command against $VAR1 | grep name | cut -c7-)

When setting a variable make sure you have no spaces before and/or after the = sign. I literally spent an hour trying to figure this out, trying all kinds of solutions! This is not cool.
Correct:
WTFF=`echo "stuff"`
echo "Example: $WTFF"
Will Fail with error "stuff: not found" or similar
WTFF= `echo "stuff"`
echo "Example: $WTFF"

If you want to do it with multiline/multiple command/s then you can do this:
output=$( bash <<EOF
# Multiline/multiple command/s
EOF
)
Or:
output=$(
# Multiline/multiple command/s
)
Example:
#!/bin/bash
output="$( bash <<EOF
echo first
echo second
echo third
EOF
)"
echo "$output"
Output:
first
second
third
Using heredoc, you can simplify things pretty easily by breaking down your long single line code into a multiline one. Another example:
output="$( ssh -p $port $user#$domain <<EOF
# Breakdown your long ssh command into multiline here.
EOF
)"

You need to use either
$(command-here)
or
`command-here`
Example
#!/bin/bash
VAR1="$1"
VAR2="$2"
MOREF="$(sudo run command against "$VAR1" | grep name | cut -c7-)"
echo "$MOREF"

If the command that you are trying to execute fails, it would write the output onto the error stream and would then be printed out to the console.
To avoid it, you must redirect the error stream:
result=$(ls -l something_that_does_not_exist 2>&1)

This is another way and is good to use with some text editors that are unable to correctly highlight every intricate code you create:
read -r -d '' str < <(cat somefile.txt)
echo "${#str}"
echo "$str"

You can use backticks (also known as accent graves) or $().
Like:
OUTPUT=$(x+2);
OUTPUT=`x+2`;
Both have the same effect. But OUTPUT=$(x+2) is more readable and the latest one.

Here are two more ways:
Please keep in mind that space is very important in Bash. So, if you want your command to run, use as is without introducing any more spaces.
The following assigns harshil to L and then prints it
L=$"harshil"
echo "$L"
The following assigns the output of the command tr to L2. tr is being operated on another variable, L1.
L2=$(echo "$L1" | tr [:upper:] [:lower:])

Mac/OSX nowadays come with old Bash versions, ie GNU bash, version 3.2.57(1)-release (arm64-apple-darwin21). In this case, one can use:
new_variable="$(some_command)"
A concrete example:
newvar="$(echo $var | tr -d '123')"
Note the (), instead of the usual {} in Bash 4.

Some may find this useful.
Integer values in variable substitution, where the trick is using $(()) double brackets:
N=3
M=3
COUNT=$N-1
ARR[0]=3
ARR[1]=2
ARR[2]=4
ARR[3]=1
while (( COUNT < ${#ARR[#]} ))
do
ARR[$COUNT]=$((ARR[COUNT]*M))
(( COUNT=$COUNT+$N ))
done

Grep for the 5 lines after a specific text string found in files

I have many java files, and I want to find how many times we are logging via
logger.isDebugEnabled(){
logger.Debug("some debug message");
}
To get an idea of how often we may be overusing the isDebugEnabled function. I have found the number of times we have called/where it is called via
grep -r "isDebugEnabled" --include=*.java . | wc -l
But I want to know how many of those are 1 line statements. Does anyone have a good script to search for this or any ideas on the most efficient way of doing this?

Don’t use grep for this, use the following AWK program:
prev ~ /isDebugEnabled/ && $0 ~ /logger\.Debug\("[^"]"\)/ {
print FILENAME ":" NR ": " $0
}
{
prev = $0
}
This program remembers the previous line in the prev variable and thereby allows you to compare two lines at a time.
To actually use it, write:
find . -name '*.java' -print \
| xargs awk 'prev ~ /isDebugEnabled/ && /logger\.Debug\("[^"]"\)/ { print FILENAME ":" NR ": " $0 } { prev = $0 }'

As mentioned in comments grep provides options to print certain number of lines after and before match.
To print lines after match:
grep -A 2 "string to match" file.txt
To print lines before match:
grep -B 2 "string to match" file.txt

Before you try to write a script giving one final answer, try different approaches for insight what is best for what you want. Test with one file.
Do you think, that logger.Debug("The database has ", getDbNum(), "and the server has ", getRemoteNumSpecial(), "records."); is a simple oneliner?
You can collect some numbers for a first orientation. The examples beneath is using x.java as example sourcefile.
# Nr of isDebugEnabled calls
grep -c "logger.isDebugEnabled" x.java
# Some may be comment with //
grep -c "//.*logger.isDebugEnabled" x.java
# How much debug-lines
grep -c "logger.Debug" x.java
# How much debug-lines with more than 1 parameter (having a ,)
grep -c "logger.Debug.*," x.java
# How much debug-lines without the closing ) on the same line
grep "logger.Debug" x.java | grep -v "Debug.*)"
# How much logger.isDebugEnabled() without a logger.Debug on the next line
grep -A1 "logger.isDebugEnabled" x.java | grep -c "logger.Debug"
# How much logger.Debug a "}" on the next line
# The debugline might have a }, so skip lines with logger.Debug
grep -A1 "logger.Debug" x.java | grep -v "logger.Debug" | grep -c "}"

Redis Mass Insertion HSET command

I have been looking all over SO for a working solution but no luck :/
I want to perform the mass insertion using the pipelining feature of redis-cli but I am not able to do so.
I have a JAVA code snippet which create a file containing all the commands to run.
String add = "*4\r\n$4\r\nHSET\r\n$22\r\ndiscountProgramOffers\r\n$" +
key.getBytes().length + "\r\n" + key + "\r\n$" +
json.getBytes().length + "\r\n" + json + "\r\n";
System.out.println(add);
In the above code, I followed mass insertion link present at Redis Documentation site.
And this is a demo String which is getting created.
*4\r\n$4\r\nHSET\r\n$22\r\ndiscountProgramOffers\r\n$5\r\nmykey\r\n$7\r\nmyvalue\r\n
When I run the file which is created by the snippet, sometimes I get nothing, and sometimes I get this error:
Error writing to the server: Connection reset by peer
EX:
echo -e "$(cat massInsert.txt)" | redis/src/redis-cli --pipe
Error writing to the server: Connection reset by peer
Am I doing something wrong??
Please help.
FYI: I referred to these questions:
Redis Mass Insertion
redis - mass inserts and counters

Looks like a simple off-by-one error in creating the RESP protocol string:) There are 21, not 22 characters in the discountProgramOffers hash name.
It worked for me (also with printf instead of echo -e), but only in Alpine's shell ash (see the other answer for details):
# inspect data
$ cat redis-cmd-tst.txt
*4\r\n$4\r\nHSET\r\n$21\r\ndiscountProgramOffers\r\n$5\r\nmykey\r\n$7\r\nmyvalue\r\n
# transfer
$ echo -e $(cat redis-cmd-tst.txt) | redis-cli --pipe -a $REDIS_PASSWORD
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
All data transferred. Waiting for the last reply...
Last reply received from server.
errors: 0, replies: 1
# verify
$ redis-cli -a $REDIS_PASSWORD hget discountProgramOffers mykey
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
"myvalue"

If you are doing everything right, but it still does not work, it may be your linux shell's fault and you may need to switch the shell (e.g. to BusyBox ash on Alpine, where Redis is usually conveniently installed, at least in containerized versions).
Correctly formed RESP strings should display in full (i.e. with each keyword and key or value name in separate line) when using either of these commands:
$ printf $(cat redis-cmd-tst.txt)
*1
$4
HSET
$21
discountProgramOffers
$6
mykey1
$8
myvalue1
$ echo -e $(cat /tmp/tst/redis-cmd-tst.txt)
*1
$4
HSET
$21
discountProgramOffers
$6
mykey1
$8
myvalue1
Currently I've verified it works either in BusyBox ash (linked to /bin/sh) on Alpine:
$ echo $0
/bin/ash
$ /bin/ash --help
BusyBox v1.35.0 (2022-08-01 15:14:44 UTC) multi-call binary.
Usage: ash [-il] [-|+Cabefmnuvx] [-|+o OPT]... [-c 'SCRIPT' [ARG0 ARGS] | FILE ARGS | -s ARGS]
Unix shell interpreter
$ cat /etc/os-release
NAME="Alpine Linux"
ID=alpine
VERSION_ID=3.16.2
PRETTY_NAME="Alpine Linux v3.16"
... while it fails to print correctly RESP files in either shell preinstalled on Ubuntu (so both in bash and in dash, to which sh is linked on Ubuntu):
$ printf $(cat dict-to-redis-working.txt)
$ echo -e $(cat dict-to-redis-working.txt)
31T3
$ echo $0
/bin/dash
$ cat /etc/os-release
NAME="Ubuntu"
VERSION="20.04.5 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.5 LTS"
VERSION_ID="20.04"
UBUNTU_CODENAME=focal

Get Unix Thread Dump

I am trying to get a unix (solaris and linux) thread dump on a java application.
1) When the java application is a tomcat web application,
using kill -3 , the dump goes to the catalina.out file, as this is standard output.
kill -3 pid > td.out does not work.
2) For another spring standalone java application, how do I find the standard output for it.
I have used:
kill -3 pid, and I have checked in my application logs, and I cannot find anything.
Please advise how I can determine standard output for the java application and see the thread dump.
Thanks,
B.

If you're using OpenJDK or Sun JDK 6 or later, try the jstack command in the bin folder. This is useful when redirecting standard out to a file is problematic for some reason. Execute the following, passing in the Java process ID:
jstack -l JAVA_PID > jstack.out

Try using Process
Process p = null;
String cmd[] = {"bash","-c","ps -C java | awk '{ print $1; }' | sed -n '2{p;q;}'"};
try
{
p = Runtime.getRuntime().exec(cmd);
try
{
p.waitFor();
System.out.println("Executed Successfully");
}catch(Exception e)
{
e.printStackTrace();
}
}catch(Exception e)
{
e.printStackTrace();
}
Then read the line by BufferReader and use the above function to kill and output the same
Explanation for the command :
ps -C java | awk '{ print $1; }' | sed -n '2{p;q;}'
" ps -C ProcessName " tells the process if its running with pid, here Java is used as ProcessName
" awk '{ print $1; }' " outputs the 1st row of the output
" sed -n '2{p;q;} " to display the 2nd column

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Inspecting Java threads in Linux using top - java

I am inspecting a Java process in Linux using top -H However, I cannot read the name of the thread in the "COMMAND" column (because it is too long). If I use 'c' to expand the full name of the process, then it is still to long to fit. How can I obtain the full name of the command?

As far as I found out jstack is outdated as of JDK 8. What I used to retrieve all Java Thread names is: <JDK_HOME>/bin/jcmd <PID> Thread.print Check jcmd documentation for more.

Threads don't have names as far as the kernel is concerned; they only have ID numbers. The JVM assigns names to threads, but that's private internal data within the process, which the "top" program can't access (and doesn't know about anyway).

Old question, but I had just the same problem with top. It turns out, you can scroll top's output to the right simply by using the cursors keys :) (but unfortunately there won't be any thread name shown)

Related

Will this NGS data annotation code pipeline work?

How to return a value printed by Java to the bash script it is called from? [duplicate]

Grep for the 5 lines after a specific text string found in files

Redis Mass Insertion HSET command

Get Unix Thread Dump

Categories

Resources