Reliably restarting a java process after a crash in a shell script - java

I have a jar that when run, goes through the files in a directory and processes 10 of them before exiting.
I have a shell script that looks something like this:
while true;
do java -jar myjar.jar
sleep2;
done
I have another shell script that runs the previous one on startup like so:
nohup loopscript.sh > /var/log/error.log
The problem is that sometimes the jar crashes when it needs more memory than the system has, and the entire loop seems to stop running. My log file ends with a stack trace when the memory cap is hit.
How can I reliably restart the loop after a crash? I read elsewhere on SO to do something like
until myserver; do
echo "Server 'myserver' crashed with exit code $?. Respawning.." >&2
sleep 1
done
But this only works if myserver is itself in a loop, and I'm intentionally halting the jar after 10 runs to force garbage collection and reduce the chance of a crash midway. Is my logic flawed? Should I just put the jar into a loop and use the above method of restarting it when it crashes?

As a quick and dirty solution, you can kill the process after some timeout. Here are two scripts in a parent-child relationship:
b.sh - parent
echo Parent running
while true; do
./a.sh &
pid=$!
echo Child running as $pid
sleep 2
if [ "`ps -p $pid`" != "" ]; then
sh -c "/bin/kill $pid" >/dev/null 2>&1
echo Killed $pid
fi
done
a.sh - child
echo Child running
seconds=$RANDOM
let "seconds %= 4"
sleep $seconds
echo Child finished
However, as #Jim Garrison notes, it's probably much better to design your app to run correctly, whatever that means in your case. This way, you can actually improve your app and see why you need that much memory. You'll probably solve some cases which will pop up in the future, but are not visible because you are just "solving" the problem by restarting.
It's like playing Russian roulette - yes, you may get lucky 20 times in a row, but it's going to happen...

Related

killing screen but java process not ending

I am running a script that basically runs a bunch of servers for local testing.
These jars are run in different screens because they need to all independently accept keyboard input. To do this I used screen.
command1="java -jar $fullPath"
screen -d -m -S "${screenName[$i]}" bash -c "$command1"
It worked great!
then I needed a way to kill all of the servers
so I wrote a script that does that
for session in $(screen -ls | grep -o "[0-9]*\.${screenName[$i]}")
do
screen -X -S "$session" quit
echo "killing screen $session"
done
It works great the screens are killed.
BUT the second I do that java all of a sudden take up 100% of my CPU.
Before I use the quit screen command
After I use the quit screen command
They also take forever to kill through the gui using force quit
Other info:
The servers are using Jetty which run on one java thread. and then another thread just sits and waits for keyboard input.
Obviously this is running on mac and the script is in bash so I would like a bash solution that works for mac and linux.
Also they are built using java 7 but run using java 8
because the servers are accepting keyboard input all commands sent to screen are ingested by the servers.
They do have input that quits them but I don't want to trust the servers to quit.
So my questions are:
Is there a way to have a screen terminate all running processes in it when it terminates?
if not is there a way to send the ctrl-c to a specific screen?
If not is there a way to see what the running process of a certain screen is without running commands in the screen itself? (then I can just use kill)
tl;dr when I kill screen the running process starts using all my cpu and does not terminate. I want it to terminate.
Made the solution myself.
In a nutshell it finds the screen process and finds all java processes then looks for a java process whose grandparent process is a screen.
It is extremely inefficient as it loops through the array for every screen. So basically O(n^2) but there are very few so it works for me!
Code:
length=$(expr ${#screenName[#]} - 1)
# gets all of the java processes and their grand parents
# the reason is that the screen makes 2 processes one is the java process and the other is the parent process
# I can't grab a children in mac for some reason BUT i can grab the parent process
javaPs=()
javaGpPs=()
for javaId in $(pgrep java)
do
#echo
#echo $javaId
#echo $(ps -o ppid= $javaId)
#echo $(ps -o ppid= $(ps -o ppid= $javaId))
javaPs+=($javaId)
javaGpPs+=($(ps -o ppid= $(ps -o ppid= $javaId)))
done
echo "Child procressed followed by screen processes"
echo ${javaPs[#]}
echo ${javaGpPs[#]}
#gets the index of an element in an array
#search term is first followed by the array
#note that becuase it returns by echo you can not add any debug statements into this function
search() {
local i=1;
searchTerm="${1}"
shift #moves over the argument looking
array=("${#}") #grabs the rest of the args as an array (which is an array)
for str in ${array[#]}; do
if [ "$str" = "$searchTerm" ]; then
echo $((i - 1)) #should reference the correct index (0 to something)
return
else
((i++))
fi
done
echo "-1"
}
for (( i=0; i<=$length; i++ ))
do
#looks to see if there are multiple screens with the same name
for session in $(screen -ls | grep -o "[0-9]*\.${screenName[$i]}")
do
echo
echo "killing screen $session"
IFS='.' read -ra ADDR <<< "$session" #splits the id from the name
pid=${ADDR[0]}
screen -X -S "$session" quit # exit session
# now we kill the still running java process (because it will not exist for some reason)
itemIndex=$(echo $(search "${pid}" "${javaGpPs[#]}"))
javaId=${javaPs[$itemIndex]}
# the process that is being killed
echo "killing java process"
echo $(ps -p $javaId)
kill -9 $javaId
sleep 1
done
done
echo
echo "All process should now be dead doing extra clean up now"
screen -wipe #remove all dead screens

How to stop Bash from going into the next loop iteration until the current process finishes?

I have a Bash script that goes something like this
#!/bin/bash
for i in $(seq 1 100); do
nohup java -jar myProgram.jar -myParameter1 input$i -myParameter2 $i > /dev/null 2>&1 &
done
The Java program (myProgram.jar) prints a lot of output to the stderr and stdout that I don't need and also takes at least 10 minutes to run, so I want to be able to log off from the remote computer and have it keep going for hours (thus, I use nohup and redirect stdout and stderr to /dev/null with 2>&1).
However, if I run this script as it is, it will just keep running the programs one on top of the other until it runs out of memory/processors (I suppose) and then it runs the next program once one of them finishes. I wouldn't mind this happening but it is a shared server so I cannot slow it down for several hours. Is there a way to prevent Bash from going onto the next loop iteration until the current myProgram is finished running?
Remove & from end of your command line to make it run in foreground. With & at end of line you are forcing your program to run in background in the loop.

Background process in linux

I have developed a Java socket server connection which is working fine.
When started from a terminal, it starts from listening from client. But when I close the terminal it stops listening.
I need to continue even though the terminal closed by user from where jar file was started.
How can I run Java server socket application in Linux as background process?
There are several ways you can achieve such a thing:
nohup java -server myApplication.jar > /log.txt - this is pretty straight forward. It will just put the application in the background. This will work but it's just not a very good way to do so.
Use a shell wrapper and the above OR daemon app. This approach is used by many open source projects and it's quite good for most of the scenarios. Additionally it can be included in init.d and required run level with regular start, stop and status commands. I can provide an example if needed.
Build your own daemon server using either Java Service Wrapper or Apache Jakarta Commons Daemon. Again - both are extremely popular, well tested and reliable. And available for both Linux and Windows! The one from Apache Commons is used by Tomcat server! Additionally there is Akuma.
Personally I would go with solution 2 or 3 if you need to use this server in the future and/or distribute it to clients, end users, etc. nohup is good if you need to run something and have no time to develop more complex solution for the problem.
Ad 2:
The best scripts, used by many projects, can be found here.
For Debian/Ubuntu one can use a very simple script based on start-stop-daemon. If in doubt there is /etc/init.d/skeleton one can modify.
#!/bin/sh
DESC="Description"
NAME=YOUR_NAME
PIDFILE=/var/run/$NAME.pid
RUN_AS=USER_TO_RUN
COMMAND=/usr/bin/java -- -jar YOUR_JAR
d_start() {
start-stop-daemon --start --quiet --background --make-pidfile --pidfile $PIDFILE --chuid $RUN_AS --exec $COMMAND
}
d_stop() {
start-stop-daemon --stop --quiet --pidfile $PIDFILE
if [ -e $PIDFILE ]
then rm $PIDFILE
fi
}
case $1 in
start)
echo -n "Starting $DESC: $NAME"
d_start
echo "."
;;
stop)
echo -n "Stopping $DESC: $NAME"
d_stop
echo "."
;;
restart)
echo -n "Restarting $DESC: $NAME"
d_stop
sleep 1
d_start
echo "."
;;
*)
echo "usage: $NAME {start|stop|restart}"
exit 1
;;
esac
exit 0
There's one crucial thing you need to do after adding a & at the end of the command. The process is still linked to the terminal. You need to run disown after running the java command.
java -jar yourApp.jar > log.txt &
disown
Now, you can close the terminal.
The key phrase you need here is "daemonizing a process". Ever wondered why system server processes often end in 'd' on Linux / Unix? The 'd' stands for "daemon", for historical reasons.
So, the process of detaching and becoming a true server process is called "daemonization".
It's completely general, and not limited to just Java processes.
There are several tasks that you need to do in order to become a truly independent daemon process. They're listed on the Wikipedia page.
The two main things you need to worry about are:
Detach from parent process
Detach from the tty that created the process
If you google the phrase "daemonizing a process", you'll find a bunch of ways to accomplish this, and some more detail as to why it's necessary.
Most people would just use a little shell script to start up the java process, and then finish the java command with an '&' to start up in background mode. Then, when the startup script process exits, the java process is still running and will be detached from the now-dead script process.
try,
java -jar yourApp.jar &
& will start new process thread,I have not tested this, but if still it not works then twite it in script file and start i with &
Did you try putting & at the end of the command line?
For example:
java -jar mySocketApp.jar &
You can also use bg and fg commands to send a process to background and foreground. You can pause the running process by CTRL+Z.
Check it out this article: http://lowfatlinux.com/linux-processes.html
Step 1.
To create new screen
screen -RD screenname
Step 2.
To enter into screen terminal
press Enter
Step 3.
Run your command or script (to run in the background) in the newly opened terminal
Step 4.
To come out of screen terminal
ctrl + A + D
Step 5.
To list screen terminals
screen -ls
that will print something like below
There is a screen on:
994.screenname (12/10/2018 09:24:31 AM) (Detached)
1 Socket in /run/screen/S-contact.
Step 6.
To login to the background process
screen -rd 994.screenname
for quite terminal and this process still working background. for me, the simple and fast way to run the process in the background is using the &! at end of the command:
if this app is built for X server: (eg: Firefox,Zathura,Gimp...)
$ java -jar yourApp.jar &!
if this app is cli (work on the terminal)
# st is my terminal like kitty alacritty
$ st -e bash -c "lookatme --style one-dark --one $1" &!

Starting and killing java app with shell script (Debian)

I'm new to UNIX. I want to start my java app with a script like so:
#!/bin/sh
java -jar /usr/ScriptCheck.jar &
echo $! > /var/run/ScriptCheck.pid
This is supposedly working. It does run the app and it does write the pid file. But when I try to stop the process with a different script which contains this:
#!/bin/sh
kill -9 /var/run/ScriptCheck.pid
the console gives me this error:
bash: kill: /var/run/ScriptCheck.pid: arguments must be process or job IDs
My best guess is that I'm not writing the right code in the stop script, maybe not giving the right command to open the .pid file.
Any help will be very appreciated.
You're passing a file name as an argument to kill when it expects a (proces id) number, so just read the process id from that file and pass it to kill:
#!/bin/sh
PID=$(cat /var/run/ScriptCheck.pid)
kill -9 $PID
A quick and dirty method would be :
kill -9 $(cat /var/run/ScriptCheck.pid)
Your syntax is wrong, kill takes a process id, not a file. You also should not be using kill -9 unless you absolutely know what you are doing.
kill $(cat /var/run/ScriptCheck.pid)
or
xargs kill </var/run/ScriptCheck.pid
I think you need to read in the contents of the ScriptCheck.pid file (which I'm assuming has only one entry with the PID of the process in the first row).
#!/bin/sh
procID=0;
while read line
do
procID="$line";
done </var/run/ScriptCheck.pid
kill -9 procID
I've never had to create my own pid; your question was interesting.
Here is a bash code snippet I found:
#!/bin/bash
PROGRAM=/path/to/myprog
$PROGRAM &
PID=$!
echo $PID > /path/to/pid/file.pid
You would have to have root privileges to put your file.pid into /var/run --referenced by a lot of articles -- which is why daemons have root privileges.
In this case, you need to put your pid some agreed upon place, known to your start and stop scripts. You can use the fact a pid file exists, for example, not to allow a second identical process to run.
The $PROGRAM & puts the script into background "batch" mode.
If you want the program to hang around after your script exits, I suggest launching it with nohup, which means the program won't die, when your script logs out.
I just checked. The PID is returned with a nohup.

Can java call a script to restart java in solaris?

We have a jboss application server running a webapp. We need to implement a "restart" button somewhere in the UI that causes the entire application server to restart. Our naive implementation was to call our /etc/init.d script with the restart command. This shuts down our application server then restarts it.
However, it appears that when the java process shuts down, the child process running the restart scripts dies as well, before getting to the point in the script where it starts the app server again.
We tried variations on adding '&' to the places where scripts are called, but that didn't help. Is there some where to fire the script and die without killing the script process?
Try using the nohup command to run something from within the script that you execute via Java. That is, if the script that you execute from Java currently runs this:
/etc/init.d/myservice restart
then change it to do this:
nohup /etc/init.d/myservice restart
Also, ensure that you DO NOT have stdin, stdout, or stderr being intercepted by the Java process. This could cause problems, potentially. Thus, maybe try this (assuming bash or sh):
nohup /etc/init.d/myservice restart >/dev/null 2>&1
Set your signal handlers in the restart script to ignore your signal with trap:
trap "" 2 # ignore SIGINT
trap "" 15 # ignore SIGTERM
After doing this, you'll need to kill your restart script with some other signal when needed, probably SIGKILL.

Categories