Auto restart a process on crash

lcslex · Jan 23, 2011

Hello,

I have a problem with a process that dies so I need some script or something that will check to see if the process is running and if not, start the process.

Can someone help please? Thanks and sorry for my English.

francis · Jan 24, 2011

To see if process is running, You can use several commands: ps(1) and top(1). Also grep(1) is very useful;

Code:

$ ps
$ ps ax |grep process_name
$ top -S
$
etc...

The output are similar, but top(1) automatically updates this display every two seconds. Then check if process which You are looking for is displayed, or not. And then run it.

Writing scripts is not my strong point, but I have done something like that. Please improve this example.

Code:

#!/bin/sh

process = 'firefox-bin'

if ps ax | grep -v grep | grep $process
then
    echo "$process is alive."
else
    echo "$process is dead, but will be launched."
/usr/local/bin/firefox3

fi

Generally, the script structure looks like this: few steps.

Code:

 if process is running
[I][color="DarkGreen"]then[/color][/I] 
 do nothing
[I][color="DarkGreen"]else[/color][/I] 
 start process

Anonymous · Jan 24, 2011

A script automatically checking the output of top, or ps, or alike will consume some CPU time. IMHO, monitoring and auto-restarting should be done by some sort of a guard process, that launches the process to be monitored as a child, and that respawns its child when it died. On Mac OS X 10.4-10.6, launchd is used for this, and on Mac OS X Server 10.2-10.3 there was a quite sophisticated watchdogd, that could do this.

I am new to FreeBSD, and unfortunately I am not aware of a utility for FreeBSD, that serves for this - for sure there is something, and I would also be interested to learn about this.

Anyway, here comes a (working) prove of the concept written in C:

Code:

//pguard.c -- process guard
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>

int main(int argc, char *argv[])
{
   pid_t pid;
   int   statloc;
   
   do
   {
      if ((pid = fork()) == 0)
         goto launch_child;
      
      if (pid < 0)
         return 1;

      if (pid > 0)
         wait(&statloc);
   }
   while (1);

launch_child:
   execv(argv[0], &argv[1]);
   return 0;
}

This would be compiled by:
[CMD=""]cc pguard.c -o pguard[/CMD]

Usage:
[CMD=""]pguard command args[/CMD]

For example, I tested it for maintaining a SSH Tunnel for MySQL database replication from one server to another.

[CMD=""]pguard /usr/bin/ssh -N -L4306:127.0.0.1:3306 tunnel@example.com > /dev/null &[/CMD]

Of course, authentication has to be done with RSA keys. Note also, that ssh is not in -f (background) mode. If the tunnel dies for some reason, pguard would respawn it.

Again, this is only a prove of the concept, it is missing error handling, and other sophisticated features like running in the background, guarding more than one process, having scheduling capabilities, etc. So, again, I would be very interested to learn about such a beast on FreeBSD.

Best regards

Rolf

DutchDaemon · Jan 25, 2011

I remember launchd was being ported to FreeBSD, not sure what the present status is -> http://wiki.freebsd.org/launchd

wblock@ · Jan 25, 2011

There are some possibles in ports:

sysutils/daedalus
sysutils/monitord
sysutils/watchmen

SirDice · Jan 25, 2011

There's also sysutils/daemontools.

SirDice · Jan 25, 2011

francis said:
To see if process is running, You can use several commands: ps(1) and top(1). Also grep(1) is very useful;

I prefer pgrep(1) for scripts. Then you don't have to grep your own grep out of the mix.

kyentei · Feb 3, 2011

I'm just wondering something here.. Wouldn't starting this process in a while (true) loop be just as useful? Whenever the process ends, it'll just be started again - and therefore doesn't require another application checking if it is still running?

Something like this:

Code:

#!/bin/sh

while (true); do
/usr/bin/firefox3
done

phoenix · Feb 3, 2011

If the process forks into the background, you will Denial-of-Service yourself by continuously starting up new firefox processes.

To test this, just run the process from the command-line. If you get returned to your prompt while the process is running, then your while loop will kill your system.

bsdgooch · Feb 3, 2011

Take a look: fscd -- service state monitoring daemon

You may want to take a look at fscd(8). From the FreeBSD Status Report - 4Q/2010:

FreeBSD Services Control (fsc)

Contact: Tom Rhodes <trhodes@FreeBSD.org>

FreeBSD Services Control is a mix of binaries which integrate into the
rc.d system and provide for service (daemon) monitoring. It knows about
signals, pidfiles, and uses very little resources.

The fscd utilities will be set up as a port and, hopefully, dropped
into the ports collection in the coming weeks. This will allow easier
testing by everyone and it should make migration into -CURRENT much
easier.

Here's a link to the proposed port (I assume anyway):

http://people.freebsd.org/~trhodes/fsc/fsc-port.tar

Anonymous · Feb 4, 2011

phoenix said:
If the process forks into the background, you will Denial-of-Service yourself by continuously starting up new firefox processes.

I guess this is one of the reasons why many daemons have a command line switch for not to daemonize, for example:

Code:

httpd -D FOREGROUND
smbd -F
sshd -D
ntpd -n
afpd -d

Or others do daemonize only when a switch is set:

Code:

ftpd -D
ssh -f

The script of kyentei can be used in any of the above cases, and yes it will autorestart the daemon when it crashes for some reason.

Best regards

Rolf

ksym · Jul 27, 2015

sysutils/fsc seems to use kqueue(2) to get notification of terminated process.

But here is my question: can kqueue(2) get an EVFILT_PROC for a crashed process? I just now got a piece of software of my own make that cannot get a notification for a crashed process. My software got only EVFILT_PROC events for processes that terminated normally. What am I missing here?

EDIT: I will post a piece of code tomorrow that will demonstrate what I mean. However, if it is true that kqueue(2) can get an event for a process that has crashed (not terminated normally), then I could finally start working on a parallel-startup and fifo-activation providing service manager ...

ksym · Jul 28, 2015

Okay, problem solved. It seems that kqueue() was broken in 10.1 for a brief moment. I am seriously sure about this! Because now when I tested, a crashed process would generate an EVFILT_PROC event. Strange, really strange.

Anyway, carry on.

ksym · Jul 28, 2015

I gotta get back to designing the service executive. I need something that works on FreeBSD, even with the rc.d-scripts as service utilities, and it needs to have support for defining watchdogs (eg. programs that will be executed after a spesific service goes down). And the whole thing needs to have extremely small footprint, and depend only on base system facilities and libraries.

Pierre Pirault · Nov 24, 2016

A cool short little script I use as temporary solution sometimes goes like:

Code:

until script; do
  echo 'crashed'
  sleep 1
done;

The script will keep restarting itself until it exits cleanly.

Auto restart a process on crash

lcslex

francis

Anonymous

Guest

DutchDaemon

Administrator

wblock@

SirDice

Administrator

SirDice

Administrator

kyentei

phoenix

bsdgooch

Anonymous

Guest

ksym

ksym

ksym

Pierre Pirault