Solved Logging all process exit codes

I am trying to generate a list of all failed applications/processes in order to meet some security reporting requirements. As we don't necessarily get errors for all of them in /var/log/messages, I thought it would be useful to have a script/program which would pull out the various process accounting records for anything where the exit status was not equal to 0, then the failed/crashing applications can then be investigated as necessary.

I enabled Process Accounting, and then looked at the various options for lastcomm(1) and sa(8) - none of the outputs show the exit status, and the man pages don't list any options for this information to be produced. The acct(5) man page lists AXSIG for "killed by a signal", but no exit codes.

Someone else has had the same requirement but on Linux, and a responder offered up a patch to Linux's lastcomm which output the exit status information. Having had a quick look at FreeBSD's lastcomm code, I don't think that this would be applicable as the program is quite a bit smaller - and the acct(5) format suggests that the information isn't recorded anyway.

I thought I might get better luck with using BSM auditing, so switched this on and audited all 'ex' class activity. I created a very simple shell script to just return a non-zero value...

Code:
#!/bin/sh
echo Now testing.
exit 1
# /tmp/test.sh
Code:
Now testing.
# echo $?
1
# audit -n
Code:
Trigger sent.
# praudit -xl 20150726210852.20150726211055 | tail

Code:
<record version="11" event="execve(2)" modifier="0" time="Sun Jul 26 22:10:45 2015" msec=" + 885 msec" ><exec_args><arg>/tmp/test.sh</arg></exec_args><path>/tmp/test.sh</path><attribute mode="755" uid="root" gid="wheel" fsid="315125549" nodeid="16" device="2157742912" /><subject audit-uid="root" uid="root" gid="wheel" ruid="root" rgid="wheel" pid="955" sid="811" tid="49769 172.16.1.1" /><return errval="success" retval="0" /></record>
As you can see from the audit output, the retval is 0 - however, I am likely misinterpreting what this field is used for as the invoked app/daemon is probably still be running as the audit event is generated at point of entry.

Any ideas on how I can reliably get all exit codes system-wide?

Many thanks.
 
The command did execve(2) successfully so the ex event shows a success. Notice if you use an intentionally bad command, dsf at the command line, you'll see failures.
Code:
<record version="11" event="execve(2)" modifier="0" time="Sun Jul 26 23:29:20 2015" msec=" + 258 msec" ><exec_args><arg>dsf</arg></exec_args><path>/usr/sbin/dsf</path><subject audit-uid="jason" uid="jason" gid="unovitch" ruid="jason" rgid="unovitch" pid="88520" sid="87393" tid="27654 10.100.82.100" />[FILE]<return errval="failure : No such file or directory" retval="4294967295" />[/FILE]</record>

Take a look at the pc flag for process auditing. Here's an example using the same shell script that exits with a "2".
Code:
<record version="11" event="exit(2)" modifier="0" time="Sun Jul 26 23:29:05 2015" msec=" + 612 msec" >[FILE]<exit errval="Error 2" retval="0" />[/FILE]<subject audit-uid="jason" uid="jason" gid="unovitch" ruid="jason" rgid="unovitch" pid="88462" sid="87393" tid="27654 10.100.82.100" /><return errval="success" retval="0" /></record>

Here's a quick compiled C program that exit's with a "1".
Code:
<record version="11" event="exit(2)" modifier="0" time="Sun Jul 26 23:32:04 2015" msec=" + 141 msec" >[FILE]<exit errval="Error 1" retval="0" />[/FILE]<subject audit-uid="jason" uid="jason" gid="unovitch" ruid="jason" rgid="unovitch" pid="89516" sid="87393" tid="27654 10.100.82.100" /><return errval="success" retval="0" /></record>

See audit_class(5) for more details on the types of events.
 
Thanks very much, Jason.

I have added auditing of the pc class, and am now seeing the exit codes in the errval as you described. I am going to modify our SIEM parser to lift the numeric value out of this field for exit events, and then add a correlator against the PID value so I can tie it back to the application name available in the execve(2) event.

Code:
<record version="11" event="setpgrp(2)" modifier="0" time="Mon Jul 27 00:49:40 2015" msec=" + 90 msec" ><subject audit-uid="root" uid="root" gid="wheel" ruid="root" rgid="wheel" pid="818" sid="810" tid="50433 172.16.1.1" /><return errval="success" retval="0" /></record>

<record version="11" event="execve(2)" modifier="0" time="Mon Jul 27 00:49:40 2015" msec=" + 90 msec" ><exec_args><arg>/tmp/test.sh</arg></exec_args><path>/tmp/test.sh</path><attribute mode="755" uid="root" gid="wheel" fsid="315125549" nodeid="16" device="2157742912" /><subject audit-uid="root" uid="root" gid="wheel" ruid="root" rgid="wheel" pid="818" sid="810" tid="50433 172.16.1.1" /><return errval="success" retval="0" /></record>

<record version="11" event="exit(2)" modifier="0" time="Mon Jul 27 00:49:40 2015" msec=" + 92 msec" ><exit errval="Error 2" retval="0" /><subject audit-uid="root" uid="root" gid="wheel" ruid="root" rgid="wheel" pid="818" sid="810" tid="50433 172.16.1.1" /><return errval="success" retval="0" /></record>
 
Back
Top