devd - no notifications?

Hi,

I have a FreeBSD 10.3 installation with ZFS only. Last week my data zpool had an error. The pool state was [FONT=Courier New]DEGRADED[/FONT] but i never got a message about that. Devd is running and in [FONT=Courier New]/etc/devd/zfs.conf[/FONT] there are some actions that should be active in a default installation right?

Code:
root@<host> ~ # zpool status
  pool: data
state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
  attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
  using 'zpool clear' or replace the device with 'zpool replace'.
  see: http://illumos.org/msg/ZFS-8000-9P
  scan: scrub repaired 0 in 4h51m with 0 errors on Mon Sep 23 13:24:44 2013
config:
  NAME  STATE  READ WRITE CKSUM
  data  DEGRADED  0  0  0
  raidz2-0  DEGRADED  0  0  0
  gpt/XXXXXX  ONLINE  0  0  0
  gpt/XXXXXX  ONLINE  0  0  0
  gpt/XXXXXX  ONLINE  0  0  0
  gpt/XXXXXX  ONLINE  0  0  0
  1831235927447848087  DEGRADED  1  442  0  was /dev/gpt/XXXXXX
  gpt/XXXXXX  ONLINE  0  0  0
  gpt/XXXXXX  ONLINE  0  0  0
  gpt/XXXXXX  ONLINE  0  0  0
  gpt/XXXXXX  ONLINE  0  0  0
  gpt/XXXXXX  ONLINE  0  0  0
  gpt/XXXXXX  ONLINE  0  0  0
  gpt/XXXXXX  ONLINE  0  0  0
errors: No known data errors

Any idea why I got no notifications?

Thanks.
 
You could enable daily_status_zfs_enable in /etc/periodic.conf. Or use a monitoring application like Nagios, Zabbix, Monit, Munin or one of the dozens of others.
 
You could enable daily_status_zfs_enable in /etc/periodic.conf. Or use a monitoring application like Nagios, Zabbix, Monit, Munin or one of the dozens of others.
I have nagios and daily_status running, so no problem with this. So there is no solution with devd, only with "external" tools?
 
Just run a Nagios check every so often. It's much easier to implement and maintain.
 
Are there any other solutions to get a message?

I wrote a small script that cron runs every 15 minutes. At its core, the script runs the command /sbin/zpool list -H -o health ${ZPool} and checks if the result of the command is "ONLINE". If not, then notification procedures start to occur. Here's the operative excerpt...

Code:
PoolStatus=$(/sbin/zpool list -H -o health ${ZPool})

# uncomment to test error condition
#PoolStatus=DEGRADED

# if OK, then punt
test ${PoolStatus} = ONLINE && exit 0

#
# if we've come this far, there may be a problem
#
 
rotor
Thanks

SirDice
These things aren't detected. Hence no messages.
Are you sure about this?
Code:
notify 10 {
match "system"  "ZFS";
match "type"  "resource.fs.zfs.statechange";
action "logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=$pool_guid vdev_guid=$vdev_guid'";
};
case_file.cc
Code:
} else if (event.Value("class") == "resource.fs.zfs.statechange") {
RefreshVdevState();
/*
* If this vdev is DEGRADED, FAULTED, or UNAVAIL, try to
* activate a hotspare. Otherwise, ignore the event
*/
if (VdevState() == VDEV_STATE_FAULTED ||
VdevState() == VDEV_STATE_DEGRADED ||
VdevState() == VDEV_STATE_CANT_OPEN)
(void) ActivateSpare();
consumed = true;
}
I think resource.fs.zfs.statechange in /etc/devd/zfs.conf should notice about a DEGRADED event.
 
Are you sure about this?
To be honest, not really. I've never seen it log anything and I've had plenty of pools in a degraded state.

This indicates it might not be correct as they are just examples:
Code:
# Sample ZFS problem reports handling.
 
Back
Top