7.3-RELEASE amd64 not calling 'zfs stop' prior to restart?

descriptus · May 12, 2010

Good day all, this is my first time posting so please bare with me if it needs some primping or information added.

Per the title, I have setup a FreeBSD 7.3R(amd64)(see specs below) which seems to hang on reboot only when I have turned on ZFS. The hang is right after the Uptime is posted, and you never get the "Rebooting...., killing cpu's, etc)

Upon investigation I did the following tests:

Testing if 'zfs stop' adequately kills ZFS before I reboot
- Ran 'zfs stop' then 'reboot' = FAIL (hangs as outlined above)

Testing if it is the module not being unloaded
- Ran 'zfs stop' then 'kldunload zfs.ko' then 'reboot' = PASS

Verifying via edits to rc.d/zfs and manual usage/reboot
- Edit /etc/rc.d/zfs: add 'kldunload zfs.ko' to 'stop' subroutine
- - Issued 'zfs stop' and 'reboot' = PASS

Testing to see if proven-functional augmentation to rc.d/zfs works during a live reboot:
- Ran 'reboot' = FAIL

So this brings us to 2 questions:
1) Should we have to manually unload the module via the rc.d script?
2) Are we either not giving time for rc.d/zfs to do it's job, or is it simply not getting called @ shutdown/reboot?

Side Note: My method of manually unloading also cleans up the shutdown messages with regards to forced unmounts on ZFS volumes. A nice side effect, considering this isn't part of a larger problem.

Thanks for any help provided!

-d

System Specs
CPU: Quad Xeon L3426
MEM: 16G
CONTROLLER: 3ware 9750-8i (all disks as single arrays, except OS which is a HW Raid1)
ZFS: 10 x 2TB raidz2 currently running defaults (for testing).

graudeejs · May 12, 2010

zero problems on FreeBSD-8. Simple # shutdown -r now works fine
please show your /boot/loader.conf and /etc/rc.conf

phoenix · May 12, 2010

No issues with ZFSv13 on 7.3-RELEASE (amd64) with 12 TB of storage space and over 150 snapshots mounted. All shutdown commands work correctly both from root and non-root users (part of operator group).

Same with ZFSv13 and ZFSv14 on 8-STABLE (amd64) (identical hardware to above).

Note: both of these use UFS for / and /usr, with ZFS for everything else.

descriptus · May 13, 2010

killasmurf86 said:
zero problems on FreeBSD-8. Simple # shutdown -r now works fine
please show your /boot/loader.conf and /etc/rc.conf

Thank you for the response. Below are my very vanilla configs. I've also posted my zfs and zpool creation commands:

### /etc/rc.conf

Code:

defaultrouter="172.0.0.228"
hostname="gluon-int"
ifconfig_em0="inet 172.0.0.100 netmask 255.255.0.0"
tws_load="YES"
nfs_server_enable="YES"
rpcbind_enable="YES"
sshd_enable="YES"
ntpd_enable="YES"
zfs_enable="YES"

# nagios-nrpe and turn off SSL
nrpe2_enable="YES"
nrpe2_flags="-n"

### /boot/loader.conf

Code:

tws_load="YES"

### ZFS Commands, each disk is a 2TB Hitachi enterprise drive.

Code:

zpool create storage01 raidz2 da1 da2 da3 da4 da5 da6 da7 da8 da9 da10 
zpool add storage01 spare da11 da12 
zfs create storage01/tachyon 
zfs create storage01/iron 
zfs create storage01/boson 
zfs create storage01/hydro

### Filesystem Layout (/dev/da0* is a HW Raid1 of 2x500G drives via the 9750, all UFS

Code:

Filesystem           Size    Used   Avail Capacity  Mounted on
/dev/da0s1a          989M    265M    645M    29%    /
devfs                1.0K    1.0K      0B   100%    /dev
/dev/da0s1e          415G    1.8G    380G     0%    /usr
/dev/da0s1d          3.9G    123M    3.4G     3%    /var
storage01            2.1T    128K    2.1T     0%    /storage01
storage01/boson      4.6T    2.5T    2.1T    54%    /storage01/boson
storage01/hydro      2.5T    362G    2.1T    14%    /storage01/hydro
storage01/iron       6.0T    3.9T    2.1T    65%    /storage01/iron
storage01/tachyon    7.5T    5.4T    2.1T    72%    /storage01/tachyon

descriptus · Jun 3, 2010

I have revisited this situation via a full re-install of 7.3-RELEASE and found it to be reproducible. At this time I have devised a workaround which in my professional opinion should already be included in the existing /etc/rc.d/zfs init script. Below are the edits I had to make so that the system would properly unmount the zpools and shutdown/reboot without a hang.

Edits Made: Added "KEYWORD: shutdown" to the init file so that calling shutdown, which uses /etc/rc.shutdown and hence rcorder, can gracefully shut down critical services before halting. Added module removal and some sleep time to ensure they get taken out. The system would still hang if the opensolaris.ko module was still in the stack, hence removal was required during the shutdown process.

Please note: I still cannot use "reboot", but must use "shutdown -r now" in order to reboot the system. Reading up on rc(8), rcorder(8), rc.subr(8), reboot(8), and rc.shutdown; leads me to believe it is due to "reboot" not calling "rcorder -k shutdown /etc/rc.d/*" from the /etc/rc.shutdown script the same way "shutdown -r now" would.

My edits to /etc/rc.d/zfs in bold
<--- SNIP --->

Code:

# PROVIDE: zfs
# REQUIRE: mountcritlocal
[B]# KEYWORD: shutdown[/B]

<--- SNIP --->
zfs_stop_main()

Code:

{
        # Disable swap on ZVOLs with property org.freebsd:swap=on.
        zfs list -H -o org.freebsd:swap,name -t volume | \
        while read state name; do
                case "${state}" in
                [oO][nN])
                        swapoff /dev/zvol/${name}
                        ;;
                esac
        done
        zfs unshare -a
        zfs unmount -a
        zfs volfini
[B]        kldunload zfs.ko opensolaris.ko
        sleep 5
        kldunload zfs.ko opensolaris.ko
        sleep 5[/B]
}

I will be submitting a bug report on this, along with my full hardware specs. The only thing 'unique' to my environment I can imagine is that I'm using a 3Ware 9750-8i SAS card with this configuration (rather than cheap sas/sata multipliers), though that really should not be a cause. During my testing I also did full manual cache flushes to the raid-card/disks to eliminate the card playing games w/ caching on shutdown that ZFS didn't know about, but it had no impact on the issue. Only manually ensuring "/etc/rc.d/zfs stop" was called on shutdown, and modules unloaded, fixed my problem.

Thanks to those that pinged back earlier when I posted this!

phoenix · Jun 3, 2010

How bizarre. We use 3Ware 9550SXU and 9650SE RAID controllers (all drives configured as Single Disk arrays) without any issues. / and /usr are on a gmirror volume on CompactFlash, though, and not on the same controller as the pool. Perhaps that's the reason?

Do you get the same results if you create a ZFS-only system (use the PC-BSD install CD)?

Don't know about the kldunload lines, but the missing KEYWORD line sounds like a definite bug.

easymac · Jan 7, 2011

I know this thread is old, but I'm having the exact same issue.

From what I can tell, the only similarity between my setup and the setup descriptus has is the tws driver that we need to load for our 3Ware cards.

I'm running the 3Ware 9750-8e, though, so the cards aren't the same. One possibility, which I haven't tried yet, would be to build the kernel with the tws.ko source.

Still, I find it quite odd that rc doesn't use the zfs script to stop everything. Is that intended?

easymac · Jan 7, 2011

I forgot to mention that I'm running 8.1-RELEASE (amd64).

I also forgot to add that it's odd that the tws driver isn't shipping with FreeBSD yet. I created a PR for this a while back, but it hasn't been picked up yet. The source code, specifically for FreeBSD 8.1, is on the drivers download page for the RAID card.

phoenix · Jan 8, 2011

Re-reading the posts in depth, the root of the issue is operator error: do not use reboot(); it doesn't do what you think.

On Linux reboot and shutdown are the same. On every other Unix-like system, they are very different.

For example, on FreeBSD shutdown() does an ordered shutdown, using the RC framework. But reboot() skips all that and uses init for rebooting.

There's a long thread about this on -stable.

Long story short: use the right tools!!

descriptus · Jan 8, 2011

phoenix said:
Re-reading the posts in depth, the root of the issue is operator error: do not use reboot(); it doesn't do what you think.

On Linux reboot and shutdown are the same. On every other Unix-like system, they are very different.

For example, on FreeBSD shutdown() does an ordered shutdown, using the RC framework. But reboot() skips all that and uses init for rebooting.

There's a long thread about this on -stable.

Long story short: use the right tools!!

phoenix,

I feel you are perfectly correct. The only issue I have had to overcome is augmenting the OS rc scripts in order to alleviate the current lack of ZFS awareness in the distribution set. After submitting this to my Conf Mgmt System, all nodes and clusters are handling quite well.

My approach has been to make a few small changes to the rc.d/zfs script, along with moving the 'reboot' binary and replacing it with an informational note on how to properly reboot the system.

As we run multiple 72+ TB (raw) systems on ZFS w/ FreeBSD 8.1-RELEASE in a production environment, these kind of safeguards are a simple necessity, and do not negate the fact that proper training on handling of ZFS is required.

Thank you for your feedback.

-D

descriptus · Jan 8, 2011

A small extra: In our situation, [cmd=]shutdown -r now[/cmd] did not work until adding KEYWORD: shutdown nojail to the RC script. Right tools or not, and whether due to our specific hardware or not, it did not function across 3 rebuilds (rcorder -k doesn't see anything with the proper keywords). I have also had many inquire directly to me w/ the same issue via my bug report w/ FreeBSD: http://www.freebsd.org/cgi/query-pr.cgi?pr=147444&cat=

As earlier noted, I do not disagree with phoenix's information on proper handling. Whether it be the specific and quite new hardware we are using, or another such combination; A proper shutdown (especially in the case of ZFS) is mandatory.

-D

easymac · Jan 8, 2011

While I agree that nobody should use reboot, you seem to ignore the fact that the rc script for ZFS doesn't get executed on shutdown, unless you put the keyword in. I've done exactly what descriptus suggested to avoid the use of `reboot` and expect that things will be perfectly fine from now on, but that's a workaround, not a solution.

I run ZFS on plenty of systems without trouble, which is why I found it interesting that descriptus and I were using the same driver (tws.ko). I wonder if the order that the kernel modules are unloaded is causing this or something.

In short: The issue exists regardless of tools!!!