ZFS ZFS autostart/mount on reboot does not work

lofty

New Member


Messages: 2

FreeBSD: 12.2-RELEASE-p10

I followed the manual: https://docs.freebsd.org/en/books/handbook/zfs/#zfs-zfs

I set zfs_enable="YES" in /etc/rc.conf but it does not work after reboot. After every reboot I have to run service zfs start by hand. After that ZFS is working and the partitions are mounted.

I don't know what I'm doing wrong.
 
Last edited by a moderator:

dminor125

New Member

Reaction score: 3
Messages: 3

Is there anything ZFS-related in your log files that might indicate what the issue is? I assume your base system is not on ZFS?
 

iucoen

New Member

Reaction score: 7
Messages: 11

I'm having the same problem. Then I saw this in my kernel logs:
Code:
Trying to mount root from ufs:/dev/nvd0p2 [rw]...
ZFS filesystem version: 5
ZFS storage pool version: features support (5000)
pid 65 (zpool) is attempting to use unsafe AIO requests - not logging anymore
pid 65 (zpool), jid 0, uid 0: exited on signal 6
WARNING: /tmp was not properly dismounted
ada0 at ahcich0 bus 0 scbus0 target 0 lun 0

So it looks like ZFS zpool tried to run, then abort()'ed. My system is booting from an NVMe device, but the ZFS volumes are on SATA drives. But the kernel finds the NVMe drive, mounts it, then immediately runs `zpool import` before the SATA drives were even probed.

I worked around this problem by manually fixing up /etc/rc.d/zpool, add this line:
Code:
while ! [ -c /dev/ada0p1 ]; do sleep 1; done
inside function zpool_start()

Then my problem was fixed.

Anyway, I think there is a bigger issue here... maybe the kernel should wait for all drive to be probed before importing the zpool?
 

Alain De Vos

Son of Beastie

Reaction score: 870
Messages: 2,827

This problem can happen when for instance the zfs service is started before an external usb drive is detected.
Also try zpool_enable="YES"
 

SirDice

Administrator
Staff member
Administrator
Moderator

Reaction score: 13,132
Messages: 39,736

grep -i -e Solaris -e ZFS /boot/loader.conf

What's found?
You don't need to explicitly load opensolaris.ko, it will get automatically loaded as a dependency of zfs.ko. But you might need to add zfs_load="YES" in /boot/loader.conf.

Also try zpool_enable="YES"
This doesn't do anything as there is no kernel module named zpool.ko.
 

sko

Aspiring Daemon

Reaction score: 445
Messages: 747

But you might need to add zfs_enable="YES" in /boot/loader.conf.

For loader.conf the entry is zfs_load="YES" and for rc.conf zfs_enable="YES". I also fell over this a few times when manually setting up zfs - usually the installer properly sets those options as needed.

lofty
paste your full rc.conf and loader.conf; maybe there's a bogus entry in one of those that chokes the execution at boot and prevents zfs_enable="YES" to be recognized.

iucoen
any chance you have unfinished upgrades on this system (i.e. zfs module or userland out of sync with the kernel)? I've seen the zfs module break with weird errors at boot on such occasions...
Also make sure those SATA disks are healthy, and remember that disk firmware always lies! look at SMART values, but don't trust them. do you have any errors logged regarding e.g. timeouts or a lot of retries for one of the disks? checksum or other errors in zpool status -v output? (maybe scrub the pool and check again)
I'm also running a mix of nvme and sata on several systems and all pools are always imported properly, so I don't think this is a general problem or race-condition...
 

SirDice

Administrator
Staff member
Administrator
Moderator

Reaction score: 13,132
Messages: 39,736

For loader.conf the entry is zfs_load="YES" and for rc.conf zfs_enable="YES".
That's what you get when muscle memory takes over. Yes, I meant zfs_load="YES" for loader.conf. Edited my post to fix this obvious error.
 

sko

Aspiring Daemon

Reaction score: 445
Messages: 747

oh, and the zfs_load="YES" is usually prepended by opensolaris_load, but this should be automatically loaded as an dependency of the zfs module, so it might not be needed...
 

SirDice

Administrator
Staff member
Administrator
Moderator

Reaction score: 13,132
Messages: 39,736

but this should be automatically loaded as an dependency of the zfs module, so it might not be needed
It will indeed be automagically loaded. On 13.0 and higher it's not needed at all any more.
 

iucoen

New Member

Reaction score: 7
Messages: 11

iucoen
any chance you have unfinished upgrades on this system (i.e. zfs module or userland out of sync with the kernel)? I've seen the zfs module break with weird errors at boot on such occasions...
Also make sure those SATA disks are healthy, and remember that disk firmware always lies! look at SMART values, but don't trust them. do you have any errors logged regarding e.g. timeouts or a lot of retries for one of the disks? checksum or other errors in zpool status -v output? (maybe scrub the pool and check again)
I'm also running a mix of nvme and sata on several systems and all pools are always imported properly, so I don't think this is a general problem or race-condition...

Take a look at the dmesg I posted... the sequence is definitely:
1. Mount root from /dev/nvd0p2
2. zpool runs, then crashes with signal 6 (SIGABRT)
3. The first SATA drive, ada0 is probed. I have a total of 6 SATA drives.

If I add a 7th SATA drive and put my boot drive on that, then this problem doesn't happen at all. So to fix the problem I need a way to delay the mount root step until all SATA drives are probed...
 

mer

Aspiring Daemon

Reaction score: 451
Messages: 715

"...If I add a 7th SATA drive and put my boot drive on that, then this problem doesn't happen at all. So to fix the problem I need a way to delay the mount root step until all SATA drives are probed..."
Sounds like a possible dependency in the zpool script may not be correct for your system.
The SATA drives are likely probed/looked at during devmatch or something, I'm not sure if that exposes any condition to the rest of the init system.
 

dminor125

New Member

Reaction score: 3
Messages: 3

I'll add some information (and ultimately what I did to correct it) because I ran into a similar issue when I upgraded to 13-STABLE.

My root file system is not on zfs: it's on a small M.2 NVMe drive partition with ffs. I have a separate 12GB ZFS pool that consists of 3 SATA drives. After upgrading via source to 13-STABLE, I discovered that my pool was not being automatically mounted at boot.

I checked /etc/rc.conf for anything I had overlooked such as missing
Code:
zfs_enable="YES"
or typos or a corrupt file. Nothing seemed to be incorrect, missing, or out of place and I also verified the file /etc/zfs/zpool.cache actually existed on my system which it did.

After the machine was up and running, if I reloaded ZFS manually by running the command service zfs restart, my pool was properly mounted so I ruled out trouble with my pool or zfs versions being wackado.

The only specific zpool messages I could find in my log files were pid 63 (zpool) is attempting to use unsafe AIO requests - not logging anymore. Since I assumed this might be coming from /etc/rc.d/zpool, I turned on debugging briefly for the RC subsystem using
Code:
rc_debug="YES"
but there were no debugging messages other than more of the same so I turned debugging off.

At some point in this process, I thought about looking at the source for /etc/rc.d/zpool in the git repository https://cgit.freebsd.org/ for stable/13 and ALSO for main. The location in the source tree is root/libexec/rc/rc.d/zpool. The file I am posting is from main (HEAD) at https://cgit.freebsd.org/src/tree/libexec/rc/rc.d/zpool

Code:
#!/bin/sh
#
# $FreeBSD$
#

# PROVIDE: zpool
# REQUIRE: hostid disks
# BEFORE: mountcritlocal
# KEYWORD: nojail

. /etc/rc.subr

name="zpool"
desc="Import ZPOOLs"
rcvar="zfs_enable"
start_cmd="zpool_start"
required_modules="zfs"

zpool_start()
{
    local cachefile

    for cachefile in /etc/zfs/zpool.cache /boot/zfs/zpool.cache; do
        if [ -r $cachefile ]; then
            zpool import -c $cachefile -a -N
            if [ $? -ne 0 ]; then
                echo "Import of zpool cache ${cachefile} failed," \
                    "will retry after root mount hold release"
                root_hold_wait
                zpool import -c $cachefile -a -N
            fi
            break
        fi
    done
}

load_rc_config $name
run_rc_command "$1"

The section of code that has been added to this file (and is missing in stable/13) is

Code:
if [ $? -ne 0 ]; then
    echo "Import of zpool cache ${cachefile} failed," \
     "will retry after root mount hold release"
      root_hold_wait
      zpool import -c $cachefile -a -N
fi

I believe the root_hold_wait takes into account that the root file system may not necessarily be on ZFS and gives the system time to mount the root file system and then release the hold and continue on to import existing ZFS pools. I pulled the HEAD version of this file and temporarily replaced my existing /etc/rc.d/zpool with this one and rebooted. At that point, my pool was automatically mounted at boot.

This may not be the same problem the op has reported; however, I suspect it might be similar to the trouble iucoen is having (?) since he specifically said the system is booting from an NVMe device. It's not necessarily the device per se, it's having the root file system not on ZFS that I believe is the issue since the 13-STABLE version of this file does not have a root_hold_wait.

Let me start by saying this is my first post (long time user of FreeBSD) and I have tried super hard to follow the Formatting Guidelines at https://forums.freebsd.org/threads/formatting-guidelines.49535/. If I have made any mistakes, I do apologize in advance.
 
OP
L

lofty

New Member


Messages: 2

So it looks like ZFS zpool tried to run, then abort()'ed. My system is booting from an NVMe device, but the ZFS volumes are on SATA drives. But the kernel finds the NVMe drive, mounts it, then immediately runs `zpool import` before the SATA drives were even probed.
Interesting. I'm using the same configuration (NVMe as boot device and SATA drives for ZFS pool).

Nice to see you found a workaround. But I don't like the thought to replace the zpool file. Looks like this is something that have to be fixed (maybe it is on 13 with OpenZFS?).

And sorry for the late reply guys. Thanks a lot for all your answers. :)
 

grahamperrin

Son of Beastie

Reaction score: 1,046
Messages: 3,512

… my first post (long time user of FreeBSD) …
Welcome :)
root_hold_wait
<https://cgit.freebsd.org/src/log/?qt=grep&q=root_hold_wait> finds:
… If I have made any mistakes, I do apologize in advance.
I mention this only because of your interest in mistakes (and this is not exactly a mistake): in lieu of blue, I would have used inline code, i.e.
root_hold_wait
screenshot
 

dminor125

New Member

Reaction score: 3
Messages: 3

Thank you :)
<https://cgit.freebsd.org/src/log/?qt=grep&q=root_hold_wait> finds:

I mention this only because of your interest in mistakes (and this is not exactly a mistake): in lieu of blue, I would have used inline code, i.e.
root_hold_wait
screenshot
Correct, I should have used inline code for
Code:
root_hold_wait
I appreciate the critique. It would be nice to see this code merged into stable/13 but I can function fine with my temporary workaround for now.
 

Euclides

New Member


Messages: 1

Dear Sirs, good morning!
I would like you to help me to make visible in freeNAS 9.3 the datastore where I have my data.
datastore/SAN-IMAGENS
datastore/SAN-VOLUME

As per the attached images.
 

Attachments

  • 1 - Imagem - zfs list.PNG
    1 - Imagem - zfs list.PNG
    22.5 KB · Views: 15
  • 2 - Imagem - zpool status.PNG
    2 - Imagem - zpool status.PNG
    20.7 KB · Views: 16
  • 3 - Imagem - ls -l dev da.PNG
    3 - Imagem - ls -l dev da.PNG
    10.5 KB · Views: 16
  • 4 - Imagem - df -k.PNG
    4 - Imagem - df -k.PNG
    37 KB · Views: 15
  • 5 - Imagem - gpart e camcontrol.PNG
    5 - Imagem - gpart e camcontrol.PNG
    17.6 KB · Views: 16

astyle

Daemon

Reaction score: 764
Messages: 1,645

Dear Sirs, good morning!
I would like you to help me to make visible in freeNAS 9.3 the datastore where I have my data.
datastore/SAN-IMAGENS
datastore/SAN-VOLUME

As per the attached images.
Not to mention it's better to start a new thread when asking for help, rather than continuing an old one. Euclides : you can always link to a relevant thread, and explain why you think it's relevant.
 
Top