Solved ZFS: certain files disappearing after each reboot

NuLL3rr0r

Active Member

Reaction score: 19
Messages: 220

I have been migrating away from UFS to ZFS on my production servers and desktop and have been quite happy with it. Except one weird phenomenon on one of my production servers. I've noticed each time I reboot this server my name server stopped working (BIND916). Then I checked the logs and found plenty of these errors:

Code:
May 31 18:38:28 whateverhostname named[45201]: zone whateverdomain.org/IN: loading from master file /usr/local/etc/namedb/master/whateverdomain.org.db.signed failed: file not found
May 31 18:38:28 whateverhostname named[45201]: zone whateverdomain.org/IN: not loaded due to errors.
I noticed those signed zone files inside /usr/local/etc/namedb/master keep disappearing after reboot. I have to mention that I have a script that generates those *.db.signed every 24 hours. Here are my datasets:

Code:
zfs list
NAME                        USED  AVAIL  REFER  MOUNTPOINT
zroot                      23.8G  51.3G    88K  /zroot
zroot/ROOT                 8.36G  51.3G    88K  none
zroot/ROOT/default         8.36G  51.3G  8.36G  /
zroot/gitea                 104M  51.3G   104M  /var/db/gitea
zroot/mail                 5.44G  51.3G  5.44G  /mail
zroot/omnibackup            134M  51.3G   134M  /var/tmp/omnibackup
zroot/pgdb12               28.8M  40.0G   360K  /var/db/postgres/data12
zroot/pgdb12/data          15.7M  40.0G  15.7M  /var/db/postgres/data12/base
zroot/pgdb12/wal           12.8M  40.0G  12.8M  /var/db/postgres/data12/pg_wal
zroot/srv                   449M  51.3G   449M  /srv
zroot/tmp                   128K  51.3G   128K  /tmp
zroot/usr                  6.91G  51.3G    88K  /usr
zroot/usr/home              299M  51.3G   299M  /usr/home
zroot/usr/obj              1.73G  51.3G  1.73G  /usr/obj
zroot/usr/ports            3.52G  51.3G  2.27G  /usr/ports
zroot/usr/ports/distfiles  1.25G  51.3G  1.25G  /usr/ports/distfiles
zroot/usr/src              1.37G  51.3G  1.37G  /usr/src
zroot/var                  2.33G  51.3G    88K  /var
zroot/var/audit              88K  51.3G    88K  /var/audit
zroot/var/cache            2.31G  51.3G   138M  /var/cache
zroot/var/cache/ccache     2.03G  51.3G  2.03G  /var/cache/ccache
zroot/var/cache/pkg         150M  51.3G   150M  /var/cache/pkg
zroot/var/crash              88K  51.3G    88K  /var/crash
zroot/var/log              24.1M  51.3G  24.1M  /var/log
zroot/var/mail              120K  51.3G   120K  /var/mail
zroot/var/tmp               928K  51.3G   928K  /var/tmp
The output of zfs get all is attached.

Thanks in advance.
 

Attachments

ralphbsz

Daemon

Reaction score: 1,480
Messages: 2,431

Exceedingly likely that this is NOT a problem with ZFS, or your disks, or in general your OS. Yes, it is theoretically possible that a file vanishes on a hard crash ... if you don't use various flavors of fsycn, sync, O_SYNC and all that, and the system crashes within a few (milli-) seconds after a file is written or created, then the file may not be there after the reboot. (Note that I said "crashes", not "shutdown). More likely something you are doing is making these files go away.

ZFS is old and extensively tested. It is virtually unimaginable that it deletes random (or non-random!) files on reboot.

If you are still worried about it: Look at boot messages from ZFS in dmesg, and run zpool scrub. But this is unlikely to be the problem.
 

Argentum

Member

Reaction score: 6
Messages: 43

I have seen such a thing. The problem in that case was that ZFS dataset was not automatically mounted after reboot.
Try zfs mount [-vO] [-o property[,property]...] -a | filesystem

All the "lost" files were actually on the unmounted dataset and ls showed only the contents of the unmounted mount point directory.
 

asteriskRoss

Well-Known Member

Reaction score: 150
Messages: 444

I agree with Argentum that the most likely explanation is that one of your ZFS datasets is being mounted over the the top of a directory that contains data, which exists on another ZFS dataset.

Since the files you are missing reside in /usr/local, possible datasets containing your missing files are:
  • zroot/ROOT/default
  • zroot/usr.
Looking further at the output you attached to your post (very helpful, thank you), I see that that zroot/usr has the following properties:
  • mountpoint set to /usr
  • canmount set to off
  • mounted reported as no.
So your zroot/usr dataset wasn't mounted when you generated that output. I suggest seeing what files are in that dataset with the following shell commands:
Code:
mkdir /tmp/zroot_usr
zfs set mountpoint=/tmp/zroot_usr zroot/usr
zfs set canmount=on zroot/usr
zfs mount zroot/usr
ls /tmp/zroot_usr
You may be able to find duplicate/missing files. If you do, that answers the question as to what is happening but not why.

As to the why, does the script you mentioned (or another) mount or unmount ZFS datasets? Do you have zfs_enable="YES" in rc.conf(5) to automatically mount ZFS datasets on startup?

Edit: Corrected zfs set mountpoint command, added zfs mount command
 
Last edited:
OP
NuLL3rr0r

NuLL3rr0r

Active Member

Reaction score: 19
Messages: 220

Thank you all for your help and informative responses; and, sorry for the tardy response.

I checked the ZFS logs after another reboot and it seems everything is fine:

Code:
dmesg | grep -i zfs
ZFS NOTICE: Prefetch is disabled by default if less than 4GB of RAM is present;
            to enable, add "vfs.zfs.prefetch_disable=0" to /boot/loader.conf.
ZFS filesystem version: 5
ZFS storage pool version: features support (5000)
Trying to mount root from zfs:zroot/ROOT/default []...
I am sure all the datasets are mounted automatically.

Regarding the scripts, they are the same set of scripts I've been running for years and they've never failed me. Here are the scripts.

/usr/local/etc/namedb/scripts/sign-all-zones.sh
Code:
#!/usr/bin/env bash

cd /usr/local/etc/namedb/master/
for file in /usr/local/etc/namedb/master/*.db; do
    /usr/local/etc/namedb/scripts/sign-master-zone.sh "$file"
done

/usr/local/etc/rc.d/named restart
/usr/local/etc/namedb/scripts/sign-master-zone.sh
Code:
#!/usr/bin/env bash

declare ZONE="$1"
declare RC

if [[ $ZONE == *"/"* ]] ;
then
    ZONE=$(basename "$ZONE")
    if [[ ${RC} -ne 0 ]] ;
    then
        echo "Error: cannot extract file name from path!"
        exit 1
    fi
fi

if [[ "${ZONE}" == "empty.db" ]] \
    || [[ "${ZONE}" == "localhost-forward.db" ]] \
    || [[ "${ZONE}" == "localhost-reverse.db" ]] ;
then
    echo "Error: ${ZONE} cannot be signed!"
    exit 1
fi

if [[ "${ZONE##*.}" == "db" ]] ;
then
    ZONE="${ZONE%.*}"
fi

if [[ -z "${ZONE}" ]] ;
then
    echo "Error: Zone is not specified!"
    exit 1
fi

cd /usr/local/etc/namedb/master
RC=$?

if [[ ${RC} -ne 0 ]] ;
then
    echo "Error: cannot change path to master zone directory!"
    exit 1
fi

declare KSK_KEY=($(ls -tr /usr/local/etc/namedb/keys/ksk/K${ZONE}.*.key))
RC=$?
if [[ ${RC} -ne 0 ]] ;
then
    echo "Error: failed to get the KSK key for ${ZONE}!"
    exit 1
fi

declare ZSK_KEY=($(ls -tr /usr/local/etc/namedb/keys/zsk/K${ZONE}.*.key))
RC=$?
if [[ ${RC} -ne 0 ]] ;
then
    echo "Error: failed to get the ZSK key for ${ZONE}!"
    exit 1
fi

rm -f /usr/local/etc/namedb/master/${ZONE}.db.signed

dnssec-signzone -a -t -3 `head -c 1024 /dev/urandom | sha512 | cut -b 1-16` -N unixtime \
    -o ${ZONE} \
    -k ${KSK_KEY[0]} \
    /usr/local/etc/namedb/master/${ZONE}.db \
    ${ZSK_KEY[0]}
RC=$?
if [[ ${RC} -ne 0 ]] ;
then
    echo "Error: failed to sign the zone ${ZONE}!"
    exit 1
fi

echo "Successfully signed ${ZONE} zone!"

/usr/local/etc/rc.d/named reload
A cron job calls the sign-all-zones.sh script every 24 hours.

I am not sure what goes wrong here, but for the moment everything works as expected. If I find out what's wrong, I'll post back.
 
OP
NuLL3rr0r

NuLL3rr0r

Active Member

Reaction score: 19
Messages: 220

asteriskRoss thank you for pointing that out. The ZFS setup was done by FreeBSD installer. What could be the reason for that? So, that I could investigate it.
 
OP
NuLL3rr0r

NuLL3rr0r

Active Member

Reaction score: 19
Messages: 220

You are right asteriskRoss, I did:

Code:
$ zfs get canmount
NAME                       PROPERTY  VALUE     SOURCE
zroot                      canmount  on        default
zroot/ROOT                 canmount  on        default
zroot/ROOT/default         canmount  noauto    local
zroot/gitea                canmount  on        default
zroot/mail                 canmount  on        default
zroot/omnibackup           canmount  on        default
zroot/pgdb12               canmount  on        default
zroot/pgdb12/data          canmount  on        default
zroot/pgdb12/wal           canmount  on        default
zroot/srv                  canmount  on        default
zroot/tmp                  canmount  on        default
zroot/usr                  canmount  off       local
zroot/usr/home             canmount  on        default
zroot/usr/obj              canmount  on        default
zroot/usr/ports            canmount  on        default
zroot/usr/ports/distfiles  canmount  on        default
zroot/usr/src              canmount  on        default
zroot/var                  canmount  off       local
zroot/var/audit            canmount  on        default
zroot/var/cache            canmount  on        default
zroot/var/cache/ccache     canmount  on        default
zroot/var/cache/pkg        canmount  on        default
zroot/var/crash            canmount  on        default
zroot/var/log              canmount  on        default
zroot/var/mail             canmount  on        default
zroot/var/tmp              canmount  on        default
And it's off for both zroot/usr and zroot/var. As I mentioned earlier this was done by the installer :|
 
OP
NuLL3rr0r

NuLL3rr0r

Active Member

Reaction score: 19
Messages: 220

And, also I'm trying to run your suggested commands, and I get:

Code:
$ mkdir /tmp/zroot_usr
$ # zfs set mountpoint=/tmp/zroot_usr
missing dataset name(s)
usage:
    set <property=value> ... <filesystem|volume|snapshot> ...

The following properties are supported:

    PROPERTY       EDIT  INHERIT   VALUES

    available        NO       NO   <size>
    clones           NO       NO   <dataset>[,...]
    compressratio    NO       NO   <1.00x or higher if compressed>
    createtxg        NO       NO   <uint64>
    creation         NO       NO   <date>
    defer_destroy    NO       NO   yes | no
    guid             NO       NO   <uint64>
    logicalreferenced  NO       NO   <size>
    logicalused      NO       NO   <size>
    mounted          NO       NO   yes | no
    origin           NO       NO   <snapshot>
    receive_resume_token  NO       NO   <string token>
    refcompressratio  NO       NO   <1.00x or higher if compressed>
    referenced       NO       NO   <size>
    type             NO       NO   filesystem | volume | snapshot | bookmark
    used             NO       NO   <size>
    usedbychildren   NO       NO   <size>
    usedbydataset    NO       NO   <size>
    usedbyrefreservation  NO       NO   <size>
    usedbysnapshots  NO       NO   <size>
    userrefs         NO       NO   <count>
    written          NO       NO   <size>
    aclinherit      YES      YES   discard | noallow | restricted | passthrough | passthrough-x
    aclmode         YES      YES   discard | groupmask | passthrough | restricted
    atime           YES      YES   on | off
    canmount        YES       NO   on | off | noauto
    casesensitivity  NO      YES   sensitive | insensitive | mixed
    checksum        YES      YES   on | off | fletcher2 | fletcher4 | sha256 | sha512 | skein
    compression     YES      YES   on | off | lzjb | gzip | gzip-[1-9] | zle | lz4
    copies          YES      YES   1 | 2 | 3
    dedup           YES      YES   on | off | verify | sha256[,verify], sha512[,verify], skein[,verify]
    devices         YES      YES   on | off
    dnodesize       YES      YES   legacy | auto | 1k | 2k | 4k | 8k | 16k
    exec            YES      YES   on | off
    filesystem_count YES       NO   <count>
    filesystem_limit YES       NO   <count> | none
    jailed          YES      YES   on | off
    logbias         YES      YES   latency | throughput
    mlslabel        YES      YES   <sensitivity label>
    mountpoint      YES      YES   <path> | legacy | none
    nbmand          YES      YES   on | off
    normalization    NO      YES   none | formC | formD | formKC | formKD
    primarycache    YES      YES   all | none | metadata
    quota           YES       NO   <size> | none
    readonly        YES      YES   on | off
    recordsize      YES      YES   512 to 1M, power of 2
    redundant_metadata YES      YES   all | most
    refquota        YES       NO   <size> | none
    refreservation  YES       NO   <size> | none
    reservation     YES       NO   <size> | none
    secondarycache  YES      YES   all | none | metadata
    setuid          YES      YES   on | off
    sharenfs        YES      YES   on | off | share(1M) options
    sharesmb        YES      YES   on | off | sharemgr(1M) options
    snapdir         YES      YES   hidden | visible
    snapshot_count  YES       NO   <count>
    snapshot_limit  YES       NO   <count> | none
    sync            YES      YES   standard | always | disabled
    utf8only         NO      YES   on | off
    version         YES       NO   1 | 2 | 3 | 4 | 5 | current
    volblocksize     NO      YES   512 to 128k, power of 2
    volmode         YES      YES   default | geom | dev | none
    volsize         YES       NO   <size>
    vscan           YES      YES   on | off
    xattr           YES      YES   on | off
    userused@...     NO       NO   <size>
    groupused@...    NO       NO   <size>
    userquota@...   YES       NO   <size> | none
    groupquota@...  YES       NO   <size> | none
    written@<snap>   NO       NO   <size>

Sizes are specified in bytes with standard units such as K, M, G, etc.

User-defined properties can be specified by using a name containing a colon (:).

The {user|group}{used|quota}@ properties must be appended with
a user or group specifier of one of these forms:
    POSIX name      (eg: "matt")
    POSIX id        (eg: "126829")
    SMB name@domain (eg: "matt@sun")
    SMB SID         (eg: "S-1-234-567-89")
 

Eric A. Borisch

Aspiring Daemon

Reaction score: 295
Messages: 505

The installer tries to set reasonable defaults that support beadm(1) boot environments, as these are an awesome feature, and it’s harder to change your configuration to work with them after the fact.

The (unmounted) zroot/usr dataset is created to act as an organizational/hierarchical parent to the zroot/usr/home, ports, obj and src directories, as these are directories you don’t (in general) want to save versions of in boot environments. (Likewise the zroot/var dataset is unmounted and used as an organizational node.)

As with any default setting in an operating system install, they work well for some (hopefully most) but not everyone.
 

asteriskRoss

Well-Known Member

Reaction score: 150
Messages: 444

$ # zfs set mountpoint=/tmp/zroot_usr
Whoops. I should have suggested running zfs set mountpoint=/tmp/zroot_usr zroot/usr (noting the addition of the ZFS dataset name).

If this dataset has never been mounted there should be nothing on it. When you said in your earlier post that you had migrated your systems from UFS, I assumed you had manually copied files (with tar(1) or similar) from a UFS filesystem to a manually configured ZFS filesystem, though one of your later posts indicates this might not be the case. Remember to set the canmount property back to off when you're done.
 

mjollnir

Active Member

Reaction score: 66
Messages: 231

[...] I noticed those signed zone files inside /usr/local/etc/namedb/master keep disappearing after reboot. I have to mention that I have a script that generates those *.db.signed every 24 hours. [...]
Is /usr/local/etc/namedb or namedb/master a symlink (to /var/...)?
 
OP
NuLL3rr0r

NuLL3rr0r

Active Member

Reaction score: 19
Messages: 220

Eric A. Borisch thank you for the great explanation.

asteriskRoss it says:

Code:
$ zfs set mountpoint=/tmp/zroot_usr zroot/usr
cannot unmount '/usr/home': Device busy
I did back up all my data using sysutils/omnibackup (e.g. Postgres, Gitea, configuration files, web server files, etc) and restored them after installation using tar xvJpf.

mjollnir no, they are not symlinks.
 

asteriskRoss

Well-Known Member

Reaction score: 150
Messages: 444

[canmount is] off for both zroot/usr and zroot/var. As I mentioned earlier this was done by the installer :|
In itself it isn't a problem to have unmountable ZFS datasets to group child datasets together. However, it is not what you expected. When I use such unmountable grouping datasets on my systems I set the mountpoint property to none so I'm not confused about the relationship of ZFS datasets to the mounted filesystem later. It might be worth suggesting a change to the FreeBSD installer for the same reason.

Code:
$ zfs set mountpoint=/tmp/zroot_usr zroot/usr
cannot unmount '/usr/home': Device busy
I just tried the commands on my development system for an unmounted and unmountable (canmount set to off) parent dataset of a mounted child dataset. It worked fine for me, though I did need to run the command zfs mount pool/dataset to explicitly mount it -- I've updated my earlier post to include this.

What this suggests to me is that your zroot/usr dataset was, in fact, mounted when you posted. Can you please post the outputs of:
Bash:
zfs list -o name,mounted,mountpoint,canmount zroot/usr
zfs mount
 
OP
NuLL3rr0r

NuLL3rr0r

Active Member

Reaction score: 19
Messages: 220

Thank you for the explanation. Here is the output:

Code:
$ zfs list -o name,mounted,mountpoint,canmount zroot/usr
NAME       MOUNTED  MOUNTPOINT  CANMOUNT
zroot/usr       no  /usr             off

$ zfs mount
zroot/ROOT/default              /
zroot/tmp                       /tmp
zroot/mail                      /mail
zroot/srv                       /srv
zroot/var/mail                  /var/mail
zroot/gitea                     /var/db/gitea
zroot/var/crash                 /var/crash
zroot/var/log                   /var/log
zroot/var/tmp                   /var/tmp
zroot                           /zroot
zroot/var/cache                 /var/cache
zroot/pgdb12                    /var/db/postgres/data12
zroot/usr/obj                   /usr/obj
zroot/usr/home                  /usr/home
zroot/var/audit                 /var/audit
zroot/usr/src                   /usr/src
zroot/usr/ports                 /usr/ports
zroot/omnibackup                /var/tmp/omnibackup
zroot/pgdb12/data               /var/db/postgres/data12/base
zroot/var/cache/ccache          /var/cache/ccache
zroot/var/cache/pkg             /var/cache/pkg
zroot/pgdb12/wal                /var/db/postgres/data12/pg_wal
zroot/usr/ports/distfiles       /usr/ports/distfiles
 

asteriskRoss

Well-Known Member

Reaction score: 150
Messages: 444

What this suggests to me is that your zroot/usr dataset was, in fact, mounted when you posted.
:oops: I was wrong about this . The "device busy" error message about zroot/usr/home is appearing because it inherits its mountpoint value from zroot/usr and so when you tried to change the mountpoint for zroot/usr you were in fact trying to change the mountpoints for zroot/usr/home , zroot/usr/obj etc. This was clear in the output you posted from zfs get all. My apologies.

Since the canmount property of zroot/usr is set to off and you aren't running any scripts that change it, I think it would be reasonable to eliminate my "mounting over the top of an existing directory" hypothesis. Unfortunately, that still doesn't identify the source of your issue. Slightly clutching at straws, it might be worth checking fstab(5) in case you have any filesystems mounting that you weren't intending and also looking at the output of the mount command for the same reason.
 
OP
NuLL3rr0r

NuLL3rr0r

Active Member

Reaction score: 19
Messages: 220

Here is the output for mount:

Code:
zroot/ROOT/default on / (zfs, local, noatime, nfsv4acls)
devfs on /dev (devfs, local, multilabel)
zroot/srv on /srv (zfs, local, noatime, nfsv4acls)
zroot/mail on /mail (zfs, local, noatime, nfsv4acls)
zroot/tmp on /tmp (zfs, local, noatime, nosuid, nfsv4acls)
zroot/usr/obj on /usr/obj (zfs, local, noatime, nfsv4acls)
zroot on /zroot (zfs, local, noatime, nfsv4acls)
zroot/var/mail on /var/mail (zfs, local, nfsv4acls)
zroot/usr/home on /usr/home (zfs, local, noatime, nfsv4acls)
zroot/var/crash on /var/crash (zfs, local, noatime, noexec, nosuid, nfsv4acls)
zroot/var/audit on /var/audit (zfs, local, noatime, noexec, nosuid, nfsv4acls)
zroot/usr/src on /usr/src (zfs, local, noatime, nfsv4acls)
zroot/usr/ports on /usr/ports (zfs, local, noatime, nosuid, nfsv4acls)
zroot/var/cache on /var/cache (zfs, local, noatime, nfsv4acls)
zroot/var/log on /var/log (zfs, local, noatime, noexec, nosuid, nfsv4acls)
zroot/var/tmp on /var/tmp (zfs, local, noatime, nosuid, nfsv4acls)
zroot/pgdb12 on /var/db/postgres/data12 (zfs, local, noatime, nfsv4acls)
zroot/gitea on /var/db/gitea (zfs, local, noatime, nfsv4acls)
zroot/var/cache/pkg on /var/cache/pkg (zfs, local, noatime, nfsv4acls)
zroot/var/cache/ccache on /var/cache/ccache (zfs, local, noatime, nfsv4acls)
zroot/usr/ports/distfiles on /usr/ports/distfiles (zfs, local, noatime, nosuid, nfsv4acls)
zroot/pgdb12/wal on /var/db/postgres/data12/pg_wal (zfs, local, noatime, nfsv4acls)
zroot/pgdb12/data on /var/db/postgres/data12/base (zfs, local, noatime, nfsv4acls)
zroot/omnibackup on /var/tmp/omnibackup (zfs, local, noatime, nfsv4acls)
fdescfs on /dev/fd (fdescfs)
And,

Code:
$ cat /etc/fstab
# Device        Mountpoint    FStype    Options        Dump    Pass#
/dev/ada0p2        none    swap    sw        0    0

# for some programs such as kde4
proc /proc procfs rw,noauto,late 0 0

# for some programs such as bash
fdesc /dev/fd fdescfs rw,late 0 0
 
OP
NuLL3rr0r

NuLL3rr0r

Active Member

Reaction score: 19
Messages: 220

I came to the conclusion that the problem is not ZFS itself. After much thoughts, I have some clues. I did run the script many times and rebooted the system every time. No issues so far. But, I remember now that the last two times it happened it was after security/libressl updates. I have this in my /etc/make.conf:

Code:
CPUTYPE?=corei7
MAKE_JOBS_SAFE=yes
MAKE_JOBS_NUMBER=3

# /usr/ports/Mk/bsd.default-versions.mk
DEFAULT_VERSIONS+=gcc=9
DEFAULT_VERSIONS+=linux=c7_64
DEFAULT_VERSIONS+=llvm=10
DEFAULT_VERSIONS+=mysql=10.4m
DEFAULT_VERSIONS+=pgsql=12
DEFAULT_VERSIONS+=php=7.4
DEFAULT_VERSIONS+=python=3.7 python2=2.7 python3=3.7
DEFAULT_VERSIONS+=rust=rust
DEFAULT_VERSIONS+=ssl=libressl

OPTIONS_UNSET+=X11

WITH_CCACHE_BUILD=yes
CCACHE_DIR=/var/cache/ccache

KERNCONF=CUSTOM
So, everything in ports is built against LibreSSL. I remember I had build issues with some packages when I ran:

Code:
$ portupgrade -fr security/libressl
Or,

Code:
$ portmaster -r libressl
In the past, they worked fine and rebuilt everything dependant on securtiy/libressl successfully. But, during the last two upgrades, they failed to rebuild OpenLDAP and if I remember correctly, I guess also BIND. So, I had to uninstall and install those packages manually to work. I have a clue there. Maybe something went wrong during dns/bind916 uninstall/install. Although, they should not touch the files in /usr/local/etc/named if they have been modified or created manually. I have to investigate more.

Thank you all for your help.
 

richardtoohey2

Active Member

Reaction score: 35
Messages: 100

A ports-building/installation issue is a more comfortable thought than a file system that loses files!

Will be interesting to see what you find.
 
OP
NuLL3rr0r

NuLL3rr0r

Active Member

Reaction score: 19
Messages: 220

Well, I have a theory now!

I did uninstall/install dns/bind916 and as it was expected, it did not touch my configuration files inside /usr/local/etc/named. I can think of this: my scripts are running every 24 hours. During these upgrades (since I am building from ports and this VM is a bit slow), if the scripts run after libressl version bump, the BIND tools probably fail to run due to missing .so files. Although in my scripts above, I tried to check status codes and if they fail the script should exit, I have this line in one of those scripts:

Code:
rm -f /usr/local/etc/namedb/master/${ZONE}.db.signed
And after that:

Code:
dnssec-signzone -a -t -3 `head -c 1024 /dev/urandom | sha512 | cut -b 1-16` -N unixtime \
    -o ${ZONE} \
    -k ${KSK_KEY[0]} \
    /usr/local/etc/namedb/master/${ZONE}.db \
    ${ZSK_KEY[0]}
RC=$?
if [[ ${RC} -ne 0 ]] ;
then
    echo "Error: failed to sign the zone ${ZONE}!"
    exit 1
fi

echo "Successfully signed ${ZONE} zone!"

/usr/local/etc/rc.d/named reload
So, after the file gets deleted, probably the dnssec-signzone gets called. I guess I have to do a check before running that rm -f. I'll try to test it.
 
OP
NuLL3rr0r

NuLL3rr0r

Active Member

Reaction score: 19
Messages: 220

OK, theory confirmed! richardtoohey2 you were right about this.

I deinstalled security/libressl and as expected dnssec-signzone failed with the error:

Code:
$ dnssec-signzone -V
ld-elf.so.1: Shared object "libcrypto.so.46" not found, required by "dnssec-signzone"
So, and I ran the script and those *.signed files are getting deleted. So, I modified this part of the script, in order to create a safety fence for such a situations:

Code:
dnssec-signzone -V
RC=$?
if [[ ${RC} -ne 0 ]] ;
then
    echo "Error: failed to run dnssec-signzone!"
    echo "Aborting..."
    exit 1
fi

rm -f /usr/local/etc/namedb/master/${ZONE}.db.signed

dnssec-signzone -a -t -3 `head -c 1024 /dev/urandom | sha512 | cut -b 1-16` -N unixtime \
    -o ${ZONE} \
    -k ${KSK_KEY[0]} \
    /usr/local/etc/namedb/master/${ZONE}.db \
    ${ZSK_KEY[0]}
RC=$?
if [[ ${RC} -ne 0 ]] ;
then
    echo "Error: failed to sign the zone ${ZONE}!"
    exit 1
fi

echo "Successfully signed ${ZONE} zone!"

/usr/local/etc/rc.d/named reload
Now, for whatever reason, the dnssec-signzone -V command fails, the script exists:

Code:
$ /usr/local/etc/namedb/scripts/sign-master-zone.sh babaei.net
Error: failed to run dnssec-signzone!
Aborting...
Thank you all for your help. The issue has been solved and as you have predicted, it had nothing to do with ZFS stability on FreeBSD.
 
Top