ZFS bizarre zfs filesystem behaviour - zombie mount of dataset

Hi,

I have a problem with disappearing datasets.
They don't completely disappear, but they are not doing what they should.

The first time it happened, a simple zfs umount and zfs mount sorted it out.
Now, I have a running jail, with its root at the start of the dataset.
If I enter the jail, all of the files are visible.
If I look from outside the jail, they are not.

This is the third time this has happened, for 3 different datasets/jails.

Code:
root@currawong:/m/.zfs/snapshot # uname -a
FreeBSD currawong 12.1-STABLE FreeBSD 12.1-STABLE r354424 GENERIC  amd64

root@currawong:/m/jails/wp.example.com.au # zfs mount | grep wp
m/jails/wp.example.com.au      /m/jails/wp.example.com.au
m/db/wpdb                       /m/jails/wp.example.com.au/var/db


root@currawong:/m/jails/wp.example.com.au # ls -la /m/jails/wp.example.com.au
total 3
drwxr-xr-x   3 root  wheel   3 Dec 17 07:36 .
drwxr-xr-x  19 root  wheel  19 Dec 19 14:56 ..
drwxr-xr-x   3 root  wheel   3 Dec 17 07:36 var

root@currawong:/m/jails/wp.example.com.au # jls
   JID  IP Address      Hostname                      Path
     1  2.3.4.5       wp.example.com.au            /m/jails/wp.example.com.au

root@currawong:/m/jails/wp.example.com.au # jexec 1 ls -la /
total 221
drwxr-xr-x  21 root  wheel   22 Dec 11 07:40 .
drwxr-xr-x  21 root  wheel   22 Dec 11 07:40 ..
drwxr-xr-x   2 root  wheel   47 Sep 29  2017 bin
dr-xr-xr-x  18 root  wheel  512 Dec 11 02:41 dev
drwxr-xr-x  25 root  wheel  113 Oct 24 16:28 etc
drwxr-xr-x   4 root  wheel    4 Sep 12  2018 home
drwxr-xr-x   3 root  wheel   52 Sep 29  2017 lib
drwxr-xr-x   3 root  wheel    4 Sep 29  2017 libexec
drwxr-xr-x   2 root  wheel    2 Sep 29  2017 media
drwxr-xr-x   2 root  wheel    2 Sep 29  2017 mnt
drwxr-xr-x  11 root  wheel   18 Aug 21  2018 old
drwxr-xr-x   2 root  wheel    2 Oct 16 15:14 oldvhosts
dr-xr-xr-x   1 root  wheel    0 Dec 19 15:24 proc
drwxr-xr-x   2 root  wheel  148 Sep 29  2017 rescue
drwxr-xr-x   7 root  wheel   30 Dec 19 14:37 root
drwxr-xr-x   2 root  wheel  134 Sep 29  2017 sbin
lrwxr-xr-x   1 root  wheel   11 Sep 29  2017 sys -> usr/src/sys
drwxrwsrwt  17 root  wheel  744 Dec 19 03:39 tmp
drwxr-xr-x  16 root  wheel   16 Sep 12  2018 usr
drwxr-xr-x  25 root  wheel   25 Sep  4  2018 var
drwxrwx--x  54 root  wheel   55 Dec  5 11:34 vhosts
drwxr-xr-x   2 root  wheel   11 Oct 22 13:17 whosts

root@currawong:/ # cd /m/jails/wp.example.com.au/.zfs/snapshot
/m/jails/wp.example.com.au/.zfs/snapshot: No such file or directory.

root@currawong:/m/.zfs/snapshot # zfs get all m/jails/wp.example.com.au | grep mount
m/jails/wp.example.com.au  mounted               yes                          -
m/jails/wp.example.com.au  mountpoint            /m/jails/wp.example.com.au  default
m/jails/wp.example.com.au  canmount              on                           default

root@currawong:/m/.zfs/snapshot # zfs mount m/jails/wp.example.com.au
cannot mount 'm/jails/wp.example.com.au': filesystem already mounted
root@currawong:/m/.zfs/snapshot # zfs umount m/jails/wp.example.com.au
cannot unmount '/m/jails/wp.example.com.au': Device busy
root@currawong:/m/.zfs/snapshot # zfs mount m/jails/wp.example.com.au
cannot mount 'm/jails/wp.example.com.au': filesystem already mounted

The last time it happened, I completely stopped the jail, umounted and mounted, and all was well.
But that's not going to be a long term solution when this system is live.

Any help appreciated,

Danny
 
ZFS filesystems can be mounted over one another.
So, while zfs mount tells you where some filesytem has been mounted, df /path would tell you which filesystem is actually visible to the system at some path.
I would get curious what the system says about df / /m /m/jail /m/jail/wp.example.com.au

Besides: the Alien movies told us that when the landscape gets strange, it is unwise to just step into things of unknown origin. Same goes for cd'ing into filesystems when things appear strange. Better analyze the situation from a safe place like /root.

And: I'm wondering how You get a default explicit mountpoint? All my explicit mountpoints are either local, received or inherited - logically, as somewhere the information must have come from.
 
Hi PMc,

Thanks for your response. I think I understand what is going on. Not entirely sure what to do about it, but...

Something (some command sequence) has caused datasets like /m/jails/wp.example.com.au to be mounted BEFORE /m/jails.

Just glancing at the output of mount I can see that the invisible datasets are earlier in the list than m/jails, which indicates to me that they were mounted first.

So now, we get this:

root@currawong:/m/jails # df / /m /m/jails/ /m/jails/wp.example.com.au
Filesystem 1K-blocks Used Avail Capacity Mounted on
/dev/ada0p2 10143484 2061864 7270144 22% /
m 1051977486 26 1051977460 0% /m
m/jails 1051977494 34 1051977460 0% /m/jails
m/jails 1051977494 34 1051977460 0% /m/jails


while it should look like this:

root@currawong:/m/jails # df / /m /m/jails/ /m/jails/vws2.clari.net.au
Filesystem 1K-blocks Used Avail Capacity Mounted on
/dev/ada0p2 10143484 2061960 7270048 22% /
m 1051976626 26 1051976600 0% /m
m/jails 1051976634 34 1051976600 0% /m/jails
m/jails/wp.example.com.au 1055366501 3389901 1051976600 0% /m/jails/wp.example.com.au


I did notice previously that the /var/db dataset for some jails was mounted before the jail dataset, causing /var/db/mysql to be invisible, but I hadn't joined the dots to the disappearing entire jails.

Is there any way to help zfs to know the sane order in which to mount datasets?

I'm not sure about the comment about the default explicit mountpoint. All my dataset mountpoints look like that when they are the default mountpoint.
When they are explicit, "local" is in the last column.

Thanks very much,

Danny
 
Something (some command sequence) has caused datasets like /m/jails/wp.example.com.au to be mounted BEFORE /m/jails.

Something like that I supposed. Can happen.

Is there any way to help zfs to know the sane order in which to mount datasets?

Normally it doesn't need to, it figures that out by itself. I have a quite similar setup, and never had problems with that, even when mixing different pools together.
Recently a "parallel mount" feature was added, so now my mount list looks rather chaotic, but the sequence is correct.
Problems come when you mix "legacy" mounts (which are declared in /etc/fstab) with regular ZFS mounts - then you must take care for the proper sequence. Or when any other script interferes by doing some mount command.
ZFS itself does umount/mount operations when e.g. receiving a dataset. That should also work correctly, but I don't know to what extent. If such operations are in place, You should have a look at them.
Or there might be some problem with R.12 - I'm at 11.3.

I'm not sure about the comment about the default explicit mountpoint. All my dataset mountpoints look like that when they are the default mountpoint.
When they are explicit, "local" is in the last column.

Strange. Mine look like this:
Code:
gr                                               mountpoint  none                                 received
gr/pgdata                                        mountpoint  /var/db/postgres                     received
gr/pgdata/arch                                   mountpoint  /var/db/postgres/arch                inherited from gr/pgdata
gr/pgdata/tblspc2                                mountpoint  /var/db/postgres/tblspc2             inherited from gr/pgdata
im                                               mountpoint  none                                 received
im/local                                         mountpoint  /usr/local                           local
im/pgdata                                        mountpoint  /var/db/postgres/data10              received
im/pgdata/pg_wal                                 mountpoint  /var/db/postgres/data10/pg_wal       inherited from im/pgdata

Hm, maybe they have been "default" long ago, and there has been too much copied around, so they are all "received" now...
 
Back
Top