zfs snapshot NFS hang

So, I have an issue where after some point (arbitrary number of snapshots? a specific snapshot? gremlins?), exporting a directory that contains a .zfs directory, whether or not snapdir=hidden is set, then listing the directory in the NFS client causes a total hang of the dataset and some commands like "zdb -d poolname". I don't know the root cause, so I don't know how to reproduce it, or create a PR.

eg.
# echo /tank/dataset -maproot=root 10.10.10.10 >> /etc/exports
# kill -HUP `cat /var/run/mountd.pid`
# tail /var/log/messages
Code:
Feb  8 15:47:54 bcnas1 mountd[46760]: can't delete exports for /tank/dataset/.zfs/snapshot/replication-20120204134001: Invalid argument
Feb  8 15:47:54 bcnas1 mountd[46760]: can't delete exports for /tank/dataset/.zfs/snapshot/replication-20120208140000: Invalid argument
...
(I would think the above shows that the NFS server is not really compatible with this situation / buggy)
# ssh 10.10.10.10 "mount bcnas1:/tank/dataset /mountpoint ; ls /mountpoint/.zfs/snapshot"
(hang on this command, and anything else after this poing using the same dataset)

There was a point when this would not hang, but instead just show many directories, and then for many other snapshots (all of the ones listed in the /var/log/messages errors and more), there would be strange binary files, or directories with the wrong files in them (it would show me a subdirectory of the correct root of the snapshot).

All of these problems happen whether or not I set snapdir=hidden or snapdir=visible.

Today an identical problem happened, and I rebooted to fix it. I don't know if it was the same cause, but I would like to find out.

Can someone give me ideas of how to track the problem, or tell me which source files I should open up in /usr/src, hack apart or add debugging and either:
  • find the root cause of the problem
  • prevent NFS from exporting any .zfs directories

Or does someone know if this has been fixed in the latest 8-STABLE or 9?

I mentioned this problem briefly before in my old thread here.

The only 'perfect' workaround I can think of is reorganizing all my datasets so the root directory is empty except one directory, and then share only that subdirectory which does not contain a .zfs directory. But ideally, nfs clients should be able to view snapshots to recover files.
 
Why don't you use sharenfs?

# zfs set sharenfs="maproot=0" tank/dataset

This only exports .zfs if it's not hidden.
 
Everything I read about FreeBSD's ZFS NFS says that it is not like Solaris, but just the same NFS server as the normal FreeBSD one, and acts exactly the same. Therefore, I would conclude that "sharenfs" only edits the /etc/zfs/exports file, and the normal daemon does the work including both that file and /etc/exports.

See output from # ps aux | grep mount
Code:
root  51407  0.0  0.0 19208  4556  ??  Is    7:59PM   0:00.01 /usr/sbin/mountd -r -p 876 /etc/exports /etc/zfs/exports

Are you sure of what you are saying? Test it by sharing a directory and testing to see if .zfs exists. Do not simply do # ls -a, but actually use the directory as if it existed: # cd .zfs/snapshot; ls.
 
peetaur said:
Are you sure of what you are saying? Test it by sharing a directory and testing to see if .zfs exists. Do not simply do # ls -a, but actually use the directory as if it existed: # cd .zfs/snapshot; ls.
Yep, you're right.

Code:
dice@molly:~>zfs list tank/FreeBSD/ports
NAME                 USED  AVAIL  REFER  MOUNTPOINT
tank/FreeBSD/ports  19.7G  1.29T   362M  /usr/ports
dice@molly:~>cd /usr/ports/.zfs/snapshot
dice@molly:/usr/ports/.zfs/snapshot>ls
20111225 20120112 20120123 20120131 20120204

But, it's not an issue on my side.

Code:
dice@williscorto:~>mount | grep /usr/ports
molly:/usr/ports on /usr/ports (nfs, read-only)
molly:/usr/ports/distfiles on /usr/ports/distfiles (nfs, read-only)
molly:/usr/ports/packages on /usr/ports/packages (nfs, read-only)
dice@williscorto:~>cd /usr/ports/.zfs/snapshot
dice@williscorto:/usr/ports/.zfs/snapshot>ls
20111225 20120112 20120123 20120131 20120204
 
Maybe # ls -l or # find . -maxdepth 2 would be more likely to crash it.

But also, it looks like you don't have as many snapshots as I do.

Code:
[CMD="#"]ls -1 /tank/.zfs/snapshot/ | wc -l[/CMD]
     751

And thanks very much for the try. I really wish I had help fixing this.
 
peetaur said:
Maybe # ls -l or # find . -maxdepth 1 would be more likely to crash it.
Tried a # find /usr/ports/.zfs/snapshot -type f Which produced a humungous list but it didn't crash the server.

FWIW I'm running
Code:
dice@williscorto:/usr/ports/.zfs/snapshot>uname -a
FreeBSD williscorto.dicelan.home 9.0-STABLE FreeBSD 9.0-STABLE #0: Wed Jan 25 13:03:03 CET 2012     root@molly.dicelan.home:/usr/obj/usr/src/sys/CORTO  amd64
Same version on the server.
 
I suggested using maxdepth so you would scan inside all snapshots instead of just the first before you hit CTRL+c.

And unfortunately, this bug is hard to reproduce... I only have the one server where it happens. I think even on the replicated server, the problem does not exist. Let's test that...

# # mount bcnas1bak:/tank/bcnasvm1 bcnasvm1 -o ro
# cd bcnasvm1/.zfs/snapshot
# ls -l

No hang. But some results are strange and interesting.

Normal looking snapshot:
Code:
drwxr-xr-x 27 root    root          27 2012-01-12 17:45 daily-2012-01-13T00:00:09

Several strange ones (files instead of directories; also wrong owner):
Code:
-rw-r--r--  1 root    root           0 2012-01-03 15:33 daily-2012-01-18T00:00:00
-rw-r--r--  1 root    root        2674 2011-03-15 02:20 daily-2012-01-29T00:00:00
-rwxr-xr-x  1 openvpn sambashare  8355 2011-09-20 12:33 daily-2012-02-03T00:00:00
-rw-rw-r--  1 root    root        8689 2010-12-30 13:04 daily-2012-02-08T00:00:00
-rw-r--r--  1 root    sambashare 23778 2011-04-02 19:20 replication-20120130002001

One of those weird things where it is not the correct directory (also wrong owner):

Code:
drwxr-xr-x  2 openvpn users          3 2011-10-11 14:16 replication-20120202195044

root@peter:/mnt/bcnasvm1/.zfs/snapshot# ls -l replication-20120202195044
total 3
-r--r--r-- 1 openvpn users 704 2011-10-11 14:16 schema.m

(note there is no openvpn user on the nas, it just has the same uid as something on my workstation)

Some snapshots that seem to think they are symlinks:
Code:
lrwxrwxrwx  1 root    root          25 2011-05-17 11:40 replication-20120203042000 -> applications-graphics.png
lrwxrwxrwx  1 root    root          42 2011-08-05 08:43 replication-20120203044001 -> ../../C/figures/naut_sampleemblem_icon.png

And of course these files look fine without NFS:

# ls -ld /tank/.zfs/snapshot/replication-20120203044001
Code:
drwxr-xr-x  13 root  wheel  13 Jan 12 12:47 /tank/.zfs/snapshot/replication-20120203044001

I also ran # find . -maxdepth 2 which didn't hang.

I wish it would hang... but of course, nothing ever hangs on the backup server... only on the most important one. (Must be some variation on Murphy's Law)
 
peetaur said:
I suggested using maxdepth so you would scan inside all snapshots instead of just the first before you hit CTRL+c.
I left it running until it finished. Maxdepth didn't do much but that's probably because I don't have a lot of snapshots.

And unfortunately, this bug is hard to reproduce... I only have the one server where it happens.
The trickiest bugs are the ones that can't easily be reproduced ;)

I don't have any other ideas. Except perhaps posting your problem to the @freebsd-fs mailinglist. Perhaps one of the developers can help out.
 
I just tested it with a FreeBSD NFS client, and it all looks correct. So I guess only the Linux client triggers this strange behavior. But the /var/log/messages error messages clearly show the server is messed up, not (only?) the client.
 
(last week) I updated to the csup'd version I downloaded and built on Feb 4th, but it shows the same issues (other than the unreproducable hang).

However it does seem to fix another hang I had (which I posted in PR #161968). :) (This is why I decided to upgrade the running system now)
 
Back
Top