resetting a replica and exports file errors

abatie · Feb 22, 2014

I am building several zfs fileservers using FreeBSD 10, eventually with the goal of getting a pair of primary/replica pairs working from some used Dell 2950's with SAS/6 controllers. I have one pair operational, and was working on the third. The primary question I'm asking is "what's causing and how do I fix the 'bad exports' and related errors".

I start a replica with these commands:

Code:

zfs snapshot -r $volume@repl-init.$$
zfs send -R $volume@repl-init.$$ | ssh $target zfs receive -F $destvol

Then every 5 minutes, run the mirror script (inside a locker wrapper "just in case" a replica takes too long; they're running about 20 sec typically, up to a couple minutes occasionally):

Code:

touch /vol/.mirror-stamp
zfs destroy -r vol@repl.1
zfs rename -r vol@repl vol@repl.1
zfs snapshot -r vol@repl
zfs send -R -i repl.1 vol@repl | ssh nas03a zfs receive -F vol

These are still a work in progress and the latter hasn't been parameterized yet, and in fact the initialize script hadn't been either, which led to the problem: I initialized the third server as another replica, and in doing so, rotated the repl snapshot so the incremental to the first replica could no longer be done. I grumbled at myself and then re-initialized the first replica, however now when I run the replication, I get these errors on the replica server:

Code:

Feb 22 12:15:21 nas03a mountd[1059]: can't delete exports for /vol/data: Invalid argument 
Feb 22 12:15:21 nas03a mountd[1059]: can't delete exports for /vol/vm-images: Invalid argument 
Feb 22 12:15:21 nas03a mountd[1059]: bad exports list line /vol/data/home
... repeat for all the filesystems in vol/data
Feb 22 12:15:21 nas03a mountd[1059]: can't change attributes for /vol/vm-images: Invalid radix node head, rn: 0 0xfffff8014d779e00
Feb 22 12:15:21 nas03a mountd[1059]: bad exports list line /vol/vm-images       -network 10.1.1.0/24 -maproot
Feb 22 12:15:21 nas03a mountd[1059]: bad exports list line /vol/vm-images/mail.batie.org
Feb 22 12:15:21 nas03a mountd[1059]: bad exports list line /vol/vm-images/ns1
... repeat for the filesystems in vol/vm-images

/etc/exports is a 0 length file (it's not used, but the system kept complaining if it wasn't there)
/etc/zfs/exports is, for all the filesystems on vol:

Code:

/vol/data	-network 10.1.1.0/24 -maproot=0 
/vol/data/home	-network 10.1.1.0/24 -maproot=0 
...

It hasn't changed since I got the system setup in the first place and the replication has been working great for several days before I put up the second replica while I was working out some hardware issues (in fact the intent of the second replica is to exercise that hardware to figure out what is causing a drive fault)...

abatie · Mar 1, 2014

A followup: I noticed something else odd after a while - though "zfs list" and "df" showed all the filesystems as fine, and the timestamp file I have in the root of the pool was getting updated properly from the mirroring, it turned out that any attempt to access something in one of the filesystems in the pool got "file not found" errors. Somehow the mounting was all bollixed up, so I rebooted the machine, and everything is working normally again. It doesn't give me warm fuzzies about the stability of zfs here, but since I can't afford a netapp, I'll have to make do and it seems to work mostly...

resetting a replica and exports file errors

abatie

abatie