ZFS ZPool UNAVAIL but datasets still mounted

scotia · Dec 23, 2015

Hi,

I started a large rsync from one FreeBSD system another. The target system is:

Code:

# uname -a
FreeBSD nas-02.thismonkey.com 10.1-RELEASE-p5 FreeBSD 10.1-RELEASE-p5 #0: Tue Jan 27 08:55:07 UTC 2015     root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  amd64

Much later into the rsync, I checked the status of the pool and saw this:

Code:

# zpool status
  pool: ZP-BACKUP-01
 state: UNAVAIL
status: One or more devices are faulted in response to persistent errors.  There are insufficient replicas for the pool to
        continue functioning.
action: Destroy and re-create the pool from a backup source.  Manually marking the device
        repaired using 'zpool clear' may allow some data to be recovered.
  scan: resilvered 0 in 307445734561825860h14m with 0 errors on Fri Dec 18 21:45:34 2015
config:

        NAME                                          STATE     READ WRITE CKSUM
        ZP-BACKUP-01                                  UNAVAIL      0    79     0
          gptid/b68b3b56-291f-11e4-9cce-6cf04955bd94  ONLINE       0     0     0
          gptid/aeebe4da-2cee-11e4-9b29-6cf04955bd94  ONLINE       0     0     0
          gptid/4c8e865d-2a0b-11e4-acd6-3cd92b042b6f  ONLINE       0     0     0
          gptid/b08bd044-2cee-11e4-9b29-6cf04955bd94  ONLINE       0     0     0
          gptid/074b373a-2a55-11e4-963e-3cd92b042b6f  ONLINE       0     0     0  block size: 512B configured, 4096B native
          gptid/b253ae47-2cee-11e4-9b29-6cf04955bd94  FAULTED      0   159     0  too many errors

As you can see this pool is a simple JBOD-style pool with no redundancy (it's a backup server, which is why I also don't care about the non-native block size warning).

What I didn't expect is that the datasets remained mounted and the rsync is still happily running.

Code:

# zfs list
NAME                          USED  AVAIL  REFER  MOUNTPOINT
ZP-BACKUP-01                 6.93T  2.72T   113K  /ZP-BACKUP-01
ZP-BACKUP-01/backup-01       6.93T  2.72T  6.93T  /z/backup-01

A few minutes later...

Code:

# zfs list
NAME                          USED  AVAIL  REFER  MOUNTPOINT
ZP-BACKUP-01                 7.03T  2.63T   113K  /ZP-BACKUP-01
ZP-BACKUP-01/backup-01       7.03T  2.63T  7.03T  /z/backup-01

What I'm asking is: is this expected behaviour? I would have thought not.

Thanks.
Scott
ps. I apologise for using CODE tags. I tried CMD tags but they collapse white space.

protocelt · Dec 23, 2015

Odd, I've never seen that before so I'm not sure what to suggest. Someone else may have an idea.

Your post formatting is correct so no apologies needed. Thanks!

junovitch@ · Dec 24, 2015

What does zpool status -v show?

scotia · Dec 24, 2015

Hi,

Unfortunately after a few hours the datasets did indeed disappear, after which I rebooted the machine.

Just prior to rebooting, running zpool status (without -v) showed the same as above however zfs list returned nothing. dmesg showed nothing of interest neither was there anything in /var/log/messages.

Upon reboot I got a clean pool:

Code:

# zpool status -v
  pool: ZP-BACKUP-01
state: ONLINE
status: One or more devices are configured to use a non-native block size.
        Expect reduced performance.
action: Replace affected devices with devices that support the
        configured block size, or migrate data to a properly configured
        pool.
  scan: resilvered 0 in 0h10m with 0 errors on Fri Dec 25 01:23:24 2015
config:

        NAME                                          STATE     READ WRITE CKSUM
        ZP-BACKUP-01                                  ONLINE       0     0     0
          gptid/b68b3b56-291f-11e4-9cce-6cf04955bd94  ONLINE       0     0     0
          gptid/aeebe4da-2cee-11e4-9b29-6cf04955bd94  ONLINE       0     0     0
          gptid/4c8e865d-2a0b-11e4-acd6-3cd92b042b6f  ONLINE       0     0     0
          gptid/b08bd044-2cee-11e4-9b29-6cf04955bd94  ONLINE       0     0     0
          gptid/074b373a-2a55-11e4-963e-3cd92b042b6f  ONLINE       0     0     0  block size: 512B configured, 4096B native
          gptid/b253ae47-2cee-11e4-9b29-6cf04955bd94  ONLINE       0     0     0

errors: No known data errors

I've restarted the rsync.

I understand the pool is probably corrupt and will run a scrub in a couple of days. (As mentioned, it's a backup server so I'm not concerned so much with the integrity).

If it problem reoccurs I will post again.

Thanks,
Scott

ZFS ZPool UNAVAIL but datasets still mounted

scotia

protocelt

junovitch@

scotia