ZFS: errors: Permanent errors have been detected

General questions about the FreeBSD operating system. Ask here if your question does not fit elsewhere.

ZFS: errors: Permanent errors have been detected

Postby overmind » 23 May 2011, 23:41

Hi,

I've played lately with ZFS building a [file]raidz pool[/file] of 3 usb sticks. Using 3 USB sticks of 8GB, I've created 4G partitions with [file]gpart[/file]. I've added those drives to a ZFS pool. Then I've tried to remove one and add a new usb stick formatted at 8G. The idea is to simulate upgrading hard drives. After first stick was changed and everything resilvered I've changed second drive. When pool resilvered for second drive I've got an error:

Code: Select all
errors: Permanent errors have been detected in the following files:
        /tank/FreeBSD-8.2-RELEASE-amd64-memstick.img


That was a file I've copied to test the speed of the pool, before changing sticks. Here is the status of my pool:

Code: Select all
zpool status -v
  pool: tank
 state: DEGRADED
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: resilver completed after 0h4m with 1 errors on Tue May 24 02:34:03 2011
config:

        NAME             STATE     READ WRITE CKSUM
        tank             DEGRADED     0     0     2
          raidz1         DEGRADED     0     0     4
            replacing    DEGRADED     0     0     0
              da1p1/old  REMOVED      0     0     0
              da1p1      ONLINE       0     0     0  1.05G resilvered
            da0p1        ONLINE       0     0     0
            da3p1        ONLINE       0     0     0  21.5K resilvered

errors: Permanent errors have been detected in the following files:

        /tank/FreeBSD-8.2-RELEASE-amd64-memstick.img


Code: Select all
zpool list
NAME   SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
tank  10.5G  3.15G  7.35G    29%  DEGRADED  -


So my question is: it is safe to upgrade drives from a [file]raidz pool[/file]? And another weird thing. I've destroyed my pool and now I try to create it again:

Code: Select all
zpool create tank2 raidz da0p1 da1p1 da2p1
invalid vdev specification
use '-f' to override the following errors:
/dev/da1p1 is part of potentially active pool 'tank'


But I do not have tank pool anymore:
Code: Select all
zpool list
no pools available
overmind
Member
 
Posts: 318
Joined: 18 Nov 2008, 12:29

Postby usdmatt » 24 May 2011, 11:20

Yes, it should usually be safe to expand a raidz vdev. It looks like you've had checksum errors reading data from the remaining disks during the resilver (rebuild) which has meant you've lost that data. The only option would be to remove the offending files, along with any snapshots that reference them, possibly followed by a scrub.

I'm never quite sure why zpool status shows checksum errors in the vdev, but none on the disks. I can only guess that data is read from the stripe on the disks, and then checksummed, meaning that a checksum error can not be mapped to a specific disk, only to the raidz vdev.
*That's only a guess though, and would mean that you'll never known what disk has the problem unless you start getting physical read/write errors, which seems a bit of a flaw*

If you are running FreeBSD with the v28 version of ZFS and have the zpool autoexpand property set to on, the pool should increase automatically as soon as the last resilver is done. Otherwise you have to export & import the pool to see the new size.

As for that create error, it looks like da1p1 wasn't cleared when the pool was destroyed. May have something to do with it still being in that replacing mode shown in the status output.

[CMD="zpool"]list[/CMD] only shows active, imported pools so you won't see it there. You may see it if you run [CMD="zpool"]import[/CMD] which scans all disks for ZFS pool info and displays their status.

You can wipe the device manually, or just use the -f option to force ZFS to use the disk.
usdmatt
Member
 
Posts: 422
Joined: 16 Mar 2009, 12:59

Postby overmind » 28 May 2011, 12:36

I've purchase other usb sticks and repeated the process and now it works (I am able to "upgrade" the [file]zfs pool[/file] to a bigger size.

The weird thing is that I did not get any error from older sticks. I think one of them was a little bit slow (well, those "old" ones was in fact not very old).

The error was present every time when I've tried to resize the pool.
And by every time I mean reformatting the drives and re-creating the pool.

So it was a hardware problem for me. If something goes wrong when upgrading drives in the pool with bigger ones then data is lost.

This was useful for me: http://kerneltrap.org/mailarchive/freebsd-fs/2007/9/24/298289

If you do tests with ZFS on usb stick and you want to clean a drive to add it to the pool like is a new drive do a [file]dd[/file] not only for first blocks of data. I think ZFS metadata is stored at the end of the drive too.
overmind
Member
 
Posts: 318
Joined: 18 Nov 2008, 12:29

Postby carlton_draught » 29 May 2011, 00:35

overmind wrote:
Code: Select all
zpool status -v
  pool: tank
 state: DEGRADED
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: resilver completed after 0h4m with 1 errors on Tue May 24 02:34:03 2011
config:

        NAME             STATE     READ WRITE CKSUM
        tank             DEGRADED     0     0     2
          raidz1         DEGRADED     0     0     4
            replacing    DEGRADED     0     0     0
              da1p1/old  REMOVED      0     0     0
              da1p1      ONLINE       0     0     0  1.05G resilvered
            da0p1        ONLINE       0     0     0
            da3p1        ONLINE       0     0     0  21.5K resilvered

errors: Permanent errors have been detected in the following files:

        /tank/FreeBSD-8.2-RELEASE-amd64-memstick.img


Did you remove da1p1 (the old one) before the pool finished resilvering? From what I see it looks like that is what you did.

However, I don't use RAIDZ of any sort any more. I don't see the benefits outweighing the costs/problems in most applications.
[PORT]sysutils/zxfer[/PORT] - transfer everything on ZFS easily and reliably. www.zxfer.org
User avatar
carlton_draught
Member
 
Posts: 288
Joined: 18 Mar 2010, 00:07

Postby overmind » 29 May 2011, 11:12

I waited for the pool to finish resilver every time I've change the drive. And I've repeated the process few time. Every time I've lost files. After I've used a new sets of USB sticks the problem was solved. So it was a hardware (usb) problem.

Well I think raidz is ok for a single point of failure and also to get maximum size of the pool. What would you choose instead of raidz?
overmind
Member
 
Posts: 318
Joined: 18 Nov 2008, 12:29

Postby carlton_draught » 29 May 2011, 22:55

overmind wrote:I waited for the pool to finish resilver every time I've change the drive. And I've repeated the process few time. Every time I've lost files. After I've used a new sets of USB sticks the problem was solved. So it was a hardware (usb) problem.

Ok. Did you scrub before you replaced the (USB) drives? Maybe there was a latent error waiting to be discovered that would be uncovered after a scrub. ZFS won't catch an error unless the file on which the error lies is read, or a scrub is performed.

overmind wrote:Well I think raidz is ok for a single point of failure and also to get maximum size of the pool. What would you choose instead of raidz?

RAIDZ tolerates the failure of only 1 device. I use mirror instead, wherever possible. And triple mirror, on HDD (you get 3x read performance instead of 2x, and you can tolerate two drives failing, meaning you can afford to relax a little more). See here and here. Drives are cheap. Your data is not.
[PORT]sysutils/zxfer[/PORT] - transfer everything on ZFS easily and reliably. www.zxfer.org
User avatar
carlton_draught
Member
 
Posts: 288
Joined: 18 Mar 2010, 00:07


Return to General

Who is online

Users browsing this forum: Google [Bot], sKa and 1 guest