Solved [Solved] Strategy to replace failed disk in RAIDZ2 array

In my RAIDZ2 array a disk has been marked as faulted because of lots of bad sector messages. I've contacted the shop where I bought the disk and it can be swapped under warranty. So, the question is how to proceed. The pool status is:

Code:
# zpool status vault
  pool: vault
 state: DEGRADED
status: One or more devices are faulted in response to persistent errors.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Replace the faulted device, or use 'zpool clear' to mark the device
        repaired.
  scan: scrub repaired 0 in 21h44m with 0 errors on Fri Mar 28 14:11:22 2014
config:

        NAME            STATE     READ WRITE CKSUM
        vault           DEGRADED     0     0     0
          raidz2-0      DEGRADED     0     0     0
            gpt/WD3T01  ONLINE       0     0     0
            gpt/WD3T02  ONLINE       0     0     0
            gpt/WD3T03  ONLINE       0     0     0
            gpt/WD3T04  ONLINE       0     0     0
            gpt/WD3T05  FAULTED      0   394     0  too many errors
            gpt/WD3T06  ONLINE       0     0     0

errors: No known data errors

The steps I want to take are:

  1. Remove the faulted disk from the array:
    Code:
    # zpool detach vault gpt/WD3T05
  2. Determine which disk I need to replace using the serial number from camcontrol identify and remove the disk, go to the store to get the replacement disk, and put the new disk in. Also make sure to add some labels to each disk for easier identification next time.
  3. Prepare the new disk:
    Code:
    # gpart create -s gpt -l WD3T07 -t freebsd-zfs <device>
    # gnop create -S 4096 /dev/gpt/WD3T07
  4. Add the new partition to the array:
    Code:
    # zpool add vault /dev/gpt/WD3T07.nop
    # gnop destroy /dev/gpt/WD3T07.nop

The only thing that worries me is the partitioning of the new disk. Currently the disk has the following layout:
Code:
# gpart list ada6
Geom name: ada6
modified: false
state: OK
fwheads: 16
fwsectors: 63
last: 5860533134
first: 34
entries: 128
scheme: GPT
Providers:
1. Name: ada6p1
   Mediasize: 3000592941056 (2.7T)
   Sectorsize: 512
   Stripesize: 4096
   Stripeoffset: 0
   Mode: r1w1e2
   rawuuid: 4e88bd89-058e-11e2-a165-bc5ff4458016
   rawtype: 516e7cba-6ecf-11d6-8ff8-00022d09712b
   label: WD3T05
   length: 3000592941056
   offset: 20480
   type: freebsd-zfs
   index: 1
   end: 5860533127
   start: 40
Consumers:
1. Name: ada6
   Mediasize: 3000592982016 (2.7T)
   Sectorsize: 512
   Stripesize: 4096
   Stripeoffset: 0
   Mode: r1w1e3

Is this the correct way to handle this? I cannot do a
Code:
# zpool replace vault gpt/WD3T05 gpt/WD3T07
as I need to exchange the faulted disk and I don't have a spare.
 
Re: Strategy to replace failed disk in RAIDZ2 array

Hi @KdeBruin!

You cannot "detach" a drive from a raidz vdev, only replace.

The "ashift" value is only set when creating the vdev, so you don´t need to create another NOP device for a replace. You can check if your has the correct ashift value by doing:
# zdb | grep ashift

It should be 12. If it´s 9, there is nothing you can do to change that, short of backing up the entire pool and start all over from scratch. Sorry.

One thing you can do however, is take care of proper alignment when partitioning the disk:
Code:
# gpart create -s gpt <device>
# gpart add -l WD3T07 -t freebsd-zfs -b 2048 <device>
# zpool replace vault gpt/WD3T05 gpt/WD3T07

If you were to "zpool add" in the new disk, you would be sorry for the rest of your life, trust me.

/Sebulon
 
Last edited by a moderator:
Re: Strategy to replace failed disk in RAIDZ2 array

The alignment of the pool is 4K and the zdb command returns value 12 for the ashift so that should be covered.

But can I do the zfs replace when the initial disk is missing? That was something I couldn't really find in the ZFS documentation.
 
Re: Strategy to replace failed disk in RAIDZ2 array

Hi @KdeBruin!

Yes it´ll work. ZFS doesn´t care that the original disk is gone when it takes the parity data from the other drives. Having to still have the original drive to replace it, when it has failed seems like a really bad design:)

/Sebulon
 
Last edited by a moderator:
Re: Strategy to replace failed disk in RAIDZ2 array

Thanks! I just have to wait until the replacement drive has arrived at the shop (probably today or tomorrow) and have a go at it.
 
Re: Strategy to replace failed disk in RAIDZ2 array

And we're "almost" back in business:

Code:
# zpool status vault
  pool: vault
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
	continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Thu Apr  3 19:40:38 2014
        1.49G scanned out of 14.1T at 33.1M/s, 124h29m to go
        252M resilvered, 0.01% done
config:

	NAME                        STATE     READ WRITE CKSUM
	vault                       DEGRADED     0     0     0
	  raidz2-0                  DEGRADED     0     0     0
	    gpt/WD3T01              ONLINE       0     0     0
	    gpt/WD3T02              ONLINE       0     0     0
	    gpt/WD3T03              ONLINE       0     0     0
	    gpt/WD3T04              ONLINE       0     0     0
	    replacing-4             UNAVAIL      0     0     0
	      10130644432126285557  UNAVAIL      0     0     0  was /dev/gpt/WD3T05
	      gpt/WD3T07            ONLINE       0     0     0  (resilvering)
	    gpt/WD3T06              ONLINE       0     0     0

errors: No known data errors
 
Re: Strategy to replace failed disk in RAIDZ2 array

The "correct" order is:
  • zpool offline <poolname> <diskname>
  • Physically remove the drive from the system
  • Physically add the new drive to the system
  • Partition the drive as needed; label the partitions as needed
  • zpool replace <poolname> <olddiskname> <newdiskname>
The nice thing about using "zpool offline" is that if something goes wrong with the new disk, you can re-add the old one, "zpool online" it, "zpool remove" the failed new drive, and the pool will be back in the state it was in before you started. That process has actually saved my bacon 3 times now. Before, I would just physically replace the drive, then do the "zpool replace" and when things went sideways ... I'd just cry while rebuilding the pool from scratch. :(
 
Re: Strategy to replace failed disk in RAIDZ2 array

Oh @phoenix, why did you have to jinx it :(

After some hour or so the resilvering failed (but it is still running though). The pool status is now

Code:
# zpool status vault
  pool: vault
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Thu Apr  3 19:40:38 2014
        10.1T scanned out of 14.1T at 176M/s, 6h44m to go
        173G resilvered, 71.17% done
config:

        NAME                        STATE     READ WRITE CKSUM
        vault                       DEGRADED     0     0     0
          raidz2-0                  DEGRADED     0     0     0
            gpt/WD3T01              ONLINE       0     0     0
            gpt/WD3T02              ONLINE       0     0     0
            gpt/WD3T03              ONLINE       0     0     0
            gpt/WD3T04              ONLINE       0     0     0
            replacing-4             UNAVAIL      0     0     0
              10130644432126285557  UNAVAIL      0     0     0  was /dev/gpt/WD3T05
              gpt/WD3T07            FAULTED      0   167     0  too many errors  (resilvering)
            gpt/WD3T06              ONLINE       0     0     0

errors: No known data errors

The errors I get are:
Code:
# dmesg | grep ada6
(ada6:ahcich7:0:0:0): CAM status: Uncorrectable parity/CRC error
(ada6:ahcich7:0:0:0): Retrying command
(ada6:ahcich7:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 40 10 40 76 40 1a 00 00 00 00 00
(ada6:ahcich7:0:0:0): CAM status: Uncorrectable parity/CRC error
(ada6:ahcich7:0:0:0): Retrying command
(ada6:ahcich7:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 80 50 40 76 40 1a 00 00 00 00 00
(ada6:ahcich7:0:0:0): CAM status: Uncorrectable parity/CRC error
(ada6:ahcich7:0:0:0): Retrying command
(ada6:ahcich7:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 08 70 54 31 40 18 00 00 00 00 00
(ada6:ahcich7:0:0:0): CAM status: Uncorrectable parity/CRC error
(ada6:ahcich7:0:0:0): Retrying command
(ada6:ahcich7:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 00 d0 40 76 40 1a 00 00 01 00 00
...

Searched a bit and I will try to replace the SATA cable as this is a common cause of these errors. After that, what can I do to have the resilvering start again?
 
Last edited by a moderator:
Re: Strategy to replace failed disk in RAIDZ2 array

KdeBruin said:
After that, what can I do to have the resilvering start again?
It should start again automatically when the pool is loaded.
 
Re: Strategy to replace failed disk in RAIDZ2 array

You should be able to "zpool online" the FAULTED disk after fixing the cable issue, and that will restart the resilver.
 
Re: Strategy to replace failed disk in RAIDZ2 array

I've replaced the cable and after some time I have a healthy pool again:

Code:
# zpool status vault
  pool: vault
 state: ONLINE
  scan: resilvered 2.19T in 12h57m with 0 errors on Sat Apr  5 22:25:11 2014
config:

	NAME            STATE     READ WRITE CKSUM
	vault           ONLINE       0     0     0
	  raidz2-0      ONLINE       0     0     0
	    gpt/WD3T01  ONLINE       0     0     0
	    gpt/WD3T02  ONLINE       0     0     0
	    gpt/WD3T03  ONLINE       0     0     0
	    gpt/WD3T04  ONLINE       0     0     0
	    gpt/WD3T07  ONLINE       0     0     0
	    gpt/WD3T06  ONLINE       0     0     0

errors: No known data errors
 
Back
Top