Replacing-0: three disk labels of the same disk

Hi folks,

I have a RAIDZ partition on which a drive failed. I replaced the failed drive, and ran zpool replace and shortly thereafter the new drive failed (talk about bad luck!).

I then replaced the drive again and reran zpool replace and I am now at this stage:

(note in particular:
Code:
	    replacing-0                        DEGRADED     0     0     0
	      8965941623156308046              OFFLINE      0     0     0  was /dev/ada0/old
	      ada3s1                           OFFLINE      0     0     0
	      ata-ST2000DM001-9YN164_W2F0M0RS  ONLINE       0     0     0  (resilvering)

My question is: after the drive stops resilvering, it is listed as ONLINE but under the subheading of replacing-0. How do I fix this up?

Code:
~# zpool status z
  pool: z
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
	continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Mon Oct  7 14:49:45 2013
    3.40T scanned out of 5.30T at 41.3M/s, 13h25m to go
    1.13T resilvered, 64.08% done
config:

	NAME                                   STATE     READ WRITE CKSUM
	z                                      DEGRADED     0     0 33.4K
	  raidz1-0                             DEGRADED     0     0 84.6K
	    replacing-0                        DEGRADED     0     0     0
	      8965941623156308046              OFFLINE      0     0     0  was /dev/ada0/old
	      ada3s1                           OFFLINE      0     0     0
	      ata-ST2000DM001-9YN164_W2F0M0RS  ONLINE       0     0     0  (resilvering)
	    ata-ST2000DM001-9YN164_Z1E0Q9Z0    ONLINE       0     0     0  (resilvering)
	    ata-ST2000DM001-9YN164_S240C17E    ONLINE       0     0     0
 
Wait for the resilvering to complete, which it hopefully will...
If you still have OFFLINE devices left around, the first thing to try would be to detach them. (A replacing entry functions very similar to a ZFS mirror)

Code:
zpool detach z 8965941623156308046
zpool detach z ada3s1
 
Thanks @usdmatt, I tried your suggestion and had the following results:

Code:
root@server:~# zpool detach z 8965941623156308046
cannot detach 8965941623156308046: no valid replicas
Code:
root@server:~# zpool detach z ada3s1
cannot detach ada3s1: no such device in pool
root@server:~#
 
Last edited by a moderator:
@@ryanjohnson

Would you give us the output of:
# uname -a

please?

/Sebulon
 
Last edited by a moderator:
Hi Sebulon,

Output from that command was:

Code:
Linux gorda 3.8.0-29-generic #42~precise1-Ubuntu SMP Wed Aug 14 16:19:23 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
 
Hmm, that's an interesting version of FreeBSD... ;)

The 'no such device' message suggests that device no longer exists in the pool. Did that device disappear (as it should) when the resilver finished? What does the pool status look like now.

The 'no valid replicas' message is common on older releases (probably why @Sebulon asked for the uname), but as you're not actually on FreeBSD it's impossible for me to know if your ZFS codebase contains these issues. I would suggest making sure you are running the latest OpenZFS release available for your OS, and if you are (or update and still can't detach the OFFLINE device), see if there's a mailing list or forum more specific to ZFS on your platform.
 
Last edited by a moderator:
usdmatt said:
Hmm, that's an interesting version of FreeBSD... ]
Very interesting indeed! And here I´ve thought all along that the name of this forum clearly had FreeBSD in it, but I guess I was wrong:)

@@ryanjohnson
That was... unexpected, to say the least. And you never thought that asking for ZFS help on a FreeBSD forum when you are using linux was maybe worth mentioning from start? You need to understand that ZFS on FreeBSD and ZFS on linux are different in versions. Heck, even ZFS on linux and ZFS on linux are different, whether you are doing ZFS through FUSE or running ZoL (ZFS on Linux).
I can tell you this much though, that in FreeBSD, there have been problems with resilvered disks just not disappearing afterwards, but staying around as those "ghosts" that you are seeing. This was fixed somewhere between 9.1-RELEASE and 9.1-RELEASE-p7 and up. Typically what you had to do was to scrub over and over until those ghosts was exorcised;) There is no way to detach a disk from a raidz.

/Sebulon
 
Last edited by a moderator:
Sebulon said:
There is no way to detach a disk from a raidz.

No, you can't detach a disk directly from a raidz. However, a replacing 'sub-vdev' works just like a mirror. When I used to have this problem of the replacing entry not disappearing while testing ZFS around the 8.0-8.1 days, I found the workable solution to be to detach the replaced drive.

It's also confirmed by the usage message returned by the ZFS command itself (that the command can apply to replacing vdevs):

Code:
# zpool status test
  pool: test
 state: ONLINE
  scan: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        test        ONLINE       0     0     0
          raidz1-0  ONLINE       0     0     0
            md0     ONLINE       0     0     0
            md1     ONLINE       0     0     0
            md2     ONLINE       0     0     0

errors: No known data errors
# zpool detach test md0
cannot detach md0: only applicable to [B]mirror and replacing vdevs[/B]
 
usdmatt said:
mirror and replacing vdevs
That does spring to mind now that you mention it, thank you for the explanation. But can we at least say that it should work, instead of saying it definitely does works like that? Because, as previously noted, detaching a "ghost" just doesn´t always work in practice, I´ve tried on several occasions detaching a ghost where ZFS just hasn´t felt like really doing so. Either it replied that there wasn´t a disk like in the pool(of course, it´s been replaced), or that it can´t because there aren´t any valid replicas.

/Sebulon
 
I think I experienced the problem with being unable to detach the faulted drive in a RaidZ2 once. IIRC, using #zpool remove pool device worked. Still, you shouldn't try to detach/remove until the replacement drive has finished resilvering.
 
Back
Top