zfs/zpool offline problems

Hi,
I'm having problems replacing a WD10EARS (hasn't failed yet) in my raidz1.

The pool seems to be fine:
Code:
# zpool status                                                                                                              ~
  pool: storage
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        storage     ONLINE       0     0     0
          raidz1    ONLINE       0     0     0
            ad0     ONLINE       0     0     0
            ad2     ONLINE       0     0     0
            ad4     ONLINE       0     0     0
            ad6     ONLINE       0     0     0

errors: No known data errors

but if I now try to "zpool offline" a drive zpool returns an error message:
Code:
# zpool offline storage ad2
cannot offline ad2: no valid replicas

Isn't this the way to replace a disk in a radz1?

I'm using 8.0-RELEASE-p2

Please help if you can. Thanks!
 
I don't see any spare disks, that would replace it in your setup :)
Your raid will work without ad2.....

What I understand is that you want some other disk to replace ad2, right?
For this you need 1 unused disk. You need to add this disk with # zpool add spare ad8 for example.

Or rebuild your raid with 3 disks, and use forth as spare
 
The spare disk is still on my desk.

Since no disk has faulted, I was under the impression that I should offline it before shutting down the machine, replacing the disk with a newdisk, rebooting and then restoring.
 
that depends how you want to do it :D
You can have it in pc, and make it able to replace faulty disk automatically, without need of shutdown :D
If it's a server, the I would add it to zpool as spare.
If it's a home PC, i would keep it in a box :D
 
Done it on my pool just for fun and didn't had any problems. I'm running raid10 zfs setup tho.

It should be possible to offline 1 disk in a raidz1 pool and keep it running and online it after few sec to see what the resilvering looks like.
 
c_geier said:
The spare disk is still on my desk.

Since no disk has faulted, I was under the impression that I should offline it before shutting down the machine, replacing the disk with a newdisk, rebooting and then restoring.

That would be the normal procedure with non-hotswap disks.
 
killasmurf86 said:
I don't see any spare disks, that would replace it in your setup :)
Your raid will work without ad2.....

What I understand is that you want some other disk to replace ad2, right?
For this you need 1 unused disk. You need to add this disk with # zpool add spare ad8 for example.

Or rebuild your raid with 3 disks, and use forth as spare

No, no, no, no, and no. :)

You do not *NEED* a spare vdev in a pool in order to replace a drive in the pool.

The process for doing so is:
  • zpool offline poolname devicename
  • stop/detach the disk using controller methods or ata/ahci/camcontrol
  • physically remove the disk
  • insert new disk
  • label/partition if needed
  • zpool replace poolname devicename (if device names are different add olddevicename)

Process works beautifully on FreeBSD 7.x and 8.x using ZFSv6, ZFSv13, and ZFSv14 (those are the versions I've done this on). Works for replacing dead drives, works for replacing good drives with larger ones. IOW, it just works.

No spare vdevs required.
 
c_geier said:
Since no disk has faulted, I was under the impression that I should offline it before shutting down the machine, replacing the disk with a newdisk, rebooting and then restoring.

That is the best method for replacing a "good" disk.

You can also do it by powering off the box, physically replacing the drive, booting, and using "zpool replace" to replace the FAULTED drive. However, that can lead to problems where you end up with a disk that is not replacable and the new drive won't finish resilvering and you run the risk of losing the entire pool. Very stressful situation. :)

Better to do it using the "zpool offline" method. :D
 
So we all agree on the right methode but why can't TS offline the disk ? :P 3 out of 4 disk should be enough to keep the pool going?

@TS: HAve you tried to scrub the pool first?
 
As a test, power off the machine, disconnect 1 drive, and boot. See if the pool comes up or not. If it does, then try to offline the FAULTED drive. If there are no errors, then power off, replace the drive, and boot, and see if you can do the zpool replace.

Note: do not format the old drive, as you may need to boot with it attached if things go awry with the replace. ;)
 
@Matty: No, I didn't scrub first :r, but that did the trick, I could offline the disk after scrubbing it first and the pool is now resilvering. Thanks!

But strangely the scrub was finished in 0h0m and did not report any errors:
Code:
 # zpool status storage
 pool: storage
 state: ONLINE
 scrub: scrub completed after 0h0m with 0 errors on Wed Aug 25 18:49:13 2010
config:

NAME        STATE     READ WRITE CKSUM
storage     ONLINE       0     0     0
  raidz1    ONLINE       0     0     0
    ad0     ONLINE       0     0     0
    ad2     ONLINE       0     0     0
    ad4     ONLINE       0     0     0
    ad6     ONLINE       0     0     0

errors: No known data errors

While I don't completely understand this I won't complain since it's working now. :) Thanks everybody!
 
phoenix said:
The process for doing so is:
  • zpool offline poolname devicename
  • stop/detach the disk using controller methods or ata/ahci/camcontrol
  • physically remove the disk
  • insert new disk
  • label/partition if needed
  • zpool replace poolname devicename (if device names are different add olddevicename)

actually I had to
Code:
zpool online poolname devicename
before the pool's status changed to ONLINE again. Is this normal behaviour?
 
Back
Top