Solved Lost Drive in ZFS pool - Cannot add new drive

Hi,

I have a backup server running in the office on a HP ProLiant MicroServer N54L.
This server is not high end hardware but as a backup unit it does a good job..

Today we realised that one of the disk has gone and we put a new one in the machine but we are not sure what to do next....
bellow is the status
zpool status
Code:
  pool: zroot
state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
        the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://illumos.org/msg/ZFS-8000-2Q
  scan: none requested
config:

        NAME                          STATE     READ WRITE CKSUM
        zroot                         DEGRADED     0     0     0
          raidz2-0                    DEGRADED     0     0     0
            ufsid/52a0f5122e690fcap2  ONLINE       0     0     0
            ufsid/52a0f53ee01d7083p2  ONLINE       0     0     0
            ufsid/52a0f56322d7540ap2  ONLINE       0     0     0
            9926649283297867681       UNAVAIL      0     0     0  was /dev/ada3p2

errors: No known data errors
Is that righ that I should run zpool online zroot 9926649283297867681 when the state is UNAVAIL?

Could anyone please assist

Thank you in advance
 
I have a page on this--for whatever reason, it took me longer to find out how to replace a root mirror than I expected (whereas finding out how to replace any other missing zfs component was easy.) :)

See if it works for you. WARNING--haven't tried to do this in over 6 months, so no guarantee that the method still works, but I think it will.

https://srobb.net/zfsroot.html
 
Hi scottro thank you very much for you guide.
Unfortunatly it does seem to work on raidz2
zpool detach zroot /dev/ada3p2
cannot detach /dev/ada3p2: only applicable to mirror and replacing vdevs
 
Ok is is more info.. I really hope someone can help.
The information given bellow is with the new drive plugin to the server
root@kryten:~ # zpool status
Code:
  pool: zroot
state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
        the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://illumos.org/msg/ZFS-8000-2Q
  scan: resilvered 5.56M in 0h0m with 0 errors on Thu Jun 30 22:31:15 2016
config:

        NAME                          STATE     READ WRITE CKSUM
        zroot                         DEGRADED     0     0     0
          raidz2-0                    DEGRADED     0     0     0
            ufsid/52a0f5122e690fcap2  ONLINE       0     0     0
            ufsid/52a0f53ee01d7083p2  ONLINE       0     0     0
            ufsid/52a0f56322d7540ap2  ONLINE       0     0     0
            9926649283297867681       UNAVAIL      0     0     0  was /dev/ada3p2

errors: No known data errors
root@kryten:~ # gpart show
Code:
=>        34  1953525101  ufsid/52a0f5122e690fca  GPT  (932G)
          34           6                          - free -  (3.0K)
          40        1024                       1  freebsd-boot  (512K)
        1064  1929379840                       2  freebsd-zfs  (920G)
  1929380904    24144231                          - free -  (12G)

=>        34  1953525101  ufsid/52a0f53ee01d7083  GPT  (932G)
          34           6                          - free -  (3.0K)
          40        1024                       1  freebsd-boot  (512K)
        1064  1929379840                       2  freebsd-zfs  (920G)
  1929380904    24144231                          - free -  (12G)

=>        34  1953525101  ufsid/52a0f56322d7540a  GPT  (932G)
          34           6                          - free -  (3.0K)
          40        1024                       1  freebsd-boot  (512K)
        1064  1929379840                       2  freebsd-zfs  (920G)
  1929380904    24144231                          - free -  (12G)
root@kryten:~ # ll /dev/ad*
Code:
lrwxr-xr-x  1 root  wheel        4 Jul  1 08:59 /dev/ad10@ -> ada3
lrwxr-xr-x  1 root  wheel        6 Jul  1 09:13 /dev/ad10p1@ -> ada3p1
lrwxr-xr-x  1 root  wheel        4 Jul  1 08:59 /dev/ad4@ -> ada0
lrwxr-xr-x  1 root  wheel        4 Jul  1 08:59 /dev/ad6@ -> ada1
lrwxr-xr-x  1 root  wheel        4 Jul  1 08:59 /dev/ad8@ -> ada2
crw-r-----  1 root  operator  0x5b Jul  1 08:59 /dev/ada0
crw-r-----  1 root  operator  0x62 Jul  1 08:59 /dev/ada1
crw-r-----  1 root  operator  0x64 Jul  1 08:59 /dev/ada2
crw-r-----  1 root  operator  0x66 Jul  1 08:59 /dev/ada3
crw-r-----  1 root  operator  0x70 Jul  1 09:14 /dev/ada3p1
I tried to follow the instruction on Dealing with Failed Devices but was uncessfull:
root@kryten:~ # zpool replace zroot 9926649283297867681 ada4p2
Code:
cannot open 'ada4p2': no such GEOM provider
must be a full path or shorthand device name
I then tried the following based on scottro guide
root@kryten:~ # gpart create -s gpt ada3
Code:
ada3 created
root@kryten:~ # gpart add -a 4k -s 512k -t freebsd-boot ada3
Code:
ada3p1 added
root@kryten:~ # gpart add -a 4k -s 1T -t freebsd-zfs -l disk3 ada3
Code:
gpart: autofill: No space left on device
root@kryten:~ # gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada3
Code:
bootcode written to ada3
root@kryten:~ # gnop create -S 4096 /dev/gpt/disk3
Code:
gnop: Provider gpt/disk3 is invalid.

Could anyone advise further please?
Thank you
 
I think this is what you should use:

zpool replace zroot 9926649283297867681 ada3p2

Make absolutely sure that the new disk and the partition is ada3p2 before proceeding. If that doesn't work you can try zpool offline zroot 9926649283297867681 first.
 
Do not use gnop(8) for this purpose, it's only useful when creating a completely new pool and you're on an older version of FreeBSD that doesn't have vfs.zfs.min_auto_ashift sysctl(8). All necessary partition alignment for the replacement disk can be done with gpart(8).
 
kpa
Thank you but didn't work
root@kryten:~ # zpool replace zroot 9926649283297867681 ada3p2
Code:
cannot open 'ada3p2': no such GEOM provider
must be a full path or shorthand device name
 
Try the zpool offline first. Double check that gpart show shows the partitions on the new disk after partitioning.
 
kpa , I'm getting closer but still not there yet..

root@kryten:~ # gpart show
Code:
=>  34  1953525101  ufsid/52a0f5122e690fca  GPT  (932G)
  34  6  - free -  (3.0K)
  40  1024  1  freebsd-boot  (512K)
  1064  1929379840  2  freebsd-zfs  (920G)
  1929380904  24144231  - free -  (12G)

=>  34  1953525101  ada3  GPT  (932G)
  34  6  - free -  (3.0K)
  40  1024  1  freebsd-boot  (512K)
  1064  1953524071  - free -  (932G)

=>  34  1953525101  ufsid/52a0f53ee01d7083  GPT  (932G)
  34  6  - free -  (3.0K)
  40  1024  1  freebsd-boot  (512K)
  1064  1929379840  2  freebsd-zfs  (920G)
  1929380904  24144231  - free -  (12G)

=>  34  1953525101  ufsid/52a0f56322d7540a  GPT  (932G)
  34  6  - free -  (3.0K)
  40  1024  1  freebsd-boot  (512K)
  1064  1929379840  2  freebsd-zfs  (920G)
  1929380904  24144231  - free -  (12G)

root@kryten:~ # gpart show ada3
Code:
=>  34  1953525101  ada3  GPT  (932G)
  34  6  - free -  (3.0K)
  40  1024  1  freebsd-boot  (512K)
  1064  1953524071  - free -  (932G)
root@kryten:~ # zpool offline zroot ada3p2
root@kryten:~ # zpool status
Code:
  pool: zroot
state: DEGRADED
status: One or more devices has been taken offline by the administrator.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Online the device using 'zpool online' or replace the device with
        'zpool replace'.
  scan: resilvered 5.56M in 0h0m with 0 errors on Thu Jun 30 22:31:15 2016
config:

        NAME                          STATE     READ WRITE CKSUM
        zroot                         DEGRADED     0     0     0
          raidz2-0                    DEGRADED     0     0     0
            ufsid/52a0f5122e690fcap2  ONLINE       0     0     0
            ufsid/52a0f53ee01d7083p2  ONLINE       0     0     0
            ufsid/52a0f56322d7540ap2  ONLINE       0     0     0
            9926649283297867681       OFFLINE      0     0     0  was /dev/ada3p2

errors: No known data errors
root@kryten:~ # zpool replace zroot 9926649283297867681 ada3p2
Code:
cannot open 'ada3p2': no such GEOM provider
must be a full path or shorthand device name
So close yet so far..
My hardware does not support hot-plugging, do I need to add any entry in my /etc/fstab?

Thank you for your help
 
Your problem is here:

root@kryten:~ # gpart show ada3
Code:
=>  34  1953525101  ada3  GPT  (932G)
  34  6  - free -  (3.0K)
  40  1024  1  freebsd-boot  (512K)
  1064  1953524071  - free -  (932G)

You tried to add larger partition (1TB) on the drive than what fits in there and gpart(8) returned an error you missed. Add the ZFS partition with this so that it matches exactly what's on the other disks:

gpart add -b 1064 -s 1929379840 -t freebsd-zfs ada3

I left out the -a option because LBA sector 1064 is already on a 4k boundary.
 
kpa thank you very much for all your help, I'll buy you a beer if you were near :)
root@kryten:~ # zpool status
Code:
  pool: zroot
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
  continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Fri Jul  1 10:30:49 2016
  714M scanned out of 285G at 9.04M/s, 8h57m to go
  175M resilvered, 0.24% done
config:

  NAME  STATE  READ WRITE CKSUM
  zroot  DEGRADED  0  0  0
  raidz2-0  DEGRADED  0  0  0
  ufsid/52a0f5122e690fcap2  ONLINE  0  0  0
  ufsid/52a0f53ee01d7083p2  ONLINE  0  0  0
  ufsid/52a0f56322d7540ap2  ONLINE  0  0  0
  replacing-3  OFFLINE  0  0  0
  9926649283297867681  OFFLINE  0  0  0  was /dev/ada3p2/old
  ada3p2  ONLINE  0  0  0  (resilvering)

errors: No known data errors

I guess, now all I need to do is to wait for the resilvering process to end :)
 
root@kryten:~ # zpool status
Code:
  pool: zroot
 state: ONLINE
  scan: resilvered 70.0G in 1h3m with 0 errors on Fri Jul  1 11:34:23 2016
config:

  NAME  STATE  READ WRITE CKSUM
  zroot  ONLINE  0  0  0
  raidz2-0  ONLINE  0  0  0
  ufsid/52a0f5122e690fcap2  ONLINE  0  0  0
  ufsid/52a0f53ee01d7083p2  ONLINE  0  0  0
  ufsid/52a0f56322d7540ap2  ONLINE  0  0  0
  ada3p2  ONLINE  0  0  0

errors: No known data errors
root@kryten:~ # gpart show
Code:
=>  34  1953525101  ufsid/52a0f5122e690fca  GPT  (932G)
  34  6  - free -  (3.0K)
  40  1024  1  freebsd-boot  (512K)
  1064  1929379840  2  freebsd-zfs  (920G)
  1929380904  24144231  - free -  (12G)

=>  34  1953525101  ada3  GPT  (932G)
  34  6  - free -  (3.0K)
  40  1024  1  freebsd-boot  (512K)
  1064  1929379840  2  freebsd-zfs  (920G)
  1929380904  24144231  - free -  (12G)

=>  34  1953525101  ufsid/52a0f53ee01d7083  GPT  (932G)
  34  6  - free -  (3.0K)
  40  1024  1  freebsd-boot  (512K)
  1064  1929379840  2  freebsd-zfs  (920G)
  1929380904  24144231  - free -  (12G)

=>  34  1953525101  ufsid/52a0f56322d7540a  GPT  (932G)
  34  6  - free -  (3.0K)
  40  1024  1  freebsd-boot  (512K)
  1064  1929379840  2  freebsd-zfs  (920G)
  1929380904  24144231  - free -  (12G)

Last question, does it matter the naming conversion isn't the same as the rest in gpart?
 
I really wonder where the ufsid labels come from? They shouldn't show up on disks that don't have UFS filesystems. How exactly were those other disks prepared, did they have UFS filesystems before used for the ZFS pool?

It doesn't really matter which names are used for the disks in the pool, ZFS uses the on-disk ZFS labels to detect which disks/partitions are part of the pool.
 
I'm pretty sure the disk was formatted before been set in that server..
Its so long I can't remember..
Anyway, am righ to think that i'm all good on zfs now?
 
Yes, keep it as it is for now since it works. For the next system(s) you're about to build you should turn off some of the labels and legacy compatibility names in loader.conf(5):

Code:
kern.cam.ada.legacy_aliases=0
kern.geom.label.gptid.enable=0
kern.geom.label.ufsid.enable=0

After disabling those you should use GPT labels as the partition identifiers:

Code:
...
gpart add -t freebsd-zfs -s NNN -l zpool0 ada0
...
gpart add -t freebsd-zfs -s NNN -l zpool1 ada1
...

Then you can use the GPT labels to create the pool:

Code:
zpool create zpool raidz2 gpt/zpool0 gpt/zpool1 ...
 
Back
Top