Solved Help with layout after replacing a failed disk in a RAIDZ2 with root on ZFS

Hi Everyone,

I'm sure I'm providing too much info, but please bear with me:

Several years ago, I set up my first FreeBSD server. There's probably a lot of "best practice" failures here, so I apologize if any of this is especially cringeworthy. It all seemed like a good idea at the time!

I used 5 x 3 TB drive in a RAIDZ2. I also set up with ZFS on Root and called it zjrr. From what I can tell, this created 3 partitions on each drive: freebsd-boot, freebsd-swap, and freebsd-zfs.

Code:
jrr@guts: /home/jrr$ ls /dev/da*
/dev/da0    /dev/da0p1  /dev/da0p2  /dev/da0p3  /dev/da1    /dev/da1p1  /dev/da1p2  /dev/da1p3  /dev/da2    /dev/da2p1  /dev/da2p2  /dev/da2p3  /dev/da3    /dev/da4    /dev/da4p1  /dev/da5    /dev/da5p1  /dev/da6

da4 and da5 are spare drives that I added to back up before I did anything else. You can ignore them.

da6 is the replacement drive. We'll get to that shortly.

da0,1,2,3 are the drives on which the whole system was originally installed. The 5th drive failed, so I set it to OFFLINE, then hot-unplugged it. I hot plugged in a new drive which I thought came up as da3, but I could not perform zpool replace zjrr da3, so I tried to shutdown -r now the system. Since the system lost one of its freebsd-swap drives, it threw a ton of errors to /var/log/messages and would not shut down. I ended up commenting out the fstab line mentioning
Code:
/dev/da3p2              none    swap
and hard powering the server off.

The server booted back up okay. I removed the comment from fstab and rebooted; server came up okay.

I moved the new hard drive to a different plug on the controller and was then able to zpool replace zjrr da3 da6. It resilvered overnight, hooray!

When I issued the replace command, I was informed:
Code:
Make sure to wait until resilver is done before rebooting.

If you boot from pool 'zjrr', you may need to update
boot code on newly attached disk 'da6'.

Assuming you use GPT partitioning and 'da0' is your new boot disk
you may use the following command:

        gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 da0

...which I have not done yet. I noticed that the new drive does not have the freebsd-swap or freebsd-root partitions on it. I misunderstood and thought those would be created during the resilver process. Correspondingly, I see that I have less swap space than before, though I have 24 GB of RAM in this system and do not believe I have a huge need for swap.

If I do a zpool status zjrr, it looks weird:
Code:
jrr@guts: /Shares$ zpool status zjrr
  pool: zjrr
state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
        still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(7) for details.
  scan: resilvered 2.19T in 0 days 06:25:26 with 0 errors on Tue Jun 23 04:50:52 2020
config:

        NAME                        STATE     READ WRITE CKSUM
        zjrr                        ONLINE       0     0     0
          raidz2-0                  ONLINE       0     0     0
            da0p3                   ONLINE       0     0     0
            da1p3                   ONLINE       0     0     0
            da2p3                   ONLINE       0     0     0
            da6                     ONLINE       0     0     0
            diskid/DISK-W6A0Y7ZKp3  ONLINE       0     0     0

errors: No known data errors

and if I do a gpart show, da6 doesn't even show on the list:
Code:
jrr@guts: /Shares$ gpart show
=>       63  488397105  nvd0  MBR  (233G)
         63       1985        - free -  (993K)
       2048  488392704     1  ntfs  (233G)
  488394752       2416        - free -  (1.2M)

=>       63  488397105  diskid/DISK-S3ESNX0K200974J  MBR  (233G)
         63       1985                               - free -  (993K)
       2048  488392704                            1  ntfs  (233G)
  488394752       2416                               - free -  (1.2M)

=>        34  5860533101  da0  GPT  (2.7T)
          34           6       - free -  (3.0K)
          40        1024    1  freebsd-boot  (512K)
        1064         984       - free -  (492K)
        2048     4194304    2  freebsd-swap  (2.0G)
     4196352  5856335872    3  freebsd-zfs  (2.7T)
  5860532224         911       - free -  (456K)

=>        34  5860533101  da1  GPT  (2.7T)
          34           6       - free -  (3.0K)
          40        1024    1  freebsd-boot  (512K)
        1064         984       - free -  (492K)
        2048     4194304    2  freebsd-swap  (2.0G)
     4196352  5856335872    3  freebsd-zfs  (2.7T)
  5860532224         911       - free -  (456K)

=>        34  5860533101  da2  GPT  (2.7T)
          34           6       - free -  (3.0K)
          40        1024    1  freebsd-boot  (512K)
        1064         984       - free -  (492K)
        2048     4194304    2  freebsd-swap  (2.0G)
     4196352  5856335872    3  freebsd-zfs  (2.7T)
  5860532224         911       - free -  (456K)

=>        40  7814037088  da4  GPT  (3.6T)
          40  7814037088    1  freebsd-ufs  (3.6T)

=>        40  7814037088  da5  GPT  (3.6T)
          40  7814037088    1  freebsd-ufs  (3.6T)

=>        34  5860533101  diskid/DISK-W6A0Y7ZK  GPT  (2.7T)
          34           6                        - free -  (3.0K)
          40        1024                     1  freebsd-boot  (512K)
        1064         984                        - free -  (492K)
        2048     4194304                     2  freebsd-swap  (2.0G)
     4196352  5856335872                     3  freebsd-zfs  (2.7T)
  5860532224         911                        - free -  (456K)

=>        40  7814037088  diskid/DISK-Z1Z9DGBM  GPT  (3.6T)
          40  7814037088                     1  freebsd-ufs  (3.6T)

=>        40  7814037088  diskid/DISK-Z1Z9DLNP  GPT  (3.6T)
          40  7814037088                     1  freebsd-ufs  (3.6T)

My questions:
- Do I need for da6 to have freebsd-swap and freebsd-boot?
- Should I run that gpart bootcode command?
- Is there a way to change the label so that my zpool status doesn't show " diskid/DISK-W6A0Y7ZKp3", or is that a sign that something is broken?
- Is there a good reference for replacing a failed disk when using Root on ZFS, or should I not be using Root on ZFS?

Thanks for your patience and for your help. For all intents and purposes, I'm pretty new to all this.
 
You added a da6 as a “whole disk”, which likely wasn’t what you wanted (to match the other drive layouts). You’ll want to (roughly; double-check man pages to be sure):
  1. zpool offline zjrr da6 (you’ll be down to one redundant drive at this point for a bit)
  2. gpart backup da0 | gpart restore da6
  3. zpool replace zjrr da6 da6p3
  4. gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 da6
So (1) removes the “whole disk” da6 from the active pool, (2) replicates the partition layout (but no content) from da0 to da6, (3) replaces the (offline) whole disk da6 with partiton da6p3, and (4) adds the bootcode to da6p1. You may want to make sure the bootcode is on your other boot partitions, too, for good measure.

Edit: fixed (4) to say da6.
 
Back
Top