ZFS Replace all disks in ZFS-on-root and expand zpool

Hi my old zpool has 8 2T disk and I'd like to replace all of them to 4T disk to expand my zpool.

old zpool.

Code:
    zroot       ONLINE       0     0     0
      raidz1-0  ONLINE       0     0     0
        da0p4   ONLINE       0     0     0
        da1p4   ONLINE       0     0     0
        da2p4   ONLINE       0     0     0
        da3p4   ONLINE       0     0     0
        da4p4   ONLINE       0     0     0
        da5p4   ONLINE       0     0     0
        da6p4   ONLINE       0     0     0
        da7p4   ONLINE       0     0     0

First, I just pull out da0, then zpool show disk removed. It's expected

Code:
    NAME        STATE     READ WRITE CKSUM
    zroot       DEGRADED     0     0     0
      raidz1-0  DEGRADED     0     0     0
        da0p4   REMOVED      0     0     0
        da1p4   ONLINE       0     0     0
        da2p4   ONLINE       0     0     0
        da3p4   ONLINE       0     0     0
        da4p4   ONLINE       0     0     0
        da5p4   ONLINE       0     0     0
        da6p4   ONLINE       0     0     0
        da7p4   ONLINE       0     0     0

Then I added the new 4T disk, and run

Code:
zpool replace zroot da0p4 da0

Then new disk was added into zpool and start resilvering.

Code:
    zroot            DEGRADED     0     0     0
      raidz1-0       DEGRADED     0     0     0
        replacing-0  DEGRADED     0     0     0
          da0p4      REMOVED      0     0     0
          da0        ONLINE       0     0     0  (resilvering)
        da1p4        ONLINE       0     0     0
        da2p4        ONLINE       0     0     0
        da3p4        ONLINE       0     0     0
        da4p4        ONLINE       0     0     0
        da5p4        ONLINE       0     0     0
        da6p4        ONLINE       0     0     0
        da7p4        ONLINE       0     0     0

Then I realize I didn't create any partition on it, so it's using the entire disk, which means there will be no swap, no EFI partition.

After completed.

Code:
    NAME        STATE     READ WRITE CKSUM
    zroot       ONLINE       0     0     0
      raidz1-0  ONLINE       0     0     0
        da0     ONLINE       0     0     0
        da1p4   ONLINE       0     0     0
        da2p4   ONLINE       0     0     0
        da3p4   ONLINE       0     0     0
        da4p4   ONLINE       0     0     0
        da5p4   ONLINE       0     0     0
        da6p4   ONLINE       0     0     0
        da7p4   ONLINE       0     0     0

Then I offline da0.

Code:
    zroot       DEGRADED     0     0     0
      raidz1-0  DEGRADED     0     0     0
        da0     OFFLINE      0     0     0
        da1p4   ONLINE       0     0     0
        da2p4   ONLINE       0     0     0
        da3p4   ONLINE       0     0     0
        da4p4   ONLINE       0     0     0
        da5p4   ONLINE       0     0     0
        da6p4   ONLINE       0     0     0
        da7p4   ONLINE       0     0     0

And try to create the partition based on existing disk info.

Code:
gpart backup da1 | gpart restore -F da0

Then checked gpart info.

Code:
=>        40  3907029088  da1  GPT  (1.8T)
          40      532480    1  efi  (260M)
      532520        1024    2  freebsd-boot  (512K)
      533544         984       - free -  (492K)
      534528     8388608    3  freebsd-swap  (4.0G)
     8923136  3898105856    4  freebsd-zfs  (1.8T)
  3907028992         136       - free -  (68K)

=>        34  7814037101  da0  GPT  (3.6T)
          34           6       - free -  (3.0K)
          40      532480    1  efi  (260M)
      532520        1024    2  freebsd-boot  (512K)
      533544         984       - free -  (492K)
      534528     8388608    3  freebsd-swap  (4.0G)
     8923136  3898105856    4  freebsd-zfs  (1.8T)
  3907028992  3907008143       - free -  (1.8T)

Then try to resize it.

Code:
# gpart resize -i 4 da0 
da0p4 resized

Then check the partition.

Code:
=>        40  3907029088  da1  GPT  (1.8T)
          40      532480    1  efi  (260M)
      532520        1024    2  freebsd-boot  (512K)
      533544         984       - free -  (492K)
      534528     8388608    3  freebsd-swap  (4.0G)
     8923136  3898105856    4  freebsd-zfs  (1.8T)
  3907028992         136       - free -  (68K)

=>        34  7814037101  da0  GPT  (3.6T)
          34           6       - free -  (3.0K)
          40      532480    1  efi  (260M)
      532520        1024    2  freebsd-boot  (512K)
      533544         984       - free -  (492K)
      534528     8388608    3  freebsd-swap  (4.0G)
     8923136  7805113999    4  freebsd-zfs  (3.6T)

Looks like the new 4T disk is starting from 34 not 40, is this a problem?

And then I could not replace the disk
Code:
# zpool replace -f zroot da0 da0p4
invalid vdev specification
the following errors must be manually repaired:
/dev/da0p4 is part of active pool 'zroot'

But the wired thing is, I can still make da0 online.

Code:
# zpool online -e zroot da0
# zpool status

    NAME        STATE     READ WRITE CKSUM
    zroot       ONLINE       0     0     0
      raidz1-0  ONLINE       0     0     0
        da0     ONLINE       0     0     0
        da1p4   ONLINE       0     0     0
        da2p4   ONLINE       0     0     0
        da3p4   ONLINE       0     0     0
        da4p4   ONLINE       0     0     0
        da5p4   ONLINE       0     0     0
        da6p4   ONLINE       0     0     0
        da7p4   ONLINE       0     0     0

And at this time, if run gpart, then da0 has no partition.

Code:
# gpart show da0
gpart: No such geom: da0.
 gpart show   
=>        40  3907029088  da6  GPT  (1.8T)
          40      532480    1  efi  (260M)
      532520        1024    2  freebsd-boot  (512K)
      533544         984       - free -  (492K)
      534528     8388608    3  freebsd-swap  (4.0G)
     8923136  3898105856    4  freebsd-zfs  (1.8T)
  3907028992         136       - free -  (68K)

=>        40  3907029088  da7  GPT  (1.8T)
          40      532480    1  efi  (260M)
      532520        1024    2  freebsd-boot  (512K)
      533544         984       - free -  (492K)
      534528     8388608    3  freebsd-swap  (4.0G)
     8923136  3898105856    4  freebsd-zfs  (1.8T)
  3907028992         136       - free -  (68K)

=>        40  3907029088  da4  GPT  (1.8T)
          40      532480    1  efi  (260M)
      532520        1024    2  freebsd-boot  (512K)
      533544         984       - free -  (492K)
      534528     8388608    3  freebsd-swap  (4.0G)
     8923136  3898105856    4  freebsd-zfs  (1.8T)
  3907028992         136       - free -  (68K)

=>        40  3907029088  da5  GPT  (1.8T)
          40      532480    1  efi  (260M)
      532520        1024    2  freebsd-boot  (512K)
      533544         984       - free -  (492K)
      534528     8388608    3  freebsd-swap  (4.0G)
     8923136  3898105856    4  freebsd-zfs  (1.8T)
  3907028992         136       - free -  (68K)

=>        40  3907029088  da3  GPT  (1.8T)
          40      532480    1  efi  (260M)
      532520        1024    2  freebsd-boot  (512K)
      533544         984       - free -  (492K)
      534528     8388608    3  freebsd-swap  (4.0G)
     8923136  3898105856    4  freebsd-zfs  (1.8T)
  3907028992         136       - free -  (68K)

=>        40  3907029088  da2  GPT  (1.8T)
          40      532480    1  efi  (260M)
      532520        1024    2  freebsd-boot  (512K)
      533544         984       - free -  (492K)
      534528     8388608    3  freebsd-swap  (4.0G)
     8923136  3898105856    4  freebsd-zfs  (1.8T)
  3907028992         136       - free -  (68K)

=>        40  3907029088  da1  GPT  (1.8T)
          40      532480    1  efi  (260M)
      532520        1024    2  freebsd-boot  (512K)
      533544         984       - free -  (492K)
      534528     8388608    3  freebsd-swap  (4.0G)
     8923136  3898105856    4  freebsd-zfs  (1.8T)
  3907028992         136       - free -  (68K)

But if mark da0 offline, then gpart show da0 has partition
Code:
# zpool offline zroot da0 
# gpart show da0
=>        34  7814037101  da0  GPT  (3.6T) [CORRUPT]
          34           6       - free -  (3.0K)
          40      532480    1  efi  (260M)
      532520        1024    2  freebsd-boot  (512K)
      533544         984       - free -  (492K)
      534528     8388608    3  freebsd-swap  (4.0G)
     8923136  7805113999    4  freebsd-zfs  (3.6T)


So what's the right step to replace the disk in root-on-zfs zpool?

should I manually create the partition to make sure it's 4K aligned?
 
Update, I destroyed all partition on da0 and manually create partition. Looks like now I can replace the disk

Code:
gpart destroy -F da0
gpart create -s gpt da0
gpart add -t freebsd-zfs -a 4k ada0
gpart add -t efi -s 260M da0
gpart add -a 4k -s 4g -t freebsd-swap da0
gpart add -a 4k -t freebsd-zfs da0
gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 da0

]=>        40  3907029088  da1  GPT  (1.8T)
          40      532480    1  efi  (260M)
      532520        1024    2  freebsd-boot  (512K)
      533544         984       - free -  (492K)
      534528     8388608    3  freebsd-swap  (4.0G)
     8923136  3898105856    4  freebsd-zfs  (1.8T)
  3907028992         136       - free -  (68K)

=>        40  7814037088  da0  GPT  (3.6T)
          40      532480    1  efi  (260M)
      532520     8388608    2  freebsd-swap  (4.0G)
     8921128  7805116000    3  freebsd-zfs  (3.6T)

Trigger the disk replacement in zpool, looks like working now.

Code:
# zpool replace zroot da0 da0p3  
# zpool status

    zroot            DEGRADED     0     0     0
      raidz1-0       DEGRADED     0     0     0
        replacing-0  DEGRADED     0     0     0
          da0        OFFLINE      0     0     0
          da0p3      ONLINE       0     0     0  (resilvering)
        da1p4        ONLINE       0     0     0
        da2p4        ONLINE       0     0     0
        da3p4        ONLINE       0     0     0
        da4p4        ONLINE       0     0     0
        da5p4        ONLINE       0     0     0
        da6p4        ONLINE       0     0     0
        da7p4        ONLINE       0     0     0

But still have one question, for the EFI partition, do I need to copy boot file to it?
 
For EFI booting you do. Of your RAIDZ1 pool every disk (actually vdev) should be ready to boot from in case of one disk going off line. Update of the bootcodes for a GPT scheme (x64 architecture) might be helpful (although formally for updating).

BTW, when you've finilized the transformation, you have an 8 disk RAIDZ1 made up of 4TB disks: in my view that's a big pool for only a RAIDZ1 with a "one disk" redundancy, I'd be more comfortable with RAIDZ2 or 3. Also take into consideration the longer resilver times when the pool will be storing more data.
 
It might have been better to copy the 2 TB disk to the 4 TB disk first, which gets you both the partitioning and the boot code.
 
which gets you both the partitioning and the boot code.
But then you'd be relying on the old boot code; I'd be sure and install/update to the newest boot code on all disks.
Maybe I'm missing boot details, but with the 2TB disks, you could be booting via the protective MBR of GPT and that won't work any more on 4TB disks.
 
But then you'd be relying on the old boot code; I'd be sure and install/update to the newest boot code on all disks.
Maybe I'm missing boot details, but with the 2TB disks, you could be booting via the protective MBR of GPT and that won't work any more on 4TB disks.

True.
 
Update:

Finally, I replaced all 8 disks. And I also use dd to copy the efi partition during disk replacement.

eg
Code:
dd if=/dev/da0p1 of=/dev/da7p1

But after reboot, server won't be able to boot and complaint disk is not bootable.

I'm wondering is that because I forgot to copy freebsd-boot partition? But I thought UEFI doesn't need this boot partition

then I boot into livecd, and run efibootmgr, then in the result there is no local disk.
So looks like the UEFI boot is not added into the disk as expected. SO

1. Copy EFI partition via dd is not enough
2. I copied partition from second disk (da1) at the beginning. I remember I saw some article says that only first disk has valid UEFI boot info.

Any way to fix it? Looks like for now I'm not able to mount the EFI partiion

QQ截图20240716162438.jpg




Update:

Fixed by below command

Code:
newfs_msdos -F 32 -c 1 /dev/da0p1
mount -t msdosfs /dev/da0p1 /mnt
mkdir -p /mnt/EFI/BOOT
cp /boot/loader.efi /mnt/efi/boot/bootx64.efi


efibootmgr -a -c -l /mnt/efi/boot/bootx64.efi -L FreeBSD

And then reboot, server can boot from UEFI now. But still complaint

2.jpg
 
bumping thread ...

Update:

Fixed by below command
...
I just happened to notice that you've updated your message after Emrion's last message: please don't do that.
Just create a new message.

People watching this thread and try to help you won't get notified.
Even when they happen to look at this thread specifically, they likely will not notice your "Update:" added to your existing message.
They only see that Emrion's message is still the last one and that you haven't answered his question yet.
 
Here you have added a efi partition, but didn't GPT labeled it.
gpart add -t efi -s 260M da0
The efi partitions file system in /etc/fstab is set to mount by its GPT label ( /dev/gpt/efiboot0 ). If there is no label, fstab can't mount the file system, the boot process is stopped at the single-user prompt for the user to fix it.

At the single-user prompt label the efi partition ( gpart modify -i 1 -l efiboot0 da1 ), continue boot.

Do the same with all the other efi partitions on all disks.

And I also use dd to copy the efi partition during disk replacement.

eg
Code:
dd if=/dev/da0p1 of=/dev/da7p1
The efi partition with the efi boot loader should be copied to all disks, in case the disk with the only efi loader is removed.

Should the disk the "FreeBSD" labeled UEFI menu entry references to is removed, it will be necessary to update the entry again.

"no pools available to import". Are there other pools to import besides "zroot"?
 
Here you have added a efi partition, but didn't GPT labeled it.

The efi partitions file system in /etc/fstab is set to mount by its GPT label ( /dev/gpt/efiboot0 ). If there is no label, fstab can't mount the file system, the boot process is stopped at the single-user prompt for to user to fix it.

At the single-user prompt label the efi partition ( gpart modify -i 1 -l efiboot0 da1 ), continue boot.

Do the same with all the other efi partitions on all disks.


The efi partition with the efi boot loader should be copied to all disks, in case the disk with the only efi loader is removed.

Should the disk the "FreeBSD" labeled UEFI menu entry references to is removed, it will be necessary to update the entry again.

"no pools available to import". Are there other pools to import besides "zroot"?

Thanks for your reply.

I just check the /etc/fstab on a working server (same configuration) .it's as following


Code:
/dev/da0p2                none    swap    sw              0       0
/dev/da1p2                none    swap    sw              0       0
/dev/da10p2               none    swap    sw              0       0
/dev/da11p2               none    swap    sw              0       0
/dev/da2p2                none    swap    sw              0       0
/dev/da3p2                none    swap    sw              0       0
/dev/da4p2                none    swap    sw              0       0
/dev/da5p2                none    swap    sw              0       0
/dev/da6p2                none    swap    sw              0       0
/dev/da7p2                none    swap    sw              0       0

There is no GPT label.
 
I just check the /etc/fstab on a working server (same configuration) .it's as following
But on the server which stops at the single-user prompt there must be a efi partition set to mount in fstab under a label name.

Otherwise there wouldn't be those messages from fstab mount
Code:
Can't open /dev/gpt/efiboot0
/dev/gpt/efiboot0 UNEXPECTED INCONSISTENCY. RUN fsck_mdosfs MANUALLY
THE FOLLOWING FILE SYSTEM HAD AN UNEXPECTED INCONSISTENCY
    msdosfs: /dev/gpt/efiboot0 (/boot/efi)
 
But on the server which stops at the single-user prompt there must be a efi partition set to mount in fstab under a label name.

Otherwise there wouldn't be those messages from fstab mount
Code:
Can't open /dev/gpt/efiboot0
/dev/gpt/efiboot0 UNEXPECTED INCONSISTENCY. RUN fsck_mdosfs MANUALLY
THE FOLLOWING FILE SYSTEM HAD AN UNEXPECTED INCONSISTENCY
    msdosfs: /dev/gpt/efiboot0 (/boot/efi)

I just bring the server back....

Code:
# newfs_msdos -F 32 -c 1 -L EFISYS1 /dev/gpt/efiboot0
# mount_msdosfs /dev/gpt/efiboot0 /mnt
# mkdir -p /mnt/EFI/BOOT
# cp /boot/loader.efi /mnt/efi/boot/bootx64.efi
# umount /mnt

Then reboot. Server can boot into normal mode.

But honest speaking, I don't know why this would work.
 
Back
Top