How to replace zfs failed drive?

I have a failed drive in my freebsd NAS. Long story short, this one drive was dropped on the floor. The system does not boot with the drive present. I got another drive and put it in place of the failed drive. This is what zpool status shows.

What commands do I need to issue to get freebsd to use the new drive, which is present as ada3? I understand that freebsd will resilver the new drive.

I'm afraid of making some mistake and wiping out my data. The new drive is not even partitioned:

Code:
# gpart show ada3
gpart: No such geom: ada3.

I'm not sure if the replacement process take care of that.

Code:
# zpool status

  pool: zroot
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
        the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: [URL]https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-2Q[/URL]
  scan: resilvered 1.56M in 00:00:01 with 0 errors on Sat Oct 14 12:37:20 2023
config:

        NAME        STATE     READ WRITE CKSUM
        zroot       DEGRADED     0     0     0
          raidz1-0  DEGRADED     0     0     0
            ada0p3  ONLINE       0     0     0
            ada1p3  ONLINE       0     0     0
            ada2p3  ONLINE       0     0     0
            ada3p3  UNAVAIL      0     0     0  cannot open

errors: No known data errors
 
The system does not boot with the drive present.
Yes, that can happen. A broken disk can hang up the entire bus, causing all sorts of issues. Just pull the drive out, you have RAID-Z so one missing disk should be fine.

I understand that freebsd will resilver the new drive.
Technically it's ZFS that will resilver the drive, not FreeBSD.

What commands do I need to issue to get freebsd to use the new drive, which is present as ada3?
Create the same partitions as on the other drives, and make sure you use ada3p3 not the whole drive (ada3). Use the zpool-replace(8) command.

If you can, don't use the data while it's resilvering. While you can certainly read and/or write data when the pool is in a degraded state it will cause the resilvering to take longer.
 
Thanks. These are the partitions on the other three drives (see following)

Reading around, this is my guess for how to complete the task. I found the disk label disk10 in an online example. No idea if this is right, immaterial, or a problem.

Code:
sudo gpart create -s gpt /dev/ada3
sudo gpart add -b 40 -s 1024 -t freebsd-boot /dev/ada3
sudo gpart add -b 1064 -s 984 -t free /dev/ada3
sudo gpart add -b 2048 -s 4194304 -t freebsd-swap -l swap10 /dev/ada3
sudo gpart add -b 4196352 -s 7809839104 -t freebsd-zfs -l disk10 /dev/ada3
sudo gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 /dev/ada3

sudo zpool replace zroot /dev/gpt/disk10
Code:
# gpart show ada0
=>        40  7813971560  ada0  GPT  (3.6T)
          40        1024     1  freebsd-boot  (512K)
        1064         984        - free -  (492K)
        2048     4194304     2  freebsd-swap  (2.0G)
     4196352  7809773568     3  freebsd-zfs  (3.6T)
  7813969920        1680        - free -  (840K)

# gpart show ada1
=>        40  7814037088  ada1  GPT  (3.6T)
          40        1024     1  freebsd-boot  (512K)
        1064         984        - free -  (492K)
        2048     4194304     2  freebsd-swap  (2.0G)
     4196352  7809839104     3  freebsd-zfs  (3.6T)
  7814035456        1672        - free -  (836K)

# gpart show ada2
=>        40  7814037088  ada2  GPT  (3.6T)
          40        1024     1  freebsd-boot  (512K)
        1064         984        - free -  (492K)
        2048     4194304     2  freebsd-swap  (2.0G)
     4196352  7809839104     3  freebsd-zfs  (3.6T)
  7814035456        1672        - free -  (836K)
 
Code:
sudo gpart add -b 1064 -s 984 -t free /dev/ada3
Don't add the "free" bits, they're not actual partitions, they're a consequence of alignment (2048 is a clean 4K/8K boundary, 1064 is not). If you don't align partitions on a proper boundary you're going to have some serious performance degradation if the disk itself has 4K sectors. Other than that, yes, looks good. You can also use gpart backup and gpart restore to "copy" the partition table from one disk to another. See the bottom of gpart(8) for some examples.

sudo zpool replace zroot /dev/gpt/disk10
Missing something here. You need to tell it which drive it should replace. If you removed it it might show up as a long number. It's zpool replace <pool> <old disk> <new disk>; if it happens to be the exact same disk it'll be something like zpool replace zroot /dev/ada3p3 /dev/ada3p3 or, if the disks moved around zpool replace zroot <long number> /dev/ada3p3

If you happen to pick the wrong disk the command will fail and tell you. Do NOT be tempted to use -f to force it. It's giving you an error for a reason.
 
can one do a gpart backup/gpart restore:
sudo gpart backup ada0 > gpart.ada0
sudo gpart restore ada3 < gpart.ada0

The only thing that could be confusing is if ada0 has gpart labels and something has mounted by them.
 
can one do a gpart backup/gpart restore:
sudo gpart backup ada0 > gpart.ada0
sudo gpart restore ada3 < gpart.ada0

The only thing that could be confusing is if ada0 has gpart labels and something has mounted by them.

Thanks. I did this. The only odd thing is that ada3 now has 33m free

gpart show ada3
=> 40 7814037088 ada3 GPT (3.6T)
40 1024 1 freebsd-boot (512K)
1064 984 - free - (492K)
2048 4194304 2 freebsd-swap (2.0G)
4196352 7809773568 3 freebsd-zfs (3.6T)
7813969920 67208 - free - (33M)

Missing something here. You need to tell it which drive it should replace. If you removed it it might show up as a long number. It's zpool replace <pool> <old disk> <new disk>; if it happens to be the exact same disk it'll be something like zpool replace zroot /dev/ada3p3 /dev/ada3p3 or, if the disks moved around zpool replace zroot <long number> /dev/ada3p3

Where can I see the names of these disks? BTW, I put the new disk in the same space the old disk occupied.
 
Where can I see the names of these disks?
zpool status will tell you.
BTW, I put the new disk in the same space the old disk occupied.
Yes, but that doesn't necessarily mean it gets the same drive designation. I have an older LSI card and if I take a disk out all the other disks shift their designation. What used to be da3 is now da2, the original da2 became da1, etc. The new disk I put in might show up as da4, even though it is in the first or second slot. Which can be utterly confusing. When you insert the new disk, take a close look at /var/log/messages to see which drive designation it got.
 
Yes, but that doesn't necessarily mean it gets the same drive designation. I have an older LSI card and if I take a disk out all the other disks shift their designation. What used to be da3 is now da2, the original da2 became da1, etc. The new disk I put in might show up as da4, even though it is in the first or second slot. Which can be utterly confusing. When you insert the new disk, take a close look at /var/log/messages to see which drive designation it got.

Thanks again. It is ada3.

So I will do

zpool replace zroot /dev/ada3p3 /dev/ada3p3
 
Thanks. I did this. The only odd thing is that ada3 now has 33m free
"1TB does not always equal 1TB" depends on the manufacturer, some include extra space, some say 1024 is 1K others say 1000 is 1K.

But that is why I prefer doing ZFS on partitions not whole devices; you can make everything the same size. Using whole devices as long as the new one is bigger than the existing you're fine, if it's smaller you can run into problems
 
No, manufacturers ALWAYS use 1000 for 1K. They're bound by various rules and regulations and must use the SI standards.
Glad they finally standardized; in the "really old days" they weren't, but if they are now, then good for us.
 
Strange problems

Code:
zpool status -v
  pool: zroot
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Wed Jan  3 21:40:10 2024
        10.4T scanned at 362M/s, 9.87T issued at 345M/s, 12.6T total
        2.35T resilvered, 78.37% done, 02:18:02 to go
config:

        NAME              STATE     READ WRITE CKSUM
        zroot             DEGRADED     0     0     0
          raidz1-0        DEGRADED     0     0     0
            ada0p3        ONLINE       1     0   917  (resilvering)
            ada1p3        ONLINE       0     0   915
            ada2p3        ONLINE       0     0   915
            replacing-3   DEGRADED     0     0   915
              ada3p3/old  UNAVAIL      0     0     0  cannot open
              ada3p3      ONLINE       0     0     0  (resilvering)

errors: Permanent errors have been detected in the following files:

        //share/Movies/0 Documentaries/Napoleonic/1812.Napoleonic.Wars.in.Russia.3of4.1080p.WEB-DL.x264.AAC.MVGroup.Forum.mkv
 
No, not really. It's busy resilvering, that's good, just over 3/4 has been done (78.37%). Depending on the state of the pool before the disk failure there could be some broken files. How old is that pool? Did you ever scrub it?
I built up that data over a few weeks in preparation to move it by plane in my hand luggage. It is a copy of my data in another country. I never scrubbed it.
Thanks for the reassurance.
 
This is the final result. Only two files seem to have been lost and they are unimportant.

Thanks for your help.

#zpool status -v

pool: zroot
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
scan: resilvered 3.01T in 11:10:13 with 90 errors on Thu Jan 4 08:50:23 2024
config:

NAME STATE READ WRITE CKSUM
zroot ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
ada0p3 ONLINE 1 0 1.60K
ada1p3 ONLINE 0 0 1.59K
ada2p3 ONLINE 0 0 1.59K
ada3p3 ONLINE 0 0 0

errors: Permanent errors have been detected in the following files:

//share/Movies/0 Documentaries/channel4.dispatches.avi
//share/Movies/0 Documentaries/Napoleonic/1812.Napoleonic.Wars.in.Russia.3of4.1080p.WEB-DL.x264.AAC.MVGroup.Forum.mkv
 
What is the reason for file corruption? There is working mirror.
They were probably already corrupt before the drive fault. When the pool was healthy, a scrub would have been able to fix them. But now 2 of the 4 parts are gone, so there's not enough redundancy anymore.
 
They were probably already corrupt before the drive fault. When the pool was healthy, a scrub would have been able to fix them. But now 2 of the 4 parts are gone, so there's not enough redundancy anymore.

Do I need to do something regularly to catch corrupt files and fix them?
 
If the server is running overnight every day (or at least overnight when it is run), then a periodic script will do the scrub every 35 days if I am not mistaken.
I believe you need to add daily_scrub_zfs_enable="YES" to your /etc/periodic.conf It looks to be disabled in /etc/defaults/periodic.conf
 
Back
Top