I had a hard time describing this problem succinctly to generate a title. Thanks also for your patience, as I am not a professional sysadmin but a scientist.
The bottom line is that my 4U JBOD power-cycled (or the controller dropped it) for some reason. When the disks came back online, there were two problems:
1. Devices were re-numbered (/dev/daNN)
2. Two devices never came back online (not recognized by CAM subsystem).
I think either of these problems alone would be no big deal for the system, but as as result of these two problems interacting, ZFS now shows duplicated device names in different RAID groups (see below).
With respect to #1, this is a known issue. In linux, I avoid this by referring to devices by SCSI/WWN (/dev/disk/by-id/...) when creating the zpool. I'm not sure what I should have done differently other than manually labeling all 24 disks prior to creating the pool, but it seems ridiculous to have a human do something that a computer should be able to handle without difficulty. In any event, this is not the primary problem.
With respect to #2, the devices were recognized by the MPR driver, but not cam subsystem. `camcontrol rescan` brought them back online and assigned them the last two ids, `da58` and `da59`. (Thanks to this forum post [0])
Now note carefully below: `da50` and `da57` are list in both `raidz2-1` and `raidz2-2` groups (whereas `da58` and `da59` should be listed)
Now, I'm terrified to issue a `zpool replace` command for two reasons:
1. naively replacing, say, `da50` might replace the working device in `raidz2-2` rather than the faulted device in `raidz2-1`, leaving me with zero redundancy while resilvering. zpool-replace may be smart enough to prevent this unless I am dumb enough to include `-f`.
2. I don't actually know which of the two now-available devices (`da58` and `da59`) belongs with which raid group
Problem #1 could be solved -- I think -- by using GUID (obtained from `zdb`) instead of device number with the `zpool replace` command.
**How do I solve problem #2?**
For example, if the GUID associated with the FAULTED drive in `raidz2-1` is 10104343158814001513, and I issue `zpool replace 10104343158814001513 <daNN>`, it will certainly work no matter which of 58 or 59 I pick, but one will resilver nearly instantly, and the other take quite a long time.
(Bonus Question: DId I screw up by not GPT labeling my disks before creating the pool? Chapter 22 of the handbook[1] never recommends this. Is there some other way to add devices by a stable identifier to prevent this problem in the future?)
[0] https://muc.lists.freebsd.scsi.nark...ll-disks-come-back-after-power-cycling-a-jbod
[1] https://docs.freebsd.org/en/books/handbook/zfs/
The bottom line is that my 4U JBOD power-cycled (or the controller dropped it) for some reason. When the disks came back online, there were two problems:
1. Devices were re-numbered (/dev/daNN)
2. Two devices never came back online (not recognized by CAM subsystem).
I think either of these problems alone would be no big deal for the system, but as as result of these two problems interacting, ZFS now shows duplicated device names in different RAID groups (see below).
With respect to #1, this is a known issue. In linux, I avoid this by referring to devices by SCSI/WWN (/dev/disk/by-id/...) when creating the zpool. I'm not sure what I should have done differently other than manually labeling all 24 disks prior to creating the pool, but it seems ridiculous to have a human do something that a computer should be able to handle without difficulty. In any event, this is not the primary problem.
With respect to #2, the devices were recognized by the MPR driver, but not cam subsystem. `camcontrol rescan` brought them back online and assigned them the last two ids, `da58` and `da59`. (Thanks to this forum post [0])
Now note carefully below: `da50` and `da57` are list in both `raidz2-1` and `raidz2-2` groups (whereas `da58` and `da59` should be listed)
Code:
```
# zpool status bigpool
pool: bigpool
state: DEGRADED
status: One or more devices could not be used because the label is missing or
invalid. Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using 'zpool replace'.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-4J
scan: scrub repaired 0B in 12:53:37 with 0 errors on Sat Mar 16 04:36:15 2024
config:
NAME STATE READ WRITE CKSUM
bigpool DEGRADED 0 0 0
raidz2-0 ONLINE 0 0 0
da53 ONLINE 0 0 0
da54 ONLINE 0 0 0
da36 ONLINE 0 0 0
da56 ONLINE 0 0 0
da51 ONLINE 0 0 0
da55 ONLINE 0 0 0
da38 ONLINE 0 0 0
da39 ONLINE 0 0 0
raidz2-1 DEGRADED 0 0 0
da52 ONLINE 0 0 0
da57 ONLINE 0 0 0
da37 ONLINE 0 0 0
da40 ONLINE 0 0 0
da50 FAULTED 0 0 0 corrupted data
da47 ONLINE 0 0 0
da42 ONLINE 0 0 0
da41 ONLINE 0 0 0
raidz2-2 DEGRADED 0 0 0
da44 ONLINE 0 0 0
da48 ONLINE 0 0 0
da45 ONLINE 0 0 0
da49 ONLINE 0 0 0
da43 ONLINE 0 0 0
da50 ONLINE 0 0 0
da57 FAULTED 0 0 0 corrupted data
da46 ONLINE 0 0 0
errors: No known data errors
```
Now, I'm terrified to issue a `zpool replace` command for two reasons:
1. naively replacing, say, `da50` might replace the working device in `raidz2-2` rather than the faulted device in `raidz2-1`, leaving me with zero redundancy while resilvering. zpool-replace may be smart enough to prevent this unless I am dumb enough to include `-f`.
2. I don't actually know which of the two now-available devices (`da58` and `da59`) belongs with which raid group
Problem #1 could be solved -- I think -- by using GUID (obtained from `zdb`) instead of device number with the `zpool replace` command.
**How do I solve problem #2?**
For example, if the GUID associated with the FAULTED drive in `raidz2-1` is 10104343158814001513, and I issue `zpool replace 10104343158814001513 <daNN>`, it will certainly work no matter which of 58 or 59 I pick, but one will resilver nearly instantly, and the other take quite a long time.
(Bonus Question: DId I screw up by not GPT labeling my disks before creating the pool? Chapter 22 of the handbook[1] never recommends this. Is there some other way to add devices by a stable identifier to prevent this problem in the future?)
[0] https://muc.lists.freebsd.scsi.nark...ll-disks-come-back-after-power-cycling-a-jbod
[1] https://docs.freebsd.org/en/books/handbook/zfs/