ZFS Devices in raidz1 config with states UNAVAIL and OFFLINE

Checking my ZFS pool today I noticed that two of my drives are in some non-functioning state, one is REMOVED and the other OFFLINE. zpool status shows the following:
Code:
  pool: library
 state: UNAVAIL
status: One or more devices has been taken offline by the administrator.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Online the device using 'zpool online' or replace the device with
        'zpool replace'.
  scan: none requested
config:

        NAME                     STATE     READ WRITE CKSUM
        library                  UNAVAIL      0     0     0
          raidz1-0               UNAVAIL      0     0     0
            da0.eli              ONLINE       0     0     0
            da1.eli              ONLINE       0     0     0
            da2.eli              ONLINE       0     0     0
            da3.eli              ONLINE       0     0     0
            ada0.eli             ONLINE       0     0     0
            ada1.eli             ONLINE       0     0     0
            ada2.eli             ONLINE       0     0     0
            ada3.eli             ONLINE       0     0     0
            8101502927522773059  REMOVED      0     0     0  was /dev/ada8.eli
            591043243035698817   OFFLINE      0     0     0  was /dev/ada6.eli
            ada6.eli             ONLINE       0     0     0
            ada9.eli             ONLINE       0     0     0

I have attached ada8 and ada6 with geli and they seem to be working fine. I tried following the action above by doing zpool replace and zpool online, but both commands results in cannot open 'library': pool is unavailable

I am not sure how to proceed. Is there some step I am missing?
I also do not understand why under name the two faulty disks are named 8101502927522773059 and 591043243035698817?

Also, I know having raidz1 config with these many devices is really dumb :)
 
Code:
591043243035698817 OFFLINE 0 0 0 was /dev/ada6.eli
ada6.eli ONLINE 0 0 0
It looks like some disks may have switched position. Was a disk removed then reinserted? That may move disk designations around causing a bit of confusion on the ZFS side.
 
That makes sense... my server is odd, as I have to remove the SATA cable from a couple of the drives when booting up and then plugging them back in once up and running. Could be that I have switched some of the cables. Will try to see if I can find the correct order and try again.
 
My old LSI SAS controller is weird in that respect too. If I pull out a drive (hot swap cradle) all drives after it move up one place, that confuses the heck out of ZFS. It does recover if I just reboot with the disk removed, but then I get the same issue again when I reinsert a new drive in that position.
 
For this reason, it is a better practice to not add disks to pools using their device names (such as /dev/adaXXX), but to give the partitions used by ZFS names in the partition label (using gpart), and add them with the names /dev/gpart/XXX. Like that, when you move around, ZFS still finds them under the same device name.

No, I don't know how that would interact with geli encryption, but I bet it can be made to work, and I vaguely remember seeing a HowTo or guide for that recently.
 
Back
Top