Unavailable disks in ZFS pool

Hello all,

I am running NAS4Free (I have already posted my issue on that forum, but suggestions have so far not resolved the problem. My issue relates to my ZFS pool opposed to the OS.)

I had the OS installed on a 1TB drive and a separate zpool 'media' comprising of 4x2TB drives. Following a drive failure on OS drive, I replaced and reinstalled the OS. Unfortunately, after this point I am unable to import the zpool again:

Code:
# zpool import -f
   pool: media
     id: 6325274310150933716
  state: UNAVAIL
 status: One or more devices are missing from the system.
 action: The pool cannot be imported. Attach the missing
        devices and try again.
   see: http://illumos.org/msg/ZFS-8000-3C
 config:

        media                     UNAVAIL  insufficient replicas
          raidz1-0                UNAVAIL  insufficient replicas
            raid/r0               ONLINE
            5557966061127549840   UNAVAIL  cannot open
            15250106807823902241  UNAVAIL  cannot open
            12240457910996348905  UNAVAIL  cannot open

From investigations it appears the drive labels allocate differently to what was previously configured (I have tried various different types of install and all amount to same problem).

From /var/log/system.log:
Code:
Sep 21 20:37:57 nas4free kernel: ada0 at ahcich0 bus 0 scbus0 target 0 lun 0
Sep 21 20:37:57 nas4free kernel: ada0: <SAMSUNG HD203WI 1AN10002> ATA-8 SATA 2.x device
Sep 21 20:37:57 nas4free kernel: ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
Sep 21 20:37:57 nas4free kernel: ada0: Command Queueing enabled
Sep 21 20:37:57 nas4free kernel: ada0: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C)
Sep 21 20:37:57 nas4free kernel: ada0: Previously was known as ad4
Sep 21 20:37:57 nas4free kernel: ada1 at ahcich1 bus 0 scbus1 target 0 lun 0
Sep 21 20:37:57 nas4free kernel: ada1: <SAMSUNG HD203WI 1AN10002> ATA-8 SATA 2.x device
Sep 21 20:37:57 nas4free kernel: ada1: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
Sep 21 20:37:57 nas4free kernel: ada1: Command Queueing enabled
Sep 21 20:37:57 nas4free kernel: ada1: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C)
Sep 21 20:37:57 nas4free kernel: ada1: Previously was known as ad6
Sep 21 20:37:57 nas4free kernel: ada2 at ahcich2 bus 0 scbus2 target 0 lun 0
Sep 21 20:37:57 nas4free kernel: ada2: <SAMSUNG HD203WI 1AN10002> ATA-8 SATA 2.x device
Sep 21 20:37:57 nas4free kernel: ada2: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
Sep 21 20:37:57 nas4free kernel: ada2: Command Queueing enabled
Sep 21 20:37:57 nas4free kernel: ada2: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C)
Sep 21 20:37:57 nas4free kernel: ada2: Previously was known as ad8
Sep 21 20:37:57 nas4free kernel: ada3 at ahcich3 bus 0 scbus3 target 0 lun 0
Sep 21 20:37:57 nas4free kernel: ada3: <SAMSUNG HD203WI 1AN10002> ATA-8 SATA 2.x device
Sep 21 20:37:57 nas4free kernel: ada3: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
Sep 21 20:37:57 nas4free kernel: ada3: Command Queueing enabled
Sep 21 20:37:57 nas4free kernel: ada3: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C)
Sep 21 20:37:57 nas4free kernel: ada3: Previously was known as ad10

...

on import:

Code:
Sep 21 21:14:16 nas4free kernel: ZFS WARNING: Unable to attach to ada1.
Sep 21 21:14:17 nas4free kernel: ZFS WARNING: Unable to attach to ada2.
Sep 21 21:14:17 nas4free kernel: ZFS WARNING: Unable to attach to ada3.

I think this has happened because previously I had a SATA cable failure resulting in me having to do a replace and scrub of the pool (thus the labels ad4,ad6,ad8,ad10 above instead of ad0,ad1,ad2,ad3). I assume the pool must store this information. This is just my guess though.

Either way, I am stuck and cannot figure out how to import my zpool. Reading on related topics I think something along the lines of this post would be required: http://forums.freebsd.org/showthread.php?t=28181

I would greatly appreciate help on this matter.

Thanks,
 
Thank you for the replies. Here is output from graid:

Code:
~# graid status
   Name      Status  Components
raid/r0  SUBOPTIMAL  ada0 (ACTIVE (STALE))
                     ada1 (ACTIVE (STALE))
                     ada2 (ACTIVE (STALE))
                     ada3 (ACTIVE (STALE))
 
I have tried to stop - but it does not find the array. (With force option):

Code:
nas4free:~# graid stop -fv r0
graid: Array 'r0' not found.
nas4free:~# graid stop -fv raid/r0
graid: Array 'raid/r0' not found.
nas4free:~#
 
This should give you the name of the array that you can use with graid stop, the first column:
# graid status -g
 
Thanks. I think everything has worked ok! (Had to force import)
Code:
nas4free:~# graid status -g
            Name      Status  Components
SiI-100515180954  SUBOPTIMAL  ada0 (ACTIVE (STALE))
                              ada1 (ACTIVE (STALE))
                              ada2 (ACTIVE (STALE))
                              ada3 (ACTIVE (STALE))
nas4free:~# graid stop SiI-100515180954
nas4free:~# zpool import -f  media
nas4free:~# zpool status
  pool: media
 state: ONLINE
  scan: resilvered 6K in 0h0m with 0 errors on Sun May 20 23:51:09 2012
config:

        NAME        STATE     READ WRITE CKSUM
        media       ONLINE       0     0     0
          raidz1-0  ONLINE       0     0     0
            ada0    ONLINE       0     0     0
            ada1    ONLINE       0     0     0
            ada2    ONLINE       0     0     0
            ada3    ONLINE       0     0     0

errors: No known data errors

Let me know if there are any concerns with the above.

Furthermore, what is best practise? Should I be labelling the disks? I want to know how to improve the reliability/maintenance of this setup.

Many thanks
 
First, backup your data immediately. Then try to get rid of the Sil RAID metadata on the disks because it might cause similar conflicts in the future. This could do the trick but it also might destroy ZFS disklabels or metadata, that's why you should make a backup first:

# graid remove SiI-100515180954 ada0

And repeat that for the other disks ada1, ada2 and ada3
 
Sorry for the delay, but thank you for the advice - it has taken a while for backup hard disks and transfers to happen.

The issue was occurring every time I rebooted the server. However, after removing the metadata per your suggestion the zpool is recognised automatically on system startup.

However, I still would like to know how this could of happened in the first place? I'm still very much a newbie with ZFS - could it be the way I initially setup the pool?
 
At some point, a hardware RAID was created with those drives. Maybe you did that earlier, or the drives might have even come from the manufacturer that way. The metadata from that RAID controller was still on there.

The old RAID metadata is not a problem unless something notices it. Previous to FreeBSD 9.1, that would require loading graid(8) by hand. Now, it is a part of the GENERIC kernel. So we'll be seeing this problem more often.
 
Back
Top