3ware RAID - one unit INOPERABLE

dvl@

Developer
After a reboot (to upgrade the kernel), one of the spare HDD is inoperable...

Before I proceed with a fix, I wanted a second opinion. This is what I think I need to do:

Code:
# tw_cli maint deleteunit c0 u2
# tw_cli maint createunit c0 p0 rspare
Make sense?

The current state of the system is:

Code:
FreeBSD supernews.example.org 8.2-STABLE FreeBSD 8.2-STABLE #0: Fri Nov 11 20:08:41 UTC 2011     
[email]dvl@supernews.example.org[/email]:/usr/obj/usr/src/sys/OPTI  amd64

Code:
# tw_cli info

Ctl   Model        (V)Ports  Drives   Units   NotOpt  RRate   VRate  BBU
------------------------------------------------------------------------
c0    9550SX-8LP   8         8        3       1       4       1      OK       

# tw_cli info c0

Unit  UnitType  Status         %RCmpl  %V/I/M  Stripe  Size(GB)  Cache  AVrfy
------------------------------------------------------------------------------
u0    RAID-10   OK             -       -       64K     195.548   ON     ON     
u1    SPARE     OK             -       -       -       69.2404   -      ON     
u2    RAID-10   INOPERABLE     -       -       64K     195.548   OFF    ON     

Port   Status           Unit   Size        Blocks        Serial
---------------------------------------------------------------
p0     OK               u2     69.25 GB    145226112     WD-WMAKE2379003     
p1     OK               u1     69.25 GB    145226112     WD-WMAKE2379069     
p2     OK               u0     69.25 GB    145226112     WD-WMAKE2379066     
p3     OK               u0     69.25 GB    145226112     WD-WMAKE2379012     
p4     OK               u0     69.25 GB    145226112     WD-WMAKE2379286     
p5     OK               u0     69.25 GB    145226112     WD-WMAKE2379019     
p6     OK               u0     69.25 GB    145226112     WD-WMAKE2394339     
p7     OK               u0     69.25 GB    145226112     WD-WMAKE2378696     

Name  OnlineState  BBUReady  Status    Volt     Temp     Hours  LastCapTest
---------------------------------------------------------------------------
bbu   On           Yes       OK        OK       OK       255    02-Sep-2010
 
dvl@ said:
Code:
Unit  UnitType  Status         %RCmpl  %V/I/M  Stripe  Size(GB)  Cache  AVrfy
------------------------------------------------------------------------------
u0    RAID-10   OK             -       -       64K     195.548   ON     ON     
u1    SPARE     OK             -       -       -       69.2404   -      ON     
u2    RAID-10   INOPERABLE     -       -       64K     195.548   OFF    ON

Hm, it looks like the u2 disk was used in a raid-10 configuration. strange if the u0 wasnt degraded.
I would check the disk itself before adding it again as spare disk.
 
I've seen similar situations where a disk in a RAID array drops off the bus, the spare kicks in and rebuilds the array, and then the original disk comes back online. The RAID metadata on the disk says it's part of an array ... but the array it's part of is already complete, so the disk is shown as part of an inoperable/incomplete array. Happens quite a bit on our RAID5+spare setups, and is kind of annoying.

You just need to double-check that all disks in the u0 array are actually online and running correctly. If so, then just delete the u2 unit. And then re-add it as a spare or whatever.
 
Via the 3dm2 web-GUI it's easy enough, just click on the unit to get the overview, then click on each disk to get the smartctl output which shows its online and running. Not sure how to do it via the CLI, never had to use it much.
 
Ah, your post above shows it, although you use slightly different syntax than I (tw_cli /c0 show). There are 6 drives listed as part of unit u0, all listed with status OK (ports p2 through p7).

If the RAID10 array is comprised of those 6 drives, then everything is kosher, and you can delete the "extra" u2.
 
Phoenix: Yes, indeed. The RAID10 array is composed of 6 drives.

Thanks y'all
 
Back
Top