LSI 9201-16 - drives not detected

After failure with IBM ServeRAID, I bought a LSI 9201-16e HBA. In Windows it is detected as SAS2 2116 Meteor ROC(E). It doesn't see my hard drives. I tried two different HDDs and mounted card in two computers. It looks like either HBA or my SFF-8088-to-SATA cable (bought new) is broken. Any ideas?

I haven't tried under Unix, but it doesn't matter, there is obviously some hardware problem here. It displays nothing at startup and doesn't respond to Ctrl-C or -R. Nothing happens when switching cable to other port.
 
In Windows it is detected as SAS2 2116 Meteor ROC(E).
That means that the chip on it isn't completely dead; it's alive enough to put its manufacturer and model on the PCI bus when asked. But ...

It displays nothing at startup and doesn't respond to Ctrl-C or -R.
Displaying nothing on the BIOS screen is very worrisome. The Control-R can be a bit finicky; you have to press those at the correct time. But with trial and error, it can be done. My theory: This card was bricked by installing bad firmware on it. I don't know how to unbrick LSI cards.
 
Thank you and sorry for delay. I returned the card to a store. The only replacements they can offer me is Perc H200E or H800. As far as I know, former requires re-flashing (which I'd prefer to avoid as it ended fatally before), and the latter is useless (can't work in IT mode).

I'm still looking for an inexpensive HBA.
 
Hi, I have the same problem. On some mainboard there is a controller (or two of them), more or less similar to this one:
Code:
ahci0: <Intel Wellsburg AHCI SATA controller>
ahci0: AHCI v1.30 with 4 6Gbps ports, Port Multiplier not supported
and I would like to have just the same thing again on a PCIe card - but that doesn't exist.

What does exist is either some cheap RAID constructs (and they do not seem to have an all-too-good reputation), or the SAS cards from LSI/Avago.

So, recently I was going to give it a try. The first effort was an elderly Marvell controller that should at least run SATA-II (but then 8 of them). Sadly, the seller declined selling it to me (no reason given), so this experiment did not really work out.
Next I tried a Dell PERC H200 - that was sent to me a week ago (from some 50 km away), but didn't arrive yet. So also no practical data.
It seems really difficult to actually get these things.

In the meantime I got the idea to look into my own machines. This here is a dedibox (i.e. rented), in fact some elderly Dell rackmount thing - I never looked into the hardware in detail, because it's none of my business, and the thing just works. But, it shows this:

mps0: <Avago Technologies (LSI) SAS2008> port 0xfc00-0xfcff mem 0xdf2b0000-0xdf2bffff,0xdf2c0000-0xdf2fffff irq 16 at device 0.0 on pci1
mps0: Firmware: 07.15.08.00, Driver: 21.02.00.00-fbsd
mps0: IOCCapabilities: 185c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,IR>

So this is one of these LSI things. And it is apparently not flashed to IT mode, because
  1. it shows the "IR" capability
  2. the hoster offers to run the disks as RAID, and
  3. the installation web-interface indeed offers the choice of RAID0, RAID1 or none.
But then, it just shows the disks in the usual way:
Code:
da1: <ATA WDC WD1003FBYX-1 1V02> Fixed Direct Access SPC-3 SCSI device
da0: <ATA WDC WD1003FBYX-1 1V02> Fixed Direct Access SPC-3 SCSI device

I installed my standard mixed UFS/ZFS layout, I run smartctl, and all my automated testing&monitoring works just like anywhere else:

Code:
# cat /ext/diskstat/smart.2023w36/da0
=== START OF INFORMATION SECTION ===
Model Family:     Western Digital RE4
Device Model:     WDC WD1003FBYX-18Y7B0
Add. Product Id:  DELL(tm)
Firmware Version: 01.01V02
User Capacity:    1,000,204,886,016 bytes [1.00 TB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    7200 rpm
Device is:        In smartctl database 7.3/5319
ATA Version is:   ATA8-ACS (minor revision not indicated)
SATA Version is:  SATA 2.6, 3.0 Gb/s
,,,

So why would one need to flash such a thing around? And, is it only mine, or do the SAS2008 in "IR" generally treat SATA disks in this friendly way?
As indeed, on ebay we can buy such controllers as defect, after the seller had tried to flash them and bricked them in the process.

The box is a Dell rackmount, even the disks are Dell cusomized (for whatever reason), so probably the controller is also a Dell version...
 
So why would one need to flash such a thing around?
I beleive the drives are faster with IT Mode. Straight passthru versus packets through a raid controller.
Plus don't you have to make eight RAID0 devices.....
The IR devices are generally called MegaRAID. They cost more but have battery backed up cache..
 
I beleive the drives are faster with IT Mode. Straight passthru versus packets through a raid controller.
Plus don't you have to make eight RAID0 devices.....
No, that's my point: it doesn't look like that. If these were fabricated RAID0 devices, they would have some artificial megaraid identity - but these show the native identity of the WD disks, and all the disk's internal data (Smart status, error log, temperature graph, SCT caps) are present, the extended self-check works, etc.

For example, when I look at some disk that is attached via USB3-to-SATA, it looks like this:
Code:
=== START OF READ SMART DATA SECTION ===
SMART Status not supported: Incomplete response, ATA output registers missing
SMART overall-health self-assessment test result: PASSED
Warning: This result is based on an Attribute check.

"SMART Status" is a low-level ATA command, and it does not even get through the USB3 here. But on the SAS2008 are no such problems, all the low-level ATA seems to work. So I don't think it is a RAID0 construct. (But to be certain I would need to move that disk to another controller - which I cannot because it's rented)

The IR devices are generally called MegaRAID. They cost more but have battery backed up cache..
It is a weirdness. Some are just "HBA" and do not support RAID5 (but RAID0 and 1), some are MegaRAID and do support RAID5, and then again some have battery-protected cache.

These guys have put up an extensive list, and it seems still maintained:
 
So, recently I was going to give it a try. The first effort was an elderly Marvell controller that should at least run SATA-II (but then 8 of them). Sadly, the seller declined selling it to me (no reason given), so this experiment did not really work out.
Next I tried a Dell PERC H200 - that was sent to me a week ago (from some 50 km away), but didn't arrive yet. So also no practical data.
It seems really difficult to actually get these things.
Here is an update. The postal service finally managed to deliver (I would be four times faster on foot). Having a look it says "Dell" but nothing specific. And I'm on the leave now and have no time, so I just put it into my desktop for a quick check

Code:
kernel: mps0: <Avago Technologies (LSI) SAS2008> port 0xe000-0xe0ff mem 0xf7d40000-0xf7d4ffff,0xf7d00000-0xf7d3ffff irq 16 at device 0.0 on pci1
kernel: mps0: Firmware: 20.00.07.00, Driver: 21.02.00.00-fbsd
kernel: mps0: IOCCapabilities: 1285c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,EventReplay,HostDisc>
kernel: da2 at mps0 bus 0 scbus8 target 0 lun 0
kernel: da2: <ATA HP SSD S700 500G 033> Fixed Direct Access SPC-4 SCSI device
kernel: da2: 600.000MB/s transfers
kernel: da2: Command Queueing enabled
kernel: da2: 476940MB (976773168 512 byte sectors)

So, this one is already modded. It detects the disk, it hotplugs the disk, it reads data and smart-data from the disk. And it is very power-hungry, I measured 73°C on the cooler, then it got stuck. This will need a fan from old graphics card... Mains intake shows additional 9-10 Watt. OTOH I have seem some 530 MB/s reading from the disk.
 
I measured 73°C on the cooler
These cards are typically used in 1U servers, those put enough airflow over the cooler. In a regular desktop (or a 4U big box) there's not enough airflow going over them. My old LSI card got all the way up to 95C this weekend (air temperature was in the 30C range). I usually get a small 4 cm fan and screw it on. It's not much but it's better than no airflow at all.
 
These cards are typically used in 1U servers, those put enough airflow over the cooler. In a regular desktop (or a 4U big box) there's not enough airflow going over them.
The 73° was open case, measured with some K-type sensor.

My old LSI card got all the way up to 95C this weekend
How You measure? I'm surprized it survives that.

(air temperature was in the 30C range). I usually get a small 4 cm fan and screw it on. It's not much but it's better than no airflow at all.
Thats exactly the plan. However, I'm not so sure if I really want these additional 10 Watts in the baseline consumption...
 
I usually get a small 4 cm fan and screw it on. It's not much but it's better than no airflow at all.
Did the same with lsi and smartpqi HBAs and it really made a difference for me, temp going down 15-20C (as reported by smartpqi HBA via its sensors).
 
How You measure?
I have a fan controller with temperature sensors in that machine. The incessant beeping when the temperature goes over 90 has been driving me nuts the past few days. I don't think the sensor is very accurate though. But I've literately burned my fingers on similar cards in the past, I know they can get very hot.

I'm surprized it survives that.
It should. That said, I'm sure it's not very good to have this temperature for very long. It will probably shorten its lifespan.
 
So why would one need to flash such a thing around?

Main reason is safety of your data. With copy on write filesystem on pooled storage, like ZFS or MS Storage Spaces + ReFS, storage subsystem should have direct access to hard drives. When operating system receives response, that particular write is committed, it should really be written to the media. If you have some abstraction layer in between, and your rig hangs or reboots unexpectedly, you can end with inconsistence in metadata. In worst case scenario, you can lost entire pool that way.

I prefer to kill a controller during reflashing, than put my data at risk later. I decided to buy Fujutsu 3116 (LSI 2208) and try flash again. If it fails, I found LSI 9200-8, but at triple price of the former.
 
Main reason is safety of your data. With copy on write filesystem on pooled storage, like ZFS or MS Storage Spaces + ReFS, storage subsystem should have direct access to hard drives. When operating system receives response, that particular write is committed, it should really be written to the media. If you have some abstraction layer in between, and your rig hangs or reboots unexpectedly, you can end with inconsistence in metadata. In worst case scenario, you can lost entire pool that way.
I see. That is an argument that can be understood - still I would not agree to it. If we don't trust the controller card to do the right thing, why should we trust the disk drive to do the right thing?
Also, I don't believe in loosing a pool. Basically a pool is just a bunch of (rather long) files. It consists of bytes, no magic. One could look into these bytes. With ZFS we have the source. It may indeed look different with MS stuff. I never touched that, for a reason.
 
Back
Top