Issues with mpr (LSI SAS3008 driver) after upgrade from 12.3 to 12.4 releng

RELENG-12.3 not support, but I can not upgrade to RELENG-12.4, because I have got issue after upgrade to RELENG-12.4.

Regression is seen when updating FreeBSD 12.3 GENERIC kernel to 12.4 kernel.

Logical drives on mpr device driver (LSI Fusion controller 3008i) fails to discover logical drive. If logical drive is removed and drives are in JBOD, drives appear normally.

This is NOT an issue in FreeBSD 12.3 and updating to 12.4 causes the server to fail boot as it cannot mount as it discovers no drives. Rolling back to 12.3-GENERIC resolves the problem.

In FreeBSD 12.4 the controller is discovered:
Code:
mpr0: <Avago Technologies (LSI) SAS3008> port 0xe000-0xe0ff mem 0xfb240000-0xfb24ffff,0xfb200000-0xfb23ffff irq 26 at device 0.0 on pci2
mpr0: Firmware: 10.00.03.00, Driver: 23.00.00.00-fbsd
mpr0: IOCCapabilities: 6985c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,IR,MSIXIndex,FastPath,RDPQArray>
mpr0: Found device <881<SataDev,Direct>,End Device> <6.0Gbps> handle<0x0009> enclosureHandle<0x0001> slot 0
mpr0: At enclosure level 0 and connector name (    )
mpr0: Found device <881<SataDev,Direct>,End Device> <6.0Gbps> handle<0x000a> enclosureHandle<0x0001> slot 1
mpr0: At enclosure level 0 and connector name (    )

However ONLY in 12.3 we see the logical drive is discovered and system boots normally:
Code:
da0 at mpr0 bus 0 scbus0 target 0 lun 0
da0: <LSI Logical Volume 3000> Fixed Direct Access SPC-4 SCSI device
da0: Serial Number 739763315135195629
da0: 150.000MB/s transfers
da0: Command Queueing enabled
da0: 142097MB (291014656 512 byte sectors)

When booting with 12.4 this logical drive is NOT discovered, system cannot boot.

Information about hardware (commands ran on RELENG-12.3):
Code:
root@host:~ # mprutil show all
Adapter:
mpr0 Adapter:
       Board Name: Asus SAS3008
   Board Assembly:
        Chip Name: LSISAS3008
    Chip Revision: ALL
    BIOS Revision: 8.35.00.00
Firmware Revision: 15.00.04.00
  Integrated RAID: yes
         SATA NCQ: ENABLED
 PCIe Width/Speed: x8 (8.0 GB/sec)
        IOC Speed: Full
      Temperature: 53 C

PhyNum  CtlrHandle  DevHandle  Disabled  Speed   Min    Max    Device
0       0001        0009       N         6.0     3.0    12     SAS Initiator
1       0002        000a       N         6.0     3.0    12     SAS Initiator
2       0003        000b       N         6.0     3.0    12     SAS Initiator
3       0004        000c       N         6.0     3.0    12     SAS Initiator
4                              N                 3.0    12     SAS Initiator
5                              N                 3.0    12     SAS Initiator
6                              N                 3.0    12     SAS Initiator
7                              N                 3.0    12     SAS Initiator

Devices:
B____T    SAS Address      Handle  Parent    Device        Speed Enc  Slot  Wdt
00   02   4433221100000000 0009    0001      SATA Target   6.0   0001 00    1
00   03   4433221101000000 000a    0002      SATA Target   6.0   0001 01    1
00   04   4433221102000000 000b    0003      SATA Target   6.0   0001 02    1
00   05   4433221103000000 000c    0004      SATA Target   6.0   0001 03    1

Enclosures:
Slots      Logical ID     SEPHandle  EncHandle    Type
  08    500112f950000f70               0001     Direct Attached SGPIO

Expanders:
NumPhys   SAS Address     DevHandle   Parent  EncHandle  SAS Level
Code:
root@host:~ # dmesg | grep mpr0
mpr0: <Avago Technologies (LSI) SAS3008> port 0x6000-0x60ff mem 0x94240000-0x9424ffff,0x94200000-0x9423ffff irq 16 at device 0.0 on pci1
mpr0: Firmware: 15.00.04.00, Driver: 23.00.00.00-fbsd
mpr0: IOCCapabilities: 6985c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,IR,MSIXIndex,FastPath,RDPQArray>
mpr0: Found device <881<SataDev,Direct>,End Device> <6.0Gbps> handle<0x0009> enclosureHandle<0x0001> slot 0
mpr0: At enclosure level 0 and connector name (    )
mpr0: Found device <881<SataDev,Direct>,End Device> <6.0Gbps> handle<0x000a> enclosureHandle<0x0001> slot 1
mpr0: At enclosure level 0 and connector name (    )
mpr0: Found device <881<SataDev,Direct>,End Device> <6.0Gbps> handle<0x000b> enclosureHandle<0x0001> slot 2
mpr0: At enclosure level 0 and connector name (    )
mpr0: Found device <881<SataDev,Direct>,End Device> <6.0Gbps> handle<0x000c> enclosureHandle<0x0001> slot 3
mpr0: At enclosure level 0 and connector name (    )
da0 at mpr0 bus 0 scbus0 target 0 lun 0
da1 at mpr0 bus 0 scbus0 target 1 lun 0

Code:
root@host:~ # dmesg | grep da0
Trying to mount root from ufs:/dev/da0p2 [rw]...
da0 at mpr0 bus 0 scbus0 target 0 lun 0
da0: <LSI Logical Volume 3000> Fixed Direct Access SPC-4 SCSI device
da0: Serial Number 3978164406233064500
da0: 150.000MB/s transfers
da0: Command Queueing enabled
da0: 456809MB (935544832 512 byte sectors)
 
I'm assuming with 'releng' you are referring to -RELEASE?

I'd try upgrading the controller firmware first. IIRC 15.xxx was rather buggy...

I'm running several SAS3008 HBAs in different systems. At least 2 of them were updated all the way from 11.x to 12.4-RELEASE (or even to the 13 branch) without such issues.
Also if you are using ZFS *DON'T* use logical drives. Flash the 'IT' firmware image and run the controller as a simple HBA.

Code:
# uname -a
FreeBSD srv1 12.4-RELEASE FreeBSD 12.4-RELEASE r372781 GENERIC  amd64
# mprutil show adapter
mpr0 Adapter:
       Board Name: SAS9300-8i
   Board Assembly: 
        Chip Name: LSISAS3008
    Chip Revision: ALL
    BIOS Revision: 8.37.00.00
Firmware Revision: 16.00.01.00
  Integrated RAID: no
         SATA NCQ: ENABLED
 PCIe Width/Speed: x8 (8.0 GB/sec)
        IOC Speed: Full
      Temperature: 52 C
 
This looks weird

mpr0: Firmware: 10.00.03.00, Driver: 23.00.00.00-fbsd
root@host:~ # mprutil show all
Firmware Revision: 15.00.04.00

root@host:~ # dmesg | grep mpr0
mpr0: Firmware: 15.00.04.00, Driver: 23.00.00.00-fbsd
 
This looks weird

mpr0: Firmware: 10.00.03.00, Driver: 23.00.00.00-fbsd
root@host:~ # mprutil show all
Firmware Revision: 15.00.04.00

root@host:~ # dmesg | grep mpr0
mpr0: Firmware: 15.00.04.00, Driver: 23.00.00.00-fbsd

I suspect those are 2 different systems, especially given the controllers are on differents slots (pci 1 vs 2) and irqs (16 vs 26). So if the 'non working' system is really on firmware revision 10.xx (which is ancient!) it's more than plausible this is the cause.
 
I wait new hardware to move server. And next step I will update firmware. I'll report results.
 
Back
Top