IBM ServeRaid M1015 and FreeBSD

thethirdnut · Jan 18, 2013

No problem - good luck.

Keep in mind like I said that quite a few people have had issues, etc getting the right combo of tools to do the flashing. This worked for me so hopefully helps you as well.

Migelo · Jan 18, 2013

Did you use the

Code:

megarec

command there because you have a UEFI?

I've tried on two PCs with the bootable DOS usb and I still can't get the sas2flsh utility to recognise my card....

Migelo · Jan 18, 2013

Turns out those megarec commands are crucial xD

The card work flawlessly now under Windows 7 now I just need to put those drivers to /boot/kernel and I'm (hopefully) done.

And if someone is looking where to get the megarec file, check out this blog post, under "How to cross flash".

thethirdnut · Jan 18, 2013

Good to hear you got it working.

m6tt · Jan 19, 2013

The built-in drivers are fine, don't use LSI's drivers for this card. If you have any problems flashing the card, try a different machine (or barring that an older/newer bios) it's quite picky. My desktop worked until a recent BIOS update, I had to use some old machine to flash it this time.

Migelo · Jan 19, 2013

It's doing now, great performance, so I'm not going to change anything.

Boeri · Mar 18, 2013

I'm also using the IBM M1015 card (flashed in IT mode) with the latest mpslsi driver and firmware from LSI on FreeBSD 9.1.

Code:

dev.mpslsi.0.%desc: LSI SAS2008
dev.mpslsi.0.%driver: mpslsi
dev.mpslsi.0.%location: slot=0 function=0
dev.mpslsi.0.%pnpinfo: vendor=0x1000 device=0x0072 subvendor=0x1000 subdevice=0x3020 class=0x010700
dev.mpslsi.0.%parent: pci1
dev.mpslsi.0.debug_level: 4
dev.mpslsi.0.disable_msix: 0
dev.mpslsi.0.disable_msi: 0
[B]dev.mpslsi.0.firmware_version: 15.00.00.00
dev.mpslsi.0.driver_version: 15.00.00.00[/B]
dev.mpslsi.0.io_cmds_active: 10
dev.mpslsi.0.io_cmds_highwater: 291
dev.mpslsi.0.chain_free: 2048
dev.mpslsi.0.chain_free_lowwater: 2014
dev.mpslsi.0.max_chains: 2048
dev.mpslsi.0.chain_alloc_fail: 0

One of the drives in my striped mirror zpool configuration is bad and this freezes the whole zfs pool.

Code:

mpslsi0: mpssas_scsiio_timeout checking sc 0xffffff8000862000 cm 0xffffff80008c1f48
mpslsi0: mpssas_alloc_tm freezing simq
mpslsi0: timedout cm 0xffffff80008c1f48 allocated tm 0xffffff8000879908
mpslsi0: mpssas_scsiio_timeout checking sc 0xffffff8000862000 cm 0xffffff800089d708
mpslsi0: queued timedout cm 0xffffff800089d708 for processing by tm 0xffffff8000879908
mpslsi0: mpssas_scsiio_timeout checking sc 0xffffff8000862000 cm 0xffffff80008a0670
mpslsi0: mpssas_free_tm releasing simq
mpslsi0: mpssas_alloc_tm freezing simq
mpslsi0: timedout cm 0xffffff80008a0670 allocated tm 0xffffff8000879a50
mpslsi0: mpssas_free_tm releasing simq

Code:

048 scsi 0 state c xfer(noperiph:mpslsi0:0:3:0): SMID 15 abort TaskMID 89 status 0x0 code 0x0 count 1
(da3:mpslsi0:0:3:0): WRITE(10). CDB: 2a 0 2 d4 40 80 0 1 0 0
(da3:mpslsi0:0:3:0): CAM status: Command timeout
(da3:mpslsi0:0:3:0): Retrying command
(da3:mpslsi0:0:3:0): WRITE(10). CDB: 2a 0 2 d4 41 80 0 1 0 0 length 131072 SMID 325 command timeout cm 0xffffff800088f068 ccb 0xfffffe0008f74000
(da3:mpslsi0:0:3:0): WRITE(10). CDB: 2a 0 2 d4 41 80 0 1 0 0 length 131072 SMID 325 completed timedout cm 0xffffff800088f068 ccb 0xfffffe0008f74000 during recovery ioc 8048 scsi 0 state c xfe(noperiph:mpslsi0:0:3:0): SMID 16 abort TaskMID 325 status 0x0 code 0x0 count 1
(da3:mpslsi0:0:3:0): WRITE(10). CDB: 2a 0 2 d4 41 80 0 1 0 0
(da3:mpslsi0:0:3:0): CAM status: Command timeout
(da3:mpslsi0:0:3:0): Retrying command

I don't want that my zfs "freezes" when one of the disks is behaving bad in a striped mirror configuration. I hoped that the zfs code and mpslsi driver would drop the disk, but this seems not to be the case.

This was the status of the pool, a few seconds before the "crash"/stuck state of the lsi driver.

Code:

root@freebsd-san:/root # zpool status
  pool: tank
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://illumos.org/msg/ZFS-8000-9P
  scan: scrub repaired 0 in 0h10m with 0 errors on Mon Mar 18 09:51:52 2013
config:

        NAME             STATE     READ WRITE CKSUM
        tank             ONLINE       0     0     0
          mirror-0       ONLINE       0     0     0
            label/disk1  ONLINE       0     0     0
            label/disk2  ONLINE       0     0     3
          mirror-1       ONLINE       0     0     0
            label/disk3  ONLINE       0     0     0
            label/disk4  ONLINE       0     0     0
        cache
          label/disk5    ONLINE       0     0     0

errors: No known data errors

Code:

root@freebsd-san:/root # camcontrol stop da3
Error received from stop unit command

Any zfs command seems to be stuck, in a blocked state, probably due to the "crashed" lsi driver.

Any ideas? I can try with the "normal" mps driver, but this is based on phase 14 if I'm correct. My firmware is phase 15, I hope this is not an issue...

edit:
After a long time, the zfs command succeeded.

Code:

(da3:mpslsi0:0:3:0): WRITE(10). CDB: 2a 0 2 d4 4f c5 0 1 0 0
(da3:mpslsi0:0:3:0): CAM status: Command timeout
(da3:mpslsi0:0:3:0): Error 5, Periph was invalidated
(da3:mpslsi0:0:3:0): oustanding 0
(da3:mpslsi0:0:3:0): removing device entry

Code:

root@freebsd-san:/root # zpool status
  pool: tank
 state: DEGRADED
status: One or more devices has been removed by the administrator.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Online the device using 'zpool online' or replace the device with
        'zpool replace'.
  scan: scrub in progress since Mon Mar 18 10:52:43 2013
        2.23G scanned out of 35.3G at 18.3M/s, 0h30m to go
        0 repaired, 6.33% done
config:

        NAME                      STATE     READ WRITE CKSUM
        tank                      DEGRADED     0     0     0
          mirror-0                DEGRADED     0     0     0
            label/disk1           ONLINE       0     0     0
            11585224050511345959  REMOVED      0     0     0  was /dev/label/disk2
          mirror-1                ONLINE       0     0     0
            label/disk3           ONLINE       0     0     0
            label/disk4           ONLINE       0     0     0
        cache
          label/disk5             ONLINE       0     0     0

errors: No known data errors

This took longer than 5 minutes. 60 seconds timeout (kern.cam.da.default_timeout: 60) and 4x retry. Is this possible?
Or is the mpslsi driver handling the timeouts in a different way ?

IBM ServeRaid M1015 and FreeBSD

thethirdnut

Migelo

Migelo

thethirdnut

m6tt

Migelo

Boeri