FreeBSD 9-RC2 breaks mpt driver when used with VMware and PCIe passthrough?

All,

I currently run FreeBSD 8.2, in a VM under VMware ESXi 5.0. Several of my non-boot drives are on an LSI embedded controller on the motherboard, which I pass through using VMDirectPath (PCIe passthrough) of the entire LSI controller.

This has worked flawlessly for me, and allows my drives to remain ZFS native, while taking advantage of VMware and running several VMs in addition to my big "do everything" FreeBSD server.

Last night I upgraded this system to FreeBSD 9 RC2, and during boot, the system would hang with messages just after loading em0.

(I will edit this in a minute... will reproduce in another VM and type it in here)

I also tried removing the passthrough device and attaching it to another VM, running FreeBSD9 RC1, which I installed from scratch. Same problem.

I used a snapshot to restore my primary system to FreeBSD 8.2 (love having root pool on ZFS!).

Here is my system under 8.2, I can't boot it far enough on 9.0-RC2 to gather much information:
Code:
[root@dante /usr/home/bill]# mptutil -u 1 show adapter
mpt1 Adapter:
       Board Name: UNUSED
   Board Assembly: 
        Chip Name: C1068E
    Chip Revision: UNUSED
      RAID Levels: none
[root@dante /usr/home/bill]# mptutil -u 1 show config 
mpt1 Configuration: 0 volumes, 5 drives
    drive da2 (932G) ONLINE <SAMSUNG HD103SJ 0001> SATA
    drive da3 (932G) ONLINE <SAMSUNG HD103SJ 0001> SATA
    drive da4 (932G) ONLINE <Hitachi HDT72101 A3AA> SATA
    drive da5 (932G) ONLINE <SAMSUNG HD103SJ 0001> SATA
    drive da6 (932G) ONLINE <SAMSUNG HD103SJ 0001> SATA
 
Here is a screen capture of the VM console of the errors during the "hang". Not really a hang, but a timeout that never exits as far as I've been able to test.

Screen%252520Shot%2525202011-11-23%252520at%25252012.35.14%252520PM.png
 
My apologies, wrong size image. Here you go.

Screen%252520Shot%2525202011-11-23%252520at%25252012.35.14%252520PM.png


I'd hate to think that I'm stuck at FreeBSD 8.2, when I'd really like to upgrade to 9.0 when it goes GA.

Hypervisor: ESXi 5.0
PCIe Passthrough of LSI chipset
Works with FreeBSD 8.2, fails with 9.0 RC.
 
I believe I may have solved my own problem.

I found the following on the FreeBSD-stable mailing list, archived at http://lists.freebsd.org/pipermail/freebsd-stable/2011-September/063937.html

On Mon, Sep 19, 2011 at 02:45:04PM +0200, Thomas Vogt wrote:
> Hello
>
> I've stability issue with my new intel SASUC8I [1] PCIe controller. It's a LSI 1068e based controller. After a few minutes with disk io (csup or scrub by example) my FreeBSD 8-stable (64bit) is "freezing" for a couple of minutes and I see a lot of error messages like:
>
> Sep 17 03:10:03 gw kernel: mpt0: request 0xffffff80002bc3b0:48367 timed out for ccb 0xffffff00050a8000 (req->ccb 0xffffff00050a8000)
> Sep 17 03:10:03 gw kernel: mpt0: request 0xffffff80002bbb40:48368 timed out for ccb 0xffffff0004f81800 (req->ccb 0xffffff0004f81800)
> Sep 17 03:10:03 gw kernel: mpt0: completing timedout/aborted req 0xffffff80002bc3b0:48367
> Sep 17 03:10:03 gw kernel: mpt0: completing timedout/aborted req 0xffffff80002bbb40:48368
> Sep 17 03:10:03 gw kernel: mpt0: Timedout requests already complete. Interrupts may not be functioning
>

If this really is an issue with interrupts not getting delivered you
could try whether disabling MSI/MSI-X by setting hw.pci.enable_msi=0
and hw.pci.enable_msix=0 either on the loader prompt or via loader.conf
works around it.

Marius

I then passed through my LSI chip and drives to a test VM, and booted it with the loader settings to disable MSI interrupts per the above. And it worked!

I know that MSI interrupts should be more efficient, but I am not sure that it matters that much for my purposes. I will test more with MSI interrupts disabled in my primary VM, and if it appears stable and with performance, then I'll upgrade again to 9.0 in my primary VM.
 
Found a workaround on the FreeBSD list archives.

I added the following to my loader.conf:

Code:
hw.pci.enable_msi=0
hw.pci.enable_msix=0

This works around the problem, my VM can boot with the LSI passthrough.

Question for a driver or kernel developer: Why has the behavior of the driver changed between FreeBSD 8.2 and 9.0 RC2?
 
Back
Top