bhyve Patch: allow any MM BARs passthru + without interrupts on bhyve

When I tried to passthru simple CompactPCI (basically PCI) cards with PLX9030 and PLX9050 chip, having:
  • memory mapped BARs only
  • no interrupts (and thus no MSI(-X) support)
I encountered unexpected fatal problems with BHYVE (FreeBSD 14.3 RELEASE):
  • it does not allow MM BARs with size that is not multiple of page size (4096 bytes)
  • it does not allow read/write access to MM BARs at all (!) - only I/O BARs are implemented
  • it does not allow card that has NO interrupts (so absence of MSI/MSI-X is not a problem)
As workaround I created experimental patch that:
  1. allow MM BARs of any size (removed constraint to allow page size multiple only)
  2. allow read and write to MM BAR (even when it is slow - because I don't map it to memory)
  3. allow using card that has no MSI/MSI-X (because our cards have no interrupts at all)
I developed simple and crude patch that works for me (see attachment) but I'm interested in what could be proper solution - to make such cards to work in official FreeBSD bhyve release?

Here is detail of one such card with PLX9050 chip:
Bash:
$ pciconf -lb ppt1@pci0:6:14:0

ppt1@pci0:6:14:0:       class=0xff0000 rev=0x02 hdr=0x00 vendor=0x10b5 device=0x9050 subvendor=0x1761 subdevice=0x00a4
    bar   [10] = type Memory, range 32, base 0xf7c01000, size 128, disabled
    bar   [18] = type Memory, range 32, base 0xf7c00000, size 2048, disabled

NOTE: Card has no interrupts so absence of MSI(-X) should be fine.

My `vm-bhyve` config contains:
Bash:
# passthru cards
passthru0="6/10/0=11:0"
passthru1="6/14/0=12:0"
wired_memory="yes"
debug="yes"
 

Attachments

Addendum - my claim:
* it does not allow read/write access to MM BARs at all (!) - only I/O BARs are implemented
is incorrect.

It is actually side effect when call to vm_map_pptdev_mmio fails - in such case fallback action is passthru_write/read, but it is correctly failing with assert, that BAR type is not I/O.

So I have to find why vm_map_pptdev_mmio fails in my case which should resolve also that assert:
Assertion failed: (pi->pi_bar[baridx].type == PCIBAR_IO), function passthru_read, file /usr/src/usr.sbin/bhyve/pci_passthru.c, line 1211.
 
I made 2nd patch that now uses mapped memory for MEM BARs, so passthru_write/read callbacks are again used just for I/O bars. It appears to work but with exception of one specific BAR - I suspect that it is because strange condition in kernel vmm.ko module.

I show only 2nd card here that causes issues:
Bash:
$ pciconf -lb ppt1

ppt1@pci0:6:14:0:       class=0xff0000 rev=0x02 hdr=0x00 vendor=0x10b5 device=0x9050 subvendor=0x1761 subdevice=0x00a4
    bar   [10] = type Memory, range 32, base 0xf7c01000, size 128, disabled
    bar   [18] = type Memory, range 32, base 0xf7c00000, size 2048, disabled

Note, that BAR1 (0x18) uses lower base address than BAR0 (0x10) - and it seems somehow causing issues in one condition. Here is dmesg output with my patch:
Code:
# kldload ./modules/usr/src/sys/modules/vmm/vmm.ko

ppt0 port 0xe000-0xe07f mem 0xf7c02000-0xf7c027ff at device 10.0 on pci6
ppt0: 6/10/0 OK: attached HPv7           
ppt1 mem 0xf7c01000-0xf7c0107f,0xf7c00000-0xf7c007ff at device 14.0 on pci6
ppt1: 6/14/0 OK: attached HPv7

# vm start -f VM_NAME
tap0: Ethernet address: 58:9c:fc:10:ff:98
tap0: promiscuous mode enabled
tap0: link state changed to UP                                                                                         
vm-default: link state changed to UP                                                                                   
ppt1: 6/14/0 BAR0 HPv7: ppt_valid_bar_mapping:497 FALSE hpa=0xf7c00000 base=0xf7c01000 len=0x1000 size=0x80 max_size=0x1
000 hpa(0xf7c00000) >= base(0xf7c01000): FALSE hpa+len (f7c01000) <= base+max_size(f7c02000): TRUE
HPv7: ppt_map_mmio:516 6/14/0 ERR PAGE_SIZE len=0x1000 hpa=0xf7c01000 gpa=0xc0008800             
HPv7: ppt_map_mmio:516 6/14/0 ERR PAGE_SIZE len=0x1000 hpa=0xf7c01000 gpa=0xc0008800
ppt1: 6/14/0 BAR0 HPv7: ppt_valid_bar_mapping:497 FALSE hpa=0xf7c00000 base=0xf7c01000 len=0x1000 size=0x80 max_size=0x1
000 hpa(0xf7c00000) >= base(0xf7c01000): FALSE hpa+len (f7c01000) <= base+max_size(f7c02000): TRUE
# VM shutdown...
tap0: link state changed to DOWN
vm-default: link state changed to DOWN

And here is bhyve.log when such card is actively used (my app uses only BAR2 in normal operation so it appears to work by chance).
Code:
bhyve: WARN: passthru device 6/14/0 BAR2 aligning size 0x800 to 0x1000 (page size: 0x1000)
bhyve: WARN: passthru device 6/14/0 BAR0 aligning size 0x80 to 0x1000 (page size: 0x1000)
bhyve: ERROR: passthru_mmio_addr:1317 pci_passthru: device 6/14/0 BAR0 map_pptdev_mmio failed
wrmsr to register 0x140(0) on vcpu 0
wrmsr to register 0x140(0) on vcpu 1
bhyve: WARN: passthru device 6/14/0 BAR0 aligning size 0x80 to 0x1000 (page size: 0x1000)
bhyve: ERROR: passthru_mmio_addr:1304 pci_passthru: device 6/14/0 BAR0 unmap_pptdev_mmio failed
bhyve: WARN: passthru device 6/14/0 BAR2 aligning size 0x800 to 0x1000 (page size: 0x1000)
bhyve: WARN: passthru device 6/14/0 BAR0 aligning size 0x80 to 0x1000 (page size: 0x1000)
bhyve: ERROR: passthru_mmio_addr:1317 pci_passthru: device 6/14/0 BAR0 map_pptdev_mmio failed
bhyve: WARN: passthru device 6/14/0 BAR2 aligning size 0x800 to 0x1000 (page size: 0x1000)

What is puzzling me is the detail of condition failure:
Code:
ppt1: 6/14/0 BAR0 HPv7: ppt_valid_bar_mapping:497 FALSE hpa=0xf7c00000 base=0xf7c01000 len=0x1000 size=0x80 max_size=0x1
000 hpa(0xf7c00000) >= base(0xf7c01000): FALSE hpa+len (f7c01000) <= base+max_size(f7c02000): TRUE

corresponding to this snippet in kernel:
C:
// /usr/src/sys/amd64/vmm/io/ppt.c
ppt_valid_bar_mapping(struct pptdev *ppt, vm_paddr_t hpa, size_t len)
{
//...
   if (hpa >= base && hpa + len <= base + max_size)
       return (true);
    else
        // ERROR
// ...
}

In my case hpa=0xf7c00000 and base=0xf7c01000 - so the condition hpa >= base is false.

Can anybody with better knowledge of FreeBSD kernel tell why this condition fails?
 

Attachments

What version of FreeBSD are you hacking on?
That's not how the actual code of ppt_valid_bar_mapping looks like.
It iterates over all BARs and fails only if none of them covers (hpa, hap + len).
 
What version of FreeBSD are you hacking on?
That's not how the actual code of ppt_valid_bar_mapping looks like.
It iterates over all BARs and fails only if none of them covers (hpa, hap + len).
My patches are for official 14.3-RELEASE
Bash:
$ freebsd-version -u

14.3-RELEASE

$ freebsd-version -k

14.3-RELEASE

$ dmesg | grep 'GENERIC amd64'

FreeBSD 14.3-RELEASE releng/14.3-n271432-8c9ce319fef7 GENERIC amd64

You can verify original ppt_valid_bar_mapping code here: https://github.com/freebsd/freebsd-src/blob/release/14.3.0/sys/amd64/vmm/io/ppt.c#L468
Here is full listing of ppt_valid_bar_mapping in 14.3-RELEASE:
C:
static bool
ppt_valid_bar_mapping(struct pptdev *ppt, vm_paddr_t hpa, size_t len)
{
    struct pci_map *pm;
    pci_addr_t base, size;

    for (pm = pci_first_bar(ppt->dev); pm != NULL; pm = pci_next_bar(pm)) {
        if (!PCI_BAR_MEM(pm->pm_value))
            continue;
        base = pm->pm_value & PCIM_BAR_MEM_BASE;
        size = (pci_addr_t)1 << pm->pm_size;
        if (hpa >= base && hpa + len <= base + size)
            return (true);
    }
    return (false);
}
 
I'm closer to last problem (even with my patch) - why BAR0 mapping fails - because it is somehow not page aligned (but why?).

On host side I see that both BARs are properly aligned to page size:
Bash:
$ pciconf -lb ppt0

ppt0@pci0:6:11:0:       class=0xff0000 rev=0x02 hdr=0x00 vendor=0x10b5 device=0x9050 subvendor=0x1761 subdevice=0x00a4
    bar   [10] = type Memory, range 32, base 0xf7c02000, size 128, disabled
    bar   [18] = type Memory, range 32, base 0xf7c01000, size 2048, disabled

But already inside /usr/src/usr.sbin/bhyve/vim pci_passthru.c I see:
Code:
bhyve: ERROR: passthru_mmio_addr:1317 pci_passthru: device 6/11/0 BAR0 at 0xc0008800 (size: 0x1000) map_pptdev_mmio failed
Note that BAR0 address inside function passthru_mmio_addr() is no longer page aligned(remainder 0x800 compared to required page align 0x1000).

This will result in failure in kernel validation at ppt_map_mmio():
Code:
ERROR: EINVAL HPv8: ppt_map_mmio:524 6/11/0 if() PAGE_SIZE len=0x1000 hpa=0xf7c02000 gpa=0xc0008800 hpa+len=0xf7c03000 gpa+len=0xc0009800 hpa_align=0 gpa_align=0x800
The condition inside ppt_map_mmio() is:
C:
if (len % PAGE_SIZE != 0 || len == 0 || gpa % PAGE_SIZE != 0 ||
            hpa % PAGE_SIZE != 0 || gpa + len < gpa || hpa + len < hpa)
    // ...

Where gpa % PAGE_SIZE != 0 is asserted (because the result is 0x800 instead of 0).
 

Attachments

Last mystery resolved - I forget to also hack /usr/src/usr.sbin/bhyve/pci_emul.c that aligns MEM BAR address to its size (which in my case is NOT multiple of page size). Here is patch that solves this problem for me:
C:
// /usr/src/usr.sbin/bhyve/pci_emul.c
static int
pci_emul_alloc_resource(uint64_t *baseptr, uint64_t limit, uint64_t size,
                        uint64_t *addr)
{
        uint64_t base;
// ADDED
        if ((size & PAGE_MASK) != 0){
                uint64_t aligned_size = (size + PAGE_MASK) & ~ PAGE_MASK;
                warnx("WARN: HPv9 %s:%d aligning size=%#lx to %#lx (page_size=%#x)\n",
                                __func__, __LINE__, size, aligned_size, PAGE_SIZE);
                size = aligned_size;
        }
// END of ADDED
        assert((size & (size - 1)) == 0);       /* must be a power of 2 */

        base = roundup2(*baseptr, size);
// ...
}

Finally when I apply my complete attached patch, everything works properly.
Guest sees all MEM BARs aligned to 4KB:
Bash:
arch-linux$ lspci -vd 10b5:9050
00:0b.0 Unassigned class [ff00]: PLX Technology, Inc. PCI <-> IOBus Bridge (rev 02)
        Subsystem: Pickering Interfaces Ltd Device 00a4
        Flags: bus master, medium devsel, latency 0
        Memory at c0009000 (32-bit, non-prefetchable) [size=128]
        Memory at c0008000 (32-bit, non-prefetchable) [size=2K]

And bhyve.log now contains no error - just my debug code (warnings):
Code:
bhyve: WARN: HPv9 pci_emul_alloc_resource:612 aligning size=0x800 to 0x1000 (page_size=0x1000)

bhyve: WARN: passthru device 6/11/0 BAR2 at 0xc0008000 aligning size 0x800 to 0x1000 (page size: 0x1000)
bhyve: WARN: HPv9 pci_emul_alloc_resource:612 aligning size=0x80 to 0x1000 (page_size=0x1000)

bhyve: WARN: passthru device 6/11/0 BAR0 at 0xc0009000 aligning size 0x80 to 0x1000 (page size: 0x1000)
bhyve: WARN: HPv9 pci_emul_alloc_resource:612 aligning size=0x40 to 0x1000 (page_size=0x1000)

wrmsr to register 0x140(0) on vcpu 0
wrmsr to register 0x140(0) on vcpu 1
bhyve: WARN: passthru device 6/11/0 BAR0 at 0xc0009000 aligning size 0x80 to 0x1000 (page size: 0x1000)
bhyve: WARN: passthru device 6/11/0 BAR2 at 0xc0008000 aligning size 0x800 to 0x1000 (page size: 0x1000)
bhyve: WARN: passthru device 6/11/0 BAR0 at 0xc0009000 aligning size 0x80 to 0x1000 (page size: 0x1000)
bhyve: WARN: passthru device 6/11/0 BAR2 at 0xc0008000 aligning size 0x800 to 0x1000 (page size: 0x1000)

Now tricky stuff - how to clean-up this patch so it can be included in official Bhyve?

To summarize, this patch fixes following limitations in Bhyve:
  1. Support for MEM BARs that are smaller than page size (4096 bytes on PC)
  2. Support PCI cards without interrupt - in such case there is no MSI(-X) needed at all
 

Attachments

Back
Top