bhyve Bhyve VM stuck when passthru enabled

jbo@

Developer
Hey folks,

I'm running into an issue for a few days now that is absolutely driving me nuts.
A few weeks ago I setup a Windows 11 Pro guest using sysutils/vm-bhyve. Installation and configuration of the VM worked without any problems.
Then I started to passthru a PCIe USB controller card. This also worked without any problems.
However, after a few days, the VM refused to boot. It gets stuck eternally somewhere in the Windows bootloader (when the Windows loader circle appears just after TianoCore finished). At this point there's nothing I can do other than to kill the bhyve process. I even need to kill -9.

After some dicking around I figured out that the VM boots fine 100% of the time if I simply don't pass thru the PCIe USB controller. And once I add the passthru again, it locks up exactly the same.

I'm completely out of ideas. This used to work well for a few days after I initially created the VM and fails consistently ever since then.

Scenario:
  • FreeBSD 14-STABLE host
  • Windows 11 Pro guest
vm-bhyve config:
Code:
loader="uefi"
graphics="yes"
xhci_mouse="yes"

# cpu
cpu=8
cpu_sockets=1
cpu_cores=8
cpu_threads=1

# memory
memory=16G

# AHCI
ahci_device_limit="8"

# networking
network0_type="virtio-net"
network0_switch="public"

# disk
disk0_type="nvme"
disk0_name="disk0.img"

# Windows expects the host to expose localtime by default, not UTC
utctime="no"

# Passtrhu
#passthru0="6/0/0"    <---- Uncommenting this makes the VM getting stuck at Windows boot :(

uuid="151ec8ab-cdb6-11ee-8d6b-000acd2d4844"
network0_mac="xx:xx:xx:xx:xx:xx"  # Redacted for public post

vm passthru
Code:
DEVICE     BHYVE ID     READY        DESCRIPTION
hostb0     0/0/0        No           8th Gen Core Processor Host Bridge/DRAM Registers
pcib1      0/1/0        No           6th-10th Gen Core Processor PCIe Controller (x16)
pcib2      0/1/1        No           Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x8)
xhci0      0/20/0       No           200 Series/Z370 Chipset Family USB 3.0 xHCI Controller
none0      0/22/0       No           200 Series PCH CSME HECI
ahci0      0/23/0       No           200 Series PCH SATA controller [AHCI mode]
pcib3      0/27/0       No           200 Series PCH PCI Express Root Port
pcib4      0/27/4       No           200 Series PCH PCI Express Root Port
pcib10     0/28/0       No           200 Series PCH PCI Express Root Port
pcib11     0/28/1       No           200 Series PCH PCI Express Root Port
pcib15     0/28/4       No           200 Series PCH PCI Express Root Port
pcib16     0/29/0       No           200 Series PCH PCI Express Root Port
isab0      0/31/0       No           Z370 Chipset LPC/eSPI Controller
none1      0/31/2       No           200 Series/Z370 Chipset Family Power Management Controller
hdac1      0/31/3       No           200 Series PCH HD Audio
ichsmb0    0/31/4       No           200 Series/Z370 Chipset Family SMBus Controller
em0        0/31/6       No           Ethernet Connection (2) I219-V
vgapci0    1/0/0        No           GP104GL [Quadro P5000]
hdac0      1/0/1        No           GP104 High Definition Audio Controller
nvme0      2/0/0        No           NVMe SSD Controller PM9A1/PM9A3/980PRO
nvme1      3/0/0        No           NVMe SSD Controller SM981/PM981/PM983
pcib5      4/0/0        No           PI7C9X2G608GP PCIe2 6-Port/8-Lane Packet Switch
pcib6      5/1/0        No           PI7C9X2G608GP PCIe2 6-Port/8-Lane Packet Switch
pcib7      5/2/0        No           PI7C9X2G608GP PCIe2 6-Port/8-Lane Packet Switch
pcib8      5/3/0        No           PI7C9X2G608GP PCIe2 6-Port/8-Lane Packet Switch
pcib9      5/4/0        No           PI7C9X2G608GP PCIe2 6-Port/8-Lane Packet Switch
ppt0       6/0/0        Yes          uPD720202 USB 3.0 Host Controller
ppt1       7/0/0        Yes          uPD720202 USB 3.0 Host Controller
ppt2       8/0/0        Yes          uPD720202 USB 3.0 Host Controller
ppt3       9/0/0        Yes          uPD720202 USB 3.0 Host Controller
pcib12     11/0/0       No           PI7C9X2G304 EL/SL PCIe2 3-Port/4-Lane Packet Switch
pcib13     12/1/0       No           PI7C9X2G304 EL/SL PCIe2 3-Port/4-Lane Packet Switch
pcib14     12/2/0       No           PI7C9X2G304 EL/SL PCIe2 3-Port/4-Lane Packet Switch
re0        13/0/0       No           RTL8111/8168/8211/8411 PCI Express Gigabit Ethernet Controller
re1        14/0/0       No           RTL8111/8168/8211/8411 PCI Express Gigabit Ethernet Controller
xhci1      15/0/0       No           ASM2142/ASM3142 USB 3.1 Host Controller

/boot/loader.conf
Code:
kern.geom.label.disk_ident.enable="0"
kern.geom.label.gptid.enable="0"
cryptodev_load="YES"
zfs_load="YES"

kern.racct.enable=1

autoboot_delay="5"

# Don't wait on USB enumeration
hw.usb.no_boot_wait="1"

# CPU microcode updates
cpu_microcode_load="YES"
cpu_microcode_name="/boot/firmware/intel-ucode.bin"

# Use usbhid(4)
hw.usb.usbhid.enable="1"    # Still necessary on FreeBSD 13.1+ ?

# VFS
vfs.usermount=1

# VMM
vmm_load="YES"

# PCI passthrough (bhyve)
pptdevs="6/0/0 7/0/0 8/0/0 9/0/0"

VNC screenshot of when/where the guest gets stuck (notice the segment of the Windows bootloader progress thingy - it stops moving):
1711309570452.png


Any ideas?
I have since updated to a more recent stable/14 commit twice - no changes in behavior.
Changes to VM configuration such as CPU count, CPU topology or memory have no effect.
Also enabling/disabling the network config in the VM config doesn't change anything.
 
Not exactly the same, but I had a problem with restarting the virtual machine when I passed thru xhci pci usb controller to it. Actually I don't use vm-bhyve, but bhyve. 'bhyve+' helped me - - https://forums.freebsd.org/threads/cant-restart-vm-in-freebsd-13-1-release.85379/post-570084
(post #18)

And I do not use '-A', as with this option Windows VMs do not load - - https://forums.freebsd.org/threads/...e-the-bhyve-a-h-p-w-s-flags.92420/post-644239
(post #8)

And yes, I use Windows 10 and Windows Server 2022, not Windows 11. My host - FreeBSD 14.0-RELEASE

Edit.
Maybe this will help somehow :
(post #25)

 
Try resetting the device from your host with devctl reset. I am not sure if you will need to use the -d option. Or try usbconf <ugen-device> reset.
 
Thank you for your input guys - Much appreciated!

I did further mess around with it including "bypassing" sysutils/vm-bhyve and just manually using bhyve and bhyvectl directly. No change in behavior.

One thing I figured out in the meantime: The VM boots successfully 100% of the time with the passthru if I do not set more than one CPU core. Also 1 CPU core but 2 threads works 100% of the time.
But once I give it more than one CPU core the VM locks up with the PCIe passthru (and still boots successfully 100% of the time without the passthru).

Any ideas?


Try resetting the device from your host with devctl reset. I am not sure if you will need to use the -d option. Or try usbconf <ugen-device> reset.
I did mess around with devctl reset but didn't observe any change in behavior (the command(s) succeeded tho).
Not sure how usbconf is relevant here as I am passing through the entire USB controller. The host (FreeBSD), doesn't get to see those hence they don't show up there.

And I do not use '-A', as with this option Windows VMs do not load - - https://forums.freebsd.org/threads/...e-the-bhyve-a-h-p-w-s-flags.92420/post-644239
(post #8)
I'm not using the -A flag of bhyve either.
 
In another thread (where problems not directly related to yours were discussed) there are links that I found useful - thanks to YYY__ (see post #8 there). Strictly speaking these:

(post #4 and below)

I think the problem has something to do with 'interrupt affinity for MSI/MSI-X capable devices'.

Edit: (clarified)
 
Last edited:
You can assign some of your USB devices (like a mouse and a keyboard) from the host os to a guest / bhyve VM using a pcie riser which splits the pcie port.

It could be something like:

https://www.galaxus.de/de/s1/produc...ZWbcHzTTtxjyxeW5ZrgaAttnEALw_wcB&gclsrc=aw.ds

Please search yourself for a suitable card!!!

Most of those cards are for mining and only expose a single pcie lane to each device.

A single pcie lane will degrade your gpu performance noticeable.

If it’s only mouse and keyboard you could use bhyves virtio-input emulation.

I think it has been accepted on 13.2 and 14.0.
 
Is your CPU AMD ? If so, did you enable IOMMU both in BIOS and /boot/loader.conf?

By default AMD-Vi passthrough support is disabled, set hw.vmm.amdvi.enable and reload vmm.ko to enable it.
I am using AMD and I was able to pass one of the usb controller (other 2 controllers cause vm to not boot) to the Windows 10 virtual machine.
 
ziomario Stop trolling or you'll become the first person I ever put on an ignore list.

Is your CPU AMD ? If so, did you enable IOMMU both in BIOS and /boot/loader.conf?
Nope, this is an Intel rig (i7-8086k). VT-d (IOMMU) is enabled in BIOS. Otherwise passthru would have never worked :)
 
ziomario : Stop trolling or you'll become the first person I ever put on an ignore list.

I don't understand why are you telling that. I'm not trolling,I'm trying to help. Anyway,sorry,it wasn't my intention to troll you. I don't even see the reason to do that. Its just that I have copied and pasted the solution that a developer gave to me because I've thought that it could have been useful also for you. Sorry.
 
I don't understand why are you telling that. I'm not trolling,I'm trying to help. Anyway,sorry,it wasn't my intention to troll you. I don't even see the reason to do that. Its just that I have copied and pasted the solution that a developer gave to me because I've thought that it could have been useful also for you. Sorry.
It's the "solution" that I suggested to you in a different thread, hence the trolling remark:
Furthermore, I don't see how this has any relevance to the topic/problem discussed here.

Let's please not add any more noise to this thread.
 
This is how learning works. Someone tells you how to do something, you listen, you test it, you see that it works and when a case arises where what you have learned is needed, you suggest it. I wouldn't call this trolling, but learning and helping. However, I also forgot that you gave me that advice. Finally, I also gave you another suggestion different from your, this could be the proof that I wasn't trolling you.
 
jbo@
One more thought. Apparently, by default (according to bhyve_config(5)), bhyve uses memory ballooning. When starting bhyve(8) VMs, I always specify the '-S' (wire guest memory) option. Surely there is a similar option in the vm-bhyve settings. Maybe turning this option on will help somehow:-/
 
I have since upgraded from an Intel i7-8086K (from the original post) to an AMD Threadripper 7960X and I am getting the exact same behavior.

One more thought. Apparently, by default (according to bhyve_config(5)), bhyve uses memory ballooning. When starting bhyve(8) VMs, I always specify the '-S' (wire guest memory) option.
That did not seem to be relevant :(
 
Bumped into the same issue.
Had two VMs, Windows 10 and Windows 11.
Windows 10 is able to boot, but Windows 11 were stuck.
The weird thing is the two vm-bhyve configs are identical and almost like yours.

The only thing that has changed since the last successful booted is the uefi-edk2-bhyve-csm was expired on 4/1/2024.
So had to install the edk2@bhyve.
Don't know if that's relevant.
 
Back
Top