bhyve 15.0-RELEASE causes unknown failure of Bhyve PCI-Passthru guest

To begin, I have multiple servers that I migrated from 14.3-RELEASE to 15.0-RELEASE. Everything went without a hitch except one very important server that has an OPNSense Bhyve guest using PCI-Passthru for WAN and LAN networking. There were no warnings or errors or anything out of the ordinary on this server.

So, upon reboot after installing the kernel, the guest would not start - okay, this makes sense since no userland was updated yet. So another freebsd-update install later and the guest started right up with no errors. Now begins the odd bit....

The OPNSense guest has the LAN and WAN (em0 and em1 respectively) devices and configured. Infact, sometimes the WAN port would successfully get an address from the ISP's DHCP. No matter what, however, the OPNSense guest is completely unable to ping out to any address WAN or LAN, even the internal gateway which is directly connected (10.99.99.1/30 is OPSense and 10.99.99.2/30 is the L3 switch, attached directly by Cat6). I verified the ports are still correct by turning them on and off and checking the modem and switch that are connected to the appropriate ports. I verified the routing. Nope, even with the correct routing (IPv4 and IPv6) table, no packets were going in either direction. I rebooted multiple times.

I spent a good 2 hours trying to diagnose this but I had to get the network back online. I used bectl activate to switch back to the latest 14.3-RELEASE boot environment from just prior to upgrade and rebooted. The OPNSense guest works normally and with zero issues.

So with that out of the way, what could be affecting PCI-Passthru network devices to a Bhyve guest from this upgrade?

I am using the vm-bhyve pkg and the config file is as such:
Code:
loader="bhyveload"
priority="1"
cpu="3"
memory="8G"
disk0_type="nvme"
disk0_name="disk0.img"
uuid="3b52dad1-c916-11ed-a8a9-002590247e86"
passthru0="3/0/0" # LAN
passthru1="5/0/0" # Modem

The full /boot/loader.conf:
Code:
boot_serial="YES"
comconsole_port="0x3e8"
console="comconsole"
autoboot_delay="3"
security.bsd.allow_destructive_dtrace="0"
kern.geom.label.disk_ident.enable="0"
kern.geom.label.gptid.enable="0"
cryptodev_load="YES"
zfs_load="YES"
coretemp_load="YES"
mlx4en_load="YES"
kern.racct.enable="1"
cpu_microcode_load="YES"
cpu_microcode_name="/boot/firmware/intel-ucode.bin"

# PCI passthrough of em2 and em3 to OPNSense
vmm_load="YES"
pptdevs="3/0/0 5/0/0"

Ideas to troubleshoot this or provide better insight?
 
For what it's worth, I'm also having issues (with vm-bhyve) and networking. It happened on 14.3, guests wouldn't start, and it turned out the switch was having problems and I was unable to destroy the switch. SirDice pointed out a PR that might be the cause. I updated ports and reinstalled vm-bhyve and it was fine. But then a few days later it happened again, same symptoms, only thing in logs is that it can't find the switch and again I can't destroy the switch. Same issue still there after upgrading to 15.0.

Here's the link to that post though it seems as if it was solved, but perhaps it may be of use.
 
Hmmmm. I don't belive that is related on account of I am not using any virtual switches with the Bhyve guests.
Code:
rich@ecorp:~ % doas vm switch list
NAME  TYPE  IFACE  ADDRESS  PRIVATE  MTU  VLAN  PORTS
rich@ecorp:~ %

The two ethernet ports are PCI-passthru directly to the Bhyve guest.....and that is why I am most confused as to why 15.0-RELEASE would at all affect how these passthru'd devices are behaving in the guest!

Further info if at all helpful:
Code:
rich@ecorp:~ % doas vm info bob
------------------------
Virtual Machine: bob
------------------------
  state: running (4750)
  datastore: default
  loader: bhyveload
  uuid: 3b52dad1-c916-11ed-a8a9-002590247e86
  cpu: 3
  memory: 8G
  memory-resident: 8606818304 (8.015G)

  console-ports
    com1: /dev/nmdm-bob.1B

  virtual-disk
    number: 0
    device-type: file
    emulation: nvme
    options: -
    system-path: /vms/bob/disk0.img
    bytes-size: 21474836480 (20.000G)
    bytes-used: 6243009536 (5.814G)

  snapshots
    zroot/vms/bob@zfs-auto-snap_monthly-2025-07-01-00h28        3.65G   Tue Jul  1  0:28 2025
    zroot/vms/bob@zfs-auto-snap_monthly-2025-08-01-00h28        2.38G   Fri Aug  1  0:28 2025
    zroot/vms/bob@zfs-auto-snap_monthly-2025-09-01-00h28        2.25G   Mon Sep  1  0:28 2025
    zroot/vms/bob@zfs-auto-snap_monthly-2025-10-01-00h28        2.19G   Wed Oct  1  0:28 2025
    zroot/vms/bob@zfs-auto-snap_monthly-2025-11-01-00h28        1.23G   Sat Nov  1  0:28 2025

CPU: Intel(R) Xeon(R) CPU E3-1280 V2 @ 3.60GHz
MoBo: Supermicro X9SCI/X9SCA
RAM: 4x SK Hynix 8GB ECC UDIMM

Switch configuration:
Code:
interface ethernet 1/1/1
 port-name ecorp-bob-LAN
 route-only
 ip address 10.99.99.2 255.255.255.252
 ip ospf area 0
 ipv6 address fe80::1 link-local
 ipv6 address fd33:58bc:59a0:9991::2/126
 ipv6 enable
 ipv6 ospf area 0
 ipv6 nd suppress-ra
 no spanning-tree
 no flow-control both
 no inline power

FWIW: OSPFv2/v3 is used between OPNSense and the switch, but I tried manual routing entries and that made no difference - not even the directly connected 10.99.99.1/2/30 links were able to ping under any circumstances that I tried.
 
It's clear that 15.0-RELEASE is a no-go for my home router until a correction is made.
I will try this weekend on a test machine, but whether it works or not, I can't risk a problem like this on my home router.
 
I just upgraded my test machine and I confirm the problem. passthru is working, but the network from the passthru device doesn't work at all.

Waiting for patches...
 
Just as an experiment you may wish to apply this patch, rebuild bhyve and use it to see if your problem disappears....

diff --git a/usr.sbin/bhyve/pci_fbuf.c b/usr.sbin/bhyve/pci_fbuf.c
index 1e3ec77c15b0..91ef57261af5 100644
--- a/usr.sbin/bhyve/pci_fbuf.c
+++ b/usr.sbin/bhyve/pci_fbuf.c
@@ -71,10 +71,10 @@ static int fbuf_debug = 1;

#define DMEMSZ 128

-#define FB_SIZE (32*MB)
+#define FB_SIZE (16*MB)

-#define COLS_MAX 3840
-#define ROWS_MAX 2160
+#define COLS_MAX 1920
+#define ROWS_MAX 1200

#define COLS_DEFAULT 1024
#define ROWS_DEFAULT 768


This is a workaround for making 9front work on 15 and -current and I am curious to see if it fixes other seemingly unrelated problems.
[Patch due to Mark Peek]
 
Just as an experiment you may wish to apply this patch, rebuild bhyve and use it to see if your problem disappears....

diff --git a/usr.sbin/bhyve/pci_fbuf.c b/usr.sbin/bhyve/pci_fbuf.c
index 1e3ec77c15b0..91ef57261af5 100644
--- a/usr.sbin/bhyve/pci_fbuf.c
+++ b/usr.sbin/bhyve/pci_fbuf.c
@@ -71,10 +71,10 @@ static int fbuf_debug = 1;

#define DMEMSZ 128

-#define FB_SIZE (32*MB)
+#define FB_SIZE (16*MB)

-#define COLS_MAX 3840
-#define ROWS_MAX 2160
+#define COLS_MAX 1920
+#define ROWS_MAX 1200

#define COLS_DEFAULT 1024
#define ROWS_DEFAULT 768


This is a workaround for making 9front work on 15 and -current and I am curious to see if it fixes other seemingly unrelated problems.
[Patch due to Mark Peek]
The link between a video framebuffer and a NIC escapes me. The VM starts, no problem from this side.
 
Note that I have no problem with a passthrough NIC working in a VM (on 15.0-stable) but you do so I suspect something is messed up with the PCI space -- may be framebuffer addr window overlaps with something else? Hence the experiment to check this hypothesis.
 
I am unable to try any changes until atleast Tuesday. I am currently states away from home and accessing via WireGuard until then (the OPNSense guest is the endpoint).
 
im ready to try anything! i cant turn on or off my lamps! i guess its not straightforward to downgrade 15 to 14.3?
It's 5 min to test the patch and other 5 min to return at the initial bhyve if you want to keep 15.0.
And yes, the only known medecine for this problem is to return to 14.x. The downgrade could be trivial if you have a root-on-zfs system (boot environments).

You can also wait until a correction is made.
 
Back
Top