Solved 11.4-RELEASE to 12.2-RELEASE network issue for bhyve VMs

After upgrading to 12.2-RELEASE networking is not working for my bhyve VMs.

In 11.4, I had daemonized the launching of the VMs using daemon -rft in /etc/rc.local.

Oddly in 12.2, the VMs come up and are accessible with VNC but not over the network, and I am unable to browse the web or ping anything across the network. I shut them down, commented the relevant lines in /etc/rc.local and rebooted. Then I started the VMs myself after logging into the system. When I start them myself I am able to browse the web and ping hosts on the network but not ping between the VMs, which was possible in 11.4.

I shut the VMs down again and started trying to compare what was different between 11.4 and 12.2. One thing I noticed was that in 11.4, when I kldstat, I would see if_tap(4) loaded. I have /etc/rc.conf loading this module. In 12.2, when I kldstat, I do not see if_tap(4) loaded.

Doing some research, it looks like if_tun(4) and if_tap(4) are being merged.

I tried switching my /etc/rc.conf to reference if_tuntap(4) and rebooting, but the behavior was no different and I didn't see if_tuntap(4) when I kldstat.

These are relevant lines of my /etc/rc.conf:

Code:
hostname="einstein"
ifconfig_em0="inet 10.10.20.121/16"
defaultrouter="10.10.0.1"
kld_list="aesni coretemp vmm if_tap if_bridge bridgestp"
cloned_interfaces="bridge21"
ifconfig_bridge21="inet 10.10.21.1/24"

I have the following in /etc/sysctl.conf:

Code:
# bhyve
net.link.tap.up_on_open=1
net.link.ether.inet.proxyall=1
net.inet.ip.random_id=1
net.inet.ip.forwarding=1

My VMs are using virtio-net with the tap device created by ifconfig tap create.

Does anybody have any idea why I am not able to ping between VMs or to other hosts on the network after this upgrade? Is this a bug?

What steps can I take to further troubleshoot?
 
Does anybody have any idea why I am not able to ping between VMs or to other hosts on the network after this upgrade? Is this a bug?

What steps can I take to further troubleshoot?
What does your ifconfig show?

Do you have a switch like this:
Code:
vm-internal: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        ether fa:21:f2:f5:56:61
        inet 192.168.100.1 netmask 0xffffff00 broadcast 192.168.100.255
        id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
        maxage 20 holdcnt 6 proto stp-rstp maxaddr 2000 timeout 1200
        root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
        groups: bridge vm-switch viid-d1efa@
        nd6 options=1<PERFORMNUD>

Can you ping the switch IP from main system? Can you connect tap interfaces to the switch?

BTW, recommend you to use sysutils/vm-bhyve for managing Bhyve VM-s. It is a collection of shell scripts, making VM management easy.
 
What does your ifconfig show?

Do you have a switch like this:
Code:
vm-internal: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        ether fa:21:f2:f5:56:61
        inet 192.168.100.1 netmask 0xffffff00 broadcast 192.168.100.255
        id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
        maxage 20 holdcnt 6 proto stp-rstp maxaddr 2000 timeout 1200
        root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
        groups: bridge vm-switch viid-d1efa@
        nd6 options=1<PERFORMNUD>

Can you ping the switch IP from main system? Can you connect tap interfaces to the switch?

BTW, recommend you to use sysutils/vm-bhyve for managing Bhyve VM-s. It is a collection of shell scripts, making VM management easy.
I don't have a switch, just a bridge. I'm not using vm-bhyve, have been trying to go without a management utility because it's another tool to learn and depend on.

Here's my ifconfig:
Code:
# ifconfig
em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=81249b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LRO,WOL_MAGIC,VLAN_HWFILTER>
        ether d0:50:99:c0:5b:c4
        inet 10.10.20.121 netmask 0xffff0000 broadcast 10.10.255.255
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
em1: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=81249b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LRO,WOL_MAGIC,VLAN_HWFILTER>
        ether d0:50:99:c0:5b:c2
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
em2: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=81249b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LRO,WOL_MAGIC,VLAN_HWFILTER>
        ether d0:50:99:c0:5b:c5
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
em3: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=81249b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LRO,WOL_MAGIC,VLAN_HWFILTER>
        ether d0:50:99:c0:5b:c3
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
bridge21: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        ether 02:5f:73:1a:6f:15
        inet 10.10.21.1 netmask 0xffffff00 broadcast 10.10.21.255
        id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
        maxage 20 holdcnt 6 proto stp-rstp maxaddr 2000 timeout 1200
        root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
        member: tap2 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
                ifmaxaddr 0 port 9 priority 128 path cost 2000000
        member: tap1 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
                ifmaxaddr 0 port 8 priority 128 path cost 2000000
        member: tap0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
                ifmaxaddr 0 port 7 priority 128 path cost 2000000
        groups: bridge
        nd6 options=9<PERFORMNUD,IFDISABLED>
tap0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=80000<LINKSTATE>
        ether 58:9c:fc:10:f4:21
        inet6 fe80::5a9c:fcff:fe10:f421%tap0 prefixlen 64 scopeid 0x7
        groups: tap
        media: Ethernet autoselect
        status: active
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        Opened by PID 2459
tap1: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=80000<LINKSTATE>
        ether 58:9c:fc:10:80:38
        inet6 fe80::5a9c:fcff:fe10:8038%tap1 prefixlen 64 scopeid 0x8
        groups: tap
        media: Ethernet autoselect
        status: active
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        Opened by PID 2470
tap2: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=80000<LINKSTATE>
        ether 58:9c:fc:00:62:58
        inet6 fe80::5a9c:fcff:fe00:6258%tap2 prefixlen 64 scopeid 0x9
        groups: tap
        media: Ethernet autoselect
        status: active
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        Opened by PID 2471
# ping 10.10.21.1
PING 10.10.21.1 (10.10.21.1): 56 data bytes
64 bytes from 10.10.21.1: icmp_seq=0 ttl=64 time=0.036 ms
64 bytes from 10.10.21.1: icmp_seq=1 ttl=64 time=0.047 ms
64 bytes from 10.10.21.1: icmp_seq=2 ttl=64 time=0.051 ms
^C
--- 10.10.21.1 ping statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.036/0.045/0.051/0.006 ms
# ping 10.10.21.101
PING 10.10.21.101 (10.10.21.101): 56 data bytes
ping: sendto: No buffer space available
ping: sendto: No buffer space available
ping: sendto: No buffer space available
# ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
64 bytes from 8.8.8.8: icmp_seq=0 ttl=119 time=8.201 ms
64 bytes from 8.8.8.8: icmp_seq=1 ttl=119 time=8.567 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=119 time=8.827 ms
^C
--- 8.8.8.8 ping statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 8.201/8.531/8.827/0.257 ms
# ping 10.10.0.1
PING 10.10.0.1 (10.10.0.1): 56 data bytes
64 bytes from 10.10.0.1: icmp_seq=0 ttl=64 time=0.214 ms
64 bytes from 10.10.0.1: icmp_seq=1 ttl=64 time=0.185 ms
64 bytes from 10.10.0.1: icmp_seq=2 ttl=64 time=0.173 ms
^C
--- 10.10.0.1 ping statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.173/0.191/0.214/0.017 ms
#
 
ifconfig bridge21 addm em0, see bridge(4) and ifconfig(8).

As a side note, I can highly recommend using sysutils/vm-bhyve to maintain your bhyve VMs. Much easier to use and easy to set up different networks etc.

Code:
root@hosaka:~ # vm list
NAME            DATASTORE  LOADER     CPU  MEMORY  VNC           AUTOSTART  STATE
case            default    bhyveload  4    4096M   -             Yes [3]    Running (23627)
jenkins         default    bhyveload  4    4096M   -             Yes [5]    Running (51734)
kdc             default    none       2    2048M   0.0.0.0:5901  Yes [2]    Running (11392)
debian          stor10k    grub       2    4096M   -             No         Stopped
fbsd-test       stor10k    bhyveload  2    4096M   -             No         Stopped
gitlab          stor10k    bhyveload  4    6144M   -             Yes [10]   Running (5920)
gitlab-runner   stor10k    bhyveload  4    4096M   -             Yes [11]   Running (5967)
kibana          stor10k    bhyveload  4    6144M   -             Yes [1]    Running (80017)
lady3jane       stor10k    bhyveload  4    4096M   0.0.0.0:5900  Yes [12]   Running (5262)
plex            stor10k    bhyveload  4    4096M   -             Yes [6]    Running (6764)
riviera         stor10k    bhyveload  2    4096M   -             Yes [9]    Running (5059)
sdgame01        stor10k    bhyveload  4    4096M   -             No         Stopped
tessierashpool  stor10k    bhyveload  4    32768M  -             Yes [4]    Running (51792)
wintermute      stor10k    bhyveload  4    4096M   -             Yes [8]    Running (23585)
Code:
root@hosaka:~ # vm switch list
NAME     TYPE      IFACE       ADDRESS  PRIVATE  MTU   VLAN  PORTS
servers  standard  vm-servers  -        no       9000  11    lagg0
public   standard  vm-public   -        no       9000  10    lagg0

 
ifconfig bridge21 addm em0, see bridge(4) and ifconfig(8).

As a side note, I can highly recommend using sysutils/vm-bhyve to maintain your bhyve VMs. Much easier to use and easy to set up different networks etc.

Code:
root@hosaka:~ # vm list
NAME            DATASTORE  LOADER     CPU  MEMORY  VNC           AUTOSTART  STATE
case            default    bhyveload  4    4096M   -             Yes [3]    Running (23627)
jenkins         default    bhyveload  4    4096M   -             Yes [5]    Running (51734)
kdc             default    none       2    2048M   0.0.0.0:5901  Yes [2]    Running (11392)
debian          stor10k    grub       2    4096M   -             No         Stopped
fbsd-test       stor10k    bhyveload  2    4096M   -             No         Stopped
gitlab          stor10k    bhyveload  4    6144M   -             Yes [10]   Running (5920)
gitlab-runner   stor10k    bhyveload  4    4096M   -             Yes [11]   Running (5967)
kibana          stor10k    bhyveload  4    6144M   -             Yes [1]    Running (80017)
lady3jane       stor10k    bhyveload  4    4096M   0.0.0.0:5900  Yes [12]   Running (5262)
plex            stor10k    bhyveload  4    4096M   -             Yes [6]    Running (6764)
riviera         stor10k    bhyveload  2    4096M   -             Yes [9]    Running (5059)
sdgame01        stor10k    bhyveload  4    4096M   -             No         Stopped
tessierashpool  stor10k    bhyveload  4    32768M  -             Yes [4]    Running (51792)
wintermute      stor10k    bhyveload  4    4096M   -             Yes [8]    Running (23585)
Code:
root@hosaka:~ # vm switch list
NAME     TYPE      IFACE       ADDRESS  PRIVATE  MTU   VLAN  PORTS
servers  standard  vm-servers  -        no       9000  11    lagg0
public   standard  vm-public   -        no       9000  10    lagg0

That's what I thought. I tried adding em0 to bridge21 but it didn't work. I compared the configuration to another host which is still on 11.4-RELEASE, and in /etc/rc.conf it has:
Code:
# bhyve
kld_list="aesni coretemp vmm if_tap if_bridge bridgestp"
cloned_interfaces="bridge21"
ifconfig_bridge21="inet 172.20.21.1/24"
Obviously the IP addresses are different, but on that system em0 is not added to the bridge.
Code:
# ifconfig
em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=209b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC>
        ether 0c:c4:7a:07:c5:84
        hwaddr 0c:c4:7a:07:c5:84
        inet 172.20.20.220 netmask 0xffff0000 broadcast 172.20.255.255
        inet 172.20.20.222 netmask 0xffff0000 broadcast 172.20.255.255
        inet 172.20.20.221 netmask 0xffff0000 broadcast 172.20.255.255
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
em1: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=209b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC>
        ether 0c:c4:7a:07:c5:85
        hwaddr 0c:c4:7a:07:c5:85
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: Ethernet autoselect
        status: no carrier
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
        inet6 ::1 prefixlen 128
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3
        inet 127.0.0.1 netmask 0xff000000
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        groups: lo
bridge21: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        ether 02:c4:0b:37:19:15
        inet 172.20.21.1 netmask 0xffffff00 broadcast 172.20.21.255
        nd6 options=9<PERFORMNUD,IFDISABLED>
        groups: bridge
        id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
        maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200
        root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
        member: tap0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
                ifmaxaddr 0 port 5 priority 128 path cost 2000000
        member: tap1 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
                ifmaxaddr 0 port 6 priority 128 path cost 2000000
tap1: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=80000<LINKSTATE>
        ether 00:bd:c1:a0:f7:01
        hwaddr 00:bd:c1:a0:f7:01
        inet6 fe80::2bd:c1ff:fea0:f701%tap1 prefixlen 64 scopeid 0x6
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        media: Ethernet autoselect
        status: active
        groups: tap
        Opened by PID 807
tap0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=80000<LINKSTATE>
        ether 00:bd:6f:8c:cd:00
        hwaddr 00:bd:6f:8c:cd:00
        inet6 fe80::2bd:6fff:fe8c:cd00%tap0 prefixlen 64 scopeid 0x5
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        media: Ethernet autoselect
        status: active
        groups: tap
        Opened by PID 10933
#

If I install sysutils/vm-bhyve will it recognize my existing VMs, or do I need to rebuild everything from scratch using the tool?
 
You're still trying to supernet (the 172.20.21.1/24 is part of the 172.20.20.220/16 network). Remove the IP address from the bridge. You could also bind the bridge to the em1 interface, it doesn't seem to be used right now.

If I install sysutils/vm-bhyve will it recognize my existing VMs, or do I need to rebuild everything from scratch using the tool?
You can probably just create the right configs for them and use them with vm(8), I don't see a reason why this won't work.
 
You're still trying to supernet (the 172.20.21.1/24 is part of the 172.20.20.220/16 network). Remove the IP address from the bridge. You could also bind the bridge to the em1 interface, it doesn't seem to be used right now.


You can probably just create the right configs for them and use them with vm(8), I don't see a reason why this won't work.
I understand what you're saying about supernet but that configuration is actually working...

One thing I did notice is that the MAC address of my guests does not match the MAC address of my tap devices on 12.2-RELEASE. However, on 11.4-RELEASE, the MAC address of my tap device agrees with the MAC address in the guest. That's got to be a problem, no?
 
Turns out there was nothing wrong with the networking. It was an issue with my bhyve configs. Unfortunately, I blew away the OS and installed a fresh 11.4-RELEASE system to discover this.

When I launch my Windows guests, I like to see the SAC come up, so I have -l com1,stdio in my bhyve command. For some reason this needs to be removed when they're daemonized and launched automatically with /etc/rc.local. If it's not removed, the Windows VMs come up just fine and are accessible with VNC, but the virtio-net adapter doesn't work. I have absolutely no idea why, and maybe this is a bug, or maybe I just don't know what I'm doing.

Once I removed this directive from my bhyve command, everything started working properly. On 12.2-RELEASE, my /etc/rc.conf is slightly different than it was on 11.4-RELEASE: there's no if_tap nor if_tuntap in my kld_list because they're evidently built into the kernel now.

Anybody know why the virtio-net adapter doesn't work when -l com1,stdio is included? The adapter is there, drivers can be loaded, it just doesn't pass any traffic. What a sincere waste of time fiddling with the network and rebuilding the OS so many times!

SirDice I did remove ifconfig_bridge21="inet 10.10.21.1/24" from my /etc/rc.conf. I'm not sure why it was ever there. On 11.4 my bridge works without em0 as a member, but on 12.2 I did need to add ifconfig_bridge21="addm em0 up".
 
Last edited by a moderator:
Back
Top