jails Huge downgrade in iperf3 speed when a jail is active

Hi, I was wondering if anyone's ever noticed this strange issue.

I am running FreeBSD-13.1-RELEASE VM on Proxmox with 2 vCPU's and 16GB RAM. If I run iperf3 test without any jails running against the Proxmox host, I get an average of 22 Gbits/sec.
However, as soon as I fire up a jail (doesn't matter if it's 1 or 4 jails and doesn't matter which jail), transfer speeds drop immediately to 4.3 Gbits/sec. As soon as I kill all those jails, transfer speeds go back up to 22 Gbits/sec.
I've verified this against a Debian Linux VM I also have running also with 2 vCPU's but only 2GB RAM. It also gets 22 Gbits/sec, so the issue is definitely related to the jails, but I can't figure out why the jails would impact performance that much, especially because they're all basically idle and not really doing anything. The issue basically triggers if ANY jail is active.

The jails are created as VNET jails by BastilleBSD setup with DHCP IP address handed out by an existing router in my network. Proxmox host is a 10c/20t Xeon Silver with 224GB RAM.

Any help is greatly appreciated!
 
Never saw such behavior on bare-metal jailhosts as well as bhyve VMs running on FreeBSD or smartOS host...
Are you sure this isn't an issue with the hosts (proxmox/linux) network stack? SmartOS once had a bug where VMs with excess virtual interfaces (10+) had degraded network performance and occasional packet loss. IIRC this was shortly after they ported bhyve to smartOS, so several years (5+?) ago...
 
This is certainly interesting.
I built & ran some smaller Proxmox clusters a few years ago with dozens of FreeBSD VMs on top of it. Most of those were running jails. While we experienced a lot of issues with Proxmox over the years and eventually abandoned it there was a lot of performance testing happening before putting it into production and I cannot recall having encountered any such behavior.
The underlying Proxmox hosts/cluster-nodes had similar specs to what you describe.

I doubt that this will make any difference but did you try the same without VNET jails?
Furthermore, are there any differences between running iperf3 from the FreeBSD VM vs. from the FreeBSD VM jail? Given that this is a VNET jail I assume that you can run iperf3 in it (I'm actually not sure).

In any case, please keep us posted about new insights and - hopefully - the corresponding conclusions at the end :)
 
Are you sure this isn't an issue with the hosts (proxmox/linux) network stack? SmartOS once had a bug where VMs with excess virtual interfaces (10+) had degraded network performance and occasional packet loss. IIRC this was shortly after they ported bhyve to smartOS, so several years (5+?) ago...
Pretty sure it isn't because the FreeBSD VM with jail is the ONLY one doing it. I even spun up more FreeBSD VM's and none of them exhibit this behavior, but as soon as I create a VNET jail in any one of them, BOOM it happens.
I doubt that this will make any difference but did you try the same without VNET jails?
You'd be surprised. I actually tried out a non VNET jail and it does NOT exhibit this performance degradation. It is ONLY with VNET jails.
Furthermore, are there any differences between running iperf3 from the FreeBSD VM vs. from the FreeBSD VM jail? Given that this is a VNET jail I assume that you can run iperf3 in it (I'm actually not sure).
Running it inside the jails incurs even more performance hit... though not nearly as dramatic (loss of 200-300 Mbits/sec.

It's worth noting that I haven't tried creating a VNET jail manually without BastilleBSD, but I really don't feel like doing so manually cause I have a lot of jails, but if I had to do it as a last resort, I will. I'm really bummed that this only happens with VNET jails.
Really hoping someone with more familiarity or who's run into this before could shed more light.
 
You'd be surprised. I actually tried out a non VNET jail and it does NOT exhibit this performance degradation. It is ONLY with VNET jails.
Unfortunately I'm no expert (I know, I hate it too) but as far as I know the "only" (again, don't quote me on that) difference between a VNET and non-VNET jail is the fact that a VNET jail runs it's own copy/version of the networking stack whereas a non-VNET jail uses the host's networking stack.
My current guess would be that some hardware off-loading features (such as checksum off-loading) is not working. Whether that is "a bug", a "desired side effect" or simply a miss-configuration is certainly beyond my current knowledge. But maybe this gives you some direction to digg further.

Running it inside the jails incurs even more performance hit... though not nearly as dramatic (loss of 200-300 Mbits/sec.
The confirmation bias is strong in this one but: Might be evidence of non-working hardware off-loading.

My recommendation would be to figure out whether network stack related hardware off-loading is working.
 
What happens if you disable "TCP Segment Offload" on the host?

sysctl net.inet.tcp.tso=0
Doesn't seem to do anything. Do I have to reboot for it to take effect? Nevermind, the setting seems to revert when the machine is rebooted.
 
the setting seems to revert when the machine is rebooted
All of them do, there's no persistence unless you explicitly put it in /boot/loader.conf or /etc/sysctl.conf (or set it by any other means during boot).

This particular one should not need to be set at the boot time though.
 
You'd be surprised. I actually tried out a non VNET jail and it does NOT exhibit this performance degradation. It is ONLY with VNET jails.
VNET interfaces have their own mac addresses which the hypervisors network stack initially has no knowledge of, with non-vnet jails you have no other extra mac addresses than what the hypervisor 'gave' to the VM and knows about.
To test this theory you could create e.g. a bridge attached to the virtio interface and let iperf traffic originate from/to that bridge.

I don't have any host on a 25Gbit link I could test with, but running some FreeBSD VMs on bhyve/smartOS as well as some bare metal FreeBSD hosts on 10Gbit networks. Some of the VMs also run VNET jails and don't show that behaviour - from the jailhost and even from within the VNET jails I get >9GBit transfer speeds with iperf3.
That being said - I only have 12.4-RELEASE hosts/VMs in production; the only 13.1-RELEASE hosts don't have 10G links but at least on 1GBit links there is no degraded performance whether they run vnet jails or not. But can you test with a 12.4-RELEASE VM?
Also what's the throughput between the VM and host with/without vnet jails? And between the jailhost VM and its vnet jail?
 
Just a shot in the dark. I had major problems with jail network speed until I found out that a workaround fixed that - disabling checksums for VNET.
Important to mention: My jails run inside a virtual machine on FreeBSD and the network interface is VirtIO! I think this is important.
I put this in my /boot/loader.conf:
Code:
hw.vtnet.csum_disable=1

I don't understand it but you can quickly check if this works for you.
 
Do you have filtering on the bridge interface (net.link.bridge.pfil_bridge) ?
Code:
root@freebsd1:~ # sysctl net.link.bridge.pfil_bridge
net.link.bridge.pfil_bridge: 1

It appears I do. Would this be the issue? I've never enabled it nor do I have any firewall enabled. Is this the default on FreeBSD or default set by BastilleBSD?

I've disabled it, but it doesn't seem to improve the speed.
 
That being said - I only have 12.4-RELEASE hosts/VMs in production; the only 13.1-RELEASE hosts don't have 10G links but at least on 1GBit links there is no degraded performance whether they run vnet jails or not. But can you test with a 12.4-RELEASE VM?
I'll give that a try today.
Also what's the throughput between the VM and host with/without vnet jails?
VM and host throughput is ~24 Gbits/sec without VNET jails active. With VNET jails active, it drops immediately to ~4 Gbits/sec.
And between the jailhost VM and its vnet jail?
Actually, now that you mention it. There's another mystery there and it may be related to the mysterious throughput issue. I can't seem to do any communications between the jail host and the jail, whether it be through SSH or iperf. Pings work however..... which is puzzling.
 
Just a shot in the dark. I had major problems with jail network speed until I found out that a workaround fixed that - disabling checksums for VNET.
Important to mention: My jails run inside a virtual machine on FreeBSD and the network interface is VirtIO! I think this is important.
I put this in my /boot/loader.conf:
Code:
hw.vtnet.csum_disable=1

I don't understand it but you can quickly check if this works for you.
Thanks for the suggestion. Unfortunately, it didn't seem to work though it sounded promising actually. However, it actually had the opposite effect, believe it or not, and I actually lost ~300 Mbits/sec.
 
i did some testing and here is what i found
freebsd 13.1 on bare metal, celeron n4000 so no powerhouse
iperf thru loopback ethernet interface (re0 is 10.1.1.181)
route delete 10.1.1.181 (delete route thru lo0)
iperf -s -B 10.1.1.181 / iperf -c 10.1.1.181 its about 2.5 Gb/s
create a bridge with re0 and tap0 (tap0 is otherwise unused)
i get about 1.7Gb/s so the bridge causes a 35% speed decrease
bridging with a vnet/epair same penalty
thruput thru lo0 is not affected by a vnetjail (about 12Gb/s)

so you can test the same
attached host nic to a bridge with some dummy iface and see what happens
 
That being said - I only have 12.4-RELEASE hosts/VMs in production; the only 13.1-RELEASE hosts don't have 10G links but at least on 1GBit links there is no degraded performance whether they run vnet jails or not. But can you test with a 12.4-RELEASE VM?
I just tried this with 12.4-RELEASE VM and 12.4-RELEASE jail. Unfortunately, I still observe the issue. Next step maybe if I just create the jail manually without using Bastille and see if maybe one of Bastille's defaults may be causing it.
 
i did some testing and here is what i found
freebsd 13.1 on bare metal, celeron n4000 so no powerhouse
iperf thru loopback ethernet interface (re0 is 10.1.1.181)
route delete 10.1.1.181 (delete route thru lo0)
iperf -s -B 10.1.1.181 / iperf -c 10.1.1.181 its about 2.5 Gb/s
create a bridge with re0 and tap0 (tap0 is otherwise unused)
i get about 1.7Gb/s so the bridge causes a 35% speed decrease
bridging with a vnet/epair same penalty
thruput thru lo0 is not affected by a vnetjail (about 12Gb/s)

so you can test the same
attached host nic to a bridge with some dummy iface and see what happens
This sounds promising. So the bridge definitely sounds like it imposes a speed penalty. Still, 35% is far less than what I'm observing. Going from 24Gbps to 4Gbps is like ~83% plunge.
 
Here's another example between Hyper-V and FreeBSD 13.1-RELEASE-p3 VM with and without TSO4

The same is happening if i create bridge0 and add tap0 and hn0 interface to it. If i re-enable the tso on hn0 ifconfig hn0 tso after it's been added to the bridge it's throughput is back to normal.

Here's the quote from if_bridge(4) . So in my test tap0 doesn't support TSO and when i add it to the bridge the hn0 interface TSO is disabled.
The TOE, TSO, TXCSUM and TXCSUM6 capabilities on all interfaces added to
the bridge are disabled if any of the interfaces do not support/enable
them. The LRO capability is always disabled. All the capabilities are
restored when the interface is removed from the bridge. Changing capa-
bilities at run-time may cause NIC reinit and a link flap.


Code:
# ifconfig hn0
hn0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8051b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,TSO4,LRO,LINKSTATE>
        ether 00:15:5d:64:79:11
        inet 192.168.100.124 netmask 0xffffff00 broadcast 192.168.100.255
        media: Ethernet autoselect (10Gbase-T <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
# iperf3 -p 8000 -c 192.168.100.101
Connecting to host 192.168.100.101, port 8000
[  5] local 192.168.100.124 port 26015 connected to 192.168.100.101 port 8000
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  3.26 GBytes  28.0 Gbits/sec    0    208 KBytes
[  5]   1.00-2.00   sec  3.31 GBytes  28.5 Gbits/sec    0    208 KBytes
[  5]   2.00-3.00   sec  3.10 GBytes  26.6 Gbits/sec    0    208 KBytes
[  5]   3.00-4.00   sec  3.40 GBytes  29.2 Gbits/sec    0    208 KBytes
[  5]   4.00-5.00   sec  3.08 GBytes  26.4 Gbits/sec    0    208 KBytes
[  5]   5.00-6.00   sec  3.05 GBytes  26.2 Gbits/sec    0    208 KBytes
[  5]   6.00-7.00   sec  3.21 GBytes  27.6 Gbits/sec    0    208 KBytes
[  5]   7.00-8.00   sec  3.31 GBytes  28.4 Gbits/sec    0    208 KBytes
[  5]   8.00-9.00   sec  3.26 GBytes  28.0 Gbits/sec    0    208 KBytes
[  5]   9.00-10.00  sec  3.19 GBytes  27.4 Gbits/sec    0    208 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  32.2 GBytes  27.6 Gbits/sec    0             sender
[  5]   0.00-10.00  sec  32.2 GBytes  27.6 Gbits/sec                  receiver

iperf Done.
# ifconfig hn0 -tso
# iperf3 -p 8000 -c 192.168.100.101
Connecting to host 192.168.100.101, port 8000
[  5] local 192.168.100.124 port 29985 connected to 192.168.100.101 port 8000
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   342 MBytes  2.87 Gbits/sec    0    209 KBytes
[  5]   1.00-2.00   sec   399 MBytes  3.35 Gbits/sec    0    209 KBytes
[  5]   2.00-3.00   sec   478 MBytes  4.01 Gbits/sec    0    209 KBytes
[  5]   3.00-4.00   sec   496 MBytes  4.16 Gbits/sec    0    209 KBytes
[  5]   4.00-5.00   sec   490 MBytes  4.11 Gbits/sec    0    209 KBytes
[  5]   5.00-6.00   sec   494 MBytes  4.14 Gbits/sec    0    209 KBytes
[  5]   6.00-7.00   sec   506 MBytes  4.25 Gbits/sec    0    209 KBytes
[  5]   7.00-8.00   sec   501 MBytes  4.21 Gbits/sec    0    209 KBytes
[  5]   8.00-9.00   sec   502 MBytes  4.21 Gbits/sec    0    209 KBytes
[  5]   9.00-10.00  sec   502 MBytes  4.22 Gbits/sec    0    209 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  4.60 GBytes  3.95 Gbits/sec    0             sender
[  5]   0.00-10.00  sec  4.60 GBytes  3.95 Gbits/sec                  receiver

iperf Done.
#
 
Here's another example between Hyper-V and FreeBSD 13.1-RELEASE-p3 VM with and without TSO4

The same is happening if i create bridge0 and add tap0 and hn0 interface to it. If i re-enable the tso on hn0 ifconfig hn0 tso after it's been added to the bridge it's throughput is back to normal.

Here's the quote from if_bridge(4) . So in my test tap0 doesn't support TSO and when i add it to the bridge the hn0 interface TSO is disabled.

Code:
# ifconfig hn0
hn0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8051b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,TSO4,LRO,LINKSTATE>
        ether 00:15:5d:64:79:11
        inet 192.168.100.124 netmask 0xffffff00 broadcast 192.168.100.255
        media: Ethernet autoselect (10Gbase-T <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
# iperf3 -p 8000 -c 192.168.100.101
Connecting to host 192.168.100.101, port 8000
[  5] local 192.168.100.124 port 26015 connected to 192.168.100.101 port 8000
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  3.26 GBytes  28.0 Gbits/sec    0    208 KBytes
[  5]   1.00-2.00   sec  3.31 GBytes  28.5 Gbits/sec    0    208 KBytes
[  5]   2.00-3.00   sec  3.10 GBytes  26.6 Gbits/sec    0    208 KBytes
[  5]   3.00-4.00   sec  3.40 GBytes  29.2 Gbits/sec    0    208 KBytes
[  5]   4.00-5.00   sec  3.08 GBytes  26.4 Gbits/sec    0    208 KBytes
[  5]   5.00-6.00   sec  3.05 GBytes  26.2 Gbits/sec    0    208 KBytes
[  5]   6.00-7.00   sec  3.21 GBytes  27.6 Gbits/sec    0    208 KBytes
[  5]   7.00-8.00   sec  3.31 GBytes  28.4 Gbits/sec    0    208 KBytes
[  5]   8.00-9.00   sec  3.26 GBytes  28.0 Gbits/sec    0    208 KBytes
[  5]   9.00-10.00  sec  3.19 GBytes  27.4 Gbits/sec    0    208 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  32.2 GBytes  27.6 Gbits/sec    0             sender
[  5]   0.00-10.00  sec  32.2 GBytes  27.6 Gbits/sec                  receiver

iperf Done.
# ifconfig hn0 -tso
# iperf3 -p 8000 -c 192.168.100.101
Connecting to host 192.168.100.101, port 8000
[  5] local 192.168.100.124 port 29985 connected to 192.168.100.101 port 8000
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   342 MBytes  2.87 Gbits/sec    0    209 KBytes
[  5]   1.00-2.00   sec   399 MBytes  3.35 Gbits/sec    0    209 KBytes
[  5]   2.00-3.00   sec   478 MBytes  4.01 Gbits/sec    0    209 KBytes
[  5]   3.00-4.00   sec   496 MBytes  4.16 Gbits/sec    0    209 KBytes
[  5]   4.00-5.00   sec   490 MBytes  4.11 Gbits/sec    0    209 KBytes
[  5]   5.00-6.00   sec   494 MBytes  4.14 Gbits/sec    0    209 KBytes
[  5]   6.00-7.00   sec   506 MBytes  4.25 Gbits/sec    0    209 KBytes
[  5]   7.00-8.00   sec   501 MBytes  4.21 Gbits/sec    0    209 KBytes
[  5]   8.00-9.00   sec   502 MBytes  4.21 Gbits/sec    0    209 KBytes
[  5]   9.00-10.00  sec   502 MBytes  4.22 Gbits/sec    0    209 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  4.60 GBytes  3.95 Gbits/sec    0             sender
[  5]   0.00-10.00  sec  4.60 GBytes  3.95 Gbits/sec                  receiver

iperf Done.
#
Oh, this is EXACTLY what I am seeing. Is there a fix to it? I've tried both ifconfig vtnet0bridge tso and ifconfig vtnet0bridge -tso, but it doesn't seem to help.

EDIT:
Some good news. I did ifconfig vtnet0 tso instead and throughput to the jail host VM is restored fully. However, iperf3 to the jails are still stuck at ~3.5 Gbps. I've tried ifconfig vnet0 tso in the jail, but unfortunately, it doesn't seem to do the trick.

Wonder why TSO enabled isn't the default?
 
I would think it's all advertised by the hypervisor because it works just fine without VNET jails and it also works on other VM's (Linux) all by default. It only stops working once I boot up a VNET jail.

Throughput is also terrible in the jails. While I expect some performance degradation, I don't expect something like 83+% drop.
 
I suppose no one else has any other ideas? I guess the solution to avoid this massive throughput hit is just to spin up more VM's instead of using jails. Kinda' a bummer, cause it's gonna be a lot heavier than jails, but I guess it'll suffice for now.
 
Back
Top