FreeBSD to Linux 25G Network Card Iperf3 Test

My cluster is deployed on Linux. Now I want to switch to FreeBSD. Currently, I am testing the 25G network card of FreeBSD. Since it's not possible to transfer everything over, let's first try to deploy these two types together. However, during the testing process, it was found that the iperf3 network test between FreeBSD and FreeBSD was normal, and the iperf3 network test between Linux and Linux was also normal. However, there was a problem with the sending packets of the iperf3 test from FreeBSD to Linux.

Below are my test results。All the network cards hereThe MTU (Maximum Transmission Unit) of all the network cards here is 9000.

freebsd to freebsd
Code:
[root@mfsbsd roothome]$ dmesg|grep ice0
ice0: <Intel(R) Ethernet Network Adapter E810-XXV-2 - 1.43.2-k> mem 0xd4000000-0xd5ffffff,0xd6010000-0xd601ffff at device 0.0 numa-domain 1 on pci17
ice0: Loading the iflib ice driver
ice0: The DDP package was successfully loaded: ICE OS Default Package version 1.3.53.0, track id 0xc0000001.
ice0: fw 7.10.1 api 1.7 nvm 4.91 etid 800214ab netlist 4.4.5000-1.18.0.db8365cf oem 1.3909.0
ice0: Using 24 Tx and Rx queues
ice0: Reserving 24 MSI-X interrupts for iRDMA
ice0: Using MSI-X interrupts with 49 vectors
ice0: Using 4096 TX descriptors and 4096 RX descriptors
ice0: Ethernet address: b4:96:91:c3:a3:ea
ice0: PCI Express Bus: Speed 16.0GT/s Width x8
ice0: Firmware LLDP agent disabled
ice0: link state changed to UP
ice0: Link is up, 25 Gbps Full Duplex, Requested FEC: RS-FEC, Negotiated FEC: FC-FEC/BASE-R, Autoneg: False, Flow Control: None
ice0: netmap queues/slots: TX 24/4096, RX 24/4096
ice0: link state changed to DOWN
ice0: link state changed to UP
ice0: Link is up, 25 Gbps Full Duplex, Requested FEC: RS-FEC, Negotiated FEC: FC-FEC/BASE-R, Autoneg: False, Flow Control: None
Code:
[root@mfsbsd roothome]$ iperf3 -c 192.168.100.108
Connecting to host 192.168.100.108, port 5201
[  5] local 192.168.100.107 port 40442 connected to 192.168.100.108 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.06   sec  3.04 GBytes  24.6 Gbits/sec    0   1.76 MBytes
[  5]   1.06-2.06   sec  2.87 GBytes  24.7 Gbits/sec    0   1.76 MBytes
[  5]   2.06-3.02   sec  2.78 GBytes  24.7 Gbits/sec    0   1.76 MBytes
[  5]   3.02-4.06   sec  2.99 GBytes  24.7 Gbits/sec    0   1.76 MBytes
[  5]   4.06-5.06   sec  2.87 GBytes  24.7 Gbits/sec    0   1.76 MBytes
[  5]   5.06-6.06   sec  2.87 GBytes  24.7 Gbits/sec    0   1.76 MBytes
[  5]   6.06-7.06   sec  2.87 GBytes  24.7 Gbits/sec    0   1.76 MBytes
[  5]   7.06-8.06   sec  2.89 GBytes  24.7 Gbits/sec    0   1.76 MBytes
[  5]   8.06-9.06   sec  2.88 GBytes  24.7 Gbits/sec    0   1.76 MBytes
[  5]   9.06-10.03  sec  2.80 GBytes  24.7 Gbits/sec    0   1.76 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.03  sec  28.9 GBytes  24.7 Gbits/sec    0            sender
[  5]   0.00-10.03  sec  28.9 GBytes  24.7 Gbits/sec                  receiver

iperf Done.
Code:
[root@mfsbsd roothome]$ iperf3 -c 192.168.100.108 -R
Connecting to host 192.168.100.108, port 5201
Reverse mode, remote host 192.168.100.108 is sending
[  5] local 192.168.100.107 port 26306 connected to 192.168.100.108 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.06   sec  3.05 GBytes  24.7 Gbits/sec
[  5]   1.06-2.06   sec  2.87 GBytes  24.7 Gbits/sec
[  5]   2.06-3.05   sec  2.86 GBytes  24.7 Gbits/sec
[  5]   3.05-4.06   sec  2.91 GBytes  24.7 Gbits/sec
[  5]   4.06-5.06   sec  2.87 GBytes  24.7 Gbits/sec
[  5]   5.06-6.06   sec  2.88 GBytes  24.7 Gbits/sec
[  5]   6.06-7.05   sec  2.87 GBytes  24.7 Gbits/sec
[  5]   7.05-8.03   sec  2.81 GBytes  24.7 Gbits/sec
[  5]   8.03-9.06   sec  2.97 GBytes  24.7 Gbits/sec
[  5]   9.06-10.03  sec  2.45 GBytes  21.7 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.03  sec  28.5 GBytes  24.4 Gbits/sec    0            sender
[  5]   0.00-10.03  sec  28.5 GBytes  24.4 Gbits/sec                  receiver

iperf Done.
linux to linux
Code:
[root@hadoop-docker-test3 roothome]# iperf3 -c 192.168.100.104
Connecting to host 192.168.100.104, port 5201
[  5] local 192.168.100.103 port 37014 connected to 192.168.100.104 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  2.42 GBytes  20.8 Gbits/sec   87   1.07 MBytes
[  5]   1.00-2.00   sec  2.58 GBytes  22.2 Gbits/sec    0   1.16 MBytes
[  5]   2.00-3.00   sec  2.61 GBytes  22.4 Gbits/sec    1   1.21 MBytes
[  5]   3.00-4.00   sec  2.56 GBytes  22.0 Gbits/sec    0   1.25 MBytes
[  5]   4.00-5.00   sec  2.18 GBytes  18.7 Gbits/sec   11   1.13 MBytes
[  5]   5.00-6.00   sec  2.59 GBytes  22.2 Gbits/sec    0   1.19 MBytes
[  5]   6.00-7.00   sec  2.72 GBytes  23.3 Gbits/sec    0   1.20 MBytes
[  5]   7.00-8.00   sec  2.67 GBytes  23.0 Gbits/sec    2   1.02 MBytes
[  5]   8.00-9.00   sec  2.49 GBytes  21.4 Gbits/sec    7   1.21 MBytes
[  5]   9.00-10.00  sec  2.43 GBytes  20.9 Gbits/sec    0   1.25 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  25.3 GBytes  21.7 Gbits/sec  108             sender
[  5]   0.00-10.00  sec  25.3 GBytes  21.7 Gbits/sec                  receiver

iperf Done.
[root@hadoop-docker-test3 roothome]# iperf3 -c 192.168.100.104 -R
Connecting to host 192.168.100.104, port 5201
Reverse mode, remote host 192.168.100.104 is sending
[  5] local 192.168.100.103 port 37074 connected to 192.168.100.104 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  2.22 GBytes  19.1 Gbits/sec
[  5]   1.00-2.00   sec  2.63 GBytes  22.6 Gbits/sec
[  5]   2.00-3.00   sec  2.32 GBytes  20.0 Gbits/sec
[  5]   3.00-4.00   sec  2.30 GBytes  19.8 Gbits/sec
[  5]   4.00-5.00   sec  2.55 GBytes  21.9 Gbits/sec
[  5]   5.00-6.00   sec  2.51 GBytes  21.5 Gbits/sec
[  5]   6.00-7.00   sec  2.56 GBytes  22.0 Gbits/sec
[  5]   7.00-8.00   sec  2.66 GBytes  22.9 Gbits/sec
[  5]   8.00-9.00   sec  2.64 GBytes  22.7 Gbits/sec
[  5]   9.00-10.00  sec  2.79 GBytes  23.9 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  25.2 GBytes  21.6 Gbits/sec   35             sender
[  5]   0.00-10.00  sec  25.2 GBytes  21.6 Gbits/sec                  receiver

iperf Done.

freebsd to linux

Code:
[root@hadoop-docker-test3 roothome]# lspci | grep -i ethernet
06:00.0 Ethernet controller: Intel Corporation Ethernet Controller XXV710 for 25GbE SFP28 (rev 02)
06:00.1 Ethernet controller: Intel Corporation Ethernet Controller XXV710 for 25GbE SFP28 (rev 02)

[root@mfsbsd roothome]$ iperf3 -c 192.168.100.104
Connecting to host 192.168.100.104, port 5201
[  5] local 192.168.100.107 port 49741 connected to 192.168.100.104 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.06   sec  1.58 GBytes  12.8 Gbits/sec  439    131 KBytes
[  5]   1.06-2.06   sec  1.17 GBytes  10.1 Gbits/sec  384    169 KBytes
[  5]   2.06-3.06   sec  1.17 GBytes  10.1 Gbits/sec  412    158 KBytes
[  5]   3.06-4.06   sec  1.22 GBytes  10.5 Gbits/sec  388    105 KBytes
[  5]   4.06-5.05   sec  1.19 GBytes  10.3 Gbits/sec  410    259 KBytes
[  5]   5.05-6.05   sec  1.21 GBytes  10.4 Gbits/sec  401    122 KBytes
[  5]   6.05-7.06   sec  1.21 GBytes  10.3 Gbits/sec  421    174 KBytes
[  5]   7.06-8.06   sec  1.21 GBytes  10.4 Gbits/sec  442    182 KBytes
[  5]   8.06-9.04   sec  1.17 GBytes  10.2 Gbits/sec  478    213 KBytes
[  5]   9.04-10.06  sec  2.13 GBytes  17.9 Gbits/sec  205   2.75 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.06  sec  13.3 GBytes  11.3 Gbits/sec  3980            sender
[  5]   0.00-10.06  sec  13.3 GBytes  11.3 Gbits/sec                  receiver

iperf Done.
You have mail in /var/mail/root
[root@mfsbsd roothome]$ iperf3 -c 192.168.100.104 -R
Connecting to host 192.168.100.104, port 5201
Reverse mode, remote host 192.168.100.104 is sending
[  5] local 192.168.100.107 port 16107 connected to 192.168.100.104 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.06   sec  3.02 GBytes  24.4 Gbits/sec
[  5]   1.06-2.06   sec  2.82 GBytes  24.4 Gbits/sec
[  5]   2.06-3.06   sec  2.87 GBytes  24.7 Gbits/sec
[  5]   3.06-4.06   sec  2.89 GBytes  24.7 Gbits/sec
[  5]   4.06-5.06   sec  2.88 GBytes  24.7 Gbits/sec
[  5]   5.06-6.06   sec  2.88 GBytes  24.7 Gbits/sec
[  5]   6.06-7.06   sec  2.88 GBytes  24.7 Gbits/sec
[  5]   7.06-8.05   sec  2.86 GBytes  24.7 Gbits/sec
[  5]   8.05-9.06   sec  2.90 GBytes  24.7 Gbits/sec
[  5]   9.06-10.04  sec  2.69 GBytes  23.6 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.04  sec  28.7 GBytes  24.6 Gbits/sec    0            sender
[  5]   0.00-10.04  sec  28.7 GBytes  24.6 Gbits/sec                  receiver

iperf Done.

I tried adjusting parameters such as the network card queue size, but there are still issues with packet transmission between FreeBSD and Linux.
 
How many cpu cores do you have in these boxes?

Since your new cluster will be all FreeBSD, why do you care about this? Linux and Freebad have had peer to peer defugalties since the beginning of time (or at least since BSDI roamed the Earth).
 
I am glad you finally posted your interface: Intel ice0. That gives us some idea what you are using.

Look below at some tuning help. You will need to adjust for the Intel Interface.
But things like LRO/TSO are all things you need to consider and look at other offloading stuff too.
There really is tuning needed with an eye to your desired network services and devices your packets need to transverse.

Here are the interface flags I found best for my 40G chelsio network gateway:
Code:
### LAGG ###
ifconfig_cxl0="up mtu 9000 -tso4 -tso6 -lro -vlanhwtso"
ifconfig_cxl1="up mtu 9000 -tso4 -tso6 -lro -vlanhwtso"
cloned_interfaces="lagg0"
#
 
It;s not a hardware performance problem. It's probably something with the window negotiation or implementation. put a monitor on it.

Have you tried in both directions; with FreeBSD as the receiver? The reason I asked about number of cpus using 24 queues is absurd; you need 24x9000x4096 (~1GB) just for the receiver buffers. The default settings for FreeBSD drivers are 20 years out of date; from back in the day when you needed 8 cores to 10gb.s., It's inefficient to split them up more than necessary
 
How many cpu cores do you have in these boxes?

Since your new cluster will be all FreeBSD, why do you care about this? Linux and Freebad have had peer to peer defugalties since the beginning of time (or at least since BSDI roamed the Earth).
The Linux machines have 56 CPUs each, while the FreeBSD machines have one with 96 CPUs and another with 56 CPUs.Because of the intention to implement a hybrid deployment, the subsequent machines will be switched to FreeBSD, while the previous machines will remain stationary for the time being.
 
It;s not a hardware performance problem. It's probably something with the window negotiation or implementation. put a monitor on it.

Have you tried in both directions; with FreeBSD as the receiver? The reason I asked about number of cpus using 24 queues is absurd; you need 24x9000x4096 (~1GB) just for the receiver buffers. The default settings for FreeBSD drivers are 20 years out of date; from back in the day when you needed 8 cores to 10gb.s., It's inefficient to split them up more than necessary
Bidirectional testing between FreeBSD and Linux. In this test, FreeBSD was set as the receiving device and it performed normally, with a throughput rate reaching 24.6 Gbits/sec. Below are the configurations of my machine.
freebsd 1:
[root@mfsbsd roothome]$ sysctl hw.ncpu
hw.ncpu: 96

sysctl kern.ipc.maxsockbuf
sysctl net.inet.tcp.sendspace
sysctl net.inet.tcp.recvspace
net.inet.tcp.sendspace: 32768
net.inet.tcp.recvspace: 65536
net.inet.tcp.rack.misc.autoscale: 20
net.inet.tcp.rack.misc.prr_sendalot: 1
net.inet.tcp.rack.tlp.send_oldest: 0
net.inet.tcp.rack.stats.persist_sends: 0
net.inet.tcp.rack.stats.nfto_send: 8606356
net.inet.tcp.rack.stats.nfto_resend: 0
net.inet.tcp.rack.stats.fto_rsm_send: 208
net.inet.tcp.rack.stats.fto_send: 160877
net.inet.tcp.ack_war_timewindow: 1000
net.inet.tcp.sendbuf_auto_lowat: 0
net.inet.tcp.sendbuf_max: 2097152
net.inet.tcp.sendbuf_inc: 8192
net.inet.tcp.sendbuf_auto: 1
net.inet.tcp.recvbuf_max: 2097152
net.inet.tcp.recvbuf_auto: 1
You have new mail in /var/mail/root
kern.ipc.maxsockbuf: 16777216
net.inet.tcp.sendspace: 32768
net.inet.tcp.recvspace: 65536

freebsd2:

[root@hadoop-test8 ~]# sysctl hw.ncpu
hw.ncpu: 56


linux1 和 linux2

[root@hadoop-docker-test3 roothome]# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 56
On-line CPU(s) list: 0-55
Thread(s) per core: 2
Core(s) per socket: 14
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 63
Model name: Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz
Stepping: 2
CPU MHz: 1411.242
CPU max MHz: 3600.0000
CPU min MHz: 1200.0000
BogoMIPS: 5199.98
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 35840K
NUMA node0 CPU(s): 0-13,28-41
NUMA node1 CPU(s): 14-27,42-55
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm epb invpcid_single intel_ppin ssbd rsb_ctxsw ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm xsaveopt cqm_llc cqm_occup_llc dtherm ida arat pln pts md_clear spec_ctrl intel_stibp flush_l1d


[root@hadoop-docker-test3 roothome]# ethtool -l ens1f0
Channel parameters for ens1f0:
Pre-set maximums:
RX: 0
TX: 0
Other: 1
Combined: 64
Current hardware settings:
RX: 0
TX: 0
Other: 1
Combined: 56

[root@hadoop-docker-test3 roothome]# ethtool -g ens1f0
Ring parameters for ens1f0:
Pre-set maximums:
RX: 4096
RX Mini: 0
RX Jumbo: 0
TX: 4096
Current hardware settings:
RX: 512
RX Mini: 0
RX Jumbo: 0
TX: 512



[root@hadoop-docker-test3 roothome]# sysctl net.ipv4.tcp_rmem
net.ipv4.tcp_rmem = 4096 16777216 67108864
[root@hadoop-docker-test3 roothome]# sysctl net.ipv4.tcp_wmem
net.ipv4.tcp_wmem = 4096 16777216 67108864
[root@hadoop-docker-test3 roothome]# sysctl net.core.rmem_max
net.core.rmem_max = 67108864
[root@hadoop-docker-test3 roothome]# sysctl net.core.wmem_max
net.core.wmem_max = 212992
[root@hadoop-docker-test3 roothome]# sysctl net.ipv4.tcp_window_scaling
net.ipv4.tcp_window_scaling = 1
 
Back
Top