Unstable/low throughput on loopback interface on 13.1-RELEASE

Hello everyone,

I'm not able to gain more than 60Gb/s total throughput (tx/rx) over the lo0 interface. I was using both iperf and iperf3 and the results are roughly the same.
Throughput variates between ~30Gb/s and ~10Gb/s:

Code:
...
[  5]   9.00-10.00  sec   248 MBytes  2.08 Gbits/sec
[  8]   9.00-10.00  sec   248 MBytes  2.08 Gbits/sec
[ 10]   9.00-10.00  sec   248 MBytes  2.08 Gbits/sec
[ 12]   9.00-10.00  sec   248 MBytes  2.08 Gbits/sec
[ 14]   9.00-10.00  sec   248 MBytes  2.08 Gbits/sec
[ 16]   9.00-10.00  sec   248 MBytes  2.08 Gbits/sec
[ 18]   9.00-10.00  sec   248 MBytes  2.08 Gbits/sec
[ 20]   9.00-10.00  sec   248 MBytes  2.08 Gbits/sec
[ 22]   9.00-10.00  sec   248 MBytes  2.08 Gbits/sec
[ 24]   9.00-10.00  sec   248 MBytes  2.08 Gbits/sec
[ 26]   9.00-10.00  sec   248 MBytes  2.08 Gbits/sec
[ 28]   9.00-10.00  sec   248 MBytes  2.08 Gbits/sec
[ 30]   9.00-10.00  sec   248 MBytes  2.08 Gbits/sec
[ 32]   9.00-10.00  sec   248 MBytes  2.08 Gbits/sec
[SUM]   9.00-10.00  sec  3.39 GBytes  29.1 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]  10.00-11.00  sec   248 MBytes  2.08 Gbits/sec
[  8]  10.00-11.00  sec   248 MBytes  2.08 Gbits/sec
[ 10]  10.00-11.00  sec   248 MBytes  2.08 Gbits/sec
[ 12]  10.00-11.00  sec   248 MBytes  2.08 Gbits/sec
[ 14]  10.00-11.00  sec   248 MBytes  2.08 Gbits/sec
[ 16]  10.00-11.00  sec   248 MBytes  2.08 Gbits/sec
[ 18]  10.00-11.00  sec   248 MBytes  2.08 Gbits/sec
[ 20]  10.00-11.00  sec   248 MBytes  2.08 Gbits/sec
[ 22]  10.00-11.00  sec   248 MBytes  2.08 Gbits/sec
[ 24]  10.00-11.00  sec   248 MBytes  2.08 Gbits/sec
[ 26]  10.00-11.00  sec   248 MBytes  2.08 Gbits/sec
[ 28]  10.00-11.00  sec   248 MBytes  2.08 Gbits/sec
[ 30]  10.00-11.00  sec   248 MBytes  2.08 Gbits/sec
[ 32]  10.00-11.00  sec   248 MBytes  2.08 Gbits/sec
[SUM]  10.00-11.00  sec  3.39 GBytes  29.1 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]  11.00-12.00  sec   248 MBytes  2.08 Gbits/sec
[  8]  11.00-12.00  sec   248 MBytes  2.08 Gbits/sec
[ 10]  11.00-12.00  sec   248 MBytes  2.08 Gbits/sec
[ 12]  11.00-12.00  sec   248 MBytes  2.08 Gbits/sec
[ 14]  11.00-12.00  sec   248 MBytes  2.08 Gbits/sec
[ 16]  11.00-12.00  sec   248 MBytes  2.08 Gbits/sec
[ 18]  11.00-12.00  sec   248 MBytes  2.08 Gbits/sec
[ 20]  11.00-12.00  sec   248 MBytes  2.08 Gbits/sec
[ 22]  11.00-12.00  sec   248 MBytes  2.08 Gbits/sec
[ 24]  11.00-12.00  sec   248 MBytes  2.08 Gbits/sec
[ 26]  11.00-12.00  sec   248 MBytes  2.08 Gbits/sec
[ 28]  11.00-12.00  sec   248 MBytes  2.08 Gbits/sec
[ 30]  11.00-12.00  sec   248 MBytes  2.08 Gbits/sec
[ 32]  11.00-12.00  sec   248 MBytes  2.08 Gbits/sec
[SUM]  11.00-12.00  sec  3.40 GBytes  29.2 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]  12.00-13.00  sec   248 MBytes  2.08 Gbits/sec
[  8]  12.00-13.00  sec   248 MBytes  2.08 Gbits/sec
[ 10]  12.00-13.00  sec   248 MBytes  2.08 Gbits/sec
[ 12]  12.00-13.00  sec   248 MBytes  2.08 Gbits/sec
[ 14]  12.00-13.00  sec   248 MBytes  2.08 Gbits/sec
[ 16]  12.00-13.00  sec   248 MBytes  2.08 Gbits/sec
[ 18]  12.00-13.00  sec   248 MBytes  2.08 Gbits/sec
[ 20]  12.00-13.00  sec   248 MBytes  2.08 Gbits/sec
[ 22]  12.00-13.00  sec   248 MBytes  2.08 Gbits/sec
[ 24]  12.00-13.00  sec   248 MBytes  2.08 Gbits/sec
[ 26]  12.00-13.00  sec   248 MBytes  2.08 Gbits/sec
[ 28]  12.00-13.00  sec   248 MBytes  2.08 Gbits/sec
[ 30]  12.00-13.00  sec   248 MBytes  2.08 Gbits/sec
[ 32]  12.00-13.00  sec   248 MBytes  2.08 Gbits/sec
[SUM]  12.00-13.00  sec  3.39 GBytes  29.2 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]  13.00-14.00  sec   158 MBytes  1.32 Gbits/sec
[  8]  13.00-14.00  sec   158 MBytes  1.32 Gbits/sec
[ 10]  13.00-14.00  sec   158 MBytes  1.32 Gbits/sec
[ 12]  13.00-14.00  sec   158 MBytes  1.32 Gbits/sec
[ 14]  13.00-14.00  sec   158 MBytes  1.32 Gbits/sec
[ 16]  13.00-14.00  sec   158 MBytes  1.32 Gbits/sec
[ 18]  13.00-14.00  sec   158 MBytes  1.32 Gbits/sec
[ 20]  13.00-14.00  sec   158 MBytes  1.32 Gbits/sec
[ 22]  13.00-14.00  sec   158 MBytes  1.32 Gbits/sec
[ 24]  13.00-14.00  sec   158 MBytes  1.32 Gbits/sec
[ 26]  13.00-14.00  sec   158 MBytes  1.32 Gbits/sec
[ 28]  13.00-14.00  sec   158 MBytes  1.32 Gbits/sec
[ 30]  13.00-14.00  sec   158 MBytes  1.32 Gbits/sec
[ 32]  13.00-14.00  sec   158 MBytes  1.32 Gbits/sec
[SUM]  13.00-14.00  sec  2.16 GBytes  18.5 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]  14.00-15.00  sec  98.2 MBytes   823 Mbits/sec
[  8]  14.00-15.00  sec  98.2 MBytes   823 Mbits/sec
[ 10]  14.00-15.00  sec  98.2 MBytes   823 Mbits/sec
[ 12]  14.00-15.00  sec  98.2 MBytes   823 Mbits/sec
[ 14]  14.00-15.00  sec  98.2 MBytes   823 Mbits/sec
[ 16]  14.00-15.00  sec  98.2 MBytes   823 Mbits/sec
[ 18]  14.00-15.00  sec  98.2 MBytes   823 Mbits/sec
[ 20]  14.00-15.00  sec  98.2 MBytes   823 Mbits/sec
[ 22]  14.00-15.00  sec  98.2 MBytes   823 Mbits/sec
[ 24]  14.00-15.00  sec  98.2 MBytes   823 Mbits/sec
[ 26]  14.00-15.00  sec  98.2 MBytes   823 Mbits/sec
[ 28]  14.00-15.00  sec  98.2 MBytes   823 Mbits/sec
[ 30]  14.00-15.00  sec  98.2 MBytes   823 Mbits/sec
[ 32]  14.00-15.00  sec  98.2 MBytes   823 Mbits/sec
[SUM]  14.00-15.00  sec  1.34 GBytes  11.5 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]  15.00-16.00  sec  98.2 MBytes   824 Mbits/sec
[  8]  15.00-16.00  sec  98.2 MBytes   824 Mbits/sec
[ 10]  15.00-16.00  sec  98.2 MBytes   824 Mbits/sec
[ 12]  15.00-16.00  sec  98.2 MBytes   824 Mbits/sec
[ 14]  15.00-16.00  sec  98.2 MBytes   824 Mbits/sec
[ 16]  15.00-16.00  sec  98.2 MBytes   824 Mbits/sec
[ 18]  15.00-16.00  sec  98.2 MBytes   824 Mbits/sec
[ 20]  15.00-16.00  sec  98.2 MBytes   824 Mbits/sec
[ 22]  15.00-16.00  sec  98.2 MBytes   824 Mbits/sec
[ 24]  15.00-16.00  sec  98.2 MBytes   824 Mbits/sec
[ 26]  15.00-16.00  sec  98.2 MBytes   824 Mbits/sec
[ 28]  15.00-16.00  sec  98.2 MBytes   824 Mbits/sec
[ 30]  15.00-16.00  sec  98.2 MBytes   824 Mbits/sec
[ 32]  15.00-16.00  sec  98.2 MBytes   824 Mbits/sec
[SUM]  15.00-16.00  sec  1.34 GBytes  11.5 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]  16.00-17.00  sec  90.4 MBytes   758 Mbits/sec
[  8]  16.00-17.00  sec  90.4 MBytes   758 Mbits/sec
[ 10]  16.00-17.00  sec  90.4 MBytes   758 Mbits/sec
[ 12]  16.00-17.00  sec  90.4 MBytes   758 Mbits/sec
[ 14]  16.00-17.00  sec  90.4 MBytes   758 Mbits/sec
[ 16]  16.00-17.00  sec  90.4 MBytes   758 Mbits/sec
[ 18]  16.00-17.00  sec  90.4 MBytes   758 Mbits/sec
[ 20]  16.00-17.00  sec  90.4 MBytes   758 Mbits/sec
[ 22]  16.00-17.00  sec  90.4 MBytes   758 Mbits/sec
[ 24]  16.00-17.00  sec  90.4 MBytes   758 Mbits/sec
[ 26]  16.00-17.00  sec  90.4 MBytes   758 Mbits/sec
[ 28]  16.00-17.00  sec  90.4 MBytes   758 Mbits/sec
[ 30]  16.00-17.00  sec  90.4 MBytes   758 Mbits/sec
[ 32]  16.00-17.00  sec  90.4 MBytes   758 Mbits/sec
[SUM]  16.00-17.00  sec  1.24 GBytes  10.6 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]  17.00-18.00  sec  87.9 MBytes   736 Mbits/sec
[  8]  17.00-18.00  sec  87.9 MBytes   736 Mbits/sec
[ 10]  17.00-18.00  sec  87.9 MBytes   736 Mbits/sec
[ 12]  17.00-18.00  sec  87.9 MBytes   736 Mbits/sec
[ 14]  17.00-18.00  sec  87.9 MBytes   736 Mbits/sec
[ 16]  17.00-18.00  sec  87.9 MBytes   736 Mbits/sec
[ 18]  17.00-18.00  sec  87.9 MBytes   736 Mbits/sec
[ 20]  17.00-18.00  sec  87.9 MBytes   736 Mbits/sec
[ 22]  17.00-18.00  sec  87.9 MBytes   736 Mbits/sec
[ 24]  17.00-18.00  sec  87.9 MBytes   736 Mbits/sec
[ 26]  17.00-18.00  sec  87.9 MBytes   736 Mbits/sec
[ 28]  17.00-18.00  sec  87.9 MBytes   736 Mbits/sec
[ 30]  17.00-18.00  sec  87.9 MBytes   736 Mbits/sec
[ 32]  17.00-18.00  sec  87.9 MBytes   736 Mbits/sec
[SUM]  17.00-18.00  sec  1.20 GBytes  10.3 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]  17.00-18.00  sec  87.9 MBytes   736 Mbits/sec
[  8]  17.00-18.00  sec  87.9 MBytes   736 Mbits/sec
[ 10]  17.00-18.00  sec  87.9 MBytes   736 Mbits/sec
[ 12]  17.00-18.00  sec  87.9 MBytes   736 Mbits/sec
[ 14]  17.00-18.00  sec  87.9 MBytes   736 Mbits/sec
[ 16]  17.00-18.00  sec  87.9 MBytes   736 Mbits/sec
[ 18]  17.00-18.00  sec  87.9 MBytes   736 Mbits/sec
[ 20]  17.00-18.00  sec  87.9 MBytes   736 Mbits/sec
[ 22]  17.00-18.00  sec  87.9 MBytes   736 Mbits/sec
[ 24]  17.00-18.00  sec  87.9 MBytes   736 Mbits/sec
[ 26]  17.00-18.00  sec  87.9 MBytes   736 Mbits/sec
[ 28]  17.00-18.00  sec  87.9 MBytes   736 Mbits/sec
[ 30]  17.00-18.00  sec  87.9 MBytes   736 Mbits/sec
[ 32]  17.00-18.00  sec  87.9 MBytes   736 Mbits/sec
[SUM]  17.00-18.00  sec  1.20 GBytes  10.3 Gbits/sec
...

I've noticed that during iperf 2 tests, one CPU core is constantly handling interrupts. CPU 3 in this case. This is not happening when I run iperf3 tests, but results are the same.

Code:
root@dev:~ # top -P
last pid:  4126;  load averages:  5.51,  2.67,  1.23                                                up 0+00:50:04  16:03:35
78 processes:  2 running, 76 sleeping
CPU 0:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 1:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 2:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 3:   0.0% user,  0.0% nice,  0.0% system,  100% interrupt,  0.0% idle
CPU 4:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 5:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 6:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 7:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 8:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 9:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 10:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 11:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 12:  0.0% user,  0.0% nice,  0.4% system,  0.0% interrupt, 99.6% idle
CPU 13:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 14:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 15:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 16:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 17:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 18:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 19:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 20:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 21:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 22:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 23:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 24:  0.4% user,  0.0% nice,  6.1% system,  0.0% interrupt, 93.5% idle
CPU 25:  0.3% user,  0.0% nice,  5.2% system,  0.0% interrupt, 94.5% idle
CPU 26:  0.6% user,  0.0% nice, 15.7% system,  0.0% interrupt, 83.7% idle
CPU 27:  0.4% user,  0.0% nice, 18.5% system,  0.0% interrupt, 81.1% idle
CPU 28:  0.0% user,  0.0% nice,  0.3% system,  0.0% interrupt, 99.7% idle
CPU 29:  0.3% user,  0.0% nice, 11.5% system,  0.0% interrupt, 88.1% idle
CPU 30:  0.4% user,  0.0% nice, 18.9% system,  0.0% interrupt, 80.7% idle
CPU 31:  0.4% user,  0.0% nice, 17.5% system,  0.0% interrupt, 82.1% idle
CPU 32:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 33:  0.8% user,  0.0% nice, 36.6% system,  0.0% interrupt, 62.6% idle
CPU 34:  0.4% user,  0.0% nice, 16.8% system,  0.0% interrupt, 82.8% idle
CPU 35:  0.0% user,  0.0% nice,  5.6% system,  0.0% interrupt, 94.4% idle
CPU 36:  1.0% user,  0.0% nice, 11.4% system,  0.0% interrupt, 87.6% idle
CPU 37:  0.0% user,  0.0% nice, 23.3% system,  0.0% interrupt, 76.7% idle
CPU 38:  0.0% user,  0.0% nice, 19.3% system,  0.0% interrupt, 80.7% idle
CPU 39:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 40:  0.4% user,  0.0% nice, 16.4% system,  0.0% interrupt, 83.2% idle
CPU 41:  0.0% user,  0.0% nice,  4.0% system,  0.0% interrupt, 96.0% idle
CPU 42:  0.7% user,  0.0% nice, 15.6% system,  0.0% interrupt, 83.7% idle
CPU 43:  0.4% user,  0.0% nice, 27.0% system,  0.0% interrupt, 72.6% idle
CPU 44:  0.8% user,  0.0% nice, 10.5% system,  0.0% interrupt, 88.7% idle
CPU 45:  0.3% user,  0.0% nice, 24.6% system,  0.0% interrupt, 75.1% idle
CPU 46:  0.0% user,  0.0% nice, 14.3% system,  0.0% interrupt, 85.7% idle
CPU 47:  0.0% user,  0.0% nice,  2.2% system,  0.0% interrupt, 97.8% idle
Mem: 136M Active, 50M Inact, 1881M Wired, 40K Buf, 122G Free
ARC: 102M Total, 15M MFU, 85M MRU, 268K Anon, 413K Header, 1872K Other
     27M Compressed, 75M Uncompressed, 2.79:1 Ratio
Swap: 4096M Total, 4096M Free

  PID USERNAME    THR PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
 4092 root          6  20    0    47M    14M uwait   45   4:10 176.11% iperf
 4091 root          7  20    0    51M    15M uwait    4   3:46 159.72% iperf
 4126 root          1  20    0    14M  3776K CPU4     4   0:00   0.05% top
 3441 www           1  20    0    92M    25M kqread   0   0:00   0.02% nginx
 3477 www           1  20    0    92M    24M kqread  30   0:00   0.02% nginx
 3478 www           1  20    0    92M    24M kqread  31   0:00   0.01% nginx
 3488 www           1  20    0    92M    24M kqread  40   0:00   0.01% nginx
 3489 www           1  20    0    92M    24M kqread  41   0:00   0.01% nginx
 3491 www           1  20    0    92M    24M kqread  43   0:00   0.01% nginx
 3490 www           1  20    0    92M    24M kqread  42   0:00   0.01% nginx
 3481 www           1  20    0    92M    24M kqread  34   0:00   0.01% nginx
 3497 www           1  20    0    92M    24M kqread  47   0:00   0.01% nginx
 3480 www           1  20    0    92M    24M kqread  33   0:00   0.01% nginx
 3493 www           1  20    0    92M    24M kqread  44   0:00   0.01% nginx


/etc/sysctl.conf
Code:
vfs.zfs.min_auto_ashift=12
net.inet.tcp.recvspace=8388608
net.inet.tcp.sendspace=8388608
kern.ipc.maxsockbuf=16388608
net.inet.tcp.abc_l_var=44
net.inet.tcp.mssdflt=1460
net.inet.tcp.minmss=536
kern.threads.max_threads_per_proc=4096
kern.ipc.somaxconn=4096
net.inet.ip.intr_queue_maxlen=4096
net.inet.ip.maxfragpackets=0
net.inet.ip.maxfragsperpacket=0
net.inet6.ip6.maxfragpackets=0
net.inet6.ip6.maxfrags=0
net.inet.tcp.syncookies=0

/boot/loader.conf
Code:
net.inet.tcp.soreceive_stream="1"
net.inet.tcp.hostcache.enable="0"
net.inet.tcp.hostcache.cachelimit="0"
net.isr.bindthreads="1"
net.isr.defaultqlimit="2048"
net.isr.maxthreads="-1"
net.link.ifqmaxlen="1024"

Localhost interface config
Code:
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 65535
    options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
    inet6 ::1 prefixlen 128
    inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3
    inet 127.0.0.1 netmask 0xff000000
    groups: lo
    nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>

Hardware spec:
CPU: 2 x Intel(R) Xeon(R) Silver 4310 CPU @ 2100MHz
Memory: 8 x 16GB DDR4
Network: 2 x Mellanox ConnectX-6 Dx (100Gbe)

I saw this thread - https://lists.freebsd.org/pipermail/freebsd-net/2016-August/045784.html and I've tried disabling NUMA in BIOS, but it didn't help.

Since I don't have much experience with FreeBSD, it is totally possible that I've missed applying some basic configuration. I am stuck in a loop and I don't know what to try next, any help is appreciated.
 
Back
Top