Request for help maximizing 10G sniffer performance

Hey FreeBSD folks, this is my first ever post. I'll try to do a good job, but bare with me. I’m not a FreeBSD guru by anyone’s account. I know it's a lot to ask, but I'ld love to get a reply from Mr. Vogel on this thread.

At issue is a 10G host we have in our lab which is designed to be a network sniffer to be used on all types of network troubleshooting problems including problems which require sniffing on 10G interfaces. Obviously we're looking to maximize the host’s performance to get the most capability we can out of it. The problem is the box has seriously underperformed in testing, and I'm looking for some help improving the performance. For the packet captures we are using simple tcpdump. We've identified 2 areas of where we're losing packets, and which are a concern: 1) the NIC interrupts are pegging one of the processor CPUs, and 2) the kernel is dropping some packets when writing to the array. Of the 2 problem areas the first is the most severe and that is where this thread will focus.

Questions:
1) Is networking polling supported on 10G interfaces?
2) If 10G network polling is not supported, or if someone familiar with this area knows it’s not advisable to use network polling, what improvements could be made to improve performance?


Details:

Hardware: Without getting extremely specific, I'd say that it's a Sun/Oracle x4270 with a 6 core Nehalem processor, with 24GB of RAM, and a 10 disk array in RAID 0. The 10G NIC is an Intel PRO/10GbE in a PCI-express slot.

Issue – losing packets at the NIC/interrupt level:

When we ran tcpdump on the host while 4.3Gbps of traffic was being forwarded to the 10G interface, the host dropped 87% of the packets. Approximately 2% was dropped by the kernel (related to writing to the array) according to the tcpdump output. What we found was that the rxq process on the 10G interface (interface name ix0) was pegging out one of the CPUs. The ps output below was taken midway during the 4.3Gbps test, and shows the “ix0 rxq” process at 98% utilization.

Code:
new# ps -aux | more
USER      PID %CPU %MEM   VSZ   RSS  TT  STAT STARTED      TIME COMMAND
root       13 100.0  0.0     0    16  ??  RL   Wed01PM 7202:39.30 [idle: cpu9]     *most of the CPUs are idle
root       14 100.0  0.0     0    16  ??  RL   Wed01PM 7202:33.35 [idle: cpu8]
root       15 100.0  0.0     0    16  ??  RL   Wed01PM 7202:08.77 [idle: cpu7]
root       16 100.0  0.0     0    16  ??  RL   Wed01PM 7201:44.21 [idle: cpu6]
root       18 100.0  0.0     0    16  ??  RL   Wed01PM 7183:21.35 [idle: cpu4]
root       21 100.0  0.0     0    16  ??  RL   Wed01PM 7197:24.85 [idle: cpu1]
root       22 100.0  0.0     0    16  ??  RL   Wed01PM 7184:48.43 [idle: cpu0]
root       11 99.0  0.0     0    16  ??  RL   Wed01PM 7202:38.07 [idle: cpu11]
root       12 98.0  0.0     0    16  ??  RL   Wed01PM 7194:30.17 [idle: cpu10]
root       54 98.0  0.0     0    16  ??  RL   Wed01PM  51:26.68 [ix0 rxq]	*rxq processes pegging one of the CPUs
root       17 96.2  0.0     0    16  ??  RL   Wed01PM 7187:29.89 [idle: cpu5]
root       19 71.1  0.0     0    16  ??  RL   Wed01PM 7189:34.04 [idle: cpu3]
root       20 43.7  0.0     0    16  ??  RL   Wed01PM 7187:08.42 [idle: cpu2]
root    21357 13.5  0.0  9492  2512  d0  S+    2:13PM   0:26.49 tcpdump -i ix0 -s 1518 -w /data/test1.cap
-The Nehalem processor is a 6 core processor with hyperthreading which is a fancy term for splitting the cpu cycles between 2 processes. The ps output shows 12 CPUs due to this arrangement. Most of the CPUs are idle, however that one process, ix0 rxq, is at 98%. Coupled with the knowledge that there was 4.3Gbps of traffic on the interface (confirmed by smartbits and switchport counters) and that resulting capture file only had approximately 13% of the total packets, it appears that we’re dropping about 87% of the packets at the NIC/interrupt level.

-t’s fair to say that the problem is that we’re not using multiple CPUs. Tcpdump being a single-threaded application creates a condition where one CPU is hammered and the rest are almost unused. I’m not aware of a good way to spread the load amongst the CPUs.


Enter network polling:

One of the performance enhancements we’ve tried to implement is network polling using the guidance found at: http://networking.ringofsaturn.com/Unix/freebsd-polling.php and http://www.cyberciti.biz/faq/freebsd-device-polling-network-polling-tutorial/ . However, though configured, it does not work on the 10G interface. There is some question whether network polling is supported on 10G interfaces. The list of supported network cards/interfaces shown at those websites does not show support for any 10G interfaces. However those articles are from 2009, and I also read that Mr. Vogel made some improvements to the 10G drivers in FreeBSD 8.2. Therefore the question about whether 10G network polling is supported..

We tried it. We upgraded to 8.2. We configured the following options:

Code:
	options DEVICE_POLLING
	options HZ=1000

and recompiled the kernel. The host shows that polling in enabled. /etc/rc.conf is configured for polling to be enabled on the 10G interface persistently. However, ifconfig output shows it does not get enabled.

Code:
sniffer# sysctl -a kern.polling
kern.polling.idlepoll_sleeping: 1
kern.polling.stalled: 0
kern.polling.suspect: 0
kern.polling.phase: 0
kern.polling.handlers: 0
kern.polling.residual_burst: 0
kern.polling.pending_polls: 0
kern.polling.lost_polls: 0
kern.polling.short_ticks: 0
kern.polling.reg_frac: 20
kern.polling.user_frac: 50
kern.polling.idle_poll: 0
kern.polling.each_burst: 5
kern.polling.burst_max: 150
kern.polling.burst: 5

/etc/rc.conf
Code:
ifconfig_ix0="inet 192.168.0.2 netmask 255.255.255.0 polling

ifconfig showing polling is not enabled
Code:
sniffer# ifconfig ix0
ix0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=1bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4>
        ether 00:1b:21:6e:c9:b0
        inet 192.168.1.10 netmask 0xffffff00 broadcast 192.168.1.255
        media: Ethernet autoselect (10Gbase-SR <full-duplex>)
        status: active


Thanks in advance for any help that we receive,


D
 
On machines with heavy (but still far from your needs) network traffic I use the following tunables:
/boot/loader.conf:
Code:
net.isr.maxthreads=4
net.isr.bindthreads=1

/etc/sysctl.conf:
Code:
net.inet.ip.process_options=0
net.inet.ip.dummynet.io_fast=1
net.isr.direct=0
 
  • Thanks
Reactions: Alt
I'm also trying to tune performance on our Intel(R) PRO/10GbE interfaces.
Kernel (CURRENT 9) is now compiling with polling support, I'll let you know the result.

I've found this on http://silverwraith.com/papers/freebsd-tuning.php

The DEVICE_POLLING option by default does not work with SMP enabled kernels. When the author of the DEVICE_POLLING code initially commited it he admits he was unsure of the benefits of the feature in a multiple-CPU environment, as only one CPU would be doing the polling. Since that time many administrators have found that there is a significant advantage to DEVICE_POLLING even in SMP enabled kernels and that it works with no problems at all. If you are compiling an SMP kernel with DEVICE_POLLING, edit the file: /usr/src/sys/kern/kern_poll.c and remove the following lines:

Code:
        #ifdef SMP
        #include "opt_lint.h"
        #ifndef COMPILING_LINT
        #error DEVICE_POLLING is not compatible with SMP
        #endif
        #endif

Is this still relevant information?
 
I'm experiencing the same problem.

POLLING is enabled for my 1Gbit nic's but not for the 10Gbit nic's.

Code:
Intel(R) PRO/1000 Network Connection version - 2.2.3
Code:
Intel(R) PRO/10GbE PCI-Express Network Driver, Version - 2.3.10
The 10Gbe is from supermicro and uses the Dual-port Intel® 82598EB chip.

Code:
[root@bsd /tank]# ifconfig ix0  
ix0: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=1bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4>
	ether 00:25:90:01:a0:ba
	media: Ethernet autoselect
	status: no carrier
[root@bsd /tank]# ifconfig lagg0
lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=1fb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,POLLING,VLAN_HWCSUM,TSO4>
	ether 00:30:48:f3:5a:42
	inet 192.168.250.175 netmask 0xffffff00 broadcast 192.168.250.255
	inet6 fe80::230:48ff:fef3:5a42%lagg0 prefixlen 64 scopeid 0xe 
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
	media: Ethernet autoselect
	status: active
	laggproto lacp
	laggport: igb1 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
	laggport: igb0 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>

From ftp://ftp.supermicro.com/CDR-BH_1.0...tel/LAN/v12.4/PROXGB/DOCS/FREEBSD/freebsd.htm

Polling

To enable polling in the driver, add the following options to the kernel configuration, and then recompile the kernel:

options DEVICE_POLLING
options HZ=1000

To turn on polling::

ifconfig <interface_num> polling

To turn off polling:

ifconfig <interface_num> -polling

NOTES: DEVICE POLLING is only valid for non-SMP kernels.

The driver has to be built into the kernel for DEVICE POLLING to be enabled in the driver.

I don't understand that it lists the POLLING option on my 1GB network card, I'm running SMP enabled kernel.
 
Back
Top