interrupts killing server

emlai · Jun 19, 2011

Hi.

Recently my FreeBSD 8.0 server running opentracker software started to malfunction at peak hours (when peer count started reaching 3M). Some debugging using top -S / vmstat -i showed that network card em0 caused about 8000 interrupts/sec and when intr was reaching 100% WCPU the server was becoming unresponsive. So I made a conclusion that that is the problem.

I upgraded to 8.2, compiled polling into the kernel and enabled it on the em0 interface. But the intr process is acting the same way and when reaching 100% WCPU it makes the server unresponsive. At that point the kernel and opentracker start eating up all remaining CPU power until nothing's left and then the server becomes unresponsive.

What should I do next to determine and fix the problem? Thank you in advance for assistance.

emlai · Jun 19, 2011

Further digging using [CMD=""]top -HS[/CMD] showed that the CPU overuser is

Code:

99.17% {swi1: netisr 0}

My guess is that increasing net.isr.maxthreads might help, but the result will be seen at the next peak time.

tingo · Jun 19, 2011

Try to disable MSI on the network interface, see if that changes anything. If it does, you know where to look.

emlai · Jun 20, 2011

Disabling MSI didn't help. It looks like I might have misdiagnosed the problem. At failure times packet ammount is reaching 40k in/35k out and that might be the networking driver limit per one CPU core but system isn't using more than one core for this task and 3 remaining cores are just resting.
I disabled polling and increased

Code:

net.isr.maxthreads=3

also switched

Code:

net.isr.direct=0

but currently in top -HPS I see only 1 netisr thread serving all load but other are resting.

Code:

   12 root     -44    -     0K   384K CPU3    3 159:32 54.69% {swi1: netisr 3}
   12 root     -44    -     0K   384K WAIT    0   0:00  0.00% {swi1: netisr 0}
   12 root     -44    -     0K   384K WAIT    0   0:00  0.00% {swi1: netisr 2}

Other threads will start working when first one is overloaded? Or they should be already working?

emlai · Jun 20, 2011

To get multithreading installed Yandex PRO/1000 driver, result from top -HPS

Code:

    0 root      76    0     0K   176K PKWAIT  1   3:38 41.55% {em0_rx0_0}
    0 root      76    0     0K   176K PKWAIT  3   3:37 41.06% {em0_rx0_2}
    0 root      76    0     0K   176K CPU1    1   3:38 40.58% {em0_rx0_1}

CPU usage seems quite high for only 30k pps in/25k pps out. Soon will be seen how this will handle peak hours.

ecazamir · Jun 20, 2011

If it's usable at your site, you could set

Code:

net.inet.ip.process_options=0

and see what happens.
AFAIK,

Code:

net.inet.ip.process_options=1

is the biggest performance killer on machines with more than 30kpps I've seen. I expect a huge performance impact (no more packet loss), but disabling IP options processing may have impact on other areas.

emlai · Jun 20, 2011

Sadly neither the Yandex driver nor

Code:

net.inet.ip.process_options=0

had any positive effect. (Same as netisr, polling). Almost seems that server load grows exponentially depending on pps, I will try to log that.

emlai · Jun 20, 2011

Indeed system load seems (at least to me currently) to increase exponentially till system fails:

Code:

[B]k pps in[/B]/[B]k pps out[/B] - [B]status[/B], [B]LOAD[/B] last minute
28/22 - system runing smoothly almost no load on CPU as expected, 0.05 
30/24 - system still runing but CPU load increasing fastly, 0.96
32/27 - system still runing but CPU load increasing fastly, 1.87
33/28 - system still runing but soon wont, 2.96
+few/+few - system becomes unresponsive

Some outputs I based my above assumptions on:

Code:

 top -HPS
last pid:  1239;  load averages:  0.05,  0.18,  0.10
up 0+00:04:21  21:46:07
111 processes: 5 running, 86 sleeping, 20 waiting
CPU 0:  1.2% user,  0.0% nice,  8.1% system,  0.0% interrupt, 90.7% idle
CPU 1:  0.0% user,  0.0% nice,  1.9% system,  0.0% interrupt, 98.1% idle
CPU 2:  0.0% user,  0.0% nice,  0.0% system, 40.4% interrupt, 59.6% idle
CPU 3:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
Mem: 31M Active, 11M Inact, 170M Wired, 344K Cache, 18M Buf, 3709M Free
Swap: 8192M Total, 8192M Free

  PID USERNAME PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
   11 root     171 ki31     0K    64K CPU0    0   4:03 100.00% {idle: cpu0}
   11 root     171 ki31     0K    64K CPU3    3   3:56 100.00% {idle: cpu3}
   11 root     171 ki31     0K    64K RUN     1   3:45 81.59% {idle: cpu1}
   12 root     -44    -     0K   320K WAIT    2   1:08 63.96% {swi1: netisr 0}
   11 root     171 ki31     0K    64K CPU2    2   3:48 57.28% {idle: cpu2}
 1207 web       49    0 22916K 13044K kqread  0   0:22  4.49% {initial thread}
   20 root      47    -     0K    16K flowcl  3   0:03  0.78% flowcleaner
   12 root     -32    -     0K   320K WAIT    1   0:01  0.20% {swi4: clock}
    0 root      76    0     0K   176K sched   1   0:49  0.00% {swapper}

netstat -h 1
            input        (Total)           output
   packets  errs idrops      bytes    packets  errs      bytes colls
       28K     0     0       3.5M        22K     0       2.4M     0


 vmstat -i
interrupt                          total       rate
irq19: atapci0                      1750          6
irq21: ehci0                         504          1
irq23: ehci1                         619          2
cpu0: timer                       570964       1968
irq257: em0:tx 0                   22514         77
cpu1: timer                       570714       1967
cpu2: timer                       570833       1968
cpu3: timer                       570830       1968
Total                            2308728       7961

====

 top -HPS
last pid:  1268;  load averages:  0.96,  0.49,  0.24
up 0+00:07:22  21:49:08
111 processes: 8 running, 85 sleeping, 18 waiting
CPU 0:  0.0% user,  0.0% nice, 22.7% system,  0.0% interrupt, 77.3% idle
CPU 1:  0.0% user,  0.0% nice, 27.3% system, 22.7% interrupt, 50.0% idle
CPU 2:  0.0% user,  0.0% nice, 22.7% system,  0.0% interrupt, 77.3% idle
CPU 3:  0.0% user,  0.0% nice, 36.4% system,  0.0% interrupt, 63.6% idle
Mem: 35M Active, 12M Inact, 212M Wired, 336K Cache, 19M Buf, 3663M Free
Swap: 8192M Total, 8192M Free

  PID USERNAME PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
   11 root     171 ki31     0K    64K RUN     2   6:00 76.56% {idle: cpu2}
   11 root     171 ki31     0K    64K RUN     3   6:18 70.17% {idle: cpu3}
   11 root     171 ki31     0K    64K RUN     0   6:11 69.19% {idle: cpu0}
   11 root     171 ki31     0K    64K CPU1    1   5:50 53.56% {idle: cpu1}
 1207 web       66    0 27012K 16836K kqread  0   1:10 29.59% {initial thread}
    0 root      63    0     0K   176K PKWAIT  1   0:35 28.76% {em0_rx0_1}
    0 root      63    0     0K   176K PKWAIT  3   0:36 28.27% {em0_rx0_0}
    0 root      63    0     0K   176K CPU3    0   0:36 27.49% {em0_rx0_2}
   12 root     -44    -     0K   320K CPU0    0   1:41 24.27% {swi1: netisr 0}
   12 root     -32    -     0K   320K WAIT    1   0:03 11.38% {swi4: clock}
   20 root      45    -     0K    16K flowcl  3   0:07  1.56% flowcleaner

 netstat -h 1
            input        (Total)           output
   packets  errs idrops      bytes    packets  errs      bytes colls
       30K     0     0       3.7M        24K     0       2.8M     0

 vmstat -i
interrupt                          total       rate
irq19: atapci0                      1828          3
irq21: ehci0                         750          1
irq23: ehci1                         947          2
cpu0: timer                       903241       1972
irq256: em0:rx 0                 1280943       2796
irq257: em0:tx 0                   91065        198
cpu1: timer                       902991       1971
cpu2: timer                       903109       1971
cpu3: timer                       903105       1971
Total                            4987979      10890

===

 top -HPS
last pid:  1352;  load averages:  1.87,  1.62,  0.95
up 0+00:17:08  21:58:54
111 processes: 10 running, 82 sleeping, 19 waiting
CPU 0:  3.7% user,  0.0% nice, 55.6% system,  0.0% interrupt, 40.7% idle
CPU 1:  0.0% user,  0.0% nice, 23.1% system, 23.1% interrupt, 53.8% idle
CPU 2:  0.0% user,  0.0% nice, 74.1% system,  0.0% interrupt, 25.9% idle
CPU 3:  0.0% user,  0.0% nice, 38.5% system, 19.2% interrupt, 42.3% idle
Mem: 49M Active, 12M Inact, 310M Wired, 324K Cache, 22M Buf, 3551M Free
Swap: 8192M Total, 8192M Free

  PID USERNAME PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
   11 root     171 ki31     0K    64K RUN     3  12:19 52.20% {idle: cpu3}
    0 root      76    0     0K   176K RUN     2   4:08 50.68% {em0_rx0_0}
    0 root      76    0     0K   176K CPU0    0   4:08 50.39% {em0_rx0_2}
    0 root      76    0     0K   176K CPU3    3   4:08 50.00% {em0_rx0_1}
   11 root     171 ki31     0K    64K RUN     2  11:44 47.46% {idle: cpu2}
   11 root     171 ki31     0K    64K RUN     1  11:20 47.17% {idle: cpu1}
   11 root     171 ki31     0K    64K RUN     0  11:23 44.97% {idle: cpu0}
 1207 web      105    0 41740K 31580K RUN     0   4:47 44.68% {initial thread}
   12 root     -44    -     0K   320K CPU2    1   3:33 19.19% {swi1: netisr 0}
   12 root     -32    -     0K   320K WAIT    2   0:22  0.39% {swi4: clock}


netstat -h 1
            input        (Total)           output
   packets  errs idrops      bytes    packets  errs      bytes colls
       32K     0     0       3.9M        27K     0       3.3M     0

vmstat -i
interrupt                          total       rate
irq19: atapci0                      2097          1
irq21: ehci0                        1638          1
irq23: ehci1                        2135          2
cpu0: timer                      2084298       1981
irq256: em0:rx 0                 5584479       5308
irq257: em0:tx 0                  364073        346
cpu1: timer                      2084049       1981
cpu2: timer                      2084163       1981
cpu3: timer                      2084155       1981
Total                           14291087      13584


===

last pid:  1890;  load averages:  2.96,  2.97,  2.94
up 0+02:50:34  17:15:25
113 processes: 10 running, 84 sleeping, 19 waiting
CPU 0:  1.5% user,  0.0% nice, 74.5% system, 13.5% interrupt, 10.5% idle
CPU 1:  2.3% user,  0.0% nice, 75.9% system, 10.5% interrupt, 11.3% idle
CPU 2:  1.9% user,  0.0% nice, 68.0% system, 12.4% interrupt, 17.7% idle
CPU 3:  0.8% user,  0.0% nice, 60.2% system, 16.9% interrupt, 22.2% idle
Mem: 68M Active, 13M Inact, 394M Wired, 276K Cache, 18M Buf, 3447M Free
Swap: 8192M Total, 8192M Free

  PID USERNAME PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
    0 root     112    0     0K   176K CPU1    1  83:40 73.97% {em0_rx0_2}
    0 root      76    0     0K   176K RUN     2  83:44 73.58% {em0_rx0_1}
    0 root      76    0     0K   176K RUN     3  83:45 71.48% {em0_rx0_0}
 1208 web      110    0 59780K 49532K CPU2    0  76:17 68.16% {initial thread}
   11 root     171 ki31     0K    64K RUN     3  77:35 27.20% {idle: cpu3}
   11 root     171 ki31     0K    64K RUN     2  73:50 23.10% {idle: cpu2}
   12 root     -44    -     0K   320K CPU3    3  53:15 22.66% {swi1: netisr 0}
   11 root     171 ki31     0K    64K RUN     1  69:23 16.89% {idle: cpu1}
   11 root     171 ki31     0K    64K RUN     0  64:39 13.77% {idle: cpu0}
   20 root      49    -     0K    16K flowcl  3   3:12  4.30% flowcleaner
   12 root     -32    -     0K   320K WAIT    1  10:45  3.37% {swi4: clock}


netstat -hI em0 1
            input          (em0)           output
   packets  errs idrops      bytes    packets  errs      bytes colls
       33K     0     0       4.0M        28K     0       3.8M     0


vmstat -i
interrupt                          total       rate
irq19: atapci0                      4975          0
irq21: ehci0                       15473          1
irq23: ehci1                       23388          2
cpu0: timer                     20501007       1991
irq256: em0:rx 0                61712428       5996
irq257: em0:tx 0                 5291014        514
cpu3: timer                     20500731       1991
cpu1: timer                     20500830       1991
cpu2: timer                     20500782       1991
Total                          149050628      14482

Any ideas?

ecazamir · Jun 21, 2011

I have a question: why do you have a single netisr thread active? How many NIC's are installed on the system?

On a machine with multiple interfaces running as a PPPoE server (with ports/net/mpd5) I see:

Code:

last pid: 68145;  load averages:  0.20,  0.08,  0.06                                                     up 3+17:04:57  13:45:46
230 processes: 5 running, 199 sleeping, 26 waiting
CPU 0:  0.0% user,  0.0% nice,  0.0% system, 48.1% interrupt, 51.9% idle
CPU 1:  0.0% user,  0.0% nice,  1.5% system,  4.9% interrupt, 93.6% idle
CPU 2:  0.0% user,  0.0% nice,  1.1% system,  7.1% interrupt, 91.7% idle
CPU 3:  0.0% user,  0.0% nice,  0.0% system,  6.7% interrupt, 93.3% idle
Mem: 408M Active, 910M Inact, 452M Wired, 315M Buf, 1196M Free
Swap: 4096M Total, 4096M Free

  PID USERNAME   PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
   11 root       171 ki31     0K    64K CPU3    3  83.4H 98.97% {idle: cpu3}
   11 root       171 ki31     0K    64K CPU2    2  84.1H 93.26% {idle: cpu2}
   11 root       171 ki31     0K    64K RUN     1  82.8H 92.87% {idle: cpu1}
   11 root       171 ki31     0K    64K CPU0    0  65.7H 59.28% {idle: cpu0}
   12 root       -68    -     0K   416K WAIT    0 676:42 21.09% {irq257: bce0}
   12 root       -68    -     0K   416K WAIT    0 428:00 14.06% {irq26: bge1}
[color="DarkGreen"]   12 root       -44    -     0K   416K WAIT    0 145:55  8.40% {swi1: netisr 0}
   12 root       -44    -     0K   416K WAIT    2 200:58  6.79% {swi1: netisr 2}
   12 root       -44    -     0K   416K WAIT    1 186:37  6.59% {swi1: netisr 1}
   12 root       -44    -     0K   416K WAIT    3 224:33  5.96% {swi1: netisr 3}[/color]
   12 root       -64    -     0K   416K WAIT    0  86:47  1.66% {irq18: uhci2}
 1507 root        48    0 13368K  7212K select  2   3:36  1.37% zebra
    0 root       -68    0     0K   112K -       2  20:28  1.07% {dummynet}
 2709 root        44    0 45700K 26708K select  1  44:45  0.20% snmpd
68128 ecazamir    44    0 74320K 10868K nanslp  1   0:00  0.10% bmon
    9 root        44    -     0K    16K ipmire  2  22:14  0.00% ipmi0: kcs

While the network traffic is:

Code:

# bmon
          Name                          RX                         TX
---------------------------------------------------------------------------------------------------------------------------------
localhost (local)           x      Rate         #   % x      Rate         #   %
  1   bge1                  x   16.91MiB    20.17K    x   12.47MiB    16.31K
  2   bce0                  x   12.12MiB    16.17K    x   16.44MiB    18.69K
  5   vlan1                 x    6.75MiB     9.06K    x   10.17MiB    11.44K
  6   vlan2                 x    5.37MiB     7.11K    x    6.27MiB     7.25K
  7   vlan100               x    8.75MiB    10.86K    x    7.77MiB     9.22K
  8   vlan101               x    1.12MiB     1.81K    x    1.89MiB     1.81K
  9   vlan102               x  231.35KiB      220     x   14.63KiB      142
  10  vlan103               x   85.64KiB      333     x  276.30KiB      401
  11  vlan104               x    1.36MiB     1.22K    x  534.92KiB     1003
  12  vlan105               x  318.95KiB      375     x  401.08KiB      360
  13  vlan106               x    5.02MiB     4.78K    x    1.06MiB     2.98K
  14  vlan107               x   48.76KiB      598     x  566.58KiB      442

vlan1 and vlan2 (WAN links) are carried by bce0, vlan100 to 107 (local network links) are carried by bge0.

vmstat -i shows:

Code:

> vmstat -i
interrupt                          total       rate
irq1: atkbd0                          50          0
irq14: ata0                           35          0
irq16: uhci0 ehci0              13404940         41
irq18: uhci2                  1587610129       4942
irq22: uhci4                      566567          1
irq24: bge0                     12540884         39
irq26: bge1                   1601372757       4985
cpu0: timer                    642446616       2000
irq256: ciss0                    1769298          5
[color="DarkGreen"]irq257: bce0                  2582178776       8038[/color]
cpu2: timer                    642439482       1999
cpu3: timer                    642439491       1999
cpu1: timer                    642439493       1999
Total                         8369208518      26054

and 8000 irq/sec does not seem to be a problem for bce on a server with a single Xeon 5130 @ 2.00GHz.

emlai · Jun 21, 2011

Indeed my initial assumptions about IRQ are wrong. Instead there seems to be some networking problem related to >33kpps/>28kpps. Only one active network card on em0, another unplugged on em1.

Currently I went back to a generic kernel with the default driver instead of the Yandex NIC driver as it gave no improvment.

Indeed good question about netisr threads, I asked same few posts before.

Code:

top -HPS 1000 | grep netisr
   12 root     -44    -     0K   400K CPU2    2   6:03 71.97% {swi1: netisr 3}
   12 root     -44    -     0K   400K WAIT    1   0:00  0.00% {swi1: netisr 1}
   12 root     -44    -     0K   400K WAIT    0   0:00  0.00% {swi1: netisr 0}
   12 root     -44    -     0K   400K WAIT    0   0:00  0.00% {swi1: netisr 2}

One netisr thread runing other resting.

Code:

last pid:  1223;  load averages:  0.35,  0.34,  0.20
up 0+00:06:59  15:59:58
108 processes: 6 running, 78 sleeping, 24 waiting
CPU 0:  3.9% user,  0.0% nice, 12.4% system,  1.7% interrupt, 82.0% idle
CPU 1:  0.9% user,  0.0% nice,  4.3% system,  3.0% interrupt, 91.8% idle
CPU 2:  0.0% user,  0.0% nice,  0.0% system, 69.4% interrupt, 30.6% idle
CPU 3:  2.1% user,  0.0% nice,  2.1% system,  0.0% interrupt, 95.7% idle
Mem: 33M Active, 10M Inact, 268M Wired, 368K Cache, 14M Buf, 3612M Free
Swap: 8192M Total, 8192M Free

  PID USERNAME PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
   11 root     171 ki31     0K    64K RUN     1   6:14 92.48% {idle: cpu1}
   11 root     171 ki31     0K    64K CPU0    0   5:51 86.08% {idle: cpu0}
   11 root     171 ki31     0K    64K RUN     3   4:19 79.88% {idle: cpu3}
   12 root     -44    -     0K   400K CPU2    2   4:24 68.65% {swi1: netisr 3}
   11 root     171 ki31     0K    64K RUN     2   4:50 60.79% {idle: cpu2}
 1051 web       60    0 27012K 17644K kqread  1   1:44 24.66% {initial thread}
   12 root     -68    -     0K   400K WAIT    0   0:11  2.10% {irq256: em0:rx 0}
   19 root      48    -     0K    16K flowcl  3   0:07  1.56% flowcleaner
   12 root     -68    -     0K   400K WAIT    1   0:04  0.68% {irq257: em0:tx 0}
   12 root     -32    -     0K   400K WAIT    1   0:04  0.10% {swi4: clock}

This is at 30/25 kpps.

ecazamir · Jun 22, 2011

Have you looked at

Code:

# sysctl hw.em

?
And.... how long is your firewall ruleset ? ipfw and/or pf ip filtering may count as interrupt/swi.
What happens if you try temporarily to disable packet filtering (or add rules to match all packets somewhere at the beginning of the ruleset) ?

emlai · Jun 22, 2011

I have looked and to my unexperienced eye no obvious problem there at least at off-peak hours.

Code:

sysctl dev.em.0
dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.1.9
dev.em.0.%driver: em
dev.em.0.%location: slot=0 function=0
dev.em.0.%pnpinfo: vendor=0x8086 device=0x10d3 subvendor=0x15d9 subdevice=0x0605 class=0x020000
dev.em.0.%parent: pci4
dev.em.0.nvm: -1
dev.em.0.debug: -1
dev.em.0.rx_int_delay: 0
dev.em.0.tx_int_delay: 66
dev.em.0.rx_abs_int_delay: 66
dev.em.0.tx_abs_int_delay: 66
dev.em.0.rx_processing_limit: -1
dev.em.0.flow_control: 3
dev.em.0.link_irq: 2
dev.em.0.mbuf_alloc_fail: 0
dev.em.0.cluster_alloc_fail: 0
dev.em.0.dropped: 0
dev.em.0.tx_dma_fail: 0
dev.em.0.rx_overruns: 0
dev.em.0.watchdog_timeouts: 0
dev.em.0.device_control: 1074790984
dev.em.0.rx_control: 67141634
dev.em.0.fc_high_water: 18432
dev.em.0.fc_low_water: 16932
dev.em.0.queue0.txd_head: 2052
dev.em.0.queue0.txd_tail: 2056
dev.em.0.queue0.tx_irq: 390141769
dev.em.0.queue0.no_desc_avail: 0
dev.em.0.queue0.rxd_head: 3989
dev.em.0.queue0.rxd_tail: 3987
dev.em.0.queue0.rx_irq: 415629309
dev.em.0.mac_stats.excess_coll: 0
dev.em.0.mac_stats.single_coll: 0
dev.em.0.mac_stats.multiple_coll: 0
dev.em.0.mac_stats.late_coll: 0
dev.em.0.mac_stats.collision_count: 0
dev.em.0.mac_stats.symbol_errors: 0
dev.em.0.mac_stats.sequence_errors: 0
dev.em.0.mac_stats.defer_count: 0
dev.em.0.mac_stats.missed_packets: 0
dev.em.0.mac_stats.recv_no_buff: 0
dev.em.0.mac_stats.recv_undersize: 0
dev.em.0.mac_stats.recv_fragmented: 0
dev.em.0.mac_stats.recv_oversize: 0
dev.em.0.mac_stats.recv_jabber: 0
dev.em.0.mac_stats.recv_errs: 0
dev.em.0.mac_stats.crc_errs: 0
dev.em.0.mac_stats.alignment_errs: 0
dev.em.0.mac_stats.coll_ext_errs: 0
dev.em.0.mac_stats.xon_recvd: 0
dev.em.0.mac_stats.xon_txd: 0
dev.em.0.mac_stats.xoff_recvd: 0
dev.em.0.mac_stats.xoff_txd: 0
dev.em.0.mac_stats.total_pkts_recvd: 1143070495
dev.em.0.mac_stats.good_pkts_recvd: 1143070495
dev.em.0.mac_stats.bcast_pkts_recvd: 1618178
dev.em.0.mac_stats.mcast_pkts_recvd: 240849
dev.em.0.mac_stats.rx_frames_64: 664262794
dev.em.0.mac_stats.rx_frames_65_127: 292581802
dev.em.0.mac_stats.rx_frames_128_255: 4373990
dev.em.0.mac_stats.rx_frames_256_511: 177880553
dev.em.0.mac_stats.rx_frames_512_1023: 2656818
dev.em.0.mac_stats.rx_frames_1024_1522: 1314537
dev.em.0.mac_stats.good_octets_recvd: 140894372999
dev.em.0.mac_stats.good_octets_txd: 119833116215
dev.em.0.mac_stats.total_pkts_txd: 942354956
dev.em.0.mac_stats.good_pkts_txd: 942354954
dev.em.0.mac_stats.bcast_pkts_txd: 47
dev.em.0.mac_stats.mcast_pkts_txd: 844
dev.em.0.mac_stats.tx_frames_64: 471305127
dev.em.0.mac_stats.tx_frames_65_127: 279859643
dev.em.0.mac_stats.tx_frames_128_255: 141605650
dev.em.0.mac_stats.tx_frames_256_511: 19146803
dev.em.0.mac_stats.tx_frames_512_1023: 11279327
dev.em.0.mac_stats.tx_frames_1024_1522: 19158406
dev.em.0.mac_stats.tso_txd: 3199317
dev.em.0.mac_stats.tso_ctx_fail: 0
dev.em.0.interrupts.asserts: 3
dev.em.0.interrupts.rx_pkt_timer: 0
dev.em.0.interrupts.rx_abs_timer: 0
dev.em.0.interrupts.tx_pkt_timer: 0
dev.em.0.interrupts.tx_abs_timer: 0
dev.em.0.interrupts.tx_queue_empty: 0
dev.em.0.interrupts.tx_queue_min_thresh: 0
dev.em.0.interrupts.rx_desc_min_thresh: 0
dev.em.0.interrupts.rx_overrun: 0

But I will try to log what happens with this at peak time.
No firewall is being used.

emlai · Jun 22, 2011

Although I have previously tuned some well known settings and also everything that was obvious from dmesg, netstat, vmstat looks like it wasn't enough. By taking additional settings from http://serverfault.com/questions/64356/freebsd-performance-tuning-sysctls-loader-conf-kernel server currently is runing without problems with required 40/35kpps and load on CPU growing linearly as expected with only 20% of CPU power utilized at peak hours.

Sad that I don't know exactly which setting tuning was critical and also that it wasn't visible to me in well known places.

Thank You ecazamir for assistance probably brainstorming would take me even more time without Your help.