PF PF states limit reached

Hi

I'm having a problem with PF since time ago and i can't solve it (FreeBSD 11.1-RELEASE-p8). It's appears randomly (sometimes when there is more traffic, sometimes not, about 1 to 5 times per month), the message is:

kernel: [zone: pf states] PF states limit reached

This drops all connections until i do restart the pf service. I was trying to increment the limits to big big numbers but that doesn't resolv the problem:

set limit { states 800000, frags 400000, src-nodes 300000 }
set timeout { adaptive.start 18000, adaptive.end 39000 } # note: i tried adding this, but doesn't work

'Pfctl -si' shows:

Code:
Status: Enabled for 0 days 01:55:25           Debug: Urgent

Interface Stats for vmx0              IPv4             IPv6
  Bytes In                       201676094           540906
  Bytes Out                     1274748598          1157607
  Packets In
    Passed                         1065152             2725
    Blocked                           7017                0
  Packets Out
    Passed                          573385             2501
    Blocked                             19                0

State Table                          Total             Rate
  current entries                      700
  searches                         1650824          238.4/s
  inserts                            64402            9.3/s
  removals                             335            0.0/s
Counters
  match                              73832           10.7/s
  bad-offset                             0            0.0/s
  fragment                               0            0.0/s
  short                                  0            0.0/s
  normalize                              0            0.0/s
  memory                                 0            0.0/s
  bad-timestamp                          0            0.0/s
  congestion                             0            0.0/s
  ip-option                              0            0.0/s
  proto-cksum                            0            0.0/s
  state-mismatch                        41            0.0/s
  state-insert                           0            0.0/s
  state-limit                            0            0.0/s
  src-limit                              0            0.0/s
  synproxy                               0            0.0/s
  map-failed                             0            0.0/s

'pfctl -sm':

Code:
states        hard limit   800000
src-nodes     hard limit   300000
frags         hard limit   400000
table-entries hard limit   200000

The system is a VM in VMWare. I have this problem with different configs in pf.conf. I have this problem ussually in one of the VMs but sometimes it happens in other VMs

I googled a lot but i can't solve this.

Any ideas? Thank you
 
I'm feeling alone here :)

Anyone have an idea to start debugging this problem?

Maybe spoofing attacks?

How can i get the limit reached?

Thank you, i appreciate your help
 
From your output of pfctl -si the state limits aren't your problem (700 active states vs 800000 limit). Or are the states actually exceed the limit when the error is triggered?
Also increase your "table-entries" as this defines how much addresses can be kept in _all_ tables combined, so setting the states limit higher than the table-entries doesn't make any sense. This might already solve your problem IF your system isn't under heavy memory pressure.
PF state tables reside in kernel memory space - if your systems memory usage already exceeded available memory your state tables can't grow any larger no matter what limits you've set.
Have a look for the "memory" value in the pfctl -si output - if this is >0 PF was unable to allocate memory. If your states were below the set limit; you have hit the kernel memory limit and most likely physical memory limits.

vmstat | grep pf may also give hints if you've hit any memory limits due to memory pressure.


If there is enough physical memory available, you might need to adjust kern.ipc.nmbclusters. Have a look at tuning(7):
kern.ipc.nmbclusters may be adjusted to increase the number of network
mbufs the system is willing to allocate. Each cluster represents
approximately 2K of memory, so a value of 1024 represents 2M of kernel
memory reserved for network buffers. You can do a simple calculation to
figure out how many you need. If you have a web server which maxes out
at 1000 simultaneous connections, and each connection eats a 16K receive
and 16K send buffer, you need approximately 32MB worth of network buffers
to deal with it. A good rule of thumb is to multiply by 2, so 32MBx2 =
64MB/2K = 32768. So for this case you would want to set
kern.ipc.nmbclusters to 32768. We recommend values between 1024 and 4096
for machines with moderates amount of memory, and between 4096 and 32768
for machines with greater amounts of memory. Under no circumstances
should you specify an arbitrarily high value for this parameter, it could
lead to a boot-time crash. The -m option to netstat(1) may be used to
observe network cluster use. Older versions of FreeBSD do not have this
tunable and require that the kernel config(8) option NMBCLUSTERS be set
instead.
While this affects the total number of connections a system can handle, it shouldn't trigger the "states limit reached" error in PF.
 
Hi Sko, thanks for the reply

My current kern.ipc.nmbclusters: 530452 (unchanged, do you think i need to increase?)

I increased table-entries:

Code:
states        hard limit   800000
src-nodes     hard limit   300000
frags         hard limit   400000
table-entries hard limit  1000000

My vmstat (no 'pf' text returned):

Code:
procs  memory       page                    disks     faults         cpu
r b w  avm   fre   flt  re  pi  po    fr   sr da0 da1   in    sy    cs us sy id
0 0 0 9.1G  185M  1659  25   0   1  1902 1568   0   0  246  3297  6283  4  1 95

Memory in pfctl, i don't know if when PF limits were reached this value increased, i'll check when issue appear again:
Code:
pfctl -si | grep memory
  memory                                 0            0.0/s

My system top:

Code:
Mem: 2326M Active, 3463M Inact, 961M Laundry, 1373M Wired, 750M Buf, 165M Free
Swap: 11G Total, 849M Used, 11G Free, 7% Inuse

Perhaps a temporary heavy memory pressure could do a a PF limits reached? I guess other services would have fallen, but maybe it could be the problem, i'll check memory

I will post any additional information, anyway if you or someone else has more ideas I appreciate it
Thank you
 
My vmstat (no 'pf' text returned):
Sorry, my bad. You need to specify the -m switch for vmstat:
vmstat -m | grep -E 'pf|Size'

Try to get the actual number of states and memory used (by pf/tables) / free when the system fails/failed. At least you'd know if it is really memory-related or there is something else wrong...

Also try if you can establish any egress connections from the host (i.e. check if the whole interface is dead or misbehaving and PF tipping over is just a symptom)
 
Hi again, i was waiting until the system fail again, system report:

Code:
vmstat -m | grep -E 'pf|Size' :

         Type InUse MemUse HighUse Requests  Size(s)
    pfs_nodes    21     6K       -       21  256
      tcpfunc     1     1K       -        1  32
       pfsync     1     1K       -        1  1024
      pf_temp     0     0K       -       67  32
      pf_hash     3  2880K       -        3
     pf_ifnet    49     9K       -      170  128,256,2048
      pf_rule   175   175K       -      205  1024
      pf_osfp  1184   122K       -     7104  64,128
     pf_table    21    42K       -       51  2048
(I had been resetting PF and values are the same)

Code:
/usr/bin/netstat -m :

272/10423/10695 mbufs in use (current/cache/total)
258/5394/5652/530452 mbuf clusters in use (current/cache/total/max)
258/5281 mbuf+clusters out of packet secondary zone in use (current/cache)
8/191/199/265226 4k (page size) jumbo clusters in use (current/cache/total/max)
0/0/0/78585 9k jumbo clusters in use (current/cache/total/max)
0/0/0/44204 16k jumbo clusters in use (current/cache/total/max)
616K/14157K/14773K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters delayed (4k/9k/16k)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0 sendfile syscalls
0 sendfile syscalls completed without I/O request
0 requests for I/O initiated by sendfile
0 pages read by sendfile as part of a request
0 pages were valid at time of a sendfile request
0 pages were requested for read ahead by applications
0 pages were read ahead by sendfile
0 times sendfile encountered an already busy page
0 requests for sfbufs denied
0 requests for sfbufs delayed

Code:
pfctl -si

Status: Enabled for 1 days 13:57:33           Debug: Urgent

Interface Stats for vmx0              IPv4             IPv6
  Bytes In                      3285955569         15377272
  Bytes Out                    29910308246         37350332
  Packets In
    Passed                        20779333            78514
    Blocked                          89914               32
  Packets Out
    Passed                        15088131            72172
    Blocked                            357                0

State Table                          Total             Rate
  current entries                     1201
  searches                        36108835          264.2/s
  inserts                           851345            6.2/s
  removals                           51490            0.4/s
Counters
  match                             946269            6.9/s
  bad-offset                             0            0.0/s
  fragment                               0            0.0/s
  short                                  0            0.0/s
  normalize                              2            0.0/s
  memory                              4128            0.0/s
  bad-timestamp                          0            0.0/s
  congestion                             0            0.0/s
  ip-option                              0            0.0/s
  proto-cksum                            0            0.0/s
  state-mismatch                       773            0.0/s
  state-insert                           0            0.0/s
  state-limit                            0            0.0/s
  src-limit                              0            0.0/s
  synproxy                               0            0.0/s
  map-failed                             0            0.0/s

Pfctl -si reports using memory:
Code:
memory                              4128            0.0/s

System top:

Code:
Mem: 3680M Active, 1994M Inact, 819M Laundry, 1491M Wired, 813M Buf, 304M Free
Swap: 11G Total, 1330M Used, 10G Free, 11% Inuse

If pfctl -si memory output is >0, then it's a memory allocation problem? How could i solve it?

Thank you
 
Hi, problem still continue with 'set optimization aggressive'

With 'vmstat -z' i get:
Code:
ITEM                   SIZE  LIMIT     USED     FREE      REQ FAIL SLEEP
pf mtags:                48,      0,       0,     332,     109,   0,   0
pf states:              296, 800007,     956,     214, 3175553,132508,   0
pf state keys:           88,      0,     956,    4894, 3175553,   0,   0
pf source nodes:        136, 300005,       1,      86,      27,   0,   0
pf table entries:       160, 1000000,      38,      62,     294,   0,   0
pf table counters:       64,      0,       0,       0,       0,   0,   0
pf frags:               112,      0,       0,       0,       0,   0,   0
pf frag entries:         40, 400000,       0,       0,       0,   0,   0
pf state scrubs:         40,      0,       0,       0,       0,   0,   0

Why pf states is failing? Do you have any ideas to solve this problem?

Thank you
 
Some time ago with an increase in traffic on the network, the PF was congested ...

The solution was to increase some parameters in PF

pf.conf
[...]
#-------------------------------------------------------------------------------
# (3) PF: Options
#-------------------------------------------------------------------------------

# Misc Options
set skip on lo
set debug urgent
set block-policy drop
set loginterface $ext_if
set state-policy if-bound
set fingerprints "/etc/pf.os"
set ruleset-optimization basic
set optimization normal
set limit { states 1000000, frags 1000000, src-nodes 100000, table-entries 1000000 }
[...]
 

Attachments

  • pf.conf
    51 KB · Views: 981
Hi, i already had the limit high

I'm looking vmstat -z and it looks like pf is reaching the states limit (800005).

I'm increasing this limit more and more

Thank you
 
Look in your state table. What is the most frequent state? Is there one type (e.g. UDP to a specific port) that's overrepresented, or are expired states not getting purged?
 
Kristof Provost how to ensure that expired states are getting purged? I think we may have hit that issue today with one of our servers. It had very low load, over 20gb in free RAM, and essentially 1-2% of CPU and disk activity for the day (no spike). But suddenly, it became impossible to connect to the host (http, ssh etc...) and we received the log message posted by the OP.
The only thing I can see is that this server has been up for almost 1 year without any reboot, so I strongly suspect that expired states are potentially not being purged.
 
> Kristof Provost how to ensure that expired states are getting purged?

Normally that happens automatically, but there have been bugs (both in pf and on ARM in the timing code) that prevented that from happening.

You need to look at the current state table and see if it contains states that should be expired. We can't debug anything without working out what the exact problem is, and we can't do that without looking at what's actually going on.
 
Back
Top