PF PSA: Subtle changes to pf rules may be required when moving to 13.0

I ran across a problem while I was upgrading one of the routers I manage from 12.2 to 13.0, and thought I should warn others who might run into the same problem. But before I get to the issue, let me provide the details on how things are set up. Keep in mind I'm not sure everything below is relevant, as I've have limited time to fully discover the scope of the change, but I thought I should include as much as possible in case it matters.
  • The router has multiple vlan(4)s on top of a lagg(4) interface with multiple em(4) interfaces underneath.
  • Routing is performed between the VLANs, and several vlan interfaces have aliased IPs on them.
  • One vlan interface in particular has numerous IP aliases on it, one for each jail that requires IP connectivity (e.g. unbound(8)).
  • At least two of the jails (both hosted on the router in question) need to communicate with one another (e.g. unbound's jail needs to talk to nsd(8)'s to resolve stub zones).
  • There's only one lo(4) interface, lo0, and it has no IP aliases and no jails using any addresses on it.
  • I'm was not and am not skipping filtering on lo0. Any traffic that needed to be sent over this interface had explicit rules permitting it.
  • No jails are using vnet, they are just assigned IP addresses (always aliases) already present on the host.
in 12.x and earlier, traffic sent between the unbound and nsd jail would flow over the loopback interface (despite the fact that the IP addresses were assigned to a vlan interface), with a rule like this:
Code:
pass quick on lo0 inet proto udp from unbound to nsd port 53 keep state
After upgrading to 13.0, I noticed they could no longer reach one another, despite making no changes to my pf.conf. Puzzled, I looked a little deeper and found an error from unbound's side of things: notice: send failed: Permission denied. This made me suspect pf might be involved, and to make a long story short, it was. In 13.0, traffic between these jails was now being sent over the vlan interface instead of over the loopback as it had in the past. Simply changing the interface in the rule to be the vlan instead of the loopback fixed the issue once the new rules were loaded.

While I don't regard this as a bug, especially since the new behavior makes more sense than the old behavior in my opinion, it was an unexpected change while upgrading. As such, I wanted to make other users of pf who might be in a similar situation aware of the potential pitfall when upgrading.

Finally, I'm not sure yet on all the specifics, so this might affect you even if you're not using jails, VLANs, etc. If you've done some testing yourself please post any additional findings, or any questions you have, below.
 
One thing I've noticed myself, after upgrading, pf seems to have a problem with tables after upgrading to 13.0-RELEASE. pfctl reports "Cannot allocate memory" while:

Code:
pfctl -s memory
...
LIMITS:
states        hard limit    10000
src-nodes     hard limit    10000
frags         hard limit     2000
table-entries hard limit   200000

And having 6 tables defined:

Code:
table <dns> const { 1.2.3.4 5.6.7.8 9.10.11.12 }
table <backs> const { 13.14.15.16 2a02:: }

table <badhosts> persist file "/etc/pf.badhosts"
table <goodhosts> persist file "/etc/pf.goodhosts"
table <spamhosts> persist file "/etc/pf.spamhosts"
table <rbl> persist file "/etc/rbl.conf"

Counting together all tables, I've 1195 entries in all tables and adding more isn't possible. Also stopping and starting pf via rc.d doesn't work anymore, till the system is rebooted and then everything gets loaded.
The badhosts table gets filled (manually), the rbl table automatically once a day. Though I guess it might affect when using things like fail2ban.

Dunno what's wrong here as there's some memory available and the system isn't even using swap.
 
Interesting. I don't have any persistent tables like that, but it makes me curious. Did you delete the files between 12.x and 13, or just keep them as they were?
 
Just kept them. The files are just a list of IP's where nothing useful came from, until now, to use them later for things like:

Code:
no rdr on $int_ext proto {udp tcp} from {<badhosts><rbl> } to any

Though, it looks like the problem about pfctl and memory has been solved by not compiling pf into the kernel directly and use it as module.

Another problem where bug reports already exist is something like:

Code:
rdr on $int_ext inet6 proto tcp from any to $int_ext port 443 -> $int_www port 443

Where int_ext is the external interface, int_www a webserver jail interface. So doing rdr with IPv6 seems to be problematic, but it depends on the external IPv6 adress as this works without any problems at another system.
 
Back
Top