HAPROXY and PF - storm of retried ACK packets

dam23 · Oct 5, 2010

Hello guys,

We seem to be having a problem here, when using HAProxy in combination with PF.

This is the basic layout:

Code:

PF -- HAPROXY -- web1
   \- HAPROXY \- web2
              \- web3
              \- web4

PF acts as a frontal firewall.

It uses relayd to RDR incoming request to either of the 2 active/active HAProxy servers.

The HAProxy servers then dispatch requests to the web servers.

In a perfect world, a request comes back from the web server to haproxy, then to the client.

However, we experience problems when one of the following events occurs:
1/ an IP gets blacklisted by PF for being too aggressive
2/ PF isn't able to create any more states

What happens when there is no more state between an IP and HAProxy is that HAProxy will keep on sending ACK packets:

Code:

23:20:14.885321 IP 192.168.26.252 > 192.168.26.1: ICMP host 91.178.77.58 unreachable, length 48
23:20:14.885325 IP 192.168.26.1.80 > 91.178.77.58.50964: Flags [P.], seq 0:951, ack 1, win 16, length 951
23:20:14.885475 IP 192.168.26.252 > 192.168.26.1: ICMP host 91.178.77.58 unreachable, length 48
23:20:14.885481 IP 192.168.26.1.80 > 91.178.77.58.50964: Flags [P.], seq 0:951, ack 1, win 16, length 951
23:20:14.885598 IP 192.168.26.252 > 192.168.26.1: ICMP host 91.178.77.58 unreachable, length 48
23:20:14.885604 IP 192.168.26.1.80 > 91.178.77.58.50964: Flags [P.], seq 0:951, ack 1, win 16, length 951
23:20:14.885704 IP 192.168.26.252 > 192.168.26.1: ICMP host 91.178.77.58 unreachable, length 48
23:20:14.885710 IP 192.168.26.1.80 > 91.178.77.58.50964: Flags [P.], seq 0:951, ack 1, win 16, length 951
23:20:14.885824 IP 192.168.26.252 > 192.168.26.1: ICMP host 91.178.77.58 unreachable, length 48
23:20:14.885829 IP 192.168.26.1.80 > 91.178.77.58.50964: Flags [P.], seq 0:951, ack 1, win 16, length 951
23:20:14.885927 IP 192.168.26.252 > 192.168.26.1: ICMP host 91.178.77.58 unreachable, length 48
23:20:14.885934 IP 192.168.26.1.80 > 91.178.77.58.50964: Flags [P.], seq 0:951, ack 1, win 16, length 951
23:20:14.886039 IP 192.168.26.252 > 192.168.26.1: ICMP host 91.178.77.58 unreachable, length 48
23:20:14.886044 IP 192.168.26.1.80 > 91.178.77.58.50964: Flags [P.], seq 0:951, ack 1, win 16, length 951
23:20:14.886138 IP 192.168.26.252 > 192.168.26.1: ICMP host 91.178.77.58 unreachable, length 48
23:20:14.886145 IP 192.168.26.1.80 > 91.178.77.58.50964: Flags [P.], seq 0:951, ack 1, win 16, length 951
23:20:14.886272 IP 192.168.26.252 > 192.168.26.1: ICMP host 91.178.77.58 unreachable, length 48
23:20:14.886278 IP 192.168.26.1.80 > 91.178.77.58.50964: Flags [P.], seq 0:951, ack 1, win 16, length 951
23:20:14.886397 IP 192.168.26.252 > 192.168.26.1: ICMP host 91.178.77.58 unreachable, length 48

Notice how we reply with an ICMP unreach, but HAProxy still insists on sending ACKs.

We're not entirely sure on how to solve this problem.

For the time being, we're using (amongst unrelated others) the following PF rules and options:

Code:

### TIMEOUTS

set timeout tcp.first           5
set timeout tcp.opening         5
set timeout tcp.established     30
set timeout tcp.closing         5
set timeout tcp.finwait         10
set timeout tcp.closed          5


### LIMITS
# Adaptation
# set timeout adaptive.start      24000
# set timeout adaptive.end        48000

# memory pool(9) limits
set limit states        200000
set limit frags         25000
set limit src-nodes     70000


### FILTERING
scrub in all no-df random-id
rdr-anchor "relayd/*"

# Block outgoing connections from HAProxy which do not match an established one with a RST/ICMP
block return on $dmzif inet proto tcp from <lb> port http to any

# Allow incoming HTTP and become more aggressive as we have less states available
pass in quick on $extif proto tcp from any to <lb> port 80 flags S/SA keep state (max 100000, source-track rule, max-src-conn-rate 50/2, overload <abusive_hosts> flush)
pass in quick on $extif proto tcp from any to <lb> port 80 flags S/SA keep state (max 80000, source-track rule, max-src-conn-rate 25/2, overload <abusive_hosts> flush)
block return in quick on $extif proto tcp from any to <lb> port 80

The goal of these rules is to:
1/ Allow up to 100.000 states to be created by the first rule, which permits up to 50 connections per 2 seconds per IP
2/ Allow up to an additional 80.000 states to be created by the second rule, with a more aggressive blacklisting policy
3/ Deny the creation of any new state for HTTP with a TCP RST to new external clients

These rules allow us to retain a few spare states to establish our own SSH and VPN sessions on the firewall during HTTP attacks (180.000 max HTTP states vs 200.000 global).

So basically, no more state existing on the FW between the client and loadbalancers => LBs keep spamming ACK packets to the client, which get refused by the firewall with an ICMP unreach.

HAproxy keeps trying, literally flooding the firewall with meaningless traffic, we hit 1gb/s last night.

I think the problem lies with HAProxy because, in my opinion, it doesn't honor the ICMP Unreachable packets it receives.

We're going to try in a few minutes with TCP RST packets instead, but I'd like to hear your toughts on the matter

dam23 · Oct 5, 2010

As a quick update, we have tried the exact same rules against a loadbalancer running under Debian 5 with a higher kernel revision and it seems to honor the ICMP Unreach packets correctly.

More info as we progress.

SirDice · Oct 5, 2010

dam23 said:
I think the problem lies with HAProxy because, in my opinion, it doesn't honor the ICMP Unreachable packets it receives.

We're going to try in a few minutes with TCP RST packets instead, but I'd like to hear your toughts on the matter

I think you're on the right track. I don't think PF is the issue here as it does seem to do what it's supposed to do.

I'd also check why haproxy keeps retrying, shouldn't it just give up at one point?
Have a look at the netstat states of those connections. Maybe that will give some clues.

And do try a RST. That really should kill any TCP connection. But you may get some RST/ACKs in return.
This state will probably time-out faster anyway.

dam23 · Oct 6, 2010

The thing is, we've experienced odd behavior with the "block return" rule.

It's supposed to send a TCP RST for TCP packets and an ICMP Unreach for other protocols.

Oddly, the firewall insisted on sending ICMPs until we explicitly wrote "return-rst".

This firewall is running 8.0-RELEASE-p2 on 64 bits with the following kernel directives:

Code:

device pf
device pflog
device pfsync
device carp

options ALTQ
options ALTQ_CBQ
options ALTQ_RED
options ALTQ_RIO
options ALTQ_HFSC
options ALTQ_PRIQ
options ALTQ_NOPCC

Regarding why HAProxy keeps trying, an HAProxy developper points out there might be an IP stack issue somewhere.

We're still experimenting around.

HAPROXY and PF - storm of retried ACK packets

dam23

dam23

SirDice

Administrator

dam23