Hello guys,
We seem to be having a problem here, when using HAProxy in combination with PF.
This is the basic layout:
PF acts as a frontal firewall.
It uses relayd to RDR incoming request to either of the 2 active/active HAProxy servers.
The HAProxy servers then dispatch requests to the web servers.
In a perfect world, a request comes back from the web server to haproxy, then to the client.
However, we experience problems when one of the following events occurs:
1/ an IP gets blacklisted by PF for being too aggressive
2/ PF isn't able to create any more states
What happens when there is no more state between an IP and HAProxy is that HAProxy will keep on sending ACK packets:
Notice how we reply with an ICMP unreach, but HAProxy still insists on sending ACKs.
We're not entirely sure on how to solve this problem.
For the time being, we're using (amongst unrelated others) the following PF rules and options:
The goal of these rules is to:
1/ Allow up to 100.000 states to be created by the first rule, which permits up to 50 connections per 2 seconds per IP
2/ Allow up to an additional 80.000 states to be created by the second rule, with a more aggressive blacklisting policy
3/ Deny the creation of any new state for HTTP with a TCP RST to new external clients
These rules allow us to retain a few spare states to establish our own SSH and VPN sessions on the firewall during HTTP attacks (180.000 max HTTP states vs 200.000 global).
So basically, no more state existing on the FW between the client and loadbalancers => LBs keep spamming ACK packets to the client, which get refused by the firewall with an ICMP unreach.
HAproxy keeps trying, literally flooding the firewall with meaningless traffic, we hit 1gb/s last night.
I think the problem lies with HAProxy because, in my opinion, it doesn't honor the ICMP Unreachable packets it receives.
We're going to try in a few minutes with TCP RST packets instead, but I'd like to hear your toughts on the matter
We seem to be having a problem here, when using HAProxy in combination with PF.
This is the basic layout:
Code:
PF -- HAPROXY -- web1
\- HAPROXY \- web2
\- web3
\- web4
PF acts as a frontal firewall.
It uses relayd to RDR incoming request to either of the 2 active/active HAProxy servers.
The HAProxy servers then dispatch requests to the web servers.
In a perfect world, a request comes back from the web server to haproxy, then to the client.
However, we experience problems when one of the following events occurs:
1/ an IP gets blacklisted by PF for being too aggressive
2/ PF isn't able to create any more states
What happens when there is no more state between an IP and HAProxy is that HAProxy will keep on sending ACK packets:
Code:
23:20:14.885321 IP 192.168.26.252 > 192.168.26.1: ICMP host 91.178.77.58 unreachable, length 48
23:20:14.885325 IP 192.168.26.1.80 > 91.178.77.58.50964: Flags [P.], seq 0:951, ack 1, win 16, length 951
23:20:14.885475 IP 192.168.26.252 > 192.168.26.1: ICMP host 91.178.77.58 unreachable, length 48
23:20:14.885481 IP 192.168.26.1.80 > 91.178.77.58.50964: Flags [P.], seq 0:951, ack 1, win 16, length 951
23:20:14.885598 IP 192.168.26.252 > 192.168.26.1: ICMP host 91.178.77.58 unreachable, length 48
23:20:14.885604 IP 192.168.26.1.80 > 91.178.77.58.50964: Flags [P.], seq 0:951, ack 1, win 16, length 951
23:20:14.885704 IP 192.168.26.252 > 192.168.26.1: ICMP host 91.178.77.58 unreachable, length 48
23:20:14.885710 IP 192.168.26.1.80 > 91.178.77.58.50964: Flags [P.], seq 0:951, ack 1, win 16, length 951
23:20:14.885824 IP 192.168.26.252 > 192.168.26.1: ICMP host 91.178.77.58 unreachable, length 48
23:20:14.885829 IP 192.168.26.1.80 > 91.178.77.58.50964: Flags [P.], seq 0:951, ack 1, win 16, length 951
23:20:14.885927 IP 192.168.26.252 > 192.168.26.1: ICMP host 91.178.77.58 unreachable, length 48
23:20:14.885934 IP 192.168.26.1.80 > 91.178.77.58.50964: Flags [P.], seq 0:951, ack 1, win 16, length 951
23:20:14.886039 IP 192.168.26.252 > 192.168.26.1: ICMP host 91.178.77.58 unreachable, length 48
23:20:14.886044 IP 192.168.26.1.80 > 91.178.77.58.50964: Flags [P.], seq 0:951, ack 1, win 16, length 951
23:20:14.886138 IP 192.168.26.252 > 192.168.26.1: ICMP host 91.178.77.58 unreachable, length 48
23:20:14.886145 IP 192.168.26.1.80 > 91.178.77.58.50964: Flags [P.], seq 0:951, ack 1, win 16, length 951
23:20:14.886272 IP 192.168.26.252 > 192.168.26.1: ICMP host 91.178.77.58 unreachable, length 48
23:20:14.886278 IP 192.168.26.1.80 > 91.178.77.58.50964: Flags [P.], seq 0:951, ack 1, win 16, length 951
23:20:14.886397 IP 192.168.26.252 > 192.168.26.1: ICMP host 91.178.77.58 unreachable, length 48
Notice how we reply with an ICMP unreach, but HAProxy still insists on sending ACKs.
We're not entirely sure on how to solve this problem.
For the time being, we're using (amongst unrelated others) the following PF rules and options:
Code:
### TIMEOUTS
set timeout tcp.first 5
set timeout tcp.opening 5
set timeout tcp.established 30
set timeout tcp.closing 5
set timeout tcp.finwait 10
set timeout tcp.closed 5
### LIMITS
# Adaptation
# set timeout adaptive.start 24000
# set timeout adaptive.end 48000
# memory pool(9) limits
set limit states 200000
set limit frags 25000
set limit src-nodes 70000
### FILTERING
scrub in all no-df random-id
rdr-anchor "relayd/*"
# Block outgoing connections from HAProxy which do not match an established one with a RST/ICMP
block return on $dmzif inet proto tcp from <lb> port http to any
# Allow incoming HTTP and become more aggressive as we have less states available
pass in quick on $extif proto tcp from any to <lb> port 80 flags S/SA keep state (max 100000, source-track rule, max-src-conn-rate 50/2, overload <abusive_hosts> flush)
pass in quick on $extif proto tcp from any to <lb> port 80 flags S/SA keep state (max 80000, source-track rule, max-src-conn-rate 25/2, overload <abusive_hosts> flush)
block return in quick on $extif proto tcp from any to <lb> port 80
The goal of these rules is to:
1/ Allow up to 100.000 states to be created by the first rule, which permits up to 50 connections per 2 seconds per IP
2/ Allow up to an additional 80.000 states to be created by the second rule, with a more aggressive blacklisting policy
3/ Deny the creation of any new state for HTTP with a TCP RST to new external clients
These rules allow us to retain a few spare states to establish our own SSH and VPN sessions on the firewall during HTTP attacks (180.000 max HTTP states vs 200.000 global).
So basically, no more state existing on the FW between the client and loadbalancers => LBs keep spamming ACK packets to the client, which get refused by the firewall with an ICMP unreach.
HAproxy keeps trying, literally flooding the firewall with meaningless traffic, we hit 1gb/s last night.
I think the problem lies with HAProxy because, in my opinion, it doesn't honor the ICMP Unreachable packets it receives.
We're going to try in a few minutes with TCP RST packets instead, but I'd like to hear your toughts on the matter