Solved CARP Gateway - Slow internet access

jbo@

Developer
Hello folks,

I have two machines (exact same hardware) that I'd like to use as a gateway/router for LAN clients to access the internet. I am using CARP for fail-over which is already working - therefore, let's refer to the two machines as "the gateway" from here on.
Here's a drawing for sanity:
Code:
                                                +-----------+
                                172.31.255.6/24 |           | 192.168.100.1/24
                                       +--------+  silver1  +---------+
                                       |        |           |         |           192.168.100.222/24
          +-----------+                |        +-----------+         |                +-----------+
          |           |                |                              |                |           |
+---------+  ISP GW   +----------------+                              +----------------+  client1  |
          |           |172.31.255.5    |                              |                |           |
          +-----------+                |        +-----------+         |                +-----------+
                                       |        |           |         |
                                       +--------+  silver2  +---------+
                                172.31.255.6/24 |           | 192.168.100.1/24
                                                +-----------+
It took me quite some reading but I managed to get it working: client1 is successfully able to access websites hosted on random webservers on the internet. However, everything is horribly slow.
The ISP uplink is a 1G/1G connection. Using iperf3 on silver1 and a server in a datacenter I do get over 900Mbps. Downloading files on silver1 works well too. The performance becomes bad once I try to access anything from client1: Everything is horribly slow. I am able to load webpages and downloading content in general, but it's slow. Running iperf3 on client1 towards the machine in the dataceter even times out.

My question: What's going on here? Where did I screw up in my setup/configuration?

Here's the network & routing configuration from silver1|2:
Code:
# Network
ifconfig_igb0="inet 192.168.8.12/24 up"
ifconfig_igb1="inet 192.168.1.12/24 up"  # DNS access
ifconfig_igb2="inet 192.168.10.1/24 up"
ifconfig_igb2_alias0="inet vhid 1 advskew 100 pass testpass alias 192.168.100.1/24 up"
ifconfig_igb3="inet 192.168.10.3/24 up"
ifconfig_igb3_alias0="inet vhid 2 advskew 100 pass testpass alias 172.31.255.6/24 up"
defaultrouter="172.31.255.5"

# Routing
gateway_enable="YES"
static_routes="ispSwisscom"
route_ispSwisscom="-net 192.168.100.0/24 172.31.255.5"

Here's the corresponding PF configuration:
Code:
if_lan0="igb0"  # Management
if_lan1="igb1"  # DNS access
if_lan2="igb2"  # Client gateway 1
if_lan3="igb3"  # Swisscom modem
if_loc0="lo0"   # Loopback

# Options
set block-policy drop

# Scrub
scrub in all

# Ignore loopback interface
set skip on $if_loc0

nat on $if_lan3 from $if_lan2:network to any -> ($if_lan3) static-port

table <bruteforce> persist
block quick from <bruteforce>

block in log all
antispoof for $if_lan3
antispoof for $if_lan4
pass out keep state

pass quick on $if_lan3 all

pass from {$if_loc0, $if_lan2:network } to any keep state
pass in quick on {$if_lan0, $if_lan3} proto tcp from any to any port 22 flags S/SA keep state (max-src-conn 10, max-src-conn-rate 50/3600, overload <bruteforce> flush global)

And for completion, here's the routing table of silver1:
Code:
root@silver1:~ # netstat -nr
Routing tables

Internet:
Destination        Gateway            Flags     Netif Expire
default            172.31.255.5       UGS        igb3
127.0.0.1          link#9             UH          lo0
172.31.255.0/24    link#6             U          igb3
172.31.255.6       link#6             UHS         lo0
192.168.1.0/24     link#4             U          igb1
192.168.1.12       link#4             UHS         lo0
192.168.8.0/24     link#3             U          igb0
192.168.8.12       link#3             UHS         lo0
192.168.10.0/24    link#5             U          igb2
192.168.10.1       link#5             UHS         lo0
192.168.10.3       link#6             UHS         lo0
192.168.100.0/24   link#5             U          igb2
192.168.100.1      link#5             UHS         lo0

Internet6:
Destination                       Gateway                       Flags     Netif Expire
::/96                             ::1                           UGRS        lo0
::1                               link#9                        UH          lo0
::ffff:0.0.0.0/96                 ::1                           UGRS        lo0
fe80::/10                         ::1                           UGRS        lo0
fe80::%lo0/64                     link#9                        U           lo0
fe80::1%lo0                       link#9                        UHS         lo0
ff02::/16                         ::1                           UGRS        lo0

silver1 and silver2 specs:
- FreeBSD 11.2-RELEASE
- Intel(R) Xeon(R) CPU D-1518 @ 2.20GHz
- 8 GB DDR4 memory
- 128 GB NVMe drive

What did I miss? I'd appreciate any help on this!
In general I feel like my PF configuration is not really suitable for what I want to achieve - I'd be thankful for any input here as well!
 
Make sure both igb2 and igb3 are in the same BACKUP or MASTER state. Or else you run the risk of getting asynchronous traffic, where outgoing traffic passes through silver1 and the return traffic gets passed through silver2 or vice versa.

I also noticed a discrepancy with your static route. The 192.168.100.0/24 network is already implicitly attached to igb2 because the interface is directly attached to that network (CARP address).

And another thing I noticed, igb2 and igb3 are both tied to the same 192.168.10.0/24 network, that's going to cause routing problems.
 
Make sure both igb2 and igb3 are in the same BACKUP or MASTER state. Or else you run the risk of getting asynchronous traffic, where outgoing traffic passes through silver1 and the return traffic gets passed through silver2 or vice versa.
Thanks! I did that and I confirmed that that's working. This is actually shown as an example in the CARP documentation.

I also noticed a discrepancy with your static route. The 192.168.100.0/24 network is already implicitly attached to igb2 because the interface is directly attached to that network (CARP address).
Hmm... did I set it somewhere explicitly as well? I agree with you but I don't understand why you're mentioning this - what am I missing?

And another thing I noticed, igb2 and igb3 are both tied to the same 192.168.10.0/24 network, that's going to cause routing problems.
This is actually what I expected to be the problem. After reading through the CARP docs I'm not sure what the requirements/recommendations here are. How should I set it up for my environment? My idea was that I simply use the 192.168.10.0/24 network for CARP (synchronization?) traffic.
 
So I just noticed that the speed is not the only problem. Pulling packages on client1 via pkg shows around 500 kbps. However, it takes about 10 minutes to complete something like pkg install wget. I assume that this is indeed a routing issue.
 
Hmm... did I set it somewhere explicitly as well? I agree with you but I don't understand why you're mentioning this - what am I missing?
This:
Code:
ifconfig_igb2_alias0="inet vhid 1 advskew 100 pass testpass alias 192.168.100.1/24 up"
Ties the 192.168.100.0/24 network to igb2 as a directly connected network, the route to it is then implicitly applied:
Code:
192.168.100.0/24   link#5             U          igb2
And that conflicts with this:
Code:
static_routes="ispSwisscom"
route_ispSwisscom="-net 192.168.100.0/24 172.31.255.5"
This explicitly sets a route to 192.168.100.0/24, which will fail to apply because it's already implicitly applied to igb2 as a directly connected network.

This is actually what I expected to be the problem. After reading through the CARP docs I'm not sure what the requirements/recommendations here are. How should I set it up for my environment? My idea was that I simply use the 192.168.10.0/24 network for CARP (synchronization?) traffic.
You put the physical address and the virtual (CARP) address in the same network. Normally you use .1 as the VIP (Virtual IP; CARP) and .2 for silver1 and .3 for silver2 for example. CARP itself communicates through multicast.
 
Thank you very much Sir - very valuable information as always :)

So this would be a decent setup then?
/etc/rc.conf of silver1:
Code:
ifconfig_igb0="inet 192.168.8.12/24 up"
ifconfig_igb1="inet 192.168.1.12/24 up"  # DNS access
ifconfig_igb2="inet 192.168.100.2/24 up"
ifconfig_igb2_alias0="inet vhid 1 advskew 100 pass testpass alias 192.168.100.1/24 up"
ifconfig_igb3="inet 172.31.255.7/24 up"
ifconfig_igb3_alias0="inet vhid 2 advskew 100 pass testpass alias 172.31.255.6/24 up"
defaultrouter="172.31.255.5"
/etc/rc.conf of silver2:
Code:
ifconfig_igb0="inet 192.168.8.18/24 up"
ifconfig_igb1="inet 192.168.1.18/24 up"  # DNS access
ifconfig_igb2="inet 192.168.100.3/24 up"
ifconfig_igb2_alias0="inet vhid 1 advskew 200 pass testpass alias 192.168.100.1/24 up"
ifconfig_igb3="inet 172.31.255.8/24 up"
ifconfig_igb3_alias0="inet vhid 2 advskew 200 pass testpass alias 172.31.255.6/24 up"
defaultrouter="172.31.255.5"

The problem I have is that while client1 is still able to ping silver1/silver2 it is not able to reach a host "on the internet":
Code:
root@client1:~ # ping 192.168.100.1
PING 192.168.100.1 (192.168.100.1): 56 data bytes
64 bytes from 192.168.100.1: icmp_seq=0 ttl=64 time=0.224 ms
64 bytes from 192.168.100.1: icmp_seq=1 ttl=64 time=0.288 ms
64 bytes from 192.168.100.1: icmp_seq=2 ttl=64 time=0.253 ms
64 bytes from 192.168.100.1: icmp_seq=3 ttl=64 time=0.273 ms
^C
--- 192.168.100.1 ping statistics ---
4 packets transmitted, 4 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.224/0.260/0.288/0.024 ms

root@client1:~ # ping 8.8.4.4
PING 8.8.4.4 (8.8.4.4): 56 data bytes
>>>>>>> nothing happening, not even after over a minute <<<<<<<<<

root@client1:~ # traceroute 8.8.4.4
traceroute to 8.8.4.4 (8.8.4.4), 64 hops max, 40 byte packets
 1  192.168.100.2 (192.168.100.2)  0.299 ms  0.189 ms  0.103 ms
 2  * * *
 3  * * *
It simply halts there/keeps going on forever.

I still seem to miss something obvious here...
 
Interesting... with that configuration silver1 cannot access the internet itself anymore. It cannot ping the ISP gateway (172.31.255.5). I must be misunderstanding something when it comes to configuring CARP.
 
I must be misunderstanding something when it comes to configuring CARP.
Take CARP out of the equation for a minute. Configure things so everything routes through silver1 correctly. When that correctly works, do the same for silver2. If both systems are working individually, add CARP. Just think of the CARP VIP as 'floating' between silver1 and silver2.
 
Okay, that makes sense. I took down silver2 and removed all the CARP stuff from silver1 and rebooted it.
I'm back to the scenario I faced yesterday: client1 can ping silver1, silver1 can ping 8.8.8.8, but client1 cannot ping 8.8.8.8.

Here's /etc/rc.conf of silver1:
Code:
# Network
ifconfig_igb0="inet 192.168.8.12/24 up"
ifconfig_igb1="inet 192.168.1.12/24 up"  # DNS access
ifconfig_igb2="inet 192.168.100.1/24 up"
ifconfig_igb3="inet 172.31.255.6/24 up"
defaultrouter="172.31.255.5"

# Routing
gateway_enable="YES"

I'm still having a major vibe here that it's PF related. My PF skills are still very limited (although learning constantly). I'd appreciate it if you could have a look at this: Here's /etc/rc.conf of silver1:
Code:
if_lan0="igb0"  # Management
if_lan1="igb1"  # DNS access
if_lan2="igb2"  # Client gateway 1
if_lan3="igb3"  # Swisscom modem
if_loc0="lo0"   # Loopback

# Options
set block-policy drop

# Scrub
scrub in all

# Ignore loopback interface
set skip on $if_loc0

nat on $if_lan3 from $if_lan2:network to any -> ($if_lan3) static-port

table <bruteforce> persist
block quick from <bruteforce>

block in log all
antispoof for $if_lan3
pass out keep state

pass quick on $if_lan3 all

pass from {$if_loc0, $if_lan2:network } to any keep state
pass in quick on {$if_lan0, $if_lan3} proto tcp from any to any port 22 flags S/SA keep state (max-src-conn 10, max-src-conn-rate 50/3600, overload <bruteforce> flush global)
 
Having purchased "The book of PF" definitely paid of - they had a section about pretty much exactly this gateway scenario - managed to get it working. Makes a lot more sense to my brain now :p

I'll mark this thread as "solved" as soon as I've got it working with CARP again.

Thank you for your help & patience!
 
Having purchased "The book of PF" definitely paid of - they had a section about pretty much exactly this gateway scenario - managed to get it working. Makes a lot more sense to my brain now :p

I'll mark this thread as "solved" as soon as I've got it working with CARP again.

Thank you for your help & patience!
How did you solve the problem?
 
Back
Top