PF Jail as default gateway

As I'm using pf in a jail that is the DMZ and I have some rdr rules for port forwarding I would like this jail also be the default gateway for the server and other jails and VMs running on it. This makes sense?, or it's better to leave the server pointing to the IP provided by the ISP?
 
Horse or cart... Will your server boot without a functioning default gateway? Do you need that for DNS and NTP services at boot time?
 
Horse or cart... Will your server boot without a functioning default gateway? Do you need that for DNS and NTP services at boot time?
I don't think so. I must try that.

I ask this, because the DMZ jail has Nginx running as a reverse proxy to a couple of webservers running on other jails. If on those jails I use the DMZ as default gateway the packets pointing to port 443 on the DMZ doesn't make the whole round trip, but if I set the ISP default gw (192.168.0.1) they work as expected.
 
Just connect the uplink to the jail and let it handle the egress connection; don't do port forwarding and fiddling around with loopback interfaces on the host.

I've been running gateways in jails for several years now (since vnet became usable without nuking the host on teardown). Works as advertised and I wouldn't want it any other way now.
 
...or it's better to leave the server pointing to the IP provided by the ISP?
...but if I set the ISP default gw (192.168.0.1) they work as expected.
Understanding the location of your firewall is germane to the problem. If your existing gateway is 192.168.0.1, then I'm guessing it's on some sort of appliance, and the "IP provided by the ISP" is actually attached to the appliance uplink, either statically or by DHCP when the appliance boots.

tommiie asked for a network diagram which would have clarified this.

I don't really see a problem using one VM (using an appliance as its default gateway) as the default gateway for other VMs. You would want to manage the boot order and the routing tables.

However, I'd be acutely reluctant to point a virtualisation server at one of its own VMs as a default gateway, as that gateway clearly will not exist at the point in time when the virtualisation server boots. The creation of the default route would very likely fail. You would have to arrange to plant the default gateway on the virtualisation server after the VM boots. Even if it can be made to work (it's messy, but it probably can), you will have created a fragile setup, likely to break easily, and obfuscate other problems.
 
However, I'd be acutely reluctant to point a virtualisation server at one of its own VMs as a default gateway, as that gateway clearly will not exist at the point in time when the virtualisation server boots.
Admittedly, I'm new to all of this, but I'm in the midst of configuring a host with a handful of jails to run my gateway services. One of the jails is the gateway, another runs dnsmasq for DHCP and DNS. The host passes the WAN and trunked LAN interfaces to the jailed gateway. The host also has its own static IP and a default route that runs over a VLAN ultimately connected via a bridge to the jailed gateway. I haven't rebooted it in a few days, but it's working fine and I haven't noticed any problems with the default routes or what not when it comes up. The gateway comes up quick and fetches its DHCP "WAN" address—not really WAN, right now, as it's all running on a separated network with my current router which I'm replacing.

The most critical piece, which I think OP should consider is how to console into the system. The machine that's running the above is a NUC and I don't have it connected to a keyboard or monitor, but it does have a console port. It also happens to have 6 NICs. I could use one of the NICs to connect via SSH, but instead I'm currently using a Raspberry Pi and connect through to that then console via a USB-console cable adapter. And, I can VPN into the Pi, too. The Pi has its NIC and a USB NIC and lives on both networks, so I can connect into it and test gateway configurations as a client machine while also connecting to the console and run everything on the host/jails.

I have discovered tmux, and now I think my life is complete.

ssh>cu>tmux, everything I need to work on the system.
 
However, I'd be acutely reluctant to point a virtualisation server at one of its own VMs as a default gateway, as that gateway clearly will not exist at the point in time when the virtualisation server boots.
Exactly this. I was only referring to locally accessible hosts.
I have those hosts connected to a management VLAN and the jail(s) handles the uplink(s) and acts as a router for e.g. local VLANs. That management LAN might be accessible via a jumphost for remote access, but the host itself never touches anything directly related to internet routing and hence isn't directly accessible from the outside world by choice and design.

For remote hosts without other means of access (e.g. IPMI/remote console) like VPS, I still let the host handle the uplink and depending on the use-case of the server, one jail acts as a gateway for all other jails. The gateway-jail may have a globally routable address different to the host or it simply uses the host as its uplink (in RFC1918 address space) and provides firewalling/forwarding/routing and basic services (DNS, NTP...) to other jails on that host. If needed that jail also connects to remote hosts via VPN.

Jail startup order is always crucial for such setups and it's easy to overcomplicate things and build hen-egg problems, e.g. distinct jails for DNS and gateway where PF uses hostnames but DNS can only start with a working uplink. So you should always follow the KISS principle and test your startup sequence extensively before shipping out the server or hardening it after inital setup.
 
The host passes the WAN and trunked LAN interfaces to the jailed gateway
There are 16 services on my FreeBSD 13.5 system that explicitly require networking to function before they can be started:
Code:
[gunsynd.187] $ grep "REQUIRE.*NETWORKING" * | wc -l
      16
There are (at least) another 55 services that REQUIRE services that implicitly follow NETWORKING (e.g. LOGIN and FILESYSTEMS). For example, sendmail, where the comments say "We make mail start late, so that things like .forward's are not processed until the system is fully operational".

Sendmail is just one example. It will probably recover as a service when a default route eventually appears after the system has booted and started its VMs, because it's designed to be resilient in the the face of network outages. But if anyone's ".forward" file is NFS mounted, you probably just broke mail delivery.

Unless you understand how all of the services behave if the default route is missing until after the system has fully booted, you are just hacking away in the dark. As I said above "you have created a fragile setup, likely to break easily, and obfuscate other problems".
 
We are operating in entirely different spheres as I'm setting up a homelab; I wouldn't presume to understand the intricacies of an IT operation upon which more than a handful of people depend.

I am confused about one thing, however. If the gateway (in its jail) has static IPs everywhere, networking is going to come up quickly. If it needs a WAN DHCP address (appropriate for a SOHO), it's going to have to either come up quick and wait on the negotiation or configure SYNCDHCP and block all of NETWORKING while waiting for an address. For a large business, I'd assume they'd have a static IP and be up and running. The gateway jail launches quickly, and if it doesn't have a lot of other stuff running on it directly, it will be one of the quickest up and running.

I don't see how this is different for the host vs. a jail.

And, again ignorance here, a private network can run with a WAN link down but it's router(s) up, can't it? I'm confused how having the host system be the gateway solves anything you wrote above that wouldn't also work with a jail gateway being the first up after the host has launched.

Jail order launch can be specified in the host jail list; meaning the gateway can be listed first. It's possible to launch them in parallel, but that does seem asking for trouble.

Anyway, like I said, we are moving in very different circles with all of this.
 
I am confused about one thing, however. If the gateway (in its jail) has static IPs everywhere, networking is going to come up quickly. If it needs a WAN DHCP address (appropriate for a SOHO), it's going to have to either come up quick and wait on the negotiation or configure SYNCDHCP and block all of NETWORKING while waiting for an address. For a large business, I'd assume they'd have a static IP and be up and running. The gateway jail launches quickly, and if it doesn't have a lot of other stuff running on it directly, it will be one of the quickest up and running.

I don't see how this is different for the host vs. a jail.
In addressing the OP, my assumption was that there was a separate firewall/router appliance handling DHCP, DNS, and NTP.

It's been a while since I tested FreeBSD, but systems generally don't wait indefinitely for a DHCP response. If they don't get one in reasonable time, they just carry on booting regardless, and try to do something sensible.

No matter how quickly a VM launches it can't boot before its VM server! That is the difference. Don't confuse speed with correctness.

Now, if you want to run a firewall in a VM, and you want your physical virtualisation server to use that firewall, you could boot the physical server using an ISP appliance as the default gateway, and switch the physical server to use a different gateway after the VMs have started, and their packet filters are running. /etc/rc.local might be a place to do this. I'd be reluctant to do that myself, and would look for a better design, but it's possible.
And, again ignorance here, a private network can run with a WAN link down but it's router(s) up, can't it? I'm confused how having the host system be the gateway solves anything you wrote above that wouldn't also work with a jail gateway being the first up after the host has launched.
A computer network that is designed to be run independently from the Internet should work just fine, but, these days, few are.

I didn't suggest that any particular host system had to be a gateway. I did suggest that booting a VM server when you know that it is configured to use a default gateway attached to a VM that does not exist has multiple risks that would need to be individually assessed.
Jail order launch can be specified in the host jail list; meaning the gateway can be listed first. It's possible to launch them in parallel, but that does seem asking for trouble.
You are still creating the default gateway after the physical server is booted, with all the attendant (and unassessed) risks.
 
We are operating in entirely different spheres as I'm setting up a homelab; I wouldn't presume to understand the intricacies of an IT operation upon which more than a handful of people depend.

I am confused about one thing, however. If the gateway (in its jail) has static IPs everywhere, networking is going to come up quickly. If it needs a WAN DHCP address (appropriate for a SOHO), it's going to have to either come up quick and wait on the negotiation or configure SYNCDHCP and block all of NETWORKING while waiting for an address. For a large business, I'd assume they'd have a static IP and be up and running. The gateway jail launches quickly, and if it doesn't have a lot of other stuff running on it directly, it will be one of the quickest up and running.
I'm also running such setups on connections without static IP and even on lines that are slow to get an address (e.g. DSL).
As long as you start the gateway first it's all fine - why should the other jails or services not start up? NETWORKING doesn't mean these services need a working route to the internet, just the network stack needs to be up so they can bind to an interface or address.
If a service requires a working uplink for startup (e.g. unbound for getting a trust anchor or ntpd for initial sync) you can either delay that service startup or check if it can be configured to start anyways and e.g. use some cached values.

On an especially pesky line (ancient DSL crap) that took ages to come up I added some @reebot cronjobs that sleep for 2 minutes and then restart a troublesome service. This workaround would also have been needed for a non-jail setup unless I'd completely block startup with SYNCDHCP, but as this was a line from deutsche telekom with constant problems (2-3 outages per month) the host would have been stuck at bootup quite often...
 
Hi!, sorry for the delay, the network diagram is this:

Code:
                             +-------------------+
                             |  Internet Router   |
                             |   192.168.100.1    |
                             +-------------------+
                                       | (LAN)
                     +-----------------+-----------------+
                     |                                   |
          +-------------------+             +-----------------------+
          |    Workstation PC |             | Firewall/Gateway VM    |
          |  192.168.100.100  |             | 192.168.100.200        |
          |  GW: 192.168.100.1|             | GW: 192.168.100.1      |
          +-------------------+             +-----------------------+
                                                       |
                                                       | (Protected Zone)
                                       +---------------+---------------+
                                       |                               |
                          +-----------------------+     +-----------------------+
                          |   Web Server VM        |     |   SQL Server VM       |
                          | 192.168.100.205        |     | 192.168.100.206       |
                          | GW: 192.168.100.200    |     | GW: 192.168.100.200   |
                          +-----------------------+     +-----------------------+

P.S.: diagram generated by DeepSeek, from this input:

Hi, I need a network diagram.The internet router is 192.168.100.1 and there are these virtual machines:192.168.100.200 is a firewall/gateway192.168.100.205 is a web server that uses 192.168.100.200 as a default gateway192.168.100.206 is a sql server that uses 192.168.100.200 as a default gatewayAnd there's a PC with IP 192.168.100.100 with 192.168.100.1 as gateway
 
Hi!, sorry for the delay, the network diagram is this:

Code:
                             +-------------------+
                             |  Internet Router   |
                             |   192.168.100.1    |
                             +-------------------+
                                       | (LAN)
                     +-----------------+-----------------+
                     |                                   |
          +-------------------+             +-----------------------+
          |    Workstation PC |             | Firewall/Gateway VM    |
          |  192.168.100.100  |             | 192.168.100.200        |
          |  GW: 192.168.100.1|             | GW: 192.168.100.1      |
Btw, there's only one PC, all the Jails, including the firewall runs on that PC.
 
You have five devices pictured and numbered. I suggest that the number may present a challenge for discussion. From your original post it looks as if you’d like the Firewall/Gateway to be the gateway to the Internet and that it is a jail. Is that right? I’m assuming you want the SQL and Web servers to be jails and use the FG as their gateway. Could you explain what the PC and internet router are, what they’re relationship is to the gateway (inside and routing through, or outside and using a different gateway), etc.

Could you describe what you’re trying to do (aside from setting this up), like what do you want to have happen? Is the PC to use the web server? Is someone else beyond the PC from the internet using the web server? Or, is there some other usage model?

Want to make sure we aren’t working on an X-Y problem.
 
Hi codeedog, let me describe what I want.

First, 192.168.100.1 is the IP of the router/modem provided by my ISP, this is the original default gw.
Then, my PC is 192.168.100.100 and I assigned 192.168.100.1 as its gw, this of course works ok.
Then, I created a Jail with VNET to work as the door for all external traffic, this is 192.168.100.200, also in the ISP's router I configured this as the DMZ, so, all external traffic enters through it. I installed pf there to limit what ports are allowed in and out.
The next step was creating a couple of jails to bring external services, like apache and an application that must be accessed from the outside, I assigned 192.168.100.205 and 192.168.100.206 IPs and set 192.168.100.200 as the default gw for them, so, if they want to go outside must pass through pf.

The problem I face is when I set 192.168.100.200 as the gw for .205 and .206 the connection from the outside is established, but never finishes, it looks like it wants to return from other IP or pf is blocking if. But if I set 192.168.100.1 as gw for both jails, everything works ok.
 
You have two different default gateways on the 192.168.100.0/24 subnet. One at 192.168.100.1 (which routes to the Internet) and one at 192.168.100.200 (which does not). A gateway is a router. You need to redesign your network. DMZ is always on an isolated subnet, and you route (usually with firewall) to and from it.
 
You have two different default gateways on the 192.168.100.0/24 subnet. One at 192.168.100.1 (which routes to the Internet) and one at 192.168.100.200 (which does not). A gateway is a router. You need to redesign your network. DMZ is always on an isolated subnet, and you route (usually with firewall) to and from it.
Well, in fact, 192.168.100.200 has pf and acts as a router, and it also has 192.168.100.1 as its gw.
 
Well, in fact, 192.168.100.200 has pf and acts as a router, and it also has 192.168.100.1 as its gw.
Based upon your questions and answers so far, I do not believe this network structure which you intend to build gets you the behavior you intend. If I understand correctly, you want a structure that supports a modem that forwards all WAN traffic to an inside device which acts as a gateway and allows some network traffic (on specific ports) into and out of two interior devices: Web Server and maybe SQL Server? You didn’t specify the latter as reachable by the internet (I’d advise against this, but I’m leaving it open that you might).

However, in the structure you’ve created everything lives on the 192.168.100/24 subnet. Gateways move Layer 3 packets between subnets, not within a subnet. Usually, that interior gateway would be set up such that it and the modem share one subnetwork (192.168.100/24) and it would sit on another subnetwork (eg. 192.168.125/24) and forward and filter packets between them. However, you don’t have two isolated networks connected via a gateway, you have one unified network (192.168.100/24) and the “interior” devices aren’t really interior because you’ve placed them on the modem side of the gateway; they are reachable directly from the modem.

gpw928 is correct, you have two gateways on the 192.168.100/0 subnet.

I’m not enough of an expert to predict how this structure behaves, but I suspect your interior gateway won’t forward layer 3 packets back onto the network from which they originated. That is, when it is contacted by the Web Server (192.168.100.205) it won’t forward them to the modem (192.168.100.1) because they both live on the 192.168.100/24) subnet. I could be wrong about this.

Most importantly, you’ve gotten the network you intended to set up because you’re the one doing all the address assignments, but you didn’t get the network behavior you’ve intended because you set it up incorrectly for that purpose. This is why I said we might have an XY problem. You’re asking about how to do X, but you really want Y.

Your options are to (a) dig in and investigate why the gateway is or is not routing packets (tcpdump and pflog0 would help with this) or (b) reconfigure your network to the way other people do it and likely have a satisfactory result. I do believe you’ll learn something doing (a). Although I don’t know, I suspect you'll never get (a) working the way you want from a behavior perspective.

So, setup (b) and place the inside (jail) devices on a separate subnet (eg 192.168.125/24), create a bridge and attach one epair per jail to the bridge and also give the bridge an IP on that separate subnet. Then, your gateway can route packets between the two subnetworks (between the interface with the modem and the bridge interface).

You’ll have to take one more step and make it possible for internal addresses in 192.168.125/24 to be handled properly when entering the external network (192.168.100/24). If you don't, your modem won’t know what to do with them. At the moment, it only knows how to handle 192.168.100/24. You have two options.

  1. You can use NAT and RDR on the interior gateway:
    Code:
    rdr on $ext_if proto tcp from any to port 80 -> $web_server
    nat log on $ext_if -> ($ext_if)
  2. If your modem has the feature, you can create a static route on it. I don’t know the exact method, but the static route would tell the modem that when it sees packets addressed to/from 192.168.125/24, the device at 192.168.100.200 knows how to handle them, so send them there.
#2 is better, if available, because it won’t double NAT your network. #1 will be double NAT and you also must have the redirect rule (RDR) to ensure outside ports make it through to the inside machines. If you leave the NATing up to the modem (with its static route), then there’s no address translation at the interior gateway and you don’t have to worry about synchronizing any modem side pinholes with the gateway pinholes.

I'm fairly certain I got all of that right, but others may correct me.
 
Well, in fact, 192.168.100.200 has pf and acts as a router
192.168.100.200 is either a router, and thus might be used as a default gateway (routing packets from one subnet to another), or not. All the information you have provided suggests it is not. The routing tables enumerated with netstat -4rn will show if its a router. A router will have at least two NICs, each connected to a different subnet.

The raison d'être of a DMZ is to isolate Internet originating traffic from your private local network. Pointing the ISP appliance-defined DMZ into your private LAN is both pointless (because it's not going to a secured network) and dangerous (because its going into your private LAN). The DMZ needs to be on its own subnet, where any security breach is fully contained to that subnet.
 
Back
Top