Musings of a noob as I migrate from Windows to FreeBSD in my homelab

Getting back at it - I'm in process trying to install BIND on both servers.
  • SR-IOV continues to behave exactly the way I'd want when combined with vnet jails.
    • 11 VFs created on ixl2 (the 1st port of my X710-T2L) and 6 VFs created on ixl3 (the 2nd port). ixl0 & 1 are the onboard ports on my Supermicro X11DPH-i.
      • It's not well-documented, but it is possible to create and use a separate .conf file for each interface that's getting configured with VFs: I have ixl2.conf and ixl3.conf located in /etc/iov/.
      • The syntax to use in /etc/rc.conf is where I had a problem... couldn't figure out how to get both files to load. Eventually I stumbled on a post somewhere (and have since lost the link) that said to simply list the files in a single line with a space separator:
        Code:
        iovctl_files="/etc/iov/ixl2.conf /etc/iov/ixl3.conf"
      • Some extra fun you can have is assigning whatever MAC address you want to VF interfaces.
        • Just specify allow-set-mac : true; at the top of the iov .conf file, and within each VF definition specify the MAC address you want: mac-addr : "XX:XX:XX:XX:XX:XX";
        • According to the IANA, addresses that start in the range 00:00:00 to 00:00:FF are reserved (what for? I don't know).
        • So, I set up my VF MAC addresses using the various jersey numbers my kid wore during his youth hockey career... 00:00:53:24:02:XX. Doesn't seem to cause any problems with my networking gear, so whatever that range is reserved for clearly isn't important (and doesn't exist/conflict).
  • BIND installed on the jail running on the primary server without issue.
  • In order to get jails working on the secondary server (running on consumer-grade hardware), I needed to use a bridge interface. SR-IOV is not an option on the Z270 motherboard, nor does the X540-T2 NIC support it.
    • This machine has been running a pair of bhyve VMs with my old Windows Servers (and very reliably, I might add), so I thought I'd keep things simple and just use one bridge for bhyve and jails. Once the Windows VMs are decommissioned I won't need to do anything except shut them down and archive (or delete) them.
    • Implementation for this is pretty straightforward - no gotchas lurking. The bridge interface that's created in the background when running the vm switch create command doesn't seem like the best choice, since I don't think it'll persist after I get rid of the VMs and destroy the vm switch. It is possible to set up a bridge first and then force the vm switch to use it. Steps I followed:
      • Shut down both Windows VMs
      • vm switch destroy public (I didn't see a need to get creative with the switch name)
      • Set up the bridge in /etc/rc.conf, adding the ix0 interface and assigning the host IP to the bridge instead of the ethernet port.
      • update the host to 14.3-RELEASE-p5 since I'm here...
      • reboot and check ifconfig to see that bridge0 is set up properly. Verify connectivity to the outside world.
      • Re-create the vm-bhyve switch using the existing bridge: vm switch create -t manual -b bridge0 public
        • I gave it the same name (public) so I wouldn't need to modify the VM network configs at all.
      • Start both VMs back up and verify connectivity
    • Worked like a charm, so I moved on to trying to get BIND installed on a jail on the secondary server.
  • My jail.conf on the secondary server has all the appropriate exec.prestart/start/stop/poststop directives for dynamically creating, attaching, detaching, and destroying epairs for the jails when they're started/stopped. Works great. If anyone cares, here's a privacy-redacted version:
    # STARTUP/LOGGING
    exec.prestart = “logger Starting jail ${name}…”;
    exec.prestart += ”/sbin/ifconfig ${epair} create up”;
    exec.prestart += ”/sbin/ifconfig ${epair}a up descr jail:${name}”;
    exec.prestart += ”/sbin/ifconfig ${bridge} addm ${epair}a up”;
    exec.start = “/sbin/ifconfig ${epair}b ${ip} up”;
    exec.start += “/sbin/route add default ${gateway}”;
    exec.start += “/bin/sh /etc/rc”;
    exec.poststart = “logger Started jail ${name}.”;
    exec.prestop = “logger Stopping jail ${name}…”;
    exec.stop = “/bin/sh /etc/rc.shutdown”;
    exec.poststop = “logger Stopped jail ${name}.”;
    exec.poststop += “/sbin/ifconfig ${bridge} delete ${epair}a”;
    exec.poststop += “/sbin/ifconfig ${epair}a destroy”;
    exec.timeout = 90;
    exec.consolelog = “/var/log/jail_console_${name}.log”;

    # PERMISSIONS
    allow.raw_sockets;
    exec.clean;
    mount.devfs;
    devfs_ruleset = “5”;

    # NAME/PATH
    path = “/usr/local/jails/containers/${name}”;
    host.hostname = “${name}.somedomain.com”

    # NETWORKS/INTERFACES
    $ip = “10.0.0.${id}/24”;
    $gateway = “10.0.0.1”;
    $bridge = “bridge0”;
    $epair = “epair${id}”;

    jail10 {
    $id = “10”;
    vnet;
    vnet.interface = “${epair}b”;
    }
    jail11 {
    $id = “11”;
    vnet;
    vnet.interface = “${epair}b”;
    }
    jail12 {
    $id = “12”;
    vnet;
    vnet.interface = “${epair}b”;
    }
    jail13 {
    $id = “13”;
    vnet;
    vnet.interface = “${epair}b”;
    }
  • Last night's hiccup: I really have no idea what I'm doing when it comes to the pf firewall.
    • Unlike on my primary server, where pkg -j jailname install bind920 worked without a hitch, I got some sort of permission denied error when using pkg install against a jail on the secondary machine.
    • The only difference is the network config - I'd wondered before if pf would behave the same on a machine using bridged networking for the jails, and I'm fairly certain the answer is no.
      • This morning I actually went and read the PACKET FILTERING section of the if_bridge(4) manpage (should have done that before), and I think it confirms my suspicions.
      • I don't think the per-jail pf configuration I'm using on the primary server will work... the manpage says that all inbound packets are filtered on the originating interface (ix0 in my case), while outbound packets are filtered on the "appropriate interfaces." My guess is that it means outbound packets are filtered on epairb interfaces...
    • I'll have to toy with this a bit to see what's actually happening and adjust my firewalling strategy. I'm not inclined to simply shut off pf so I can install BIND - I'd rather have the pf setup procedure completely ironed out for a simple service that uses one UDP port and one TCP management port. When I go to set up more involved jails I really don't want to be flailing around wondering why things are behaving poorly.
  • Assuming I'm right and it's just my lack of understanding about how filtering works with bridge interfaces, and assuming I learn enough to move on, I'm fairly confident that I'll get BIND up and running in the next day or two.
    • Simple setup: a single forward lookup zone and a few reverse lookup zones, all of which are private (so I don't need to mess around signing things for DNSSEC). These BIND instances will also act as resolvers - not best practice for networks with large loads, but mine is small and has comparatively light traffic. Should be fine. Filter lists will wait a bit, because...
  • I hadn't planned on it originally, but it looks as though I'll be implementing a DHCP jail. I run a Unifi UDM as my router, and I've been using the built-in DHCP service. Unfortunately Unifi doesn't seem to offer any support for RFC2136 DDNS updates from their DHCP service, which is an (maybe not surprising) unpleasant discovery. Seriously... this stuff has been around for ages. There are some documented attempts I've seen to install nsupdate on a Unifi device, or even add a middleware layer to make the UDM think it's updating a cloud DDNS provider (which it supports). I'm not interested. If Unifi DHCP won't natively interface with a standards-compliant DNS server, I'll just roll my own.
    • So, as soon as I have my static zones set up and working properly in BIND, I'll make the tweaks necessary to support DDNS updates and go ahead with a DHCP jail. I'm thinking Kea, since it's the currently-supported offering from ISC. The old ISC dhcpd 4.4 offering was EOL-ed in 2022... Seems like the FreeBSD handbook should be updated to reflect this. I'm sure it works fine, but EOL means minimal (or no) security fixes, and that's not ok with me.
 
Interesting read... Sometimes I get the impression that the setup is more complicated than it needs to be, esp. with a lot of components that need to be set up by hand. Keeping a system like this up-to-date is gonna be a headache.

But considering that this looks like a private hobbyist project, I'm just gonna sit on the sidelines and enjoy the occasional read.
 
Sometimes I get the impression that the setup is more complicated than it needs to be, esp. with a lot of components that need to be set up by hand.
Likely true, if I'm being completely honest with myself. I think it comes down to 3 factors:
  1. I'm using a random collection of curated (by that I mean, did it fit within budget) hardware that occasionally forces me to look for workarounds to problems I could have avoided if I'd made different selections (e.g. my panic when one of my Seagate HDDs didn't update to 4kn - should have just bought 4kn drives to begin with, but the 512e upgradable drives were cheaper).
  2. Sometimes I'm way off track trying to do stuff without a clue. Most times, in fact. Good thing I know how to read...
  3. iX Systems have had teams of professionals working for years on their Truenas product. What I'm essentially doing is trying to recreate what they've already done, and really there's no good reason besides hobbyist fun. Linux is, imo, a bit of a s***-show these days. While FreeBSD isn't perfect I do appreciate the Unix philosophy more. Since I have no deadlines and no set finish line, puttering along one step at a time suits me just fine. If I needed it today, I'd wipe it all and install Truenas.
Keeping a system like this up-to-date is gonna be a headache.
Likely 15.0-RELEASE will be out before I'm anywhere close to done... we'll see if I feel adventurous when it comes out, but I'm thinking I'll probably hold off a bit. My plan is to update base and/or packages as security errata indicate I should, but not otherwise. The exception would be if I'm suffering a bug that someone's fixed and I want that patch, or if a new feature gets added to a package that I really want. Generally, though, I expect updates to be sparse. Boot environments and snapshots of userland files also have me excited. If I blow it (and I'm sure I will at some point), I'll have better odds of recovering before my wife notices I've bricked her ability to browse the web.
 
I expect updates to be sparse. Boot environments and snapshots of userland files also have me excited. If I blow it (and I'm sure I will at some point), I'll have better odds of recovering before my wife notices I've bricked her ability to browse the web.
Actually this, I'd avoid at all costs.

A setup that I have at home is to have a router with DD-WRT that my family connects to directly for Internet (over wi-fi), and a collection of my own machines that also have a direct connection (both cat5e and wifi) to the router. This allows me to play with networking and servers all I want, even play with IPv6 and HTTP/2, HTTP/3, and the like- and my family is none the wiser, because their own Internet access is not disturbed. There are ways to play with stuff while keeping basic essential functionality undisturbed.

REALLY need to engineer out and isolate points of failure.
 
REALLY need to engineer out and isolate points of failure.
Yup, agree 100%. That's why I'm installing secondary DNS, DHCP failover, secondary LDAP/Kerberos, etc on the junky backup server. At some point in the next year I'll get better hardware for that secondary role and migrate those jails to a more reliable box. Upgrade policy is to only work on one server at a time, verify after performing any updates, and touch the secondary only once the primary is confirmed in a good state. For hobbyist home networks there will always be a single point of failure somewhere.

The best I can do is to push those points to the non-DIY parts of the network. E.g. my Unifi UDM doesn't have a shadow copy waiting in the wings, but I _did_ spring for the service contract on that device. Cost me an extra 20% and it's already paid off - my original UDM had its SSD go bad, which manifested in a slow failure mode: it would run fine for 24 hours or so, then the GUI would die (no ssh access either), and then a few hours after that it would stop routing and switching. A hard power cycle would reset the clock, so the network was limping along and my wife didn't know about it until the warranty replacement showed up on our doorstep. Unifi backup/restore worked perfectly, and I had the dying router replaced in a span of 30 minutes while she was out on an errand. If I hadn't purchased the service contract I'd have had to ship back the old unit before they sent me the new one. Well worth the money, imo, to have the new one in hand before having to send the old router back. Fingers crossed it was a one-time thing... I've never seen a new SSD go bad that quickly before. Then again, I don't know what brand/model of SSD Unifi is using inside their products.
 
I've never seen a new SSD go bad that quickly before. Then again, I don't know what brand/model of SSD Unifi is using inside their products.
Samsung has very good SSDs, I swear by the brand. Expensive, but well worth the money.

Worst brand for SSDs, one I learned to stay away from is Teamgroup. Their SSDs are cheap, but they deteriorate very quickly.
 
Yup, agree 100%. That's why I'm installing secondary DNS, DHCP failover, secondary LDAP/Kerberos, etc on the junky backup server. At some point in the next year I'll get better hardware for that secondary role and migrate those jails to a more reliable box. Upgrade policy is to only work on one server at a time, verify after performing any updates, and touch the secondary only once the primary is confirmed in a good state. For hobbyist home networks there will always be a single point of failure somewhere.
I'd say that's just the wrong way to do it.

Have a secondary router behind your primary one, and all your play boxes behind that secondary router. Your wife gets to simply connect to the primary router for Internet.

Your secondary router can be the box where you host your secondary DNS, DHCP, subnetting, etc. Everything else that you wanna play with - it should be on a subnet behind that secodary router.

Yeah, not an elegant solution, you may need to buy more ethernet cable than you expected - but that does prevent you from bricking your wife's Internet access. The extra expense for more cat5e cable (and running it in weird places) is well worth it, trust me.
 
Isn't that the entire point of virtualization, be it via hypervisor or a container-based concept? If I muck up a jail it shouldn't kill any of the others, nor cause the host to die. Assuming FreeBSD jails are robust in this manner, I'm pretty happy with my trajectory.
 
Isn't that the entire point of virtualization, be it via hypervisor or a container-based concept? If I muck up a jail it shouldn't kill any of the others, nor cause the host to die. Assuming FreeBSD jails are robust in this manner, I'm pretty happy with my trajectory.
No, it's not. First, set up your networking correctly, then muck around with jails and then, based on your results, DNS. Separation of concepts matters. Besides, if you think about it, that's how pros handle things. There's a reason for Best Practices out there.

If you muck up your secondary DNS jail, it should NOT affect the DNS on the primary router. Yeah, there's a bit of redundancy, but that's what it takes to stay out of trouble.
 
No, it's not. First, set up your networking correctly, then muck around with jails. Separation of concepts matters. Besides, if you think about it, that's how pros handle things. There's a reason for Best Practices out there.

If you muck up your secondary DNS jail, it should NOT affect the DNS on the primary router. Yeah, there's a bit of redundancy, but that's what it takes to stay out of trouble.
Most of the jailed services are networking: DNS, DHCP, auth, storage. Where else would I put them except in a jail? I have strong philosophical objections to letting the Unifi UDM do most of the heavy lifting. Sure, it can do more, but thats too many eggs in one (non-redundant) basket.

The whole point of this exercise is to set up networking "correctly" on FreeBSD. Sure, a few services at the end are fluff (eg jellyfin), but most of this is just getting core net services daemons running in containerized environments to keep things simple (and therefore less likely to fail).

I appreciate your level of caution, but I think we differ philosophically about where the line needs to be drawn. Although it's been 2+ decades, I did run a small ISP back in the late 90s... I still have a pretty good idea about how the basic network should be architected and how services should be deployed. Maybe it's dated knowledge, but I'm not flying blind when it comes to network design.
 
Most of the jailed services are networking: DNS, DHCP, auth, storage. Where else would I put them except in a jail? I have strong philosophical objections to letting the Unifi UDM do most of the heavy lifting. Sure, it can do more, but thats too many eggs in one (non-redundant) basket.
In all honesty, I also have strong objections to letting my ISP handle DNS and whatnot at home.

That's why I made sure to get a dumb modem from my ISP (Comcast), and to use my own router (which I flashed with DD-WRT). That dumb modem (Arris S33) was actually difficult to find, but that effort is paying off, I have the flexibility to play without disrupting Internet service at my place.

And man, $300 USD is too much for a wifi router... I use an Asus AC 1900 that I got for less than half that, and it's a workhorse with DD-WRT, and I'm the one in control of DNS entries on it. Comcast did have those all-in-one gateways that they tried to rope me into, I said no, and found a compatible dumb modem instead.
 
Minor distraction (I have the attention span of a cat, in case you hadn't noticed): I decided I need a client machine in the house so I can test services as I deploy them without needing to sit in the basement or "cheat" by SSH-ing from a machine that isn't part of the new deployment.
  • I grabbed my daughter's old laptop, an Asus Vivobook she used during high school, and wiped Windows away.
  • I installed 15.0-BETA5, which as of this morning is now upgraded to 15.0-RC1
  • Mostly it works - the iwl wifi driver is a bit picky (doesn't like the access points at work at ALL, but works fine at home). Sound is kind of scratchy from the speakers (I'll debug that later).
  • The laptop is an upgrade over the POS Clevo I've been lugging with me to conferences (that Clevo is very long in the tooth). I think I'll keep the Asus for that purpose.
    • If I want to take this machine with me on the road, I need it to run Matlab for some light work - mostly re-plotting figures for last-minute edits to my slide decks. Matlab doesn't provide a build for FreeBSD, and I'm not overly eager to find something else, although I'm aware of a couple of free alternatives (I get a Matlab license for free thru work).
    • My first thought was to try Linux emulation by installing Ubuntu in a jail and running Matlab from there. I used the writeup describing how to run Chrome from a Linux jail as an example.
    • I was able to get Chrome to run on my desktop from the jail, but Matlab flat-out refused to work after installation. It seemed to be spinning its wheels loading something and never launched a GUI. Since Matlab isn't on the list of applications that Linuxulator supports, I'm not all that surprised. It was worth a shot...
  • I figured if Linux emulation didn't work I'd go ahead and try a VM.
    • I installed Ubuntu 24.04 LTS in Bhyve and enabled p9fs... with the help of this post, I managed to get vm-bhyve to mount my /home/user directory from the FreeBSD host into Ubuntu as the same /home/user directory (yes, UIDs and GIDs match between systems). Getting the /home directory shared between systems enables seamless transition - everything I'm working on can be seen and edited using Matlab, even though Matlab is running in a VM.
    • as a test I set up Chrome in the Linux VM and taught myself enough about X11 forwarding to get it to work over SSH. I have a desktop launcher that fires up Chrome on the desktop without issue.
    • last step: I installed Matlab and it works flawlessly. It's not snappy at all running the GUI over X11 forwarding, but it will suffice for the occasions where I need to touch up a figure. For real computation this is the wrong machine... I'll just connect to resources at work.
Here's a screenshot of the machine I'll be using as my 1st client device on the new home network:

matlab on ubuntu on 15.0-RC1-b.png
 
Service deployment update: DNS and DHCP are up and running.
  • BIND 9.20 installed from packages in a jail on the primary server, with another named instance installed in a jail on the backup machine.
  • Authoritative zones configured on the primary named instance for my lone forward-lookup domain and the 5 reverse-lookup domains that make up my network.
    • LOTS of room to grow... I'm lazy and configure each VLAN with a /24, which means I have hundreds of unused addresses on nearly every VLAN. That's fine, since it really wasn't that fun manually typing in the static address assignments into the DNS zone files.
    • BIND was without doubt the easiest install I'll have - it's evolved since the late 90s, but the fundamentals are the same.
  • Automatic zone transfers from primary to secondary work flawlessly
  • Recursive resolvers are fast... happy to abandon using my ISPs resolvers (I had Windows DNS servers forwarding to my ISP, since WinDNS sucks as a resolver)
  • pf rules locking down both jails were straightforward in the end:
    • I poked and prodded the bridged config for the backup DNS jail, and wasn't able to get pf to work the way I wanted. What I wanted was something similar to how it works on the primary, where SR-IOV makes the VF assigned to the jail look like an entirely separate adapter. Tying myself into knots trying to make it work my way didn't work, and I'm not interested enough to keep going.
      • So, I punted and moved all pf filtering onto the host... none of the jails on the backup server will run pf. I'll just have one big config on the host doing everything. I think this'll work out, so long as I keep to my plan of putting jail-specific rules in groups, it won't be that hard to edit.
  • For DHCP I configured another pair of jails the same way (one using an SR-IOV VF on the primary, and the other using an epair attached to the backup machine bridge) and installed Kea DHCP 3.0.2 from pkg.
    • Commentary: this is a wildly complex package - config options are endless, and I can see why some people might get frustrated. The documentation along with the knowledgebase are fantastic, however, so if you're willing to spend the time reading you should be up and running reasonably quickly.
    • Having said that, it was not so simple to get DHCP working on my network. I followed the guides for a high-availability setup - specifically I configured for hot spare mode. No load balancing or anything like that... just another instance on another box waiting in the wings in case the primary dies.
  • This should be straightforward - the config files are identical, with only one tweak needed to tell each instance which server it is. So, I configured a test pool, verified healthy config and intra-instance communication (heartbeat), and enabled the dhcp4 service. I disabled the DHCP service on the Unifi UDM SE so there wouldn't be a conflict, and then tried to get a lease on my phone.
    • NOTHING - no lease on the phone, and no info in the logs indicating my phone was asking for a lease.
    • Down the rabbit hole I went...
      • My first try was to disable pf. It was likely that I didn't have the filter rules set correctly, so I thought this was a good 1st step.
      • WRONG. Same behavior (or lack of any behavior whatsoever)
      • Seemed like the DHCPDISCOVER broadcasts weren't making it to the DHCP jail...
      • Digging in, I learned that SR-IOV VFs don't see everything that crosses the adapter - traffic is pre-filtered so that the VF only gets traffic bound for its MAC address. There's a fix for this, if you have hardware that supports it... enable promiscuous mode on the VF that needs to see everything. My Intel X710 supports this (or claims to), so I edited the VF config file and re-created all VFs to allow the DHCP jail's VF to see all traffic on the interface.
      • STILL NOTHING. WTF? I can imagine difficulty in getting broadcast traffic through the bridge on the backup machine, but I thought I'd guessed right by enabling promiscuous mode for the VF on the primary.
      • Last-ditch try: both the primary and backup servers have integrated GigE NICs on the motherboards. Those ports were unused, so I thought I'd try assigning a whole dedicated physical interface to each DHCP jail.
        • this is easy with Vnet jails - the only caveat is that on the backup jail I have to re-enable pf, since it won't be running through that host's bridge.
        • Both DHCP jails communicate just fine... pf rules look correct based on how DHCP should work (clients send from UDP port 68, servers listen on UDP port 67)
      • Guess what? NOTHING.
      • As a final try I switched the tri-position toggle for the Unifi UDM DHCP service from "disabled" to "relay". I shouldn't need a relay, since the SSID my phone is connecting to is tied to the same VLAN that has the DHCP servers. No tagging involved... the native VLAN is correct end to end, and broadcasts using .255 should be going from my phone thru the AP thru the POE port on the UDM SE and into every downstream switch port that's using that VLAN natively (or tagged), including the ports that my DHCP servers are attached to.
        • Surprisingly, this wasn't a worthless thing to do. Once I switched on the DHCP relay feature I started seeing log traffic that indicated the default gateway address for that VLAN was attempting to communicate with the DHCP server, from port 67 to port 67. So, relay works, and I just needed to adjust my pf rules to allow this combination of from/to addresses and ports (I assumed relay would use port 68, but I guess not...)
        • Once I had pf rules set to allow the UDM relay thru, viola! Lease obtained.
      • My conclusion after all this: the UDM SE box that sits at the heart of my network is filtering DHCP broadcasts instead of forwarding them through the VLAN. I've combed thru all the settings and can't find one that says "disable/enable broadcasts," but the behavior is clear as day:
        • enable Unifi DHCP and it'll serve up leases
        • disable Unifi DHCP and it won't allow another DHCP server to serve requests by listening for broadcast DHCPDISCOVER packets, since they're just not forwarded.
        • use Unifi DHCP Relay and let the UDM SE translate the broadcast into a unicast communication coming from the UDM gateway IP, and things work out.
      • Whatever... I really thought it'd be simpler to get DHCP working. My whole reason for falling down this hole was that I wanted DDNS updates sent to my name servers from the DHCP server. Unifi's DHCP service won't do this, but ISC has a nice ddns service bundled with Kea. So, I don't really need my DHCP jails to use the dedicated NIC ports, but now that it's configured I'll just leave it that way. I guess if I ever run short of switch ports I can reclaim a couple by moving back to the old VF or bridge design, but I don't really see that happening any time soon. See my signature below.
  • Kea DHCP is configured for 3 of my 5 VLANs at this point (the other 2 don't need it), and sub-zones are configured in DNS for dynamic clients. All is working perfectly, except:
    • some sort of permissions error updating the in-addr.arpa zones with dynamic PTR records. Those zones were pre-configured with some static addresses, but it seems that in doing so I borked the permissions and named won't create a .jnl file for dynamic updates. Dynamic updates to the forward-lookup sub-zones works perfectly... it's just the reverse-lookup zones that aren't getting updates. A bit more poking and I'll get this
  • Last step will be to set up DNS RPZ to provide blacklist/sinkhole type filtering to prevent ad sites, telemetry sites, and malware sites from resolving. Looking at my options now for whose data feed to use...
Edit: I forgot to mention that I made sure the jails could see /dev/bpf* by creating a custom devfs_ruleset, and raw sockets were allowed... there's no evidence either of these helped or hurt. I just never see broadcast DHCP traffic.
 
DHCP is a protocol that only works across a single broadcast domain (LAN segment). If you want it to reach further you need DHCP relay enabled at your routers / gateways. Always.

As for "high availability" the simplest way is often to just split the ip address pool in two, have two DHCP servers, and run one half on each DHCP server. No complicated config, just a different pool range on each DHCP server. Of course, this assumes that everything else is the same.
 
DHCP is a protocol that only works across a single broadcast domain (LAN segment). If you want it to reach further you need DHCP relay enabled at your routers / gateways. Always.

I shouldn't need a relay, since the SSID my phone is connecting to is tied to the same VLAN that has the DHCP servers. No tagging involved... the native VLAN is correct end to end, and broadcasts using .255 should be going from my phone thru the AP thru the POE port on the UDM SE and into every downstream switch port that's using that VLAN natively
 
...I switched the tri-position toggle for the Unifi UDM DHCP service from "disabled" to "relay"...My conclusion after all this: the UDM SE box that sits at the heart of my network is filtering DHCP broadcasts instead of forwarding them through the VLAN.
Sounds like this Unifi box is not a real bridge. I have two OpenWRT access points in bridge mode on my network, and DHCP works happily across them.
 
Back
Top