I've got a setup with a pair of routers running FreeBSD 13. Tonight I attempted an upgrade from 13.0-RELEASE-p11 to 13.1-RELEASE. However upon doing so, I was suddenly unable to do any DHCP or PPP(oE) out my WAN interfaces on either router.
It looks like the same underlying problem as Thread install-13-1-rc-3-not-getting-ip-from-dhcp-with-atheros-ar8161-and-realtek-wifi.84927, but the OP there doesn't appear to have done anything specific to fix it, it just started working. But in my case it fails consistently across both machines.
In my case, the 3 interfaces are vLANs on an underlying LACP LAGG of two Intel 82576 interfaces. Two of them attempt DHCP, and the third attempts PPPoE via mpd5.
Relevant parts of my rc.conf are:
I handle the actual DHCP up/down via a CARP script, but regardless of which host is MASTER, the result was the same:
On 13.0, the DHCP requests complete immediately and the PPPoE connection completes successfully.
On 13.1, the DHCP requests do DHCPDISCOVER indefinitely and time out when running
Nothing else changed on my network, just the system upgrade from 13.0 to 13.1.
The only thing in the release notes I could see related to networking was the change of "net.inet.ip.broadcast_lowest", but adjusting this had no effect on either type of connection.
I did attempt some packet captures at my modem side, and while I do see the DHCPDISCOVER packets, I get zero response from any of my upstream providers (two completely separate ones) like I do normally, which potentially points to a malformed packet of some kind, but I couldn't see any errors in them. And this doesn't explain why both dhclient *and* mpd5 are failing in a similar way. It seems like it could be an issue with the network drivers at some level (perhaps vLANs?), but...
All my other vLANs (12 of them) worked flawlessly, including DHCP requests *in* to the router from client devices. It was only these 3 outbound interfaces that seemed to have problems, which makes it even more confusing.
I'm not really sure where else to look or what else would be useful to troubleshoot further; I'm a relative FreeBSD newbie aside from these routers, but very well-versed in Linux so hit me with the advanced commands. Does anyone have any advice, either for how to find more information about what's going on or a potential cause?
It looks like the same underlying problem as Thread install-13-1-rc-3-not-getting-ip-from-dhcp-with-atheros-ar8161-and-realtek-wifi.84927, but the OP there doesn't appear to have done anything specific to fix it, it just started working. But in my case it fails consistently across both machines.
In my case, the 3 interfaces are vLANs on an underlying LACP LAGG of two Intel 82576 interfaces. Two of them attempt DHCP, and the third attempts PPPoE via mpd5.
Relevant parts of my rc.conf are:
Bash:
cloned_interfaces="lagg0 vlan11 vlan12 vlan13 [several more too...]"
ifconfig_igb0="-vlanhwtag mtu 9000 up"
ifconfig_igb1="-vlanhwtag mtu 9000 up"
ifconfig_igb2="-vlanhwtag mtu 9000 up"
ifconfig_igb3="-vlanhwtag mtu 9000 up"
ifconfig_lagg0="mtu 9000 laggproto lacp lagghash l2 laggport igb0 laggport igb1"
# Virtual Interfaces (WAN)
# WAN1 DOCSIS
create_args_vlan11="vlan 11 vlandev lagg0"
ifconfig_vlan11="mtu 1500 dhcp"
ifconfig_vlan11_alias0="link 54:e1:ad:15:ba:61"
# WAN2 FWA
create_args_vlan12="vlan 12 vlandev lagg0"
ifconfig_vlan12="mtu 1500 dhcp"
ifconfig_vlan12_alias0="link 54:e1:ad:15:ba:62"
# WAN3 DSL
create_args_vlan13="vlan 13 vlandev lagg0"
ifconfig_vlan13="mtu 1500 up"
mpd_enable="YES"
mpd_flags="-b WAN3"
I handle the actual DHCP up/down via a CARP script, but regardless of which host is MASTER, the result was the same:
On 13.0, the DHCP requests complete immediately and the PPPoE connection completes successfully.
On 13.1, the DHCP requests do DHCPDISCOVER indefinitely and time out when running
dhclient
either automatically (part of my CARP scripts) or manually, and the PPPoE connection fails with repeated timeouts and no specific error, e.g.
Code:
Jun 4 22:42:17 dcr2 mpd[34258]: [vlan13_link0] PPPoE: Connecting to 'WAN3'
Jun 4 22:42:26 dcr2 mpd[34258]: [vlan13_link0] PPPoE connection timeout after 9 seconds
Jun 4 22:42:26 dcr2 mpd[34258]: [vlan13_link0] Link: DOWN event
Jun 4 22:42:26 dcr2 mpd[34258]: [vlan13_link0] LCP: Down event
Nothing else changed on my network, just the system upgrade from 13.0 to 13.1.
The only thing in the release notes I could see related to networking was the change of "net.inet.ip.broadcast_lowest", but adjusting this had no effect on either type of connection.
I did attempt some packet captures at my modem side, and while I do see the DHCPDISCOVER packets, I get zero response from any of my upstream providers (two completely separate ones) like I do normally, which potentially points to a malformed packet of some kind, but I couldn't see any errors in them. And this doesn't explain why both dhclient *and* mpd5 are failing in a similar way. It seems like it could be an issue with the network drivers at some level (perhaps vLANs?), but...
All my other vLANs (12 of them) worked flawlessly, including DHCP requests *in* to the router from client devices. It was only these 3 outbound interfaces that seemed to have problems, which makes it even more confusing.
I'm not really sure where else to look or what else would be useful to troubleshoot further; I'm a relative FreeBSD newbie aside from these routers, but very well-versed in Linux so hit me with the advanced commands. Does anyone have any advice, either for how to find more information about what's going on or a potential cause?