Possible Bug with LAGG Trunk interfaces

Hello All- I am running into some network problems and after some weeks of troubleshooting am beginning to suspect a bug as tonight I made a discovery of tcpdump being open on an interface, it works. If anybody has proficiency with LAG configurations, would appreciate input or advice. My initial description below is to broach the topic and seek input, if you wish for more detail please let me know.

Situation
I have 4 Network interfaces participating in 2 different LAG's (Config excerpt below). There are multiple "cloned" interfaces for VLAN trunking from these lags/port-channels. what I am experiencing is multiple symptoms, ultimately where service through the network interfaces intermittently works, workarounds have been manual config or destroy/recreate interface post boot, and using tcpdump which starts passing traffic. the workaround I found is where I suspected a bug and thought I'd reach out to see if I made any mistakes here.

  • Symptom: DHCP is delayed in accepting, or occasionally completely times out when attempting to DHCP from a VLAN tagged interface. DHCP is confirmed in TCPDump going out, and DHCP responses are confirmed via cisco monitoring session
  • Symptom: Intermittently packet responses are NOT seen by the host (But they were confirmed via a cisco monitor session).-
    • This symptom can be alleviated by performing a tcpdump on the cloned interface (tcpdump -ni vlan303 icmp)
    • Once tcpdump is open, the host receives ICMP pings no problem on VLAN303
    • when TCPdump is closed, ICMP stops on VLAN303
    • VLAN302, and VLAN304 work with no issue.
  • Symptom: DHCP configuration via rc.conf is not always honored on startup (also intermitted).
    • Workaround has been to use rc.local and call dhclient seperately.
  • Symptom: Ether is not honored in rc.conf on cloned vlan interfaces, however it can be set post and honored in post.
Here is the setup - Thank you for all that take time to review.

Cisco 3850x IOS switch running Cisco IOS XE Software, Version 16.12.10a

Total of 5 FIBs, 3 WAN internet links, multiple VLANs.
2 LAGs (one LAN facing, one WAN facing).

Current Kernel FreeBSD Kernel FreeBSD rtr0 13.2-RELEASE-p5 FreeBSD 13.2-RELEASE-p5 releng/13.2-n254643-3a088f485f74 (14 shows the same)
PCEngines APU4 using Intel I211 chipset
dev.igb.3.%desc: Intel(R) I211 (Copper)
dev.igb.2.%desc: Intel(R) I211 (Copper)
dev.igb.1.%desc: Intel(R) I211 (Copper)
dev.igb.0.%desc: Intel(R) I211 (Copper)

::Cisco config on PO1 and PO2
Code:
core0(config-if)#do sh run int po1
Building configuration...

Current configuration : 148 bytes
!
interface Port-channel1
 description rtr0-lan
 switchport trunk allowed vlan 2,4,5,9,10,20-25
 switchport mode trunk
 spanning-tree portfast
end

core0(config-if)#do sh run int po2
Building configuration...

Current configuration : 123 bytes
!
interface Port-channel2
 switchport trunk allowed vlan 302-304
 switchport mode trunk
 spanning-tree portfast trunk
end

Ether channel shows fine on cisco side
Code:
core0(config-if)#do sh etherchan sum
Flags:  D - down        P - bundled in port-channel
        I - stand-alone s - suspended
        H - Hot-standby (LACP only)
        R - Layer3      S - Layer2
        U - in use      f - failed to allocate aggregator

        M - not in use, minimum links not met
        u - unsuitable for bundling
        w - waiting to be aggregated
        d - default port

        A - formed by Auto LAG


Number of channel-groups in use: 6
Number of aggregators:           6

Group  Port-channel  Protocol    Ports
------+-------------+-----------+-----------------------------------------------
1      Po1(SU)         LACP        Gi1/0/47(P)     Gi1/0/48(P)
2      Po2(SU)         LACP        Gi1/0/45(P)     Gi1/0/46(P)
::ifconfig
(Public IP's on VLAN302, 303, and 304 have been scrubbed for sensitivity, but they are valid)
(Mac Addresses in IFconfig have been anonymized for sensitivity, but they are valid).
Code:
igb0: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=4e120bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,WOL_MAGIC,VLAN_HWFILTER,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP>
        ether aa:bb:cc:dd:f1:94
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
igb1: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=4e120bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,WOL_MAGIC,VLAN_HWFILTER,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP>
        ether aa:bb:cc:dd:f1:94
        hwaddr aa:bb:cc:dd:f1:95
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
igb2: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=4e120bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,WOL_MAGIC,VLAN_HWFILTER,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP>
        ether aa:bb:cc:dd:f1:96
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
igb3: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=4e120bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,WOL_MAGIC,VLAN_HWFILTER,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP>
        ether aa:bb:cc:dd:f1:96
        hwaddr aa:bb:cc:dd:f1:97
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=2b<PERFORMNUD,ACCEPT_RTADV,IFDISABLED,AUTO_LINKLOCAL>
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
        inet6 ::1 prefixlen 128
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x5
        inet 127.0.0.1 netmask 0xff000000
        groups: lo
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=4e120bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,WOL_MAGIC,VLAN_HWFILTER,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP>
        ether aa:bb:cc:dd:f1:94
        laggproto lacp lagghash l2,l3,l4
        laggport: igb0 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
        laggport: igb1 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
        groups: lagg
        media: Ethernet autoselect
        status: active
        nd6 options=69<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL,NO_RADR>
lagg1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=4e120bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,WOL_MAGIC,VLAN_HWFILTER,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP>
        ether aa:bb:cc:dd:f1:96
        laggproto lacp lagghash l2,l3,l4
        laggport: igb2 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
        laggport: igb3 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
        groups: lagg
        media: Ethernet autoselect
        status: active
        nd6 options=69<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL,NO_RADR>
vlan2: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=4600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP>
        ether aa:bb:cc:dd:f1:94
        inet 10.178.1.1 netmask 0xffffff00 broadcast 10.178.1.255
        groups: vlan
        vlan: 2 vlanproto: 802.1q vlanpcp: 0 parent interface: lagg0
        media: Ethernet autoselect
        status: active
        nd6 options=69<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL,NO_RADR>
vlan9: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=4600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP>
        ether aa:bb:cc:dd:f1:94
        inet 10.178.9.1 netmask 0xffffff00 broadcast 10.178.9.255
        groups: vlan
        vlan: 9 vlanproto: 802.1q vlanpcp: 0 parent interface: lagg0
        media: Ethernet autoselect
        status: active
        nd6 options=69<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL,NO_RADR>
vlan10: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=4600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP>
        ether aa:bb:cc:dd:f1:94
        inet 10.178.10.1 netmask 0xffffff00 broadcast 10.178.10.255
        groups: vlan
        vlan: 10 vlanproto: 802.1q vlanpcp: 0 parent interface: lagg0
        media: Ethernet autoselect
        status: active
        nd6 options=69<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL,NO_RADR>
vlan24: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=4600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP>
        ether aa:bb:cc:dd:f1:94
        inet 10.178.24.1 netmask 0xffffff00 broadcast 10.178.24.255
        groups: vlan
        vlan: 24 vlanproto: 802.1q vlanpcp: 0 parent interface: lagg0
        media: Ethernet autoselect
        status: active
        nd6 options=69<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL,NO_RADR>
vlan4: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=4600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP>
        ether aa:bb:cc:dd:f1:94
        inet 10.178.4.1 netmask 0xffffff00 broadcast 10.178.4.255
        groups: vlan
        vlan: 4 vlanproto: 802.1q vlanpcp: 0 parent interface: lagg0
        media: Ethernet autoselect
        status: active
        nd6 options=69<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL,NO_RADR>
vlan20: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=4600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP>
        ether aa:bb:cc:dd:f1:94
        inet 10.178.20.1 netmask 0xffffff00 broadcast 10.178.20.255
        groups: vlan
        vlan: 20 vlanproto: 802.1q vlanpcp: 0 parent interface: lagg0
        media: Ethernet autoselect
        status: active
        nd6 options=69<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL,NO_RADR>
vlan21: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=4600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP>
        ether aa:bb:cc:dd:f1:94
        inet 10.178.21.1 netmask 0xffffff00 broadcast 10.178.21.255
        groups: vlan
        vlan: 21 vlanproto: 802.1q vlanpcp: 0 parent interface: lagg0
        media: Ethernet autoselect
        status: active
        nd6 options=69<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL,NO_RADR>
vlan22: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=4600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP>
        ether aa:bb:cc:dd:f1:94
        inet 10.178.22.1 netmask 0xffffff00 broadcast 10.178.22.255
        groups: vlan
        vlan: 22 vlanproto: 802.1q vlanpcp: 0 parent interface: lagg0
        media: Ethernet autoselect
        status: active
        nd6 options=69<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL,NO_RADR>
vlan23: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=4600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP>
        ether aa:bb:cc:dd:f1:94
        inet 10.178.23.1 netmask 0xffffff00 broadcast 10.178.23.255
        groups: vlan
        vlan: 23 vlanproto: 802.1q vlanpcp: 0 parent interface: lagg0
        media: Ethernet autoselect
        status: active
        nd6 options=69<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL,NO_RADR>
vlan25: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=4600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP>
        ether aa:bb:cc:dd:f1:94
        inet 10.178.25.1 netmask 0xffffff00 broadcast 10.178.25.255
        groups: vlan
        vlan: 25 vlanproto: 802.1q vlanpcp: 0 parent interface: lagg0
        media: Ethernet autoselect
        status: active
        nd6 options=69<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL,NO_RADR>
vlan302: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=4600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP>
        ether aa:bb:cc:dd:f1:96
        inet 100.XXX.XXX.XXX netmask 0xfffff000 broadcast 100.XXX.XXX.XXX
        groups: vlan
        vlan: 302 vlanproto: 802.1q vlanpcp: 0 parent interface: lagg1
        fib: 2
        media: Ethernet autoselect
        status: active
        nd6 options=69<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL,NO_RADR>
vlan304: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=4600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP>
        ether aa:bb:cc:dd:f1:96
        inet 97.XXX.XXX.XXX netmask 0xfffffff8 broadcast 97.XXX.XXX.XXX
        groups: vlan
        vlan: 304 vlanproto: 802.1q vlanpcp: 0 parent interface: lagg1
        fib: 4
        media: Ethernet autoselect
        status: active
        nd6 options=69<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL,NO_RADR>
vlan303: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=4600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP>
        ether aa:bb:cc:dd:a7:2b
        inet 98.XXX.XXX.XXX netmask 0xfffff000 broadcast 255.255.255.255
        groups: vlan
        vlan: 303 vlanproto: 802.1q vlanpcp: 0 parent interface: lagg1
        fib: 3
        media: Ethernet autoselect
        status: active
        nd6 options=69<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL,NO_RADR>


::rc.conf
Code:
#bring all interfaces online
ifconfig_igb0="up -tso4 -tso6 -lro -vlanhwtso"
ifconfig_igb1="up -tso4 -tso6 -lro -vlanhwtso"
ifconfig_igb2="up -tso4 -tso6 -lro -vlanhwtso"
ifconfig_igb3="up -tso4 -tso6 -lro -vlanhwtso"

#create cloned interfaces
cloned_interfaces="lagg0 lagg1 vlan2 vlan9 vlan10 vlan24 vlan4 vlan20 vlan21 vlan22 vlan23 vlan25 vlan302 vlan303 vlan304"

#po3 on core0, lan
ifconfig_lagg0="up laggproto lacp laggport igb0 laggport igb1"

#po4 on core0, wan
ifconfig_lagg1="up laggproto lacp laggport igb2 laggport igb3"

#assign vlan to lagg
ifconfig_vlan9="vlan 9 vlandev lagg0 10.178.9.1/24"
ifconfig_vlan10="vlan 10 vlandev lagg0 10.178.10.1/24"
ifconfig_vlan2="vlan 2 vlandev lagg0 10.171.8.1/24"
ifconfig_vlan24="vlan 24 vlandev lagg0 10.178.24.1/24"
ifconfig_vlan4="vlan 4 vlandev lagg0 10.178.4.1/24"
ifconfig_vlan20="vlan 20 vlandev lagg0 10.178.20.1/24"
ifconfig_vlan21="vlan 21 vlandev lagg0 10.178.21.1/24"
ifconfig_vlan22="vlan 22 vlandev lagg0 10.178.22.1/24"
ifconfig_vlan23="vlan 23 vlandev lagg0 10.178.23.1/24"
ifconfig_vlan25="vlan 25 vlandev lagg0 10.178.25.1/24"

#(Note this will bring the interface up, but ether is not honored here)
ifconfig_vlan302="fib 2 vlan 302 vlandev lagg1" 
ifconfig_vlan303="fib 3 vlan 303 vlandev lagg1"
ifconfig_vlan304="fib 4 vlan 304 vlandev lagg1"

ifconfig_vlan302_ipv4="inet6 accept_rtadv"
ifconfig_vlan303_ipv4="inet6 accept_rtadv"
ifconfig_vlan304_ipv4="inet6 accept_rtadv"
::The following has to be run via rc.local
Code:
setfib 2 dhclient vlan302
setfib 3 dhclient vlan303
setfib 4 dhclient vlan304
 
I wanted to follow up with this post and clarify that my initial statement of a bug was incorrect. I found the issue and my ignorance was at fault. It is important to realize the layers of communication at play.

Layer of communication
IGB Interfaces -> LAGG with LACP -> CLONE VLAN Interfaces

Issue
The LAGG has a specific hardware address that was being carried into the cloned interfaces. When I was specifying an alternate ethernet address on a sub clone interface (vlan303), the traffic was being blocked at the LAG interface unless it was placed in promisc mode. The discovery of tcpdump led to this finding. The intermittent nature, was ironically by my own hand, in using tcpdump to troubleshoot and having sudden packet successes.

When the cloned interface was being used without a custom MAC, everything worked as expected.

To address this, simply specifying ifconfig promisc on lagg1 addressed.
I hope this helps others who may be dealing with this.

"Irony is a disciplinarian feared only by those who do not know it, but cherished by those who do."
 
Back
Top