Moving LAGG(LACP) to a different switch

moriksan80 · Dec 9, 2024

Hello everyone,
Decisio DEC4040 Firewall appliance (Freebsd version: 14.1-RELEASE-p6, running OpnSense 24.10.1) with 2x25G SFP28 ports (ice0,1) are in a lagg (lagg0) with a switch (S1). Various VLAN subnets and derivative interfaces rules etc depend on lagg0. This lagg is the “main” LAN connection. 100G breakout DAC is used. Life is good, everything works well.

I have a need to move the 2 physical connections on the freebsd appliance over to a different switch (S2) in the short-term; and later, in the medium-term split the connections across two switches (S2,3) which are in a vPC configuration. For the short-term move, I have tried the following steps neither of which have yielded a successful outcome:

Move the 100G cable from S1 to S2. LACP isn’t re-established. “No carrier” status is shown on OpnSense
Move the 100G cable from S1 to S2 after

Bash:

ifconfig lagg0 down 
ifconfig lagg0 up

LACP isn’t re-established. “No carrier” status is shown on freebsd-side. Interface timeouts are shown on the switch-side.

Move the 100G cable from S1 to S2 after bringing down all involved interfaces (lagg0,ice0,ice1).

Bash:

ifconfig lagg0 down 
ifconfig ice0 down 
ifconfig ice1 down 
<< insert cable on both ends >>
ifconfig ice0 up 
ifconfig ice1 up 
ifconfig lagg0 up

LACP isn’t re-established. “No carrier” status is shown on OpnSense
Put new 100G QSFP on S2 + 2x25G SFP28 on DEC4040 and connected with MTP-to-LC breakout cable. Repeat combination of the above steps wrt ifconfig up/down. LACP isn’t re-established. “No carrier” status is shown on freebsd side.

What i've ruled out:

bad DAC cable
bad fiber cable
bad transceivers

Start status of lagg0

Bash:

ifconfig -v ice0
ice0: flags=1008843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,LOWER_UP> metric 0 mtu 9000
    options=4800028<VLAN_MTU,JUMBO_MTU,HWSTATS,MEXTPG>
    ether f4:90:ea:00:9f:72
    inet6 fe80::f690:eaff:fe00:a206%ice0 prefixlen 64 scopeid 0x5
    media: Ethernet autoselect (25G-AUI <full-duplex>)
    status: active
    nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>
    drivername: ice0
    plugged: SFP/SFP+/SFP28 25GBASE-CR CA-25G-S (Copper pigtail)
    vendor: CISCO-LEONI PN: L45593-D278-B30 SN: LCC2506GADX-CH3 DATE: 2021-02-10
root@MorikCage:~ # ifconfig -v lagg0
lagg0: flags=1008843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,LOWER_UP> metric 0 mtu 9000
    description: main_LAGG (opt1)
    options=4800028<VLAN_MTU,JUMBO_MTU,HWSTATS,MEXTPG>
    ether f4:90:ea:00:9f:72
    hwaddr 00:00:00:00:00:00
    inet 192.168.98.1 netmask 0xffffff00 broadcast 192.168.98.255
    inet6 fe80::f690:eaff:fe00:9f72%lagg0 prefixlen 64 scopeid 0xd
    laggproto lacp lagghash l2,l3,l4
    lagg options:
        flags=0<>
        flowid_shift: 16
    lagg statistics:
        active ports: 2
        flapping: 0
    lag id: [(8000,F4-90-EA-00-9F-72,09A8,0000,0000),
         (8000,E8-0A-B9-75-49-87,0001,0000,0000)]
    laggport: ice0 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING> state=3d<ACTIVITY,AGGREGATION,SYNC,COLLECTING,DISTRIBUTING>
        [(8000,F4-90-EA-00-9F-72,09A8,8000,0005),
         (8000,E8-0A-B9-75-49-87,0001,8000,01C3)]
    laggport: ice1 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING> state=3d<ACTIVITY,AGGREGATION,SYNC,COLLECTING,DISTRIBUTING>
        [(8000,F4-90-EA-00-9F-72,09A8,8000,0006),
         (8000,E8-0A-B9-75-49-87,0001,8000,01C4)]
    groups: lagg FG_ALL_VLANs FG_CRITICAL_LAN
    media: Ethernet autoselect
    status: active
    nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
    drivername: lagg0

S1 (and S2) configs

Markdown (GitHub flavored):

interface port-channel9
  switchport mode trunk
  mtu 9216

interface Ethernet1/9/1
  switchport mode trunk
  mtu 9216
  channel-group 9 mode active

interface Ethernet1/9/2
  switchport mode trunk
  mtu 9216
  channel-group 9 mode active

End status in each of the above cases for when moving lagg0

Bash:

ifconfig -vv lagg0
lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
    description: main_LAGG (opt1)
    options=4800028<VLAN_MTU,JUMBO_MTU,HWSTATS,MEXTPG>
    ether f4:90:ea:00:9f:72
    hwaddr 00:00:00:00:00:00
    inet 192.168.98.1 netmask 0xffffff00 broadcast 192.168.98.255
    inet6 fe80::f690:eaff:fe00:9f72%lagg0 prefixlen 64 scopeid 0xd
    laggproto lacp lagghash l2,l3,l4
    lagg options:
        flags=0<>
        flowid_shift: 16
    lagg statistics:
        active ports: 0
        flapping: 0
    lag id: [(0000,00-00-00-00-00-00,0000,0000,0000),
         (0000,00-00-00-00-00-00,0000,0000,0000)]
    laggport: ice0 flags=0<> state=41<ACTIVITY,DEFAULTED>
        [(8000,F4-90-EA-00-9F-72,8005,8000,0005),
         (FFFF,00-00-00-00-00-00,0000,FFFF,0000)]
    laggport: ice1 flags=0<> state=41<ACTIVITY,DEFAULTED>
        [(8000,F4-90-EA-00-9F-72,8006,8000,0006),
         (FFFF,00-00-00-00-00-00,0000,FFFF,0000)]
    groups: lagg FG_ALL_VLANs FG_CRITICAL_LAN
    media: Ethernet autoselect
    status: no carrier
    nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
    drivername: lagg0

I think I may be missing something basic here. When changing lagg endpoints, does FreeBSD require removal and re-addition of the physical interfaces? Would there be another way short of deleting and re-creating lagg0? If not, then i'll lose all derivate configuration (vlan, firewall rules etc) from lagg0 if lagg0 is removed or re-created. Or perhaps some another subtlety I may be overlooking?

shurik · Dec 12, 2024

moriksan80 said:
When changing lagg endpoints, does FreeBSD require removal and re-addition of the physical interfaces? Would there be another way short of deleting and re-creating lagg0? If not, then i'll lose all derivate configuration (vlan, firewall rules etc) from lagg0 if lagg0 is removed or re-created. Or perhaps some another subtlety I may be overlooking?

No, it is not necessary to recreate lagg interface. Check if ice0 and ice1 is in up state and status is active when connected to S2. If not something is wrong with physical layer. Or check if ports on S2 is not in blocked state by STP.

moriksan80 · Dec 13, 2024

Thank you so much for taking the time to reply. All interfaces were up prior to the switchover. Port-channel was configured on S2 apriori. Therefore, STP blocking shouldn’t occur. However, as I’m preeently away, please grant me at least until tomorrow to revert with confirmation.

VladiBG · Dec 13, 2024

recreate the lagg
lagg doesn't support virtual port-channel. You may try with mlacp but it may also fail with stp blocking. Your best bet is to use only one of the ports with lagg failover in multichassis

moriksan80 · Dec 17, 2024

Thank you for your patience, shurik. My responses are found below:

Check if ice0 and ice1 is in up state and status is active when connected to S2.

All three interfaces (lagg0, ice0, ice1) are in UP state with S1 prior to the move. S2 had a replicated configuration of S1 with all three interfaces (po9, e1/9/1, e1/9/2) in ADMIIN disabled state until physical connection move was ready. At which time, (po9, e1/9/1) were converted to no shut, 1 physical connection from S1 for ice0 was moved to S2 e1/9/1, the other physical connection for ice1 was disconnected (to prevent STP storm between S1,S2), followed by ample time (up to 5 mins) provided for S2 e1/9/1 <----> (lagg0,ice0) to convert into UP state. But, alas it does not. I see interface timeout on S2 e1/9/1's layer 2. Similar process happens if e1/9/2 is chosen instead.

If not something is wrong with physical layer.

The ports and cables are fine (confirmed by plugging same cable in different connection endpoint).

Or check if ports on S2 is not in blocked state by STP.

LACP ports won't go into STP blocking on S1,S2. Confirmed by output of sh spanning-tree blocked ports.

moriksan80 · Dec 17, 2024

VladiBG said:
recreate the lagg
lagg doesn't support virtual port-channel. You may try with mlacp but it may also fail with stp blocking. Your best bet is to use only one of the ports with lagg failover in multichassis

VladiBG, thank you for your guidance.
To re-create lagg, I'll have to delete it. Since VLAN interfaces, and its corresponding rules on the firewall side are derivates of lagg0, that would also mean that I'd lose the rules and various DHCP/DNS settings associated with not just the lagg but also all remainder interfaces. Essentially, it'll involve building firewall from scratch. It wouldn't be my preferred option.

Also, since S2,S3 are already in a (Cisco) vPC, connecting ice0 and ice1 to a port on S2 and S3 is feasible and is the original design intent. However, a single interface move doesn't bring ice0 (or ice1) into UP state. So, it won't work across S3 either.

Unfortunately, I'm at a standstill. Would there be debug logs which I could enable on freebsd-side to better understand the reason for why ice0,1 aren't converting to UP state for lagg0?