vlan interface can't use CARP packets

Hi,

I've got 2 server (Proxmox and FreeNAS) and both of them should run a OPNsense VM so I could run them in HA mode.
Both OPNsense VMs are running fine and everything works as long as only one at the time is running.
To be able to use HA these VMs need to use CARP to communicate but here is the problem, because CARP packages are getting lost between my lagg0 (using tagged vlan) interface and vlan2/vlan42/vlan43 interface (see the red line in the picture below). The other direction is working fine. CARP packages from vlan2/vlan42/vlan43 can pass to lagg0.

This is how my setup looks like:
opnsense1.png


I used tcpdump and wireshark to analyse the packets.
All unicast packets are working fine between both VMs.
CARP is working fine from the VM on FreeNAS down to the VM on Proxmox (I can see CARP packets from both VMs if I listen on "vmbr2").
If I listen on "bridge2" or "vlan2" I only see CARP packets from the VM on the FreeNAS host.
If I listen on "lagg0" I see CARP packets of both VMs.

So it really looks like the CARP packets of the VM on Proxmox get lost while traveling between "lagg0" and "vlan2" on the FreeNAS host.
I also checked the header of the CARP packets captured on "lagg0" and verified that these are tagged with vlan id 2.
I have no idea why all other traffic with vlan id 2 passed "lagg0" to "vlan2" but not the CARP packets.

Tcpdump vlan2:
Code:
tcpdump -B 16000 -i vlan2 -s0 -vv -n | grep VRRPv2
tcpdump: listening on vlan2, link-type EN10MB (Ethernet), capture size 262144 bytes
    192.168.0.3 > 224.0.0.18: vrrp 192.168.0.3 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 100, authtype none, intvl 1s, length 36, addrs(7): 204.26.121.131,113.231.163.36,111.27.69.51,139.52.213.104,103.208.241.213,47.0.182.224,252.5.96.64

Tcpdump lagg0:
Code:
tcpdump -B 16000 -i lagg0 -s0 -vv -n | grep VRRPv2
tcpdump: listening on lagg0, link-type EN10MB (Ethernet), capture size 262144 bytes
    192.168.0.3 > 224.0.0.18: vrrp 192.168.0.3 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 100, authtype none, intvl 1s, length 36, addrs(7): 111.100.125.40,62.223.145.20,142.113.126.246,185.154.196.161,103.0.53.151,212.222.211.243,85.170.90.42
    192.168.43.3 > 224.0.0.18: vrrp 192.168.43.3 > 224.0.0.18: VRRPv2, Advertisement, vrid 3, prio 100, authtype none, intvl 1s, length 36, addrs(7): 96.66.4.97,132.211.112.69,212.103.17.121,143.149.122.238,4.30.250.155,65.48.107.187,65.63.152.79
    192.168.42.3 > 224.0.0.18: vrrp 192.168.42.3 > 224.0.0.18: VRRPv2, Advertisement, vrid 5, prio 100, authtype none, intvl 1s, length 36, addrs(7): 32.109.10.203,100.5.167.83,103.19.183.233,178.227.26.106,226.114.149.25,181.206.12.219,83.59.18.64
    192.168.0.2 > 224.0.0.18: vrrp 192.168.0.2 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 0, authtype none, intvl 1s, length 36, addrs(7): 111.100.125.40,62.223.145.21,135.66.16.237,211.4.211.42,117.193.93.14,212.76.235.95,104.10.216.75

Tcpdump vmbr2:
Code:
tcpdump -B 16000 -i vmbr2 -s0 -vv -n | grep VRRPv2
tcpdump: listening on vmbr2, link-type EN10MB (Ethernet), capture size 262144 bytes
    192.168.0.3 > 224.0.0.18: vrrp 192.168.0.3 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 100, authtype none, intvl 1s, length 36, addrs(7): 121.117.48.10,56.7.253.87,70.65.213.181,67.105.250.170,222.9.115.236,50.252.143.196,169.224.5.94
    192.168.0.2 > 224.0.0.18: vrrp 192.168.0.2 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 0, authtype none, intvl 1s, length 36, addrs(7): 121.117.48.10,56.7.253.88,58.219.37.12,15.210.142.91,192.209.83.22,41.242.4.18,137.13.56.178

Code:
ifconfig vlan2
vlan2: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        description: WAN vlan
        ether f4:52:14:88:30:60
        nd6 options=9<PERFORMNUD,IFDISABLED>
        media: Ethernet autoselect
        status: active
        vlan: 2 vlanpcp: 0 parent interface: lagg0
        groups: vlan

Code:
ifconfig lagg0
lagg0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 9000
        description: 10G + 1G bond
        options=88<VLAN_MTU,VLAN_HWCSUM>
        ether f4:52:14:88:30:60
        nd6 options=9<PERFORMNUD,IFDISABLED>
        media: Ethernet autoselect
        status: active
        groups: lagg
        laggproto failover lagghash l2,l3,l4
        laggport: em0 flags=0<>
        laggport: mlxen0 flags=5<MASTER,ACTIVE>

Code:
mlxen0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 9000
        description: Bond SFP+
        options=8c00a8<VLAN_MTU,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWTSO,LINKSTATE>
        ether f4:52:14:88:30:60
        hwaddr f4:52:14:88:30:60
        nd6 options=9<PERFORMNUD,IFDISABLED>
        media: Ethernet autoselect (10Gbase-CX4 <full-duplex,rxpause,txpause>)
        status: active

Does anyone has a idea?

I'm trying to fix this for days and no one in the OPNsense, FreeNAS or Proxmox Forums has any idea why the CARP packets get lost.

I can post tcpdumps, logs and so on if more information is needed.

Its FreeNAS version 11.3U4.1 so its based on FreeBSD 11.3.
 
Back
Top