Solved No ingress VLAN traffic

tscho

New Member

Reaction score: 1
Messages: 12

Hi,

I'm fighting for a couple of days with networking of FreeBSD. I try to create a VLAN on my Bhyve host for some VMs. Everything works fine as long as I don't use VLAN tags. I changed already the physical switch already and double-checked my VLAN setup but to me it seems like it's a FreeBSD problem. Also I see the MAC addresses of the router and the VM in the MAC address table of my switch.

My setup looks like this:
Code:
┌──────────────────────────────────┐  ┌───────┐   ┌───────┐  ┌──────┐
│ ┌─────┐         ┌───────┐  ┌┬───┬┤  │       │   │       │  │Vl1   │
│ │VM1  │         │bridge1│  ││em0│├──┤       │   │       │  │Vl101 │
│ │Vl104├─────────┤       │  │└───┘│  │       │   │       │  │Vl102 │
│ └─────┘         │Vl104  ├──┤┌───┐│  │       ├───│       ├──┤Vl104 │
│                 │       │  ││em1│├──┤       │   │       │  │      │
│                 │       │  │└───┘│  │       │   │       │  │      │
│                 │       │  │lagg0│  │       │   │       │  │      │
│                 └───────┘  └─────┤  │switch1│   │switch2│  │router│
└──────────────────────────────────┘  └───────┘   └───────┘  └──────┘
If I ping from inside the VM I see the packages arriving on my router, which is than also sending an response:
Code:
[#] sudo tcpdump -i eth0.104

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0.104, link-type EN10MB (Ethernet), capture size 262144 bytes
13:21:42.426419 ARP, Request who-has 10.192.168.1 tell 10.192.168.50, length 46
13:21:42.426490 ARP, Reply 10.192.168.1 is-at 04:18:d6:83:27:df (oui Unknown), length 28
13:21:43.460105 ARP, Request who-has 10.192.168.1 tell 10.192.168.50, length 46
13:21:43.460169 ARP, Reply 10.192.168.1 is-at 04:18:d6:83:27:df (oui Unknown), length 28
On the other hand I don't see the response packages arriving on my host on the lagg0.104 interface:
Code:
[#] sudo tcpdump -nni lagg0.104
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode

listening on lagg0.104, link-type EN10MB (Ethernet), capture size 262144 bytes
00:18:23.237551 ARP, Request who-has 10.192.168.1 tell 10.192.168.50, length 46
00:18:24.257961 ARP, Request who-has 10.192.168.1 tell 10.192.168.50, length 46
00:18:25.277020 ARP, Request who-has 10.192.168.1 tell 10.192.168.50, length 46
I created everything with the neat vm tool:
Code:
[#] sudo vm switch list

NAME             TYPE      IFACE      ADDRESS  PRIVATE  MTU  VLAN  PORTS
Zuheim           standard  vm-Zuheim  -        no       -    -     lagg0
ChocolateCoding  standard  bridge1    -        no       -    104   lagg0

Did I miss something here?
 

SirDice

Administrator
Staff member
Administrator
Moderator

Reaction score: 6,975
Messages: 28,967

Did you create a trunk on the switch ports? Your switch much be able to understand 802.1q VLAN tagging. The specific ports must be configured to allow VLAN tagged packets.
 
OP
OP
tscho

tscho

New Member

Reaction score: 1
Messages: 12

The switch ist configured with trunk interfaces. If I set an IP address on the lagg0.104 interface I can reach the router from the host. But a communication with the VM is not possible.

Code:
[tscho@horst01:~] ifconfig lagg0.104
lagg0.104: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1496
        description: vm-vlan-ChocolateCoding-lagg0.104
        ether 00:19:99:ae:b7:0e
        inet6 fe80::219:99ff:feae:b70e%lagg0.104 prefixlen 64 scopeid 0x7
        inet 172.16.0.20 netmask 0xffffff00 broadcast 172.16.0.255
        groups: vlan vm-vlan viid-71b8c@
        vlan: 104 vlanpcp: 0 parent interface: lagg0
        media: Ethernet autoselect
        status: active
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>

[tscho@horst01:~] ping 10.192.168.1
PING 10.192.168.1 (10.192.168.1): 56 data bytes
64 bytes from 10.192.168.1: icmp_seq=0 ttl=64 time=0.256 ms
64 bytes from 10.192.168.1: icmp_seq=1 ttl=64 time=0.276 ms
 

SirDice

Administrator
Staff member
Administrator
Moderator

Reaction score: 6,975
Messages: 28,967

Don't configure any VLANs on the VM itself. It gets the untagged VLAN network (your bridge1 does the tagging/untagging).
 
OP
OP
tscho

tscho

New Member

Reaction score: 1
Messages: 12

Thank you for your reply. Of that I'm aware. I see the traffic from the VM on the bridge1 interface and then incoming on the router. My router replies, for instance the ARP requests but than I don't see this traffic incoming on my bridge1.
If I configure an IP address on the bridge1 interface on the host, I see ARP requests on the bridge1 and a ping is possible
 

SirDice

Administrator
Staff member
Administrator
Moderator

Reaction score: 6,975
Messages: 28,967

How is the router set up? Is that also a FreeBSD machine?

You shouldn't need an IP address on the virtual switches (bridge1). But your router must be set as the gateway for each VLAN.

For example, this is my router:
Code:
em1.10: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
        options=3<RXCSUM,TXCSUM>
        ether 90:e2:ba:54:ff:22
        inet 192.168.10.1 netmask 0xffffff00 broadcast 192.168.10.255
        inet6 fe80::92e2:baff:fe54:ff22%em1.10 prefixlen 64 scopeid 0x7
        inet6 2001:470:1f15:bcd::1 prefixlen 64
        groups: vlan
        vlan: 10 vlanpcp: 0 parent interface: em1
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
dice@maelcum:~ % ifconfig em1.20
em1.20: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
        options=3<RXCSUM,TXCSUM>
        ether 90:e2:ba:54:ff:22
        inet 10.0.1.1 netmask 0xffffff00 broadcast 10.0.1.255
        inet6 fe80::92e2:baff:fe54:ff22%em1.20 prefixlen 64 scopeid 0x9
        inet6 2001:470:7989:20::1 prefixlen 64
        groups: vlan
        vlan: 20 vlanpcp: 0 parent interface: em1
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
As each network is "directly attached" there are some implicit routes:
Code:
192.168.10.0/24    link#7             U        em1.10
192.168.10.1       link#7             UHS         lo0
10.0.1.0/24        link#9             U        em1.20
10.0.1.1           link#9             UHS         lo0
Basic routing does the rest (you do have gateway_enable="YES"?).

As you can see I have a VLAN 10 with 192.168.10.0/24 (gateway address 192.168.10.1) and VLAN 20 with 10.0.1.0/24 (gateway 10.0.1.1).
 
OP
OP
tscho

tscho

New Member

Reaction score: 1
Messages: 12

It's a Ubiquiti Router which runs linux. I configured the IP address on the bridge1 interface just to check if I can get IP connectivity between the bhyve host and router, which I can. But layer2 traffic between the VM and the router is not possible.
 
OP
OP
tscho

tscho

New Member

Reaction score: 1
Messages: 12

I checked again. I configured the IP address on my bhyve host on lagg0.104, there I can reach the router. If I configure the IP address on the bridge1 I can't reach the router. To me it seems like the layer2 traffic is not going beyond the lagg0.104 interface.

I saw the you configured an MTU of 9000. My MTU is set to default 1500. Could this be the problem?
 

SirDice

Administrator
Staff member
Administrator
Moderator

Reaction score: 6,975
Messages: 28,967

No, MTU is not the issue here. I enabled Jumbo frames, that's really the only difference.

Are the various tap(4) interfaces correctly added to the bridge? In your case you should have a bridge interface named bridge1.

I do recommend turning off every hardware assisted checking on the 'physical' interfaces:
Code:
cloned_interfaces="lagg0"
ifconfig_igb0="inet 192.168.10.180 netmask 255.255.255.0 mtu 9000"
ifconfig_igb1="up mtu 9000 -rxcsum -rxcsum6 -txcsum -txcsum6 -lro -tso -vlanhwtso"
ifconfig_igb2="up mtu 9000 -rxcsum -rxcsum6 -txcsum -txcsum6 -lro -tso -vlanhwtso"
ifconfig_lagg0="laggproto lacp laggport igb1 laggport igb2"
defaultrouter="192.168.10.1"
I'm using igb0 as a 'management' interface for the server. The igb1 and igb2 interfaces are bundled with lagg(4) (similar to your setup). The 'virtual' switches are then tied to the lagg0 interface:
Code:
root@hosaka:~ # vm switch list
NAME     TYPE      IFACE       ADDRESS  PRIVATE  MTU   VLAN  PORTS
servers  standard  vm-servers  -        no       9000  11    lagg0
public   standard  vm-public   -        no       9000  10    lagg0
The actual bridge looks like this:
Code:
root@hosaka:~ # ifconfig vm-servers
vm-servers: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
        ether 8e:c9:d0:83:80:6b
        id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
        maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200
        root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
        member: tap7 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
                ifmaxaddr 0 port 18 priority 128 path cost 2000000
        member: tap2 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
                ifmaxaddr 0 port 13 priority 128 path cost 2000000
        member: tap1 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
                ifmaxaddr 0 port 12 priority 128 path cost 2000000
        member: lagg0.11 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
                ifmaxaddr 0 port 8 priority 128 path cost 2000000
        groups: bridge vm-switch viid-d5539@
        nd6 options=1<PERFORMNUD>
 
OP
OP
tscho

tscho

New Member

Reaction score: 1
Messages: 12

It seems to me like our configuration is pretty much the same but yours is working and mine isn't. The interfaces is connected to the bridge1:

Code:
[tscho@horst01:~] ifconfig lagg0
lagg0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=810098<VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,VLAN_HWFILTER>
        ether 00:19:99:ae:b7:0e
        inet 172.16.0.10 netmask 0xfffffe00 broadcast 172.16.1.255
        inet6 fe80::219:99ff:feae:b70e%lagg0 prefixlen 64 scopeid 0x4
        inet6 2001:858:5:2500::10 prefixlen 64
        laggproto lacp lagghash l2,l3,l4
        laggport: em0 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
        laggport: em1 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
        groups: lagg
        media: Ethernet autoselect
        status: active
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>

Did you change any of the net.link.bridge? My is looking like this:
Code:
[tscho@horst01:~] sudo sysctl net.link.bridge.
net.link.bridge.ipfw: 0
net.link.bridge.allow_llz_overlap: 0
net.link.bridge.inherit_mac: 0
net.link.bridge.log_stp: 0
net.link.bridge.pfil_local_phys: 0
net.link.bridge.pfil_member: 1
net.link.bridge.ipfw_arp: 0
net.link.bridge.pfil_bridge: 1
net.link.bridge.pfil_onlyip: 1
 

SirDice

Administrator
Staff member
Administrator
Moderator

Reaction score: 6,975
Messages: 28,967

It seems to me like our configuration is pretty much the same but yours is working and mine isn't.
That's why I'm leaning towards an issue on the router. Besides that router we have a near identical setup.

Did you change any of the net.link.bridge?
No, nothing.
 
OP
OP
tscho

tscho

New Member

Reaction score: 1
Messages: 12

When I configured an IP address on the lagg0.104 it was reachable from the router and the VM. An IP address on the bridge1 is not singable. To me it looks like there is a problem with the bridge interfaces. Is there a way look at all layer 2 traffic, even malformed packages. tcpdump doesn't show anything
 

SirDice

Administrator
Staff member
Administrator
Moderator

Reaction score: 6,975
Messages: 28,967

Is there a way look at all layer 2 traffic, even malformed packages. tcpdump doesn't show anything
tcpdump(1) already shows layer 2, you can add the -e option to have it show the source en destination MAC addresses. Perhaps that will provide some hints why it's not working.
 
OP
OP
tscho

tscho

New Member

Reaction score: 1
Messages: 12

What kind of networking cards do you use? I found a couple of posts where users of Intel cards report similar problems. In my case, the on board cards are 82569L and a 82569LM
 

SirDice

Administrator
Staff member
Administrator
Moderator

Reaction score: 6,975
Messages: 28,967

They're Intel Pro/1000, 4 ports integrated on the mainboard. They're detected as igb, this driver used to be 'stand-alone' but on 12 it has been integrated into to em(4) driver. This combination effort may have introduced new bugs of course, so I can't rule out bugs in the Intel Pro/1000 driver. But so far it seems I've been lucky not to hit any.
 
OP
OP
tscho

tscho

New Member

Reaction score: 1
Messages: 12

I played a little more around and found out the following. It seems to me like the VLAN is not ending in the right bridge/ switch. And I don't have any idea why

Code:
[tscho@horst01:~] sudo vm switch list
NAME    TYPE      IFACE      ADDRESS  PRIVATE  MTU  VLAN  PORTS
Zuheim  standard  vm-Zuheim  -        no       -    -     em1
CC      standard  vm-CC      -        no       -    500   em1
Code:
[tscho@horst01:~] sudo ifconfig vm-CC addr
58:9c:00:00:55:dc Vlan1 tap5 1178 flags=0<>

[tscho@horst01:~] sudo ifconfig vm-Zuheim addr
04:03:00:00:a3:36 Vlan1 em1 1101 flags=0<>
04:18:00:00:27:df Vlan500 em1 1173 flags=0<>
c8:69:00:00:46:4d Vlan1 em1 1074 flags=0<>
3c:2e:00:00:fd:2c Vlan1 em1 1078 flags=0<>
00:17:00:00:d6:bf Vlan1 em1 1184 flags=0<>
80:2a:00:00:48:a3 Vlan1 em1 1200 flags=0<>
04:18:00:00:27:df Vlan1 em1 1199 flags=0<>
fc:2a:00:00:14:0b Vlan1 em1 1067 flags=0<>
f0:18:00:00:b3:4f Vlan1 em1 1200 flags=0<>
80:2a:00:00:48:d4 Vlan1 em1 1198 flags=0<>
6c:f0:00:00:2e:e8 Vlan1 em1 1199 flags=0<>
08:02:00:00:23:91 Vlan1 em1 1191 flags=0<>
08:02:00:00:23:93 Vlan1 em1 1197 flags=0<>
58:9c:00:00:b2:d5 Vlan1 tap1 1101 flags=0<>
58:9c:00:00:e1:d4 Vlan1 tap3 1159 flags=0<>
58:9c:00:00:eb:df Vlan1 tap4 806 flags=0<>
58:9c:00:00:97:4c Vlan1 tap0 1194 flags=0<>
 

sko

Well-Known Member

Reaction score: 218
Messages: 429

Can you ping the VM host from your router? The Linux network stack does some really weird/annoying stuff like replying to ARP requests on _ANY_ interface, even the ones that aren't responsible for the IP and even on interfaces that are located in a totally different (and supposedly separated) network/VLAN. This default behaviour is not only very bad from a security standpoint, but also leads to weird problems that are extremely annoying to debug.

I somehow suspect this is exactly what's happening here - your router answers to an ARP request on the wrong VLAN; hence the reply never reaches the FreeBSD machine.
I'm connecting linux hosts/appliances/embedded systems exclusively to access ports because this behaviour (for me its a bug!) has bitten us so many times in the past...

Assuming that your bridge/vlan configuration is correct (vlan104 interface with lagg0 as parent; vlan104 + tap interfaces are members of the bridge), the bridge acts as if it were connected to an access port, so the tap interfaces for the VMs should _not_ use vlan tagging - vlan tagging is handled when traffic enters/leaves the bridge network through the vlan interface to lagg0. However, the output of vm switch list you've posted looks like you have connected the VM interfaces directly to em1 instead of the bridge?
But you've only given very scarce snippets of your interface configurations - can you give us your full network configuration in rc.conf and full output of ifconfig? (Maybe use something like nopaste, gitlab snippets etc and post a link.)


Double-check VLAN and port configuration on all switches and especially the router. VLANs usually need to be configured on all switches to forward tagged packets (VTP or other mechanisms are _very_ helpful if you have more than 1 switch).
Maybe you can also connect another FreeBSD host to switch 2 to verify that VLAN traffic is properly forwarded through the switches and their interconnecting trunk. Using a mirror/monitor port to analyze traffic from/to the router might also be helpful to see on what VLAN the packets are sent/received by the router.
 
OP
OP
tscho

tscho

New Member

Reaction score: 1
Messages: 12

I can't ping the router from the VM or in the other direction.
In my optioning the problem is that the MAC table on the bridge1 is not getting populated. If I configure the IP address directly on the lagg0.104 interface a ping in both directions is possible. I would say, the VLAN configuration in between is working as supposed.

The output of rc.conf and ifconfig can be found here
 

sko

Well-Known Member

Reaction score: 218
Messages: 429

Where do you configure/set up the bridges? They are not in the rc.conf

All members of the bridges are still in "STP LEARNING" mode, so it seems they never receive any traffic to finally switch to FORWARDING mode.

em0/1 and the lagg interface across those two have HW-offloading for VLAN-tagging activated - this usually interferes with VLAN tagging on higher levels (like above a lagg) as the HW is filtering every VLAN-tag. This might be the culprit why your bridges never receive any tagged packets.
Also some of the cheaper/older intel Pro1000 chipsets have buggy VLAN and checksum-offloading - try disabling all HW-offloads and only enable them one-by-one while confirming that everything is still working.
 
OP
OP
tscho

tscho

New Member

Reaction score: 1
Messages: 12

The bridges come from the VM-BHYVE(8) which create the bridges. I configured them also by hand with the same result.

The result with disabled HW-offloads is the same. I also activated the STP on the switch. Ifconfig shows now the MAC of the root bridge but the result is the same.

The untagged traffic on the bridge vm-Zuheim is working perfectly. Probably the driver for the two onboard NICs (82574L and 82579LM) ist buggy.
 
OP
OP
tscho

tscho

New Member

Reaction score: 1
Messages: 12

The Problem was caused by the onboard card(s). Just putted an Intel I350 into the machine and everything is working. Thank you all for your help and your patience
 
Top