Schrodinger's Network: Same packet getting passed and blocked on same interface, in same direction

I have a fun issue here. I think it's just some critical misunderstanding on my side, but I figured it was worth typing up.

Trying to set up a Windows guest with bhyve, using a bridged interface for networking and using pf as a firewall.

I have the VM booted up and set up with a virtual interface (tap0) with a IP address given by DHCP. The server's real interface (igb0) also has an IP address with DHCP. They're bridged together on bridge0.

Here's the FreeBSD version, network interfaces and pf.conf :

Code:
[root@td350 ~]# uname -a FreeBSD td350 11.1-RELEASE-p9 FreeBSD 11.1-RELEASE-p9 #0: Tue Apr  3 16:59:16 UTC 2018     root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  amd64
[root@td350 ~]# ifconfig
...
igb0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=2400b9<RXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWTSO,RXCSUM_IPV6>
        ether 08:94:ef:50:d5:7a
        hwaddr 08:94:ef:50:d5:7a
        inet 10.10.1.31 netmask 0xffffff00 broadcast 10.10.1.255
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
        inet6 ::1 prefixlen 128
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x5
        inet 127.0.0.1 netmask 0xff000000
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        groups: lo
tap0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=80000<LINKSTATE>
        ether 00:bd:ff:9a:f7:00
        hwaddr 00:bd:ff:9a:f7:00
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: Ethernet autoselect
        status: active
        groups: tap
        Opened by PID 43738
bridge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        ether 02:ae:37:60:33:00
        nd6 options=9<PERFORMNUD,IFDISABLED>
        groups: bridge
        id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
        maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200
        root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
        member: tap0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
                ifmaxaddr 0 port 6 priority 128 path cost 2000000
        member: igb0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
                ifmaxaddr 0 port 3 priority 128 path cost 20000
pflog0: flags=141<UP,RUNNING,PROMISC> metric 0 mtu 33160
        groups: pflog

[root@td350 ~]# cat /etc/pf.conf
# Interfaces
default_if = "igb0"
tap_if = "tap0"

# Local traffic
set skip on lo0
set skip on bridge0

# Block everything coming in by default
block in log

# But allow everything going out from here
pass out all keep state

# DHCP Rules for VMs
pass in quick on {$default_if $tap_if} inet proto udp from any port 67:68 to any port 67:68 keep state

# SSH
pass in on $default_if inet proto tcp from any to ($default_if) port ssh keep state

[root@td350 ~]# pfctl -F rules -f /etc/pf.conf
rules cleared

Alright, let's set up the experiment!

First, let's start up the bhyve VM that we're going to use for this test:
Code:
[root@td350 ~]# bhyve -c 2 -m 2G -H -w \
-s 0,hostbridge \
-s 3,ahci-hd,windows2016.img \
-s 5,virtio-net,tap0 \
-s 31,lpc \
-l com1,stdio \
-l com2,/dev/nmdm1A \
-l bootrom,/usr/local/share/uefi-firmware/BHYVE_UEFI.fd \
windows2016

And that seems to start up fine:

Code:
Microsoft Windows [Version 10.0.14393]
(c) 2016 Microsoft Corporation. All rights reserved.

C:\Windows\system32>

I have a tmux window with 3 different tcpdump commands to see how the packets travel in my system. I'm just going to look at just ICMP packets to filter out regular network noise.

The first one is listening on the tap0 interface on the VM
Code:
[root@td350 ~]# tcpdump -vv -n -e -tt -i tap0 icmp
tcpdump: listening on tap0, link-type EN10MB (Ethernet), capture size 262144 bytes

The second one is listening on pflog to check what's getting blocked system-wide
Code:
[root@td350 ~]# tcpdump -vv -n -e -tt -i pflog0 icmp
tcpdump: listening on pflog0, link-type PFLOG (OpenBSD pflog file), capture size 262144 bytes

And the third one is on igb0 which is the real networking hardware of the PC
Code:
[root@td350 ~]# tcpdump -vv -n -e -tt -i igb0 icmp
tcpdump: listening on igb0, link-type EN10MB (Ethernet), capture size 262144 bytes

Let's go to the VM and fire off some packets:

Code:
C:\Windows\system32>ping 8.8.8.8

Pinging 8.8.8.8 with 32 bytes of data:
Request timed out.
Request timed out.
Request timed out.
Request timed out.

Ping statistics for 8.8.8.8:
    Packets: Sent = 4, Received = 0, Lost = 4 (100% loss)

Ok, now let's check to see how everything worked.

VM Interface, tap0:
Code:
[root@td350 ~]# tcpdump -vv -n -e -tt -i tap0 icmp
tcpdump: listening on tap0, link-type EN10MB (Ethernet), capture size 262144 bytes
1523018732.893835 00:a0:98:08:5c:c8 > f0:9f:c2:67:7b:f1, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 128, id 13858, offset 0, flags [none], proto ICMP (1), length 60)
    10.10.1.113 > 8.8.8.8: ICMP echo request, id 1, seq 1, length 40
1523018737.578264 00:a0:98:08:5c:c8 > f0:9f:c2:67:7b:f1, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 128, id 13859, offset 0, flags [none], proto ICMP (1), length 60)
    10.10.1.113 > 8.8.8.8: ICMP echo request, id 1, seq 2, length 40
1523018742.572494 00:a0:98:08:5c:c8 > f0:9f:c2:67:7b:f1, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 128, id 13860, offset 0, flags [none], proto ICMP (1), length 60)
    10.10.1.113 > 8.8.8.8: ICMP echo request, id 1, seq 3, length 40
1523018747.582308 00:a0:98:08:5c:c8 > f0:9f:c2:67:7b:f1, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 128, id 13861, offset 0, flags [none], proto ICMP (1), length 60)
    10.10.1.113 > 8.8.8.8: ICMP echo request, id 1, seq 4, length 40

Packets dropped by pf, pflog0:
Code:
[root@td350 ~]# tcpdump -vv -n -e -tt -i pflog0 icmp
tcpdump: listening on pflog0, link-type PFLOG (OpenBSD pflog file), capture size 262144 bytes
1523018732.893848 rule 0/0(match): block in on tap0: (tos 0x0, ttl 128, id 13858, offset 0, flags [none], proto ICMP (1), length 60)
    10.10.1.113 > 8.8.8.8: ICMP echo request, id 1, seq 1, length 40
1523018737.578274 rule 0/0(match): block in on tap0: (tos 0x0, ttl 128, id 13859, offset 0, flags [none], proto ICMP (1), length 60)
    10.10.1.113 > 8.8.8.8: ICMP echo request, id 1, seq 2, length 40
1523018742.572509 rule 0/0(match): block in on tap0: (tos 0x0, ttl 128, id 13860, offset 0, flags [none], proto ICMP (1), length 60)
    10.10.1.113 > 8.8.8.8: ICMP echo request, id 1, seq 3, length 40
1523018747.582319 rule 0/0(match): block in on tap0: (tos 0x0, ttl 128, id 13861, offset 0, flags [none], proto ICMP (1), length 60)
    10.10.1.113 > 8.8.8.8: ICMP echo request, id 1, seq 4, length 40

Physical outside interface, igb0
Code:
[root@td350 ~]# tcpdump -vv -n -e -tt -i igb0 icmp
tcpdump: listening on igb0, link-type EN10MB (Ethernet), capture size 262144 bytes

Woah, did you see that?

The packets travel through the tap0 interface first, *then* they get blocked coming into the interface. igb0 sees nothing.

I tried to make a quick diagram:

Code:
pflog0 (system-wide)

 +-igb0-(sees none of this)-------------------------------+
 |                       +-tap0----------------+          |
 |  icmp packet -------> | (1) Passes through! |--???-->  |
 |                     | +---------------------+          |
 |              <--???-+                                  |
 |    (2) Dropped by pf                                   |
 +--------------------------------------------------------+

so (1) happens at 1523018732.893835 unix time, (2) happens at 1523018732.893848 unix time. First the packet passes through tap0, in the incoming direction. The first tcpdump listening on the inside of tap0 tells us that. Then the packet gets dropped because there's a match on "block in on tap0"... so, again in the incoming direction on tap0. But we've already seen it go through!

I don't understand how a packet can be dropped after it's already gone through an interface.

I hope that explains the issues I'm having wrapping my head around this. Any help is appreciated!
 
From what I can see the packet comes into the host on tap0, then gets blocked by the firewall.

Code:
       |----------HOST-----------|
GUEST -> tap0 -> BRIDGE -> igb0 -> NETWORK
              ^
              |
              Packet stopped here

I'm not a pf expert but do you not just need a rule to allow traffic going through the host from guest interfaces?
 
From what I can see the packet comes into the host on tap0, then gets blocked by the firewall.

Code:
       |----------HOST-----------|
GUEST -> tap0 -> BRIDGE -> igb0 -> NETWORK
              ^
              |
              Packet stopped here

Also not a pf expert here :p

However, there's a rule in pf.conf allowing all traffic on bridge0 as well as lo0 for good measure:

Code:
# Local traffic
set skip on lo0
set skip on bridge0

So pf shouldn't be dropping the packet there, right?
 
Didn’t see those, but then pf is blocking packets in on tap0. As a simple test does adding a skip rule for tap0 fix it?
 
Yes and no.

Yes, it allows traffic to go through tap0, go out through igb0 and come back the same way again.

But no, that doesn't really solve the overarching problem of knowing why a packet shows up on both tcpdump commands. It doesn't make sense to me for a packet to show up as both passing through a interface and rejected on that interface.

It also will make the pf.conf a little harder to follow since it's all going to be rules based on the real physical interface instead of each VM's virtual interface. It would be great to have rules in written like...

Code:
pass in on $vm inet proto tcp from any to ($vm) port 22

instead of...

Code:
pass in on $default_if inet proto tcp from any to ($vm) port 22

Although now that I see the two rules written out I have to agree it doesn't look that different...

I'll write the rules out that other way but will keep eyes on this thread in case anyone can clear up what's really going on internally.

But thanks again for your help!
 
The packet show up in both your tcpdump commands because the first one is showing the packets coming into tap0, at which point they get dropped. The second command is monitoring pflog0, which isn't really a real network interface. It's just showing you a log of the packets that are being dropped, which is those 4 that came in on tap0.
 
The packet show up in both your tcpdump commands because the first one is showing the packets coming into tap0

That sounds like such a weird idea to me that I'd call it a bug. If tcpdump says that it saw a packet, it by definition means it wasn't dropped, right? At least not before coming into the interface. If it was dropped after it went through that would make sense to me but then it would be dropped by an out rule, not by an in rule... right?

I just don't want to get to a point where I start doubting everything that tcpdump says from now on :\
 
Back
Top