ARP table entries on wrong/duplicate interface

Hello!

I'm running what feels like a rather odd network setup, and it's giving me what feels like rather odd issues. If you see any flaws in my trail of thought, please let me know what they are!

I've got a machine with two physical interfaces, igb0 and igb1. I want to virtualise a pfsense router/gateway in bhyve on this machine, as well as run a handful of jails and other bhyves on the "LAN" side.

I've got igb0 connected to my ISP-provided cat5e cable from the wall. igb0 is then connected to tap0 via bridge0, and tap0 is shared to the bhyve instance running pfsense.
On the "other side" of the bhyve, I've got tap1 connected, which in turn is bridge1'd to igb1. igb1 then connects physically to a switch and wireless access point, for laptops and phones to connect to.
bridge1 will also get more tap's connected for more bhyves - I don't think two bhyves can share a tap(?).

Neither of the bridges have IP addresses, I tried making bridge1 have an IP address and use that to reach the gateway inside the bhyve, but the problems persisted.

Now, my problems seem rather odd to me. *Some* machines on my LAN (10.0.0.0/24) are unable to access the "host" (10.0.0.2, physical machine hosting everything). Upon looking into this, I found the machines unable to connect either have duplicate ARP table entries, or their ARP table entry points to the wrong interface.

By this, I mean that they've either got two ARP table entries, one for tap1 and one for igb1, or just one - in the case of my phone, igb1 only (no tap1 at all).

I am unable to delete the superfluous ARP table entries (on igb1):
Code:
arp: writing to routing socket: No such file or directory

which I guess is related to the fact that igb1 isn't really used, but rather "just" a physical interface?

Here's what my ARP table looks like (do note Android device on igb1, and laptop.host.domain.tld on two interfaces):
Code:
root@host:~ # arp -a
host.domain.tld (10.0.0.2) at 00:bd:97:18:f7:01 on tap1 permanent [ethernet]
torrent.host.domain.tld (10.0.0.3) at 00:bd:97:18:f7:01 on tap1 permanent [ethernet]
router.host.domain.tld (10.0.0.1) at 00:a0:98:c7:06:c5 on tap1 expires in 264 seconds [ethernet]
bots.host.domain.tld (10.0.0.6) at 00:bd:97:18:f7:01 on tap1 permanent [ethernet]
smokeping.host.domain.tld (10.0.0.7) at 00:bd:97:18:f7:01 on tap1 permanent [ethernet]
samba.host.domain.tld (10.0.0.5) at 00:bd:97:18:f7:01 on tap1 permanent [ethernet]
laptop.host.domain.tld (10.0.0.181) at ac:bc:32:99:a5:4d on tap1 expires in 14 seconds [ethernet]
android-deviceID.host.domain.tld (10.0.0.187) at 40:b8:37:11:1c:27 on igb1 expires in 815 seconds [ethernet]
laptop.host.domain.tld (10.0.0.181) at ac:bc:32:99:a5:4d on igb1 expires in 1132 seconds [ethernet]

Here's what my ifconfig looks like:
Code:
root@host:~ # ifconfig -a
igb0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
    options=2400b9<RXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWTSO,RXCSUM_IPV6>
    ether d0:50:99:c1:4b:ef
    nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
    media: Ethernet autoselect (100baseTX <full-duplex>)
    status: active
igb1: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
    options=2400b9<RXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWTSO,RXCSUM_IPV6>
    ether d0:50:99:c1:4b:f0
    nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
    media: Ethernet autoselect (1000baseT <full-duplex>)
    status: active
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
    options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
    inet6 ::1 prefixlen 128
    inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3
    inet 127.0.0.1 netmask 0xff000000
    nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
    groups: lo
lo1: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
    options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
    inet 127.0.1.1 netmask 0xffffffff
    inet 127.0.1.4 netmask 0xffffffff
    inet 127.0.1.2 netmask 0xffffffff
    inet 127.0.1.3 netmask 0xffffffff
    nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
    groups: lo
bridge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
    ether 02:32:fd:c9:80:00
    nd6 options=9<PERFORMNUD,IFDISABLED>
    groups: bridge
    id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
    maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200
    root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
    member: tap0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
           ifmaxaddr 0 port 7 priority 128 path cost 2000000
    member: igb0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
           ifmaxaddr 0 port 1 priority 128 path cost 2000000
bridge1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
    ether 02:32:fd:c9:80:01
    nd6 options=9<PERFORMNUD,IFDISABLED>
    groups: bridge
    id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
    maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200
    root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
    member: tap1 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
           ifmaxaddr 0 port 8 priority 128 path cost 2000000
    member: igb1 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
           ifmaxaddr 0 port 2 priority 128 path cost 2000000
tap0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
    options=80000<LINKSTATE>
    ether 00:bd:8a:18:f7:00
    nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
    media: Ethernet autoselect
    status: active
    groups: tap
    Opened by PID 83037
tap1: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
    options=80000<LINKSTATE>
    ether 00:bd:97:18:f7:01
    inet 10.0.0.2 netmask 0xffffff00 broadcast 10.0.0.255
    inet 10.0.0.3 netmask 0xffffffff broadcast 10.0.0.3
    inet 10.0.0.7 netmask 0xffffffff broadcast 10.0.0.7
    inet 10.0.0.5 netmask 0xffffffff broadcast 10.0.0.5
    inet 10.0.0.6 netmask 0xffffffff broadcast 10.0.0.6
    nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
    media: Ethernet autoselect
    status: active
    groups: tap
    Opened by PID 83037

ezjail is responsible for adding the extra addresses to tap1. Since they're on /32's, I assume it's safe.

Here's the relevant section from my rc.conf:
Code:
cloned_interfaces="lo1 bridge0 bridge1 tap0 tap1"

ifconfig_bridge0="addm igb0 addm tap0 up"
ifconfig_bridge1="addm igb1 addm tap1 up"

ifconfig_lo1="up"
ifconfig_igb0="up"
ifconfig_igb1="up"
ifconfig_tap1="inet 10.0.0.2/24"
defaultrouter="10.0.0.1"

I've been trying most things I can think of, and I hope I'm not missing something super obvious.

Thanks in advance!
 
It's usually caused by having two or more interfaces in the same network. This is a bad idea in general.
 
It's usually caused by having two or more interfaces in the same network. This is a bad idea in general.

Alright, makes sense to me, apart from... As far as I can see, I don't have two interfaces in the same network. My igb1 isn't even in a network, the only interface I have in a network is tap1.

Am I missing something here?
 
Is the pfSense machine perhaps bridging tap0 and tap1? The only way a MAC address would show up on the "wrong" side would be if there's a network loop somewhere. Something appears to connect both ends to the same broadcast domain.
 
Is the pfSense machine perhaps bridging tap0 and tap1?

Not bridging, it's using them (to pfSense known as vtnet0 and vtnet1) as its WAN and LAN interfaces respectively, with a NAT and firewall between them. In the pfSense ARP table, there are some WAN entries as well (my upstream gateway and some other machine with my ISPs hostname), but they exist on vtnet0. All in all, the pfSense ARP table looks perfectly reasonable.

The only way a MAC address would show up on the "wrong" side would be if there's a network loop somewhere. Something appears to connect both ends to the same broadcast domain.

It's not on the "wrong" "side", everything on the WAN side (igb0, bridge0, tap0, pfSense vtnet0) is working as intended. The problem is that I have ARP entries on the "wrong" interface, igb1, which doesn't have an IP address or anything, it's just bridged to tap1. I don't have any network loops that I know of.
 
Adding gateway_enable="YES" to /etc/rc.conf didn't remove the duplicate ARP table entries, but clients can talk to the server anyways. I guess this is because instead of "catching" the packets on igb1, they're forwarded to bridge1.

I'm not marking this thread as solved because I feel it isn't solved in a good way. After all, I still have the same MAC addresses tied to two separate interfaces.
 
I've had this problem, too, (yesterday!). arp -a gave me a physical interface that is a member of a bridge for my gateway. I've been confused how this happened, because the physical interface does not even have an IP, like yours.

What I found out is that after changing my 3 NICs to be members of a bridge, I forgot to remove the physical interface from my DHCP server. It was still serving DHCP on this NIC. Check your /etc/rc.conf and your configurations for services that still run on the physical NIC. In this case remove the physical interface (replace it with your bridge). ARP entries will disappear automatically after the expire.
 
Back
Top