Need some help with epair / bridge for multiple Jails

The Goal
Run many jails that each serve their own ssh. I got about as far as getting the jails running and then hit a wall with networking.

It seemed simple enough going in but I'm completely crashing and burning for some reason. Any help would be appreciated.

The Configuration details
The Host is a VM with two network interfaces exposed. (I did that so I keep form locking my self out of the machine every time I messed up). The network itself is a bridged network on the VM host.

I'm using FreeBSD 13.2

em0 - admin interface
vtnet0- the working interface where I'd like to serve the ssh from each jail.

The jail is pretty much a complete FreeBSD 13.2 environment made with bsdinstall jail. I did disable all the services except for sshd. Here is the rc.conf of the jail (I don't think the problem is here but I want to document it just in case)
Code:
dumpdev="NO"
sshd_enable="YES"
syslogd_enable="NO"
sendmail_submit_enable="NO"
cron_enable="NO"
sendmail_outbound_enable="NO"
sendmail_msp_queue_enable="NO"
newsyslog_enable="NO"
motd_enable="NO"
clear_tmp_enable="NO"

Created the Bridge and epair devices as bridge0 and epair0a/epair0b
ifconfig bridge create up
ifconfig epair create

Attach epair0a and vtnet0 to to Bridge0
ifconfig bridge0 inet 192.168.1.254/24 addm vtnet0 addm epair0a

Set an address for vtnet0 since it still doesn't have one.
ifconfig vtnet0 inet 192.168.1.253/24 up

OK, so far so good. At this point my tiny brain is satisfied things might work...
Code:
#> ifconfig
em0: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=481009b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,VLAN_HWFILTER,NOMAP>
        ether 08:00:27:34:35:10
        inet6 fe80::a00:27ff:fe34:3510%em0 prefixlen 64 scopeid 0x1
        inet6 2601:8c0:d80:b850:a00:27ff:fe34:3510 prefixlen 64 autoconf
        inet 192.168.1.11 netmask 0xffffff00 broadcast 192.168.1.255
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>
vtnet0: flags=8963<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=c00b9<RXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWTSO,LINKSTATE>
        ether 08:00:27:12:b6:92
        inet 192.168.1.253 netmask 0xffffff00 broadcast 192.168.1.255
        media: Ethernet autoselect (10Gbase-T <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
        inet6 ::1 prefixlen 128
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3
        inet 127.0.0.1 netmask 0xff000000
        groups: lo
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
bridge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        ether 58:9c:fc:10:18:02
        inet 192.168.1.254 netmask 0xffffff00 broadcast 192.168.1.255
        id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
        maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200
        root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
        member: epair0a flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
                ifmaxaddr 0 port 5 priority 128 path cost 2000
        member: vtnet0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
                ifmaxaddr 0 port 2 priority 128 path cost 2000
        groups: bridge
        nd6 options=9<PERFORMNUD,IFDISABLED>
epair0a: flags=8963<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8<VLAN_MTU>
        ether 02:9e:56:33:93:0a
        groups: epair
        media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
epair0b: flags=8862<BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8<VLAN_MTU>
        ether 02:9e:56:33:93:0b
        media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>

I create a jail with with the following command on a clone of the bsdinstall root fs. It starts up just fine and I can work in it.

Code:
jail -c name=test1-1682048289 \
     path=/var/j/clones/test1-1682048289 \
     vnet \
     vnet.interface=epair0b \
     host.hostname=test1 \
     allow.mount.zfs=1 \
     exec.system_user=root \
     exec.start="/bin/sh /etc/rc" \
     exec.stop="/bin/sh /etc/rc.shutdown" \
     allow.raw_sockets=1 \
     securelevel=9 \
     sysvmsg=new \
     sysvsem=new \
     sysvshm=new \
     mount.devfs \
     devfs_ruleset=4 \
     persist

exec.poststart="/bin/sh -c 'ifconfig epair0b inet 192.168.1.13/24 up'" doesn't seem to work due to some kind of race condition so I remove it (can't seem to find epair0b "post start" ) No problem. I'll just do it manually, I think. Not taking any chances, I'm going to jump into the jail and do it there rather than from outside with just jexec.

Code:
#> jexec $JID sh
[root@test1]#> ifconfig epair0b inet 192.168.1.13/24 up
[root@test1]#>  ifconfig
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
        inet6 ::1 prefixlen 128
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1
        inet 127.0.0.1 netmask 0xff000000
        groups: lo
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
epair0b: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8<VLAN_MTU>
        ether 02:9e:56:33:93:0b
        inet 192.168.1.13 netmask 0xffffff00 broadcast 192.168.1.255
        groups: epair
        media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>

Looking good I think! Man, I'm some kind of FreeBSD power user. I make this stuff look easy!
Code:
[root@test1]#> ping -c 3 -W 1000 192.168.1.253
PING 192.168.1.253 (192.168.1.253): 56 data bytes

--- 192.168.1.253 ping statistics ---
3 packets transmitted, 0 packets received, 100.0% packet loss
[root@test1]#> ping -c 3 -W 1000 192.168.1.254
PING 192.168.1.254 (192.168.1.254): 56 data bytes

--- 192.168.1.254 ping statistics ---
3 packets transmitted, 0 packets received, 100.0% packet loss
Ok. That's not good. I'm no networking genius but I would expect to be able to ping other addresses on the bridge at the very least, right? It's layer 2 from what I understand so something is going really wrong for me. I just don't know what it is. Is there some sysctl for layer 2 stuff I need to set?

Here are my loaded modules
Code:
#> kldstat
Id Refs Address                Size Name
 1   24 0xffffffff80200000  1f3e2d0 kernel
 2    1 0xffffffff8213f000   59dfa8 zfs.ko
 3    1 0xffffffff826de000     a4a0 cryptodev.ko
 4    1 0xffffffff82b18000     3218 intpm.ko
 5    1 0xffffffff82b1c000     2180 smbus.ko
 6    1 0xffffffff82b1f000     7638 if_bridge.ko
 7    1 0xffffffff82b27000     60d8 bridgestp.ko
 8    1 0xffffffff82b2e000     3a64 if_epair.ko

Here are my kernel settings for network bridge related stuff
Code:
#> sysctl -a | grep net.*bridge
net.link.bridge.ipfw: 0
net.link.bridge.allow_llz_overlap: 0
net.link.bridge.inherit_mac: 0
net.link.bridge.log_stp: 0
net.link.bridge.pfil_local_phys: 0
net.link.bridge.pfil_member: 1
net.link.bridge.ipfw_arp: 0
net.link.bridge.pfil_bridge: 1
net.link.bridge.pfil_onlyip: 1
dev.netmap.max_bridges: 8
dev.netmap.bridge_batch: 1024


What am I missing? Can anyone help me with a recipe to reach my desired goal? The jails need to be routable to the 92.168.1.0/24 network.

Thanks for any help!
 
Set an address for vtnet0 since it still doesn't have one.
Why do you think it needs one? The bridge doesn't need an IP address either. So remove all those.

Your admin interface (em0) and your jails (tied to vtnet0 via bridge0) have IP addresses in the same range. That's going to cause routing issues.
 
Thanks for the help. I knew my mistakes would probably jump out to an expert eye.

I should have added the part of the goal is to have the jails serve ssh to 192.168.1.0/24 It's a flat network. I wanted to just use the bridge device to connect the jails like they were another host on the network with a network address in that network. (So they just look like any other machine on the network)

I made adjustments based on your input.

Removed IP addresses from bridge0. Rather than adding vtnet0 to bridge0 to I bridged em0 and epair0a to it.
I did adjust some sysctl's based on the advice on the iocage docs site. Other than that, the main thing I changed was using addresses for the epair that are in a different subnet from em0. I also manually added a route to the 192.168.1.0/24 network in the jail.

Code:
# sysctl settings reccomended by iocage for vnet (run after bridge module is loaded)
sysctl net.inet.ip.forwarding=1       # Enable IP forwarding between interfaces
sysctl net.link.bridge.pfil_onlyip=0  # Only pass IP packets when pfil is enabled
sysctl net.link.bridge.pfil_bridge=0  # Packet filter on the bridge interface
sysctl net.link.bridge.pfil_member=0  # Packet filter on the member interface

Network settings now
Code:
EPAIRA_IP="192.168.2.1/24"
EPAIRB_IP="192.168.2.2/24"

jexec $JID route add -net 192.168.1.0/24 192.168.2.1

I was able to ping across the epair and the Hosts IP (192.168.1.11) but I can't get out to any other machines or the gateway on 192.168.1.0/24 . Adding a default gateway of 192.168.1.1 or 192.168.1.11 didn't seem to make a difference. The host already seems to know about route add -net 192.168.2.0/24 192.168.2.2.

It's OK progress but I haven't reached my goal yet. I have hit the limit of networking knowledge again and need more hints to proceed further.

Here is the host config in the now (more working) setup.
Code:
#> ifconfig
em0: flags=8963<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=4810099<RXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,VLAN_HWFILTER,NOMAP>
        ether 08:00:27:34:35:10
        inet6 fe80::a00:27ff:fe34:3510%em0 prefixlen 64 scopeid 0x1
        inet6 2601:8c0:d80:b850:a00:27ff:fe34:3510 prefixlen 64 autoconf
        inet 192.168.1.11 netmask 0xffffff00 broadcast 192.168.1.255
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>
vtnet0: flags=8822<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=4c07bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWTSO,LINKSTATE,TXCSUM_IPV6>
        ether 08:00:27:12:b6:92
        media: Ethernet autoselect (10Gbase-T <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
        inet6 ::1 prefixlen 128
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3
        inet 127.0.0.1 netmask 0xff000000
        groups: lo
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
bridge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        ether 58:9c:fc:10:18:02
        id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
        maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200
        root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
        member: epair0a flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
                ifmaxaddr 0 port 5 priority 128 path cost 2000
        member: em0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
                ifmaxaddr 0 port 1 priority 128 path cost 20000
        groups: bridge
        nd6 options=9<PERFORMNUD,IFDISABLED>
epair0a: flags=8963<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8<VLAN_MTU>
        ether 02:bc:35:c2:29:0a
        inet 192.168.2.1 netmask 0xffffff00 broadcast 192.168.2.255
        groups: epair
        media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>

#> netstat -nr
Routing tables

Internet:
Destination        Gateway            Flags     Netif Expire
default            192.168.1.1        UGS         em0
127.0.0.1          link#3             UH          lo0
192.168.1.0/24     link#1             U           em0
192.168.1.11       link#1             UHS         lo0
192.168.2.0/24     link#5             U       epair0a
192.168.2.1        link#5             UHS         lo0

Here are the jail network settings.

Code:
# jexec $JID ifconfig
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
        inet6 ::1 prefixlen 128
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1
        inet 127.0.0.1 netmask 0xff000000
        groups: lo
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
epair0b: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8<VLAN_MTU>
        ether 02:bc:35:c2:29:0b
        inet 192.168.2.2 netmask 0xffffff00 broadcast 192.168.2.255
        groups: epair
        media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>

#> jexec $JID netstat -nr
Routing tables

Internet:
Destination        Gateway            Flags     Netif Expire
default            192.168.1.1        UGS     epair0b
127.0.0.1          link#1             UH          lo0
192.168.1.0/24     192.168.2.12       UGS     epair0b
192.168.2.0/24     link#2             U       epair0b
192.168.2.13       link#2             UHS         lo0


#> jexec $JID netstat -nr
Routing tables

Internet:
Destination        Gateway            Flags     Netif Expire
default            192.168.1.1        UGS     epair0b
default            192.168.1.11       UGS     epair0b
127.0.0.1          link#1             UH          lo0
192.168.1.0/24     192.168.2.1        UGS     epair0b
192.168.2.0/24     link#2             U       epair0b
192.168.2.2        link#2             UHS         lo0

Thank you for the help so far. I'm making progress. I wish I understood the behavior of the bridge and epair devices in particular a bit better. The bridge doesn't work quite how my mental map of it works (I think of it more like a switch. But that's not how it seems to work for me) the epair driver doesn't work like a "VPN" or a hub, so I don't even know how to relate to it. Can it send traffic only on layer 2? If so, how do I get that behavior out of it? If anyone has pointers on good articles explaining them better I'd greatly appreciate the links.
 
I should have added the part of the goal is to have the jails serve ssh to 192.168.1.0/24 It's a flat network.
It is a flat network, not a routed network, then you may want to setup the network like this:

em0 --- bridge0(192.168.1.11/24) --- epair0a --- epair0b(192.168.1.x/24)
 
I think I'm getting closer to a solution.
It is a flat network, not a routed network, then you may want to setup the network like this:

em0 --- bridge0(192.168.1.11/24) --- epair0a --- epair0b(192.168.1.x/24)

This seemed to partially work. I mean, it worked about as well as anything I have managed so far.
What I was seeing was that I could ping any of the local addresses. local host could ping the gateway and I could ping either side of the epair from the other. I still could not, however, get to the 192.168.1.0/24 network from inside the jail (the bside of the epair).

I tried doing your example exactly, and also one that works more like the way my mental map of the bridge device works by putting the host address 192.168.1.11/24 on em0 It worked just as well as your example. Is there any reason one one way is better than the other?

So, the problem doesn't seem to be local layer 2 traffic. And it seems that layer 2 mostly works internally on the host and jail. But based on the symptoms I was seeing, I started to believe that perhaps there was some kind of layer 2 problem between the epair device and the bridge or the rest of the network.

I dumped the arp table to look for clues.
Code:
#> arp -na
? (192.168.1.11) at 08:00:27:34:35:10 on em0 permanent [ethernet]
? (192.168.1.12) at 02:bc:35:c2:29:0b on em0 expires in 1117 seconds [ethernet]
? (192.168.1.1) at 6c:cd:d6:7e:9f:16 on em0 expires in 1185 seconds [ethernet]
? (192.168.1.7) at d8:5e:d3:e2:1c:5c on em0 expires in 1165 seconds [ethernet]

Interesting. There is an entry for the epair device (the b side on 192.168.1.12) but the bridge still doesn't send it across when I try to ping it from the network.

How do I get the epair devices to behave? em0 works as you would expect it to, even bridged the same way. Is there some tunable I need to adjust?

I've learned a lot about how this works just working through this so thanks for the help so far everyone.
 
The Configuration details
The Host is a VM with two network interfaces exposed. (I did that so I keep form locking my self out of the machine every time I messed up). The network itself is a bridged network on the VM host.
That is (was) the problem. You have to use a physical network adapter and configure it in the VM (?).
I went through the same suck, then tried the settings on a real machine and everything worked like a charm :D
 
Back
Top