Solved Reuse Bhyve's public switch with jails

Hello,

I'm running a FreeBSD 12.1-RELEASE server that hosts few bhyve VMs. The network config is like this:

Code:
ixl0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
    options=e507bb<RXCSUM,TXCSUM,VLAN_MTU,...>
    ether ...
    inet MY.PUBLIC.IP.ADDR netmask 0xffffffc0 broadcast MY.BRD.CST.ADDR
    media: Ethernet autoselect (1000baseT <full-duplex>)
    status: active
    nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
../..
vm-public: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
    ether ...
    inet 192.168.0.1 netmask 0xffffff00 broadcast 192.168.0.255
    id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
    maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200
    root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
    member: tap2 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
            ifmaxaddr 0 port 8 priority 128 path cost 2000000
    member: tap1 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
            ifmaxaddr 0 port 7 priority 128 path cost 2000000
    member: tap0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
            ifmaxaddr 0 port 6 priority 128 path cost 2000000
    groups: bridge vm-switch viid-4c918@
    nd6 options=1<PERFORMNUD>
tap0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
    description: vmnet-my-vm-0-public
    options=80000<LINKSTATE>
    ether 00:bd:0e:26:01:00
    inet6 fe80::2bd:eff:fe26:100%tap0 prefixlen 64 tentative scopeid 0x6
    groups: tap vm-port
    media: Ethernet autoselect
    status: active
    nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
    Opened by PID 9520
../..

It works great.

Now I would like to add few Jails when a full VM is not necessary, and I would like to hook them on my vm-switch. I can't find a proper how to / recipe to do so.

Any pointer/help appreciated!
 
vm-public is just a bridge, so you can either

1. add interface on which your jail would add and use ip address to bridge vm-public

2. using vnet(9) and epair(4) virtual interfaces, place epair0a to jail and add epair0b to bridge vm-public.
 
Thanks sol289. I would be happy to do 1/ but I don't know how. I'm not using ezjail nor iocage, I just want to keep it simple with vi and /etc/jail.conf. I think I've found a how to for 2/ but it does seem a little more complex, I'm not sure…
 
Use this page as a cheat sheet for jails (I do it myself from time to time).

For example, simple config for thin jails with template. There is a root template in a /usr/j/template, which will be mounted in ro to /usr/j/jail1, and then thin jail /usr/j/jail1_s, which contains /etc/, /var/ and /usr/local/, and will be mounted to /usr/j/jail1/s. Template is prepared so /etc/, /var/ and /usr/local/ are symlinks to /s/... dirs.

/etc/jail.conf:
Code:
#
exec.start = "/bin/sh /etc/rc";
exec.stop = "/bin/sh /etc/rc.shutdown";
mount.devfs;
allow.raw_sockets;
path = "/usr/j/$name";
host.hostname = $name.your.domain.com;
mount.fstab = "/usr/j/${name}_s/fstab";

jail1 {
    ip4.addr = "bce0|192.168.245.18/24";
    ip4.addr += "lo0|127.0.0.2/32";
}

/usr/j/jail1_s/fstab:
Code:
# Device    Mountpoint  FStype  Options Dump    Pass#
/usr/j/template /usr/j/jail1     nullfs  ro  0   0
/usr/j/jail1_s   /usr/j/jail1/s   nullfs  rw  0   0


Code:
user@server:/usr/j/jail1_s % ls -l
total 13
drwxr-xr-x  23 root  wheel  106  6 янв  2018 etc
-rw-r--r--   1 root  wheel  153 20 мар  2020 fstab
drwxr-xr-x   2 root  wheel    2 13 июн  2017 home
drwxr-xr-x   6 root  wheel   14  4 сен 13:15 root
drwxr-xr-t   7 root  wheel    7 27 окт 03:01 tmp
drwxr-xr-x  12 root  wheel   12 13 июн  2017 usr-local
drwxr-xr-x  24 root  wheel   24 13 авг 07:25 var
user@server:/usr/j/jail1_s % cd ../template/
user@server:/usr/j/template % ls -l
total 31
-r--r--r--   1 root  wheel  6197 25 мар  2016 COPYRIGHT
drwxr-xr-x   2 root  wheel    47 25 мар  2016 bin
drwxr-xr-x   8 root  wheel    50 25 мар  2016 boot
dr-xr-xr-x   2 root  wheel     2 25 мар  2016 dev
lrwxr-xr-x   1 root  wheel     6 13 июн  2017 etc -> /s/etc
lrwxr-xr-x   1 root  wheel     7 13 июн  2017 home -> /s/home
drwxr-xr-x   3 root  wheel    52 25 мар  2016 lib
drwxr-xr-x   3 root  wheel     5 13 июн  2017 libexec
drwxr-xr-x   2 root  wheel     2 25 мар  2016 media
drwxr-xr-x   2 root  wheel     2 25 мар  2016 mnt
dr-xr-xr-x   2 root  wheel     2 25 мар  2016 proc
drwxr-xr-x   2 root  wheel   146 25 мар  2016 rescue
lrwxr-xr-x   1 root  wheel     7 13 июн  2017 root -> /s/root
drwxr-xr-x   2 root  wheel     2 13 июн  2017 s
drwxr-xr-x   2 root  wheel   132 25 мар  2016 sbin
lrwxr-xr-x   1 root  wheel    11 25 мар  2016 sys -> usr/src/sys
lrwxr-xr-x   1 root  wheel     6 13 июн  2017 tmp -> /s/tmp
drwxr-xr-x  14 root  wheel    15 12 май  2018 usr
lrwxr-xr-x   1 root  wheel     6 13 июн  2017 var -> /s/var
user@server:/usr/j/template %

And then all you have to is add bce0 interface to your bridge as a member:
Code:
user@server:~ # ifconfig vm-public addm bce0

I hope this will help you.
 
Thanks a lot.
How do you choose bce0 as your interface? Does it have to be the same as your host's public interface?
 
Well, it's up to you, on which interface your jail will have IP address that you want. If, for example, your server is running in private network and your server's main interface is bce0 with address 192.168.1.100, and you want to run jail in the same network with address 192.168.1.200, then in jail.conf for a specific jail you will add:
Code:
ip4.addr = "bce0|192.168.1.200/24";

Then, when your jail will come up, on bce0 will be added an alias IP address 192.168.1.200. And you can put network mask /32 to it if main server interface already have IP address from the big /24 network, because ARP requests "who has 192.168.1.200" will be answered from 192.168.1.100 address.

If you planning to run jail that will serve some public service and don't have public IP for it (when your server have only one public IP address), then use NAT for it, there are plenty of "Jail + NAT" HOWTO's (and on this forum too).
 
Thanks. I'm using NAT for bhyve (and for jail also, then), no particular difficulties on this side. Just to make sure I'm not messing with my network setup 400km away from the server I've used interface ixl1 of the host for the jail (ixl0 is my public interface, ixl1 is not used). Jail's up and running, can ping outside world, can fetch pkgs. So everything looks good.
Now I have a question about automation: how do I make sure the jail is started at boot-time and it's interface is added to the bridge (the bridge is created by bhyve)?
 
Now I have a question about automation: how do I make sure the jail is started at boot-time and it's interface is added to the bridge (the bridge is created by bhyve)?
How do you start bhyve? I'm using sysutils/vm-bhyve for bhyve management, and on one of our servers I'm using manually created bridge rather than "switch" created by vm switch. So when system goes up, bridges created, and bhyve is using them as it's "switch", and jails could also use them. I'll show how:

First, create bridges on start:
/etc/rc.conf
Code:
cloned_interfaces="vlan20 vlan997 bridge997 bridge20"
ifconfig_vlan20="vlan 20 vlandev igb2 up"
ifconfig_vlan997="vlan 997 vlandev igb2 up"
ifconfig_bridge997="addm vlan997 up"
ifconfig_bridge20="addm vlan20 up"

Then, with vm switch create command add bridge interface as a switch for bhyve vms:
Code:
# vm switch create -t manual -b bridge997 vlan997
# vm switch create -t manual -b bridge20 vlan20

And here you go:
Code:
# vm switch list
NAME     TYPE    IFACE      ADDRESS  PRIVATE  MTU  VLAN  PORTS
vlan997  manual  bridge997  n/a      no       n/a  n/a   n/a
vlan20   manual  bridge20   n/a      no       n/a  n/a   n/a

Your jails would hook up on interface that will be a member of a bridge (member interface can be created on boot or added as a member to bridge on jail start), and you vm's will hook up on same bridge using their tap interfaces.
 
Thanks a lot. I'll have to look deeper into this.
I'm using sysutils/vm-bhyve to manage/start my VMs with a standard switch. I feel like this "automated" setup is really nice but will cause much scheduling trouble: not so sure about the availability of the bhyve switch when jails start, etc.
A more static approach like yours is probably more foolproof.
 
Ok, I've rebooted the server, and I do have a scheduling problem:
after reboot, all VMs and the jail are up, and ifconfig displays a correct setup, but VMs and Jail are unreachable and can't resolve domain names. I have to reload pf rules and restart named to get it all working again.
I believe a static bridge manually created in rc.conf could resolve this problem, but I'll have to test extensively. For now a @reboot crontab will be my workaround.
 
As for pf, I'll paste only beginning of file:
Code:
# Macros: define common values, so they can be referenced and changed easily.
ext_if="ixl0"    # replace with actual external interface name i.e., dc0
loc_if="lo0"    # localhost
set limit table-entries 1000000

set skip on vm-public

# NAT for bhyve
nat on $ext_if from {192.168.0.0/24} to any -> ($ext_if)

# rdr from outside to VMs
rdr inet proto { tcp, udp} from any to any port XX -> 192.168.0.50 port YY
rdr inet proto { tcp, udp} from any to any port ZZ -> 192.168.0.50 port ZZ
../..

Everything below those lines are pass/block rules.
I guess my problem is that vm-public in non-existent when PF starts at boot-time. An rc.conf defined bridge would probably exists before PF startup.

if/Jail/VM related rc.conf directives:
Code:
$ grep -iE '^jail|^vm|^if' /etc/rc.conf
ifconfig_ixl0="inet MY.IP.ADD.RESS  netmask 255.255.255.192"
vm_dir="zfs:sas/vm"
vm_enable="YES"
vm_list="centos7-01 ubuntu-01 ubuntu-02"
vm_delay="20"
jail_enable="YES"
 
set skip on vm-public
I think it's here PF stops, because no such interface exists at boot time, before vm switch create it.

Create manual bridges-switches for vm, and I think you will be fine. And don't forget to change network switch name in vm's .conf file.
 
Almost a year later I come back to say I've had some time this weekend and tried, and so far it seems ok. I've created a bridge interface in rc.conf with no members, with the gateway IP for my VMs:
Code:
cloned_interfaces="bridge0"
ifconfig_bridge0="inet 192.168.0.1 netmask 255.255.255.0 up"

created a static switch for vm-bhyve, and set this new switch in VM's config files.

Now, reboots are way more easy to handle, no more manual intervention to set things straight. I don't use any jail currently so the test is not complete but I'm pretty confident this will work.

Thanks
 
I see you have ixl interfaces.
Those are 10G and you have an additional options with those.
SRV-IO and its control utility:

These create virtual interfaces that you can use with VM's or jails. These will be faster than any bridged interface.
 
Thanks, that's good to know, but anyway my real bandwidth is caped so I don't need any faster option.
 
Back
Top