Solved Child jails in hierarchical jail not starting on parent jail startup

Hi,

I'm trying to set up some hierarchical vnet jails but am having problems getting the child jails to launch automatically after the parent jail is started. I've set the jail_list entry in the rc.conf of the parent jail and hence would expect the child jails to launch at startup but when I check the parent jail after it has launched none of the child jails are running.
I can manually launch the child jails from the parent using service jail start without any issues. I also tried starting the child jails by adding exec.start += "/usr/sbin/service jail start"; into the jail.conf for the parent jail and while that worked fine in automatically launching all child jails at startup it feels a bit 'clunky' to me. Any ideas on what might be going on here?
The jail.conf to launch the parent jail as well as the rc.conf used by the parent jail are attached below. Both the host as well as all the jails are running FreeBSD 13.0-RELEASE-p3.

jail.conf to launch parent jail
Code:
exec.timeout = 90;
stop.timeout = 30;

path = "/jails/${name}";

exec.start = "/bin/sh /etc/rc";
exec.stop = "/bin/sh /etc/rc.shutdown jail";

exec.consolelog = "/var/log/jails/jail_${name}.log";
exec.clean;

mount.devfs;

parent {
  exec.prestart = "/jails/scripts/jib addm ${name} some_interface";
  exec.poststop = "/jails/scripts/jib destroy ${name}";

  vnet;
  vnet.interface = e0b_${name};

  devfs_ruleset=32;

  allow.mount;
  allow.mount.devfs;
  allow.mount.procfs;
  allow.mlock;
  enforce_statfs = 1;
  children.max = 5;
}


rc.conf used in parent jail

Code:
# Set dumpdev to "AUTO" to enable crash dumps, "NO" to disable
dumpdev="AUTO"
syslogd_flags="-ss"
sendmail_enable="NONE"
clear_tmp_enable="YES"
cron_flags="$cron_flags -J 15"

jail_enable="YES"
jail_list="child1 child2 child3" # Explicitly setting jail_list="" as per the man page didn't launch them either
 
I also tried starting the child jails by adding exec.start += "/usr/sbin/service jail start"; into the jail.conf for the parent jail and while that worked fine in automatically launching all child jails at startup it feels a bit 'clunky' to me.
Agree. Unfortunately (fortunately? It does work...), I went with a similar solution.
Code:
exec.start = "service jail start ; /bin/sh /etc/rc";
 
In case someone else comes across this thread with the same problem, I haven't been able to figure out why the rc.conf settings are being ignored so I'm just using the workaround mentioned in my original question of starting the jails in the parent jail using exec.start += "/usr/sbin/service jail start"; in the jail.conf for the parent.

One caveat with this workaround though is that the parent jail won't terminate properly and stay in the "dying" state (can be checked with jls -d) unless the running child jails are explicitly stopped prior to the jail stopping. By using something like exec.stop += "/usr/sbin/service jail stop"; in the jail.conf for the parent jail for example.

In practice I don't think having a bunch of jails stuck in the dying state is a huge deal (someone correct me if I'm wrong) but this can bite you in the butt if it's expected that the parent jail release any network interfaces it was assigned. If the jail gets stuck in the dying state the network interfaces won't necessarily be released back to the host OS. This isn't a problem if using interfaces like an epair that are dynamically created/torn-down at jail start/end by using for example the jib script. Non dynamically create interfaces such as physical NICs though pose a problem.
The NIC not being returned to the host OS means that the jail can't be started again as the host OS can't 'see' the NIC that is supposed to be assigned to the jail anymore and the jail will fail to start. It also means that the parent jail can never be restarted using something like service jail restart. The only easy way to get the 'lost' interface(s) back that I've found is a full reboot of the host.

Just thought I'd post this here as it took me a good half hour trying to figure out what the hell was going on when I switched from using an epair to assigning a physical NIC to the VNET parent jail.
 
Back
Top