jail.conf jails won't start Q - 2 bugs

So I've got a FreeBSD 10.3 host with three jails that use nullfs to mount a common read-only base system. I've discovered that on reboot, only one or two of the three will start, and I cannot predict which ones. The remaining jail (or jails) fail to start. There are NO logs anywhere on the main system, nor in the jails' individual console log files. So to debug, I had to hack some debugging logging into the /etc/rc.d/jail script.

/etc/jail.conf:
Code:
jail1 {
  host.hostname  = "jail1.example.org";
  path  = "/usr/local/jail/jail1";
  ip4.addr  = 127.0.0.11;
  mount  = "/usr/local/jail/basejail_2016_04_19 /usr/local/jail/jail1/basejail nullfs ro 0 0";
  exec.consolelog = "/var/log/jail_${host.hostname}.log";
  exec.prestart = "/bin/sh -c 'echo PRESTART_${host.hostname} >> /tmp/DEBUG'";
  exec.poststart = "/bin/sh -c 'echo POSTSTART_${host.hostname} >> /tmp/DEBUG'";
}

jail2 {
  host.hostname  = "jail2.example.org";
  path  = "/usr/local/jail/jail2";
  ip4.addr  = 127.0.0.12;
  mount  = "/usr/local/jail/basejail_2016_04_19 /usr/local/jail/jail2/basejail nullfs ro 0 0";
  exec.consolelog = "/var/log/jail_${host.hostname}.log";
  exec.prestart = "/bin/sh -c 'echo PRESTART_${host.hostname} >> /tmp/DEBUG'";
  exec.poststart = "/bin/sh -c 'echo POSTSTART_${host.hostname} >> /tmp/DEBUG'";
}

jail3 {
  host.hostname  = "jail3.example.org";
  path  = "/usr/local/jail/jail3";
  ip4.addr  = 127.0.0.11;
  mount  = "/usr/local/jail/basejail_2016_04_19 /usr/local/jail/jail3/basejail nullfs ro 0 0";
  exec.consolelog = "/var/log/jail_${host.hostname}.log";
  exec.prestart = "/bin/sh -c 'echo PRESTART_${host.hostname} >> /tmp/DEBUG'";
  exec.poststart = "/bin/sh -c 'echo POSTSTART_${host.hostname} >> /tmp/DEBUG'";
}

/etc/rc.d/jail script has this line added in the jail_start() function in the _ALL case statement subsection to capture the output stored in the $_tmp file on error/failure:

Code:
echo  "DEBUG: Contents of '$_tmp' are:" >> /tmp/DEBUG
cat $_tmp >> /tmp/DEBUG
echo  "DEBUG: END OF '$_tmp' CONTENTS" >> /tmp/DEBUG

So I reboot the system. Only a single jail starts, the first one this time (the first nearly always starts--it's usually the second that fails, but sometimes it's the third, or both second and third). I examine the output of /tmp/DEBUG and see:

Code:
PRESTART_jail1
POSTSTART_jail1
DEBUG: Contents of '/tmp/jail.hyLntGie' are:
mount_nullfs: /usr/local/jail/jail2/basejail: Operation not supported by device
mount_nullfs: /usr/local/jail/jail3/basejail: Operation not supported by device
jail: jail2: /sbin/mount -t nullfs -o ro /usr/local/jail/basejail_2016_04_19 /usr/local/jail/jail2/basejail: failed
jail: jail3: /sbin/mount -t nullfs -o ro /usr/local/jail/basejail_2016_04_19 /usr/local/jail/jail3/basejail: failed
jail1: created
DEBUG: jail_start(): END OF '/tmp/jail.hyLntGie' CONTENTS

Workaround discovered: My next step was to add to jail2 a "depend = jail1;" line and to jail3 a "depend = jail2;" line. That fixes the problem.

So the question is:
Why is jail failing with mount_nullfs errors when launching jails WITHOUT me manually setting dependencies so that jails launch sequentially?

My conclusion:
This looks like a parallel race with mounting a common nullfs(5) read-only filesystem.

BUG #1: Race condition on nullfs(5) mount of commonly shared read-only filesystems by jails

BUG #2: NO logging of the failure anywhere! (I had to invent my own.)

What is the correct permanent fix? I can use my workaround, but I hate workarounds when something should just work.


Thanks!

Aaron out.
 
Does it work if you use /etc/fstab.jail1.example.org, etc., instead of the mount in jail.conf?
 
Back
Top