jails infrastructure as code and orchestration tools : ansible vs iocage vs (appjail + director + overlord)

Ahhh, that's so simple. Ok, I'll try that.
This is starting to touch on why I choose to do things the way I do them (-e.g. my jail script vs others) and I'm interested in how you're going to tackle these types of things. Tell us how it goes (obviously, UID--for example--is easy, but I want to see how you're going to tackle these types of "issues" in an automated setup). Very cool to watch! Keep going.
 
atax1a

I'm still having trouble with this. I have created a user named backup in both my HOST and JAIL, and the UID is 2001 in both cases. The jails root user can see the host files, but not the backup.

Code:
# 1. host user
root@fbsdhost4:/home/toddg # grep backup /etc/passwd
backup:*:2001:2001:User &:/home/backup:/bin/sh

# 2. jail user
root@fbsdhost4:/home/toddg # jexec -l backitup grep backup /etc/passwd
backup:*:2001:2001:User &:/home/backup:/bin/sh

# 3. host datasets / filesystems are chowned to backup:backup
root@fbsdhost4:/home/toddg # ls -lsat /opt
total 21
17 drwxr-xr-x  23 root   wheel  28 Feb 17 21:23 ..
 1 drwxr-x---   4 backup backup  4 Feb 14 12:41 prod
 1 drwxr-x---   5 backup backup  5 Feb 14 12:41 .
 1 drwxr-x---   4 backup backup  4 Feb 14 12:41 stage
 1 drwxr-x---   4 backup backup  4 Feb 14 12:41 dev
root@fbsdhost4:/home/toddg # ls -lsat /opt/dev
total 4
1 drwxr-x---  5 backup backup 5 Feb 14 12:41 ..
1 drwxr-x---  4 backup backup 4 Feb 14 12:41 .
1 drwxr-x---  2 backup backup 2 Feb 14 12:41 postgres
1 drwxr-x---  2 backup backup 2 Feb 14 12:41 feeds
root@fbsdhost4:/home/toddg # ls -lsat /opt/dev/feeds/
total 2
1 drwxr-x---  4 backup backup 4 Feb 14 12:41 ..
1 drwxr-x---  2 backup backup 2 Feb 14 12:41 .
root@fbsdhost4:/home/toddg # ls -lsat /opt/dev/postgres/
total 2
1 drwxr-x---  4 backup backup 4 Feb 14 12:41 ..
1 drwxr-x---  2 backup backup 2 Feb 14 12:41 .

# 4. root user inside jail can see the files 
root@fbsdhost4:/home/toddg # jexec -l backitup find /opt
/opt
/opt/prod
/opt/prod/postgres
/opt/prod/feeds
/opt/dev
/opt/dev/postgres
/opt/dev/feeds
/opt/stage
/opt/stage/postgres
/opt/stage/feeds
/opt/stage/feeds/fakefeed.txt

# 5. the backup user inside the jail cannot see the files
root@fbsdhost4:/home/toddg # jexec -l -u backup backitup find /opt
/opt
find: /opt/prod: Permission denied
find: /opt/dev: Permission denied
find: /opt/stage: Permission denied
find: /opt: Permission denied

# 6. trying with both the "-u" and "-U" flags, same result
root@fbsdhost4:/home/toddg # jexec -l -U backup backitup find /opt
/opt
find: /opt/prod: Permission denied
find: /opt/dev: Permission denied
find: /opt/stage: Permission denied
find: /opt: Permission denied

Any suggestions would be most welcome
:)
 
Taking a quick fly-by.
Try "chown 2001:wheel"

I, personally, don't necessarly create a user on host AND jail (jail: yes, host: no) but I keep the same UID on both. So for example, I created some directories on a dataset with a UID of 211 (way back when). If in my jail I create a user with a UID of 211 if I want RW.

Code:
host (/var/db/repositories/):
drwxr-s---   9 211  wheel     9 Jan 25 15:25 server/
jail (/var/db/git/):
drwxr-s---   9 git  wheel     9 Jan 25 15:25 server/

Both the host--in this case--and the jail have this directory mounted from another system (I have an NFS server and three physical servers and one remote). One host has ~3 or 4 jails Another host has ~4+ jails and the third host has ~2 jails.
 
Ahhh. I think something is borked in my ZFS dataset. When I try this with a fresh dataset everything works as expected:

Code:
# 1. create fresh new dataset
root@fbsdhost4:/home/toddg # zfs create -o mountpoint=/fubar zroot/fubar

# 2. populate it with a file
root@fbsdhost4:/home/toddg # touch /fubar/somefile.txt

# 3. chown to the user I'll be using in the jail
root@fbsdhost4:/home/toddg # chown -R backup:backup /fubar/

# 4. verify file perms
root@fbsdhost4:/home/toddg # ls -lsat /fubar/
total 19
 1 drwxr-xr-x   2 backup backup  3 Feb 17 22:54 .
 1 -rw-r--r--   1 backup backup  0 Feb 17 22:54 somefile.txt
17 drwxr-xr-x  24 root   wheel  29 Feb 17 22:53 ..

# 5. update the jail's fstab to mount this new dataset
root@fbsdhost4:/home/toddg # vim /jails/fstab/backitup.fstab

 >>> /fubar  /jails/containers/backitup/fubar              nullfs rw 0 0


# 6. make the mount point in the jail
root@fbsdhost4:/home/toddg # jexec -l backitup mkdir /fubar

# 7. restart the jail to reload the fstab
root@fbsdhost4:/home/toddg # service jail restart backitup
Stopping jails: backitup.
Starting jails: backitup.

# 8. drumroll....
root@fbsdhost4:/home/toddg # jexec -l backitup ls /fubar
somefile.txt

# 9. Ok, so that worked
root@fbsdhost4:/home/toddg # jexec -l backitup cat /fubar/somefile.txt

# 10. add some text to the file
root@fbsdhost4:/home/toddg # vim /fubar/somefile.txt

# 11. cat it out
root@fbsdhost4:/home/toddg # jexec -l backitup cat /fubar/somefile.txt
snthaoeusnth
sanoteusatheu
snth

Ok, so the advice from atax1a was spot on, and I have to clean my ZFS.

More on this soon.

(thx JohnK, I'll keep y'all updated as I progress)
 
JohnK, I like your idea of not having those user accounts on the HOST.

Code:
# 1. remove the backup user from HOST
root@fbsdhost4:/home/toddg # rmuser backup
Matching password entry:

backup:...

Is this the entry you wish to remove? (yes/no): y
Remove user's home directory? [/home/backup] (yes/no): y
Removing user (backup): mailspool home passwd.

# 2. verify files on the HOST use the correct UID (in this case 2001)
root@fbsdhost4:/home/toddg # ls -lsat /fubar/
total 19
 1 drwxr-xr-x   2 2001 2001   3 Feb 17 22:56 .
 1 -rw-r--r--   1 2001 2001  33 Feb 17 22:56 somefile.txt
17 drwxr-xr-x  24 root wheel 29 Feb 17 22:53 ..

# 3. verify that the jail root user can see the files 
root@fbsdhost4:/home/toddg # jexec -l backitup cat /fubar/somefile.txt
snthaoeusnth
sanoteusatheu
snth

# 4. verify that the HOST user cannot do anything (it's been deleted)
root@fbsdhost4:/home/toddg # jexec -l -u backup backitup cat /fubar/somefile.txt
jexec: backup: no such user

# 5. verify that the JAIL user can see the files
root@fbsdhost4:/home/toddg # jexec -l -U backup backitup cat /fubar/somefile.txt
snthaoeusnth
sanoteusatheu
snth

So sure, that seems like a good way to go, thx!
 
patmaddox I am following your suggestion of structuring my data separate from my jails. I really like having a root node like this:

Code:
zroot/SAFE

And then configuring sanoid snapshots to recurse from that:

Code:
[zroot/SAFE]
  recursive = yes

[zroot/SAFE/dev]
  use_template = dev

[zroot/SAFE/stage]
  use_template = stage

[zroot/SAFE/prod]
  use_template = prod

However, the way I am creating these datasets seems to be interfering with permissions within the jail (see thread above).

Here's my Ansible code for creating this dataset tree:

Code:
- name: create a new dataset called zroot/SAFE
  community.general.zfs:
    name: zroot/SAFE
    state: present
    extra_zfs_properties:
      canmount: off
      mountpoint: "/opt"

# dev
- name: create a new dataset called zroot/SAFE/dev
  community.general.zfs:
    name: zroot/SAFE/dev
    state: present
    extra_zfs_properties:
      canmount: off
      mountpoint: "/opt/dev"

- name: create a new dataset called zroot/SAFE/dev/feeds
  community.general.zfs:
    name: zroot/SAFE/dev/feeds
    state: present

- name: create a new dataset called zroot/SAFE/dev/postgres
  community.general.zfs:
    name: zroot/SAFE/dev/postgres
    state: present

It's the same pattern for stage and prod environments.

I've tried zroot/SAFE/dev with and without the extra_zfs_properties stanza.

What I'm finding is that when I create the datasets by hand (see above posts) then I can access the datasets from a normal user account within the jail. But the datasets created using this Ansible code result in permissions errors within the jail (for normal jail users, the jail root can see the files no problem).

I'll continue to debug, but any insights are welcome :-)
 
From the following post:

FWITW: I just noticed you are duplicating a bit of configuration in the jail.conf.d/jail.conf file and from what I can see, it can possibly be simplified a bit.
/etc/jail.conf.d/backitup.conf
Code:
backitup {
    $id     = "10";

### 
# all of this can be moved to /etc/jail.conf
# .....................>%
    $ipaddr = "10.0.0.${id}";
    $mask   = "255.255.255.0";
    $gw     = "10.0.0.1";
    vnet;
    vnet.interface = "epair${id}b";

    exec.prestart   = "ifconfig epair${id} create up";
    exec.prestart  += "ifconfig epair${id}a up descr vnet-${name}";
    exec.prestart  += "ifconfig bridge0 addm epair${id}a up";

    exec.start      = "/sbin/ifconfig lo0 127.0.0.1 up";
    exec.start     += "/sbin/ifconfig epair${id}b ${ipaddr} netmask ${mask} up";
    exec.start     += "/sbin/route add default ${gw}";
    exec.start     += "/bin/sh /etc/rc";

    exec.prestop    = "ifconfig epair${id}b -vnet ${name}";

    exec.poststop   = "ifconfig bridge0 deletem epair${id}a";
    exec.poststop  += "ifconfig epair${id}a destroy";
# .....................>%
###
    persist;
}

my jail.conf.d/myjail.conf files typically look like:
Code:
myjail {
    $id = "63";
}
And only contain specific information about that jail's specific needs (like 'mlocks' and whatnot and NOT network related information).

You can see my /etc/jail.conf file contents here (in the readme or doc) to double-check:
 
Summary: problem solved, but I'm not 100% sure why.

After destroying the datasets, I noticed that I had data in various directories such as /opt and /opt/dev/feeds etc. So in my experimentation, I somehow wrote into the mount points, and under the mounted zfs datasets.

So what I did to figure this out was to iterate over a process like this:

taskcommandnotes
destroy the datasetsHOST (root)# zfs destroy -r zroot/SAFEnuke the planet from orbit
rm the mount pointsHOST (root)# rm -rf /optsame
ensure everything is clean, e.g. no datasets and directory /opt does not existHOST (root)# find /optuntil i did this, i definitely got confusing results
create the datasetsHOST (root)# zfs create -o canmount=off -o mountpoint=/opt zroot/SAFEsee below for further discussion on this
populate the datasets with some test dataHOST (root)# touch /opt/dev/feeds/feed1.txtrepeat as nec
verify that the HOST(root) account can see the dataHOST (root)# find /opt
verify that the JAIL(root) account can see the dataHOST (root)# jexec -l find /opt
verify that the JAIL(root) account can see the dataHOST (root)# jexec -l -U backup find /opt

Based on atax1a 's comment, I looked over the zpool history:

Code:
2026-02-18.00:00:33 zfs create -p -o canmount=off -o mountpoint=/opt zroot/SAFE
2026-02-18.00:00:33 zfs create -p zroot/SAFE/dev
2026-02-18.00:00:34 zfs create -p zroot/SAFE/dev/feeds
2026-02-18.00:00:35 zfs create -p zroot/SAFE/dev/postgres
2026-02-18.00:00:35 zfs create -p zroot/SAFE/stage
2026-02-18.00:00:36 zfs create -p zroot/SAFE/stage/feeds
2026-02-18.00:00:36 zfs create -p zroot/SAFE/stage/postgres
2026-02-18.00:00:36 zfs create -p zroot/SAFE/prod
2026-02-18.00:00:36 zfs create -p zroot/SAFE/prod/feeds
2026-02-18.00:00:37 zfs create -p zroot/SAFE/prod/postgres

Code:
2026-02-18.19:53:36 zfs create -o canmount=off -o mountpoint=/opt zroot/SAFE
2026-02-18.19:53:36 zfs create -o canmount=off zroot/SAFE/dev
2026-02-18.19:53:36 zfs create -o canmount=off zroot/SAFE/stage
2026-02-18.19:53:36 zfs create -o canmount=off zroot/SAFE/prod
2026-02-18.19:54:12 zfs create zroot/SAFE/dev/file
2026-02-18.19:54:12 zfs create zroot/SAFE/dev/db
2026-02-18.19:54:12 zfs create zroot/SAFE/stage/file
2026-02-18.19:54:12 zfs create zroot/SAFE/stage/db
2026-02-18.19:54:12 zfs create zroot/SAFE/prod/file
2026-02-18.19:54:12 zfs create zroot/SAFE/prod/db

For some reason, the ansible.community.zfs module is creating zfs datasets using the '-p' flag, which causes zfs to ignore any other options passed:

From man(8) zfs-create:

Code:
       -p  Creates all the non-existing parent datasets.  Datasets created in
           this manner are automatically mounted according to the mountpoint
           property inherited from their parent.  Any property specified on
           the command line using the -o option is ignored.  If the target
           filesystem already exists, the operation completes successfully.

So perhaps that was the issue? I'm not sure if was that or the stale data under the mountpoints mentioned above. But it's fixed now. And after 2 days of noodling with zfs on this issue, I just have a better feel for things.

JohnK
re: /etc/jail.conf.d/backitup.conf
That's a great idea. Done! Thx!

BTW - as I have more questions, what's the tradition on this forum? Create new questions to this post or keep appending questions here?
 
BTW - as I have more questions, what's the tradition on this forum? Create new questions to this post or keep appending questions here?
I'm still new so I'm not 100pct sure either, but I'd say here (keeps things nice and tidy) if you ask me.

2 days?! That's a lot of messing around for a single line of code. I'm sort of glad I don't use Ansible (sorry). ...but stepping back and thinking about your overall objective; do you think Ansible is easy to audit?

Yes, keeping the jail specific entries in the jail.conf.d/myjail.conf file makes is super easy to update.
 
* pf_enable="YES"
* jail_tasks.yml : jail_enable="YES"
* node_exporter_task.yml : node_exporter_enable="YES"
* pf_enable="YES"
...
Reading this thread with fascination, as I'm also getting started in bootstrapping a new infrastructure that'll revolve around heavy use of jails, and I'm very pro infrastructure as code, I wanted to add my two cents that I'm not sure if they'll be useful to your Ansible workflow, but hopefully you'll still find it interesting.

If you're worried about about rc.conf being mutated by multiple tasks, be aware that you can always spread your rc configuration out to multiple files. For example, most of my rc configurations are not even in /etc/rc.conf to begin with, but in /usr/local/etc/rc.conf.d, and within that, if I'm configuring separate services, e.g. GitLab & nginx, each of them gets its own configuration file:

sh:
-> sysrc -f /usr/local/etc/rc.conf.d/gitlab gitlab_enable="YES"
-> sysrc -f /usr/local/etc/rc.conf.d/nginx nginx_enable="YES"

Might seem overkill, especially if all you're doing is enabling services and nothing more, but its certainly organized, atomic, and it scales.

HTH!
 
backitup.fstab
Code:
/opt/dev/feeds    /jails/containers/backitup/opt/dev/feeds          nullfs ro 0 0
/opt/dev/postgres    /jails/containers/backitup/opt/dev/postgres          nullfs ro 0 0

/opt/stage/feeds    /jails/containers/backitup/opt/stage/feeds          nullfs ro 0 0
/opt/stage/postgres    /jails/containers/backitup/opt/stage/postgres          nullfs ro 0 0

/opt/prod/feeds    /jails/containers/backitup/opt/prod/feeds          nullfs ro 0 0
/opt/prod/postgres    /jails/containers/backitup/opt/prod/postgres          nullfs ro 0 0
Like I said in an earlier post, I'm also working on fleshing out a new piece of infrastructure that'll revolve around heavy use of jails, and, among them, I'm looking to deploy a few that will be writing critical data to their local filesystems, e.g. databases, so I'm obviously concerned about the data backup problem, and have been weighing some options:
  1. Mount SMB shares from my ZFS array into those jails. Extremely simple and convenient, as the array is already redundant and on a snapshotting and replication schedule to rsync.net, but I fear that'll translate into poor write performance for those jailed services.
  2. As these jails will be running in a VM on top of that array, I could also put those jails of interests on zvols that'd also benefit from ZFS snapshotting and replication schedules. That feels a lot better than SMB, but still leaves me wary about the data integrity problem: if your database is, for example, holding a lot of data in-memory and flushes to disk at unpredictable intervals, what guarantees could you ever have that any one ZFS snapshot or another holds integral and self-consistent data that would be useful for recovery scenarios?
  3. The nullfs mounts into a secondary, backup-specific jail that you present here is pretty interesting, it's something I hadn't thought about, but wouldn't that suffer from the same data integrity problem as above? After all, you're still simply reading from hot binary files on disk.
  4. And last but not least, the route that, at least for the time being, I'd like to avoid to keep complexity down to a minimum, is one that, for databases at least, would be comprised of:
    1. logical & service-specific replication to a secondary jail, e.g. mysql-84-primary --> mysql-84-replica-1 (and its equivalent for PostgreSQL databases, MongoDB, etc.).
    2. either SMB mounts or zvols for the mysql-84-replica-N jails.
    3. a script that, on a schedule, connects to the replica MySQL service(s), locks it, flushes all tables to disk, snapshots the underlying storage, and finally unlocks the replica to resume logical replication.
That final idea is the only one I've come up with so far that I believe solves the speed, availability, data integrity, and backup problems all at once (you could lock and flush to disk your primary db node, and then snapshot its storage, sure, but that'd eventually have an impact on availability), but is heavy on complexity. Moreover, it doesn't work for services that don't have readily available replication solutions, e.g. Uptime Kuma's SQLite-based data store, so I still need to figure out a backup strategy for those

Are these problems you believe you'd face with your setup? And, if I may ask, what ideas would you have to approach them?

Thanks!
 
... you can always spread your rc configuration out to multiple files. For example, most of my rc configurations are not even in /etc/rc.conf to begin with, but in /usr/local/etc/rc.conf.d

In addition/extension to this concept: I use crontab also in this way and it makes mgmt so much easier. For example (a simple jail config):

I created a "watch script" that just checks if a process is running every once and a while. To establish "self healing" in my jails, I use this script to restart the jail if it stops. In my jail config file I use the `poststart` and `prestop` to setup the "watch script" (-i.e. copy from a directory into the `cron.d` directory.

Code:
emby {
  ...
  # Self healing; restart the jail if the `dotnet` process stops.
  exec.poststart += "if [ -f /opt/scheduler/cron/emby.crontab ]; then cp /opt/scheduler/cron/emby.crontab /etc/cron.d/emby.crontab; fi";
  exec.prestop += "if [ -f /etc/cron.d/emby.crontab ]; then rm /etc/cron.d/emby.crontab; fi";
  ...
}
 
Back
Top