Solved Binaries get rewritten at jail start with unionfs

D-FENS · Mar 15, 2019

I am working on a solution for mass jails deployment based on ZFS datasets and unionfs.
A jail template is created by extracting base archive into the dataset. Then the dataset is mounted read-only in all the jails' roots and each jail has its own lean dataset, where only the deltas are stored.

The goal is, when an update is applied, only the template is modified and the jails automatically get the updates without duplicating the files for each jail.

My problem is, when I start a jail with initially empty top unionfs layer (R/W), many standard files in /bin, /lib, /sbin etc. get rewritten with the current timestamp but the sizes are identical. This bloats the jail's dataset and will probably break the update, when I update the template, the older files from the jails will hide what is the latest version in the lower unionfs layer.

Why do the binary files get rewritten? This happens only for a handful of them, for example /bin/sh, /bin/cat, /bin/mkdir etc. In total, it's about 7 MB and ~300 files.

Here is a little visual help about the situation:

level 0 ------ jail*/root, R/W -------------- : should contain only the changed files in each jail (but the rewritten files in /bin land here and hide the respective files in level 1).
level 1 ------ template/root, R/O -------------- : contains the system base, completely generic, to be updated regularly

I know about the possibility to mount read-only directories via nullfs, but this option looks quite complicated compared to the simple layering with unionfs, which should work in principle.

D-FENS · Mar 15, 2019

Here is my configuration:

Bash:

# fstab
/jails/template/root    /jails/overlay/mnt       unionfs         ro                                                                      0       0
/jails/overlay/root     /jails/overlay/mnt       unionfs         rw,noatime,cow,max_files=32768,allow_other,use_ino,suid,nonempty        1       0


#jail.conf
overlay {
        host.hostname = "overlay";
        path = "/jails/overlay/mnt";
        exec.clean;
        
        exec.system_user = "root";
        exec.jail_user = "root";

        vnet;
        vnet.interface = "";

        allow.raw_sockets;
        mount.devfs;
        devfs_ruleset="4";
        
        mount.fstab = "$path/../fstab";
        
        exec.consolelog = "$path/../log/jail_${name}_console.log";
        
        # hooks only create/destroy network interfaces via ifconfig
        exec.prestart  += "$path/../exe/hooks/prestart.sh  $name";
        exec.poststart += "$path/../exe/hooks/poststart.sh $name";
        exec.prestop   += "$path/../exe/hooks/prestop.sh   $name";
        exec.poststop  += "$path/../exe/hooks/poststop.sh  $name";
        
        exec.start += "/bin/sh -x /etc/rc";
        exec.stop  =  "/bin/sh -x /etc/rc.shutdown";
}

D-FENS · Mar 16, 2019

Another interesting observation. I started setting template's ZFS dataset property "readonly=on" and all of a sudden those files don't get recreated in the jails anymore.
Why would the readonly property of the template's dataset have this impact on another file system mounted on top of it R/W via unionfs?

SirDice · Mar 18, 2019

Code:

BUGS
     THIS FILE SYSTEM TYPE IS NOT YET FULLY SUPPORTED (READ: IT DOESN'T WORK)
     AND USING IT MAY, IN FACT, DESTROY DATA ON YOUR SYSTEM.  USE AT YOUR OWN
     RISK.  BEWARE OF DOG.  SLIPPERY WHEN WET.  BATTERIES NOT INCLUDED.

From mount_unionfs(8).

D-FENS · Mar 19, 2019

I actually use sysutils/fusefs-unionfs, unionfs(8). This manual page does not warn about missing support. There is a known issue about disabling copy-on-write, but I enable it, so it should in theory work.
As mentioned above, it actually works as expected, but when I set the "readonly" ZFS property of the lower layer to "on". I am wondering why the file system behaves like this.

Edit:
Wow, now I took a look at my fstab again, and you're completely right! It actually goes back to mount_unionfs. That's probably what happens.
I initially used a script for mounting (fusefs-unionfs) and then switched to fstab.
Thanks, I'll check if the problem persists when I use fusefs-unionfs via the command line instead of fstab.

The.Silicon.Projects · Mar 19, 2019

You MUST mount the unionfs layer with NOATIME option, and so the system will no attempt to update the access time.
This is the reason why it rewrites the binaries.

Setting the underlying layer in read only mode doesn't block the access time update process... it will just block the writting process
With unionfs, the system find a writtable layer.... so it writes on it.

As SirDice said, unionfs is not totally reliable.

I have also tested unionfs in jail in the past, I have finally dropped.... exactly for the reason I explain here. Because with unionfs it is impossible to force read only mode
For example, if I want to "lock" a file, with unionfs it doesn't work anymore because filesystem only take into account the writtable flag of the upper layer

To lock a given file, you must create manually a new version of the file on the upper layer that will mask the underlying version of the file, and then use "chmod" which will apply to the file on the upper unionfs layer. So this reduce the interest of unionfs.

Unionfs implementation has some limitations "by design".
Many people have tested before you unionfs in jail, you are not the first, many people have finally dropped this idea because unionfs brings more problems than it solves ones.

Unionfs may be used in some limited scenario as merging some targeted user's directories as applications list....

D-FENS · Mar 19, 2019

Wozzeck.Live said:
You MUST mount the unionfs layer with NOATIME option, and so the system will no attempt to update the access time.
This is the reason why it rewrites the binaries.

Setting the underlying layer in read only mode doesn't block the access time update process... it will just block the writting process
With unionfs, the system find a writtable layer.... so it writes on it.

A SirDice said, unionfs is not totally realiable.

I have also tested unionfs in jail in the past, I have finally dropped.... exactly for the reason I explain here. Because with unionfs it is impossible to force readonly mode
For example, if I want to "lock" a file, with unionfs it doesn't work anymore because filesystem only take into account the writtable flag of the upper layer

To lock a given file, you must create manually a new version of the file on the upper layer that will mask the underlying version of the file, and then use "chmod" which will apply to the file on the upper unionfs layer. So this reduce the interest of unionfs.

Unionfs implementation has some limitations "by design".
Many people have tested before you unionfs in jail, you are not the first, you have not invented the wheel, many people have finally dropped this idea because unionfs brings more problems than it solves ones.

Unionfs may be used in some limited scenario as merging some targeted user's directories as applications list....

Thank you so much for the detailed explanation! This makes a lot of sense, I'll try it out and report back.

mast07 · Mar 19, 2019

There was a talk at EuroBSDcon 2017 about a similar topic. Perhaps it introduces some ideas for a solution without union-fs at all: Talk / Slides

D-FENS · Mar 19, 2019

Wozzeck.Live said:
You MUST mount the unionfs layer with NOATIME option, and so the system will no attempt to update the access time.
This is the reason why it rewrites the binaries.

I am confirming that mounting the lower layer with NOATIME works. This solves my problem, thanks!

D-FENS · Mar 19, 2019

Wozzeck.Live said:
As SirDice said, unionfs is not totally reliable.

I have also tested unionfs in jail in the past, I have finally dropped.... exactly for the reason I explain here. Because with unionfs it is impossible to force read only mode
For example, if I want to "lock" a file, with unionfs it doesn't work anymore because filesystem only take into account the writtable flag of the upper layer

To lock a given file, you must create manually a new version of the file on the upper layer that will mask the underlying version of the file, and then use "chmod" which will apply to the file on the upper unionfs layer. So this reduce the interest of unionfs.

Unionfs implementation has some limitations "by design".
Many people have tested before you unionfs in jail, you are not the first, many people have finally dropped this idea because unionfs brings more problems than it solves ones.

Alright, I am convinced that unionfs is probably not the perfect solution for me. And I almost buy the argument, but I have a slight problem that prevents me from going all the way.
Let's consider this scenario:

I mount the template via nullfs into the jail root.
I mount /etc, /var, /usr/local/etc and /home as read-write. This is part of the jail's dataset.

Problem: After an update in the template, how do I make sure the updated configuration files are merged into the jails in /etc? The jails will be stuck with their copy of /etc, which will never be updated.
How did you solve this issue back then?
Should I use unionfs then only for /etc? i.e. should I mount $TEMPLATE/root/etc and $JAIL/root/etc on top of each other via unionfs?

SirDice · Mar 20, 2019

roccobaroccoSC said:
After an update in the template, how do I make sure the updated configuration files are merged into the jails in /etc? The jails will be stuck with their copy of /etc, which will never be updated.

You could use the same solution as for EZjail: https://www.freebsd.org/doc/en_US.I....html#jails-ezjail-update-mergemaster-trusted

D-FENS · Mar 20, 2019

SirDice said:
You could use the same solution as for EZjail: https://www.freebsd.org/doc/en_US.I....html#jails-ezjail-update-mergemaster-trusted

Perfect! That was the last missing piece of the puzzle. I already changed the jails to use nullfs and read-only mounts, it works beautifully.
Tons of thanks to everybody who helped me!

Solved Binaries get rewritten at jail start with unionfs

D-FENS

D-FENS

D-FENS

SirDice

Administrator

D-FENS

The.Silicon.Projects

D-FENS

mast07

D-FENS

D-FENS

SirDice

Administrator

D-FENS