[Guide] Manually upgrading your jail: easy with ZFS!

Hi gang!

When it comes to systems administration then there are many tools which can help you not having to deal with all the small individual steps which are required to set everything up. Now, with all due respect to those tools, I personally believe that there's also a risk involved. Which is you becoming dependent on the tool at hand and if that suddenly starts to misbehave... then what?

And sometimes things honestly don't have to be as complicated as you might think.

SO... 11.2 came out recently, let's take a look at how you can manually (yet still quickly!) upgrade your jail.

... while you simply leave it running :cool:

Requirements

This methodology relies on ZFS and for your jail to be installed on its own ZFS filesystem as well. For example here is my setup:
Code:
peter@zefiris:/home/peter $ zfs list -r zroot/opt/jails
NAME                  USED  AVAIL  REFER  MOUNTPOINT
zroot/opt/jails      2.27G  88.4M   162M  /opt/jails
zroot/opt/jails/psi  2.11G  50.8G  2.11G  /opt/jails/psi
In case you're wondering: /opt/jails is a placeholder where I store the archives for the base system. As such I severely limited its total storage yet without this affecting underlying filesystems (refquota). My Psi jail is in a filesystem of its own but not necessarily aware of that ;)

Another requirement is that the jail needs access to a recent source tree (/usr/src) where 'recent' means "a source tree representing the new FreeBSD version we're upgrading to".

Preparations

Step 1: proper backups.

Although my methodology doesn't really need these it never hurts and in fact.. it makes much more sense to make these before you start "just in case".

Step two on the other hand...

First you need to make a snapshot of your jail. This is to ensure that we don't disrupt the live system:
Code:
root@zefiris:/home/peter # zfs snapshot zroot/opt/jails/psi@upgrade
root@zefiris:/home/peter # zfs list -rt all zroot/opt/jails
NAME                          USED  AVAIL  REFER  MOUNTPOINT
zroot/opt/jails              2.27G  88.4M   162M  /opt/jails
zroot/opt/jails/psi          2.11G  50.8G  2.11G  /opt/jails/psi
zroot/opt/jails/psi@upgrade      0      -  2.11G  -
So here comes the fun part: we didn't make that snapshot merely as a means of backup. Nah, we're actually going to perform the upgrade based on the snapshot while our actual jail just keeps on running, as such reducing our downtime.

We do this by cloning our snapshot:
Code:
root@zefiris:/home/peter # zfs clone zroot/opt/jails/psi@upgrade zroot/opt/jails/psi.11_2
root@zefiris:/home/peter # zfs list -rt all zroot/opt/jails
NAME                          USED  AVAIL  REFER  MOUNTPOINT
zroot/opt/jails              2.27G  88.4M   162M  /opt/jails
zroot/opt/jails/psi          2.11G  50.8G  2.11G  /opt/jails/psi
zroot/opt/jails/psi@upgrade   174K      -  2.11G  -
zroot/opt/jails/psi.11_2        1K  50.8G  2.11G  /opt/jails/psi.11_2
So now we have access to /opt/jails/psi.11_2 which is basically a copy of our original jail, at least for now, and which is ready to be upgraded. Without doing any damage to either our snapshot or the main jail. In fact... I think you'll be astonished to see how smooth this will make the whole upgrade procedure where downtime is concerned.

Removing the flags

Several system files are extra protected against harm. For example:
Code:
peter@zefiris:/home/peter $ ls -lo /sbin/init
-r-xr-xr-x  1 root  wheel  schg 1103736 Jul  8 11:18 /sbin/init*
This means that if we were to simply extract /opt/jails/base.txz into my jail then tar might have a complaint or two about files which it will refuse to overwrite. Good for you tar but bad for us ;) So:
Code:
root@zefiris:/home/peter # cd /opt/jails/psi.11_2/
root@zefiris:/opt/jails/psi.11_2 # cd bin
root@zefiris:/opt/jails/psi.11_2/bin # chflags noschg *
root@zefiris:/opt/jails/psi.11_2/bin # cd ../sbin
root@zefiris:/opt/jails/psi.11_2/sbin # chflags noschg *
root@zefiris:/opt/jails/psi.11_2/sbin # cd ../lib
root@zefiris:/opt/jails/psi.11_2/lib # chflags noschg *
root@zefiris:/opt/jails/psi.11_2/lib # cd ../libexec/
root@zefiris:/opt/jails/psi.11_2/libexec # chflags noschg *
root@zefiris:/opt/jails/psi.11_2/libexec # cd ../usr/sbin
root@zefiris:/opt/jails/psi.11_2/usr/sbin # chflags noschg *
root@zefiris:/opt/jails/psi.11_2/usr/sbin # cd ../lib
root@zefiris:/opt/jails/psi.11_2/usr/lib # chflags noschg *
root@zefiris:/opt/jails/psi.11_2/usr/lib #
This will leave /var/empty but we're just going to ignore that.

Now it's time to upgrade the system:

Installing the upgrade
Code:
root@zefiris:/opt/jails/psi.11_2/usr/lib # cd ../..
root@zefiris:/opt/jails/psi.11_2 # tar xf ../base.txz
./var/empty/: Can't restore time
tar: Error exit delayed from previous errors.
Depending on your system this can take a while and you should be grateful here because all this time would otherwise be downtime. And yes; tar may complain but trust me: it did its job:
Code:
root@zefiris:/opt/jails/psi.11_2 # cd usr/lib
root@zefiris:/opt/jails/psi.11_2/usr/lib # ls -lo | grep schg
-r--r--r--  1 root  wheel  schg,uarch    23880 Jun 22 06:33 librt.so.1
If you don't believe me then check your main system, which is exactly why we left that intact for now:
Code:
root@zefiris:/opt/jails/psi.11_2/usr/lib # ls -lo librt.so*
lrwxr-xr-x  1 root  wheel  uarch         10 Jun 22 06:33 librt.so@ -> librt.so.1
-r--r--r--  1 root  wheel  schg,uarch 23880 Jun 22 06:33 librt.so.1
root@zefiris:/opt/jails/psi.11_2/usr/lib # ls -lo /opt/jails/psi/usr/lib/librt.so*
lrwxr-xr-x  1 root  wheel  uarch         10 Jul 21  2017 /opt/jails/psi/usr/lib/librt.so@ -> librt.so.1
-r--r--r--  1 root  wheel  schg,uarch 22224 Jul 21  2017 /opt/jails/psi/usr/lib/librt.so.1
root@zefiris:/opt/jails/psi.11_2/usr/lib #
Should be obvious enough, right?

Now; this action was actually destructive because it overwrote files which you might have customized. For example /etc/ftpusers or /etc/hosts. This is where your backup comes into play: copy all those files directly onto your new jail. I usually just restore the entirety of /etc by copying this from the original (and still running) jail.

Then.. the next part is tricky. We're going to upgrade our configuration files in /etc but in order to do that we'll need access to the source tree. So make this work ;)

In my case I use the following /etc/fstab.psi which is defined in /etc/jails.conf:
Code:
root@zefiris:~ # less /etc/fstab.psi
## Mountpoint(s) for the Psi jail

# Dev   Mountpoint      FS              Options         Dump / Check

/usr/src                /opt/jails/psi/usr/src                  nullfs ro 0 0
/usr/ports              /opt/jails/psi/usr/ports                nullfs rw 0 0
/usr/ports/distfiles    /opt/jails/psi/usr/ports/distfiles      nullfs rw,late 0 0
So I just re-do this manually for the cloned jail: # mount -t nullfs -o ro /usr/src /opt/jails/psi.11_2/usr/src. Another major issue: /dev. So: # mount -t devfs devfs /opt/jails/psi.11_2/dev.

And now it's time to have some fun:
Code:
root@zefiris:~ # chroot /opt/jails/psi.11_2/
root@zefiris:/ # cd /usr/src
root@zefiris:/usr/src # mergemaster -iF

*** Creating the temporary root environment in /var/tmp/temproot
*** /var/tmp/temproot ready for use
*** Creating and populating directory structure in /var/tmp/temproot
You don't need to go to /usr/src, that's just force of habit on my part. So enjoy the awesome stuff which /usr/sbin/mergemaster can do for your config files. For this particular jail I didn't have to do anything because nearly all files in /etc were left default, thus the upgrade just continued "automagically".

But it worked... I can assure you that:
Code:
root@zefiris:/usr/src # cd /etc/ssh/
root@zefiris:/etc/ssh # head sshd_config
#       $OpenBSD: sshd_config,v 1.101 2017/03/14 07:19:07 djm Exp $
#       $FreeBSD: releng/11.2/crypto/openssh/sshd_config 323136 2017-09-02 23:39:51Z des $

# This is the sshd server system-wide configuration file.  See
# sshd_config(5) for more information.
See?

In case you don't believe me:
Code:
root@zefiris:/etc/ssh # exit
root@zefiris:~ # head /opt/jails/psi/etc/ssh/sshd_config
#       $OpenBSD: sshd_config,v 1.98 2016/02/17 05:29:04 djm Exp $
#       $FreeBSD: releng/11.1/crypto/openssh/sshd_config 311915 2017-01-11 05:56:40Z delphij $

# This is the sshd server system-wide configuration file.  See
# sshd_config(5) for more information.
Isn't FreeBSD + ZFS just awesome? o_O

So now we've upgraded our jail + all its config files in the base system. All that's left to do now is to actually switch the systems. There is more awesomeness heading our way!

First it's time to clean up though:
Code:
root@zefiris:~ # cd /opt/jails/psi.11_2/
root@zefiris:/opt/jails/psi.11_2 # umount dev
root@zefiris:/opt/jails/psi.11_2 # umount usr/src

Verify, double check and let's do this!

So, at this stage we have our old (11.1) jail running in /opt/jails/psi and we have our passive yet upgraded jail available in /opt/jails/psi.11_2. Now, it might be tempting to just edit /etc/jail.conf and be done with it, but that's really not the right approach.

First we're going to promote our upgraded jail:
Code:
root@zefiris:~ # zfs promote zroot/opt/jails/psi.11_2
root@zefiris:~ # zfs list -rt all zroot/opt/jails
NAME                               USED  AVAIL  REFER  MOUNTPOINT
zroot/opt/jails                   2.77G  88.4M   162M  /opt/jails
zroot/opt/jails/psi                385K  50.3G  2.11G  /opt/jails/psi
zroot/opt/jails/psi.11_2          2.62G  50.3G  2.15G  /opt/jails/psi.11_2
zroot/opt/jails/psi.11_2@upgrade   477M      -  2.11G  -
See what happened here?

By promoting our clone we've separated it from it's origin: zroot/opt/jails/psi@upgrade. "Sort off" :) Right now we have 2 jail environments: the original jail running on 11.1 and the new upgraded jail for 11.2. With the major difference that the new jail isn't merely separated: it is the main environment which has its own child snapshot which gives us access to the previous (11.1) state. In fact: it's zroot/opt/jails/psi which is now a clone of zroot/opt/jails/psi.11_2@upgrade!

Time to stop our jail, clean up the mess and get the new environment running. Warning: this is where your downtime is happening!

Code:
root@zefiris:~ # jail -r psi
Stopping cron.
Waiting for PIDS: 83473.
Terminated
.
psi: removed
root@zefiris:~ # zfs rename zroot/opt/jails/psi zroot/opt/jails/backup.psi
root@zefiris:~ # zfs rename zroot/opt/jails/psi.11_2 zroot/opt/jails/psi
root@zefiris:~ # jail -c psi

psi: created
ELF ldconfig path: /lib /usr/lib /usr/lib/compat /usr/local/lib
32-bit compatibility ldconfig path: /usr/lib32
Setting hostname: psi.intranet.lan.
Generating host.conf.
Creating and/or trimming log files.
Starting syslogd.
Clearing /tmp (X related).
Updating motd:.
/etc/rc: WARNING: $testrc_enable is not set properly - see rc.conf(5).
sendmail_submit: /etc/mail/aliases newer than /etc/mail/aliases.db, regenerating
/etc/mail/aliases: 29 aliases, longest 10 bytes, 297 bytes total
Starting sendmail_submit.
Starting sendmail_msp_queue.
Starting cron.

Sun Jul 29 21:31:24 CEST 2018
And what do you know?
Code:
root@zefiris:~ # jexec psi freebsd-version
11.2-RELEASE
And all services are running smoothly on a new upgraded jail. And our downtime? Literally what you saw up there. If you put that stuff in a script then your downtime will pretty much only consist of the time needed to reboot the jail.

So now that I have my new upgraded jail, let's explore why I went through all this trouble... See: that snapshot we made is now coming in really handy:

Code:
root@zefiris:~ # jexec psi csh
root@psi:/ # cd /.zfs/snapshot/upgrade/usr/sbin
root@psi:/.zfs/snapshot/upgrade/usr/sbin # diff mergemaster /usr/sbin/mergemaster
11c11
< # $FreeBSD: releng/11.1/usr.sbin/mergemaster/mergemaster.sh 302912 2016-07-15 19:58:05Z bdrewery $
---
> # $FreeBSD: releng/11.2/usr.sbin/mergemaster/mergemaster.sh 302912 2016-07-15 19:58:05Z bdrewery $
root@psi:/.zfs/snapshot/upgrade/usr/sbin #
See?

Now that I know everything is ok I can continue and delete the previous jail: # zfs destroy zroot/opt/jails/backup.psi and I'll hang onto the snapshot for a week or so to ensure that nothing went wrong. This can also help me to recover from any silly editing mistakes from here on.

And worse case scenario? Well, simple: assuming that some customers start calling tomorrow about stuff not working and data loss and whatever then I always have a backup plan: roll back the snapshot and my jail will be fully restored to the point before the upgrade.

(edit): Of course if you worry that something like this could happen then it makes more sense to also keep your backup copy of the jail around so that if an emergency does occur you can simply switch jails.

And there you have it!

This is how you upgrade a jail in my opinion. Minimal downtime, full control over the upgrade, easy to automate and most of all: you'll always got a backup and emergency fallback if you need it.
 
Back
Top