HOWTO: Install, backup and restore a reliable FreeBSD 8.2 system

HOWTO: Install, backup and restore a reliable FreeBSD 8.2 system

This guide will be my take on how to install, backup and restore a reliable FreeBSD workstation, or server. What does reliable mean in this context?

  • ZFS on everything practical - I want strong hashes on every bit of data so that we can have self-healing, or at the very least know when and where we have an error.
  • ZFS rather than UFS on root. Even though your OS and application files are not your data, they operate on your data. If corruption causes an application or OS to make errors, this may be a problem in and of itself. Corruption can also cause bad data to be written onto your data filesystems. Running ZFS will prevent that.
  • Redundancy in all pools, to allow self-healing and time to replace faulty drives. This means triple HDD storage mirror, and regular SSD root mirror using reliable and cost effective SSDs.
  • Ability to recover from intentional/unintentional file modification. It is my experience that from time to time, many organizations will have a need to access data that is years old and may have been deleted. Thus, we need a snapshot system that can allow recovery of both recently modified files (where the most likely requests for archived data will come from), and also arbitrarily old files, while balancing needs for disk storage etc. Hence we will have a Grandfather Father Son (GFS) snapshot scheme in place on your HDD storage mirror. These snapshots are replicated on all of your backup pools.
  • A procedure to make proper offsite and offline backups, along with tested restore procedures, that will allow restoration of your data to any point in time you have snapshots for on the backup. Unless you have backups and tested restores of said backups, what you would otherwise build is (IMHO) frankly just a toy.
  • Ability to easily restore your operating system + applications, as well as data. My thoughts are that since it takes so long to get a nice functional OS/application install working, you want to be able to restore that as well as the data. Your customers aren't going to be patient while you install everything from your own personal Howtos and then realize that you have left out certain important but undocumented steps that you will need to figure out under pressure and lack of sleep.
  • Backup pools are automatically scrubbed before backup takes place to catch failing drives.
  • Assurance of end to end data integrity in transfers. e.g. ZFS hashes enable assurance of data integrity on source. Transfer mechanisms are used in ways that ensure that the data written to the destination is checked (via hash, or checksum+hash) with the data that was read from the source. And ZFS hashes on the destination ensure that the data is verifiably the same as it was when written.

What are the additional features of this setup?
  • Fast application and OS speed - most OS and application data is kept on a small SSD mirror.
  • Cost effective data storage - user data, system backup data, and large and non-speed critical system directories are kept on HDD triple mirror. This is easily sped up with the purchase of another SSD to use as an L2ARC.
  • Ability to restore the SSD mirror from the HDD mirror.
  • Backups are easy and fast. All that is required is to connect the backup HDD pool via e-SATA HDD dock and backup with one command. Only the snapshots that have changed since the last backup are transferred.

How did I get to this point? Essentially what happened is that I started with this article, sysutils/zfs-snapshot-mgmt and sysutils/zfs-replicate, and then realized that there was a yawning chasm between what I wanted and what existed at the time. Thus, sysutils/zxfer was born from sysutils/zfs-replicate, and these articles here are the rest of it.

I recommend reading the following two articles as a preface. The first is an explanation of why backups using HDD as the media and ZFS/zxfer as the software can be cheaper, more convenient and have the potential to be just as reliable as tape. The second is a discussion of what sort of hardware to select and what procedures are necessary to bring the reliability to approach the levels found in tape.
1. HDD based backups with ZFS & zxfer – a new paradigm
2. Design considerations for HDD-based backups using ZFS/zxfer

If you find this howto useful, please thank one of these posts.

Some additional notes to readers
I've tested these instructions and accompanying scripts myself a couple weeks ago, going through every instruction and fixing things until they work. I had things roughly working back in FreeBSD 8.0, but I've learned a lot since then. I'm finally in a position where I'm confident enough to release these guides. Over the last two weeks, I've just cleaned the scripts up a bit. Hopefully nothing is broken.

I hope that the path I am showing is sound, or at least sound with some editing. If I sound less than confident it's not because I lack confidence (indeed, I'm confident enough to use them on my own workstation), it's because I think that a cautious attitude in general is the best approach to dealing with system reliability.

That's in part why I'm releasing this here, to get some feedback and end up with a better solution than if I'd just kept everything to myself. If I've done something wrong, please comment. I also urge you to test things thoroughly first on a system that doesn't hold any important data. The BSD disclaimer applies in full to any instructions here.

I will also be posting this guide, now and as it gets modified, on the zxfer wiki. Actually, I'll wait a few days until people comment and the dust settles a bit before doing that.

Do note that while much of what I've done is original work, I've incorporated suggestions from many sources via lots of googling and reading forum posts etc. So thanks to everyone, especially to phoenix for many suggestions such as glabeling drives, recommended rsync options etc, and blazing a path in general. And monkeyboy, for provoking some very valid thoughts about what backups should provide. Constantin Gonzalez provided much help and collaboration with zxfer.

You may also notice that some encryption, particularly on backups will prove useful. It should be easy enough to add instructions for that at a later date.
 
Contents and Synopsis

Contents
  1. Synopsis
  2. Background
  3. How to install the system.
    1. Install zroot, the root SSD mirror.
    2. Install some useful software from ports.
    3. Create the storage pool and /home directory
    4. Setup automatic interim backups and snapshots.
  4. Backup to a HDD pool that will be physically taken offsite.
  5. Important file deleted on zroot – how to roll back.
  6. zroot drive dies – how to replace.
  7. zroot pool dead – how to restore (from storage) and optionally roll back to an earlier working state.
  8. Important file deleted on storage/home – how to recover.
  9. storage drive dies – how to replace
  10. storage pool dead – how to restore (from backup01)
  11. Connect the backup HDD pool to the system
  12. System completely dies. How to restore from backup pool.

Synopsis
In 2010 I embarked on a quest for a reliable system. I wanted to be able to rule out errors in hardware with a high degree of certainty. This lead to wanting a FreeBSD system that stored as much data on ZFS as possible, with redundancy to allow self-healing. I wanted a modern SSD boot drive + HDD storage solution to increase performance while retaining the ability to store lots of data. I wanted a way of easily backing up and restoring both user data and boot drive data with a similar degree of assurance of end to end data integrity that ZFS gave me, and documented and tested procedures for doing so. I wanted to backup, restore and effectively archive using HDD based ZFS pools, and somehow retain all of the advantages of tape storage.

This didn't exist, so I was forced to design and create the missing parts as best I could with a minimum of wheel reinvention. My hope is that others see enough merit in my approach that they use it and improve on it, or perhaps let me know that I'm doing something stupid and suggest how to improve it. If that happens, this system will become more reliable and convenient than if I had decided to keep it to myself. It might also lead to other developers "playing nice" with this method of doing things, which will also be a win.
 
Background
What does reliability mean? Simply that your system is able to be depended upon. It means that you have set your system up so that ordinarily you can have confidence that it should work as it is supposed to, and that when disaster strikes you are hurt as little as possible.

Achieving reliability
What could go wrong? The best way I can think of to increase reliability is to take a risk management approach and look at all the possible areas that your system can fail, and then look at ways to reduce or eliminate that risk. The only way to stop Murphy's Law from destroying your system is to anticipate everything possible going wrong, and have a counter in place for it.

Thus, I've covered a list of things that can go wrong, with fixes, so that the reasoning behind my design decisions becomes apparent. This is not exhaustive, so if you can come up with valid additions to this list I will add them.

Problem: RAM can cause data corruption due to bit flips from cosmic rays etc. Note that ZFS will happily and silently write data from RAM that has been corrupted. ZFS does not magically know whether the data currently sitting in RAM is corrupt or otherwise.
Solution: Run good error correcting (ECC) RAM. This also means either running Intel Xeon processors or AMD processors other than Bobcat, with a motherboard that supports it.
Your BIOS should be able to print out a list of RAM errors.

Problem: The whole system dies.
Solution:
  • Minimize likelihood of this occurring by protecting the system from power spikes/failures with a UPS.
  • Select components with a degree of redundancy, to keep the system up through individual component failures. This includes splitting mirrors or RAIDZ2 across different HDD controllers such that one can fail and the pools will still be accessible.
  • Minimize the data loss by regularly backing up the data to an offsite and offline location. For our purposes, we will be using HDDs connected to the system through e-SATA.
  • Keep the components in the system at suitable operating temperatures and humidities.

Problem: HDDs can fail by silently corrupting data, still seeming to operate correctly.
Solution(s):
  • Run ZFS. ZFS has the ability to check that each block of data is correct through the use of hashes, detect it both when those files are accessed, and through scrubs of the system.
  • Schedule regular ZFS scrubs to identify when this is happening.
  • Use a triple mirror setup (or RAIDZ2 if large amounts of data need to be stored) so that the pools will tolerate 2 HDDs dying, which will allow you one HDD to die, and still have redundancy while you detect and replace that HDD.
  • From what I can tell from newegg reviews, the Intel SSDs are much more reliable than the newer mechanical HDDs, so I think it is tolerable to just have a double mirror. YMMV.
  • When backing up to a single HDD, have more than 1 HDD in rotation that is being backed up to so that if one dies at the worst time, there is still another backup. If possible, back up to a pool with redundancy of HDDs (e.g. mirror) rather than a single HDD.
  • Use copies=2+ on single HDDs, so that the data should withstand everything but a head crash.

Problem: Silent data corruption occurs on the drive(s) holding your operating system, which causes your programs no longer generate the correct output, perhaps writing bad data that will not be detected until a later date.
Solution: Run a ZFS mirror setup on the disks holding your operating system.

Problem: Silent data corruption occurs on the drive(s) holding your swap. This will in turn mean that your programs may write errant data or malfunction, perhaps leading to problems that will not be detected until a later date.
Solution: I'm not sure that there is one at this time. See here. One way of avoiding this situation is to equip your machine with more ECC RAM than it should ever need to use.

Problem: Due to an error of cables or components, incorrect data is copied during a backup or transfer of files. ZFS block checksums will not detect this error, since all they will verify is that the data that is on the destination is the same data that was written, not that the destination data matches the source data.
Solution: Use either ZFS send/receive or rsync to transfer the data. This is what I designed zxfer to do. I discuss this more later.

Problem: During a backup of the root filesystem(s), the contents changes, making the backup inconsistent.
Solution: Use ZFS snapshots as the basis of the rsync transfer. (Again, zxfer does this by design).

Problem: Someone inadvertently or intentionally deletes or edits a critical file.
Solution: Schedule regular snapshots, and keep some around on a decaying frequency (e.g. Grandfather-Father-Son - GFS). That way, if a long time passes before someone realizes how critical the file was, it can still be recovered. This can be done with sysutils/zfs-snapshot-mgmt (thanks to Marcin Simonides).

Problem: The security of the system is compromised, leading to malicious modification of data.
Solution: Poor security practice can lead to very bad things even in a setup that is otherwise extremely reliable. Thus, good security practice is a necessary part of building a reliable system, but it is mostly outside the scope of this guide.

Problem: The CPU has a bug.
Solution: This is a good question. Probably the best solution is don't be a guinea pig. Wait for the processor you use to be tested sufficiently by the market before you wade in.

Problem: A core on your CPU has a flaw that produces errors.
Until we have an operating system or a CPU that will cause every operation to be done redundantly on different cores and compared, I'm not sure there is a solution within the one computer. Probably the best way to do this would be to have several systems running and then compare the results. This is beyond the scope of this guide. (However, I welcome discussion of this in the comments.)

Problem: The OS/software you use is unreliable, meaning errant data could be written.
Solution: Use well tested software with a reputation for reliability. And hope.
 
Background Continued
A note
I'm far from infallible, and I've only been using FreeBSD for less than two years now. (I have used Unix operating systems as far back as 1997, with heavy use of Linux since 2006 or so.) I've written this guide because nothing quite like it existed at the time of writing. It is my hope that others will find the features a setup like this brings to be very useful, even a must have. If that happens, then I expect that as is typical in the Free and Open Source Software (FOSS) world, the user-base grows and other people will contribute. They will find and fix bugs, they will suggest better ways of doing things. Reliability and convenience will improve. Perhaps even FreeBSD developers will improve FreeBSD with the idea that systems will be setup like this. In general, a virtuous circle results.

Again, use this guide at your own risk, and please let me know of any mistakes or dubious design decisions I've made. No warranty is implied or expressed, YMMV, if it breaks you get to keep both pieces.

File transfer mechanism: zfs send/receive, rsync
In order to transfer files and filesystems for backup and restore purposes, I needed programs that verify what was written on the destination is the same as what was read from the source. From what I could gather, zfs send/receive and rsync both do this. In order to streamline the process I took Constantin Gonzalez's script sysutils/zfs-replicate, and coded up additional functionality to create sysutils/zxfer. Constantin offered useful advice throughout. Using zxfer it is possible to transfer filesystems and their properties using either zfs send/receive or net/rsync, depending on what you want to do.

As far as reliability, zfs send/receive appears to be neither better nor worse than rsync. Both automatically use checksums/hashes to verify that the writes match what was written (combined with ZFS checksums on the files, we will have end-to-end checksums on our data - very reliable). I was confused for a while about ZFS send/receive because there is nothing about checksums in the man page, and there are some discussions in the zfs mailing list. However, this discussion relates to storing ZFS send streams, not the send/receive process itself. The way ZFS send works is that it outputs a snapshot as a stream, with checksums in the stream. The stream is piped to ZFS receive, which (AFAICT) writes data to the target ZFS filesystem and checks that against the checksum (note that the author of the post I linked to should be authoritative - he worked at Sun). If you were to just store the stream, you have no way of knowing whether the stream is corrupt or not - dangerous if you are going to use it for backup purposes.

If you use ZFS send in tandem with ZFS receive, it's as reliable as rsync. Rsync verifies that the data (post transaction) on the target is the same as that on the source, using both an md5 hash and a rolling checksum. (Note that you do not need to use --checksum to get this functionality, it does it automatically. However, it is wise to use --checksum because you want to verify that the existing files on destination are the same as those on source, otherwise force a transfer.)

So which to use? It really comes down to convenience in most cases. ZFS send/receive is supposedly faster with lots of small files. However, if we only want to transfer a selected few files from a filesystem, rsync is the only way to go as ZFS send/receive operates on snapshots of whole filesystems. Also, if we have two different snapshotting schemes on source and destination, you can only use rsync.

Design decisions
The pools in our system will both be mirrors. The root mirror will consist of two Intel 40GB SSDs, named zroot. This should give good performance for the operating system, and as high a reliability as possible. I recommend using the 320 series SSDs because they have battery backup for buffered data in the case of unexpected power loss. Because the workstation has 12GB of RAM, I thought the swap should be at least 16GB. I could not get mirrored swap to work, but seeing as I have enough RAM that swap should never be used, it should not be a concern.

We'll also have a HDD based triple mirror for storage (pool name: storage), which should have higher reliability, much more capacity, for not much money (at the expense of being slower especially for random reads/writes). Note that this could also be a RAIDZ or RAIDZ2 setup. However, there are plenty of reasons to consider using a mirror instead.

We will keep a minimum of snapshots on zroot so that unwanted changes can be rolled back, but excessive space is not used. We will schedule regular transfers of zroot files over to storage, so that a longer history of snapshots can be built up inexpensively. We will also keep a few directories/filesystems on storage because there is little advantage to keeping them on SSD. They are /usr/ports/packages and /usr/ports/distfiles. They largely consist of compressed files that are rarely used in day to day operations, and when read/written to would be expected to be done so in a sequential manner equally suited to HDD.

On storage, we will keep a longer history of snapshots, with a few going back months, and fewer going back years.

For offsite backups, we will use several individual pools. This is discussed at length in the articles I wrote here and here.

For the sake of this article we will use single drives with copies=2 on the filesystems on these drives, so that they are tolerant to silent data corruption, and compression so that space is saved. Or mirrored HDDs, again with compression. We will use an e-SATA dock to connect these drives to the system. We will only backup the storage pool, as that contains within it the means to restore zroot.

A word of warning
In testing all of these howtos, I re-used several drives for different zpools. What you will find is that if you decide to re-purpose a drive without first destroying the zpool, is that ZFS will happily create a new zpool with the drive, and without any errors AFAIK. However, when you type
# zpool import
you will see the old pool sitting there in a degraded state. It is confusing, and doesn't look good. What is worse, is that ZFS doesn't appear to allow importing that zpool in a faulted state. If you can't import, you can't destroy the zpool, and hence you can't clear the name. I know of two potential ways to clear that error.
  1. Use dd to write zeros to the drive before using it again. This will take a long time, especially if you have to do this to every drive on an old pool. This is actually a pretty good practice on whatever drive you are going to be using as part of a pool. e.g.
    # dd if=/dev/zero of=/dev/ada0 bs=1m
  2. Anticipate that you are going to be re-purposing the drives before you disconnect them, and do a
    # zpool destroy pool_name
    before you will reuse them. This is perhaps the better method of the two, although dd will make sure that gnome won't try and automount an NTFS partition with lingering information that hasn't yet been overwritten.

Which brings me to another point – always use either glabel to label drives or gpart to label partitions (e.g. in the case of zroot). This is done in the scripts I have made for each howto. Thanks to phoenix for this suggestion. This means that you can attach the drives to any SATA port you want, and ZFS will recognize the drives. It is hard for me to over-emphasize how much of a good idea this is.
 
Install zroot, the root SSD mirror
Again, we are mostly just doing the same thing as in here. It can certainly be done this way. It's a lot of typing though, and I'd rather do the same thing via a script that I have verified is correct - no chance of typos, and much faster. Some differences and notes:
  • tweaked the filesystem setup for atime=off option on the appropriate directories.)
  • checksum is sha256, to really minimize the chances of a bit flip having the same checksum as the original data.
  • gpt labelled everything in relation to the drives and what is on the partitions. Should allow some degree of SATA cable independence.
  • deletes partitions before creating them, to suit testing and retesting.
  • the swap is regular FreeBSD swap, not ZFS swap. Unfortunately, the recommendation appears for there to be no checksum on the swap volume (and also no way to save crash dumps). Assuming this is for a reason, this would invalidate the only reason I would want to run swap. If you want a highly reliable system, consider using enough RAM that swap isn't used. Using reliable SSDs for swap should also increase reliability.

To get to that point, we need some way to store and access that script (via the LiveDVD). I will show how to do so from both a FAT formatted USB stick and a ZFS formatted one. I will assume you know how to copy scripts to such a USB stick, but do not know how to mount the USB stick in FreeBSD.

If you need random help with ZFS, I recommend printing out the zfs and zpool man pages, and this entry from the handbook. The opensolaris ZFS best practices wiki was also useful.

Format the USB stick

If using ZFS
This is only really practical if you have a FreeBSD system up and running already with net access, x11 and a browser to download the scripts. Otherwise, download the scripts and put them onto a regular FAT formatted USB stick.

For completeness, I will show you how it's done from the liveDVD (the only difference between that and a regular install would be having to load the ZFS kernel modules - if this has already been done, pick up at step 5).
  1. Boot the LiveDVD (FreeBSD-8.2-RELEASE-amd64-dvd1.iso.xz) (default option, 1)
  2. Set your region, and your console keymap.
  3. Fixit, option 2 (CDROM/DVD)
  4. Load the ZFS kernel module (if necessary):
    [CMD=""]Fixit# kldload /mnt2/boot/kernel/opensolaris.ko[/CMD]
    [CMD=""]Fixit# kldload /mnt2/boot/kernel/zfs.ko[/CMD]
  5. Determine the device node for your USB stick:
    [CMD=""]Fixit# dmesg | tail[/CMD]
    Insert the USB stick, wait a few seconds.
    [CMD=""]Fixit# dmesg | tail[/CMD]
    Note the device node. In my case, dmesg prints several new lines starting with da3, which is the device node for my usb stick.
  6. Create the zpool on the USB stick:
    [CMD=""]Fixit# zpool create usbstick01 /dev/da3[/CMD]
  7. Create the ZFS filesystem on the usbstick01 zpool. Let's use ZFS to enable compression and a suitable level of redundancy (copies=3) for our usb stick - a great reliability upgrade for a questionably reliable device. Gzip compression is also a great idea - get more storage from your usb-stick, with negligible cost as USB 2.0 speed will be the bottleneck, and compression may even speed up transfer.
    [CMD=""]Fixit# zfs create -o compression=gzip -o copies=3 usbstick01/scripts[/CMD]
If using FAT (msdosfs)

  1. You could use another operating system to do this if you know how, or do the following from FreeBSD:
  2. Insert the USB stick.
  3. Use dmesg to find what device node it is (e.g. da3)
    # dmesg | tail
  4. Format the USB stick.
    # newfs_msdos /dev/da3
Copy the install scripts to the USB stick.
I will assume you know how to do this from your operating system of choice. The scripts are here.

Install preparation
  1. Boot the LiveDVD (FreeBSD-8.2-RELEASE-amd64-dvd1.iso.xz) (default option, 1)
  2. Set your region, and your console keymap.
  3. Fixit, option 2 (CDROM/DVD)

Mount the USB stick
If using FAT
  1. Use dmesg to determine the device node of the USB stick as before (e.g. da3).
  2. Make a directory to mount it in.
    [CMD=""]Fixit# mkdir /mnt/usb[/CMD]
  3. Mount the USB stick.
    [CMD=""]Fixit# mount_msdosfs /dev/da3 /mnt/usb[/CMD]

If using ZFS
  1. Load the ZFS kernel module, if necessary:
    [CMD=""]Fixit# kldload /mnt2/boot/kernel/opensolaris.ko[/CMD]
    [CMD=""]Fixit# kldload /mnt2/boot/kernel/zfs.ko[/CMD]
  2. See what is available to import.
    [CMD=""]Fixit# zpool import | less[/CMD]
  3. Import the zpool (in this example, the zpool is "usbstick01").
    [CMD=""]Fixit# zpool import -f usbstick01[/CMD]
  4. Mount the ZFS filesystem.
    [CMD=""]Fixit# zfs mount usbstick01/scripts[/CMD]

Execute the scripts
  1. Change to script directory
    [CMD=""]Fixit# cd /path/to/scripts[/CMD]
  2. Check that the scripts are executable. There should be an "x" in the fourth position, e.g. -rwx----...
    [CMD=""]Fixit# ls -al[/CMD]
  3. If not, make them executable.
    [CMD=""]Fixit# chmod 700 *[/CMD]
    Determine the device nodes for the drives that you will be mirroring. It is probably something like da0, da1... or ad0, ad1...
    [CMD=""]Fixit# dmesg | less[/CMD]
  4. Ensure you have the correct drive device nodes for your mirrors. Also ensure that the section under "Creating partitions" is correct. The default will work if you want a 16GB swap, and also to ensure that write block, erase block and partition beginning boundaries are aligned.
    [CMD=""]vi install1.sh[/CMD]
  5. Execute the first script. Be aware of what you are doing though, as the script WILL erase everything on whatever the scripts says are the device nodes for your mirror. Answer yes when it asks if you want to extract the base distribution into /zroot, and doc distribution...
    [CMD=""]Fixit# ./install1.sh[/CMD]
  6. Follow the instructions. Because as a part of the installation, you chroot /zroot, the second part of the script is copied there for you to execute.
    [CMD=""]Fixit# chroot /zroot[/CMD]
    [CMD=""]Fixit# ./install2.sh[/CMD]
  7. Enter your root password, and set your timezone.
  8. The third part of the script is already copied to /mnt, ready to be executed as you exit the chroot jail.
    [CMD=""]Fixit# exit[/CMD]
    [CMD=""]Fixit# exit[/CMD]
  9. This will throw you back into Fixit menu, just hit 2 and enter again, to get back to the Fixit prompt.
  10. Now get to /mnt to execute the last script.
    [CMD=""]Fixit# cd /mnt/usb[/CMD]
    [CMD=""]Fixit# ./install3.sh[/CMD]
  11. Exit and reboot.
 
Install some useful software from ports
This will involve setting up our network connection, fetching ports, and installing several ports that we will use for various backup/restore/installation related purposes.

Get net connection working
You will need a working /etc/resolv.conf and /etc/rc.conf. See here. After that you will need to restart.

Install various ports
You can also use the install_software.sh script instead.
  1. Update ports
    # portsnap fetch extract
  2. Install portaudit
    # cd /usr/ports/ports-mgmt/portaudit
    # make install clean
  3. Check installed ports for known vulnerabilities:
    # /usr/local/sbin/portaudit -Fda
  4. Install portmaster (feel free to use something else, this is what I use)
    # cd /usr/ports/ports-mgmt/portmaster
    # make install clean
  5. Install sysutils/zfs-snapshot-mgmt (only if installing a new system, if restoring do not do this step.)
    # /usr/local/sbin/portmaster --force-config sysutils/zfs-snapshot-mgmt
  6. Install sysutils/zxfer. This will have the useful side effect of installing net/rsync.
    # /usr/local/sbin/portmaster --force-config sysutils/zxfer
 
Create the storage pool and /home directory
This assumes that you are going to have a HDD based storage mirror for the home directory and short term backups. I've included this because I found it easier to do it this way than to set it up as in the wiki and then have to remove the /usr/home ZFS filesystem, and mess with mount points. /usr/ports/packages and /usr/ports/distfiles will live on storage too, because there is no real reason to crowd out your SSDs with a lot of data that is infrequently used, and sequentially read from.

  1. Exit the install CD and restart, taking out the DVD if you haven't already.
  2. Log in to your new system.
  3. Use dmesg to find the device nodes for your HDDs as per before. Tip:
    # dmesg | grep da
  4. If you are rebuilding or testing and storage already exists, import the old storage pool and then destroy it.
    # zpool import -f storage
    # zpool destroy storage
  5. Mount the usb drive again. Use dmesg if you can't remember the device node. Here are the instructions for a FAT formatted USB drive.
    # mkdir /mnt/usb
    # mount_msdosfs /dev/da3 /mnt/usb
    # cd /mnt/usb
  6. Create the storage mirror. Edit the script create_storage.sh to name the drive labels and device nodes correctly, then execute it. I suggest using a triple mirror, for a combination of reliability, speed, ease of use and peace of mind. Note that in the process we put /usr/ports/packages and /usr/ports/distfiles on storage and wipe whatever was in those directories, which isn't anything of importance.
    # vi create_storage.sh
    # ./create_storage.sh
 
Setup automatic interim backups and snapshots
What will happen here is that everything on storage is snapshotted in a Grandfather Father Son (GFS) scheme, for archival and recovery purposes. zroot is snapshotted but only to allow short term rollbacks/recovery as necessary. A regular zxfer is scheduled to rsync across zroot to storage, where it can be snapshotted in a GFS scheme, giving the ability to restore from arbitrarily far back in time without wasting expensive SSD space.

It would be better in a production system to do the zroot zxfer on a daily basis rather than hourly, e.g. in the small hours. Using an hourly schedule will allow you to easily see and test how it works. To learn how to set it up on a daily basis, see:
# man zfs-snapshot-mgmt

Install some useful ports.
See here.

Copy the zfs-snapshot-mgmt.conf to the correct location
  1. Mount the usb drive again. Use dmesg if you can't remember the device node. Here are the instructions for a FAT formatted USB drive.
    # mkdir /mnt/usb
    # mount_msdosfs /dev/da3 /mnt/usb
    # cd /dev/da3 /mnt/usb
  2. Copy the config file to the correct location.
    # cp /mnt/usb/zfs-snapshot-mgmt.conf /usr/local/etc/

Create the root crontab
  1. Copy the root crontab to its correct location. (Or just skip to the next step and paste the contents in there, which is probably a more proper way to do things.)
    # cp /mnt/usb/root /var/cron/tabs/
  2. Verify that it is in the right place.
    # crontab -e
  3. Enable the commands in crontab by creating /.crontab_enable file. We do this so that in the event of a full system restore, the snapshotting etc. only starts when WE want it to. Otherwise we have to be more concerned about timing.
    # touch /.crontab_enable
  4. Verify that things are functioning correctly.
    There should be snapshots of storage/home every 30 minutes. There should be recursive snapshots of the filesystem tree under storage/zrootbackup, zroot, storage/distfiles and storage/packages every hour.
    # zfs list -t snapshot
    At 11 minutes past the hour, the first zxfer of the root mirror should start. Give it another 15 minutes or so, and then check that there are filesystems in storage/zrootbackup. Also check that they match correctly with what is in zroot.
    # date
    # zfs list
    # ./diff_zroot_storage.sh
 
Backup to a HDD pool that will be taken offsite
This should be done frequently, and rotated with other HDD pools to a safe offsite and offline location. Also consider using something that can tolerate a HDD head crash, e.g. a mirror.

Create the HDD zpool
  1. Connect a suitable HDD via e-SATA dock to the system. If you are using FreeBSD8.1+ with AHCI enabled in your /boot/loader.conf, it should autodetect. If not, and if you are willing to reboot your machine at this point, the HDD will be detected when it reboots.
  2. Otherwise, get the system to detect the HDD. X corresponds to scbusX from the following command.
    # camcontrol devlist -v
    Then to get it to detect:
    # camcontrol reset X && camcontrol rescan X
    Note that you will have needed to put the following in your /boot/loader.conf
    Code:
    ahci_load="YES"
  3. Use dmesg to determine which device node it is (e.g. /dev/ada2)
    # dmesg
  4. At this point, it's probably a good idea to zero the drive so we start from a clean slate.
    # dd if=/dev/zero of=/dev/ada2 bs=1m

If the backup is a mirror
  1. glabel the HDD (e.g. wdc250_backup01). It is a good idea to physically label the HDD with the glabel as well, and the pool to which it will belong.
    # glabel label -v wdc250_backup01 /dev/ada2
    # glabel label -v wdc250_backup02 /dev/ada2
  2. Test.
    # glabel list
  3. Create the zpool:
    # zpool create backup01 mirror label/wdc250_backup01 label/wdc250_backup02
  4. Test.
    # zpool status

If the backup is a single HDD with copies=2 and compression
  1. glabel the HDD (e.g. wdc250_backup01). It is a good idea to physically label the HDD with the glabel as well.
    # glabel label -v wdc250_backup01 /dev/ada2
  2. Test.
    # glabel list
  3. Create the zpool:
    # zpool create backup01 /dev/label/wdc250_backup01
  4. Test.
    # zpool status

Continuing on
Note that the only difference for a single HDD backup vs a mirror is that for a budget constrained, single HDD solution you will want to specify "-o copies=2,compression=lzjb" in all sysutils/zxfer commands. There is no reason why you can't do the same thing with a mirror. In particular, specifying compression may be useful.
  1. Create the filesystems for the backups:
    # zfs create backup01/pools
    # zfs create backup01/filesystems
  2. Perform the first backup. Note that it won't transfer anything but the empty filesystems if there are no snapshots yet - unless we create snapshots specifically to zxfer across, we will need to have waited until about 25 minutes past the hour for zroot to be zxferred to storage, and a further 45 minutes until it is on the hour and a recursive snapshot is taken of storage/zrootbackup. Note also that if we specify "-o" we are automatically overriding any properties of the original filesystems e.g. with compression and copies, as suits a backup. However, the original property values for each filesystem will be backed up, to be restored as they once were.

    Another thing to be aware of is that the "-g 370" option specifies "grandfather protection", i.e. zxfer will test for any destination snapshots that are to be deleted and fail if there are any over 370 days old (i.e. in the case of grandfathers). In that case, there is likely to be something wrong on your system if such snapshots are being deleted (which causes zxfer to want to delete the corresponding snapshots on the destination).
    # zxfer -dFkPbv -g 370 -o copies=2,compression=lzjb -N storage/home backup01/filesystems
    or an example of the compression,copies version:
    # zxfer -dFkPbv -g 370 -N storage/home backup01/filesystems
    Continuing...
    # zxfer -dFkPbv -g 370 -N storage/distfiles backup01/filesystems
    # zxfer -dFkPbv -g 370 -N storage/packages backup01/filesystems
    # zxfer -dFkPBv -g 370 -N storage/zrootbackup backup01/pools
  3. In future, it's easier to put those commands in a script and execute it, e.g.:
    # /mnt/usb/backup_with_zxfer.sh backup01
  4. Export the backup pool.
    # zpool export backup01
  5. Turn off the e-SATA dock and remove the HDD.
Note that I've included the system beep in the zxfer commands, since it's nice to be able to be alerted when the backup as a whole is finished or there is an error that needs your attention.

Perform incremental backups
  1. Connect the HDD via e-SATA dock, turn it on.
  2. Get the operating system to detect the HDDs, using the above camcontrol commands.
  3. Import the pool.
    # zpool import backup01
  4. Transfer the filesystems. Note that the same command that initially transfers the filesystems will perform the incremental backups, deleting any snapshots that no longer exist on the destination in order to transfer properly.
    # /path/to/backup_with_zxfer.sh backup01
  5. Note that the above command now includes a precursory scrub of the backup pool, along with showing the progress of the scrub.
  6. Turn off the e-SATA dock, remove the HDDs and safely take them to the offsite location.
 
Roll back zroot to an earlier state
Say you have erased or modified an important file on zroot, and need to recover. How do we do this?

  1. Identify the filesystem that would have held the data destroyed/modified (e.g. say we have accidentally deleted /usr/local/bin/rsync. Then the filesystem we would restore is zroot/usr.)
    # zfs list
  2. Examine the snapshots for that filesystem. There will be several, and it is up to you to identify the likely time.
    # zfs list -t snapshot | grep zroot/usr | less
  3. Check that the file is as it should be in the desired snapshot. A way you can do this is as follows:
    # cd /usr/.zfs/snapshot
    Look at the different snapshots (e.g. auto-2010-11-06_12.00):
    # ls
    And have a look around to find your file.
    # cd auto-2010-11-06_12.00
    # cd local/bin
    # ls
    Once you are satisfied, get out of the snapshots directory.
    # cd /
    Note that in many cases it is not necessary to rollback a filesystem, which is a rather blunt tool. All you may need to do in some cases is to copy the file(s) you want from the appropriate snapshot directory to where you want them. This will prevent the case where you accidentally erase good data in the process of rolling back your snapshot. But if you desire to rollback, here is how to do it:
  4. Roll back the filesystem to the desired snapshot. If you are not rolling back to the last snapshot, then you will have to roll back to a previous version and automatically delete all the snapshots taken since then as well. Note that the following command will do that with the -r option, so be careful.
    # zfs rollback -r zroot/usr@auto-2010-11-06_12.00
  5. Check that the file is as it should be. Things should be working now.
 
A drive on zroot dies - how to replace
If a single SSD on the zroot dies, how do we replace it? We can either treat it as we would a failure of the entire zroot (replacing the faulty drive in the process), or we can just replace the faulty drive and partition it correctly, add bootcode, and inform ZFS that this drive is to be the replacement.

  1. Identify the disk that is to be replaced. It is a good idea to have made a label or map of your physical drives so that you can identify which one has failed by its glabel.
    # zpool status
    zroot will have one drive marked as something other than "online", this is the one we want to disconnect.
  2. Disconnect that failed drive, remove it, replace it with a working one and connect it. (For sake of example, let's say that the one that was failed is gpt/intel40GB-01-zfs. We may need to offline it first (which we will have to do in order to test it, which will also require using a physically different drive for testing. To get back to the original drive, repeat the process twice.)
    # zpool offline zroot gpt/intel40GB-01-zfs
  3. If necessary, make FreeBSD aware of the existence of the new drive. See here. If this doesn't work, you will need to restart.
  4. Mount the usb drive again. Use dmesg if you can't remember the device node. Here are the instructions for a FAT formatted USB drive.
    # mkdir /mnt/usb
    # mount_msdosfs /dev/da3 /mnt/usb
    # cd /dev/da3 /mnt/usb
  5. Edit the file zroot_drive_restore.sh so that the variables match the drive. In order to tell what device node the new drive is on, and depending on what was formerly stored on the drive, you may need to do any of the following to try and determine which device node it is (and use a process of elimination.
    # glabel list
    # gpart list
    # zpool status
    # zpool import
    # dmesg
  6. Execute the script.
    # ./zroot_drive_restore.sh
  7. Test that the drive on zroot is now resilvering:
    # zpool status
 
zroot pool dies - how to restore using the storage pool
For some reason the zroot pool dies, how do we restore it (assuming the storage pool is still functional)? These instructions include the ability to restore from any snapshot that exists on storage.

Determining the problem
If zroot is down, it is almost sure that the machine isn't going to boot properly, and probably just stopped working. The machine would be making it through the BIOS otherwise we'd be trying to replace the PSU, or motherboard. We'd need to see the status of the drives with Fixit on the LiveDVD first.
  1. Boot the LiveDVD (8.0-RELEASE-amd64-dvd1.iso.gz) (default option, 1)
  2. Set your region, and your console keymap.
  3. Fixit, option 2 (CDROM/DVD)
  4. Load the ZFS kernel module:
    [CMD=""]Fixit# kldload /mnt2/boot/kernel/opensolaris.ko[/CMD]
    [CMD=""]Fixit# kldload /mnt2/boot/kernel/zfs.ko[/CMD]
  5. Check the zpool status.
    [CMD=""]Fixit# zpool status[/CMD]
    If there is something wrong with both drives, it may be appropriate to restore the zroot mirror from storage, if storage is ok.

Restoring from storage pool
  1. Install the zroot mirror as shown here.
  2. Install some software from ports, as shown here.
  3. Exit install and restart the computer, removing the LiveDVD. Log in as root.
  4. Mount the usb drive again. Use dmesg if you can't remember the device node. Here are the instructions for a FAT formatted USB drive.
    # mkdir /mnt/usb
    # mount_msdosfs /dev/da3 /mnt/usb
    # cd /dev/da3 /mnt/usb
  5. Import the storage pool.
    # zpool import storage
  6. At this point we want to decide which snapshot we want to restore zroot from.
    # ls /storage/zrootbackup/.zfs/snapshot
  7. Edit the file restore_from_storage.sh and fix the snapshot variable. Unless you have a reason for using an older snapshot (e.g. a newer one doesn't work), you probably want to use the newest one.
    # vi restore_from_storage.sh
  8. Destroy filesystems that are already found on storage.
    # zfs destroy zroot/usr/ports/distfiles
    # zfs destroy zroot/usr/ports/packages
  9. Find all the schg flags that have been turned on (in order to turn them off temporarily). This creates a text file called schg_flags.txt.
    # ./find_schg_flags.sh
  10. Turn off readonly on zroot/var/empty
    # zfs set readonly=off zroot/var/empty
  11. Set all those files with schg to noschg.
    # ./set_noschg.sh
  12. Restore the files necessary to get the root mirror restored (uses rsync, which should have been installed).
    # ./restore_from_storage.sh
  13. Put the schg flags back the way they were.
    # ./set_schg.sh
  14. Turn readonly back on, on zroot/var/empty
    # zfs set readonly=on zroot/var/empty
  15. Reboot
    # shutdown -r now
  16. Enable the commands in crontab by putting a file in the / directory. We do this so that in the event of a full system restore, the snapshotting etc. only starts when WE want it to. Otherwise we have to be more concerned about timing.
    # touch /.crontab_enable
 
Recover a file on the storage pool
Say you have erased or modified an important file on storage and need to recover it. How do we do this?

The key is again, snapshots. The first port of call is to realize that every snapshot is accessible without having to do anything special, in the filesystem's .zfs/snapshot/$snap_name directory.

Restoring just a file
  1. Say we have deleted the file /home/user/foo/stuff.txt, and /home is the mountpoint of storage/home.
  2. Check to see which snapshots are available.
    # zfs list -t snapshot | grep storage/home | less
  3. Inspect the file in each snapshot at a time around where you think the change was. e.g.
    # vi storage/home/.zfs/snapshot/auto-2010-11-07_20.00/foo/stuff.txt
  4. If you decide you want to be using that version of the file, you probably want to wait until there has been a snapshot taken by zfs-snapshot-mgmt recently so that you don't have to worry about overwriting the new version if you have somehow made an error, and then copy it to the current location. e.g.
    # cp storage/home/.zfs/snapshot/auto-2010-11-07_20.00/user/foo/stuff.txt /home/user/foo/stuff.txt


Rolling back to an earlier version of the filesystem
If a wholesale disaster has happened and you just want to roll back the filesystem to an earlier state, I suggest first backing up the filesystem so that when you roll it back you can restore if necessary.
  1. Backup the storage pool. I would use a new HDD or pool for this purpose rather than an existing backup. See !!LINK TO article 4!!
  2. Check to see which snapshots are available.
    # zfs list -t snapshot | grep storage/home | less
  3. Inspect the file in each snapshot at a time around where you think the change was. e.g.
    # vi storage/home/.zfs/snapshot/auto-2010-11-07_20.00/foo/stuff.txt
  4. Roll back to the earlier version.
    # zfs rollback -r storage/home@auto-2010-11-07_20.00
  5. If for whatever reason you change your mind or you have rolled back too many snapshots by accident, you can restore from the backup you have made as follows:
    # zxfer -deFPv -N backup01/filesystems/home storage
 
A drive on the storage pool dies - how to replace it
We have discovered during a routine zpool status check that one of the drives on storage needs to be replaced. How to go about it?

  1. Check the status of the zpool to find out which drive has malfunctioned. e.g. label/wdc/250-02
    # zpool status
  2. Disconnect the malfunctioned drive. Hopefully you have the physical drives labelled in advance with the glabel of the drive. It might also be a good idea to offline the drive in question. We will need to do that anyway if we are just testing. e.g.
    # zpool offline storage label/wdc250-02
  3. Connect the replacement.
  4. (If not FreeBSD 8.1+: ) Inform FreeBSD of that there is a new device on a SATA port. First find which buses are not already used.
    # camcontrol devlist -v
    At this point, I usually do a trial and error of X, where X is an unused "scbusX" from that command.
    # camcontrol reset X && camcontrol rescan X
    Success happens when there is a few lines of output on the screen recognizing the drive (same as the last bit of dmesg.
  5. Label the drive with glabel, and physically write on the drive or a map as to the glabel. For sake of example, assume the device node is /dev/ada3. We are going to label it "wdc250-04".
    # glabel label -v wdc250-04 /dev/ada3
  6. Tell zfs to replace the drive.
    # zpool replace storage label/wdc250-02 label/wdc250-04
  7. Check on the resilvering status.
    # zpool status

Note that if you are just testing this by replacing a working drive, the drive that is replaced is marked as no longer part of the pool (handy for us!). You can test this with:
# zpool import
 
The storage pool dies - how to restore it from a backup
Maybe something electrical has gone wrong with the storage pool, or someone has been overly aggressive with the roll backs. You need to restore it from a backup drive (or pool). This assumes zroot is still functional.

  1. Stop regular snapshotting and zxfers via crontab.
    # rm /.crontab_enable
  2. Connect the backup HDD via e-SATA dock to the system. Note that you will have needed to put the following in your /boot/loader.conf and have rebooted before continuing.
    Code:
    ahci_load="YES"
  3. See that the pool is available.
    # zpool import
  4. Import the backup pool.
    # zpool import backup01
  5. If the storage pool is functional but we just want to restore a filesystem to whatever was last backed up (and all snapshots), do as follows:
    # zxfer -deFPv -N backup01/filesystems/home storage
  6. If the storage pool is not working, then we would have checked that it was non-functional (assuming you have checked to see that the HDDs are dead and that it's just not device nodes being switched around - you have glabeled the HDDs haven't you?):
    # zpool status
  7. Destroy the pool and then disconnect the drives.
    # zpool destroy storage
    Or if you have some sort of hope and desire to recover them later, export the pool and disconnect the drives.
    # zpool export storage
  8. Connect the replacement HDD for storage and create that pool. !!FOLLOW LINK TO ARTICLE 2!!
    There should be a new storage pool now. If that isn't working, it might be a faulty HDD controller or a changed BIOS setting perhaps.
  9. Execute the restore_storage_from_backup.sh script.
    # ./restore_storage_from_backup.sh
  10. After everything appears to be working, enable things in crontab again.
    # touch ./crontab_enable
 
Connect the backup HDD pool to the system

  1. Connect the backup HDD via e-SATA dock to the system. Note that you will have needed to put the following in your /boot/loader.conf and have rebooted before continuing.
    Code:
    ahci_load="YES"
    Note also that the updated script backup_with_zxfer.sh assumes that the backup pool is NOT imported, so the following is no longer necessary.
  2. See that the pool is available.
    # zpool import
  3. Import the backup pool.
    # zpool import backup01
 
System completely destroyed – how to do a full restore from backup
Something has gone horribly wrong, but fortunately you have at least one off-site backup HDD. Here is how to restore the whole system.

  1. Get your computer to a point where it will boot from DVD and you can put all the drives for zroot and storage in it.
  2. Install zroot.
  3. Install some ports.
  4. Install storage
  5. Connect and mount the backup HDD.
  6. Determine the last zroot snapshot to backup with (and edit the restore_full.sh) script.)
    # zfs list -t snapshot | grep zroot@
  7. Mount the USB stick: use dmesg to determine the device node of the USB stick as before (e.g. da3).
  8. Make a directory to mount the usb drive.
    # mkdir /mnt/usb
  9. Mount the USB drive.
    # mount_msdosfs /dev/da3 /mnt/usb
  10. Go to the location of the scripts.
    # cd /mnt/usb
  11. Find all the schg flags that have been turned on (in order to turn them off temporarily). This creates a text file called schg_flags.txt.
    # ./find_schg_flags.sh
  12. Turn off readonly on zroot/var/empty
    # zfs set readonly=off zroot/var/empty
  13. Set all those files with schg to noschg.
    # ./set_noschg.sh
  14. Edit the restore_full.sh to suit. Pay attention that you have included every necessary directory and file from the old / directory. You will have needed to modify the regular zxfer command in your crontab for those files/directories to be there. This is a good reason not to store much additional stuff in that directory that isn't a mountpoint of a non-zroot filesystem.
    # vi restore_full.sh
  15. Restore storage/home, storage/distfiles, storage/packages, storage/zrootbackup and zroot via sysutils/zxfer. Note in the last step which files rsync couldn't overwrite because they were busy. There should not be many. Depending on what they are, choose how to update. e.g. you might decide to reinstall rsync, or do a full system update later.
    # ./restore_full.sh
  16. Put the schg flags back the way they were.
    # ./set_schg.sh
  17. Turn readonly back on, on zroot/var/empty
    # zfs set readonly=on zroot/var/empty
  18. At this point it is a good idea to check to see that things look correct with diff, between directories in /storage/zrootbackup/zroot and /. A diff of /usr and its counterpart will show differing /usr/ports/distfiles and /usr/ports/packages but this is nothing to be concerned about, because these aren't stored in the /storage/zrootbackup/zroot... . For some reason, files in /boot/GENERIC and /usr/share/man differ, but I'm not sure why. /var/db/entropy/... differs, I don't believe this matters. Some files in /var/log differ but this is not cause for concern as logs will have been written in the time between restoring and diffing. AFAIK none of this is cause for concern, but would be glad to be enlightened if otherwise. Here are the commands:
    # ls /usr/ports/distfiles
    # ls /usr/ports/packages
    # ./diff_zroot_storage.sh
  19. Reboot the computer, and disconnect the e-SATA dock.
    # shutdown -r now
  20. At this point, it is probably a good idea to update your system completely, considering that the last step of that script was rsyncing the old zroot over the new. However, I would think that only the software that uses files that rsync couldn't overwrite are necessary to update.*
  21. Enable the commands in crontab by putting a file in the / directory. We do this so that in the event of a full system restore, the snapshotting etc. only starts when WE want it to.
    # touch /.crontab_enable

* I think this method I outlined works quite well. A better method would probably be to boot to a FreeBSD install on a USB stick, mount storage, zroot and your backup pool, so that every necessary file might be overwritten. Trying to use zfs send/receive rather than rsync based transfer might be possible. However, because you:
  1. Don't want to destroy the old snapshots on the backup
  2. Need to transfer every snapshot across to what is what probably a space constrained pool in order to get a current snapshot over, even if you delete the older snapshots immediately after
  3. Need to somehow get a working /boot/zfs/zpool.cache file to the correct location - IME it won't work if you try and boot it up. (Perhaps the answer is to use a separate filesystem or pool specifically for this /boot/zfs.)
  4. Cannot use the new zroot to bootstrap itself (i.e. you have to do so from another system)
this makes it somewhat more difficult to accomplish.
 
Reserved post 17 - Credits?

At this point, the article is effectively finished as far as a first public draft, and I will wait for some comments before further amendments. (Edit: Actually, I'm still re-reading it every so often and editing as I go, but the changes will be minor.) Mods: I would like to reserve the next few posts in case I need them. Thanks.
 
There is some good discussion going on here. In particular I've included some pictures to show how you would connect up your backups (e.g. what to buy).
 
One thing I always wonder is why people seems to keep on using the released FreeBSD versions to bootstrap new installations. True, this is how you get FreeBSD installed for the very first time or in case of total disaster. However, once you have at least one working FreeBSD system for the particular CPU architecture, you may build your own install 'distribution'. You can control the layout of your distribution and not have your scripts fail when the release puts things differently. You will also need only one medium to deal with.

Probably I am too much die-hard in this regard but I rarely trust prepackaged things. Here I will outline how to create an self-replicating FreeBSD-stable system. :)

Let's assume you have an working system, with sources in /usr/src. You build the system as usual.

# cd /usr/src
# make update
# make buildworld
# make buildkernel

Next, we need to prepare our bootable installation medium. My typical installation medium is USB flash stick. This is small, fast and cheap. I would use ZFS on the USB flash. The zpool needs to be bootable. I would create it with a script like this (USB drive is da2):

Code:
# select device node
disk=da2
# wipe out drive
dd if=/dev/zero of=/dev/$disk bs=1m
gpart create -s GPT $disk 
gpart add -b 34 -s 128 -t freebsd-boot $disk 
gpart add -t freebsd-zfs -l distribution0 $disk 
gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 $disk
zpool create -O atime=off distribution /dev/gpt/distribuion0
zpool set bootfs=distribution distribution
zpool export distribution
zpool import -o cachefile=/tmp/zpool.cache distribution
cp /tmp/zpool.cache /distribution/boot/zfs/zpool.cache
I use distribution0 for the label, because sometimes I want to replicate these sticks. To replicate, just label another one with say, distribution1, then attach it as mirror to the first one, wait to resilver, then export. You will have then two 'broken' mirror pairs, but still working perfectly well (and ready to replicate to yet another USB stick).

You will have the new ZFS filesystem mounted at /distribution. Now install your FreeBSD-STABLE there:

# cd /usr/src
# make installworld DESTDIR=/distribution
# make distribution DESTDIR=/distribution
# make installkernel DESTDIR=/distribution
# echo 'zfs_enable="YES"' >> /distribution/etc/rc.conf
# echo 'zfs_load="YES"' >> /distribution/boot/loader.conf
# echo distribution / zfs rw,noatime 0 0 > /distribution/etc/fstab

Adjust /distribution/boot/loader.conf and /distribution/etc/rc.conf to your needs, as you would for any newly installed system. This USB flash drive is now bootable and contains up to date FreeBSD (with all the latest drivers and fixes).

To make it self-replicating, you can do something like this:

# rsync -aH /usr/src /distribution/usr/src
# rsync -aH /usr/obj /distribution/usr/obj

You will now have the current FreeBSD sources and compiled binaries already. When you need to install to a new system from there, you can just repeat the above procedure. You may also find it useful to add rsync to the USB flash. Just

# cp /usr/ports/packages/All/rsync-3.0.8.tbz /distribution/tmp/
# chroot /distribution
# pkg_add /tmp/rsync-3.0.8.tbz

You may now copy your installation scripts etc.

Don't forget to export the pool when you are done.

# zpool export distribution

You may make any other modifications to the bootable USB flash, in order to have current system. If in hurry, you can use it on-site to update/recompile the FreeBSD source on any new hardware. Sometimes I keep the entire ports tree on the bootable USB flash as well.

Or, you may update it on your 'host' system, like this:

# cd /usr/src
# make update
# make buildworld
# make buildkernel
# zpool import distribution
# make installkernel DESTDIR=/distribution
# make installworld DESTDIR=/distribution
# make delete-old DESTDIR=/distribution
# mergemaster -Fi -D /distribution
# zpool export distribution

You have now updated your bootable USB distribution drive.

One final comment. You do not have to use ZFS on the USB boot flash. You may well just create an UFS filesystem. Some people prefer it this way, as to minimize ZFS exposure. Using UFS will only change the file system creation/mount commands above.
I believe ZFS is stable enough for such usage and because of aggregate writes it may be faster on USB flash storage, should you decide to do in-place updates.

I may have missed or messed something, as usual :)
 
This is really great, lots of effort in here.

Two questions:
1) You mentioned "...in preparation for dedup..." Aside from dedup which would prefer to run SHA256; any other reason to run it instead of fletcher4? That is, how prone is it to collisions in real life?

2) Since dedup is not currently available on 8.2-RELEASE, will it work retroactively, i.e asynchronously as well as synchronously on FreeBSD making it possible to dedup already written data. Seeing how this is massively great feature and all.

Thanks
 
Back
Top