Solved Recovering from a mistake

I'd like to ask for some help understanding how ZFS is supposed to work... Below I have a mock-up of a shell session where as root, I compile the kernel, get the ports, and try to strategically take some ZFS snapshots.
Code:
# make - j4 buildworld
# make - j4 buildkernel
# zfs snapshot zpool@first-snapshot-bare-kernel
# portsnap fetch
# portsnap extract
# zfs snapshot zpool@second-snapshot-empty-ports-tree
# cd /usr/ports/graphics/graphviz
# make && make install
# cd /usr/ports/devel/doxygen
# make && make install
# cd /usr/ports/lang/ruby27
# make && make install
# zfs snapshot zpool@third-snapshot-pre-kde
# cd /usr/ports/x11/plasma5-plasma-desktop
# make && make install
# zfs snapshot zpool@fourth-snapshot-kde

My trouble is, issuing # zfs rollback zpool@third-snapshot-pre-kde generates absolutely no errors, but # pkg info x11-wm/plasma5-kwin shows that I still have it installed. And all the distfiles that I have downloaded since that third snapshot are still there. I want to go back to the zpool@third-snapshot-pre-kde snapshot and try re-compiling the ports again, starting at that snapshot... I'm a little reluctant to use # zfs destroy, but if that's what does the job, I'll bite the bullet.

Thanks in advance!
 
Do you indeed have everything in one single filesystem? Otherwise you need -r for the snapshot, and something like for zf in `zfs list -rH -o name -t filesystem zpool`; do zfs rollback $zf@snapshot; done for rollback.
 
Something doesn't seem quite right to me with your snapshot commands.
Typically you snapshot a dataset:
you have a zpool named mypool, you create a dataset called sheep under it:
zfs create -o mountpoint=/sheep mypool/sheep
then you snapshot "sheep" dataset:
zfs snapshot mypool/sheep@snap1

What is the output of:
zpool list
zfs list

if you wanted to snapshot all datasets under a pool, you need to do it recursively:
zfs snapshot -r zpool@first1

If your intent was to "snapshot the whole system before installing ports and being able to rollback"
I think you missed adding the "-r".
basically what PMc said.
 
Thanks for the quick reply!

I am indeed trying to snapshot the entire disk / whole system. So yeah, the snapshotting commands on my actual system did include the "-r" option... when I do # zfs list -t snapshot, every dataset shows my snapshots just fine.

I don't have it in my notes, but off the top of my head, I recall that /usr/ports/ directory is a separate dataset...

So, if I get PMc correctly, I will need to repeat the # rollback -r command for every single dataset??? Oh, boy, 13.0-RELEASE's default ZFS setup has nearly 10 datasets...
 
So, if I get PMc correctly, I will need to repeat the # rollback -r command for every single dataset??? Oh, boy, 13.0-RELEASE's default ZFS setup has nearly 10 datasets...
Yes, and beware: the -r for rollback does not mean "recursive". It means <go back more than one snapshot and destroy those in between>. There is no "recursive" for rollback[*]. Don't complain, I already made up a command line for You that does the whole job.. ;)

[*] Rationale: on snapshot, we need a "recursive" option, because we want to do all these snapshots atomically at the very same timestamp, with no filesystem action in between. On rollback we are in recovery, so this is not needed to be atomical.
 
Yes, and beware: the -r for rollback does not mean "recursive". It means <go back more than one snapshot and destroy those in between>. There is no "recursive" for rollback[*]. Don't complain, I already made up a command line for You that does the whole job.. ;)

[*] Rationale: on snapshot, we need a "recursive" option, because we want to do all these snapshots atomically at the very same timestamp, with no filesystem action in between. On rollback we are in recovery, so this is not needed to be atomical.
Thanks, PMc , this makes sense. I'm gonna try that later tonight.
 
Shell globbing is not my thing, but further research confirmed that yeah, zfs snapshot, zfs clone, zfs get, etc - they are per-dataset, not per-pool. From that, one can be expected to draw the conclusion that PMc pointed out to me (Once again, thanks, PMc !) And in every single explanation that I found - even in the FreeBSD handbook, the working assumption was that this important point was already understood, no need to explicitly point that out. Sure, it was casually mentioned, but I somehow kept missing it within dense text, and never realized just how important it was - until I got stuck on something simple. This kind of scenario is frankly one of the pitfalls of Computer Science education, even if you do pay attention.
 
  • Like
Reactions: mer
For what it's worth, I habitually create, activate then boot a new environment before any routine such as pkg-upgrade(8) or freebsd-update(8).

Boot environments might help with what you're doing, although I do see the value in sometimes rolling back other file systems (for example, to tell the point at which something goes wrong).

A recent example:

Code:
% bectl list -c creation | grep n247798-f39dd6a9784
n247798-f39dd6a9784-a -      -          14.5M 2021-07-08 07:54
n247798-f39dd6a9784-b -      -          6.58M 2021-07-08 12:12
n247798-f39dd6a9784-c -      -          722M  2021-07-09 19:25
n247798-f39dd6a9784-d NR     /          5.92G 2021-07-09 21:06
n247798-f39dd6a9784-e -      -          45.6M 2021-07-10 04:31
% uname -KUv
FreeBSD 14.0-CURRENT #100 main-n247798-f39dd6a9784: Thu Jul  8 07:38:23 BST 2021     root@mowa219-gjp4-8570p:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG  1400025 1400025
%

The most recently created environment is not active because I know that it's one of two that are bugged.

For each of the two bugged environments:

Code:
% bectl list -s -c creation | grep n247798-f39dd6a9784 | grep -e "-a" -e "-e"
n247798-f39dd6a9784-a
  copperbowl/ROOT/n247798-f39dd6a9784-a                         -      -          8K    2021-07-08 07:54
n247798-f39dd6a9784-e
  copperbowl/ROOT/n247798-f39dd6a9784-e                         -      -          9.75M 2021-07-10 04:31
%

– I could place a ZFS hold on the snapshot but in this case, I'll not bother. I have a bug report to remind me which boot environments are bugged.

<https://www.freebsd.org/cgi/man.cgi?query=zfs-hold&sektion=8&manpath=FreeBSD+14.0-current>
 
grahamperrin This link from vermaden has steps to install into a boot environment. Basically you create the new BE, mount it, chroot into it, do the freebsd-update steps, do the pkg upgrade steps, exit chroot, unmount the BE, activate new BE and reboot.
Similar to the way TrueOS was doing the updates for a while way back.
It works pretty slick, I think a bit easier across versions (when you are using -r), you only reboot once, you can clean up packages while you're in the chroot, which is nice when say python37 goes EOL.

 
PMc 's basic idea (the fact that zfs rollback is per-dataset) worked like a charm. I even understood the shell globbing command that PMc provided. To avoid re-downloading the distfiles, I got 'em backed up to a USB stick :p Now THAT is convenient.
 
  • Like
Reactions: mer
In the words of the immortal bards, Monty Python:
"...and there was much rejoicing"

Glad you got it figured out. It's always weird when you think a command should be symmetrical and it's not.
"create recursive snapshots of a zpool" but "what I have to rollback every single dataset under zpool, not just recursively rollback zpool"
 
A bit of a follow-up to continue the conversation: Extending my opening post, would the following sequence of events be even possible, or are there some pitfalls to watch out for?
  1. Create a new Boot Environment, say test.
  2. Boot into test.
  3. Roll back to "third-snapshot-pre-kde"
  4. Install say, GNOME.
Would I end up with KDE in default BE and GNOME in test BE, and be able to boot between the two? Or, following that same logic, it could very well be possible to say, have one BE for Wayland, and another for Xorg... Is that even the case, or are there pitfalls to watch out for?
 
  • Like
Reactions: mer
Would I end up with KDE in default BE and GNOME in test BE, and be able to boot between the two?
Typically /usr/local is on a separate file system than the root, so a BE wouldn't really cover it.

You could have it on the same file system, and then the BE would, and yes it'd work as you expect.
 
Typically /usr/local is on a separate file system than the root, so a BE wouldn't really cover it.

You could have it on the same file system, and then the BE would, and yes it'd work as you expect.
bectl(8) says that
ZFS boot environments are bootable clones of datasets.
Your comments suggest that only the root dataset (whose mountpoint is /, as shown by # zfs list) is cloned when creating the BE.

That same bectl(8) says that # bectl list [B][U]-s[/U][/B] will also list snapshots - which suggests to me that it may be more accurate to describe a ZFS BE as a bootable clone of the entire zpool on the disk. What further supports my thinking is the idea that # bectl list [B][U]-a[/U][/B] will also list all datasets.

Man, I hope vermaden can weigh in and tell me if I'm on the right track... Before this thread becomes too long to read and establish context. He practically wrote the whole thing, after all, so he'd know.
 
… it may be more accurate to describe a ZFS BE as a bootable clone of the entire zpool

No. This real-world example (including the responses) might help you to understand what's normally comprised by a boot environment: zpool import with an altroot followed by mount of a boot environment from within the pool : freebsd

… entire zpool

pool (not zpool).

… have one BE for Wayland, and another for Xorg. …

Why separate the two?
 
Your comments suggest that only the root dataset (whose mountpoint is /, as shown by # zfs list) is cloned when creating the BE.
That is correct.

Boot Environments are meant to clone copies of your operating system, such as the $POOL/ROOT/default dataset.

/usr/local isn't really your operating system, but feel free to make it part of the root filesystem dataset if you really wish.
 
BE is a clone of the root dataset; that is correct. BUT other datasets that don't have explicit mountpoints may actually wind up under the "/" mountpoint.
Things needed during system startup may live under /usr/local. That's why paths for init/rc stuff includes /usr/local/etc/rc.d.

The best way to see is use the zfs list command. Here's from a system that has no datasets beyond what is created by the installer (and upgraded BEs a couple times):
zfs list NAME USED AVAIL REFER MOUNTPOINT zroot 43.7G 179G 88K /zroot zroot/ROOT 13.7G 179G 88K none zroot/ROOT/13.0-RELEASE-p2 632K 179G 5.53G / zroot/ROOT/13.0-RELEASE-p3 13.7G 179G 5.56G / zroot/ROOT/default 908K 179G 5.75G / zroot/tmp 2.23M 179G 2.23M /tmp zroot/usr 29.9G 179G 88K /usr zroot/usr/home 29.2G 179G 29.2G /usr/home zroot/usr/ports 757M 179G 757M /usr/ports zroot/usr/src 88K 179G 88K /usr/src zroot/var 1.60M 179G 88K /var zroot/var/audit 88K 179G 88K /var/audit zroot/var/crash 88K 179G 88K /var/crash zroot/var/log 1.12M 179G 1.12M /var/log zroot/var/mail 112K 179G 112K /var/mail zroot/var/tmp 120K 179G 120K /var/tmp

Take a close look: there are no datasets with a mountpoint of /usr/local. That means that /usr/local is actually part of the initial root dataset, so every new BE created from that has /usr/local as part of it. Notice too that there is no /var/db mountpoint, so that is also part of your root dataset (and then new BEs).

So in my opinion, going back to post #15, astyle yes I think that would work the way you intend. Just be careful with "rolling back to a snapshot" that may have unintended effects depending on exactly what the snapshot was taken of.
Assuming the BE you create the new test BE from has KDE installed, I would reboot into test, pkg delete the KDE components and then pkg install the GNOME components. That would be similar to the link I provided to vermaden Upgrade procedure. An implication there is "you may not need to reboot into test BE, you could simply mount test BE, chroot into it and then pkg delete KDE components, pkg install GNOME"
 
bectl(8) says that

Your comments suggest that only the root dataset (whose mountpoint is /, as shown by # zfs list) is cloned when creating the BE.

That same bectl(8) says that # bectl list [B][U]-s[/U][/B] will also list snapshots - which suggests to me that it may be more accurate to describe a ZFS BE as a bootable clone of the entire zpool on the disk. What further supports my thinking is the idea that # bectl list [B][U]-a[/U][/B] will also list all datasets.

Man, I hope vermaden can weigh in and tell me if I'm on the right track... Before this thread becomes too long to read and establish context. He practically wrote the whole thing, after all, so he'd know.
With the default FreeBSD install on the ZFS the /usr/local IS INCLUDED in the ZFS Boot Environment.

The canmount paremeter is important here.
Code:
% zfs get canmount                                                      
NAME                  PROPERTY  VALUE     SOURCE
zroot                 canmount  on        default
zroot/ROOT            canmount  on        default
zroot/ROOT/13.0       canmount  on        default
zroot/ROOT/13.0@safe  canmount  -         -
zroot/ROOT/13.0.safe  canmount  off       local
zroot/home            canmount  on        default
zroot/tmp             canmount  on        default
zroot/usr             canmount  off       local
zroot/usr/ports       canmount  on        default
zroot/usr/src         canmount  on        default
zroot/var             canmount  off       local
zroot/var/audit       canmount  on        default
zroot/var/crash       canmount  on        default
zroot/var/log         canmount  on        default
zroot/var/mail        canmount  on        default
zroot/var/tmp         canmount  on        default

What else You need mate?
 
I really appreciate everybody's responses here! I was able to learn a lot, and to tie up some loose ends for myself. Now I know why the default mountpoints for the default ZFS install pool were chosen to be what they are - some thought was put into that, too, and it's affecting things down the road. As in - if I want to share Poudriere's output between BE's and hosts, it would be helpful to:
  • give it a separate dataset mountpoint!
  • To pay attention to exactly when I do snapshots and make new BE's!
Classic example of correlating machine states to directed acyclical graph nodes, and frankly, basics of Computer Science education. FreeBSD's "ports vs. packages" is another concrete example of that.
 
With the default FreeBSD install on the ZFS the /usr/local IS INCLUDED in the ZFS Boot Environment.
Just noticed something: Everybody's # zfs list shows that /usr is a separate dataset!!! IIRC, that's the default.

Going by that logic, wouldn't /usr/local be a part of the /usr dataset (and therefore not part of the ROOT dataset)?
 
Good thought, but look at what vermaden posted:
zroot/usr canmount off local

"canmount" is off. By having that off, "any dataset under /usr without it's own dataset is part of the root dataset".

if canmount was on, then you would be correct.

Look at /var: same thing, canmount off. So /var/db is part of the root dataset, but /var/log is not.
That's why "don't put your mysql database in /var/db/mysql"

Fun stuff, eh?
 
Back
Top