ZFS Undestroy destroyed ZFS dataset & snapshots

malavon · Jul 6, 2023

For some extra context about the story, I'll refer you to my previous thread Thread geom-labels-disappear-after-geli-attaches-to-raw-disk-device.89579.
I'm writing this before having searched the webs, because I'm quite certain I'm going to have to write something myself and was hoping that I'm not the first to tackle this problem.

Another matter I'm trying to address is that I have a ZFS volume (on a dd image now - made before encryption) where I used to have quite a large bunch of snapshots in order to safeguard my data. Sadly this system was used against me, manipulating snapshots in order to delete data (that would incriminate the hackers) and then rewriting the date using a modified version of a script I once wrote myself. That's one of the worst parts of this hack, my own knowledge turned into scripts was turned against me.

As the subject clearly implies, what I'm looking for is a way to restore a ZFS dataset and/or all snapshots on that dataset from stray metadata. I'm quite certain this should be possible, considering that the disks this dataset was on were of ample size and combined with one of the key ZFS features - copy-on-write - it should be possible to recover lost datasets & snapshots by simply scanning all (unused) blocks. On top of that metadata is kept in several copies, so I'm quite certain there's a good chance of recovery.
Like I said, I'm not sure if a tool to do this already exists, but ZFS has been around for quite some time and I was hoping that either someone has heard of a tool that does more or less this or has written something that may provide me a running start.

malavon · Jul 6, 2023

As a bit of follow-up from my side: zpool history shows me all the destroyed snapshots together with its transaction id. From there I can also gather that a certain amount of snapshots have been rewritten ... a rather simple zpool history -il r | grep "snapshot <dataset>" | grep -v @zfs-diff | cut -w -f 4 | sort | uniq -c | grep -v " 1 " show me all of these that are not unique (but should be).
Now comes of course the biggest hurdle and that is actually restoring them. But something tells me that shouldn't be impossible since every transaction id is unique.

Is it possible - I am still backing up everything in order to test this without destroying any data - to do something like zfs restore <tx>? The manual page doesn't show this, but I know from looking into the sources that many functions can work using a tx id. I'm perfectly fine with coding something myself, but I have this feeling that I won't have to but I can't locate it in my brain. Does anyone know of a way to achieve this?

Fyi. I know I'm doing things that aren't exactly mainstream, but really, from a ZFS user point of view being able to restore a destroyed snapshot holds a lot of value.

VladiBG · Jul 7, 2023

Snapshots are not a backup. Backup must be stored on other hardware preferably at another site location and regularly tested for restore.

For the dataset it may be impossible to restore because after it get destroyed it's space is freed and used for another data stored in the pool. Here's more detailed explanation:

How to recover a destroyed dataset on a ZFS pool - Endless Puzzle

A procedure to recover your data after you destroyed a dataset in a ZFS pool. No warranties but better than giving up on your data...

endlesspuzzle.com

malavon · Jul 7, 2023

VladiBG said:
Snapshots are not a backup. Backup must be stored on other hardware preferably at another site location and regularly tested for restore.

Backups were made and overwritten by the modified snapshots. No dice there.

VladiBG said:
For the dataset it may be impossible to restore because after it get destroyed it's space is freed and used for another data stored in the pool. Here's more detailed explanation:

Copy-on-write is one of the core mechanisms of ZFS and I'm quite sure that on top of that ZFS tries to distribute writes (on mechanical hard drives) across the disk.
Also, metadata is kept in multiple locations - 2 in my case - so the chances for recovery are actually pretty good as I explained.

A quick look at zdb tells me it allows some manipulations based on transactions, I'm going to take a deeper look into it. If I have to dump an entire disk, overwrite, recreate snapshot etc I'm fine with it. Of course going to a different disk to prevent overwriting sectors

SirDice · Jul 7, 2023

malavon said:
Backups were made and overwritten by the modified snapshots.

Backups shouldn't be 'overwritten'. Backups are written once then stored (preferably off-site) for an X amount of time. You NEVER overwrite your previous backup.

malavon · Jul 7, 2023

SirDice said:
Backups shouldn't be 'overwritten'. Backups are written once then stored (preferably off-site) for an X amount of time. You NEVER overwrite your previous backup.

We can have a semantic discussion about backups which isn't relevant to this discussion. Things are what they are. Still, I'll explain adding to the noise.
My backups consisted of a snapshot of my home directory every single minute, assuming any changes were done. These snapshots are then culled into hourly/daily/monthly etc versions. They are then copied to several on- and offline servers/disks using zfs send -ReLPcw ... and zfs receive -Fu. These commands overwrite and delete snapshots by design, so yes, backups in my case were overwritten. Otherwise I would have had a snapshot on my backups for every single minute that my pc was on.
However, after those "nice people" decided to rewrite my snapshots ON EVERY SINGLE ONE OF MY 3 COPIES while I was away for a month I did a bunch of new zfs send -ReLPcw commands that overwrote the last good copy of my data.
Did I do the careless thing by doing so, even though I should have noticed that more data was being transferred this time? YES, of course I did. And I know very well that I did. I know this because I've spent weeks pouring over all of my disks and servers looking for my data. Trust me, I'm very well aware of the fact that I no longer have the original snapshots anywhere, only the rewritten ones except in transaction ids and the output of zpool history. But making statements about what backup strategy is good or isn't good isn't going to help me, except that it might maybe bump the thread and draw some attention from people who have an actual answer.

I'm simply asking if someone has something close to what I'm looking for, a way to restore datasets/snapshots based on transaction id. If it can be done using existing tools in base, great. If I have to write something myself, that's fine. I've written code on top of libzfs* before, in fact that's what my little snap/zap/(/send - unimplemented) utility called vize had to do as well. Of course if I had been able to finish that program this never would have happened, it would have detected a different UUID on these snapshots and warned me about it. But again, things are what they are. I do not possess the ability to go back in time, I can only look forward and work with what I have.

* Please don't go telling me that libzfs isn't supposed to be used for user-type of tools. I know this and I don't care, in fact I wholeheartedly disagree with it. I'll write my system utilities on top of any library that I please. Nonetheless, it's not conducive to this thread to elaborate on this until I have a working solution in hand

malavon · Jul 7, 2023

For the record, I'm very well aware that my post may have come across as rude. But let's face it, I can't allow this thread to derail into a discussion about something at best tangentially related. I hope most people here, at least the ones that have seen something from me before, know very well that I don't do useless discussions. I try to help people, albeit with varying success, with as much to-the-point remarks as I can muster. I hope people don't compare me to the daily troll asking senseless questions in an effort to make FreeBSD look worse than Linux, or to the weekly poster of "(Free)BSD is dying".

And let's be honest, this forum does have a serious tendency to get derailed in its posts, including the serious ones. I know, most of it is well-meant, but I have seen threads where I might have been able to chip in with something useful and just chose not to because I know it wouldn't stand out within the noise anymore. Sometimes it's better not to answer something so that the actual relevant answers come popping up.

In fact, one of the few things I really like about the StackOverflow (and family) and even sometimes (don't hate me) Reddit is a mechanism to prop up certain posts towards the top. I know this isn't possible in a forum, because it would destroy continuity, but it does have benefits for sure.

malavon · Jul 7, 2023

And to get back to the point, something tells me that zdb together with the -e and/or other flags might become my new best friend. I still have to find a way to get from transaction id to something more useful like the GUID though and then hopefully I can use the GUIDs to restore snapshots.

astyle · Jul 7, 2023

Y'know, what SirDice said is actually extremely succinct and relevant to the issue at hand.

To start:

malavon said:
My backups consisted of a snapshot of my home directory every single minute, assuming any changes were done. These snapshots are then culled into hourly/daily/monthly etc versions. They are then copied to several on- and offline servers/disks using zfs send -ReLPcw ... and zfs receive -Fu. These commands overwrite and delete snapshots by design, so yes, backups in my case were overwritten.

Just bad design from get-go, and it bit you in the ass just now.

Just simple as that - the overwrites by design make it impossible to recover from a mistake, be it your own, the users' or the 'nice people' you trusted. ZFS is actually irrelevant in this case. A good backup system makes it possible to recover from mistakes. The very design of your installation has defeated the very purpose of having backups in the first place. I would strongly recommend redoing the whole backup system from ground up, and to google "Best Practices for Backup systems".

If the snapshots are overwritten (potentially with bad data), there's nothing to restore from. No amount of coding will help. So - redo your backup system from ground up, with the goal of having some good data that you can restore from, no matter what.

Really looks like the idea of 'rotating backups' took priority over 'actually usable backups'.

chungy · Jul 7, 2023

The only reliable and supported way to get back destroyed datasets is by rolling back to a zpool-checkpoint(8), but this is not something you would normally have.

If you are quick and turn your power off within 5 seconds of a destroy, you might have a decent chance of using zpool-import(8) with the -T option to get back an accidental destroy. This becomes increasingly unlikely to work for up to around two minutes following a destroy, since the uberblock labels rotate and older copies that point to an older MOS will eventually become overwritten.

From everything said in this thread, you did not suffer a mere accident, but a targeted attack and you could not have pulled power in such a short time of it happening, nor would your attackers have been nice enough to create a checkpoint before messing with your data. On top of that, your backup system is designed to destroy snapshots on the remote system without regard for what you've done on the origin. It replicates mistakes and attacks equally and no longer has your old data.

If you'd like to write your own program to scan for MOSes that aren't referenced by the current uberblocks, go for it. Be warned, that recovery in your scenario is extremely unlikely, and ZFS probably did reuse some of the newly-freed space that your old data referenced. Double whammy if you used SSDs with autotrim turned on.

Eric A. Borisch · Jul 7, 2023

chungy said:
Be warned, that recovery in your scenario is extremely unlikely, and ZFS probably did reuse some of the newly-freed space that your old data referenced.

Correct; while zfs is copy-on-write, that doesn’t mean WORM; any blocks no longer referenced by a filesystem or snapshot after the COW completes are added to the free-to-be-written-on list.

How soon they are written likely depends first on how much new data is being written, and then on physical layout issues (how large the contiguous free region is and how large the incoming writes are, for example). Other than avoiding needless fragmentation, I’m not aware of any mechanism/back-pressure in ZFS to prefer writing to “never used” vs “no longer used” space.

And if you are on SSDs with autotrim, you can pack up your tools now.

astyle · Jul 7, 2023

chungy said:
From everything said in this thread, you did not suffer a mere accident, but a targeted attack and you could not have pulled power in such a short time of it happening, nor would your attackers have been nice enough to create a checkpoint before messing with your data.

I would not attribute to malice something that can be explained with mere stupidity.

Hanlon's razor - Wikipedia

en.wikipedia.org

chungy · Jul 7, 2023

astyle said:
I would not attribute to malice something that can be explained with mere stupidity.

I am assuming good faith and that the topic creator has accurately described the scenario. Nothing of what I said would change if that detail were different.

malavon · Jul 7, 2023

astyle I disagree and simply for the following reason: no matter what kind of perfect backup system I would have had, if someone with malicious intent had overwritten all backups and I were to sync data from said backups, my data would have been toast too.
That said, my backup system was work in progress and it did not revolve around rotating snapshots, but rather used culling - cutting away certain snapshots to save space (i.e. I had only a single snapshot left for 2019, but had a snapshot every minute for the past few days etc). Call them incremental backups if you will, but with the benefit that it's possible to cut a few out in-between other incremental backups. This whole scenario would have been impossible had the system been finished, but that's not the point. I was doing temporary manual synchronizations and I made a few mistakes while doing it. It wasn't finished and it's not going to be retroactively finished in the past no matter how many times people tell me I made a mistake. I cannot time travel and I presume neither can you. Let it rest, don't divert the topic any further, please.

chungy said:
I am assuming good faith and that the topic creator has accurately described the scenario. Nothing of what I said would change if that detail were different.

Thank you, and I did. There's more to it, but these are parts that I can't and probably never will share publicly.

Eric A. Borisch said:
And if you are on SSDs with autotrim, you can pack up your tools now.

One of the backups was on an SSD, 1TB SSD for 161Gb of data, 324Gb including all snapshot data. I'm not that worried that it's impossible. Another was on a hard drive, the rotating disc kinda thing. I'm also not that worried about that one. Most snapshots are extremely small too, remember, one a minute. In many cases that means that a few metadata blocks also contained embedded data.

malavon · Jul 18, 2023

Adding a slight bit to this: the 1TB SSD was completely zeroed before taking it into commission, less than 1.5 years ago. Irrelevant but: this is always a good practice to do with a new drive to see if any errors pop up with smartctl before actually putting real data on it.
After copying it with dd to a sparse ZFS Volume (ZVOL), the ZVOL occuppies about 660Gb. Both compressratio & refcompressratio are 1.00, meaning that there is no compression and all of this disk space savings is due to being a sparse volume. Being about twice the size sounds about right with how the snapshotting has occurred over the past 1.5 years, including the rewriting of snapshots by these miscreants.
Knowing that 1/3rd of this disk has never been touched by ZFS before, I'd dare say that there's a reasonable chance for the most if not all data to still be there.

If only I wasn't still preoccupied with BIOS troubles. Anyway, if this gives someone an idea I'd love to hear all about it!

astyle · Jul 19, 2023

The whole point of ANY backup system is to have several copies of good data, in different places. If you can't restore from one copy for any reason, that's OK, there's another place you can restore from.

Of course, the downside is that you need tons of disk space (which is expensive, no doubt), and the need to spend time on manual maintenance (and actually paying attention to making sure the system works as intended). Your ROI is peace of mind that yeah, you can restore the data (even if it's a little old), and actually have good options to choose from.

Yeah, no system is perfect. Any suggestion in here has downsides - ZFS can have a steep learning curve for some people, securing data storage space for your backups can be a logistical or financial nightmare, making sure the people you trust with that stuff can actually be trusted... and many more downsides than what I listed here.

High-end consumer-grade SSDs (Like Samsung EVO or WD Black, popular brands among benchmarker blogs) with a 1 TB capacity can be had for around $100 (or even less) on Amazon.

There's a difference between chasing perfection and simply avoiding disastrous scenarios. All my suggestions are made with the idea that disaster scenarios should be easy to avoid (and recover from), especially if they are visible a mile away.

malavon · Jul 19, 2023

astyle No system is perfect, correct. But please stop derailing the thread, it's not about the perfect backup scenario nor is it about your or even my personal backup preferences. It's not about the cost of SSDs and it's not about what happened. It's all and only about data (snapshot) recovery on a ZFS volume...

astyle · Jul 19, 2023

malavon said:
astyle No system is perfect, correct. But please stop derailing the thread, it's not about the perfect backup scenario nor is it about your or even my personal backup preferences. It's not about the cost of SSDs and it's not about what happened. It's all and only about data (snapshot) recovery on a ZFS volume...

which you overwrote as a direct consequence of poor design choices for the backup system.

malavon · Jul 19, 2023

astyle said:
which you overwrote as a direct consequence of poor design choices for the backup system.

No, I didn't. Bad people did. IF people had had access to your backups and wanted to do the same they would have done so too. These people did their homework, any off-site archival storage would have been attacked with all means as well. Nothing you can propose would have saved me and it surely isn't going to save me post-factum.

Fun fact: what you're proposing here is just as flawed because you're talking/writing about hard drives. I'm already in the process of researching what I'm going to do in the future and I can tell you one thing: hard drives aren't going to be part of the actual backup plan. Hard drives (well, mostly SSD's & NVME's) will be used only for the working copy (kept in sync with ZFS snapshots) and as a first rapidly available quick backup (again, based on my current software). However, the long term storage will be on optical media, either BluRay or maybe, but it's proprietary, Archival Disc. That long-term storage will also be based on ZFS, I'll be keeping every single snapshot of my data on disc for as long as I physically can. Again, I told you my system was work-in-progress, not by any chance the final solution. However, with my current experience I've added some extra requirements as you might be able to imagine.
Whatever you're proposing has the exact same weaknesses as the system you're trying so preciously to denounce, data can be modified on hard drives. Everything that is not immutable is a terrible choice for a backup system. Or to use the words you so many times uttered and agreed with:

SirDice said:
Backups shouldn't be 'overwritten'.

astyle said:
Just simple as that - the overwrites by design make it impossible to recover from a mistake,

If you're going to criticize someone, at least make sure you don't start contradicting yourself. Otherwise I'd have to add the following:

astyle said:
I would not attribute to malice something that can be explained with mere stupidity.

Yours Truly,
OP

Now, if you want to discuss backup strategies I will very happily do so with you in any thread you start. In fact, once this is all over I'll be starting my own thread to explain what I'm doing and ask if people see any improvements. However, right now that is not on the top of my priority list so there's no point in reiterating your fallible wisdoms.

chungy · Jul 19, 2023

malavon said:
IF people had had access to your backups and wanted to do the same they would have done so too. These people did their homework, any off-site archival storage would have been attacked with all means as well. Nothing you can propose would have saved me and it surely isn't going to save me post-factum.

In a well-designed system, there would be no possibility to delete backups. Especially not off-site.

malavon · Jul 19, 2023

chungy said:
In a well-designed system, there would be no possibility to delete backups. Especially not off-site.

People can be bought and all facilities including the most secure ones (which I cannot afford by a longshot) are staffed by people. You're being extremely naïve - like I was until just a few months ago - if you think people can be trusted. Anyway, I will try to stop responding to off-topic posts.

astyle · Jul 19, 2023

malavon said:
However, the long term storage will be on optical media, either BluRay or maybe, but it's proprietary, Archival Disc. That long-term storage will also be based on ZFS,

ZFS is really not a good fit for optical media, because writes are horrendously slow, and there's very little error correcting. SSD's, OTOH, are a match for ZFS.

Optical media is a good choice for preserving stuff like movies and music. Another shortcoming of optical media - if you don't set up your burning session correctly, you can have a 500KB Word file on a 700MB CD-R that you can't write to ever again.

If you have some old data that you know won't be added to at a later date - then Archival Disc is not a bad choice (but still an expensive one, more than SSD's).

Your best bet is to set up replication to a few 1TB SSD's. And if you don't trust people in a commercial data center (like Google's, they do have backup racks for rent) to take care of the stuff, can you trust yourself to set up something offsite with a few spare SSD's?

malavon · Jul 19, 2023

astyle said:
ZFS is really not a good fit for optical media, because writes are horrendously slow, and there's very little error correcting. SSD's, OTOH, are a match for ZFS.

For the record, again, all work in progress. I'm sure I'll find a way, the important thing is that I could restore my entire volume including all snapshots from optical media. ER can be done using CRC's, so no issue there. Also, I wouldn't be doing ZFS on optical media, I would be copying my ZFS snapshots (most likely as files) onto optical media. ZFS is incredibly versatile.

astyle said:
Optical media is a good choice for preserving stuff like movies and music. Another shortcoming of optical media - if you don't set up your burning session correctly, you can have a 500KB Word file on a 700MB CD-R that you can't write to ever again.

Correct, but that's the point of sessions, simply adding on top of an existing disc. I'll still need to verify I can't delete data, but I'm sure my restoration scripts could take care of that.

astyle said:
If you have some old data that you know won't be added to at a later date - then Archival Disc is not a bad choice (but still an expensive one, more than SSD's).

I'm more worried about procuring a reader/writer 20 years from now, old hardware has a tendency to become extremely expensive when it's no longer produced but still required by some. BD is already up to 4 layers, 128Gb and writers are readily available. And should compatibility be a foreseeable problem I simply go single layer.

astyle said:
Your best bet is to set up replication to a few 1TB SSD's. And if you don't trust people in a commercial data center (like Google's, they do have backup racks for rent) to take care of the stuff, can you trust yourself to set up something offsite with a few spare SSD's?

Nope, unless it's inside a datacenter I and only I have access to, inside a building that I own, it's not a viable option. In fact even worse: when you trust a company there's no way to verify if they aren't accessing the data or systems without your knowledge. Security is not just about the system itself, it's also about the building it's in, the people who can access the building and system etc.
Anyway, like I said, work in progress and not my priority

astyle · Jul 20, 2023

Ahh, what I had in mind is frankly a spare machine in your office that is connected to the network, and is used only for your backups. It's not that hard to set it up so that only you have access to it.

Seems like there's a need to re-evaluate who has access to backups, and why. A good design takes that into consideration.

If you want to be able to restore from a ZFS snapshot (which is what this thread is primarily about), all you need is a couple spare machines with those 1 TB SSDs inside. It doesn't matter if they are in your office, your home, or rented from a datacenter.

Offsite backup can mean a plethora of options. If commercial datacenters make you uncomfortable, you can just set up a spare machine at your own home to mirror your work backups. Frankly, that's what many small shops do for offsite backups.

Eric A. Borisch · Jul 20, 2023

Or a service like tarsnap. Encrypt on your machine before it gets sent to the cloud.

ZFS Undestroy destroyed ZFS dataset & snapshots

Administrator