ZFS Undestroy destroyed ZFS dataset & snapshots

Let me summarize what I've read in this thread, and the parent thread (about GELI), and the one about making copies of a ZFS volume using dd.

You run a system that uses a very peculiar "backup" strategy, one that is inherently completely broken in the sense that the backups are designed to be modified and overwritten in place. Well, that was a big mistake, but we all have to learn from first making mistakes. In spite of having a very paranoid attitude towards using computers (for example, having a complex but mis-begotten backup system), you used the same password for everything; that seems to implausible that I'm hard pressed to believe it. Then a set of adversaries took over your system for a month, and performed a series of actions that taken together sound like something right out of a science fiction movie. The hacked your cable modem (so even today you think they're listening for what you do, and are spoofing you), and you haven't fixed that. They edited your facebook posts. They understand internals of ZFS and administration of FreeBSD well enough to perform detailed modification of your file system. The actually set a perfect trap for you, making sure that your own use of backups after they have finished the hack will do you in.

And the next thing is where this turns from a difficult-to-believe story into a fever dream: When you came back from your month-long absence, instead of immediately seeking help from law enforcement and not touching the computer, you started messing around madly, copying partitions using dd in the most bizarre fashion, re-encrypting things, and starting on the very complex journey of wanting to develop a "ZFS time machine", which is what this thread is about. This is insane. Here is my advice: Stop. Get help from professionals. Contact law enforcement about the hack (which if it happened as you described is a serious crime). Consult with a good lawyer for what your options for regress to the hackers (which you seem to know) are. Abandon the existing system, get a new set of hardware and OS, and start from scratch. And consult with mental health professionals, to see whether your ability to make good decisions is being negatively affected by a psychological or medical problem that should be treated.

Now to the topic you asked about above. You say (correctly) that ZFS typically doesn't overwrite information (both file data and file system metadata) in place, and that it often writes multiple copies of metadata to disk. You wish to create a tool which acts like a time machine does in science fictions: it rewinds the state of a ZFS file system to a given point in time in the past (which you will specify as a transaction ID). To begin with, this it not in general completely possible, as the ZFS scattered bits of metadata are not guaranteed to be complete. It is also in general not unique: If you have the current state of a bit of ZFS metadata, and some earlier version found randomly on disk, there are usually many possible ways that older version could have evolved into the current state. As a hypothetical example: You find that a directory has 10 entries today, files named a ... j. You find an older copy of the metadata for the directory (but not its content), saying that it had 11 entries a week ago. You also find 1000 file metadata entries on disk that were unlinked within the last week. You have no idea which of the 1000 files was the 11th entry a week ago. In order to replay "the log" (the incomplete an unordered sets of metadata), you have to try out all 1000 possibilities. That seems easy, since computers are pretty fast and capable of running many things in parallel. But then you immediately get into another problem: The other 999 unlinked files could be in any other directory, for which you have not found older metadata. If you have 10,000 directories in your system, there are (10000 999) ways those could be arranged (where the notation means the binomial coefficient), and that number is astronomically large. To put it differently, the task of guessing which path an unknown file system modification took requires checking a very large number of combinations, and like most combinatorial problems, it is intractable (learn about complexity theory and non-polynomial problems). Certainly, with some research it would be possible to create heuristic solutions to this problem, but that's in the realm of a major CS research problem. In my opinion, this is the kind of thing a grad student could do as a PhD thesis.

To my knowledge, no such "time machine forensics" tool exists. And given that it is of very limited use (few people need anything better than fsck for UFS and scrub for ZFS), and much better solutions exist for production systems (namely actual backups, and large production file systems that perform actual transaction logging and have built-in backup mechanisms), it makes sense that nobody is going to invest time into it.

If you really want to write such a tool for yourself, I would start by reading the chapter on ZFS in the "black book" (Design and Implementation of FreeBSD, by Kirk McKusick and friends), and then attending one of the classes that Kirk teaches on ZFS internals. That will give you a solid base for understanding the on-disk data structures, and mechanisms by which they are updated in normal operation. Then schedule a few months of free time.
 
Let me summarize what I've read in this thread, and the parent thread (about GELI), and the one about making copies of a ZFS volume using dd.
You make many assumptions that are incorrect, which is probably caused by not having the entire story and misreading. I cannot give you the entire story because I guarantee you that a few are reading along. However I do thank you because I'm guessing more people are asking these questions but do not - as I asked - post them on this thread. It's a rather peculiar problem really, I try to keep this thread clean but at the same time it might discourage people who know (part of) the answer from posting. Life is difficult I guess.
In spite of having a very paranoid attitude towards using computers (for example, having a complex but mis-begotten backup system), you used the same password for everything; that seems to implausible that I'm hard pressed to believe it.
The system I was writing - vize - is actually incredibly simple, but not finished and I had to use temporary manual workarounds. Once I got this wrong and did not notice, as I mentioned earlier. I do not use the same password for everything, I use(d) a variation that I can(/could) remember. I have a rather bad memory - due to neurological problems that I need workarounds for - and it's still better than writing them down. Some of my accounts ended up on haveibeenpwned.com, exposing many of them at once including the one for this forum. And sadly due to said neurological problems I was never able to rectify this, I simply forgot. Having memory issues makes things which are easy to most rather difficult but one would have to experience this in order to fully understand.
The hacked your cable modem (so even today you think they're listening for what you do, and are spoofing you), and you haven't fixed that.
They did. And I did. No more information will I be giving on this topic.
They edited your facebook posts.
No, they deleted posts of a dead man which were incriminating to some of them and due to the closed nature of Facebook I could not backup them (although it wouldn't have mattered). They also deleted e-mails from all of my systems - this is why they needed to rewrite my snapshots - and my local mail server/archive.
They understand internals of ZFS and administration of FreeBSD well enough to perform detailed modification of your file system.
Actually they don't understand ZFS internals even at the level that I do, I'm sure of that. But rewriting snapshots using a modified testing/rectification script they stole from my own hard drive is rather simple and insulting on top of it.
When you came back from your month-long absence, instead of immediately seeking help from law enforcement and not touching the computer
No. You've got the timeframe all wrong. That absence was long ago, almost a year now. I only noticed the hacks more recently because they're connected to other crimes I'm facing which I will not elaborate on. What I did or did not do with law enforcement is not going to be written down here either.
I also will not entertain your assumptions about my mental state but will add that diagnosing mental problems over the internet and written text at that would be rather difficult for any seriously trained professional. That said, any seriously trained professional would not attempt this so I'm certain you're not one of them. Things happened as I say and said, I don't care what you think about my mental state.
You wish to create a tool which acts like a time machine does in science fictions: it rewinds the state of a ZFS file system to a given point in time in the past (which you will specify as a transaction ID).<snipped more text>
Actually rather correct, although I would imagine it would work differently. If it's not possible then fine, if it doesn't exist and I have to write it myself also fine. Thank you for your book recommendation, I'll probably read it once I find the time. But this attitude of "it's impossible because I don't know how to do this" is what stops progress. Or put differently: "I can't commit the effort to this, so I'm going to discourage other people from trying as well". While I do not wish to compare what I'm asking to other great achievements of our time, I'll ask you this: was it not impossible once to land on the moon - even so much so that quite a few people today still believe it didn't happen? Was it not impossible once to build a machine without moving parts that somehow - magically almost - is able to perform calculations and show me this very forum? Was it not impossible once to imagine needing more than 640kB of RAM ;)? ... Was it not impossible once to make fire of your own instead of relying on a lightning striking a tree? Was it not once impossible to treat diseases that used to kill billions? Was it not once impossible to do ... almost anything?

I want to end with a quote by Eugene Lewis Fordsworthe: Assumption is the mother of all mistakes. Although the author himself admitted that it's not entirely correct and sometimes making an assumption is necessary, I'd say that making assumptions about the thoughts in someone else's head, someone you've never even talked to in person, is always doomed to fail.
 
[...] If you really want to write such a tool for yourself, I would start by reading the chapter on ZFS in the "black book" (Design and Implementation of FreeBSD, by Kirk McKusick and friends), and then attending one of the classes that Kirk teaches on ZFS internals. That will give you a solid base for understanding the on-disk data structures, and mechanisms by which they are updated in normal operation.
Discussion aside, the ZFS On-Disk Specification - Draft* is also quite informative: zfsondisk. Matt Ahrens notes:
This repo contains the ZFS On-Disk Specification, from Sun Microsystems, published 2006. The OpenZFS on-disk format is an extension of this.

___
* I think I noticed this first at Chris Siebenmann's blog entry A broad overview of how ZFS is structured on disk of June 24, 2018.
 
And consult with mental health professionals,
malavon This is both to have yourself checked up and to possibly add to the charges. Many a court consider data to be worthless, damage to mental health is something else.

And as long as I don't know where the OP is, I can not judge the quality of the law enforcement specialists. Knowing what I know about them here, I would trust the forum more. YMMV extremely here.
 
Really looks like OP failed to detect sarcasm in Professor ralphbsz 's post...

Several people who actually know something about backups have been telling OP that his very approach to backups is broken. In fact, I was the one to be blunt and tell OP in post #9 that this really bit him in the ass...
 
Really looks like OP failed to detect sarcasm in Professor ralphbsz 's post...
I have a good sarcasm detector but detected none. Always possible I'm wrong though.
malavon This is both to have yourself checked up and to possibly add to the charges. Many a court consider data to be worthless, damage to mental health is something else.
I don't see why this would be the case, data is memory. On top of that, data might be (incriminating) evidence, in which case destruction of data is destruction of evidence. There are laws against that in my country - which as a mod you can certainly see in my profile - and yours.
 
On top of that, data might be (incriminating) evidence, in which case destruction of data is destruction of evidence. There are laws against that in my country
As are in most countries. And guess what? Someone seems to be messing with the evidence. Or do you work on a strictly read-only source copy? Do the authorities have a signed archive version? If not, you are having one heck of a problem.
 
And guess what? Someone seems to be messing with the evidence.
Yes, every law is going to be broken, that's why it's deemed to be necessary to be codified into a law. But I have a feeling we're missing each other's point. I simply meant that destruction of data should be punishable by law, whether it is or not. I consider my data to be part of my identity as it is part of my memory and thus destroying said data I - personally - consider an attack on my selfbeing. Doesn't matter much, just wanted to make that clear.
 
Back
Top