ZFS A power outage left my ZFS file system with inaccessible files and directories.

My power went out while building ports via poudreire (I don't know whether it is related or not, but I figured I'd share it). Upon reboot, several files and directories are in a 'phantom' state. As far as I can tell, the files that were impacted were not being edited at the time or are frequently accessed. These files and directories are missing from the file system and, at the same time, present and cannot be altered. e.g., lswill give me the name of the file or directory in question with an the error No such file or directory. cp or mv returns File exists. To work around the problem, I moved the parent directory with phantom files in it and restored the files in question from a snapshot.

Also, zfs scrub came back clean.

Questions:
What happened?
How can I return the files to their original state or make it so they can be deleted?
I only noticed these phantom files because of the programs that broke, Is there a way I can find other phantom files?
 
Generally there is no escape except what you already did, moving them away. There is no fsck for ZFS.
Ok, I was hoping that there was something that could be done with zdb. After looking at its manpage, it didn't seem like it was going to be a simple feat.

Is this even possible with zfs' sync on? ZFS is designed specifically to avoid this. I had abrupt power offs many times, zero problems.
I've had many abrupt power cycles over the years, and this is the first time I've encountered something like this.
 
These files and directories are missing from the file system and, at the same time, present and cannot be altered. e.g., lswill give me the name of the file or directory in question with an the error No such file or directory. cp or mv returns File exists.
This sounds like your ZFS metadata is corrupted, in a rather bad way. Obviously, this should have never had happened.

There might be other explanations, but I can't think of any sensible ones right now. You can get very bizarre effects if file names contain unicode characters (for example, you might have multiple files that "look like" they have the same name, but in reality the names are different, they just render the same way on the screen). To the clueless user trying to access them, it might seem like a file both exists and doesn't exist: you do "ls" and see a file that looks like it is named "a", but "cat a" fails, because the file name is really a character from another language that just happens to look exactly like a. Usually, doing something like "ls -1 > /tmp/foo" and then examining /tmp/foo with hexdump tends to find those problems. But this is very rare.

As already said: Fixing this with zdb is impractical for most people, and probably not worth the effort.

I suspect you have found a bug. In theory, you should report it and allow some developer to access your file system to see what the exact situation is. In practice, it seems unlikely a developer will have the time to do this.

Purely idle curiosity: What version of FreeBSD (and therefore ZFS) were you running? Has your hardware ever shown any symptoms of memory errors (like random spontaneous crashes)? Does your motherboard have ECC? I'm not saying "blame the hardware", but memory errors in a metadata block are a possibility.
 
Back
Top