AndyUKG said:
The reality is that as I had a backup copy of the data I didn't do any detailed analysis of what might have been affected data wise, it would have been an interesting exercise but impractical as the pool contained millions of files. If I hadn't had a backup copy to restore from, or even to compare the data against I would have felt extremely nervous about the integrity of the data (ie if I'd had to copy the data off, destroy the pool and create it). This seems to me a pretty piss poor result from a simple power outage on a supposedly advanced fault tolerant file system.
It's both well documented and common knowledge that certain types of hardware failures can cause corruption regardless of your filesystem. Specifically in ZFS's case, these failures tend to cause errors exactly as you have stated. See here for some information:
http://docs.sun.com/app/docs/doc/819-5461/gavwg?a=view
Essentially, these issue's come down to several forms of hardware.
- Cable cross-talk
- Faulty hardware eg RAM
- Sudden power failure
Since you report sudden power failure, I'll tell exactly why it happened and why it's your fault and not ZFS's.
ZFS, as with any FS, trusts a flush request. Because of ZFS's COW abilities and transaction grouping this is generally not a problem but there is one generally rare situation that results in corruption even on ZFS. Hard drives have a feature write-caching which greatly increases performance at the risk of possible corruption.
ZFS guarantees "good" data is not moved until the entire write is complete, but that guarantee comes with a caveat some do not realize. If the hard drive "lies" to ZFS that one portion of the write is complete, ZFS will go ahead with committing the transaction group and updating the uberblock. Say you lose power at this point, and the disk completes the writes but issues them out of order. You're stuck with new COW data, but with wrong uberblock so the COW differentials are unable to track changes to specific files and you end up with your exact scenario. This is simplified a bit but you should get the idea. Remember in ZFS, redundancy and consistency are different things and one doesn't always guarantee the other. The consistency portion is what went wrong for you so the mirror doesn't help.
You can find plenty more of these but this link shows my explanation in the real world:
http://mail.opensolaris.org/pipermail/zfs-discuss/2010-January/035740.html
Every reliability document worth it's weight advises you to disable drive write-caching. Here's just one example:
http://wiki.postgresql.org/wiki/SCSI_vs._IDE/SATA_Disks
There are ZFS and hardware methods you can use reduce the performance impact of disabling this but for a lot of setups it's going to be more effective to simply keep full backups as this is generally a rare issue.
Because you didn't take adequate measures to insure ZFS could operate reliably, you are at fault here. You compounded the issue by blaming ZFS and spreading FUD. I don't think it was intentional on your part, but nevertheless potentially harmful. Please don't take this as another personal assault as I'm sure you're a fine person. You also seem like an intelligent person and a decent sysadmin with some room for improvement. I believe you're a decent sysadmin since you had backups.
This explanation was brought to you by your more detailed problem description. HIH.
AndyUKG said:
Is it really necessary to come onto a thread and label people with insulting names when they are trying to share experiences and knowledge??
Maybe you're confusing me with someone else as I'm quite sure I never called anyone here an insulting name. When I do ask questions, I really dislike getting misleading responses, FUD, or answers that do nothing but serve to inflate the responder's post count. So what I was pointing out to you is there are more details in ZFS than are dreamt of in your philosophy, and you need not get snippy when someone asks for a clarification.
I'm all for you sharing your experiences though as we're all in this ZFS boat together, and hopefully reports like yours(the detailed version, not the original) can help both awareness and resolution for everyone.