FreeBSD on UFS, preventing data loss on crash?

Hi

I've been using FreeBSD on my server for a year now, and I'm mostly quite happy.

There is only one thing that bothers me (a bit too much) and that is file system inconsistencies which has happened a couple of times. Both times the cause was an unclean shutdown (power was lost).

I first installed FreeBSD 8.1 on UFS, and had the first crash after a few weeks. After upgrading to FreeBSD 9.0 I just now had the second crash.

Running fsck (as advised) cleaned up the file system inconsistencies both times, but when the result is that files are missing or contain the wrong content, I find it hard to trust the system. The last fsck caused /etc/fstab to contain the content of the /etc/ssh/sshd_config, and the /etc/ssh/sshd_config file was missing.
Also the content of /etc/ssh/sshd_config appeared in some other files as well. A lot of other "fixes" were also made.

Needless to say, it wouldn't boot, so it took quite some time to fix.

Having backup of /etc/ I could find the changes in that directory with diff, but there could be changes in other directories I don't know about.

So the question I have is: is this "just the way it has to be" when running FreeBSD on UFS, or are there tricks to avoid this from happening? Is UFS really that fragile when it come to unclean shutdowns?
 
bro said:
So the question I have is: is this "just the way it has to be" when running FreeBSD on UFS, or are there tricks to avoid this from happening? Is UFS really that fragile when it come to unclean shutdowns?
Not sure if you can turn it on after the fact but FreeBSD 9.0 now installs using UFS+Journaling. That should help.

Another option is to use ZFS, which has a lot more features with regard to data integrity.
 
  • Thanks
Reactions: bro
UFS snapshots with SU+J are broken on 9.0-RELEASE, use 9-STABLE if you are going to use UFS snapshots (dump -L for example makes use of UFS snapshots).
 
  • Thanks
Reactions: bro
bro said:
There is only one thing that bothers me (a bit too much) and that is file system inconsistencies which has happened a couple of times. Both times the cause was an unclean shutdown (power was lost).

Prevention is better than after-the-fact recovery. So install a UPS. Seriously. Even one of the terrible cheapies is better than nothing.
 
Both times the cause was an unclean shutdown (power was lost)
Are you serious?
Powerloss = instant crash. Are you aware of voltage fluctuations before powerloss and before power "restore"? Those events are the #1 killer of HDD's. In my opinion most electronic device failures (including stereos, etc) should be billed to the power company (good luck with that).

There is no system that can handle powerloss, because although in the short term you might recover depending on software quality, in the long run it will become a HARDWARE failure - then you will really pull out your hair! Do you realize what would happen if the HDD-read-head were to react to a voltage spike by traveling half-way across the drive-plate?

GET A UPS. Make sure it can signal (through com1 or something) when it is about to loose internal power. Channel that signal to your server to triger emmergency shut-down. Small and cheap UPS can supply power for 10-15 mins before giving out. Don't shut down immediately, as it may be a very short outage. Once it passes the 10 min mark (or 40% of battery left) server needs to begin preparing for shutdown.

EDIT: Actually, voltage spikes will kill your HDD faster than powerouts. Voltage spikes can occur during normal power, blinked power-outs (< 1 sec power out) but best of all when power is "restored". I once had an HDD that was kicked so hard by a voltage spike that the read head threw its self accross half the disk platter and wiped data while it was doing it. It was like scratching a record along the radius while the record is spinning.
Most UPS's also have surge protectors and can cut the power coming from A/C when voltage goes crazy, thereby protecting from spikes, not just outages.
 
  • Thanks
Reactions: bro
wblock@ said:
Prevention is better than after-the-fact recovery. So install a UPS. Seriously. Even one of the terrible cheapies is better than nothing.


+1 to this.

All a filesystem can do is ensure that the filesystem is consistent and not corrupted after power loss. Not avoid data loss - if the disk hasn't had the data written to it (i.e., it is in buffer cache, or not written out by the app) it will be lost no matter what file system you run.

Get a UPS.
 
  • Thanks
Reactions: bro
SirDice said:
Not sure if you can turn it on after the fact but FreeBSD 9.0 now installs using UFS+Journaling. That should help.

Another option is to use ZFS, which has a lot more features with regard to data integrity.

I did consider reinstalling with ZFS, but don't really have the time at the moment. Will probably do it eventually.

wblock@ said:
Prevention is better than after-the-fact recovery. So install a UPS. Seriously. Even one of the terrible cheapies is better than nothing.

Prevention is of course better, so Ill probably "suck up" and buy one of those. Thanks for the advice ;-)

@Beeblebrox
Thanks for the in-depth explanation. After investing 5500$ in the server, it's actually quite silly not to have an UPS.

throAU said:
+1 to this.

All a file system can do is ensure that the file system is consistent and not corrupted after power loss. Not avoid data loss - if the disk hasn't had the data written to it (i.e., it is in buffer cache, or not written out by the app) it will be lost no matter what file system you run.

Get a UPS.

I understand that data can be lost, but when different files all over the FS get messed up (when they haven't been edited recently) is a bit annoying and unexpected. I've never experienced anything similar on Linux, but I might've just been lucky.

Conclusion:
1) Get a UPS
2) Install UFS with journaling or use ZFS.

Thanks for the feedback.
 
Back
Top