Solved data loss due to power outage

I am having several data losses due to power outages, at the moment I can't buy an UPS. I am trying to use sync in a script to force data flow. I don't know if there is any other more elegant way
Code:
#!/bin/sh
while :
do
    sync
    sleep 1
done
this put in a script which runs in crontab with @reboot

any suggestion or idea
welcome
 
Could you please explain what you mean by data loss? Are files being corrupted? Is a file being marked as saved but not actually being committed to disk? Or as you script seems to suggest, is there a process that is terminated abruptly and fails to resume where it left and therefore causes data loss?

Clarifying this and giving more details on what you are doing and what you are trying to achieve will put people in a better position to help you.
 
What filesystem, ZFS, UFS, something else?
If UFS, what options, soft updates, journaling, sync, async, noasync?

I'm with reddy you've not given enough for anyone to help you.
 
I am having several data losses due to power outages
That seems very implausible. Nearly always a power cut simply causes an unclean shutdown, and the system restarts with no loss. In particular if you use UFS and configure it for soft updates, or if you use ZFS.

I have a suspicion: What you call "data loss" is actually that you are running some long-running application, which writes files, and then is surprised when the last few writes aren't there after a crash (powerloss) and reboot. But applications should not have that expectation! You should look at the source code for those applications, and modify them to either use some sync open mode, or to call fsync() after every important write.
 
… data losses due to power outages …

From that, I assume UFS (not ZFS) however I see you reading this:


… sync in a script

… more elegant way …

If it's UFS: you can use fstab(5) (the fourth field) to specify sync.

Also: tuning(7)
 
… very implausible. …

With UFS, in this situation (power outage), data loss does not surprise me.

Re: <https://docs.freebsd.org/en/books/faq/#safe-softupdates>, with added emphasis:

  • the answer does nothing to explain soft update journaling
  • the answer mentions a delay of up to thirty seconds before data is written to disk (the norm for UFS)
  • expert test results suggest that the delay may be much greater.

– if I recall correctly (more than a year ago), the expert's calculation was up to two minutes.

FreeBSD bug 261944 – FFS/UFS: explain soft update journaling (SUJ) in documentation and in tunefs(8)
 
Is there ANY file system that will truly be happy with frequent power losses? And never ever lose data? (Asking the experts more than you, OP).

I appreciate the power supply issue might be something you can't fix, and you can't get a UPS at the minute, but not sure there's that many computers/electronics/operating systems that will be 100% happy with flaky power, so not sure there's a file system or script that can "fix" power supply issues. But I might be proved wrong!
 
  • Like
Reactions: mer
Is there ANY file system that will truly be happy with frequent power losses? And never ever lose data? (Asking the experts more than you, OP).
Simple: Every FS that is mounted read-only...

ZFS is pretty resilient as in that it never keeps incomplete/false data. So if a transaction hasn't finished due to a power loss, it will be rolled back. Yes - you will always have consistent "data" on ZFS, but only from a filesystem perspective. The FS doesn't have any idea what an application might regard as 'consistent data'; i.e. if 2 files have to be updated to reflect plausible/consistent data and only one has been fully updated.
NO filesystem can account for such things. That's what UPSes are there for - if you care about your data and *know* there are frequent outages, you just have to have an UPS or live with frequent data loss.

If you know you fall off your bike at least twice a week yet still don't wear a helmet, you can't complain about getting a concussion every now and then...
 
Just a couple of days ago, I had expat2 get corrupted on ZFS. I am guessing it happened because I had gotten into the habit of powering down my system using the power button instead of properly using the shutdown command. (I had been working on some Windows computers for my wife.)

I was taught, long ago, not to use the power button cause you won't get a clean shutdown. However, when I hit the power button, the system goes through a shutdown process that looks the same as when I use the shutdown command though it seems to finish quicker.
 
  • Like
Reactions: mer
I think on some (most?) systems the power button is tied to ACPI events, so it's possible to capture them and perform the correct actions. I've seen this behavior on some a few different Linux systems basically calling shutdown -p now.
Similar stuff to how laptops do suspend/halt/whatever.
 
There's people confusing UFS journaling and ZFS reliability against filesystem corruption with "no data loss."

You're going to lose data without a UPS. That's just all there is to it. Even if the computer wrote the stuff to the disk immediately, it could lose power in the middle of the write.
 
I think on some (most?) systems the power button is tied to ACPI events, so it's possible to capture them and perform the correct actions. I've seen this behavior on some a few different Linux systems basically calling shutdown -p now.
Similar stuff to how laptops do suspend/halt/whatever.
Well, one should be (very) careful with pressing the power button* anything other than just giving it a quick push. Pushing it for a longer duration gets you dangerously close to a forced power shut-off: not recommendable.

* that is: the one usually at the front of a current ordinary desktop, not to be confused with the powerswitch of the powersupply ...
 
  • Like
Reactions: mer
Interesting many answers and we don't even know if the OP uses UFS or ZFS which is quite funny to me.

For ZFS i have
Code:
vfs.zfs.txg.timeout=5 # sync zfs every 5 seconds

For UFS i have
Code:
kern.metadelay=28
kern.dirdelay=29
kern.filedelay=30
 
You should look at the source code for those applications, and modify them to either use some sync open mode, or to call fsync() after every important write.
Now that is quite unrealistic, I'd say.

Normally filesystems and applications are controlled by kernel tunables and other .conf files.

A complex filesystem like ZFS would hold a lot of data in RAM before periodically writing to disk. Amount of data loss really depends on how active the RAM was (and how full the buffers were) at time of power cut. There are ways to play with that using kernel tunables and values set in other .conf files, but it only makes sense if you can step back and make yourself see that puzzle as part of the bigger picture. :)
 
I think on some (most?) systems the power button is tied to ACPI events, so it's possible to capture them and perform the correct actions. I've seen this behavior on some a few different Linux systems basically calling shutdown -p now. …

If FreeBSD is set to shut down (not sleep) in response to the power button:
  • the shut down will be graceful.



Copied from <https://forums.freebsd.org/posts/522096>:

… In chronological order:
  1. <https://cgit.freebsd.org/src/commit...c?id=ad4240fec4feed2dfca1ca0e0bb303eb01aa3a5b> move all functions related to shutting down to one file called kern_shutdown.c …
  2. <https://cgit.freebsd.org/src/commit...c?id=3e755f76d1f51651901d3893554a92b8aa371684> Make it possible to pass boot()'s flags to shutdown_nice() …
  3. <https://cgit.freebsd.org/src/commit/?id=acf0ab0669861bdcef9c49f5ed87ff6c82bd1d1b> init: Only run /etc/rc.shutdown if /etc/rc was run. …
  4. <https://cgit.freebsd.org/src/commit/?id=912d59378b93b34f395d4e7e98078d76ec50005f> Clean up shutdown_nice(). Just send the right signal to init(8). Right now, init(8) cannot distinguish between an ACPI power button press or a Ctrl+Alt+Del sequence on the keyboard. …
kern_shutdown.c « kern « sys - src - FreeBSD source tree <https://cgit.freebsd.org/src/tree/sys/kern/kern_shutdown.c>
 
Normally ... applications are controlled by kernel tunables and other .conf files.
No, end-user applications are typically not configured by system-wide conf files. Exceptions are services. Even then

A complex filesystem like ZFS would hold a lot of data in RAM before periodically writing to disk. Amount of data loss lsreally depends ...
Let me be perfectly clear here: If any applications opens or creates a file, writes to it, closes it, then the power fails, that is NOT data loss, in the sense of file systems or operating systems. An applications that writes to a file can not have an expectation that the data is durable, until and unless the application of the administrator does specific things. File systems try to do the best compromise between how long to hold back data in RAM, and the performance impact of writing faster, but there are no guarantees. Most file systems do indeed take between 5 and 30 seconds to write everything out.

In order to guarantee writes are actually durable, you need to do something special. The easiest one is to open the file in a sync mode; then all the write() system calls immediately become durable. It can have very significant impact on performance. The slightly harder one is to call fsync() are appropriate places in the program, when you need data to become durable (for example, when the writing of the file is about to cause side effects observable from outside the program). In most cases, this is the best compromise between durability and performance. Even simpler is using the sync command from a shell. Or just mount the whole file system with sync, but performance (in particular on spinning disk) might be terrible.
 
ZFS transaction groups are held in memory and periodically written to the disk. Depending on how things are tuned, it could be "a lot".
I think ARC is mostly read data (plus attendant meta data)
 
This does look like a bit of a chicken-and-egg problem.
  • Yes, it's possible (within application source code) to make writes durable, as ralphbsz points out. The downsides to that are:
    • complexity (Just try finding all the correct places in the program's source code for all those write calls, and then making edits and then re-compiling the whole enchilada!) and
    • performance of the entire software stack (It will suffer).
  • I have pointed out that it's possible to play with .conf files and kernel tunables. That path leaves the application itself alone, and tries to tune the environment for that application. In that sense, the .conf files and kernel tunables do in fact control the running application. And yes, you can have per-user .conf files, controlling end-user applications.
 
the partition system I have for now is ufs but as soon as 13.1 comes out I will switch to zfs

a solution for the moment was

/boot/loader.conf
kern.cam.ada.write_cache="0"

/etc/sysctl.conf
kern.filedelay=15
kern.dirdelay=14
kern.metadelay=13

----------------
in zfs the loss is much lower

-------------

when I say lost I mean leaving the machine unused for a few hours
then a power outage
when restarting the machine an fsck was done automatically
and I lost a lot of configuration of my home directory
such as desktop appearance, some other program settings, and various files

Sorry for the delay, I'm not in good health
 
the partition system I have for now is ufs but as soon as 13.1 comes out I will switch to zfs

a solution for the moment was

/boot/loader.conf
kern.cam.ada.write_cache="0"

/etc/sysctl.conf
kern.filedelay=15
kern.dirdelay=14
kern.metadelay=13

----------------
in zfs the loss is much lower

-------------

when I say lost I mean leaving the machine unused for a few hours
then a power outage
when restarting the machine an fsck was done automatically
and I lost a lot of configuration of my home directory
such as desktop appearance, some other program settings, and various files

Sorry for the delay, I'm not in good health
Get well soon!

You might want to ask for some decent ergonomic equipment for the computer. Another idea is to keep a notebook of stuff you want to do, so that when you can get to the computer, you can actively work with minimal downtime. That's just ideas I'm throwing out, health is still an overriding priority :)
 
A complex filesystem like ZFS would hold a lot of data in RAM before periodically writing to disk. Amount of data loss really depends on how active the RAM was (and how full the buffers were) at time of power cut. There are ways to …

… Most file systems do indeed take between 5 and 30 seconds to write everything out. …

ZFS transaction groups are held in memory and periodically written to the disk. …

From Monitoring ZFS (Allan Jude, FreeBSD Journal DE November/December 2017) (PDF):

… You'll notice the natural cycle of ZFS, where there are a minimal number of synchronous writes as requested by applications; then every 5 seconds all other buffered asynchronous writes are flushed out to disk. …

zpool-iostat(8)

ZFS I/O (ZIO) Scheduler — OpenZFS documentation




Less recent, still interesting:
 
Back
Top