ZFS ZFS lost my data

SDK Chan · Mar 27, 2025

Maturin said:
Just a master in electrical engineering; you don't need to be a professor to know how homework is done right.

I hate homework, because it gives me the feeling that someone assumes I don't know how to keep myself busy

Most programming homework I ever had, was easy, and/or boring, though ...
FreeBSD and back then Arch Linux are however somehow hard to grasp at some times.
Well, due to having a high-leveled asperger syndrome, I sometimes need 5 or 10 minutes more to understand something.
Sometimes it is the opposite, though, where I can grasp things faster ...

Maturin · Mar 27, 2025

SDK Chan said:
I hate homework

depends. In your last posts you elaborated you did a lot of homework - with fun; depends what, and for whom

SDK Chan said:
asperger syndrome

you're not he only one here (not me.)

SDK Chan said:
need 5 or 10 minutes more to understand something

That's not neccessarily asperger. by principle there are slower and faster brains. the slow ones need more time, having a harder time to learn new things. but once they've learned something, they know it. while the faster ones also faster forget

SDK Chan · Mar 27, 2025

Maturin said:
depends. In your last posts you elaborated you did a lot of homework - with fun; depends what, and for whom

Well, homework can be interpreted in many ways, I guess.
If you refer to my last post, then yes, I like homework in that context, because it is not useless like writing a program which sums numbers from 1 to 10, or a program which asks you whether you are an adult or not just to tell you that you can drive a car, now

Such examples of writing dumb programs were my homework back then in my university, but the good part about them was the documentation. I not only had to write the programs, I also needed to document them, too.
The documentation part was the hardest thing to do due to finding the right words for describing my steps.
I think the challenge in general is not to write code, but to design robust code, and to understand other peoples code.

Maturin said:
you're not he only one here (not me.)

I am not really felling it, though.
My partner used to work with autistic people in the past, and there were cases which could not even think straight, sadly.
Luckily I can spend the whole day on a topic, without needing breaks

Maturin said:
That's not neccessarily asperger. by principle there are slower and faster brains. the slow ones need more time, having a harder time to learn new things. but once they've learned something, they know it. while the faster ones also faster forget

That is true.
I would say people only forget fast if they are not using the learned skills frequently, or sometimes.

Minbari · Mar 27, 2025

I had an SSD from that brand too and it failed after a few months. The problem is not ZFS, but that cheap crap SSD whom are made on ships that deliver them.

rootbert · Mar 27, 2025

Minbari said:
I had an SSD from that brand too and it failed after a few months. The problem is not ZFS, but that cheap crap SSD whom are made on ships that deliver them.

I have no problem with a device failing. My problem is not being notified in any way by the system or the filesystem about a device that does not work properly. How can I be sure with my servers that after a reboot my files are not gone? THAT is the issue here...

Maturin · Mar 28, 2025

rootbert said:
My problem is not being notified in any way by the system or the filesystem about a device that does not work properly.

Everybody here feels not good if anybody lost hardware, or data. Because everybody fears it, and nobody wants it to happen to anybody else. Not to lose data is the game's middle name.

FreeBSD is not the kind of OS that bombards its users with unasked and so most of the times unwanted, so useless information aka spam.

Since (almost) every information is somehow available somewhere, and the system cannot know what you want to know, what's important to you when, and what's not, and while giving every information available were way too much - that made the system unusable (you were drowning within information within seconds), it's up to you to ask it yourself about what you want to know.

It's within the very nature of any unix[like] system not only to give in the contrary to Window's incomprehensible 'Star Trek' gibber gabber comprehandable informations, but also to provide a whole shebang of tools and ways for automation.
Never forget: With your computer you're sitting in front of a machine that's original fundamental core entity is being for to automate things. That most computer users are not aware of, or forgot that, cause other systems degrade the computer to an electrical typewriter with porn player function is partially the fault of according operating systems, on the other part of users neither thinking of, nor caring for it.
Just with shell scripts, and cron jobs alone you can do a lot.

In this particular case I don't even have a script. The following most primitive solution works for me.
All my /root/.cshrc end with:

Code:

echo
uname -v
freebsd-version -u
echo
zpool status
echo
zfs list
echo
zpool list
echo
uptime

Everytime I login as root I get a brief overview over the system, and see if a drive from a pool failed.

It's my solution. It works for me. I see it often enough. If you want to see it more often, or need the system to inform you automatically, you need to find your own way.
Of course there is no 100% guarantee for never ever lose all my data. There never is. But I rely on the zfs pools' raid redundancy, plus the redundancy of backups done to other machines also using zfs pools with even more redundancy, to minimize the risk at a reasonable price that everything is irrecoverably gone before I notice any problem at all.
But that's just my idea.
Feel free to adopt this to your needs, or do it completely otherwise. If you ask others here you will get a whole bunch of other, even better ideas. Maybe a shell script that's done by cron hourly, using diff (or something else) to see if something has changed since the last capture, produce some output the way you like to inform you...

However, it's up to you anyway to figure out what fit your needs best.

All help can be provided here can only give you ideas about individual solutions what others do, helping about details with yours, and may help as far as possible when something went wrong.
As I already said at the top:
Everybody here feels not good if anybody lost hardware, or data. And everybody here wants to help - as far as it's possible.

Sounds like truism but it's just pure, simple basics:
After all prevention is always the best solution anyway. It sounds cynical when the desaster stroke, so most won't say it loud then. But if things are lost irrecoverably, all you can do is to learn from it for the future.

ralphbsz · Mar 28, 2025

rootbert said:
I have no problem with a device failing. My problem is not being notified in any way by the system or the filesystem about a device that does not work properly. How can I be sure with my servers that after a reboot my files are not gone? THAT is the issue here...

You are putting your finger on a real-world issue.

Judging by the description of your storage system failure, it was not caused by actual drive failure, but by a transient communication error: the system seemed hung, you rebooted it, and then it got better. What probably happened underneath: ZFS had the files you had recently written still in memory and was trying to write them to disk, but was not succeeding. That's probably what made the system be unresponsive. In many cases, the IO operations such as these writes have timeouts; for many disk drives, a timeout of 60 seconds or 5 minutes is reasonable. If the timeouts had worked correctly (and perhaps you didn't wait long enough), after 5 minutes you would have seen error messages in dmesg or /var/log/messages. Matter-of-fact, if you look in /var/log/messages, there might be error messages there.

The real root cause of the data loss is actually the fact that you rebooted. Until you did that, the data was actually still in memory. But honestly, it probably doesn't do much good there. In theory, if you had noticed the error messages, and if the system was still responsive and able to do things, you could have read the files (they would have been served from memory copies), and tried to save them to whatever hardware is still available, for example with copy commands (such as cp). Hardware that is still available might include other disks, or perhaps writing them over the network (scp them to a host that has working disks0. In practice, that answer is not very realistic, because you probably wouldn't have known to look for errors.

The real-world issue is that the above answer is somewhat unrealistic. It is rare that operating systems actually implement handling of communication (!) errors during IO completely correctly, and actually time out failing operations. It is more common that the OS gets hung, or that the failing IO operations never actually complete (not even with timeout errors). In this case, you would not have seen error messages in any place, and ZFS would have become hung and blocked. I've worked on projects where we hardened OSes to actually handle IO errors with sensible error messages all the time, and it takes many engineers and many months to get it right. You also need to have people from multiple layers cooperating: the disk drives themselves, the interfaces (HBAs, or SATA and nVME hardware), the motherboard, the OS kernel, and whatever application layer such as a file system is on top. When doing this professionally, it involves lots of conference calls and video meetings.

It is also somewhat unrealistic to expect you (as the untrained end user) to look at error messages in somewhat obscure places, and when you see them, be able to react correctly. In a large commercial system, there would have been automated alerts when IO errors occur, and you would have been able to rely on tech support of the storage system or operating system or file system vendor to guide you. With consumer-grade hardware and a free OS / file system, you won't get that.

Finally: Most file systems allow users to write data to memory first, and then in the background write that data to disk. That immediately means that hardware problems (such as communication errors) that happen on the background write will NOT be reported to the program that writes the data. The underlying assumption is that most of the time this is not a problem, and most users are much more interested in performance than in knowing that their data is really safe. If you have writes that you really care about going to the disk, you should use one of the various sync options when writing the data, and you would have immediately received an error indication. Again, most application-level users (who don't code their own programs) are not capable of doing that. Ultimately, you get what you pay for. And this happens to be inexpensive hardware, a free OS and file system, and no support contract for the software.

6502 · Mar 28, 2025

rootbert said:
How can I be sure with my servers that after a reboot my files are not gone? THAT is the issue here...

I wrote above and will repeat: when you do not need high speed storage, get HDDs.

rootbert · Mar 28, 2025

ralphbsz said:
You are putting your finger on a real-world issue.

Judging by the description of your storage system failure, it was not caused by actual drive failure, but by a transient communication error: the system seemed hung, you rebooted it, and then it got better. What probably happened underneath: ZFS had the files you had recently written still in memory and was trying to write them to disk, but was not succeeding. That's probably what made the system be unresponsive. In many cases, the IO operations such as these writes have timeouts; for many disk drives, a timeout of 60 seconds or 5 minutes is reasonable. If the timeouts had worked correctly (and perhaps you didn't wait long enough), after 5 minutes you would have seen error messages in dmesg or /var/log/messages. Matter-of-fact, if you look in /var/log/messages, there might be error messages there.

Exactly that was my assumption. Thus it baffles me that after more than 40 minutes after the rsync job finished and I have verified the files (which came from cache obviously) there is no message about any problem, neither from ZFS nor dmesg.

As a side note: usually I rsync the files, verify if everything is there, inspect a few of them, delete files on the source, make a snapshot, send the snapshot to my storage system while I walk into the room where my camera equipment is stored to equip my camera again with the sdcard and when I return to my desk the zfs sync job is done (usually). So Murphy hit me with this process, too.

Erichans · Mar 28, 2025

rootbert said:
Thus it baffles me that after more than 40 minutes after the rsync job finished and I have verified the files (which came from cache obviously) there is no message about any problem, neither from ZFS nor dmesg.

I quite agree.
If there comes no response as to why any messages/logging about this might have been absent, and how to mitigate that, perhaps put it to the appropriate mailinglist, or even more direct: to the OpenZFS - github site in a form you feel appropriate.

cracauer@ · Mar 28, 2025

/var/log/messages won't contain information related to hangs when the filesystem for /var/log is the one that hangs.

rootbert · Mar 28, 2025

cracauer@ said:
/var/log/messages won't contain information related to hangs when the filesystem for /var/log is the one that hangs.

the system resides on a different zpool, which is a zmirror nvme pool. The devices that caused problems were SATA SSDs (QLC)

ralphbsz · Mar 29, 2025

The hang might be in the whole block storage subsystem, way below the ZFS layer, and affecting all file systems.

6502 · Mar 29, 2025

What is the status of SSD-s - are they working or cannot start? Any errors in SMART (if it can be read)?

ralphbsz · Mar 29, 2025

6502 said:
What is the status of SSD-s - are they working or cannot start? Any errors in SMART (if it can be read)?

In general, SSDs work on FreeBSD. They can be attached via SATA and nvme (and supposedly SAS, although not seen that). They are supported by smartctl, in as much as they are in the smartctl database of known devices (if they're not, only generic things work). Is the IO stack in FreeBSD highly optimized to handle the high speed of SSDs most efficiently? Is error handling complete and correct? I don't know, and I doubt it.

In the specific case of the OP, the SSDs came back to full function after a reboot, so the general suspicion is that their problem was caused by an intermittent cabling issue. It is unfortunate (but real) that a intermittent cable issue can cause data loss, in spite of the fact that they were using a nominally redundant and fault-tolerant disk configuration, but a storage system can only be as reliable as its most vulnerable component, which in this case was probably the low-level block IO and disk attachment subsystem in FreeBSD.

6502 · Mar 29, 2025

Is there chance for bug in firmware?

T-Aoki · Mar 29, 2025

You'd better using nvmecontrol(8) rather than SMART on NVMe.
Logpage would be helpful.

rootbert · Mar 31, 2025

smart-data is all fine. I do not care why the drive did show faulty behavior, I am much more interested in why the OS/storage layer/filesystem did not inform me about this the way I would expect it.

6502 · Mar 31, 2025

Check for existing (and not installed) firmware update for this SSD.

rootbert · Thursday at 6:15 PM

6502 said:
Check for existing (and not installed) firmware update for this SSD.

well, this may fix this specific problem, however, it does not resolve the issue that the user is not being notified when writes are not passed through to the storage device(s).