Linus Torvalds Begins Expressing Regrets Merging Bcachefs

homeadm · Aug 28, 2024

malaizhichun said:
Kent responded and argued that "Bcachefs is _definitely_ more trustworthy than Btrfs"

Excuse me, what's the point of comparing yourself to the worst one?

scottro · Aug 28, 2024

Well, to sorta play devil's advocate (I should add that the way we, including me, sometimes bash <heh> Linux here reminds me of the way early Linux forums would bash MS. I have a friend in his 60's who still writes M$ Turd for word, etc.,) but at any rate, as btrfs is the default in Fedora, and probably others, it makes sense because it's saying, Bcachefs is better than the default of several distributions. (I'm making up the several, as I said, I know that Fedora has it as default, but don't know for sure about any others.)

gpw928 · Aug 29, 2024

ralphbsz said:
But there is enormous power in integrating the RAID layer with the file system layer. The single biggest one: when a disk fails, you don't have to immediately resilver unallocated space, and you can treat metadata and data separately, and ... many more things. All that is not available when using ext4 or other "single disk file systems".

I think that the utility of ZFS depends on the usage perspective.

In situations where Unix-like systems are deployed in significant numbers (large data centres including the cloud) they are almost inevitably virtualised Red Hat Linux. [I know that there are exceptions like Netflix, but they are exceptions.]

For virtualised systems, dealing with a dead disk is usually handled transparently in a hypervisor or storage array. So the storage clients are simply not aware of, nor involved in, managing (e.g. resilvering, or optimising access to) storage hardware.

With physical placement and rectification of disk hardware maladies off the list of issues for the client operating system (or volume manager) to address, the advantages of ZFS on that client are somewhat less pronounced.

The Linux Logical Volume Manager is able to virtualise (optionally) redundant storage, creating, replicating, deleting, growing, or shrinking logical volumes on-line at will. In these circumstances, ext4 and xfs continue to provide sound options. And with no main stream commercial support for ZFS (I'll ignore Oracle), this is the prevailing (and absolutely overwhelming) commercial model.

I get what you mean by "single disk file system" but LVM makes them highly flexible, and storage arrays take away physical disk management. That's why I said your description of ext4 "only for single-disk single-node systems" as a little unfair.

Having said all that, I agree with you that ZFS is certainly better in many ways. It's just not used in most serious commercial applications, where ext4 and xfs prevail (and, in my commercial experience with thousands of Linux systems, work pretty well).

From a personal perspective, my FreeBSD ZFS server provides a list of services remarkably similar to what one gets from a Dell EMC Data Domain storage system -- and I'm fairly smug about that...

Beastie7 · Aug 29, 2024

gpw928 said:
The Linux Logical Volume Manager is able to virtualise (optionally) redundant storage, creating, replicating, deleting, growing, or shrinking logical volumes on-line at will. In these circumstances, ext4 and xfs continue to provide sound options. And with no main stream commercial support for ZFS (I'll ignore Oracle), this is the prevailing (and absolutely overwhelming) commercial model.

You can't compare LVM to a zpool. They're fundamentally different. One is separate from the underlying filesystem, and the other inherits the design constraints of it's CoW nature. Linux folks seem to forget (or be oblivious) to this distinction. ZFS isn't just some meme. It's the last (latest?) word on filesystems. IIRC only NetApp and Apple have similar filesystems.

gpw928 said:
Having said all that, I agree with you that ZFS is certainly better in many ways. It's just not used in most serious commercial applications, where ext4 and xfs prevail (and, in my commercial experience with thousands of Linux systems, work pretty well)

You might want to take a look at iXsystem's clients. I beg to differ.

gpw928 · Aug 29, 2024

Beastie7 said:
You can't compare LVM to a zpool. They're fundamentally different

I have used both, quite extensively, and for a very long time (since before you could boot from ZFS). So I get that. What I am saying is that:

ZFS does not work well when the underlying storage is virtualised, and
in the vast majority of real world commercial Unix-like systems (RHEL), nearly all storage is virtualised (and uses LVM with ext4 or xfs).

Beastie7 said:
You might want to take a look at iXsystem's clients. I beg to differ.

iXsystems have great products. But they have only a few hundred employees. They are simply too small to count significantly in a global assessment of what's deployed and in use. I expect that will change, as companies like iXsystems grow and go public. But I don't see a lot of hope for ZFS to be widely deployed on Linux because of the licensing issues. That's probably good news for FreeBSD.

ralphbsz · Aug 29, 2024

I mostly agree, so very minor comments:

gpw928 said:
In situations where Unix-like systems are deployed in significant numbers (large data centres including the cloud) they are almost inevitably virtualised ~~Red Hat~~ Linux.

For customers who need Linux support, it may be RHEL. The really larger customers (the FAANG and friends) instead roll their own distributions, often "based loosely" on a well-known one like Debian or CentOS. If you deploy 10 million Linux machines, you'll have enough engineering, you don't need to pay Red Hat.

For virtualised systems, dealing with a dead disk is usually handled transparently in a hypervisor or storage array. ...
With physical placement and rectification of disk hardware maladies off the list of issues for the client operating system (or volume manager) to address, the advantages of ZFS on that client are somewhat less pronounced.

Absolutely. In the large deployments, individual clients do not run against real disks. Their "block devices" are in reality virtual volumes that run on complex layers of storage, usually with error checking, snapshots, load balancing, internal logging (partially to get the the speed of SSD at the cost of HDD), and so on. In the ones I'm familiar with, the Linux LVM isn't even used, since there are better things in the layers below. And those virtual disks de-facto don't fail: most cloud vendors advertise 11 nines of durability, and those claims are not a lie.

That's why I said your description of ext4 "only for single-disk single-node systems" as a little unfair.

Absolutely! My comment was targeted at the "home" user who actually has physical disks (that includes small commercial systems): they are better served by ZFS, in particular if they have 2 or 5 disks available. If you are inside a giant cloud deployment, the game changes. Or if your data is disposable. For example, I treat Raspberry Pi's as disposable computers: If the SD card fails (happens occasionally), or if the OS has a hiccup, I just put a blank SD card in and re-image them.

Beastie7 said:
ZFS isn't just some meme. It's the last (latest?) word on filesystems. IIRC only NetApp and Apple have similar filesystems.

Similar and better things exist in other parts of the commercial space. Not just in NetApp, who used to be head and shoulders above the competition, and had the intellectual leadership, but that was 25 years ago. These better solutions are not commonly bought by small and medium users. I mean, who has a PureStorage or DDN or Spectrum Scale or Data Domain at home or under the desk in the office? And if you were able look inside the big cloud providers, they have stuff that's at least as good.

gpw928 said:
ZFS does not work well when the underlying storage is virtualised,

I would word that differently: If the underlying storage is virtualized extremely well (so it is reliable, balanced, fast, error-proof), then the power of ZFS is wasted on it, and one needlessly pays for the overhead of it.

Example: I run a FreeBSD machine in the cloud, at one of the large providers. Its file system is UFS.

Beastie7 · Aug 29, 2024

gpw928 said:
ZFS does not work well when the underlying storage is virtualised

That's interesting. Could you explain why? I'd wager this could be a problem worth fixing. Especially with regards to bhyve. I haven't seen fixes for the ARC and mmap/page cache issue either.

LibreQuest · Aug 29, 2024

msplsh said:
No, you can't, and I encourage you to get into an argument about it with the people who have participated in all the threads about if this was possible or not.

Everyone who needs an app that depends on electron.

I've used yum and apt and all other sorts of linux package mangement and they either work or they don't, vs FreeBSD where maybe something will break, hope you read UPDATING, GLHF. I appreciate that I can fix it (I could probably fix linux if I actually had to, but... NEVER HAD TO), but I don't appreciate the hosing. Forum littered with examples.

Yeah, sure, there is no "best" but let's not pretend FreeBSD is something it is not.

In my opinion, the only advantage is using css draggable regions in a frameless application window. I find PWA to be a much simpler option and is more consolidated rather than making mobile apps and electron apps or other web application solutions.

EDIT: Also, I have used Btrfs on Nobara OS before moving to FreeBSD. It did some nice lvm by default but it was replaced in later versions with ext4 I believe. Not sure if there are issues with btrfs but I didn't use it more than a year so I don't have much experience with it running on any of my hardware. ZFS has given me no issues. As a new user I did have to learn how to properly use ZFS but that is to be expected.

Regarding the main topic it's very interesting that this dev tried to push an update out as a bug fix. Interesting stuff.

zirias@ · Aug 29, 2024

msplsh said:
Only supports one version of electron. Way better! Top leader!

That's wrong. There are multiple electron versions available. What's true is only one is built as a binary package at the moment because of a shortage in builder resources ... (I'm the one who opened the PR about that! From what was written there, it seems there will be new/more hardware eventually at least).

msplsh said:
Packages disappear out underneath you if a security fix breaks it.

Packages can disappear from the repository. Not from your machine, unless you deliberately ignore what pkg is telling you and just hit "yes" ... a common bad habit. Regarding the repository, I very much prefer that approach to the alternative: Just keep the old package and hope nothing breaks in weird ways because of changed dependencies.

msplsh said:
Can't mix ports and packages. Easier management!

msplsh said:
No, you can't, and I encourage you to get into an argument about it with the people who have participated in all the threads about if this was possible or not.

Wrong again, nobody ever said you can't. To do it manually, you DO need a good thorough understanding how things work, and it's still easy to mess up, that's why it's never a recommended thing to do. There's been a perfectly safe option for a while now: Let poudriere do it while building your own repo. Poudriere will only add pre-built packages that are a perfect match to your configuration.

That said, even without mixing in packages, messing up with locally built ports happens easily. I'd like to deprecate that altogether and recommend to only ever use pkg to install something on your live system, with the option to build your own repo with poudriere.

msplsh · Aug 29, 2024

zirias@ said:
There are multiple electron versions available

Pedantic. You know what the problem is and why it's a problem. It's not better than other distros that have multiple binaries available.

zirias@ said:
common bad habit

I think you missed where I was trying to express that it was a bad thing that it happened at all. It's only a "bad habit" because it can happen.

zirias@ said:
you DO need a good thorough understanding how things work, and it's still easy to mess up, that's why it's never a recommended thing to do.

Ah yes, thank you for more pedantry! I believe my comment on this was "easier management" in comparison to other systems and you explaining how it do it is super complicated illustrates how FreeBSD does not have easier package management than other systems. I expect your next comment to be about how ports is not "package management" and then we can go round and round again with the electron problem!

If it were to be impossible to build ports and mix with pkg, I would like pkg to just build the port with whatever options are necessary instead of making the user configure poudriere. A "The following ports will have to be built from source" message, or whatever. Turn the option off by default, who cares. Just label the packages internally with the configuration options so systems can figure out if there's a prebuilt or not, sheesh.

My point was that FreeBSD is not inherently superior to these other systems.

zirias@ · Aug 29, 2024

I'm not sure what exactly you're arguing against, I don't see anyone here claiming that everything in FreeBSD was inherently better than everything else ... so maybe your problem is a different one. It doesn't matter much, your claims were wrong, and no amount of strawmanning and quote cutting would change that ...

What I do see here is quite some bashing for whatever reason. Hey, there's a lot to IMHO rightfully bash ... idiotic stuff like flatpak/snap/whatever (or worse, just deliver everything as a docker image?) for example. And of course electron, it's a massively stupid idea to do seemingly simple things by bundling a ton of browser code with it and a horrible pile of node.js dependencies on top (note that's unrelated to the usefulness of the software actually using that stupid idea, so, of course we want electron in our ports/packages, even though it's horse shit). Many more things probably, and almost all of those aren't really Linux, but are somehow tied to typical Linux systems.

The only thing I don't get is why an obviously very sane thing done by Linus, with a pretty childish and stupid reaction by some over-ambitious filesystem developer, is a trigger to get into that kind of bashing ?

mer · Aug 29, 2024

msplsh said:
My point was that FreeBSD is not inherently superior to these other systems.

Conversely, FreeBSD is not inherently inferior to these other systems.

Like anything manmade, it has warts. Other things have different warts. One choses to use a specific thing because it fits their need and the warts don't make a difference.

But that's my opinion, warts and all.

zirias@ · Aug 29, 2024

Pretty much what mer said. Of course there are things done better elsewhere, but then please come up with real examples if you feel you must talk about that. Like Debian's dpkg/apt infrastructure has well-working subpackages (in FreeBSD ports, these are a relatively new feature, I didn't really look into it yet, but from what I hear, it's still far from perfect) and categories for optional (recommended, suggested) dependencies, while it's either "strictly required" or nothing in FreeBSD ports...

For some things like the strategy about packages failing to build, there just isn't a clearly "better" solution, you have to pick one of two bads: Either leave a package out of the repository, or keep the older one with the risk of breakage in other packages depending on it. I prefer FreeBSD's approach here. And after all, we're on a FreeBSD forum, so it's quite likely to find people here prefering the FreeBSD ways for many things. ?

msplsh · Aug 29, 2024

zirias@ said:
trigger to get into that kind of bashing

IMO, people see "linux" on this forum and get defensive.

msplsh · Aug 29, 2024

zirias@ said:
I'm not sure what exactly you're arguing

These things could be better. I don't think other things are worse.

fmc000 · Aug 29, 2024

T...L

kpedersen · Aug 29, 2024

msplsh said:
IMO, people see "linux" on this forum and get defensive.

People often came to this forum to get away from broken ideas. In many ways you can expect they don't want to discuss it or when they do, it won't be with joy and happiness in their hearts

msplsh · Aug 29, 2024

So, defensive and also unable to resist getting closer to posts involving the "broken ideas" they wanted to get away from.

IDK, maybe modify the description to defensive proselytizing.

gpw928 · Aug 29, 2024

Beastie7 said:
That's interesting. Could you explain why? I'd wager this could be a problem worth fixing. Especially with regards to bhyve. I haven't seen fixes for the ARC and mmap/page cache issue either.

I think that there's a fair bit written on it. From the top of my head:

ZFS makes assumptions about redundancy when constructing a RAID set. The assumption that the "disks" are independent, and distributed parity can be used to recover if one "disk" fails. As soon as you virtualise the storage, this assumption can be broken.
ZFS also assumes that I/O optimisation can be done by actuating the read/write heads on all "disks" simultaneously, i.e. striping. If those heads are not independant, then the striping algorithms can cause massive head contention.
ZFS is a copy on write (CoW) file system. The conventional wisdom is that using a CoW file system to provide virtualised storage to another CoW file system is undesirable. I think that the main problem is write amplification.

rbranco · Aug 29, 2024

Debian Orphans Bcachefs-Tools: "Impossible To Maintain In Debian Stable" - Phoronix

www.phoronix.com

kpedersen · Aug 29, 2024

msplsh said:
So, defensive and also unable to resist getting closer to posts involving the "broken ideas" they wanted to get away from.

Better the enemy you know. You don't run from broken ideas, you tackle them head on, expose them and discard them quickly.

Broken ideas are like garden weeds. Treating them is easier before they get together, grow and multiply. Of course, having the weeds not appearing in the first place is the best outcome.

But I suppose once the weeds have fscked up everyone elses garden, it is only natural that they spread to ours, like a rot.

ralphbsz · Aug 30, 2024

gpw928 said:
ZFS makes assumptions about redundancy when constructing a RAID set. The assumption that the "disks" are independent, and distributed parity can be used to recover if one "disk" fails.

ZFS, being designed for small systems (a few disks to dozens of disks), has no notion of failure domains and of correlated failures. It is designed around the assumption that every block device is independent of all others. This assumption is sensible if the main source of failures are disk drives, and each block device corresponds exactly to a disk drive. It also doesn't try to deal with interesting failure scenarios: A disk either works perfectly, or it has an individual sector error, or a whole disk goes away (failstop). In the real world of large storage systems, both assumptions are wrong. For example, groups of disks may have correlated failures. For example a system with 100 disks may have 10 backplanes, each with 10 disks. Failure of a backplane will knock out 10 disks at once. When doing RAID layout, you need to make sure you never use two disks from the same backplane in a RAID set. On a virtualized storage system, the assumption of failure independence is more complex than the simply failure domain example I gave. Or for example a disk may be slowly failing, and you want to begin slowly draining data from it, but without starting a full resilvering which is equivalent to the disk being dead, because the data that is still on the disk remains readable, but no new data should be written to it (with HAMR/MAMR disks, this will become a common syndrome, where one part of a disk goes readonly). Or a disk may have developed performance problems, and the optimal system configuration is to decrease the load on just that disk, by allocating less data onto it. These are all things that a file system integrated with the underlying virtual storage layer can do well, but ZFS doesn't know that the underlying layer is virtualized.

This doesn't mean that ZFS is broken, on the contrary: it is great on a small number of physical disks. But the way it is designed means that it works best if used on a small number of physical disks.

ZFS is a copy on write (CoW) file system. The conventional wisdom is that using a CoW file system to provide virtualised storage to another CoW file system is undesirable. I think that the main problem is write amplification.

Depending on how the underlying storage system is implemented, the workload presented by ZFS's CoW style writing can either lead to terrible performance, or to great performance. The thing is that the average user doesn't know ahead of time which it is going to be.

Jose · Aug 30, 2024

gpw928 said:
The Linux Logical Volume Manager is able to virtualise (optionally) redundant storage, creating, replicating, deleting, growing, or shrinking logical volumes on-line at will. In these circumstances, ext4 and xfs continue to provide sound options. And with no main stream commercial support for ZFS (I'll ignore Oracle), this is the prevailing (and absolutely overwhelming) commercial model.

I get what you mean by "single disk file system" but LVM makes them highly flexible, and storage arrays take away physical disk management. That's why I said your description of ext4 "only for single-disk single-node systems" as a little unfair.

Been there, done that, never again. Linux LVM is a mess of poorly designed tools. It's configuration is large and confusing, and changes randomly from release to release. You're left wondering why your machine won't boot (if you were foolish enough to use LVM in the boot drive) or at least why it won't mount the volumes that worked just yesterday. Hard pass.

I'd like to see an example of a volume whose size changed "on-line at will". Use ZFS datasets, and this is not even a problem.

recluce · Aug 30, 2024

msplsh said:
Only supports one version of electron. Way better! Top leader! Packages disappear out underneath you if a security fix breaks it. Can't mix ports and packages. Easier management!

Hijacking thread with BS on stuff only very few people use, some wrong statements and a lot of ranting. That is what makes a classic troll post, in my book. Welcome to my (very small) ignore list on this forum.

gpw928 · Aug 31, 2024

Jose said:
Been there, done that, never again. Linux LVM is a mess of poorly designed tools. It's configuration is large and confusing...

I find myself wedged into the uncomfortable position of defending LVM against ZFS.

I agree "that ZFS is certainly better in many ways" to LVM and ext4, and I would almost always choose ZFS for personal use.

My original intent was to assert that the ext4 file system was not "only for single-disk single-node systems", and was the defacto choice in real world commercial deployments.

ralphbsz rightly asserts above that "[ZFS] works best if used on a small number of physical disks". So, a vast improvement from Online Disksuite on a physical Sun server, and seriously good functionality anywhere else you have a moderate number of physical disks. But also with some curly problems when used to provision storage for virtual machines.

My experience of LVM, ext4, and xfs is in the context of virtualised data centres where "physical disks" are absent, Linux is the standard choice, and ZFS is not available. That experience is fairly positive (mostly because I had no choice but to master that "large and confusing" mess).

But I will also confidently assert that ZFS ecosystem is also large and confusing... OK, much better design, but still "large and confusing". I actually had to buy books on ZFS, something I never had to do for LVM.

Jose said:
I'd like to see an example of a volume whose size changed "on-line at will". Use ZFS datasets, and this is not even a problem.

I have lost count of the number of times I have grown ext4 and xfs file systems on-line with lvextend. I'll admit that shrinking file systems on-line is dangerous, and I'd always prefer to do it off-line. But shrinking is acutely rare (read never happens).

Even zpools can run out of space -- admittedly much less common because of the shared headroom.