Auto (ZFS) Install - why is compression enabled by default?

And that trying to compress already compressed data is an absolute waste of resources.
It was already mentioned that compression algorithms have a fast way of detecting this.
Old means x32 type of old.
ZFS on i386 is bad for a lot of *real* reasons, compression isn't one of them.
The main reason why I eventually moved way from that one was ... it got slow. Noticably.
That can't be related to compression as the default setting is only used when creating new datasets, your existing datasets could not be toggled to use compression by switching the compression default.
I re-did my VM box without compression. And I can see the difference in my specs.
Graphs?
And before you ask me to share graphs and what not: that's not even the point here!
It is, so graphs please? :)
Whatever happened to freedom of choice?!
peter@bsd:/home/peter $ man zpool-create | grep compression peter@bsd:/home/peter $ man zfs-create | grep compression peter@bsd:/home/peter $
You are just looking in the wrong place, see zfsprops(7).
This 'feat' isn't even mentioned in the install screens, there's also no option to turn this mess off.
It is now, at least for 15 snapshots, so you can see that lz4 is being used for root dataset.
 
peter@bsd:/home/peter $ man zpool-create | grep compression
peter@bsd:/home/peter $ man zfs-create | grep compression
Try man zfsprops | grep compression.

That said, zfsprops(8)
Rich (BB code):
     compression=on|off|gzip|gzip-N|lz4|lzjb|zle|zstd|zstd-N|zstd-fast|zstd-fast-N
       Controls the compression algorithm used for this dataset.

       When set to on (the default), indicates that the current default
       compression algorithm should be used.  The default balances compression
       and decompression speed, with compression ratio and is expected to work
       well on a wide variety of workloads.  Unlike all other settings for
       this property, on does not select a fixed compression type.  As new
       compression algorithms are added to ZFS and enabled on a pool, the
       default compression algorithm may change.  The current default
       compression algorithm is either lzjb or, if the lz4_compress feature is
       enabled, lz4.
See manual for rest of "compression" property description.

If you ask me then it almost looks like people don't want us to know about any of this mess.
It's there, but OpenZFS manuals are separated for each subcommand (unlike FreeBSD Solaris ZFS manuals: zfs, zpool), harder to find a specific expression if one does not know where to look. Perhaps a zfsall and zpoolall meta-man pages (like zshall(1)) would be helpful to find specific terms.

One could use find(1):
Code:
% find /usr/share/man -type f \( -name "zfs*" -o -name "zpool*" \) -exec zgrep compression {} +
But it's a long command, not easy to remember.

More comfortable is textproc/ugrep

chsh, tcsh, zsh, etc. :
Code:
% ug -rIzw compression /usr/share/man/man{7,8}

Bourne Shell
Code:
$ ug -rIzw compression /usr/share/man/man[7,8]

There is another fast grep style search utility, textproc/the_silver_searcher, but it doesn't work well with compressed files (man.1.gz).


This 'feat' isn't even mentioned in the install screens, there's also no option to turn this mess off.
If you are referring to the menu guided installation from a installation media, a zpool creation options dialog menu has been implemented end of 2024 in main (alias CURRENT), and also merged in future 14.3-RELEASE.

bsdinstall zfsboot: Add an option to edit the ZFS pool creation options

Here an image how the menu looks like (from https://reviews.freebsd.org/D47478).
 
There is no denying the fact that compression gobbles up CPU cycles. And that trying to compress already compressed data is an absolute waste of resources.
[...]
Fact remains that not using compression leads to a more responsive system. I'm not just venting, I has proof.

In Zstandard Compression in OpenZFS, The FreeBSD Journal, March/April 2021, Allan Jude wrote:
For various historical reasons, compression still defaults to "off" in newly created ZFS storage pools.
Apparently the 'various historical reasons' were overcome, Default to ON for compression - commit, Mar 3, 2022:
Rich (BB code):
A simple change, but so many tests break with it,
and those are the majority of this.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #13078

In the official OpenZFS 2.2.0 release - Oct 13, 2023 at zfsprops(7) - ZFS v2.2 first appeared an explicit mention of the compression being set to on as its default setting:
Rich (BB code):
compression=on|off|gzip|gzip-N|lz4|lzjb|zle|zstd|zstd-N|zstd-fast|zstd-fast-N
    Controls the compression algorithm used for this dataset.

    When set to on (the default), indicates that the current default compression algorithm should be used.
Before that version, it looks like it wasn't being explicitly mentioned either way, that is, neither:
  • set to on by default
    nor
  • set to off by default
allthough it was set to off by default when a pool was created.

As OpenZFS 2.2.0 was introduced with 14.0-RELEASE - Release Notes, ZFS Changes (20 Nov 2023), this includes the 'default on' setting for compression. As 13.5-RELEASE still only uses OpenZFS 2.1.14, it looks like there the 'default off' setting for compression is still being used, I haven't checked.

If you want to argue OpenZFS' decision of using the 'default on' setting for compression on the merits, I suggest opening a relevant discussion at the OpenZFS github. I think there is much more fundamental ZFS knowledge about this available from the ZFS developer community.
 
This thread kinda spiraled out of control so to speak, but I don't think OP was wrong. There is no denying the fact that compression gobbles up CPU cycles. And that trying to compress already compressed data is an absolute waste of resources.

And the fact that no one bothered to come up with stats of "with" and "without" compression .... IMO that says enough. Fun fact: there's a major difference.

Which makes me laugh because some people easily mention FreeBSD as a solid choice for a Rasberry (and rightfully so!) yet then you get this nonsense?

Fact remains that not using compression leads to a more responsive system. I'm not just venting, I has proof.

First.. I just discovered this today, I accidentally made a new thread about it. The thing is.. I maintain a LAN server (mostly running Minecraft) and this used to be an old PowerEdge running FreeBSD. Old means x32 type of old. 8 years of old.

The main reason why I eventually moved way from that one was ... it got slow. Noticably. And many people claimed that "ZFS is slow" which is something I never bought. So recently I fired up a new VM (FreeBSD powered) using Hyper-V. You can say whatever you want about Microsoft, they do know how to run their statistics. Lo and behold... I re-did my VM box without compression. And I can see the difference in my specs.

And before you ask me to share graphs and what not: that's not even the point here!

Whatever happened to freedom of choice?!
Code:
peter@bsd:/home/peter $ man zpool-create | grep compression
peter@bsd:/home/peter $ man zfs-create | grep compression
peter@bsd:/home/peter $
This 'feat' isn't even mentioned in the install screens, there's also no option to turn this mess off.

If you ask me then it almost looks like people don't want us to know about any of this mess.

And the main reason I am reacting as annoyed as I do now... is because this takes me back: where software (often games) put too much strain on hardware requirements, more or less forcing people to upgrade their hardware to get things to work. Windows, as much as I love & respect it today, was a solid example of this crap methodology.

They also gave us no choice in the matter back then. MS clued up, and lo and behold... we're right back to square one: same mindset, different environment: "we know better than the user".

Gve me a break!
In my experience, yes...ZFS gets slow. Even atime updates (disabled by default on almost the entire install) seemed to fragment what was originally a streamlined write to where mechanical drive performance drops horribly.

Compression does cause CPU+RAM overhead, but once disk data is involved its 'usually' the disks that are the bottleneck. Some RAM overhead may be made up by having ZFS ARC hold more data as it keeps the compressed version in RAM.

My ZFS induced slowdowns haven't yet shown they are caused by compression unless its zstd and is turned up quite high (write bottleneck). Reads do use multiple CPU cores too so disk speed vs CPU speed + cores should be considered. Compression has caused speedups for me since I get more data from point a to point b with it.

As for having control, the installer has a spot to override ZFS properties but it doesn't let us customize each dataset separately through bsdinstall. The installer needs a rework to make creating partitions/pools and filesystems/datasets each with/without geom providers and control of the properties on each in an effective way without just leaving to a shell to do it manually. Similarly the installer should benchmark compression, encryption, disks, etc. on the system to be able to provide the user recommendations to the settings (max space, max compression without bottlenecking a transfer, lighter compression measurements to decide how much or little overhead to take for the system's use, etc.)
 
It was already mentioned that compression algorithms have a fast way of detecting this.

There are a number of catches to that.

1) some compression algorithms have higher startup cost before they strt outputting compressed data t all. I have been told that zstd is one of those.

2) it doesn't matter in ZFS as of now. With the exception of lz4 all compression "probing" is done my compressing a block into a fixed sized buffer and when the buffer runs out it just doesn't use the compressed version.

So you almost always get the full brunt of compression when doing uncompressible data.

Having said that, I never found that I have to use up cores that I didn't have to spare anyway.
 
Is the compress property only settable on a zpool or can it also be set on a dataset? Asking because I'm feeling lazy about doing "man zfsprops"

If it can also apply to a dataset, then I think the property inheritance would be from zpool as the default unless expressly overridden at dataset creation time.

If that premise is true, then having it default to "on" for the zpool but then expressly overridden in the installer when it creates the datasets.

Having it default to on may show acceptable tradeoffs between CPU usage and "waiting on the device" for some majority of configurations and use cases, especially if you think about different vdev configurations where the slowest device is the gate for the whole thing.
Now if you are using things based on NVMe devices, the math will certainly change.

So if all my assumptions above are at least partially true, my opinion is that have compress=on for zpools in the installer is not nefarious or "we know better than you". Worst case, one can always drop to a shell in the installer, manually create things with the properties they desire and continue to install.
 
If you want to argue OpenZFS' decision of using the 'default on' setting for compression on the merits, I suggest opening a relevant discussion at the OpenZFS github. I think there is much more fundamental ZFS knowledge about this available from the ZFS developer community.
Well, it's not so much the feature itself which bugs me, but rather the fact that as soon as it comes to ZFS then the FreeBSD system is making a lot of choices for you out of the box; and sometimes without even directly mentioning this.

Dataset compression is one example, another being that /usr/sbin/adduser automatically creates a new dedicated dataset for a new user when you're using ZFS, also without asking and also opt-out instead of opt-in. Or the default root filesystem which I personally think is very awkwardly set up (referring to beadm). It's the combination of the whole thing that sometimes irks me a little. But not enough to start stirring things up; I got it out of my system now, got a lot of new solid information to reflect on (thanks guys!), all good.

Also considering that nothing gets enforced. At the time of writing I re-did my server setup, made sure to turn compression off for my main zroot dataset (I always set up a server manually, without the main installer) and only enabled it again wherever I needed it and well... things are running very smoothly so far.
 
Old means x32 type of old.
ZFS on i386 is bad for a lot of *real* reasons, compression isn't one of them.
I have the same general view about using ZFS on i386 (apart from the fact that i386 support on FreeBSD is being phased out). However, for i386, 32 bit architectures, there seem to be additional costs for LZ4 compression (for others as well I imagine), [zfs] LZ4 compression algorithm:
2) LZ4 compression functions allocate 16k of state on the stack on
64-bit systems (since their stacks are large enough) and on the heap
on 32-bit i386 (since its stack is too small for this). This can be
changed by lowering COMPRESSLEVEL to <= 11, but this hurts
compression ratio with no speed gain on 64-bit systems (our main
target going forward, I guess).
Cursory scan of sys/cddl/contrib/opensolaris/common/lz4/lz4.c suggests that this still is the case. There seem to be other differences such as used blocksize too, but I'm far from knowledgeable in this area.

Therefore, it would be interesting what OpenZFS on github would have to say on the speed impact of using (LZ4) compression with its current compression algorithm and compression level selections on i386 32 bit systems; use of the specific VM environment used may be relevant too.
 
Back
Top