Solved ZFS: unexplained space overhead for small-ish files in large records

I copied the same 478 mp3 files of 2–7MiB each (for a total of 1940MiB) to datasets with different recordsizes and compression set:
Code:
NAME                 RECSIZE  REFER  LREFER  COMPRESS
safe/test/none-1M         1M  2.09G   2.09G       off
safe/test/zle-1M          1M  1.87G   2.09G       zle
safe/test/lz4-1M          1M  1.86G   2.09G       lz4
safe/test/none-512K     512K  1.98G   1.98G       off
safe/test/lz4-512K      512K  1.86G   1.98G       lz4
safe/test/none-64K       64K  1.88G   1.88G       off
safe/test/lz4-64K        64K  1.85G   1.88G       lz4
safe/test/none-4K         4K  1.90G   1.88G       off
safe/test/lz4-4K          4K  1.89G   1.88G       lz4
I don't understand why larger recordsizes show more logicalreferenced space, and why that "overhead" (for lack of a better term) gets compressed away by even zle compression.

The best I can think of is "partial-record overhead", where 800K worth of data in a 1M record would waste 200K on zeroes. But all documentation says that this never happens, because partial records are stored truncated (down to ashift granularity).

So what's happening here?!
 
I was enlightened in IRC, and the explanation seems to be:
zfs has variable recordsize unless the file is larger than recordsize, in which case it will take up an integer number of recordsize records, except if any kind of compression is enabled, which will remove the overhead again – i.e. if compression is off, and recordsize is 1M, and your file 1.1M will take up 2M space
That explains both what I'm seeing, and what the documentation says, because the documentation usually refers to smaller-than-recordsize files being saved without overhead, not multiple-recordsize-filesizes that "don't fit" (and then cause unexpected overhead, which is curable even by zle compression).

As always, the answer is "just use compression, it's basically free anyway" :)
 
Back
Top