ZFS ZFS disk activity every 2 seconds on an idle system

It looks like atime is enabled for all datasets on that zpool. But doesn't this being a problem suppose that something (in userspace) is actually reading the filesystem? `zpool iostat -v 1` doesn't show any read operations. Only write operations.
e.g. log files are constantly accessed and written, although those writes end up in cache/txg/dirty_writes first and are only periodically committed - but as said, IIRC zfs metadata is updated ASAP, so atime for a dataset with constantly open/updated files can very well be the reason...

Basically you can disable atime for everything that isn't dependant on this information - like e.g. maildir storage or some file-based databases. The only dataset I could find on the quick actually was the /var/mail dataset on one of our mailservers...

You also mentioned VMs - any chance there is something from redmond running on one of those? that toy OS is extremely pesky when it comes to random and pointless disk IO even when completely idle...
 
What does
zfs diff sea/media/music@test sea/media/music
show at this point?

Code:
# zfs diff sea/media/music@test sea/media/music
Unable to obtain diffs: No such file or directory
# zfs list | grep music
sea/media/music                                 365G   807G   365G  /sea/media/music
# zfs list -t snapshot | grep music
sea/media/music@2025-03-14                13.2M      -   365G  -
sea/media/music@2025-03-23_1              8.68M      -   365G  -
sea/media/music@testing                   8.55M      -   365G  -

This doesn't bode well...
 
You also mentioned VMs - any chance there is something from redmond running on one of those? that toy OS is extremely pesky when it comes to random and pointless disk IO even when completely idle...

The VM is linux and the issue persists after `service vm stop`, so I don't think this is the cause.

so atime for a dataset with constantly open/updated files can very well be the reason...

I turned off atime on all datasets on the affected pool (i.e. `sea`):

Code:
# for i in $(zfs list | grep sea | awk '{print $1}'); do echo $i; zfs set atime=off $i; done

but alas, no cigar.
 
Are you sure this is from ZFS? Reboot and stop on startup menu, then wait and see whether the same sounds exist. I have Windows laptop with similar effects and think it is HDD calibration.
The earlier zpool iostat result shows that the writes are coming from zfs/system as opposed to HDD background scans.
 
Code:
# zfs diff sea/media/music@test sea/media/music
Unable to obtain diffs: No such file or directory
# zfs list | grep music
sea/media/music                                 365G   807G   365G  /sea/media/music
# zfs list -t snapshot | grep music
sea/media/music@2025-03-14                13.2M      -   365G  -
sea/media/music@2025-03-23_1              8.68M      -   365G  -
sea/media/music@testing                   8.55M      -   365G  -

This doesn't bode well...
Are those volumes rather than filesystems? If they are, there is something from the consumer of the volume (VM ?) that is driving the writes.
 
sea/media/music is a filesystem

Code:
$ zfs list -t filesystem | grep music
sea/media/music                                 365G   807G   365G  /sea/media/music
 
Are those volumes rather than filesystems?

sea/media/music is a filesystem
Don’t draw the wrong conclusions here. A typo is responsible for the error message. The @snapname in zfs-diff(8) dataset@snapname should be "testing" instead "test")
Rich (BB code):
# zfs diff sea/media/music@test sea/media/music
Unable to obtain diffs: No such file or directory
...
# zfs list -t snapshot | grep music
sea/media/music@2025-03-14                13.2M      -   365G  -
sea/media/music@2025-03-23_1              8.68M      -   365G  -
sea/media/music@testing                   8.55M      -   365G  -

This doesn't bode well...
 
Gah! Thanks for pointing that out!

Here's the correct output:

Code:
# zfs diff sea/media/music@testing sea/media/music
#
 
try:


zfs snap sea/media/music@now
zfs send -i sea/media/music@testing sea/media/music@now | zstream dump -v


And perhaps share the output. You can add ‘-d’ to even show more info.
 
OK, I hope I got this right:

Code:
# zfs snap sea/media/music@test1; sleep 20; zfs snap sea/media/music@test2
# # wait 20 seconds, listen to crunchx4
# zfs send -i sea/media/music@test1 sea/media/music@test2 | zstream dump -dv 2>&1 | tee zfslog
BEGIN record
        hdrtype = 1
        features = 4
        magic = 2f5bacbac
        creation_time = 67e31805
        type = 2
        flags = 0xc
        toguid = 54dc784a55accc76
        fromguid = 5f22927494c2bc3
        toname = sea/media/music@test2
        payloadlen = 0

END checksum = 41a6f46d6/114664886d1/252d09ba9c53/3680264e5644c
    checksum = 5b453e828/2978755b2f8/a4fb4e028c25/1cd48fe3a90c05
SUMMARY:
        Total DRR_BEGIN records = 1 (0 bytes)
        Total DRR_END records = 1 (0 bytes)
        Total DRR_OBJECT records = 0 (0 bytes)
        Total DRR_FREEOBJECTS records = 0 (0 bytes)
        Total DRR_WRITE records = 0 (0 bytes)
        Total DRR_WRITE_BYREF records = 0 (0 bytes)
        Total DRR_WRITE_EMBEDDED records = 0 (0 bytes)
        Total DRR_FREE records = 0 (0 bytes)
        Total DRR_SPILL records = 0 (0 bytes)
        Total records = 2
        Total payload size = 0 (0x0)
        Total header overhead = 624 (0x270)
        Total stream length = 624 (0x270)

I imagine we had hoped that all the byte sizes weren't zero.
 
Does zfs get written sea/media/music@test2 show non-zero?

Sorry; we want to check the written property of the filesystem before taking the snapshot. We want to find a non-zero written fs, snapshot it fs@new, and send the -i fs@penultimate fs@new update after taking the fs@new snapshot.
 
Having trouble getting non-zero written values now. I think that the 136K we saw before was an unrelated fluke.

It's been about 5 mins since i took a snapshot and `zfs list -ro name,written sea` shows all zeros even thought the disks have been crunching every 5 seconds for that whole iterval.
 
What about the thing I pasted earlier where dtrace showerd that the syncer is fsyncing a path every second. Is that a lead?
 
OK since my last message we have a 5MB change on sea/media/music since sea/media/music@testing was tagged.

Doing:
Code:
# zfs snap sea/media/music@now
# zfs send -i sea/media/music@testing sea/media/music@now | zstream dump -dv 2>&1 | tee log

I've got load of output in a log file now.

To paste a litte of it, it looks like:
Code:
BEGIN record
        hdrtype = 1
        features = 4
        magic = 2f5bacbac
        creation_time = 67e33637
        type = 2
        flags = 0xc
        toguid = d89c1701befd1e5f
        fromguid = 839db504d252de04
        toname = sea/media/music@now
        payloadlen = 0

OBJECT object = 1 type = 21 bonustype = 0 blksz = 1024 bonuslen = 0 dn_slots = 1 raw_bonuslen = 0 flags = 0 maxblkid = 0 indblkshift = 0 nlevels = 0 nblkptr = 0
    checksum = 731cfd705/361c23c99e2/dc3225381cd1/26fb054521430e
FREE object = 1 offset = 1024 length = -1
    checksum = c27f6bbcd/6dae403c7a7/265e11e60e0f7/9fb145c36e4e92
FREE object = 1 offset = 1024 length = 2048
    checksum = fadfbce3b/b80ba01bafe/5300596a343ef/1bf938dc3fd23d7
OBJECT object = 2 type = 20 bonustype = 44 blksz = 1024 bonuslen = 168 dn_slots = 1 raw_bonuslen = 0 flags = 0 maxblkid = 0 indblkshift = 0 nlevels = 0 nblkptr = 0
 5a 50 2f 00  02 04 18 00  ed 41 00 00  00 00 00 00   ZP/. .... .A.. ....
 0c 00 00 00  00 00 00 00  c3 46 af 00  00 00 00 00   .... .... .F.. ....
 e8 03 00 00  00 00 00 00  e8 03 00 00  00 00 00 00   .... .... .... ....
 22 00 00 00  00 00 00 00  44 01 00 00  08 04 00 00   "... .... D... ....
 bd 2c e3 67  00 00 00 00  60 a1 e3 33  00 00 00 00   .,.g .... `..3 ....
 da 8a d4 67  00 00 00 00  20 0c 76 03  00 00 00 00   ...g ....  .v. ....
 da 8a d4 67  00 00 00 00  20 0c 76 03  00 00 00 00   ...g ....  .v. ....
 35 bb db 65  00 00 00 00  48 8e 43 36  00 00 00 00   5..e .... H.C6 ....
 02 00 00 00  00 00 00 00  03 00 00 00  00 00 00 00   .... .... .... ....
 00 00 00 10  bf 01 1e 00  00 00 40 20  a9 00 12 00   .... .... ..@  ....
 00 00 00 40  a9 00 12 00                             ...@ ....
    checksum = 1409f8c8a6/1175cf86486c/997ec90cd7369/3f64a054659e147
FREE object = 2 offset = 1024 length = -1
    checksum = 1bd62c8e40/1d5b24add3b9/146e904908d0f8/accc0ae0e3f8d95
OBJECT object = 3 type = 20 bonustype = 44 blksz = 16384 bonuslen = 168 dn_slots = 1 raw_bonuslen = 0 flags = 0 maxblkid = 0 indblkshift = 0 nlevels = 0 nblkptr = 0
 5a 50 2f 00  02 04 18 00  ed 41 00 00  00 00 00 00   ZP/. .... .A.. ....
 99 01 00 00  00 00 00 00  c3 46 af 00  00 00 00 00   .... .... .F.. ....
 e8 03 00 00  00 00 00 00  e8 03 00 00  00 00 00 00   .... .... .... ....
 22 00 00 00  00 00 00 00  44 01 00 00  08 04 00 00   "... .... D... ....
 b7 1c e2 67  00 00 00 00  30 90 08 32  00 00 00 00   ...g .... 0..2 ....
 60 bd ab 67  00 00 00 00  58 0d 8f 2e  00 00 00 00   `..g .... X... ....
 f2 c7 cc 67  00 00 00 00  a0 44 22 26  00 00 00 00   ...g .... .D"& ....
 60 bd ab 67  00 00 00 00  58 0d 8f 2e  00 00 00 00   `..g .... X... ....
 98 01 00 00  00 00 00 00  03 00 00 00  00 00 00 00   .... .... .... ....
 00 00 00 10  bf 01 1e 00  00 00 40 20  a9 00 12 00   .... .... ..@  ....
 00 00 00 40  a9 00 12 00                             ...@ ....
    checksum = 1ecacb11c6/269d5e1bd694/1eca5483b8ce36/128d27674670a183
FREE object = 3 offset = 65536 length = -1
    checksum = 272de9f72c/37cbf57109e9/34ca014f6d790f/25dd9eeaa337b3a3
OBJECT object = 4 type = 20 bonustype = 44 blksz = 512 bonuslen = 168 dn_slots = 1 raw_bonuslen = 0 flags = 0 maxblkid = 0 indblkshift = 0 nlevels = 0 nblkptr = 0
 5a 50 2f 00  02 04 18 00  ed 41 00 00  00 00 00 00   ZP/. .... .A.. ....
 03 00 00 00  00 00 00 00  c3 46 af 00  00 00 00 00   .... .... .F.. ....
 e8 03 00 00  00 00 00 00  e8 03 00 00  00 00 00 00   .... .... .... ....
 22 00 00 00  00 00 00 00  44 01 00 00  08 04 00 00   "... .... D... ....
 b7 1c e2 67  00 00 00 00  a8 65 50 38  00 00 00 00   ...g .... .eP8 ....
 d0 58 ad 5a  00 00 00 00  ef 8f 24 0b  00 00 00 00   .X.Z .... ..$. ....
 e1 d9 cc 67  00 00 00 00  80 7a 8f 25  00 00 00 00   ...g .... .z.% ....
 d0 58 ad 5a  00 00 00 00  ef 8f 24 0b  00 00 00 00   .X.Z .... ..$. ....
 02 00 00 00  00 00 00 00  03 00 00 00  00 00 00 00   .... .... .... ....
 00 00 00 10  bf 01 1e 00  00 00 40 20  a9 00 12 00   .... .... ..@  ....
 00 00 00 40  a9 00 12 00                             ...@ ....
    checksum = 2b0196fe1d/44c0ea0255f7/47c55f4b745a49/38c7258f2cc33404
...

What shall I look for in this output?
 
And just to be clear: 'zfs diff sea/media/music@testing sea/media/music@now' is still empty?

I would grep it for WRITE, and then look at the object; for example, on my /var/log filesystem:

Code:
zfs send -Ri zroot/var/log@zfs-auto-snap_hourly-2025-03-25-17h00 zroot/var/log@zfs-auto-snap_hourly-2025-03-25-18h00 | zstream dump -v | grep WRITE
WRITE object = 208 type = 19 checksum type = 7 compression type = 0 flags = 0 offset = 393216 logical_size = 131072 compressed_size = 0 payload_size = 131072 props = f000400ff salt = 0000000000000000 iv = 000000000000000000000000 mac = 00000000000000000000000000000000
WRITE object = 368 type = 19 checksum type = 7 compression type = 0 flags = 0 offset = 0 logical_size = 15360 compressed_size = 0 payload_size = 15360 props = f000a001d salt = 0000000000000000 iv = 000000000000000000000000 mac = 00000000000000000000000000000000
WRITE object = 369 type = 19 checksum type = 7 compression type = 0 flags = 0 offset = 393216 logical_size = 131072 compressed_size = 0 payload_size = 131072 props = f002c00ff salt = 0000000000000000 iv = 000000000000000000000000 mac = 00000000000000000000000000000000
WRITE object = 369 type = 19 checksum type = 7 compression type = 0 flags = 0 offset = 524288 logical_size = 131072 compressed_size = 0 payload_size = 131072 props = f002e00ff salt = 0000000000000000 iv = 000000000000000000000000 mac = 00000000000000000000000000000000
WRITE object = 369 type = 19 checksum type = 7 compression type = 0 flags = 0 offset = 655360 logical_size = 131072 compressed_size = 0 payload_size = 131072 props = f002c00ff salt = 0000000000000000 iv = 000000000000000000000000 mac
...

And then look at the object numbers; these should be inodes within the filesystem; to find what file is written, I can look for inode 208, 368, or 369:

Code:
$ find /var/log/.zfs/snapshot/zfs-auto-snap_hourly-2025-03-25-18h00/ -inum 208 -ls
   208       95 -rw-------    1 root                             wheel                              405716 Mar 25 17:48 /var/log/.zfs/snapshot/zfs-auto-snap_hourly-2025-03-25-18h00/debug.log

Sure enough, that file was updated in the snapshot window.

If you're not getting any WRITE objects, inspect what you have. I think the type / bonus type IDs are here, so your type 20 is DMU_OT_DIRECTORY_CONTENTS with bonus 44: DMU_OT_SA (System Attributes).

Depending on how sensitive you feel the contents of sea/media/music are, you could paste it up on pastebin. (Perhaps with out -d for shorter output).

I'll be honest; I don't understand what it's doing with some of the OBJECT/FREE pairs; at least on mine they appear to reference stale files (haven't changed in ages) in the filesystem, but that doesn't seem likely.
 
And just to be clear: 'zfs diff sea/media/music@testing sea/media/music@now' is still empty?

It is, yeah.

If you're not getting any WRITE objects, inspect what you have.

No WRITE objects.

Choosing a random entry:

Code:
OBJECT object = 27287 type = 19 bonustype = 44 blksz = 131072 bonuslen = 168 dn_slots = 1 raw_bonuslen = 0 flags = 0 ma
xblkid = 0 indblkshift = 0 nlevels = 0 nblkptr = 0

Code:
 27287    11937 -rwxr-xr-x    1 edd                              edd                               6100688 May 28  2022 ./OGG/ugasanie/wanderers_of_north/Ugasanie - Wanderers of North - 03 Last Shore.ogg

Great ambient album that, but that's besides the point :)

I think 19/44 is plain file/system attributes.

Choosing another random entry, this time type 20:

Code:
OBJECT object = 14362 type = 20 bonustype = 44 blksz = 1536 bonuslen = 168 dn_slots = 1 raw_bonuslen = 0 flags = 0 maxb
lkid = 0 indblkshift = 0 nlevels = 0 nblkptr = 0

Code:
$ find . -inum 14362 -ls
 14362       17 drwxr-xr-x    2 edd                              edd                                    18 Nov  2  2021 ./OGG/cephalic_carnage/lucid_interval

This time, metal! and, as expected, a directory.

As you already realised, it looks like most (maybe all) 19 or 20 entries are followed by a FREE entry for that same inum.

A thought pops to mind. Last week we were investigating some other oddities with this zpool:

The problem dataset in that post was destroyed after I copied it all into a new dataset, the one we are talking about in this post. Could this disk activity be the destroy still in progress after a week or so? The old dataset was mounted in the same location as the one we are talking about in this post (/sea/media/music).

Another theory, I recently did `zpool upgrade`. Could zfs have a backlog of adding some new "system attribute" to all of my files or something like that? Sadly I don't know which new features the upgrade pulled in to the pool.
 
Could be; I would say it's definitely a strange one.

That said, here's a patch to zstream to make it so you don't have to look up those IDs:
C:
--- a/sys/contrib/openzfs/cmd/zstream/zstream_dump.c
+++ b/sys/contrib/openzfs/cmd/zstream/zstream_dump.c
@@ -477,15 +477,19 @@ zstream_do_dump(int argc, char *argv[])
                        payload_size = DRR_OBJECT_PAYLOAD_SIZE(drro);

                        if (verbose) {
-                               (void) printf("OBJECT object = %llu type = %u "
-                                   "bonustype = %u blksz = %u bonuslen = %u "
+                               (void) printf("OBJECT object = %llu type = %u [%s] "
+                                   "bonustype = %u [%s] blksz = %u bonuslen = %u "
                                    "dn_slots = %u raw_bonuslen = %u "
                                    "flags = %u maxblkid = %llu "
                                    "indblkshift = %u nlevels = %u "
                                    "nblkptr = %u\n",
                                    (u_longlong_t)drro->drr_object,
                                    drro->drr_type,
+                                   drro->drr_type < DMU_OT_NUMTYPES ?
+                                     dmu_ot[drro->drr_type].ot_name : "??",
                                    drro->drr_bonustype,
+                                   drro->drr_bonustype < DMU_OT_NUMTYPES ?
+                                     dmu_ot[drro->drr_bonustype].ot_name : "??",
                                    drro->drr_blksz,
                                    drro->drr_bonuslen,
                                    drro->drr_dn_slots,

With this, you'll get output like this:
Code:
OBJECT object = 6 type = 47 [SA attr layouts] bonustype = 0 [unallocated] blksz = 16384 bonuslen = 0 dn_slots = 1 raw_bonuslen = 0 flags = 0 maxblkid = 0 indblkshift = 0 nlevels = 0 nblkptr = 0
    checksum = 3b79d89469/99f3513905dc/105192117f1c2c2/4dde8052f67d7fd9
FREE object = 6 offset = 32768 length = -1
    checksum = 4090a98602/ad6a6f2cb193/136fb797ee605dc/a4f7cd555b28d234
FREE object = 6 offset = 32768 length = 16384
    checksum = 43ff2a2d39/c209a1e8bcc0/16efd4eb0d063ff/c33c0d0ad8a55c7
OBJECT object = 7 type = 20 [ZFS directory] bonustype = 44 [System attributes] blksz = 512 bonuslen = 168 dn_slots = 1 raw_bonuslen = 0 flags = 0 maxblkid = 0 indblkshift = 0 nlevels = 0 nblkptr = 0
 
Back
Top