WARNING 1: Seeing that you two have received no additional responses in more than 72 hours, with nothing but good intentions in my heart, especially helping you, I
outsourced your case to a non-subject expert-in-training and below the horizontal line is what it had to say.
WARNING 2: Use it all only as an
inspiration or starting investigative point and check everything using official sources. That it
sounds certain doesn't mean it is; these non-subjects are trained to first and foremost sound certain.
WARNING 3: Read in full before doing anything. Then,
decide what you want to do.
Case 1 is
Dre's.
Case 2 is
thorstenr's.
Note: If you end up solving your problem, please write a post explaining what fixed it, so it gets documented for the community.
Diagnose periodic ZFS disk wakeups after upgrading to FreeBSD 15.0
What changed (high-signal suspects)
FreeBSD 15.0-RELEASE updates the in-base OpenZFS implementation to zfs-2.4.0-rc4 (see FreeBSD 15.0-RELEASE Release Notes).
OpenZFS 2.4 introduces a “TXG time database” mechanism that records and flushes transaction-group (TXG) timestamps. In
zfs(4)(), the defaults are:
- zfs_spa_note_txg_time=600 seconds (10 minutes)
- zfs_spa_flush_txg_time=600 seconds (10 minutes)
An upstream OpenZFS 2.4 defect report describes the same symptom pattern:
- disks spin up exactly every 10 minutes while a pool is
imported (even with datasets unmounted)
- exporting the pool stops it
- importing read-only stops it
- increasing spa_flush_txg_time to a very large value allows disks to stay spun down
(see
OpenZFS issue #18082).
Terms (quick definitions)
-
Pool: ZFS storage pool (
zpool(8)()), built from one or more disks/vdevs.
-
Dataset: ZFS filesystem/volume inside a pool (
zfs(8)()).
-
Imported vs exported: Imported means the pool is active in the kernel; exported means detached (no background pool activity). Observation that
zpool export pool1 stops wakeups is a key discriminator.
-
TXG (transaction group): internal ZFS batching unit; periodic TXG-related bookkeeping can trigger small writes even with no open files.
0) Prerequisites and baseline checks
Run as root:
Code:
# freebsd-version -ku
# uname -a
# zfs version
# zpool status
1) Confirm the wakeup cadence (10 minutes vs ~5 seconds)
1.1 Watch live disk I/O
Use a simple live view:
What success looks like: when the pool is idle, the target HDDs show no periodic I/O bursts.
2) Fast isolation: prove “imported pool == wakeups”
Hazard (availability / data path): zpool export immediately detaches the pool and makes all datasets in it unavailable to the host (and any jails/services using them). Scope: the exported pool. Safer pattern: stop services using the pool, confirm nothing is mounted/used (e.g.,
zfs mount,
fstat,
zpool status), then export.
This matches the reported findings, but keep it as a repeatable test:
Hazard (device power/state control): camcontrol standby changes a disk’s power state and can disrupt in-flight I/O, cause timeouts, or trigger error recovery if anything is still touching the device. Scope: the specified disk device(s). Safer pattern: export the pool first (or otherwise ensure the device is idle), then issue standby to the correct device node; avoid testing on the boot/root disk.
Then place the HDD in standby:
Code:
# camcontrol standby /dev/adaX
Wait longer than the HDD idle timeout.
Interpretation:
- If exported pools stay asleep, the trigger is “pool imported” behavior (kernel/ZFS or pool properties), not Samba mounts or userland file opens.
3) Eliminate ZFS properties that intentionally generate background I/O
3.1 multihost (MMP) — periodic writes by design
When multihost=on, ZFS performs periodic writes to show the pool is in use (see
zpoolprops(7)()).
Code:
# zpool get -H multihost pool1
# zpool get -H multihost data
Hazard (pool behavior / compatibility): zpool set multihost changes pool coordination behavior. Scope: the target pool; impacts how ZFS guards against multi-host imports. Safer pattern: record current value first (
zpool get multihost) so rollback is trivial.
If it is
on on a single-host system, turn it off and re-test:
Code:
# zpool set multihost=off pool1
# zpool set multihost=off data
3.2 autotrim
autotrim=on causes periodic TRIM of recently freed space; default is off (see
zpoolprops(7)()).
Code:
# zpool get -H autotrim pool1
# zpool get -H autotrim data
Hazard (performance / device behavior): zpool set autotrim can change background I/O patterns and performance characteristics. Scope: the target pool; effect depends on device type. Safer pattern: apply only to the HDD pools under test and keep a note of the prior setting for rollback.
If enabled on HDD pools, disable and re-test:
Code:
# zpool set autotrim=off pool1
# zpool set autotrim=off data
Rollback: restore prior values using
zpool set.
4) Check for the 10-minute TXG time database flush (strong match for Case 1)
zfs(4)() documents a 600-second default flush interval for the TXG time database. This aligns with “exactly 10 minutes” wakeups and with the upstream OpenZFS 2.4 report (OpenZFS issue #18082).
4.1 Find the exact sysctl node names on this system
Do not guess names; discover them:
Code:
# sysctl -a | egrep 'spa_(note|flush)_txg_time'
Hazard (filesystem durability characteristics): changing ZFS sysctls can alter write patterns and what metadata is preserved across unexpected power loss. Scope: host-wide ZFS behavior (kernel module). Safer pattern: treat as a short diagnostic only; record the prior value and restore it after the test.
4.2 Temporarily change the flush interval to see if the wakeup interval changes
Example pattern (use the exact node name returned in 4.1):
Code:
# sysctl <node_for_spa_flush_txg_time>=3600
What success looks like: wakeups shift from ~600 seconds to ~3600 seconds. That confirms the TXG time DB flush as the trigger.
Rollback: set it back to 600.
5) Rule out base cron/periodic writes (often confused with “about 10 minutes”)
FreeBSD’s default cron invokes
/usr/libexec/save-entropy every 11 minutes (not 10) (see
save-entropy(8)() and the Handbook’s cron discussion). It stores entropy under
/var/db/entropy by default, and can be disabled by setting
entropy_dir="NO" in
rc.conf(5)().
5.1 Verify where /var lives
If
/var is on the HDD-backed pool, cron can wake those disks.
5.2 Watch cron when a wakeup happens
Hazard (system security posture): disabling entropy caching changes how entropy is preserved across reboots; it does not disable the kernel RNG. Scope: host-wide configuration. Safer pattern: only apply if logs show
save-entropy correlates with wakeups; revert afterward if not needed.
5.3 Disable entropy caching (only if it is implicated)
Rollback: remove the setting or restore the previous
entropy_dir value (see
save-entropy(8)()).
6) Attribute the I/O to the responsible process (authoritative)
lsof/fstat can miss kernel-originated writes. Use DTrace’s io provider to see who is issuing block I/O (see dtrace io provider documentation).
Run this and wait for the next audible tick/spin-up:
Code:
# dtrace -q -n '
#pragma D option quiet
io:::start
/args[1]->device_name == "ada0" || args[1]->device_name == "ada1"/
{
@[args[1]->device_name, execname, pid] = count();
}
'
Interpretation:
- If execname is cron, periodic, smbd, etc., fix that service/job.
- If the activity attributes to kernel/ZFS paths (no meaningful userland culprit), the symptom set matches the OpenZFS 2.4 TXG time DB issue report (export/readonly import stop it; 10-minute wakeups; txg sync thread involvement).
7) Use ZFS internal history to correlate events (optional but useful)
zpool history -i includes internally logged ZFS events (see
zpool-history(8)()).
Code:
# zpool history -il pool1 | tail -200
# zpool history -il data | tail -200
8) Practical mitigations (match both Case 1 and Case 2)
8.1 Keep “cold” pools exported when idle
Hazard (availability / data path): zpool export detaches the pool and breaks access for mounts, jails, and services. Scope: the exported pool. Safer pattern: stop dependents first; verify no mounts and no active users; then export.
This avoids the “imported pool triggers periodic access” class of problems.
8.2 Import read-only when only reads are needed
Hazard (availability / workflow change): importing read-only prevents writes (including normal metadata updates) and can cause confusing “read-only filesystem/pool” failures in services expecting writes. Scope: the imported pool. Safer pattern: use only for truly read-only access windows; export afterward when idle.
zpoolprops(7)() documents the pool import property readonly=on. This also matches the upstream OpenZFS 2.4 report: read-only import allowed disks to spin down.
Code:
# zpool import -o readonly=on pool1
8.3 For “always-imported” HDD pools
If DTrace confirms the TXG time database flush behavior, the root cause is likely the OpenZFS 2.4 regression described upstream (OpenZFS issue #18082). In that situation, exporting idle pools or importing read-only are the lowest-risk workarounds until the upstream issue is fixed and pulled into FreeBSD.
Case mapping (based on the observations)
-
Case 1 (exact ~10-minute spin-ups; unmount doesn’t help; export fixes it): strong match for zfs_spa_flush_txg_time=600 TXG time database flush behavior (see
zfs(4)()).
-
Case 2 (audible accesses every ~5–8 seconds; stop after standby): consistent with frequent TXG sync-thread related activity described in the same upstream report (mentions txg_sync_thread cadence).