Solved FreeBSD sysctl vfs.zfs.dmu_offset_next_sync and openzfs/zfs issue #15526 (errata notice, FreeBSD bug 275308)

grahamperrin · Nov 24, 2023

FreeBSD sysctl vfs.zfs.dmu_offset_next_sync and openzfs/zfs issue #15526

Posted in r/freebsd by u/grahamperrin • 15 points and 37 comments

old.reddit.com

~~From the pinned comment:~~

(quote removed)

Please see the pinned comment, which has changed, and may change again.

VladiBG · Nov 24, 2023

some copied files are corrupted (chunks replaced by zeros) · Issue #15526 · openzfs/zfs

System information Type Version/Name Distribution Name Gentoo Distribution Version (rolling) Kernel Version 6.5.11 Architecture amd64 OpenZFS Version 2.2.0 Reference https://bugs.gentoo.org/917224 ...

github.com

forquare · Nov 24, 2023

Am I to believe that this should not be an issue on FreeBSD versions lower than 14.0-RELEASE?

Reading the GitHub issue, here is what I see on one box that has been upgraded to 14.0-RELEASE (along with zpool upgrade to zfs-2.2.0-FreeBSD_g95785196f):

Code:

root@loft-bsd:~ # zpool get all zroot | grep block_cloning
zroot  feature@block_cloning          enabled                        local
root@loft-bsd:~ # sysctl vfs.zfs.dmu_offset_next_sync
vfs.zfs.dmu_offset_next_sync: 1
root@loft-bsd:~ # zpool get all zroot | grep bclone
zroot  bcloneused                     0                              -
zroot  bclonesaved                    0                              -
zroot  bcloneratio                    1.00x                          -
root@loft-bsd:~ # sysctl vfs.zfs.bclone_enabled
vfs.zfs.bclone_enabled: 0

Whereas on my server running 13.2-RELEASE-p4 with zfs-2.1.9-FreeBSD_g92e0d9d18:

Code:

root@manaha:~ # zpool get all zroot | grep block_cloning
root@manaha:~ # sysctl vfs.zfs.dmu_offset_next_sync
vfs.zfs.dmu_offset_next_sync: 1
root@manaha:~ # zpool get all zroot | grep bclone
root@manaha:~ # sysctl vfs.zfs.bclone_enabled
sysctl: unknown oid 'vfs.zfs.bclone_enabled'

cy@ · Nov 24, 2023

Reading the upstream issue, the bug is supposed to exhibit itself on systems with bclone disabled.

Erichans · Nov 24, 2023

forquare said:
Am I to believe that this should not be an issue on FreeBSD versions lower than 14.0-RELEASE?

13.2-RELEASE is most likely also affected: OpenZFS issue - 15526; though it seems that this bug is hard to trigger.

Michael Rüger · Nov 24, 2023

Yes, i am able to reproduce this on my 13.2-RELEASE-p5 system.

I modified the script from github to make it FreeBSD compatible.

Code:

#!/usr/local/bin/bash
#
prefix="reproducer_${BASHPID}_"
dd if=/dev/urandom of=${prefix}0 bs=16K count=1 status=none

echo "writing files"
end=1000
h=0
for i in `seq 1 2 $end` ; do
        let "j=$i+1"
        cp ${prefix}$h ${prefix}$i
        cp ${prefix}$i ${prefix}$j
        let "h++"
done

echo "checking files"
for i in `seq 1 $end` ; do
        diff ${prefix}0 ${prefix}$i
done

When this is run with
[code]
./reproducer.sh & ./reproducer.sh & ./reproducer.sh & ./reproducer.sh &./reproducer.sh & ./reproducer.sh & ./reproducer.sh & ./reproducer.sh &./reproducer.sh & ./reproducer.sh & ./reproducer.sh & ./reproducer.sh & wait
[/code]
multiple times, sometime i get something like

Code:

...
checking files
Binary files reproducer_88319_0 and reproducer_88319_228 differ
Binary files reproducer_88337_0 and reproducer_88337_306 differ
Binary files reproducer_88319_0 and reproducer_88319_457 differ
Binary files reproducer_88319_0 and reproducer_88319_458 differ
Binary files reproducer_88337_0 and reproducer_88337_613 differ
Binary files reproducer_88337_0 and reproducer_88337_614 differ
Binary files reproducer_88319_0 and reproducer_88319_915 differ
Binary files reproducer_88319_0 and reproducer_88319_916 differ
Binary files reproducer_88319_0 and reproducer_88319_917 differ
Binary files reproducer_88319_0 and reproducer_88319_918 differ
[1]   Done                    ./reproducer.sh
...

One run takes on my machine 1m16s.

Setting
[cmd]
doas sysctl vfs.zfs.dmu_offset_next_sync=0
[/cmd]

One run just takes 4s now and the error is no more reproducible. I don't get, why this is now even faster. Just for me?

cy@ · Nov 24, 2023

Looking at the upstream issue discussion, it doesn't matter if blcone is enabled or disabled. The bug exhibits itself under both conditions.

grahamperrin · Nov 24, 2023

Vince (darkain) in Discord helped me find the relevant report:

275308 – EN tracking issue for potential ZFS data corruption

bugs.freebsd.org

grahamperrin · Nov 25, 2023

Respecting the wish to minimise noise in GitHub …

The bsdice script

<https://old.reddit.com/comments/182pgki/-/kanif6q/?context=1> re: line 16.

~~Am I doing something wrong, or is the script not yet applicable on FreeBSD?~~ Sorted, I think.

_martin · Nov 25, 2023

I was not happy to see this bug; but then who was. I just did a migration I was really wanting to avoid - jumped from 12.4 to 14.0. I followed this briefly and to my understanding it's still not known if opensolaris version is affected. I ran the script and ruled out all the false positives. Luckily enough there's only one set of private data that seems to be a valid issue - photos from 2005 of something I would not be too angry about. From that only 1 photo is of len 0 which doesn't make sense.

I suspended all my (private) backups for now.

grahamperrin · Nov 25, 2023

_martin said:
the script

Please, which one? I listed three, others may exist (I haven't reviewed the issue in GitHub for a few hours).

_martin · Nov 25, 2023

I'd swear I downloaded it from 15526 github though I can't find it there. On FreeBSD I changed the bash location.

It was

Code:

#!/bin/bash

ZFS_PATH="/yourmountpoint"
RECORD_SIZE=4096

shopt -s globstar

for FILE in "$ZFS_PATH"/**; do
 if [ ! -f "$FILE" ]; then
  continue
 fi

 if !(dd if="$FILE" bs=1 count=$RECORD_SIZE 2>/dev/null | grep -q '[^[:space:]]'); then
  echo "Possible data corruption in $FILE"
 fi
done

My private backups do have 2nd location running Linux as FreeBSD is practically not able to use AX210. There I recently did 8TB backup from 12.4 to 13.2 (HW that is now running Linux) and let those data be used readonly on openzfs 2.2.0~rc3-0ubuntu4.

But maybe I spoke too soon. This script found many issues on various location of the source code, not private data only (such as python scripts of hana db servers, Linux source codes on various qemu images I used for debugging). Luckily enough this is not of any importance as I can download any of that either from corporate repo or public domain.

Those photos are the only "issue" I have right now.

Erichans · Nov 25, 2023

_martin said:
I was not happy to see this bug; but then who was. I just did a migration I was really wanting to avoid - jumped from 12.4 to 14.0.

You jumped from pre-OpenZFS (12.4) to OpenZFS 2.2 (14.0). If you haven't upgraded your zpools at all, then I think your zpools should be still on the old version 28 (=pre Feature Flags era). If your BEs are created as per default then you should be able to revert to 12.4 AFAIK

_martin · Nov 26, 2023

Well I wrote above I did backup from 12.4 to 13.2. During backup receiving host was 13.2, pools updated to latest-greatest. That pool then was imported ro to Ubuntu server.
From what I read though it's not clear if this bug is affecting opensolaris versions too. That data, pictures from 2005, were last touched in 2009. Other data was heavily used in 2016-2018. That is all on 12.4. I never used 13.x anywhere, only at home.

grahamperrin · Nov 26, 2023

_martin said:
I can't find

See Convenience links to scripts (the comment in Reddit; NB the opening post here).

_martin said:
This script found many issues

Not definite issues; possible issues.

At a glance: you're working with an outdated edition, maybe the first edition, of the RichardBelzer script, which is currently in the midst of 179 or more hidden items. When I worked with a more recent edition of the script, it reported 270 'Possible data corruption …' lines for my home directory. IIRC GitHub includes discussion of the script.

Instead, maybe try the bsdice script. Whether it's the best, I don't know, but (as noted in Reddit) it reported nothing from a broader scan of /usr/home.It's now scanning /.

HTH

grahamperrin · Nov 28, 2023

Via <https://old.reddit.com/r/freebsd/comments/185gohv/-/>:

Data-destroying defect found after OpenZFS 2.2.0 release • Liam Proven, The Register

Liam's article gives prominence to Ed Maste's email to the freebsd-stable list. …

GitHub - 0x0177b11f/zfs-issue-15526-check-file

Contribute to 0x0177b11f/zfs-issue-15526-check-file development by creating an account on GitHub.

github.com

Attached: a result of scanning a mobile hard disk drive, on USB, that's given to a pool where most files are VirtualBox-related.

_martin · Nov 28, 2023

IMO, most likely these are all false positives.

grahamperrin · Nov 28, 2023

_martin said:
IMO, most likely these are all false positives.

Certainly, scan results such as those are non-conclusive. <https://github.com/openzfs/zfs/issues/15526#issuecomment-1826412289>, final paragraph.

bakul · Nov 29, 2023

FYI:

Comment 13Martin Matuska

2023-11-28 23:42:05 UTCA bugfix from OpenZFS has been merged into:
main (2276e5394)
stable/14 (d92e0d62c)
stable/13 (5858f93a8)

grahamperrin · Nov 29, 2023

bakul thanks.

zfs: merge openzfs/zfs@688514e47 · freebsd/freebsd-src@2276e53

Notable upstream pull request merges: #15532 c1a47de86 zdb: Fix zdb '-O|-r' options with -e/exported zpool #15535 cf3316633 ZVOL: Minor code cleanup #15541 803a9c12c brt: lift internal d...

github.com

– obtained from:

dmu_buf_will_clone: fix race in transition back to NOFILL · freebsd/freebsd-src@688514e

Previously, dmu_buf_will_clone() would roll back any dirty record, but would not clean out the modified data nor reset the state before releasing the lock. That leaves the last-written data in db...

github.com

– with reference to:

some copied files are corrupted (chunks replaced by zeros) · Issue #15526 · openzfs/zfs (15526 was the subject of this topic)
dmu_buf_will_clone: fix race in transition back to NOFILL by robn · Pull Request #15566 · openzfs/zfs (the PR).

grahamperrin · Nov 29, 2023

The slightly bigger picture

zfs-2.2.2 patchset by tonyhutter · Pull Request #15602 · openzfs/zfs

Motivation and Context Patchset for 2.2.2. This release includes the fix for dirty dbuf corruption: #15526 Description Include fix for data corruption. Full details in: #15526. Other fixes also ...

github.com

Expect:

merge upstream
merge to FreeBSD src main (CURRENT)
cherry-picks to stable and releng branches after respectable periods of time.

Graham Perrin (@grahamperrin@bsd.cafe)

@emaste@mastodon.social FYI <https://github.com/openzfs/zfs/pull/15602> Will FreeBSD BR 275308, for the errata notice, broaden? To have a single EN for both: a) what's already merged to FreeBSD src b) openzfs/zfs PR 15602 for 2.2.2...

mastodon.bsd.cafe

skunk · Nov 29, 2023

mer said:
cherry-picks to stable and releng branches after respectable periods of time.

So -RELEASE will not get the fixes, right?

angry_vincent · Nov 29, 2023

i would guess there will be patch updates to release, assuming that issue is important

grahamperrin · Nov 29, 2023

skunk said:
So -RELEASE will not get the fixes, right?

releng is for RELEASE.

The screenshot below might help.

Note that releng/13.2 is ahead of (above) the release/13.2.0 tag.

<https://cgit.freebsd.org/src/log/?h=releng/12.4>

<https://cgit.freebsd.org/src/log/?h=releng/13.2>

<https://cgit.freebsd.org/src/log/?h=releng/14.0>

Unofficial, but useful:

<https://bokut.in/freebsd-patch-level-table/>

grahamperrin · Nov 29, 2023

CVE - CVE-2023-49298

awaiting analysis at <https://nvd.nist.gov/vuln/detail/CVE-2023-49298>.

In the meantime, according to the author of openzfs/zfs PR 15571:

… The scenario described really just sounds like the author hasn't really understood the detail.

Solved FreeBSD sysctl vfs.zfs.dmu_offset_next_sync and openzfs/zfs issue #15526 (errata notice, FreeBSD bug 275308)

grahamperrin

FreeBSD sysctl vfs.zfs.dmu_offset_next_sync and openzfs/zfs issue #15526

VladiBG

some copied files are corrupted (chunks replaced by zeros) · Issue #15526 · openzfs/zfs

forquare

cy@

Erichans

Michael Rüger

cy@

grahamperrin

275308 – EN tracking issue for potential ZFS data corruption

grahamperrin

The bsdice script

_martin

grahamperrin

_martin

Erichans

_martin

grahamperrin

Attachments

grahamperrin

GitHub - 0x0177b11f/zfs-issue-15526-check-file

Attachments

_martin

grahamperrin

bakul

grahamperrin

zfs: merge openzfs/zfs@688514e47 · freebsd/freebsd-src@2276e53

dmu_buf_will_clone: fix race in transition back to NOFILL · freebsd/freebsd-src@688514e

grahamperrin

The slightly bigger picture

zfs-2.2.2 patchset by tonyhutter · Pull Request #15602 · openzfs/zfs

Graham Perrin (@grahamperrin@bsd.cafe)

skunk

angry_vincent

grahamperrin

grahamperrin

Solved FreeBSD sysctl vfs.zfs.dmu_offset_next_sync and openzfs/zfs issue #15526 (errata notice, FreeBSD bug 275308)

The bsdice script​

Attachments

Attachments

The slightly bigger picture​

The bsdice script

The slightly bigger picture