UFS Why kernel crashes with dirty filesystems?

Cath O'Deray · Jan 3, 2022

mark_j said:
… mangled … It's a pity you don't have the dubious USB and its file system to image and provide with a bug report.

<https://bugs.freebsd.org/bugzilla/showdependencytree.cgi?id=244384&hide_resolved=1>

one mangling bug there
plus (page one, post 12) the two earlier mangling bugs that are not linked from the UFS fuzz metabug.

Concerning a fourth example of kernel panics in the presence of mangling (my January 2021 case):

<https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=244384#c1> "… If you'd like proper background … I can spin it into a new linked bug report …"
<https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=244384#c2> mckusick@ "… I don't think that we need a separate bug report opened. …".

I do have an image – the verbatimstorengo-bent.img that appears as a directory listing under <https://bugs.kde.org/show_bug.cgi?id=447820#c1> – however I do not expect anyone to require the image for FreeBSD development purposes.

Cath O'Deray · Jan 3, 2022

MassimoM said:
Why kernel crashes with dirty filesystems?

To this particular question: strictly speaking, it does not.

The opening post pictured a kernel panic in the presence of a mangled entry. The file system was not detectably dirty prior to a mount command that did not require force.

ralphbsz describes occurrences of this type of bug as very rare; I agree (I might say, extremely rare).

It's far more commonplace for a dirty file system to be detectably dirty (without mangling), so the end user must decide whether to risk applying force.

mark_j · Jan 3, 2022

I'm not so sure an image of a disk wouldn't help as these issues seem to be edge-cases or fuzz.
I guess you could probably replicate it by finding the directory inode and dd-ing some garbage (zeros) over it.

Edit: it seems some of the bug reports linked to the main one do provide an image to test with.

mark_j · Jan 3, 2022

grahamperrin said:
To this particular question: strictly speaking, it does not.

The opening post pictured a kernel panic in the presence of a mangled entry.

The file system was not detectably dirty prior to a mount command that did not require force. ralphbsz describes occurrences of this type of bug as very rare; I agree (I'd say, very or extremely rare).

It's far more commonplace for a dirty file system to be detectably dirty (without mangling), so the end user must decide whether to risk applying force.

I think, and MassimoM can definitively answer this, the use of the word "dirty" seems to mean bad or corrupt (or, by extension mangled). In other words, the reason for the "mangling" is not the file system and its programming per-se but the underlying hardware.

The mount-ing is the problem. Should it have been done read-only, a message would have appeared as shown however it would not have panicked. To me this is what should happen when mounting a disk. In all other times, I think a panic is desired to preserve the data/file system.

I assume (wildly) that the assumption is made that an ACTIVE read-write file-system with a bad directory inode that cannot be traversed is serious enough to panic. This type of scenario is unfortunately expected when using USB and SD-cards that have the propensity to just fail to write or read due to hardware failure.

The assumption of "rare" failures might have been fine with rotational disks, and likewise with SSDs because of their hardware, but not USB and uSD/SD cards.

mark_j · Jan 3, 2022

covacat said:
seems that netbsd/openbsd/dragonflybsd all panic on ufs_dirbad unless fs is r/o

As I said here. UFS/FFS basically is rooted in FreeBSD and branches out to these. (Don't take that as a negative term.

)
Edit: A bit of research on my notes finds the earliest reference to this as v2.0R. I will presume it has been in effect since year dot of UFS.
As I said, it's more a feature than a bug nowadays.

covacat · Jan 3, 2022

at least netbsd has sizeable modifications like endian independence, it's own journaling stuff(wapbl) so i wondered if any of them did something about this (at least netbsd and openbsd rely heavily on UFS)

Cath O'Deray · Jan 3, 2022

MassimoM said:
… I never heard a similar behaviour in linux, … never in linux, solaris and so on. …

In the past, it was entertainingly simple to trigger kernel panics with Darwin (Apple Mac OS X):

whilst running Apple's OS, insert a perfect fault-free CD/DVD for an alternative (and equally respected) OS.

I almost certainly reported the bug to Apple (rdar), probably never disclosed it to fellow AppleSeed project members. The distant past, so I imagine that Apple fixed the bug, but still: I'll not disclose the nature of the optical disc. At a glance there's no comparable report in Open Radar or elsewhere. For anyone who's curious: happy hunting!

Cath O'Deray · Jan 3, 2022

Sorry:

grahamperrin said:
strictly speaking

– I wrote that, then ?‍

continued to use the non-strict phrase dirty. (At the back of my mind: relatively colloquial phrases such as dirty flag and dirty bit.)

More accurately, I was thinking of the sblock.fs_clean field, which appears in parts of code such as these:

<https://github.com/freebsd/freebsd-...0da8ec2c6fb1a1380d2ad7179d247a0666f5e3f9cR254> – note, UFS/FFS superblock within the title
<https://github.com/freebsd/freebsd-...d35d35203b869d677ca71445cb655773c0fdR313-R319> – there's another relatively colloquial phrase, unclean.

According to 1759 'File System' OCT-28 | kexiang's blog, 1.2.1 there's a redundant copy of the superblock at a varying offset from the beginning of each cylinder group.

mangled entry is another exact phrase, which was photographed in the opening post. The phrase appears in parts of code such as this:

<https://github.com/freebsd/freebsd-...3791fa7d054a3076a0a9948d1ebbba50e433R287-R290> – sys/ufs/ufs/ufs_lookup.c

– and in the (stress2) comment at <https://github.com/freebsd/freebsd-...51307f9f2e/tools/test/stress2/misc/md2.sh#L29>.

With added emphasis:

stress2 is a tool for finding problems in the kernel. … stress2 has found a large number of problems: … from the comment above, let's assume that mangled entry cases are amongst the problems that may be identified by use of stress2.

Cath O'Deray · Jan 4, 2022

grahamperrin said:
Touch wood, the same volume does not crash FreeBSD 14.0-CURRENT 4bae154fe8c (2021-12-24).

Touching wood – ancient superstition – is not a substitute for a logical approach to testing software.

A mixture of logic plus freedom of thought produced this, twenty-four hours ago: :

Code:

mowa219-gjp4-8570p-freebsd dumped core - see /var/crash/vmcore.8

Mon Jan  3 03:26:58 GMT 2022

FreeBSD mowa219-gjp4-8570p-freebsd 14.0-CURRENT FreeBSD 14.0-CURRENT #118 main-n251923-4bae154fe8c: Sat Dec 25 08:03:37 GMT 2021     root@mowa219-gjp4-8570p-freebsd:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG  amd64

panic: ufs_dirbad: /media/Verbatim_STORE_N_GO_17071802004381_p1: bad dir ino 2 at offset 0: mangled entry

…
Reading symbols from /boot/kernel/kernel...
Reading symbols from /usr/lib/debug//boot/kernel/kernel.debug...

Unread portion of the kernel message buffer:
panic: ufs_dirbad: /media/Verbatim_STORE_N_GO_17071802004381_p1: bad dir ino 2 at offset 0: mangled entry
cpuid = 3
time = 1641179709
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0194a5e880
vpanic() at vpanic+0x17f/frame 0xfffffe0194a5e8d0
panic() at panic+0x43/frame 0xfffffe0194a5e930
ufs_lookup_ino() at ufs_lookup_ino+0xc9f/frame 0xfffffe0194a5ea20
vfs_cache_lookup() at vfs_cache_lookup+0xad/frame 0xfffffe0194a5ea70
lookup() at lookup+0x45c/frame 0xfffffe0194a5eb10
namei() at namei+0x2c8/frame 0xfffffe0194a5ebd0
kern_statat() at kern_statat+0xe9/frame 0xfffffe0194a5ed00
sys_fstatat() at sys_fstatat+0x2f/frame 0xfffffe0194a5ee00
amd64_syscall() at amd64_syscall+0x10c/frame 0xfffffe0194a5ef30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe0194a5ef30
--- syscall (552, FreeBSD ELF64, sys_fstatat), rip = 0x8061e38aa, rsp = 0x1035dd8, rbp = 0x1035ef0 ---
KDB: enter: panic
Uptime: 1d8h40m36s
Dumping 2859 out of 16267 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%

__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
55        __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu,
(kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=textdump@entry=1)
    at /usr/src/sys/kern/kern_shutdown.c:399
#2  0xffffffff80bfb281 in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:487
#3  0xffffffff80bfb6fe in vpanic (
    fmt=0xffffffff81207ead "ufs_dirbad: %s: bad dir ino %ju at offset %ld: %s", ap=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:920
#4  0xffffffff80bfb503 in panic (fmt=<unavailable>)
    at /usr/src/sys/kern/kern_shutdown.c:844
#5  0xffffffff80f1077f in ufs_dirbad (ip=0xfffff8003678e600,
    offset=<optimized out>, how=<optimized out>)
    at /usr/src/sys/ufs/ufs/ufs_lookup.c:772
#6  ufs_lookup_ino (vdp=<unavailable>,
    vdp@entry=<error reading variable: value is not available>,
    vpp=0xfffffe0194a5ec40,
    vpp@entry=<error reading variable: value is not available>,
    cnp=<optimized out>,
    cnp@entry=<error reading variable: value is not available>, dd_ino=0x0,
    dd_ino@entry=<error reading variable: value is not available>)
    at /usr/src/sys/ufs/ufs/ufs_lookup.c:380
#7  0xffffffff80cc37cd in VOP_CACHEDLOOKUP (dvp=0xfffff802331a0380,
    vpp=0xfffffe0194a5ec40, cnp=0xfffffe0194a5ec68) at ./vnode_if.h:99
#8  vfs_cache_lookup (ap=<unavailable>,
    ap@entry=<error reading variable: value is not available>)
    at /usr/src/sys/kern/vfs_cache.c:3066
#9  0xffffffff80cd129c in VOP_LOOKUP (dvp=0xfffff802331a0380,
    vpp=0xfffffe0194a5ec40, cnp=0xfffffe0194a5ec68) at ./vnode_if.h:65
#10 lookup (ndp=ndp@entry=0xfffffe0194a5ebe8)
    at /usr/src/sys/kern/vfs_lookup.c:1128
#11 0xffffffff80cd0438 in namei (ndp=ndp@entry=0xfffffe0194a5ebe8)
    at /usr/src/sys/kern/vfs_lookup.c:658
#12 0xffffffff80cef7b9 in kern_statat (td=0xfffffe0124253720,
    flag=<optimized out>, fd=-100, path=<unavailable>, pathseg=<unavailable>,
    pathseg@entry=UIO_USERSPACE, sbp=sbp@entry=0xfffffe0194a5ed18, hook=0x0)
    at /usr/src/sys/kern/vfs_syscalls.c:2448
#13 0xffffffff80cefeaf in sys_fstatat (td=<unavailable>,
    uap=0xfffffe0124253b10) at /usr/src/sys/kern/vfs_syscalls.c:2425
#14 0xffffffff810904ec in syscallenter (td=0xfffffe0124253720)
    at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:189
#15 amd64_syscall (td=0xfffffe0124253720, traced=0)
    at /usr/src/sys/amd64/amd64/trap.c:1191
#16 <signal handler called>
#17 0x00000008061e38aa in ?? ()
Backtrace stopped: Cannot access memory at address 0x1035dd8
(kgdb)

------------------------------------------------------------------------
ps -axlww
…

again, mangled entry
the first of three kernel panics
usefulness of the vmcore.⋯ files is, I assume, limited by my preference for a GENERIC-NODEBUG kernel
I suspect that panics in this situation will be consistently reproducible.

covacat said:
the code looks the same to me in -CURRENT.
if it reaches the "mangled entry" it will panic unless fs is r/o

Food for thought

Why did the imaged device previously not cause a kernel panic when there was write access to the file system at mount time?

Code:

root@mowa219-gjp4-8570p-freebsd:~ # fstyp /dev/da2p1
ufs
root@mowa219-gjp4-8570p-freebsd:~ # mkdir /tmp/danger
root@mowa219-gjp4-8570p-freebsd:~ # mount -o ro /dev/da2p1 /tmp/danger
root@mowa219-gjp4-8570p-freebsd:~ # ls -ahl /tmp/danger
total 0
root@mowa219-gjp4-8570p-freebsd:~ # du -hs /tmp/danger
4.0K    /tmp/danger
root@mowa219-gjp4-8570p-freebsd:~ # df /tmp/danger
Filesystem 1K-blocks    Used  Avail Capacity  Mounted on
/dev/da2p1   7386504 5863112 932472    86%    /tmp/danger
root@mowa219-gjp4-8570p-freebsd:~ # umount /tmp/danger
root@mowa219-gjp4-8570p-freebsd:~ # fsck -fn /dev/da2p1
** /dev/da2p1 (NO WRITE)
VALUES IN SUPER BLOCK LSB=128 DISAGREE WITH THOSE IN
LAST ALTERNATE LSB=14085120

IGNORE ALTERNATE SUPER BLOCK? no


LOOK FOR ALTERNATE SUPERBLOCKS? no

root@mowa219-gjp4-8570p-freebsd:~ # sync
root@mowa219-gjp4-8570p-freebsd:~ # mount /dev/da2p1 /tmp/danger
root@mowa219-gjp4-8570p-freebsd:~ # umount /tmp/danger
root@mowa219-gjp4-8570p-freebsd:~ #

– there, the penultimate command:

mount /dev/da2p1 /tmp/danger

Cath O'Deray · Jan 4, 2022

For messages that can not be saved to /var/log/messages, it's necessary to examine the core.txt.⋯ file.

Code:

?…
ugen1.6: <Verbatim STORE N GO> at usbus1
umass2 on uhub5
umass2: <Verbatim STORE N GO, class 0/0, rev 2.00/3.02, addr 6> on usbus1
da2 at umass-sim2 bus 2 scbus6 target 0 lun 0
da2: <Verbatim STORE N GO > Removable Direct Access SPC-2 SCSI device
da2: Serial Number 17071802004381
da2: 40.000MB/s transfers
da2: 7450MB (15257600 512 byte sectors)
da2: quirks=0x2<NO_6_BYTE>
UFS: forcibly unmounting /dev/da2p1 from /media/Verbatim_STORE_N_GO_17071802004381_p1
WARNING: /tmp/danger was not properly dismounted
panic: ufs_dirbad: /media/Verbatim_STORE_N_GO_17071802004381_p1: bad dir ino 2 at offset 0: mangled entry
…

– there, additional food for thought, the two messages immediately before the panic:

UFS: forcibly unmounting /dev/da2p1 from /media/Verbatim_STORE_N_GO_17071802004381_p1
WARNING: /tmp/danger was not properly dismounted

Cath O'Deray · Jan 4, 2022

mark_j said:
… In other words, the reason for the "mangling" is not the file system and its programming per-se but the underlying hardware. …

For the case of MassimoM (no image of the affected device), we can not tell.

In my case, I'm not certain that hardware is a factor.

I can write, to the device, the (2022-01-02) 7.3G image of the entire device:

Code:

root@mowa219-gjp4-8570p-freebsd:/usr/home/grahamperrin/Documents/IT/BSD/FreeBSD/Verbatim STORE N GO kernel panics # gdd status=progress bs=10240 if=verbatimstorengo-bent.img of=/dev/da2
7802275840 bytes (7.8 GB, 7.3 GiB) copied, 1147 s, 6.8 MB/s
762880+0 records in
762880+0 records out
7811891200 bytes (7.8 GB, 7.3 GiB) copied, 1148.19 s, 6.8 MB/s
root@mowa219-gjp4-8570p-freebsd:/usr/home/grahamperrin/Documents/IT/BSD/FreeBSD/Verbatim STORE N GO kernel panics #

chungy · Jan 4, 2022

MassimoM said:
I never heard a similar behaviour in linux, because bad-input=crash is never accepted in a production-grade environment, expecially for the kernel.

I just have to interject here: You are absolutely wrong about Linux's behavior. Just as FreeBSD can have a kernel crash with a sufficiently malformed/corrupted UFS, Linux can and will crash with sufficiently malformed and corrupted ext4 (or XFS, or btrfs, or <pick file system>).

Kernel code paths usually assume that the file system is mostly sane; userland utilities like fsck are meant to be used to deal with corrupted file systems before passing off to kernel code.

Perhaps on both sides (Linux and FreeBSD), there's room to improve the file system drivers to try to not crash everything on a corrupted file system, but this is the state of the world today and for the likely long-term future.

Cath O'Deray · Jan 4, 2022

chungy said:
userland utilities like fsck are meant to be used to deal with corrupted file systems before passing off to kernel code.

Thanks, it's useful to think of things in that way.

grahamperrin said:
@MassimoM

Your server

The reason for the tunefs question, on page 1, is an fsck_ffs(8) bug for which the fix is not in FreeBSD 13.0-RELEASE.