FreeBSD 13.0 crashes while trying to run 'startx'!

Did you try a fsck? What was the result?
Oh, I'm really sorry, I forgot to reply.
Anyways, I don't know how to use it :(. I typed that command in multi user mode and it showed several commands all of the as far as I can remember defaulted to 'no'. Since I am in single user mode now, I ran that command, and it takes me through but I don't know how to use it. The man page is not helpful.
 
So, running that command in multiuser mode with sudo fsck_ffs /dev/ada0a:
Code:
** /dev/ada0a (NO WRITE)
** SU+J Recovering /dev/ada0a

USE JOURNAL? no
** Skipping journal, falling through to full fsck

SETTING DIRTY FLAG IN READ_ONLY MODE

UNEXPECTED SOFT UPDATE INCONSISTENCY
** Last Mounted on   /
** Root file system 
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames

UNALLOCATED   I=6414210  OWNER=operator  MODE-100600
SIZE= 4096  MTIME=Mar  15 16:33  2022
FILE=/var/db/entropy/saved-entropy.1

UNEXPECTED SOFT UPDATE INCONSISTENCY

REMOVE? no

** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
FREE BLK COUNT(S) WRONG IN SUPER BLK
SALVAGE? no
SUMMARY INFORMATION BAD
SALVAGE? no
BLK(S) MISSING IN BIT MAPS
SALVAGE? no
594331 files, 6852664 used, 10157318 free, (20758 frags, 1267070 blocks, 0.1% fragmentation)
 
It means that old kernel and dump don't match; rules out and answers my question above.

Can you attach few lines of strings on that vmcore.3 file ? strings /var/crash/vmcore.3|head -100.
Warning about FS not being properly dismounted after panic/sudden reboot is expected. Should there be deeper issue with the FS system would stop.
Here you go.
 

Attachments

Thanks.
Can you reupload the same command but with head -1000 ? 100 lines is not enough, 1000 should be enough (limiting output is needed, so it's a little bit of guessing game doing this remotely).

For clarification: are you able to boot this VM after crash/reboot or does it end up in single mode ? There's no need to do any fsck if after reboot machine boots up ok. You should not fsck live FS, especially /.

How much free space do you have in /var/crash (or on / as you have only one partition) ?
 
bsduck, you are right. The more I shutdown from the virtual machine instead through the FreeBSD OS, the more worse the problem gets. First I used to start kdeplasma5 by running sudo startx, after that, I installed sddm and by running sudo sddm as a workaround. Now, I shutdown the VM a couple of times through the Virtual Box instead from doing it inside the guest because FreeBSD was not responding(It happens when I try to run mpv or vlc media players). And now I can't open sddm too!
Now when I ran sudo fsck it seems to me that it has gotten much worse. This is the output:
Code:
** /dev/ada0a (NO WRITE)
** SU+J Recovering /dev/ada0a

USE JOURNAL? no
** Skipping journal, falling through to full fsck

SETTING DIRTY FLAG IN READ_ONLY MODE

UNEXPECTED SOFT UPDATE INCONSISTENCY
** Last Mounted on /
** Root file system
** Phase 1 - Check Blocks and Sizes
INCORRECT BLOCK COUNT I=6410741 (432 should be 424)
CORRECT? no

INODE 6414181: FILE SIZE 156495872 BEYOND END OF ALLOCATED FILE, SIZE SHOULD BE 229736
ADJUST? no

** Phase 2 - Check Pathnames
DIRECTORY CORRUPTED I=5294874 OWNER=jack MODE=40755
SIZE=1024 MTIME=Mar 15 21:26 2022
DIR=/usr/home/jack/.config/session

UNEXPECTED SOFT UPDATE INCONSISTENCY
SALVAGE? no
UNALLOCATED   I=6414210  OWNER=operator  MODE-100600
SIZE= 4096  MTIME=Mar 15 22:00  2022
FILE=/var/db/entropy/saved-entropy.2

UNEXPECTED SOFT UPDATE INCONSISTENCY

REMOVE? no

UNALLOCATED   I=6900994  OWNER=_tor  MODE-100600
SIZE=0 MTIME=Mar 15 22:04  2022
FILE=/var/db/tor/state

UNEXPECTED SOFT UPDATE INCONSISTENCY

REMOVE? no

** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
LINK COUNT FILE I=6921337 OWNER=_tor MODE=0
SIZE=0 MTIME=Mar 15 22:04  2022 COUNT 0 SHOULD BE -1
ADJUST? no

** Phase 5 - Check Cyl groups
FREE BLK COUNT(S) WRONG IN SUPER BLK
SALVAGE? no
SUMMARY INFORMATION BAD
SALVAGE? no
BLK(S) MISSING IN BIT MAPS
SALVAGE? no
594465 files, 6853308 used, 10126075 free, (19451 frags, 1263328 blocks, 0.1% fragmentation)
 
Thanks.
Can you reupload the same command but with head -1000 ? 100 lines is not enough, 1000 should be enough (limiting output is needed, so it's a little bit of guessing game doing this remotely).

For clarification: are you able to boot this VM after crash/reboot or does it end up in single mode ? There's no need to do any fsck if after reboot machine boots up ok. You should not fsck live FS, especially /.

How much free space do you have in /var/crash (or on / as you have only one partition) ?
Um, well you see, writing 1000 lines is not exactly a trivial task, so I'm sorry, I won't be able to fulfill your wish. I pasted the 100 lines of output into a file and then copied that file to my host and the copied it again from host to another VM where I am running the FreeBSD forums. Now since I have lost sddm too (see my earlier post), how can I send it? :(
 
Is your VM still bootable, meaning are you able to get shell ? Are you able to remotely ssh to that VM ? If yes you can just do strings /var/crash/vmcore.3|head -1000 > vmoutput and copy that file from the VM.
If you can't properly boot the VM you could insert the bootable CD and retrieve it from there.
 
Thanks.
Can you reupload the same command but with head -1000 ? 100 lines is not enough, 1000 should be enough (limiting output is needed, so it's a little bit of guessing game doing this remotely).

For clarification: are you able to boot this VM after crash/reboot or does it end up in single mode ? There's no need to do any fsck if after reboot machine boots up ok. You should not fsck live FS, especially /.

How much free space do you have in /var/crash (or on / as you have only one partition) ?
Once or twice, while rebooting, I ended up in single user mode. While booting FreeBSD sometimes it reboots again. It was too fast for me to read(next time, I'm gonna record it) but it probably dumps something and the reboots.
 
bsduck says to run fsck, while _martin says not to. :(
We don't know what state is your VM in, it's hard to say what you should or should not do in your particular case. However, you should not run fsck on live system when / is mounted. Also FreeBSD will detect corrupted FS upon boot and will try to fix it automatically. If it fails to do so it drops the boot and lets you interact with it.
As I mentioned above it's normal for system to report dirty FS after sudden reboot/crash.

I'm heavy ZFS user and don't use UFS at all nowadays. grahamperrin traced and shows some bugs in 13 that were fixed. Some of it was expected behavior with certain settings (use updates or not,etc.), maybe he's able to shed some light on it (I didn't follow that topic deeply).

EDIT: btw how are you showing us the fsck output if you can't copy stuff ?
 
Is your VM still bootable, meaning are you able to get shell ? Are you able to remotely ssh to that VM ? If yes you can just do strings /var/crash/vmcore.3|head -1000 > vmoutput and copy that file from the VM.
If you can't properly boot the VM you could insert the bootable CD and retrieve it from there.
So I followed this thread and now I have accidentaly overwritten my home directory :(. But anyways, here is the requested file.
EDIT: btw how are you showing us the fsck output if you can't copy stuff ?
I painfully type them, word for word.
 

Attachments

I painfully type them, word by word.
Sometimes you don't know if one is joking on internet or not. :)

Code:
panic: ufs_dirbad: /: bad dir ino 400652 at offset 4608: mangled entry
cpuid = 0
time = 1647226871
KDB: stack backtrace:
#0 0xffffffff80c57525 at kdb_backtrace+0x65
#1 0xffffffff80c09f01 at vpanic+0x181
#2 0xffffffff80c09d73 at panic+0x43
#3 0xffffffff80f0c11f at ufs_lookup_ino+0xe7f
#4 0xffffffff80cc9f2d at vfs_cache_lookup+0xad
#5 0xffffffff80ccdeed at cache_fplookup_noentry+0x1ad
#6 0xffffffff80ccb5f2 at cache_fplookup+0x322
#7 0xffffffff80cd666f at namei+0x6f
#8 0xffffffff80cf40df at kern_statat+0xcf
#9 0xffffffff80cf47bf at sys_fstatat+0x2f
#10 0xffffffff8108baac at amd64_syscall+0x10c
#11 0xffffffff8106243e at fast_syscall_common+0xf8
Uptime: 1h5m48s
Here's your panic string and backtrace. It's related to corrupted FS.

Not sure how come you were not able to grep the panic string from my previous command I shared.

Now this could have been a result of bad shutdown previously done to this VM. Do you have other core files in /var/crash ? Can you check for the panic string there ?
 
How is ZFS better than UFS?
By the features and options it provides. It's a bit heavier on resources but considering benefit/cost it's way more on benefit side. That's my personal opinion.

I checked the bug tracker and it seems there are two PRs but one is linked to PR 244384 which is important (and has quite few dependents).
 
The only thing that comes to my mind is to boot it into single mode, do the fsck and hope for the best. I don't know if that would help though.
You can also try to find the dir inode with find / -type d -inum 400652 as root. If anything you'd have an idea which directory is a problem.
 
do the fsck and hope for the best.
fsck-ed it, now i can start sddm. but what about the problem / not being dismounted properly?also just immediately after fsck-ing it, it crashed once!
I think I should delete this thread. and probably find my answers somewhere else in the forum. like this thread for example
 
As mentioned above, please run the find command to see which directory has the problem. Do the strings on the new vmcore and verify what it crashed on (most likely it's a FS issue again).
This is (most likely) a bug in UFS so it very well might not be fixable by fsck.
 
wait a min, actually when I rebooted the system(from the vm)because it got stuck again as always when i try to run vlc or mpv (i suspect it is the limitation of hardware resources), i fscked it the warning disappeared. also startx works(though it opens plasma as user root, but i can always log out and login as regular user) and so too sddm.
Erm, do you still want the strings?
If not, I think I should delete this thread.
 
UFS soft updates without soft update journaling can be problematic with 13.0-RELEASE-⋯. <https://forums.freebsd.org/posts/523465>

tunefs: soft updates: (-n) enabled tunefs: soft update journaling: (-j) enabled
  • this combination of preferences is OK.

… If not, I think I should delete this thread.

Please, do not remove. Leave these writings visible to the public, a realistic record of things can be very useful.

… You should not fsck live FS, especially /

… you should not run fsck on live system when / is mounted …

As far as I know – anyone, please correct me if I'm wrong:
  1. although fsck_ffs(8) with option -n can be used on a mounted file system, do so can be meaningless[SUP]†[/SUP]
  2. if a file system is mounted and if -n is omitted, the utility will not attempt any repair.

… I'm heavy ZFS user and don't use UFS at all nowadays…



[SUP]†[/SUP] Meaningless because normal writes to a file system, during a check, will cause the utility to report abnormalities that do not exist.
 

244384 – UFS fuzz metabug

This meta bug was opened by Conrad Meyer following a series of methodical reports from Neeraj Pal (presumably bsdboy | <https://dev.to/bsdboy>, Senior Product Security Engineer at Qualcomm).

In the list of dependent bugs, I don't know why the first five of eight are (currently) GEOM-specific, they do involve UFS.

geom(8)

… grahamperrin traced and shows some bugs in 13 that were fixed. …

I might describe my period of focused testing as thoughtful but far less methodical :)

The fix for 256746 is not yet released.

Other FreeBSD bugs for UFS include:

If 259090 is reproducible with 13.1-BETA⋯, it should be reclassified to 13.1-RELEASE.
 
Erm, do you still want the strings?
If not, I think I should delete this thread.
Don't delete the thread, it may help others googling around. No need to share more strings. I'd suggest to look into why you can't read the dump (insufficient space in /var maybe?) and check what directory is affected.
X startup is related to the corrupted files, it is most likely what is triggering the bug during the access too.
 
grahamperrin I'm not 100% sure with the UFS, it's a rule of thumb for many years on other systems. I don't know how force behaves on mounted FS either.
When FS is in a state where you do need to run fsck it makes sense not to use it till the check is done and (possible) issue fixed though.
 
Back
Top