Random crashes when starting some applications and problems with kgdb

I get occasional random crashes when opening some applications in KDE. The most
common culprits are libreoffice. gimp and audacity. When this happens everything freezes as soon as the application starts and after a few seconds the system reboots and saves a crash dump. However when I try to run kgdb I get an error message "A problem internal to GDB has been detected".
Code:
curlew:/var/crash# kgdb /boot/kernel/kernel vmcore.7
GNU gdb (GDB) 13.1 [GDB v13.1 for FreeBSD]
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd13.2".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /boot/kernel/kernel...
(No debugging symbols found in /boot/kernel/kernel)
/wrkdirs/usr/ports/devel/gdb/work-py39/gdb-13.1/gdb/thread.c:1337: internal-error: switch_to_thread: Assertion `thr != NULL' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
----- Backtrace -----
0x131ffd1 ???
0x17bda53 ???
0x17bd8b8 ???
0x1c48e03 ???
0x177cfbf ???
0x1454490 ???
0x175e2df ???
0x13536e4 ???
0x178b3b7 ???
0x153b4b5 ???
0x1539edb ???
0x15388de ???
0x122c5c4 ???
---------------------
/wrkdirs/usr/ports/devel/gdb/work-py39/gdb-13.1/gdb/thread.c:1337: internal-error: switch_to_thread: Assertion `thr != NULL' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n)

Code:
curlew:/var/crash# cat info.7
Dump header from device: /dev/gpt/swap1
  Architecture: amd64
  Architecture Version: 2
  Dump Length: 1759834112
  Blocksize: 512
  Compression: none
  Dumptime: 2023-09-03 14:28:06 +0100
  Hostname: curlew
  Magic: FreeBSD Kernel Dump
  Version String: FreeBSD 13.2-RELEASE-p2 GENERIC
  Panic String: page fault
  Dump Parity: 3569274929
  Bounds: 7
  Dump Status: good

curlew:/var/crash# cat info.7
Dump header from device: /dev/gpt/swap1
  Architecture: amd64
  Architecture Version: 2
  Dump Length: 1759834112
  Blocksize: 512
  Compression: none
  Dumptime: 2023-09-03 14:28:06 +0100
  Hostname: curlew
  Magic: FreeBSD Kernel Dump
  Version String: FreeBSD 13.2-RELEASE-p2 GENERIC
  Panic String: page fault
  Dump Parity: 3569274929
  Bounds: 7
  Dump Status: good
curlew:/var/crash#
 
I suspect the error is due to the missing symbols. You don't seem to have kernel-dbg installed.
 
Do I see correctly and this is still 13.1? You will want to upgrade to 13.2 anyway. Packages are now being built for 13.2 because 13.1 is EoL. This might be the source of your problems, running 13.2 executables on 13.1.
 
Do I see correctly and this is still 13.1? You will want to upgrade to 13.2 anyway. Packages are now being built for 13.2 because 13.1 is EoL. This might be the source of your problems, running 13.2 executables on 13.1.
That had me confused too but 'GNU gdb (GDB) 13.1 [GDB v13.1 for FreeBSD]' refers to the version of the package, not the OS.
Code:
curlew:/home/mike% pkg info -E gdb
gdb-13.1_3

curlew:/home/mike% freebsd-version -kru
13.2-RELEASE-p2
13.2-RELEASE-p2
13.2-RELEASE-p2
So it looks like I need to install kernel-dbg
 
the addresses dont look right and you still have to see the public symbols (those shown with nm kernel)
either the stack is foobared or something is wrong with the dump or gdb
 
I suspect the error is due to the missing symbols. You don't seem to have kernel-dbg installed.
I've made a bit of progress but no solution yet.

I installed a 13.2 bhyve guest with kernel dubigging enabled and copied the crash dump files into it, Running kgdb in the byhve guest produced what looks like more useful output
Code:
root@freebsd132-vm:/var/crash # kgdb /boot/kernel/kernel vmcore.7
GNU gdb (GDB) 13.1 [GDB v13.1 for FreeBSD]
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd13.2".
Type "show configuration" for configuration details.
For bug repoappearsrting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /boot/kernel/kernel...
Reading symbols from /usr/lib/debug//boot/kernel/kernel.debug...

Unread portion of the kernel message buffer:
drmn0: [drm] GPU HANG: ecode 7:1:85dffffd, in MainThread [101162]


Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 04
fault virtual address   = 0x61
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff80f55537
stack pointer           = 0x28:0xfffffe00c36b3b60
frame pointer           = 0x28:0xfffffe00c36b3ba0
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 0 (linuxkpi_short_wq_0)
trap number             = 12
panic: page fault
cpuid = 2
time = 1693747686
KDB: stack backtrace:
#0 0xffffffff80c53d95 at kdb_backtrace+0x65
#1 0xffffffff80c06711 at vpanic+0x151
#2 0xffffffff80c065b3 at panic+0x43
#3 0xffffffff810b1fa7 at trap_fatal+0x387
#4 0xffffffff810b1fff at trap_pfault+0x4f
#5 0xffffffff81088e28 at calltrap+0x8
#6 0xffffffff80f5562d at kmem_free+0x2d
#7 0xffffffff83b6381d at __i915_gpu_coredump_free+0x12d
#8 0xffffffff83b34dd9 at intel_gt_handle_error+0xa9
#9 0xffffffff83b20131 at heartbeat+0x2a1
#10 0xffffffff80e63603 at linux_work_fn+0xe3
#11 0xffffffff80c68931 at taskqueue_run_locked+0x191
#12 0xffffffff80c69bf3 at taskqueue_thread_loop+0xc3
#13 0xffffffff80bc2f9e at fork_exit+0x7e
#14 0xffffffff81089e9e at fork_trampoline+0xe
Uptime: 5h55m6s
Dumping 1678 out of 16119 MB:..1%..11%..21%..31% (CTRL-C to abort)  (CTRL-C to abort)  (CTRL-C to abort) ..41% (CTRL-C to abort) ..51%..61%..71%..81%..91%

warning: Could not load shared library symbols for 4 libraries, e.g. i915kms.ko.
Use the "info sharedlibrary" command to see the complete listing.
Do you need "set solib-search-path" or "set sysroot"?
0xffffffff80c064fe in ?? () at /usr/src/sys/kern/kern_shutdown.c:353
353     /usr/src/sys/kern/kern_shutdown.c: No such file or directory.
(kgdb)
I have no skills in working with dumps but am I right in thinking that the crash originated from the i915kms.ko module installed by graphics/drm-510-kmod?

I assume the warnings about lack of shared library symbols for i915kms.ko is just due to graphics/drm-510-kmod not being installed on the bhyve guest.

I have tried to enable kernel debugging on my working PC by extracting /usr/freebsd-dist/kernel-dbg.txz from the installation ISO into into /usr/lib/debug/boot/kernel on my PC and rebooting but a subsequent attempt to debug the crash dump produced the original error message "A problem internal to GDB has been detected". Is there a further step I need to do in order to enable kernel-dbg? I'm assuming that I should be able to do this without compiling a new kernel because the installer only appears to extract this tar file without building a kernel from source.
 
I have no skills in working with dumps but am I right in thinking that the crash originated from the i915kms.ko module installed by graphics/drm-510-kmod?
I'm definitely not an expert in reading these either, but yes, I would suspect the same thing.

I'm assuming that I should be able to do this without compiling a new kernel because the installer only appears to extract this tar file without building a kernel from source.
The GENERIC kernel has some debug options set. If you're using a custom kernel those may have been turned off?
 
Back
Top