Reading/interpreting a backtrace

I need help understanding a thread backtrace.

I've got a kgdb backtrace that makes me wonder, whether I somehow mangled everything with my kernel module.
Upon kldunload of my kernel module, I get the following trace of a kldunload being "stuck" and never returning to the commandline:
Code:
(kgdb) bt
#0  sched_switch (td=td@entry=0xfffffe0119b6d740, flags=flags@entry=260) at /usr/src/sys/kern/sched_ule.c:2299
#1  0xffffffff80bfabb7 in mi_switch (flags=flags@entry=260) at /usr/src/sys/kern/kern_synch.c:545
#2  0xffffffff80c4cb57 in sleepq_switch (wchan=wchan@entry=0xfffffe011ea4c228, pri=0, pri@entry=-2114226488) at /usr/src/sys/kern/subr_sleepqueue.c:609
#3  0xffffffff80c4ca2e in sleepq_wait (wchan=<unavailable>, wchan@entry=0xfffffe011ea4c228, pri=<unavailable>, pri@entry=0) at /usr/src/sys/kern/subr_sleepqueue.c:660
#4  0xffffffff80b79df6 in _cv_wait (cvp=0xfffffe011ea4c228, lock=0xffffffff81fb76c8 <ww_mutex_global>) at /usr/src/sys/kern/kern_condvar.c:146
#5  0xffffffff80e67606 in linux_ww_mutex_lock_sub (lock=0xfffffe011ea4c208, ctx=0xfffffe011c74eb58, catch_signal=0) at /usr/src/sys/compat/linuxkpi/common/src/linux_lock.c:120
#6  0xffffffff83eeed2c in __ips_chipset_val (ips=0xfffffe011ea4c000) at /usr/home/user/drm-kmod/drivers/gpu/drm/i915/gt/intel_rps.c:339
#7  0xffffffff83eee652 in intel_rps_raise_unslice (rps=0xfffffe011c74eb58) at /usr/home/user/drm-kmod/drivers/gpu/drm/i915/gt/intel_rps.c:2355
#8  0xffffffff83eb554b in virtual_submission_tasklet (t=0xfffff80002284200) at /usr/home/user/drm-kmod/drivers/gpu/drm/i915/gt/intel_execlists_submission.c:3884
#9  0xffffffff83d4d87f in configure_dual_link_mode (encoder=0xfffffe011ea4c000, pipe_config=0x3) at /usr/home/user/drm-kmod/drivers/gpu/drm/i915/display/icl_dsi.c:306
#10 0xffffffff83cd88ad in validate_cmds_sorted (engine=0xfffff80016611300, cmd_tables=0xfffff80001228ca8, cmd_table_count=-2048)
    at /usr/home/user/drm-kmod/drivers/gpu/drm/i915/i915_cmd_parser.c:836
#11 0xffffffff83cea6ca in bdw_enable_vblank (_crtc=0xffffffff83eacfb0 <process_csb+304>, _crtc@entry=<error reading variable: value is not available>)
    at /usr/home/user/drm-kmod/drivers/gpu/drm/i915/i915_irq.c:2931
#12 0xffffffff80e69883 in linux_pci_detach_device (pdev=0xfffffe0119b6d740, pdev@entry=0xfffff80016611300) at /usr/src/sys/compat/linuxkpi/common/src/linux_pci.c:492
#13 0xffffffff80e6bbf9 in linux_pci_detach (dev=0xfffff80002284200, dev@entry=<error reading variable: value is not available>) at /usr/src/sys/compat/linuxkpi/common/src/linux_pci.c:482
#14 0xffffffff80c28f45 in DEVICE_DETACH (dev=0xfffffe011c74eb58) at ./device_if.h:234
#15 device_detach (dev=0xfffffe011c74eb58, dev@entry=0xfffff80002284200) at /usr/src/sys/kern/subr_bus.c:2613
#16 0xffffffff80c28527 in devclass_driver_deleted (busclass=busclass@entry=0xfffff80001950500, dc=dc@entry=0xfffff80001d89580, driver=driver@entry=0xffffffff83eacff8 <process_csb+376>)
    at /usr/src/sys/kern/subr_bus.c:742
#17 0xffffffff80c2841f in devclass_delete_driver (busclass=busclass@entry=0xfffff80001950500, driver=driver@entry=0xffffffff83eacff8 <process_csb+376>) at /usr/src/sys/kern/subr_bus.c:815
#18 0xffffffff80e6a70e in linux_pci_unregister_drm_driver (pdrv=<optimized out>) at /usr/src/sys/compat/linuxkpi/common/src/linux_pci.c:906
#19 0xffffffff83cea3b0 in i915gm_enable_vblank (crtc=0xffffffff83cea3b0 <i915gm_enable_vblank+32>) at /usr/home/user/drm-kmod/drivers/gpu/drm/i915/i915_irq.c:2847
#20 0xffffffff83cea7e2 in gen11_dsi_configure_te (intel_crtc=0xfffff800155a6c80, enable=255) at /usr/home/user/drm-kmod/drivers/gpu/drm/i915/i915_irq.c:2911
#21 0xffffffff80bc6da2 in module_unload (mod=0xffffffff83e8c818, mod@entry=0xfffff800155a6c80) at /usr/src/sys/kern/kern_module.c:261
#22 0xffffffff80bb659b in linker_file_unload (file=file@entry=0xfffff80001306300, flags=flags@entry=0) at /usr/src/sys/kern/kern_linker.c:715
#23 0xffffffff80bb7e0d in kern_kldunload (td=<unavailable>, td@entry=<error reading variable: value is not available>, fileid=21, fileid@entry=<error reading variable: value is not available>,
    flags=0, flags@entry=<error reading variable: value is not available>) at /usr/src/sys/kern/kern_linker.c:1227
#24 0xffffffff810d13ce in syscallenter (td=<optimized out>) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:189
#25 amd64_syscall (td=0xfffffe0119b6d740, traced=0) at /usr/src/sys/amd64/amd64/trap.c:1200
#26 <signal handler called>
#27 0x000027070c9744ea in ?? ()
Backtrace stopped: Cannot access memory at address 0x27070ad47b58

If I read this correctly, this mixes a thread that came out of a context switch with the actual kldunload. That's what I don't understand. At line 20, apparently the kernel module was doing its thing (but actually should not have, but that's a different story) and then on line 21 we're suddenly in kldunload's program space. What's going on here?

Or am I reading this wrong?
 
At line 20, apparently the kernel module was doing its thing (but actually should not have, but that's a different story) and then on line 21 we're suddenly in kldunload's program space
It's a back trace so you need to read it from bottom up. In frame 25 you see syscall being executed, 23 shows you it's kern_kldunload syscall.
Frames 6 is interesting. At least there's where I'd start debugging to see where's the problem. That function was trying to obtain the mutex it never got. It seems to be in deadlock waiting for the mutex indefinitely (you can't sleep when you hold mutex; you see it got into sleeping).
 
Back
Top