Solved 14.0 - AMDGPU Hard Crash / Reboots drm-515-kmod

Hi thedaemon

I believe I have the same issue here. Would you be so kind to share the steps you took to fix this? Also, how did you confirm this was a GPU issue?

Thanks a lot!

Edit: Please ignore the second question. I see you've used the core dump to check the problem. :)
 
I've changed from version drm-515-kmod-5.15.118_4 to version drm-510-kmod-5.10.163_9 and so far it has been stable... not sure why this is happening though. But this might be a quick fix for someone having the same problem like me.

Cheers!
 
I've changed from version drm-515-kmod-5.15.118_4 to version drm-510-kmod-5.10.163_9 and so far it has been stable... not sure why this is happening though. But this might be a quick fix for someone having the same problem like me.

Cheers!
Problem with this workaround is that it makes my life a pain everytime I want to pkg autoremove

This makes pkg try to remove all amdgpu-firmware-... away because they're linked to drm-515-kmod-5.15.118_4 :(

Any suggestions are highly appreciated 👍
 
… all amdgpu-firmware-... away because they're linked to drm-515-kmod-5.15.118_4

For the latter:

1715938975282.png


Try temporarily locking graphics/gpu-firmware-kmod

Postscript: see below for the better approach.
 
This makes pkg try to remove all amdgpu-firmware-... away because they're linked to drm-515-kmod-5.15.118_4
You don't need them all, and they are not linked to a specific drm version.

Check which firmware is loaded

Example:
Code:
 # kldstat | grep amdgpu

0    1 0xffffffff83600000   4129b8 amdgpu.ko
17    1 0xffffffff83a9f000     64e0 amdgpu_renoir_sdma_bin.ko
18    1 0xffffffff83aa6000    2c2e0 amdgpu_renoir_asd_bin.ko
19    1 0xffffffff83ad3000     7560 amdgpu_renoir_pfp_bin.ko
20    1 0xffffffff83adb000     6560 amdgpu_renoir_me_bin.ko
21    1 0xffffffff83ae2000     4560 amdgpu_renoir_ce_bin.ko
22    1 0xffffffff83ae7000     bcd8 amdgpu_renoir_rlc_bin.ko
23    1 0xffffffff83af3000    43800 amdgpu_renoir_mec_bin.ko
24    1 0xffffffff83b37000    43800 amdgpu_renoir_mec2_bin.ko
25    1 0xffffffff83b7b000    1fbe8 amdgpu_renoir_dmcub_bin.ko
26    1 0xffffffff83b9b000    645a0 amdgpu_renoir_vcn_bin.ko

afterwards check if the firmware is on the pkg autoremove list.

If it is listed, set the package from automatic to non-automatic, to prevent autoremove
Code:
 # pkg set -A 0 gpu-firmware-amd-kmod-renoir

Replace 'renoir' with the GPU name of your system.

For documentation see pkg-set(8)
 
I do suspect that he is using graphics/drm-kmod which sets the gpu-firmware as dependency.

So when the removes drm-515 then the metapackage gets removed too and then the gpu-firmware with it.
Not sure what happens there on jcamos system, I can't reproduce the issue described.

On a VM I installed graphics/drm-kmod, which pulled in graphics/drm-515-kmod and all firmware of graphics/gpu-firmware-amd-kmod (The host has AMD 'Lucienne' GPU).

Then:
Code:
 # pkg del -f drm-515-kmod
 # pkg install drm-510-kmod

Running pkg autoremove doesn't list anything.

By the way, on the host I'm running amd-510-kmod instead of drm-515-kmod. With drm-515-kmod Xorg in time becomes sluggish to a point locking up. I might try the version in the PR you linked. Thanks for the suggestion.
 
How exactly do you 'use' this branch? I have the same issue, in console there is no problem, but in GNOME, it just locks up and reboots in minutes.

i9 13900k on asus B760 prime with RX580.
I cloned it, switched to the required branch, ran make and make install. At first make failed because I did not have system sources installed. After installation everything was fine. Though I was running 4.1-BETA1
 
Back
Top