panic when connecting or disconnecting A/C power

michls

New Member

Reaction score: 1
Messages: 6

Hi all,

my Ryzen 4700-powered HP Probook running Current as of July 10 is now capable of suspend&resume (I assume thx to drm-devel from the same date). I did a "pkg update" + "pkg upgrade" just today.

Whenever I connect or disconnect A/C power, it panics immediately and drops me into kdb.

Since at that moment I'm dropped into the console at that moment, I only see part of the stack backtraces - two of them to be precise:
The first seems to be be repeated several times; I see two instances thereof, it starts with
Code:
WARNING !dmr_modeset_is_locked(plane->mutex) failed at ... 
#0 0xf.... at linux_dump_stack+0x23
...

the second one has cpuid and a timestamp and then:
Code:
KDB: stack backtrace:
<stuff that looks fairly standard trap handling>
--- trap 0xc, rip ... --
amdgpu_pm_acpi_event_handler() at amdgpu_pm_acpi_event_handler+0x69/frame 0x...
amdgpu_acpi_event() at amdgpu_acpi_event+0x62/frame 0x...
linux_handle_acpi_acad_event() at linux_handle_acpi_acad_event+0x3c/frame 0x...
...

I think it may be relevant that I'm seeing tons of these three messages in the logs and on the text console:

Code:
Jul 31 18:02:30 hbeast kernel: Firmware Error (ACPI): AE_AML_PACKAGE_LIMIT, Index (0x000000005) is beyond end of object (length 0x5) (20210604/exoparg2-569)
Jul 31 18:02:30 hbeast kernel: ACPI Error: Aborting method \_TZ.GTTP due to previous error (AE_AML_PACKAGE_LIMIT) (20210604/psparse-689)
Jul 31 18:02:30 hbeast kernel: ACPI Error: Aborting method \_TZ.CHGZ._TMP due to previous error (AE_AML_PACKAGE_LIMIT) (20210604/psparse-689)

... roughly every ten seconds.

I'd be grateful for any pointers or advice to get my power management fixed.

regards & TIA
Michael
 

mark_j

Daemon

Reaction score: 681
Messages: 1,192

A quick peruse of the web shows:
This seems to point to a BIOS/UEFI issue. Any updates available? That's your first port of call.
 
OP
M

michls

New Member

Reaction score: 1
Messages: 6

mark_j thank you - I was hoping I'd get around it since AFAICT these messages are not present when I'm using the other OS ... (and since that's not the one from Redmond, HP only offers an update from the BIOS itself, which I'm a bit hesitant about, as I don't have a FAT partition ... )

cheers!
 

George

Aspiring Daemon

Reaction score: 201
Messages: 506

I guess the acpi errors refer to a thermal zone ( = TZ). If they show up every 10 seconds, maybe the temperature is reported to the OS every 10 seconds or so. It is reporting an invalid value I guess.

The kernel backtrace shows a amdgpu event handler, that should handle acap events (AC adapter events). I would Think that the bug is in these event handlers?


Does the laptop boot with the cable unplugged?

I would make a PR.
 
OP
M

michls

New Member

Reaction score: 1
Messages: 6

Hi George,
thx for your feedback. The laptop boots with and without a plugged-in power cable.

btw: even removing the power cable when laptop is suspended causes the panic upon resume.
 
OP
M

michls

New Member

Reaction score: 1
Messages: 6

another update: I've updated to the latest BIOS, no (real) change - I haven't looked at the new stack traces when the laptop panics, but panic it definitely does (the ACPI error messages look the same)
 

George

Aspiring Daemon

Reaction score: 201
Messages: 506

Maybe you could boot without the graphics drivers and see whether the problem persists (to find out of its really amdgpu related). But either way, create a bug report. ;D I don't think anything can be done except writing a patch.
 
OP
M

michls

New Member

Reaction score: 1
Messages: 6

Your point being I should address this to the mailing list? Thx for the reminder :)
 
Top