Lenovo ThinkCentre M90q Gen 2 - ACPI issue (hangup)

On my ThinkCentre everything works without issues, until the temperature goes over ~60°

Then, CPU fan stop for couple of seconds and then start blowing at full speed. While temperature
is decreasing (down to under 40°) fans continue to blow and after 2-4 minutes computer hangs and
only hard reboot can bring it back to life.

During this period of 2-4 minutes, log is filled with error messages :

AcpiOsExecute: failed to enqueue task, consider increasing the debug.acpi.max_tasks tunable
ACPI Error: AE_NO_MEMORY, Unable to queue handler for GPE 24 - event disabled (20201113/evgpe-1049)
ACPI Error: AE_NO_MEMORY, Unable to queue handler for GPE 24 - event disabledAcpiOsExecute: failed to enqueue task, consider increasing the debug.acpi.max_tasks tunable
(20201113/evgpe-1049)

I did try to increase the debug.acpi.max_tasks tunable up to 4096 but it did not help to resolve the issue.

The latest BIOS (12/15/2022, ver, M3JKT38A) has been installed.

All details about the computer (acpidump, devinfo, sysctl, dmesg, dmidecode) are in attached files.
 

Attachments

  • sysctl_-a.txt
    632.7 KB · Views: 99
  • dmesg.boot_verbose.txt
    85.5 KB · Views: 76
  • acpidump_-dt.txt
    11 KB · Views: 75
  • devinfo_-v.txt
    21.4 KB · Views: 88
  • dmesg-errors+dmidecode.txt
    43.6 KB · Views: 160
On my ThinkCentre everything works without issues, until the temperature goes over ~60°

How high does it ever go?

Then, CPU fan stop for couple of seconds and then start blowing at full speed. While temperature is decreasing (down to under 40°) fans continue to blow and after 2-4 minutes computer hangs and
only hard reboot can bring it back to life.

Seems it's shifting to _AC0 (fastest fan) state, which should work up to 82.1C, but it's not shifting back to _AC1 once again below 55.1°C ?

I'd probably post this - with excellent detail BTW - to the freebsd-acpi@ list (after subscribing probably) assuming a likely ACPI issue, bearing in mind that I'm just guessing from many years ago :)

During this period of 2-4 minutes, log is filled with error messages :

AcpiOsExecute: failed to enqueue task, consider increasing the debug.acpi.max_tasks tunable
ACPI Error: AE_NO_MEMORY, Unable to queue handler for GPE 24 - event disabled (20201113/evgpe-1049
...


I did try to increase the debug.acpi.max_tasks tunable up to 4096 but it did not help to resolve the issue.

I suspect any size will only fill up if it's rejecting tasks, perhaps to do with _AC0.

Code:
hw.acpi.thermal.tz0._TSP: -1
hw.acpi.thermal.tz0._TC2: -1
hw.acpi.thermal.tz0._TC1: -1
hw.acpi.thermal.tz0._ACx: 82.1C 55.1C 50.1C 45.1C 40.1C -1 -1 -1 -1 -1
hw.acpi.thermal.tz0._CRT: 105.1C
hw.acpi.thermal.tz0._HOT: -1
hw.acpi.thermal.tz0._PSV: -1
hw.acpi.thermal.tz0.thermal_flags: 0
hw.acpi.thermal.tz0.passive_cooling: 0
hw.acpi.thermal.tz0.active: 4
hw.acpi.thermal.tz0.temperature: 44.1C
hw.acpi.thermal.user_override: 0
hw.acpi.thermal.polling_rate: 10
hw.acpi.thermal.min_runtime: 0

All details about the computer (acpidump, devinfo, sysctl, dmesg) are in attached files.

Couple of obs:

If you don't have the latest BIOS, start there. (i.e. is 20201113 the BIOS date?)

> hw.acpi.cpu.cx_lowest: C1

C8 will run cooler, especially on relatively idle cpus, and with the firepower you have should be unnoticeable.

Consider running coretemp ?

Oh, just noticed:

<118>Thu Feb 2 12:44:51 CET 2023
<118>Feb 2 12:45:35 <auth.notice> xxxx dbus-daemon[1987]: [system] Rejected send message, 2 matched rules; type="method_call", sender=":1.18" (uid=1001 pid=2278 comm="/usr/local/lib/libexec/org_kde_powerdevil") interface="org.freedesktop.ConsoleKit.Manager" member="CanSuspendThenHibernate" error name="(unset)" requested_reply="0" destination="org.freedesktop.ConsoleKit" (uid=0 pid=2178 comm="/usr/local/sbin/console-kit-daemon --no-daemon")
acpi_tz0: switched from _AC4 to _AC3: 46.1C

I hope KDE powerdevil is not messing with temperature control, but is turned OFF?

Are you running powerd{,++}? If not, why not?

cheers, Ian
 
While working this out, you may find it useful to try limiting temperature with sysutils/powerdxx as shown here, esp. post #17:

At a guess, limiting below that _AC0 temp. of 82.1°C ?
 
How high does it ever go?
I never saw it higher then 79°C but most probably is as you said, once 82.1°C reached,
fans go to full speed and (almost) never come back to normal speed. Almost because,
it already hapend to come back to normal, _AC3 / _AC4 speed.

Temperature on the other hand should not be the issue. When the machine freezes,
CPU temperature is around 40°C

Seems it's shifting to _AC0 (fastest fan) state, which should work up to 82.1C, but it's not shifting back to _AC1 once again below 55.1°C ?
That seems to be the case, indeed.

Couple of obs:

If you don't have the latest BIOS, start there. (i.e. is 20201113 the BIOS date?)
I do have the latest BIOS version (12/15/2022).

> hw.acpi.cpu.cx_lowest: C1

C8 will run cooler, especially on relatively idle cpus, and with the firepower you have should be unnoticeable.
I've tried to go with C3 (C3 handles deeper states then C3) but system becomes quite sluggish.
I've did monitor the power consumption and temperature, the difference is not staggering.

Consider running coretemp ?

I did try with coretemp, and the temperature difference between ACPI and coretemp is interesting :

dev.cpu.0.coretemp.delta: 50
dev.cpu.0.temperature: 50.0C
dev.cpu.1.coretemp.delta: 51
dev.cpu.1.temperature: 49.0C
dev.cpu.2.coretemp.delta: 51
dev.cpu.2.temperature: 49.0C
dev.cpu.3.coretemp.delta: 50
dev.cpu.3.temperature: 50.0C
dev.cpu.4.coretemp.delta: 50
dev.cpu.4.temperature: 50.0C
dev.cpu.5.coretemp.delta: 51
dev.cpu.5.temperature: 49.0C
dev.cpu.6.coretemp.delta: 50
dev.cpu.6.temperature: 50.0C
dev.cpu.7.coretemp.delta: 48
dev.cpu.7.temperature: 52.0C
dev.cpu.8.coretemp.delta: 52
dev.cpu.8.temperature: 48.0C
dev.cpu.9.coretemp.delta: 50
dev.cpu.9.temperature: 50.0C
dev.cpu.10.coretemp.delta: 49
dev.cpu.10.temperature: 51.0C
dev.cpu.11.coretemp.delta: 42
dev.cpu.11.temperature: 58.0C

hw.acpi.thermal.tz0.temperature: 48.1C


Oh, just noticed:

<118>Thu Feb 2 12:44:51 CET 2023
<118>Feb 2 12:45:35 <auth.notice> xxxx dbus-daemon[1987]: [system] Rejected send message, 2 matched rules; type="method_call", sender=":1.18" (uid=1001 pid=2278 comm="/usr/local/lib/libexec/org_kde_powerdevil") interface="org.freedesktop.ConsoleKit.Manager" member="CanSuspendThenHibernate" error name="(unset)" requested_reply="0" destination="org.freedesktop.ConsoleKit" (uid=0 pid=2178 comm="/usr/local/sbin/console-kit-daemon --no-daemon")
acpi_tz0: switched from _AC4 to _AC3: 46.1C

I hope KDE powerdevil is not messing with temperature control, but is turned OFF?

Are you running powerd{,++}? If not, why not?

cheers, Ian

I really think the issue is on the BIOS / ACPI side, as the temperature is not high
when is crashes. And also the kernel messages storm does indicate the ACPI issue.

Thank you for you thought and your feedback.

P.S.
powerd{xx|++} is managing the CPU freq. As said, that would probably not resolve the issue.

 
I am having similar problem and so far BIOS updates have not resolved it. The only way this Lenovo M90q Gen 2 is stable is when fans run at full speed. Mine is with i9 CPU which is quite power hungry.

CPU: 11th Gen Intel(R) Core(TM) i9-11900 @ 2.50GHz (2496.00-MHz K8-class CPU)

Was an solution to this issue found ever?
 
Back
Top