i915kms gpu hung on Freebsd-12.2-RELEASE-p1

agxak

New Member

Reaction score: 1
Messages: 2

Hi, First of all I am new to freebsd, so I apologize in advance if I make a mistake about something.
I had used freebsd before, but I only used it for about a week on the same pc. I recently installed the latest freebsd release, installed xorg and tried to install some desktop environments like kde or gnome, but something went wrong, for some reason they were running slow when compositing was enabled. In the end I ended up installing xfce (other desktops like lxqt or cinnamon ended up with a segmentation fault and in the end it was the only one I could run without problems). Later I noticed that when running a program that it made more use of graphics acceleration than normal (especially the 2d acceleration). X began to become slow and closed, when reviewing the messages on the console something like this would appear:

Code:
error: [drm:pid12:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung
info: [drm] capturing error event; look for more information in sysctl hw.dri.0.info.i915_error_state
error: [drm:pid:12:i915] *ERROR* Chip reset has failed... Declaring GPU Hung

I have an old 4 series chipset; the intel G41 express chipset with the integrated Intel Graphics Media Accelerator X4500 video card. I am using the i915kms module included in the base system, modesetting ddx and mesa dri. Also have installed legacy intel vaapi driver. I was originally using the intel ddx driver. but I changed it to modesetting after reading that it was recommended for use in intel gpu, after the change the times in which the problem occurs decreased. And xorg no longer ends with a segmentation fault (when opening specific applications, but keeps doing so for no apparent reason at random times). But now every time that cpu hung happens there is corruption on the screen, flickering, incorrectly displayed items, non-updated areas of the screen and other things that are not solved by terminating xorg and starting it again. A reboot is required to fix it temporarily.

I am using the latest release as indicated before:
$ uname -v
FreeBSD 12.2-RELEASE-p1 GENERIC


I saw a similar open bugs reported from some time ago: PR 194766 PR 227870

Attached file output from "hw.dri.0.info.i915_error_state" content, i don't know if it helps for something.


Edit: After seeing bug report PR 194766 changed "drm.i915.semaphores" to "1" , and xorg no longer has segmentation failures absolutely. But now any graphical application that makes greater use of graphical acceleration ends up with segmentation failures. (For example if i execute a game this end with segfault, but it no longer causes the whole x server to crash). But still presenting corruption sometimes in xorg. In addition to: at seeing the kernel output the same problem of gpu hang continues to occur.
 

Attachments

  • hw.dri.0.info.i915_error_state.txt
    1.3 MB · Views: 61
OP
A

agxak

New Member

Reaction score: 1
Messages: 2

I don't quite understand well the change to the x.org forum from Base System/System Hardware but I appreciate the change it is more in line with the thread.

I summarize; the main problem being the failure of the i915kms module in which the gpu is declared hung. Although the following problems also are present:
-Applications that required graphical acceleration were running slowly despite aceleration enabled and working.
-Xorg ended with segmentation fault (fixed before change "drm.i915.semaphores" to "1") (reduced incidence after enable modesetting ddx).
-Corruption in Xorg (new) (before: enable modesetting ddx).
-Segmentation fault in applications that make use of graphical acceleration (new) (before: change "drm.i915.semaphores" to "1").

I have read the wiki and apparently the support for my chipset will be removed in the next freebsd release. But the modules should remain in the graphics/drm-legacy-kmod port, which has been removed on the first day of this year. So I took a snapshot and decided to try the graphics/drm-kmod port and well, as expected, the modules don't work for me. I just can't start the GUI at all. So I reverted the changes. Therefore, I can only consider that freebsd will not run properly on my device in the future. So I will proceed to uninstall it. I leave the thread open in case someone has the same problem can share more details or someone can provide guidance or solutions for them.

For my part this is all. Goodbye
 

the3ajm

Member

Reaction score: 20
Messages: 85

Can you provide the wiki link I'm running an older system GM965 chipset and I have the following error as you have:

Code:
kernel: error: [drm:pid12:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
Jan 23 20:56:48 Dell kernel: info: [drm] capturing error event; look for more information in sysctl hw.dri.0.info.i915_error_state
Jan 23 20:56:48 Dell kernel: error: [drm:pid0:i915_reset] *ERROR* Failed to reset chip.
Jan 23 20:57:36 Dell devd[503]: notify_clients: send() failed; dropping unresponsive client
Jan 23 20:57:37 Dell syslogd: last message repeated 2 times
Jan 23 20:57:40 Dell kernel:
Jan 23 20:57:40 Dell kernel: error: [drm:pid0:assert_pll] *ERROR* PLL state assertion failure (expected on, current off)
Jan 23 20:57:40 Dell syslogd: last message repeated 1 times
Jan 23 20:57:40 Dell kernel: error: [drm:pid0:assert_pipe] *ERROR* pipe A assertion failure (expected on, current off)
Jan 23 20:57:41 Dell kernel: error: [drm:pid0:intel_enable_lvds] *ERROR* timed out waiting for panel to power on
Jan 23 20:57:41 Dell kernel: error: [drm:pid0:intel_modeset_check_state] *ERROR* encoder's hw state doesn't match sw tracking (expected 1, found 0)
Jan 23 20:57:41 Dell kernel: error: [drm:pid0:assert_pipe] *ERROR* pipe A assertion failure (expected on, current off)
Jan 23 20:57:42 Dell kernel: .
Jan 23 20:57:42 Dell devd[503]: notify_clients: send() failed; dropping unresponsive client
Jan 23 20:57:43 Dell kernel: .
Jan 23 20:57:44 Dell ntpd[802]: ntpd exiting on signal 15 (Terminated)
Jan 23 20:57:44 Dell kernel: .
Jan 23 20:57:45 Dell kernel: , 802.
Jan 23 20:57:48 Dell kernel: , 503.
Jan 23 20:57:48 Dell kernel: .

I know 12.2 have an issue with i915 packages but it's going to be fixed once 12.1 goes EOL.
 

Emrion

Aspiring Daemon

Reaction score: 235
Messages: 699

I know 12.2 have an issue with i915 packages but it's going to be fixed once 12.1 goes EOL.
That's the point. You have several threads in this forum about this very problem. You can also install drm-kmod by ports, since this will compile the module (drm-fbsd12.0-kmod) against the installed kernel.
 

the3ajm

Member

Reaction score: 20
Messages: 85

I'm using 12.1 and the error came up for me when my browser kept buffering videos and won't play them. I installed another browser once I hit up youtube, my screen just gone dark and had to reboot the machine then saw the messages. I'll wait until the end of the month to perform the update then report back to see if I face any similar issues in 12.2.
 

the3ajm

Member

Reaction score: 20
Messages: 85

I've recently discovered that whenever I try to run cheese to test my webcam, I'm able to reproduce the error as the application opens and my screen turns black I have to restart the machine afterwards.
 

the3ajm

Member

Reaction score: 20
Messages: 85

My computer is getting high disk readings from Firefox playing videos, top said it's high disk activity which caused to be slow then suddenly the screen went black. I will paste the logs here tomorrow.

**Here it is:

Code:
Feb  6 23:18:41 Dell kernel: error: [drm:pid12:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
Feb  6 23:18:41 Dell kernel: info: [drm] capturing error event; look for more information in sysctl hw.dri.0.info.i915_error_s
tate
Feb  6 23:18:42 Dell kernel: error: [drm:pid0:i915_reset] *ERROR* Failed to reset chip.

An excerpt from Xorg:

Code:
[ 55601.590] (EE) intel(0): Detected a hung GPU, disabling acceleration.
[ 55601.603] (EE) intel(0): When reporting this, please include i915_error_state from debugfs and the full dmesg.
[ 55601.947] (WW) intel(0): Page flip failed: Invalid argument
[ 55601.947] (EE) intel(0): present flip failed
 
Top