Xfce Sudden unresponsiveness in GUI

Hello,
I'm facing a very odd problem on my freebsd 14 desktop running Xfce4 : randomly the desktop will become unresponsive for up to few minutes, often the mouse cursor is fully responsive but nothing else will move. Sometimes even the cursor is stuck.
It does not seem to be heat-related even though temp of CPU, GPU and storage rise a lot : I've used a browser benchmark to load the GUI, I've got record-breacking temps but GUI remained fully responsive.
I'm a bit lost, I don't know where to look next.

The problem occurs randomly and it can pass few days between 2 occurrences.

What system metrics should I track ? What tests could I try, taking into account I can't switch OS, nor shut down or reboot too often as this freebsd PC is my internet gateway too.
 
Is a drm-kmod video driver loaded?

I experience similar symptoms with a "amdgpu" kernel driver module (graphics/drm-61-kmod, AMD 5700U "Lucienne"). Over time, the content drawing in xorg is slowing down, sometime it freezes minutes until input is accepted. To get a snappy xorg response again, only a system reboot will help (video driver is then reloaded).

To test if the video driver is responsible for the issue, try disabling video acceleration [1], or use a framebuffer video driver [2].

[1]
To see which driver is loaded, look in /var/log/Xog.0.log. In most cases it's modesetting(4), if no suitable other xf86-video-* drivers are installed.

Example:
Code:
[    80.732] (II) AMDGPU: Driver for AMD Radeon:
        All GPUs supported by the amdgpu kernel driver
[    80.732] (II) modesetting: Driver for Modesetting Kernel Drivers: kms

Create /usr/local/etc/X11/xorg.conf.d/modesetting.conf
Code:
Section "Device"
    Identifier    "Modesetting"
    Driver        "modesetting"
    Option        "AccelMethod"    "none"
EndSection
This will disable "glamor" X acceleration. You may experience some distorted window content when resizing program windows.

[2]
Disable the drm-kmod driver in /etc/rc.conf, install x11-drivers/xf86-video-scfb, create a driver config file (/usr/local/etc/X11/xorg.conf.d/scfb.conf - scfb(4))
 
This might be a very good hypothesis. I do use amdgpu:

Code:
$ kldstat |grep amdg
11    1 0xffffffff83600000   6688e8 amdgpu.ko
18    1 0xffffffff83d0a000    48a60 amdgpu_gc_11_0_1_mes_2_bin.ko
19    1 0xffffffff83d53000    3a890 amdgpu_gc_11_0_1_mes1_bin.ko
20    1 0xffffffff83d8e000     2ae0 amdgpu_psp_13_0_4_toc_bin.ko
21    1 0xffffffff83d91000    446a0 amdgpu_psp_13_0_4_ta_bin.ko
22    1 0xffffffff83dd6000    4ba00 amdgpu_dcn_3_1_4_dmcub_bin.ko
23    1 0xffffffff83e22000    225e0 amdgpu_gc_11_0_1_imu_bin.ko
24    1 0xffffffff83e45000    425e0 amdgpu_gc_11_0_1_pfp_bin.ko
25    1 0xffffffff83e88000    425e0 amdgpu_gc_11_0_1_me_bin.ko
26    1 0xffffffff83ecb000    29430 amdgpu_gc_11_0_1_rlc_bin.ko
27    1 0xffffffff83ef5000    43a80 amdgpu_gc_11_0_1_mec_bin.ko
28    1 0xffffffff83f39000     a7e0 amdgpu_sdma_6_0_1_bin.ko
29    1 0xffffffff83f44000    5bb00 amdgpu_vcn_4_0_2_bin.ko

Xorg.0.log:
[1111264.876] (II) Loading sub module "glamoregl"
[1111264.876] (II) LoadModule: "glamoregl"
[1111264.876] (II) Loading /usr/local/lib/xorg/modules/libglamoregl.so
[1111264.886] (II) Module glamoregl: vendor="X.Org Foundation"
[1111264.886]     compiled for 1.21.1.16, module version = 1.0.1
[1111264.886]     ABI class: X.Org ANSI C Emulation, version 0.4
[1111265.068] (II) modeset(0): glamor X acceleration enabled on AMD Radeon 780M (radeonsi, gfx1103_r1, LLVM 19.1.7, DRM 3.49, 14.2-RELEASE-p1)
[1111265.068] (II) modeset(0): glamor initialized

I’ll wait for my FreeBSD 14.3 update just to check if situation is better. Then if it’s not, I’ll disable glamor X acceleration.
 
will become unresponsive for up to few minutes
I experience similar symptoms with a "amdgpu"
Might be related to LinuxKPI: make linux_alloc_pages() honor __GFP_NORETRY - commit, mentioned in FreeBSD 14.3-RELEASE Release Notes - General Kernel Changes:
LinuxKPI: linux_alloc_pages() now honors __GFP_NORETRY. This is to fix slowdowns with drm-kmod that get worse over time as physical memory become more fragmented (and probably also depending on other factors). 831e6fb0baf6 (Sponsored by The FreeBSD Foundation).
 
It looks really promising! The description of the bug matches perfectly the behavior of my system.
I can't wait to upgrade 😅
 
I know of this linuxkpi issue ( https://forums.freebsd.org/threads/...7735hs-with-radeon-graphics.97700/post-699846).

The user experiencing the issue, I recommended to try stable/14 (which was branched in releng/14.3, alias 14.3-RELEASE). At the time the linuxkpi fix was merged in stable/14, but the issue persisted

https://forums.freebsd.org/threads/...7735hs-with-radeon-graphics.97700/post-700969

Disabling glamor X acceleration helped (covering the symptoms).

It looks really promising! The description of the bug matches perfectly the behavior of my system.
I can't wait to upgrade 😅
Since there was no improvement on stable/14 don't be surprised if it's not improving on 14.3.
 
I almost thought that the problem was solved after a few hours with 14.3 running. Usually, on my system, it takes up to an hour or two if many applications are running simultaneously, until xorg window drawing slows down (VirtualBox in particular accelerates the slowing down). At this point, the system needs to be rebooted to be useful.

Running 14.3 with a lot of applications open (including VirtualBox), xorg window drawing didn't slow down at all. This gave me the impression it's all good now.

I thought wrong, xorg froze without prior warning, no user input possible, mouse cursor also frozen. The system had to be powered down by power button, it powered off graciously, running the system shutdown procedure.

At least suspend / resume (by laptop lid close / open) works again. This was broken on 14.2.

YMMV
 
Currently the highest load I've tested is a strategic game I've played for 2 hours. The PC has 32GB ram, but it runs many other things and it swaps a little bit (few hundreds of KB). It was a flawless experience.
I'm not using the GUI so much. Only once or twice a week, and it seems to me that these unresponsiveness episodes are random : it's like a startxfce roulette, either my fresh session will be super snappy or super sluggish with short or long unresponsive periods.
I'll upgrade to 14.3 later and hopefully it'll be enough...
 
Did you happen to be running firefox when the hangs occurred? I've noticed some intermittant bugs where firefox suddenly decides to chew up 100% cpu and everything else becomes unresponsive. I haven't been able to isolate the bug. However it only ever happens when I'm doing something with firefox... when it happens the fan comes on full and the case gets hot, so the CPU is working very hard.

Most times when this happens I can't move the mouse pointer and am unable to ssh in from another box, ssh can't get a connection. Can't switch to another VT, it's unresponsive. Usually I just power cycle it. It's happened a handful of times over the last few weeks.

Thinking about it more... it's usually when I'm looking at youtube, or weirdly, aliexpress.

Of course it could be my hardware...
 
I've had a complete lock-up out of nowhere with Xfce 4.20 and Firefox open 14.2 or 14.3-B3 with Intel UHD 630 when on modesetting DDX. I usually always use Intel DDX and never had a complete lock-up.

I'm thinking AMDGPU might default to modesetting too vs amdgpu DDX; I'd try switching to that.

With Firefox, it'll do software-acceleration with Intel DDX on their blacklist (not sure about AMDGPU DDX; modesetting defaults to hardware-accelerated fine); I'd check about:support, but that can also be fixed (forced back to HW) with gfx.webrender.all.
 
I'm using Firefox but it does not use all the CPU. I can see CPU spikes for Xorg, up to 100% but they do not last more than 1 or 2 seconds and cannot explain minutes long freeze of the GUI.
 
Any clues in the logs?
Only two copies of Xorg.0 are saved since 14.3 upgrade. Those show "Server terminated successfully", which suggests a normal Xorg exit, not a log when the server froze.

Currently I'm testing Firefox without graphics hardware acceleration [1] (Settings -> General -> Performance -> uncheck "Use hardware acceleration when available").

Firefox was running when Xorg froze, twice now since 14.3 upgrade. I'll report back when testing has finished.


[1]
Use hardware acceleration when available: This setting allows Firefox to use your computer's graphics processor, if possible, instead of the main processor, to display graphics-heavy web content such as videos or games. This frees up resources on your computer so it can run other applications, like Firefox, faster. This box is checked by default but the feature isn't available for all graphics processors. You must restart Firefox after changing this setting, before it will take effect.
 
Have you (re)tried several times to ssh in?
Have you tried ping-ing the busy system?
Yes. The machine's NIC responds to ping, but the ssh client simply hangs forever unable to establish the connection. Tried it more than once. The machine itself is completely unresponsive to keypresses and mouse. Fan is whirring away like crazy trying to cool it down, case gets very warm. I suspect a hard loop somewhere inside firefox. I've only ever seen this bug when running FF, never seen it when FF not runing. And it's usually when I'm looking at video, eg yt, or there is some kind of animated banner playing on the screen, eg in an ad. I might try nice'ing the firefox process at startup from now on.

I thought firefox might have been dumping core, but disk activity LED is off, and can't find any core files. It looks like a tight loop inside software. Pretty unusual to lock the whole system up like that. If's pretty infrequent... it's happened perhaps 4 times over the last couple of months. Enough to be annoying. I've updated FF to latest level (pkg upgrade etc) but still see the bug.

It's a thinkpad X201 running 14.2 RELEASE, and it's using intel graphics using modesetting driver with crocus accelerator.

Another thing that makes me think its in firefox is that I've seen similar bug using <<waterfox>> on linux... waterfox is derived from firefox, they have a common codebase.

This may not be exactly the same bug as the OP's report, he says his mouse stays responsive. I will keep trying to isolate it but it's a bit difficult when the box just locks up hard.

Logs... yeah I should have a more detailed look at the kernel log next time it happens.

Unfortunately I have no test case that will recreate this bug. It's intermittant, seems to happen almost at random. The only commonality I've seen is that I'm usually using firefox when it happens.
 
Having just written all that... it might of course be something like needing to replace the heatsink compound. It might be an intermittant hardware fault. It's quite an old machine.
 
Back
Top