Xorg high disk I/O activity

I made bigger skip from 11.1-RELEASE to 12.2-RELEASE (clean install, not upgrade). I noticed that high disk I/O activity is visible in Xorg + x11-wm/e16. The problem is reproducible in VirtualBox on various PCs and OS with default installation of 12.2-RELEASE, Auto ZFS or Auto UFS, 1 GiB RAM for the guest. Installed packages after installing the OS were # pkg install Xorg e16 virtualbox-ose-additions. After start Xorg with E16 I/O continuous activity is visible in top io page.
Code:
last pid:   826;  load averages:  0.89,  0.68,  0.30    up 0+00:02:04  15:37:07
27 processes:  1 running, 26 sleeping
CPU:  4.6% user,  0.0% nice,  6.2% system,  1.5% interrupt, 87.7% idle
Mem: 52M Active, 82M Inact, 155M Wired, 667M Free
ARC: 96M Total, 16M MFU, 78M MRU, 65K Anon, 604K Header, 2069K Other
     58M Compressed, 136M Uncompressed, 2.33:1 Ratio
Swap: 2048M Total, 2048M Free

  PID USERNAME     VCSW  IVCSW   READ  WRITE  FAULT  TOTAL PERCENT COMMAND
  808 u            391    144      6     40     20     66  76.74% Xorg
  811 u             80    204      0      0     20     20  23.26% e16
  818 u              2      1      0      0      0      0   0.00% top

I tried also x11-wm/icewm. It's not as visible as in E16, but problem is still exists. Any movement, iconifying, shading, etc. cause abnormal HDD write activity. x11-wm/fluxbox seems to be without these problem. I also tried this to reproduce it in 11.4-RELEASE and in fully updated 12.2-RELEASE-p4. The same problems appeared.

I've never seen any similar problem in previous releases of FreeBSD.

Does anyone have any idea the problem might be?
 
It looks like it's swapping, but in reality the swap is intact and there is enough free memory. I have a lot more memory on my real computer and the problem is the same.

When I deactivate the swap, the problem is exactly the same.
 
There's little to no room for filesystem and process caches. So it has to constantly load everything from disk instead of getting it from the cache. That will certainly increase the disk I/O. Just give it more memory.
 
I changed filesystem to UFS and set 2 GiB RAM (I can't give more). Problem still persists.

Code:
last pid:   811;  load averages:  0.59,  0.41,  0.17    up 0+00:02:15  15:57:54
24 processes:  2 running, 22 sleeping
CPU:  3.5% user,  0.0% nice,  8.6% system,  0.0% interrupt, 87.9% idle
Mem: 37M Active, 4864K Inact, 110M Wired, 75M Buf, 1733M Free
Swap: 1024M Total, 1024M Free

  PID USERNAME     VCSW  IVCSW   READ  WRITE  FAULT  TOTAL PERCENT COMMAND
  786 u            221    195      0    123      0    123 100.00% Xorg
  789 u            180    118      0      0      0      0   0.00% e16
  807 u              3      3      0      0      0      0   0.00% xterm
  810 u              2      0      0      0      0      0   0.00% top

Is 2 GiB still low memory? How much memory I should give, when 12.2-RELEASE is in the default installation, which only runs Xorg with E16? I have 4 GiB RAM on my real PC and the behaviour is the same.

So it has to constantly load everything from disk instead of getting it from the cache.

You talked about the load. Perhaps I don't understand the top information correctly, but I only can see write in top and it confuses me. I tried to use sysutils/gkrellm2 for convenience and on its HDD monitor only write is visible. A little bit of reading will only appear when I start a new instance of xterm.
 
I don’t think that the system is paging or otherwise low on RAM. According to the top(1) output, 80 % of memory is free.
But I have to admit that I really don’t know what is causing the disk I/O.
You could try to use procstat(1) to list Xorg’s file descriptors, and check which of them change during the course of a few seconds.
(The following assumes that your Xorg process has PID 786. You have to do this with root privileges.)
Code:
# cd /tmp
# procstat -f 786 > foo1
# sleep 5
# procstat -f 786 > foo2
# diff foo1 foo2
 
I tried it. There is the diff output of procstat.

Code:
    PID COMM                FD T V FLAGS    REF  OFFSET PRO NAME       

29c29
<   815 Xorg                23 v c rw---n--   2   20928 -   /dev/input/event2
---
>   815 Xorg                23 v c rw---n--   2   20976 -   /dev/input/event2
 
That output is normal. It doesn’t indicate any irregular file access activity.

Let’s try another thing. Please type iostat 5 and let it run for, say, one minute. While doing that, do not touch the machine, and try to make sure that no other programs are running that cause disk activity, e.g. do not run a web browser or something like that. If possible, the shell terminal should be the only window. Alternatively, do not open any window at all (so basically your window manager is the only X11 program running), and instead switch back to the text console and run the iostat command there.
 
I did two tests. In the first test, I started the X session and then switched to the text console and run iostat 5. After switching to the text console, the HDD activity stopped immediately.

Code:
$ iostat 5
       tty            ada0              cd0            pass0             cpu
 tin  tout  KB/t tps  MB/s   KB/t tps  MB/s   KB/t tps  MB/s  us ni sy in id
   1   167 16.63 116  1.88   0.00   0  0.00   0.00   0  0.00   8  0 15  1 76
   0    46  0.00   0  0.00   0.00   0  0.00   0.00   0  0.00   0  0  0  0 99
   0    16  0.00   0  0.00   0.00   0  0.00   0.00   0  0.00   0  0  0  0 99
   0    16  0.00   0  0.00   0.00   0  0.00   0.00   0  0.00   0  0  0  0 99
   0    15  0.00   0  0.00   0.00   0  0.00   0.00   0  0.00   0  0  0  1 99
   0    16  0.00   0  0.00   0.00   0  0.00   0.00   0  0.00   0  0  0  0 99
   0    16  0.00   0  0.00   0.00   0  0.00   0.00   0  0.00   0  0  0  1 99
   0    16  8.00   0  0.00   0.00   0  0.00   0.00   0  0.00   0  0  0  0 99
   0    16  0.00   0  0.00   0.00   0  0.00   0.00   0  0.00   0  0  0  0 99
   0    16  0.00   0  0.00   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
   0    16  0.00   0  0.00   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
   0    16  0.00   0  0.00   0.00   0  0.00   0.00   0  0.00   0  0  0  0 99
   0    16  0.00   0  0.00   0.00   0  0.00   0.00   0  0.00   0  0  1  1 99
^C   1    40  0.00   0  0.00   0.00   0  0.00   0.00   0  0.00   0  0  0  1 99

In the second test, I ran iostat 5 from the open Xterm. No other windows or programs were running. The HDD activity was visible.

Code:
$ iostat 5
       tty            ada0              cd0            pass0             cpu
 tin  tout  KB/t tps  MB/s   KB/t tps  MB/s   KB/t tps  MB/s  us ni sy in id
   1   196 16.14 178  2.80   0.00   0  0.00   0.00   0  0.00  11  0 21  1 66
   0    47  4.96 139  0.68   0.00   0  0.00   0.00   0  0.00   3  0  8  0 88
   0    16  4.80  88  0.41   0.00   0  0.00   0.00   0  0.00   3  0  6  0 92
   0    16  4.41 136  0.58   0.00   0  0.00   0.00   0  0.00   2  0  8  0 90
   0    16  3.80 111  0.41   0.00   0  0.00   0.00   0  0.00   2  0  6  0 91
   0    16  4.37 137  0.58   0.00   0  0.00   0.00   0  0.00   3  0  8  0 90
   0    16  4.32  98  0.41   0.00   0  0.00   0.00   0  0.00   2  0  9  0 89
   0    16  4.35 139  0.59   0.00   0  0.00   0.00   0  0.00   3  0  9  0 88
   0    16  4.44  98  0.43   0.00   0  0.00   0.00   0  0.00   3  0  8  1 89
   0    16  4.56 135  0.60   0.00   0  0.00   0.00   0  0.00   3  0  9  0 87
   0    16  4.05 107  0.42   0.00   0  0.00   0.00   0  0.00   2  0  8  0 89
   0    16  4.09 154  0.62   0.00   0  0.00   0.00   0  0.00   3  0  8  0 89
   0    16  4.10 107  0.43   0.00   0  0.00   0.00   0  0.00   3  0  8  0 89
   0    16  4.29 139  0.58   0.00   0  0.00   0.00   0  0.00   4  0  9  1 86
^C   9    44  4.45 122  0.53   0.00   0  0.00   0.00   0  0.00   2  0  9  1 88
 
Code:
$ iostat 5
       tty            ada0              cd0            pass0             cpu
 tin  tout  KB/t tps  MB/s   KB/t tps  MB/s   KB/t tps  MB/s  us ni sy in id
   1   196 16.14 178  2.80   0.00   0  0.00   0.00   0  0.00  11  0 21  1 66
   0    47  4.96 139  0.68   0.00   0  0.00   0.00   0  0.00   3  0  8  0 88
   0    16  4.80  88  0.41   0.00   0  0.00   0.00   0  0.00   3  0  6  0 92
   0    16  4.41 136  0.58   0.00   0  0.00   0.00   0  0.00   2  0  8  0 90
   0    16  3.80 111  0.41   0.00   0  0.00   0.00   0  0.00   2  0  6  0 91
   0    16  4.37 137  0.58   0.00   0  0.00   0.00   0  0.00   3  0  8  0 90
   0    16  4.32  98  0.41   0.00   0  0.00   0.00   0  0.00   2  0  9  0 89
   0    16  4.35 139  0.59   0.00   0  0.00   0.00   0  0.00   3  0  9  0 88
   0    16  4.44  98  0.43   0.00   0  0.00   0.00   0  0.00   3  0  8  1 89
   0    16  4.56 135  0.60   0.00   0  0.00   0.00   0  0.00   3  0  9  0 87
   0    16  4.05 107  0.42   0.00   0  0.00   0.00   0  0.00   2  0  8  0 89
   0    16  4.09 154  0.62   0.00   0  0.00   0.00   0  0.00   3  0  8  0 89
   0    16  4.10 107  0.43   0.00   0  0.00   0.00   0  0.00   3  0  8  0 89
   0    16  4.29 139  0.58   0.00   0  0.00   0.00   0  0.00   4  0  9  1 86
^C   9    44  4.45 122  0.53   0.00   0  0.00   0.00   0  0.00   2  0  9  1 88
Ok ... So, to sum up, there is a constant write activity of about 500 Kbyte/s (average) to your HDD (ada0) as long as X.org is running on your screen. The target of the write activity is not your swap partition (because your swap usage is 0 %, and also you mentioned that the problem persists when you disable swap). So, the target must be some file (or several files) inside a file system on your HDD.

Unfortunately, there is no easy way to see which files a process is actually writing to. You can use procstat -f <PID> or fstat -p <PID> or lsof -p <PID> (the last one is from sysutils/lsof) to get a list of open file descriptors of process ID <PID>. Each of these tools produce slightly different output, but I’m not sure if it is helpful in this case. It should also be noted that the output from those tools need not be complete or accurate, e.g. if a process accesses a file that doesn’t have a directory entry (i.e. a file without a name), then this information cannot be displayed, obviously. Also, if the process closes its file handle after each write access and opens it again for the next one, then the descriptor is only visible for tiny fractions of time, and chances are that the mentioned tools will miss it.

I just had a look at my own Xorg process – As far as I can see, the only plain file it has opened with write access is /var/log/Xorg.0.log. Maybe your Xorg process is writing heaps of stuff to that log file? Wouldn’t hurt to have a look, but I would be surprised if that was actually the case. Also, your Xorg drivers and configuration are probably different from mine, so this can make a difference in what files are used (e.g. the Nvidia driver that I use opens quite a bunch of files, although it doesn’t appear to write to any of them).

Just out of curiosity, what filesystems are there? (Output from df.)
Is /tmp a separate filesystem on your HDD, or is it a tmpfs(5), or is it part of the root file system? I’m asking because Xorg creates some stuff in /tmp.

Another thing that crossed my mind: You could try watching the output from df over the course of several seconds. Does the amount of free space shrink? If it does not, then the write activity either overwrites the same file again and again (so no new space needs to be allocated), or it creates new data and at the same time removes other data that has the same size. Not sure if this will help tracking down the problem, but at least it might server as another piece of the puzzle.
 
You can use truss or DTrace to see the actual file which is accessed. There was a very good script for DTrace in the forum.

Edit:
 
You may want to add message: "query '[%C/%n] %M'", to the ALIAS section of /usr/local/etc/pkg.conf, then read through all pkg message|less and apply the requested settings. Note: pkg info -Da is equivalent.
I read all messages of all packages, but I didn't find anything important. It is almost solely default FreeBSD installation inside the VirtualBox with Xorg and x11-wm/e16 packages installed.

I just had a look at my own Xorg process – As far as I can see, the only plain file it has opened with write access is /var/log/Xorg.0.log. Maybe your Xorg process is writing heaps of stuff to that log file?
There is no unusual write access to /var/log/Xorg.0.log.

Also, your Xorg drivers and configuration are probably different from mine, so this can make a difference in what files are used (e.g. the Nvidia driver that I use opens quite a bunch of files, although it doesn’t appear to write to any of them).
The HDD write activity is also observed in VirtualBox, so I don't think it has anything to do with real graphic card. I have no xorg.conf file in VirtualBox.

Just out of curiosity, what filesystems are there? (Output from df.)
This is automatically created default layout ZFS via the bsdinstall. I also tried UFS with same issues.

Output of df:
Code:
Filesystem         1K-blocks      Used     Avail Capacity  Mounted on
zroot/ROOT/default  17773308   1770448  16002860    10%    /
devfs                      1         1         0   100%    /dev
zroot/tmp           16002956        96  16002860     0%    /tmp
zroot/var/log       16003044       184  16002860     0%    /var/log
zroot               16002956        96  16002860     0%    /zroot
zroot/usr/ports     16002956        96  16002860     0%    /usr/ports
zroot/usr/src       16002956        96  16002860     0%    /usr/src
zroot/var/audit     16002956        96  16002860     0%    /var/audit
zroot/var/crash     16002956        96  16002860     0%    /var/crash
zroot/var/mail      16002980       120  16002860     0%    /var/mail
zroot/var/tmp       16002956        96  16002860     0%    /var/tmp
zroot/usr/home      16003128       268  16002860     0%    /usr/home
vbox               487791636 155733752 332057884    32%    /mnt

Output of zfs list:
Code:
NAME                 USED  AVAIL  REFER  MOUNTPOINT
zroot               1.69G  15.3G    96K  /zroot
zroot/ROOT          1.69G  15.3G    96K  none
zroot/ROOT/default  1.69G  15.3G  1.69G  /
zroot/tmp             96K  15.3G    96K  /tmp
zroot/usr            556K  15.3G    96K  /usr
zroot/usr/home       268K  15.3G   268K  /usr/home
zroot/usr/ports       96K  15.3G    96K  /usr/ports
zroot/usr/src         96K  15.3G    96K  /usr/src
zroot/var            688K  15.3G    96K  /var
zroot/var/audit       96K  15.3G    96K  /var/audit
zroot/var/crash       96K  15.3G    96K  /var/crash
zroot/var/log        184K  15.3G   184K  /var/log
zroot/var/mail       120K  15.3G   120K  /var/mail
zroot/var/tmp         96K  15.3G    96K  /var/tmp

Another thing that crossed my mind: You could try watching the output from df over the course of several seconds. Does the amount of free space shrink? If it does not, then the write activity either overwrites the same file again and again (so no new space needs to be allocated), or it creates new data and at the same time removes other data that has the same size. Not sure if this will help tracking down the problem, but at least it might server as another piece of the puzzle.
To use df was good idea. I tried two second looping zfs list -o name,referenced. The changes appear in the zroot/var/tmp file system during the HDD activity. Its size was slightly varying between 104K and 108K size. But there was not any named file in the directory /var/tmp. When the HDD activity stopped, the size returned back to 96K size.

When I was playing with # lsof -p 809 | grep var/tmp I noticed the output
Code:
Xorg    809 root   33u    VREG 3546908663,1252974314      192 21209 /var/tmp (zroot/var/tmp)
But it did not appear again.

Oh, one important thing: Please provide the output from ls -l /dev/null /dev/zero.
Output of ls:
Code:
# ls -l /dev/null /dev/zero
crw-rw-rw-  1 root  wheel  0x12 Mar 25 17:13 /dev/null
crw-rw-rw-  1 root  wheel  0x13 Mar 25 17:13 /dev/zero

You can use truss or DTrace to see the actual file which is accessed. There was a very good script for DTrace in the forum.
I'll try to use DTrace, but it will take me a while to learn it. :)
 
SIGNAL 14 (SIGALRM) code=SI_KERNEL
sigprocmask(SIG_SETMASK,{ SIGALRM },0x0) = 0 (0x0)
sigreturn(0x7fffffffe040) EJUSTRETURN
poll({ 4/POLLIN 5/POLLIN 6/POLLIN 15/POLLIN 25/POLLIN 26/POLLIN },6,61334) = 1 (0x1)
setitimer(0,{ 0.005000, 0.005000 },0x0) = 0 (0x0)
recvmsg(26,{NULL,0,[{"5\^X\^D\0\b\^Q`\0\^O\^E\0\0\^A\0"...,16384}],1,{},0,0},0) = 1416 (0x588)
writev(26,[{"\^A\^A\M-,2\0\0\0\0\f\0@\0\0\0\0"...,32}],1) = 32 (0x20)
recvmsg(26,{NULL,0,[{"\^C\^X\^B\0\b\^Q`\0\^N\^E\^B\0\b"...,16384}],1,{},0,0},0) = 16 (0x10)
writev(26,[{"\0\^C\M--2\b\^Q`\0\0\0\^C\0\0\0"...,32}],1) = 32 (0x20)
poll({ 4/POLLIN 5/POLLIN 6/POLLIN 15/POLLIN 25/POLLIN 26/POLLIN },6,0) = 0 (0x0)
writev(26,[{"\^A\^X\M-.2\0\0\0\0\^O\^E\0\0\0"...,32}],1) = 32 (0x20)
recvmsg(26,{NULL,0,[{"+\^X\^A\0\b\^Q`\0\^N\^E\^B\0\b"...,16384}],1,{},0,0},0) = 4 (0x4)
writev(26,[{"\^A\^A\M-/2\0\0\0\0\f\0@\0\0\0\0"...,32}],1) = 32 (0x20)
recvmsg(26,{NULL,0,[{"\^N\^X\^B\0\b\^Q`\0\^N\^E\^B\0\b"...,16384}],1,{},0,0},0) = 8 (0x8)
writev(26,[{"\^A\^X\M-02\0\0\0\0\^O\^E\0\0\0"...,32}],1) = 32 (0x20)
recvmsg(26,{NULL,0,[{"\M^B\a\^D\0\r\^Q`\0\M-@\0\0\0\0"...,16384}],1,{},0,0},0) = 16 (0x10)
fstatat(AT_FDCWD,"/run/shm",0x7fffffffe510,0x0) ERR#2 'No such file or directory'
fstatat(AT_FDCWD,"/var/tmp",{ mode=drwxrwxrwt ,inode=4494347,size=512,blksize=32768 },0x0) = 0 (0x0)
open("/var/tmp/shmfd-9KFsXq",O_RDWR|O_CREAT|O_EXCL|O_CLOEXEC,0600) = 27 (0x1b)
unlink("/var/tmp/shmfd-9KFsXq") = 0 (0x0)

There's some issue with libxshmfence-1.3 i will try to build it from ports and update the actual system to see if it's the same.

Here what i did to reproduce this:

Clean install on Hyper-V gen1, FreeBSD 12.2-RELEASE r366954 GENERIC without any updates
pkg install xorg e16
sysctl kern.evdev.rcpt_mask=6
echo "kern.evdev.rcpt_mask=6" >> /etc/sysctl.conf
xinit
e16



last pid: 1608; load averages: 0.69, 0.42, 0.18 up 0+00:02:25 21:18:30
31 processes: 1 running, 30 sleeping
CPU: 0.0% user, 0.0% nice, 0.4% system, 0.0% interrupt, 99.6% idle
Mem: 40M Active, 27M Inact, 230M Wired, 101M Buf, 662M Free
Swap: 2560M Total, 2560M Free

PID USERNAME VCSW IVCSW READ WRITE FAULT TOTAL PERCENT COMMAND
1600 root 186 169 0 126 0 126 100.00% Xorg
1607 root 189 103 0 0 0 0 0.00% e16
 
Very interesting … I haven’t looked closely at this isse, but I’m wondering, what is causing it to use a physical file instead of real shared memory (via SysV IPC or via mmap(2))? This shouldn’t happen.
 
Found a workaround:

From the settings menu disable pager display. This will stop the checking of /var/tmp/shmfd* files and reduce the i/o it's caused of Pager scanning.

Very interesting … I haven’t looked closely at this isse, but I’m wondering, what is causing it to use a physical file instead of real shared memory (via SysV IPC or via mmap(2))? This shouldn’t happen.
Sorry i dont' know, my knowledge is limited :)

1616701920753.png


Edit:
also using Pager Mode: Simple also reduce the polling.
 
I just want to make sure. So the problem lies in the Xorg. E16 is not responsible for this problem. Is it so?
 
E16 is using a libxshmfence

This library offers a CPU-based synchronization primitive compatible
with the X SyncFence objects that can be shared between processes
using file descriptor passing.

Output of truss(1)

stat("/var/tmp",{ mode=drwxrwxrwt ,inode=3049729,size=512,blksize=32768 }) = 0 (0x0)
open("/var/tmp/shmfd-sfVKVJ",O_RDWR|O_CREAT|O_EXCL|O_CLOEXEC,0600) = 20 (0x14)
unlink("/var/tmp/shmfd-sfVKVJ") = 0 (0x0)
ftruncate(20,0xc0) = 0 (0x0)
 
From my mailing with E16 developer Kim Woelders, I received this reply from him:

I don't think e16 uses libxshmfence but Xorg does.

However, e16 (via imlib2) does a lot of SHM stuff, particularly in the pager scan, which may cause Xorg to use libxshmfence.

Imlib2 uses SHM (when possible) to exchange pixmaps with the server. This can be done in different modes, either using "fd passing" (introduced in imlib2-1.5.0) or not.

You can control which SHM mode imlib2 uses via the IMLIB2_SHM_OPT environment variable (assuming the options are available):
0: SHM off
1: Regular SHM (disable fd passing)
2: SHM with fd passing (default)

So you could try launching e16 with IMLIB2_SHM_OPT=1 (or 0) to see if that makes a difference.

Happy news to all E16 or imlib2 users. In my experience, the change of SHM mode helps. HDD writing disappears after setting IMLIB2_SHM_OPT to 1. I also tried SHM mode 0. I didn't see any differences between SHM mode 0 or 1.
 
Back
Top