3 cores pegged 24/7 by interrupts with no endpoints + MCE

cellinux · 2026-02-09T01:55:59+0000

NAS server based on Supermicro H13SSL-N w/ EPYC 9214. TrueNAS CORE Community 13.0-U6.8 (FreeBSD 13.1-RELEASE-p9)

A month ago we replaced HBAs, RAM and the OS; since then, 3 CPU cores are pegged at 100% continuously. It persists after reboots.

Usually this would suggest an issue either with processes, or with a NIC, HBA, USB etc. — but it's all interrupt cycles, and it appears that these interrupts come from root PCIe ports, with no downstream endpoints at all.

Code:

top -SH
last pid: 30669;  load averages:  3.33,  3.15,  3.12    up 1+12:17:11  16:52:25
2199 threads:  36 running, 2020 sleeping, 143 waiting
CPU:  0.0% user,  0.0% nice,  0.0% system,  9.4% interrupt, 90.6% idle
Mem: 128K Active, 1500M Inact, 179M Laundry, 236G Wired, 11G Free
ARC: 211G Total, 71G MFU, 140G MRU, 31K Anon, 53M Header, 61M Other
     205G Compressed, 252G Uncompressed, 1.23:1 Ratio
Swap: 10G Total, 10G Free
  PID USERNAME    PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
   11 root        155 ki31     0B   512K CPU23   23  36.4H 100.00% idle{idle: cpu23}
   11 root        155 ki31     0B   512K CPU25   25  36.0H 100.00% idle{idle: cpu25}
   11 root        155 ki31     0B   512K CPU13   13  35.8H 100.00% idle{idle: cpu13}
   11 root        155 ki31     0B   512K CPU15   15  35.8H 100.00% idle{idle: cpu15}
   11 root        155 ki31     0B   512K CPU12   12  35.8H 100.00% idle{idle: cpu12}
   11 root        155 ki31     0B   512K RUN     14  35.8H 100.00% idle{idle: cpu14}
   11 root        155 ki31     0B   512K CPU10   10  35.7H 100.00% idle{idle: cpu10}
   11 root        155 ki31     0B   512K CPU7     7  35.9H  99.73% idle{idle: cpu7}
   12 root        -80    -     0B  2336K CPU18   18  36.3H  99.56% intr{irq156: pcib9}
   12 root        -80    -     0B  2336K CPU26   26  36.3H  98.92% intr{irq176: pcib12}
   12 root        -80    -     0B  2336K CPU20   20  36.3H  98.17% intr{irq157: pcib10}
   11 root        155 ki31     0B   512K CPU9     9  35.8H  94.85% idle{idle: cpu9}
...

Code:

pciconf -lv | egrep -n 'pcib9@|pcib10@|pcib12@' -A6
112:pcib9@pci0:128:1:1:    class=0x060400 rev=0x01 hdr=0x01 vendor=0x1022 device=0x14ab subvendor=0x1022 subdevice=0x1453
113-    vendor     = 'Advanced Micro Devices, Inc. [AMD]'
114-    class      = bridge
115-    subclass   = PCI-PCI
116:pcib10@pci0:128:1:2:    class=0x060400 rev=0x01 hdr=0x01 vendor=0x1022 device=0x14ab subvendor=0x1022 subdevice=0x1453
117-    vendor     = 'Advanced Micro Devices, Inc. [AMD]'
118-    class      = bridge
119-    subclass   = PCI-PCI
120-pcib11@pci0:128:1:3:    class=0x060400 rev=0x01 hdr=0x01 vendor=0x1022 device=0x14ab subvendor=0x1022 subdevice=0x1453
121-    vendor     = 'Advanced Micro Devices, Inc. [AMD]'
122-    class      = bridge
--
124:pcib12@pci0:128:1:4:    class=0x060400 rev=0x01 hdr=0x01 vendor=0x1022 device=0x14ab subvendor=0x1022 subdevice=0x1453
125-    vendor     = 'Advanced Micro Devices, Inc. [AMD]'
126-    class      = bridge
127-    subclass   = PCI-PCI
128-hostb9@pci0:128:2:0:    class=0x060000 rev=0x01 hdr=0x00 vendor=0x1022 device=0x149f subvendor=0x0000 subdevice=0x0000
129-    vendor     = 'Advanced Micro Devices, Inc. [AMD]'
130-    class      = bridge

MSI and MSI-X are enabled.

Also, as it just so happens, the server had an MCE yesterday. First time it happened, after weeks of testing + 3 weeks in production.

Version String: FreeBSD 13.1-RELEASE-p9 n245433-9dc2dc9b081 TRUENAS
Panic String: Unrecoverable machine check exception
2026-02-07 04:28:33 Memory [MEM-0001] Uncorrectable ECC / other uncorrectable memory error @DIMMG2 - Assertion Sensor-specific
2026-02-07 04:28:33 ProcessorConfiguration [PC-0153] Configuration error - CPU 1 LS Uncorrectable error - Assertion Sensor-specific
2026-02-07 04:28:15 Watchdog [WDT-0131] Timer interrupt - interrupt type: none, timer use at expiration: SMS/OS - Assertion Sensor-specific
2026-02-07 04:24:49 ProcessorConfiguration [PC-0153] Configuration error - CPU 1 LS Uncorrectable error - Assertion

With all this in mind…

1. How likely is it that CPU hardware is the cause for both issues?
2. Is the next step to temporarily disable MSI, MSI-X? (sysctl hw.pci.enable_msix=0 , sysctl hw.pci.enable_msi=0 ) How disruptive is that to a production system? (Seconds of downtime? Risk of dropping storage/network mid-I/O?)

My colleague who's the more experienced Linux guy, is "not going to worry about the pegged threads now". That worries me, and this issue worries me. Also, this I can look into, while conjuring replacement CPU+RAM+mobo will take days at best.

sko · 2026-02-09T09:34:26+0000

cellinux said:
TrueNAS CORE Community 13.0-U6.8 (FreeBSD 13.1-RELEASE-p9)

you did read the forum rules, didn't you?

Thread 'GhostBSD, pfSense, TrueNAS, and all other FreeBSD Derivatives'

Sep 25, 2009

Questions about 'derivative FreeBSDs', like

GhostBSD
TrueNAS
XigmaNAS
OPNsense
pfSense
BSD Router Project
NomadBSD
helloSystem
HardenedBSD

should be asked on the forums and/or mailing lists for these specific products. See below for links.

If you still think your questions should be asked here, beware of the following:

To show that you have indeed tried to get a solution from the forum or mailing list of the...

FreeBSD 13.1 has been EOL for over *two and a half years* now.

Thread 'Topics about unsupported FreeBSD versions'

Jun 24, 2013

The FreeBSD Forums cater primarily to end-users and systems administrators. As such, the Forums focus almost exclusively on FreeBSD versions that are officially supported according to the official FreeBSD website. Since resources are scarce, the FreeBSD Forums strongly suggest that anyone asking questions on the forums run one of the officially supported versions (see the links at the end of this post), for which the installed user base is logically the broadest.

In terms of 'unsupported versions' we make the following distinction:

FreeBSD versions that are...

3 cores pegged 24/7 by interrupts with no endpoints + MCE

cellinux

sko

Thread 'GhostBSD, pfSense, TrueNAS, and all other FreeBSD Derivatives'

Thread 'Topics about unsupported FreeBSD versions'