Recently I have been experiencing a strange behavior with new virtual machines running FreeBSD 13.0-RELEASE (the host is Linux with QEMU and KVM), even on a fresh new installation updated to p05 via
Symptoms:
* One of the CPUs is always loaded at 100%.
* Shutting down does not always work. Sometimes the system just hangs and does not properly shut down.
I tried also on a 13.0-RELEASE VM with p04 and it has the same problem. However, I do have another p04 VM that is fine and the CPU load is 0. The only significant difference I see is they are running on different storages.
My setup is as follows:
Host machine (Linux/KVM)
- ZFS pool zroot (NVMe SSDs) with ZVOLs: FreeBSD 13.0 VMs with ZFS root pools running on top of the volumes, the VMs are fine.
- ZFS pool tank (SATA HDDs) with ZVOLs: FreeBSD 13.0 VMs with ZFS root pools running on top of the volumes, all the VMs seem to be having the problem of 1 CPU always at 100%.
The VMs are virtually empty. Here is the output of htop:
I am quite puzzled by this. The only thing that comes to mind is something being caused by a ZFS pool being created on top of a ZVOL in another ZFS pool. I remember having read that experts warn against doing this but I don't recall the reasons.
Can it be that the NVMe is somehow handled differently (maybe with regard to caching, CPU and memory) and ZFS on top of ZFS does not cause a problem, but when I do this on an HDD it does?
Edit: I tried also copying the VM image to the NVMe ZFS pool but the problem still persists. Presumably it is not caused by the usage of SATA HDDs.
Interestingly,
It looks like this is a duplicate of https://forums.freebsd.org/threads/freebsd-13-high-cpu-usage-rand_harvestq.80475/
I was able to resolve the problem by applying the workaround described in the link above.
I recreated the VM by using i440FX chipset and BIOS instead of UEFI. The CPU usage is now normal. As the bug is already reported, I will mark the thread as solved.
freebsd-update
.Symptoms:
* One of the CPUs is always loaded at 100%.
* Shutting down does not always work. Sometimes the system just hangs and does not properly shut down.
I tried also on a 13.0-RELEASE VM with p04 and it has the same problem. However, I do have another p04 VM that is fine and the CPU load is 0. The only significant difference I see is they are running on different storages.
My setup is as follows:
Host machine (Linux/KVM)
- ZFS pool zroot (NVMe SSDs) with ZVOLs: FreeBSD 13.0 VMs with ZFS root pools running on top of the volumes, the VMs are fine.
- ZFS pool tank (SATA HDDs) with ZVOLs: FreeBSD 13.0 VMs with ZFS root pools running on top of the volumes, all the VMs seem to be having the problem of 1 CPU always at 100%.
The VMs are virtually empty. Here is the output of htop:
Code:
0[ 0.0%] 3[ 0.0%]
1[ 0.0%] 4[ 0.0%]
2[|||||||||||||||||||||||||||||||||||||||||||100.0%] 5[ 0.0%]
Mem[|||| 142M/3.96G] Tasks: 26, 0 thr, 27 kthr; 3 running
Swp[ 0K/0K] Load average: 0.75 0.23 0.09
Uptime: 00:01:11
PID△USER PRI NI VIRT RES S CPU% MEM% TIME+ Command
1 root 52 0 11824 1116 S 0.0 0.0 0:00.08 /sbin/init
22591 root 20 0 12920 2808 S 0.0 0.1 0:00.01 ├─ /usr/sbin/syslogd -ss
33036 root 52 0 13192 2736 S 0.0 0.1 0:00.00 ├─ dhclient: system.syslog
33843 root 4 0 13192 2808 S 0.0 0.1 0:00.00 ├─ dhclient: vtnet0 [priv]
48160 ntpd 20 0 21864 6624 S 0.0 0.2 0:00.01 ├─ /usr/sbin/ntpd -p /var/db/ntp/ntpd.pid -c /etc/ntp.
54105 root 20 0 12964 2648 S 0.0 0.1 0:00.00 ├─ /usr/sbin/cron -s
59512 _dhcp 52 0 13196 2916 S 0.0 0.1 0:00.00 ├─ dhclient: vtnet0
60118 root 20 0 11492 1484 S 0.0 0.0 0:00.00 ├─ /sbin/devd
60954 root 32 0 20952 8396 S 0.0 0.2 0:00.00 ├─ /usr/sbin/sshd
68129 root 20 0 21392 9036 S 0.0 0.2 0:00.02 │ └─ sshd: root@pts/0
68326 root 20 0 13980 4028 S 0.0 0.1 0:00.01 │ └─ -csh
68914 root 20 0 16536 4684 R 0.0 0.1 0:00.01 │ └─ htop
63567 root 52 0 13624 2836 S 0.0 0.1 0:00.00 ├─ sh /etc/rc autoboot
63766 root 52 0 12768 2188 S 0.0 0.1 0:00.00 │ └─ sleep 60
64313 root 52 0 12932 2544 S 0.0 0.1 0:00.00 ├─ logger -p daemon.notice -t fsck
65693 root 52 0 12932 2416 S 0.0 0.1 0:00.00 ├─ logger: system.syslog
67027 root 52 0 12892 2352 S 0.0 0.1 0:00.00 ├─ /usr/libexec/getty Pc ttyv0
67179 root 52 0 12892 2352 S 0.0 0.1 0:00.00 ├─ /usr/libexec/getty Pc ttyv1
67220 root 52 0 12892 2352 S 0.0 0.1 0:00.00 ├─ /usr/libexec/getty Pc ttyv2
67368 root 52 0 12892 2352 S 0.0 0.1 0:00.00 ├─ /usr/libexec/getty Pc ttyv3
67570 root 52 0 12892 2352 S 0.0 0.1 0:00.00 ├─ /usr/libexec/getty Pc ttyv4
67755 root 52 0 12892 2352 S 0.0 0.1 0:00.00 ├─ /usr/libexec/getty Pc ttyv5
67784 root 52 0 12892 2352 S 0.0 0.1 0:00.00 ├─ /usr/libexec/getty Pc ttyv6
67931 root 52 0 12892 2352 S 0.0 0.1 0:00.00 ├─ /usr/libexec/getty Pc ttyv7
68111 root 52 0 12892 2344 S 0.0 0.1 0:00.00 ├─ /usr/libexec/getty 3wire ttyu0
79249 unbound 21 0 27612 10312 S 0.0 0.2 0:00.01 └─ /usr/sbin/local-unbound -c /var/unbound/unbound.co
I am quite puzzled by this. The only thing that comes to mind is something being caused by a ZFS pool being created on top of a ZVOL in another ZFS pool. I remember having read that experts warn against doing this but I don't recall the reasons.
Can it be that the NVMe is somehow handled differently (maybe with regard to caching, CPU and memory) and ZFS on top of ZFS does not cause a problem, but when I do this on an HDD it does?
Edit: I tried also copying the VM image to the NVMe ZFS pool but the problem still persists. Presumably it is not caused by the usage of SATA HDDs.
Interestingly,
ps aux
shows 100% CPU usage going to the command [rand_harvestq]:
Code:
root@freebsd130:~ # ps aux | less
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
root 11 498.0 0.0 0 96 - RNL 01:53 21:24.95 [idle]
root 22 99.0 0.0 0 16 - RL 01:53 4:13.76 [rand_harvestq]
root 0 0.0 0.1 0 6224 - DLs 01:53 0:00.38 [kernel]
root 1 0.0 0.0 11824 1120 - ILs 01:53 0:00.07 /sbin/init
root 2 0.0 0.0 0 96 - DL 01:53 0:00.00 [KTLS]
root 3 0.0 0.0 0 16 - DL 01:53 0:00.00 [crypto]
root 4 0.0 0.0 0 16 - DL 01:53 0:00.00 [crypto returns 0]
root 5 0.0 0.0 0 16 - DL 01:53 0:00.00 [crypto returns 1]
...
It looks like this is a duplicate of https://forums.freebsd.org/threads/freebsd-13-high-cpu-usage-rand_harvestq.80475/
I was able to resolve the problem by applying the workaround described in the link above.
I recreated the VM by using i440FX chipset and BIOS instead of UEFI. The CPU usage is now normal. As the bug is already reported, I will mark the thread as solved.