bhyve Bhyve virtual machine after S3 mode - glitches or freezes

Vurefozu · May 24, 2024

Hello.
As a continuation of bug #278941, I decided to create a new bug report, because... the problem is broader in nature.

Code:

# uname -a
FreeBSD home 13.2-RELEASE-p11 FreeBSD 13.2-RELEASE-p11 releng/13.2-n254665-f5ac4e174fdd MYGENERIC amd64

I work with bhyve virtual machines via vm-bhyve.
But I also tested without layers, through a clean launch from bhyve. The problem will be reproduced the same way.

Reproducing the problem:
1) start a bhyve virtual machine with a guest from the list below.
2) enter S3 mode using the command "zzz" or "acpiconf -s 3".
3) exit S3 mode (turn on the computer).
4) we observe glitches in the virtual machine.

If you start the bhyve virtual machine and then enter and exit S3 mode using the command “zzz” or “acpiconf -s 3”, the virtual machine begins to glitch, some distributions freeze for a while. If you wait from a few minutes to several hours, the problem may go away on its own.
For example:
- IP addresses are not pinged from inside the virtual machine, even 127.0.0.1.
- some commands stop being executed.
- even "reboot", "poweroff", "init 0" does not work;
- execution of the "fdisk -l" command freezes;
etc.

in this case, you can ping the guest from the host and even try to connect via ssh, but the ssh connection is not completed completely.

Sometimes the problem may not be reproduced, but most of the time it is reproduced.
For example, after several inputs and outputs from S3 mode, the problem itself may not be reproduced both for all virtual machines and for some. Those. some virtual machines will glitch, some will work fine after S3.

Recorded a video of the problem: https://disk.yandex.ru/i/NgmryhIytofmVw
To fully understand the problem, you need to reproduce the problem yourself and see what's happening with the virtual machine.

I tested this problem on a very large number of iso images and different Linux distributions.
At the beginning, I assumed that the problem only affected Linux distributions, but I was able to reproduce the problem on "DragonFly BSD 6.4.0" (FreeBSD fork) by booting from the iso image. The virtual machine freezes tightly, and this is even without X, a clean console.

The problem with bhyve virtual machine after "zzz" is reproduced on:

Code:

grub loader:
    Debian 10 (debian-10.10.0-amd64-netinst.iso), installed from iso
    Debian 11 (debian-11.5.0-amd64-netinst.iso), installed from iso
    Debian 11 Cloud image (debian-11-nocloud-amd64-20240507-1740.raw), installed
    Debian 12 (debian-12.5.0-amd64-netinst.iso), installed from iso
    Ubuntu 22.04 installed from iso
    Ubuntu 22.04.4 Cloud image (ubuntu-22.04-server-cloudimg-amd64.img), installed
    Ubuntu 24.04 Cloud image (ubuntu-24.04-server-cloudimg-amd64.img), installed
    AlpineLinux (alpine-standard-3.19.1-x86_64.iso), boot from iso
    PureOS 10.3 (pureos-10.3-plasma-live-20230614_amd64.iso), boot from iso
UEFI loader:
    Debian 12 installed from iso
    Debian 12 (debian-12.5.0-amd64-netinst.iso), boot from iso
    Debian 12 Cloud image (debian-12-nocloud-amd64-20240507-1740.raw), installed
    Debian 11 (debian-11.5.0-amd64-netinst.iso), boot from iso
    Debian 11 Cloud image (debian-11-nocloud-amd64-20240507-1740.raw), installed
    Debian 10 (debian-10.10.0-amd64-netinst.iso), boot from iso
    Ubuntu 16.04.7 (ubuntu-16.04.7-server-amd64.iso), boot from iso, freezes
    kubuntu 16.04.1 (kubuntu-16.04.1-desktop-amd64.iso), boot from iso
    Ubuntu 18.04.1 (ubuntu-18.04.1-live-server-amd64.iso), boot from iso
    lubuntu 20.04.2 (lubuntu-20.04.2-desktop-amd64.iso), boot from iso
    Ubuntu 22.04.4 (ubuntu-22.04.4-live-server-amd64.iso), boot from iso
    Ubuntu 23.04 (ubuntu-23.04-desktop-amd64.iso), boot from iso
    Xubuntu 24.04 (xubuntu-24.04-minimal-amd64.iso), boot from iso
    Ubuntu 22.04.4 Cloud image (ubuntu-22.04-server-cloudimg-amd64.img), installed
    Ubuntu 24.04 Cloud image (ubuntu-24.04-server-cloudimg-amd64.img), installed
    ArchLinux (archlinux-2024.05.01-x86_64.iso), boot from iso
    VoidLinux (void-live-x86_64-20221001-xfce.iso), boot from iso, freezes
    ROSA.FRESH 12.4 (ROSA.FRESH.LXQT.12.4.x86_64.uefi.iso), freezes
    MintLinux (lmde-6-cinnamon-64bit.iso), boot from iso, freezes
    Crux (crux-3.6.1.iso), boot from iso
    Trisquel 9.0 (Trisquel_9.0_amd64.iso), boot from iso
    Guix 1.3.0 (guix-system-install-1.3.0.x86_64-linux.iso), boot from iso
    NixOS (nixos-plasma5-x86_64-linux.iso), boot from iso
    SparkyLinux 7.3 (sparkylinux-7.3-x86_64-minimalcli.iso), boot from iso
    antiX 21 (antiX-21_x64-base.iso), boot from iso (base - debian 12)
    MX Linux 23.3 (MX-Linux-23.3_x64.iso), boot from iso, freezes (base - debian 12)
    SystemRescue (systemrescue-10.00-amd64.iso), boot from iso
    FreeBSD and fork:
        DragonFly BSD 6.4.0 (dfly-x86_64-6.4.0_REL.iso), boot from iso, freezes
    parabola 2021.08.11 (parabola-openrc-lxde-2021.08.11-dual.iso), boot from iso, freezes
    parabola 2022.04 (parabola-x86_64-systemd-cli-2022.04-netinstall.iso), boot from iso
    Trisquel 11 (trisquel-mini_11.0_amd64.iso), installed
    Trisquel 11 (trisquel-mini_11.0_amd64.iso), boot from iso
    Devuan 5 (devuan_daedalus_5.0.0_amd64_minimal-live.iso), boot from iso (base - debian 12)
    Fedora 40 Cloud image (Fedora-Cloud-Base-AmazonEC2.x86_64-40-1.14.raw), installed
    Rocky 9 Cloud image (Rocky-9-GenericCloud-Base.latest.x86_64.qcow2), installed
    AlmaLinux 8 Cloud image (AlmaLinux-8-GenericCloud-UEFI-latest.x86_64.qcow2), installed

The problem does not reproduce on:

Code:

UEFI loader:
    OmniOS (omnios-r151050.iso), boot from iso
    Tribblix (tribblix-0m30.iso), boot from iso
    vzLinux 9.0-22 (vzlinux-iso-min-9.0-22.iso), boot from iso
    FreeBSD and fork:
        FreeBSD 13.2, boot from iso
        FreeBSD 14.0, boot from iso
        FreeBSD 15, boot from iso
        GhostBSD 22.06.18 (GhostBSD-22.06.18.iso), boot from iso
        HardenedBSD 14-stable (build-3 2023.10.14), boot from iso
        mfsBSD 13.1 RELEASE (mfsbsd-13.1-RELEASE-amd64.iso), boot from iso
        MidnightBSD-3.0.1 (MidnightBSD-3.0.1--amd64-disc1.iso), boot from iso
        nomadbsd 131 (nomadbsd-131R-20221130.amd64.zfs.img), installed on a flash drive (the mouse does not work)

I also noticed that when the problem is solved on its own, the following messages are written in the dmesg of the guest virtual machine:

Debian 11, Grub loader (Cloud image)
dmesg:

Code:

clocksource: Long readout interval, skipping watchdog check: cs_nsec: 443581789 wd_nsec: 8595201253

Code:

[  102.612635] clocksource: timekeeping watchdog on CPU0: Marking clocksource 'tsc' as unstable because the skew is too large:
[  102.616971] clocksource:                       'hpet' wd_now: 18d9585d wd_last: 6610b8ba mask: ffffffff
[  102.620601] clocksource:                       'tsc' cs_now: 3cd9ad0d0a2 cs_last: 3cd2ee5b056 mask: ffffffffffffffff
[  102.624630] tsc: Marking TSC unstable due to clocksource watchdog
[  102.628088] TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'.
[  102.631415] sched_clock: Marking unstable (102587039862, 40744173)<-(102634646643, -6561187)
[  102.631784] clocksource: Checking clocksource tsc synchronization from CPU 0.
[  102.638002] clocksource: Switched to clocksource hpet

Debian 12, Grub loader (install from iso)
dmesg:

Code:

[  124.509318] clocksource: timekeeping watchdog on CPU0: Marking clocksource 'tsc' as unstable because the skew is too large:
[  124.509386] clocksource:                       'hpet' wd_nsec: 8663531482 wd_now: 849e772a wd_last: 7bf499f7 mask: ffffffff
[  124.509412] clocksource:                       'tsc' cs_nsec: 511967473 cs_now: 101de797fc5 cs_last: 10173beaa3f mask: ffffffffffffffff
[  124.509435] clocksource:                       'tsc' is current clocksource.
[  124.509461] tsc: Marking TSC unstable due to clocksource watchdog
[  124.509535] TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'.
[  124.509537] sched_clock: Marking unstable (124548213501, -38421876)<-(124526816918, -17281783)
[  124.510102] clocksource: Checking clocksource tsc synchronization from CPU 0 to CPUs 1.
[  124.510325] clocksource: Switched to clocksource hpet

After these messages in dmesg there are no problems with the virtual machine. Everything works as it should.

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=279271

Vurefozu · May 27, 2024

Adding to the file "/etc/default/grub" to the GRUB_CMDLINE_LINUX_DEFAULT variable lines clocksource=hpet tsc=unstable trace_clock=local seems to solve the problem.

In this option, the kernel immediately switches to hpet.

This is written into dmesg immediately upon boot:

Code:

[ 0.514143] clocksource: Switched to clocksource hpet

Those. For each virtual machine with a Linux guest, you need to add this line to the kernel parameter.

bhyve Bhyve virtual machine after S3 mode - glitches or freezes

Vurefozu

Vurefozu