Solved FreeBSD 12-RELEASE + Ryzen + Bhyve == any luck ?

New build here -

Older generation Ryzen2600 (6c/12t . 65W, very cheap)
Asrock B450-Pro4-F (very cheap)
Nvidia GT710 (very cheap)
64GB of cheap DDR4-2666
NVME drive (very cheap)


FreeBSD 12.0 running great, fantastic little headless server box for general duties and developing jailed apps ... ample capacity, cheap-as, quiet, and power efficient. etc etc. No complaints there.

Bhyve is looking like a non-starter though. I can fire up a freebsd vm, but pretty much everything else Ive tried fails instantly in the installer. (where "everything else" == half a dozen linux distros of different vintages, and "fails instantly" == linux kernel panic and a hung byhve process). Host is still OK though - I just kill -9 the offending bhyve, and its all good.

I do have Virtualisation turned on in the BIOS, and do have hw.vmm.amdvi.enable=1 in /boot/loader.conf, so thats the obvious out the way. Maybe I have missed some other obvious thing though.

The only use I can think of at the moment for a bhyve VM would be to run up a kubernetes stack in a VM for some experiments. Nothing production worthy.

Wondering if its worth putting any time into trying to fix this, or is it a slippery slope with Ryzen ?

Or maybe the most cost effective / time effective solution might be to dual boot the thing If I really need to do some linux+kubernetes hacking. (Its unlikely I would need to do both Kubes and FreeBSD/Jail work at the same time anyway)

So any Ryzen + Bhyve hints and tips / success stories ?

thanks
 
Hy.

I'm having the same issue. I've tried different distros and build dates dating back to 2016 to no avail with a Ryzen 3 2200G.

It has to be fixable in some way, and I rather stick to bhyve if we could get it working. I've found some old bugs with regards to bhyve and AMD from 2015 but no recent information (yet) on whether or not it is an actual problem, lost cause or currently beeing worked on.

Wanna hack at it? It might still be a tunable fix.

Kind regards,
4z0r
 
Hi.

According to bhyve's wiki AMD should run fine provided the CPU has the POPCNT Feature (with some exceptions listed below).

A quick check within the boot messages: grep 'Feature' /var/run/dmesg.boot yelds that feature for my CPU-model. However it is noted in the wiki, that even some architectures with POPCNT included are missing another required feature. From bhyve's wiki:
Most "Barcelona" class and newer AMD processors support bhyve but some, notably the "Kuma" core processors include POPCNT but lack the required "NRIPS" (Next RIP Save) feature.

I couldn't find anything on that feature for my specific processor (yet).

Kind regards,
4z0r
 
Hy.

I'm having the same issue. I've tried different distros and build dates dating back to 2016 to no avail with a Ryzen 3 2200G.

It has to be fixable in some way, and I rather stick to bhyve if we could get it working. I've found some old bugs with regards to bhyve and AMD from 2015 but no recent information (yet) on whether or not it is an actual problem, lost cause or currently beeing worked on.

Wanna hack at it? It might still be a tunable fix.

Kind regards,
4z0r
sweet, more eyes the better.

this weekend, will try a fresh approach, and maybe even try getting some diagnostics.

it must be fixable, yes.
 
vm install kube ubuntu-19.04-live-server-amd64.iso

ACPI error, then a delay, then general protection fault SMP NOPTI

got to read up on what all that means.

Code:
[ 0.047452] Spectre V2 : Spectre mitigation: LFENCE not serializing, switching to generic retpoline                                                                                                                   [5/127]
[ 1.068340] ACPI Error: Could not enable RealTimeClock event (20181213/evxfevnt-184)                                                                                                                                          
[ 1.091518] general protection fault: 0000 [#1] SMP NOPTI
[ 1.091955] CPU: 0 PID: 1 Comm: init Not tainted 5.0.0-13-generic #14-Ubuntu
[ 1.092509] Hardware name: BHYVE, BIOS 1.00 03/14/2014
[ 1.092927] RIP: 0010:switch_mm_irqs_off+0x44c/0x550
[ 1.093322] Code: 20 00 00 00 31 c9 45 31 ed e9 97 fd ff ff 0f 0b 3e 4c 0f ab b3 08 04 00 00 e9 47 fd ff ff b9 49 00 00 00 b8 01 00 00 00 31 d2 <0f> 30 e9 34 fc ff ff 48 8b 15 06 49 66 01 48 8b 34 c2 e8 bd fc fe
[ 1.094779] RSP: 0018:ffffbcd0400d7cc8 EFLAGS: 00010046
[ 1.095191] RAX: 0000000000000001 RBX: ffff9e0fde4f5100 RCX: 0000000000000049
[ 1.095751] RDX: 0000000000000000 RSI: ffff9e0fca42ae00 RDI: ffff9e0fde4f5100
[ 1.096308] RBP: ffffbcd0400d7cf8 R08: 00000000410eb400 R09: 0000000000000000
[ 1.096866] R10: 0000000000000000 R11: 0000000000000001 R12: ffff9e0fde4f4cc0
[ 1.097423] R13: ffff9e0fca42ae00 R14: 0000000000000000 R15: ffff9e0fde4f5100
[ 1.097982] FS: 00007f9ed8f435c0(0000) GS:ffff9e0fdf200000(0000) knlGS:0000000000000000
[ 1.098613] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1.099065] CR2: 000055c124a29298 CR3: 000000000a764000 CR4: 00000000000406f0
[ 1.099622] Call Trace:
[ 1.099827] __schedule+0x291/0x840
[ 1.100107] schedule+0x2c/0x70
[ 1.100360] do_wait+0x1db/0x240
[ 1.100620] kernel_wait4+0xaf/0x150
[ 1.100907] ? task_stopped_code+0x50/0x50
[ 1.101233] __do_sys_wait4+0x83/0x90
[ 1.101527] ? handle_mm_fault+0xe1/0x210
[ 1.101847] ? __do_page_fault+0x25a/0x4c0
[ 1.102173] __x64_sys_wait4+0x1e/0x20
[ 1.102474]  do_syscall_64+0x5a/0x110
[ 1.102768]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1.103167] RIP: 0033:0x7f9ed8e38387
[ 1.103453] Code: 0f 2b 10 00 f7 d8 64 89 02 b8 ff ff ff ff eb bf 0f 1f 00 48 8d 05 59 90 10 00 8b 00 85 c0 75 13 45 31 d2 b8 3d 00 00 00 0f 05 <48> 3d 00 f0 ff f 77 51 c3 41 54 41 89 d4 55 48 89 f5 53 89 fb 48
[ 1.104901

boom
 
The mystery deepens - OpenSUSE boots in bhyve - at least to a grub prompt.
GhostBSD boots in byhve (to a non-graphical login)
Dragonfly loads to some sort of pre-boot command prompt.

So - looks like bhyve is functional at least, just needs some configuration. (well - a lot of configuration maybe)


update ... hmmmm

package bhyve-firmware is not installed ! That might be a problem, not sure yet.
Fails to install because of a size mismatch

Bash:
# pkg install -f bhyve-firmware                                                                                                                                                                       
Updating FreeBSD repository catalogue...
FreeBSD repository is up to date.
All repositories are up to date.
Updating database digests format: 100%
The following 3 package(s) will be affected (of 0 checked):

New packages to be INSTALLED:
        bhyve-firmware: 1.0_1
        uefi-edk2-bhyve-csm: 0.2_1,1
        uefi-edk2-bhyve: 0.2_1,1

Number of packages to be installed: 3

The process will require 4 MiB more space.
1 MiB to be downloaded.

Proceed with this action? [y/N]: y
[1/2] Fetching uefi-edk2-bhyve-csm-0.2_1,1.txz: 100%  760 KiB 155.7kB/s    00:05
pkg: cached package uefi-edk2-bhyve-csm-0.2_1,1: size mismatch, fetching from remote
[2/2] Fetching uefi-edk2-bhyve-csm-0.2_1,1.txz: 100%  760 KiB 155.7kB/s    00:05
pkg: cached package uefi-edk2-bhyve-csm-0.2_1,1: size mismatch, cannot continue
 
The two firmware are packaged together in bhyve-firmware.

The only one you need for EFI installs is this one:
sysutils/uefi-edk2-bhyve/

I also dig the *.fd file out and put it in my /vm directory to make for a shorter path for the file location.
The firmware is only one file.
 
Hy.

I still think that this is related to the CPU Features not being present, namely NRIPS. Every distribution and build I've tried has resulted in approximately the same error on startup in the initialization process. As far as I've figured out on my end its:

  • start_kernel
  • call to the scheduler before entering idle state.
  • panic
regardless of the flavor / distribution and build. So for all I've tried it get stuck during the first initialization when the scheduler is called.

Code:
Call Trace:
__schedule+0x5c0/0x67e
? native_safe_halt+0x13/0x14
schedule_idle+0x1b/0x2c
do_idle+0x16e/0x18e
cpu_startup_entry+0x6a/0x6c
start_secondary+0x197/0x1b2
secondary_startup_64+0xa4/0xb0

Again my specs sysctl hw.model hw.machine hw.ncpu

Bash:
hw.model: AMD Ryzen 3 2200G with Radeon Vega Graphics
hw.machine: amd64
hw.ncpu: 4
since it might still be a different issue.

And the features:

grep Feature /var/run/dmesg.boot

Bash:
  Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT>
  Features2=0x7ed8320b<SSE3,PCLMULQDQ,MON,SSSE3,FMA,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND>
  AMD Features=0x2e500800<SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM>
  AMD Features2=0x35c233ff<LAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS,Prefetch,OSVW,SKINIT,WDT,TCE,Topology,PCXC,PNXC,DBE,PL2I,MWAITX>
  Structured Extended Features=0x209c01a9<FSGSBASE,BMI1,AVX2,SMEP,BMI2,RDSEED,ADX,SMAP,CLFLUSHOPT,SHA>
  XSAVE Features=0xf<XSAVEOPT,XSAVEC,XINUSE,XSAVES>
  AMD Extended Feature Extensions ID EBX=0x1007<CLZERO,IRPerf,XSaveErPtr>

I have the following packages installed:

sysutils/bhyve-firmware
sysutils/grub2-bhyve
sysutils/uefi-edk2-bhyve-devel

and /etc/rc.conf has
Bash:
kld_list="nmdm vmm"
iohyve_enable="YES"
iohyve_flags="kmod=1 net=wlan0"

and /boot/loader.conf has
Bash:
hw.vmm.amdvi.enable=1
yet I don't think that this is required at this point (it enables pass-through / direct assignment to guest).

Kind regards,
4z0r
 
Thanks all - have tried the above ... ended up building uefi-edk2-bhyve from source too ... same result

Tried setting hw.vmm.* sysctl parameters back to default, one at a time ... same result

Need to think about the next stage.
Wondering if its hardware ? maybe try another motherboard, although these days you would think that most AM4 boards are functionally identical.
 
!!!! FIXED !!!!

use option
`ignore_bad_msr="yes"`

in the config for the VM you are trying to boot. It works !

It does say in the docs that this option "is often needed on AMD processors"

... was trying all the options described in the `man vm` options. Glad that worked, because its about the last one left that I hadnt tried yet.

Man, that feels good to see it booting.

TL;DR - Cheap 2nd Gen Ryzen on cheap motherboard + cheap RAM = a winner !!
 
It's interesting in your original error log, it says:
general protection fault: 0000 [#1] SMP NOPTI

So, perhaps this is related to https://wiki.freebsd.org/SpeculativeExecutionVulnerabilities, specifically:

Code:
Meltdown vulnerability mitigation requires using separate kernel and user mode page tables, so that 
user mode does not have sensitive physical pages mapped even with restricted permissions. The technique
is known as Page Table Isolation (PTI) and implemented for amd64 kernel. PTI is enabled by default for any 
non-AMD CPUs. You can enforce PTI, or instead disable it, with vm.pmap.pti=0 loader tunable.

Anyway, it's just speculation :eek:, I'm glad you've solved it. Still I wonder would vm.pmap.pti fix it.
 
!!!! FIXED !!!!

use option
`ignore_bad_msr="yes"`

in the config for the VM you are trying to boot. It works !

It does say in the docs that this option "is often needed on AMD processors"

... was trying all the options described in the `man vm` options. Glad that worked, because its about the last one left that I hadnt tried yet.

Man, that feels good to see it booting.

TL;DR - Cheap 2nd Gen Ryzen on cheap motherboard + cheap RAM = a winner !!

I concur. This fixed the issue on my part as well!

It's interesting in your original error log, it says:
general protection fault: 0000 [#1] SMP NOPTI

So, perhaps this is related to https://wiki.freebsd.org/SpeculativeExecutionVulnerabilities, specifically:

Code:
Meltdown vulnerability mitigation requires using separate kernel and user mode page tables, so that
user mode does not have sensitive physical pages mapped even with restricted permissions. The technique
is known as Page Table Isolation (PTI) and implemented for amd64 kernel. PTI is enabled by default for any
non-AMD CPUs. You can enforce PTI, or instead disable it, with vm.pmap.pti=0 loader tunable.

Anyway, it's just speculation :eek:, I'm glad you've solved it. Still I wonder would vm.pmap.pti fix it.

Code:
vm.pmap.pti
however did nothing on it's own. It was the
Code:
ignore_bad_msr
.

Cheers!!

Kind regards,
4z0r
 
Back
Top