bhyve Bhyve can't pass thru any of my NVIDIA graphic cards to an Ubuntu bhyved os : the vm freezes before recognizing the disk

Hello.

it's again me. I'm trying to make the passthru of my Nvidia RTX 2080 ti from FreeBSD to Ubuntu 21.04. Unfortunately an error that seems a bug prevents me from completing the task. First of all I want to show you what is the FULL pci configuration on my PC :

Code:
root@marietto:/home/marietto # pciconf -v -l

hostb0@pci0:0:0:0:    class=0x060000 rev=0x0d hdr=0x00 vendor=0x8086 device=0x3e30 subvendor=0x1458 subdevice=0x5000
    vendor     = 'Intel Corporation'
    device     = '8th/9th Gen Core 8-core Desktop Processor Host Bridge/DRAM Registers [Coffee Lake S]'
    class      = bridge
    subclass   = HOST-PCI

pcib1@pci0:0:1:0:    class=0x060400 rev=0x0d hdr=0x01 vendor=0x8086 device=0x1901 subvendor=0x1458 subdevice=0x5000
    vendor     = 'Intel Corporation'
    device     = '6th-10th Gen Core Processor PCIe Controller (x16)'
    class      = bridge
    subclass   = PCI-PCI

pcib2@pci0:0:1:1:    class=0x060400 rev=0x0d hdr=0x01 vendor=0x8086 device=0x1905 subvendor=0x1458 subdevice=0x5000
    vendor     = 'Intel Corporation'
    device     = 'Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x8)'
    class      = bridge
    subclass   = PCI-PCI

vgapci2@pci0:0:2:0:    class=0x030000 rev=0x02 hdr=0x00 vendor=0x8086 device=0x3e98 subvendor=0x1458 subdevice=0xd000
    vendor     = 'Intel Corporation'
    device     = 'CoffeeLake-S GT2 [UHD Graphics 630]'
    class      = display
    subclass   = VGA

none0@pci0:0:18:0:    class=0x118000 rev=0x10 hdr=0x00 vendor=0x8086 device=0xa379 subvendor=0x1458 subdevice=0x8888
    vendor     = 'Intel Corporation'
    device     = 'Cannon Lake PCH Thermal Controller'
    class      = dasp

xhci1@pci0:0:20:0:    class=0x0c0330 rev=0x10 hdr=0x00 vendor=0x8086 device=0xa36d subvendor=0x1458 subdevice=0x5007
    vendor     = 'Intel Corporation'
    device     = 'Cannon Lake PCH USB 3.1 xHCI Host Controller'
    class      = serial bus
    subclass   = USB

none1@pci0:0:20:2:    class=0x050000 rev=0x10 hdr=0x00 vendor=0x8086 device=0xa36f subvendor=0x8086 subdevice=0x7270
    vendor     = 'Intel Corporation'
    device     = 'Cannon Lake PCH Shared SRAM'
    class      = memory
    subclass   = RAM

none2@pci0:0:22:0:    class=0x078000 rev=0x10 hdr=0x00 vendor=0x8086 device=0xa360 subvendor=0x1458 subdevice=0x1c3a
    vendor     = 'Intel Corporation'
    device     = 'Cannon Lake PCH HECI Controller'
    class      = simple comms

ahci0@pci0:0:23:0:    class=0x010601 rev=0x10 hdr=0x00 vendor=0x8086 device=0xa352 subvendor=0x1458 subdevice=0xb005
    vendor     = 'Intel Corporation'
    device     = 'Cannon Lake PCH SATA AHCI Controller'
    class      = mass storage
    subclass   = SATA

pcib3@pci0:0:27:0:    class=0x060400 rev=0xf0 hdr=0x01 vendor=0x8086 device=0xa340 subvendor=0x1458 subdevice=0x5001
    vendor     = 'Intel Corporation'
    device     = 'Cannon Lake PCH PCI Express Root Port'
    class      = bridge
    subclass   = PCI-PCI

pcib4@pci0:0:28:0:    class=0x060400 rev=0xf0 hdr=0x01 vendor=0x8086 device=0xa338 subvendor=0x1458 subdevice=0x5001
    vendor     = 'Intel Corporation'
    device     = 'Cannon Lake PCH PCI Express Root Port'
    class      = bridge
    subclass   = PCI-PCI

pcib5@pci0:0:28:5:    class=0x060400 rev=0xf0 hdr=0x01 vendor=0x8086 device=0xa33d subvendor=0x1458 subdevice=0x5001
    vendor     = 'Intel Corporation'
    device     = 'Cannon Lake PCH PCI Express Root Port'
    class      = bridge
    subclass   = PCI-PCI

pcib6@pci0:0:29:0:    class=0x060400 rev=0xf0 hdr=0x01 vendor=0x8086 device=0xa330 subvendor=0x1458 subdevice=0x5001
    vendor     = 'Intel Corporation'
    device     = 'Cannon Lake PCH PCI Express Root Port'
    class      = bridge
    subclass   = PCI-PCI

isab0@pci0:0:31:0:    class=0x060100 rev=0x10 hdr=0x00 vendor=0x8086 device=0xa305 subvendor=0x1458 subdevice=0x5001
    vendor     = 'Intel Corporation'
    device     = 'Z390 Chipset LPC/eSPI Controller'
    class      = bridge
    subclass   = PCI-ISA

hdac2@pci0:0:31:3:    class=0x040300 rev=0x10 hdr=0x00 vendor=0x8086 device=0xa348 subvendor=0x1458 subdevice=0xa0c3
    vendor     = 'Intel Corporation'
    device     = 'Cannon Lake PCH cAVS'
    class      = multimedia
    subclass   = HDA

ichsmb0@pci0:0:31:4:    class=0x0c0500 rev=0x10 hdr=0x00 vendor=0x8086 device=0xa323 subvendor=0x1458 subdevice=0x5001
    vendor     = 'Intel Corporation'
    device     = 'Cannon Lake PCH SMBus Controller'
    class      = serial bus
    subclass   = SMBus

none3@pci0:0:31:5:    class=0x0c8000 rev=0x10 hdr=0x00 vendor=0x8086 device=0xa324 subvendor=0x8086 subdevice=0x7270
    vendor     = 'Intel Corporation'
    device     = 'Cannon Lake PCH SPI Controller'
    class      = serial bus

em0@pci0:0:31:6:    class=0x020000 rev=0x10 hdr=0x00 vendor=0x8086 device=0x15bc subvendor=0x1458 subdevice=0xe000
    vendor     = 'Intel Corporation'
    device     = 'Ethernet Connection (7) I219-V'
    class      = network
    subclass   = ethernet

vgapci0@pci0:1:0:0:    class=0x030000 rev=0xa1 hdr=0x00 vendor=0x10de device=0x1e04 subvendor=0x19da subdevice=0x2503
    vendor     = 'NVIDIA Corporation'
    device     = 'TU102 [GeForce RTX 2080 Ti]'
    class      = display
    subclass   = VGA

hdac0@pci0:1:0:1:    class=0x040300 rev=0xa1 hdr=0x00 vendor=0x10de device=0x10f7 subvendor=0x19da subdevice=0x2503
    vendor     = 'NVIDIA Corporation'
    device     = 'TU102 High Definition Audio Controller'
    class      = multimedia
    subclass   = HDA

xhci0@pci0:1:0:2:    class=0x0c0330 rev=0xa1 hdr=0x00 vendor=0x10de device=0x1ad6 subvendor=0x19da subdevice=0x2503
    vendor     = 'NVIDIA Corporation'
    device     = 'TU102 USB 3.1 Host Controller'
    class      = serial bus
    subclass   = USB

none4@pci0:1:0:3:    class=0x0c8000 rev=0xa1 hdr=0x00 vendor=0x10de device=0x1ad7 subvendor=0x19da subdevice=0x2503
    vendor     = 'NVIDIA Corporation'
    device     = 'TU102 USB Type-C UCSI Controller'
    class      = serial bus

vgapci1@pci0:2:0:0:    class=0x030000 rev=0xa1 hdr=0x00 vendor=0x10de device=0x1c02 subvendor=0x19da subdevice=0x2438
    vendor     = 'NVIDIA Corporation'
    device     = 'GP106 [GeForce GTX 1060 3GB]'
    class      = display
    subclass   = VGA

hdac1@pci0:2:0:1:    class=0x040300 rev=0xa1 hdr=0x00 vendor=0x10de device=0x10f1 subvendor=0x19da subdevice=0x2438
    vendor     = 'NVIDIA Corporation'
    device     = 'GP106 High Definition Audio Controller'
    class      = multimedia
    subclass   = HDA

nvme0@pci0:3:0:0:    class=0x010802 rev=0x03 hdr=0x00 vendor=0xc0a9 device=0x5403 subvendor=0xc0a9 subdevice=0x2100
    vendor     = 'Micron/Crucial Technology'
    class      = mass storage
    subclass   = NVM

xhci2@pci0:5:0:0:    class=0x0c0330 rev=0x03 hdr=0x00 vendor=0x1912 device=0x0014 subvendor=0x1912 subdevice=0x0015
    vendor     = 'Renesas Technology Corp.'
    device     = 'uPD720201 USB 3.0 Host Controller'
    class      = serial bus
    subclass   = USB

Then,according with the wiki : https://wiki.freebsd.org/bhyve/pci_passthru ; I have masked the pci devices of the graphic card inside the file /boot/loader.conf like this :

Code:
/boot/loader.conf

pptdevs="1/0/0 1/0/1 1/0/2 1/0/3"

and I have rebooted the PC and I've seen that all relevant pci devices have been masked correctly.

Code:
ppt0@pci0:1:0:0:    class=0x030000 rev=0xa1 hdr=0x00 vendor=0x10de device=0x1e04 subvendor=0x19da subdevice=0x2503
    vendor     = 'NVIDIA Corporation'
    device     = 'TU102 [GeForce RTX 2080 Ti]'
    class      = display
    subclass   = VGA

ppt1@pci0:1:0:1:    class=0x040300 rev=0xa1 hdr=0x00 vendor=0x10de device=0x10f7 subvendor=0x19da subdevice=0x2503
    vendor     = 'NVIDIA Corporation'
    device     = 'TU102 High Definition Audio Controller'
    class      = multimedia
    subclass   = HDA

ppt2@pci0:1:0:2:    class=0x0c0330 rev=0xa1 hdr=0x00 vendor=0x10de device=0x1ad6 subvendor=0x19da subdevice=0x2503
    vendor     = 'NVIDIA Corporation'
    device     = 'TU102 USB 3.1 Host Controller'
    class      = serial bus
    subclass   = USB

ppt3@pci0:1:0:3:    class=0x0c8000 rev=0xa1 hdr=0x00 vendor=0x10de device=0x1ad7 subvendor=0x19da subdevice=0x2503
    vendor     = 'NVIDIA Corporation'
    device     = 'TU102 USB Type-C UCSI Controller'
    class      = serial bus

So,I tried to run the Ubuntu virtual machine with this command :

Code:
bhyve -S -c 4 -m 8G -w -H \
        -s 0,hostbridge \
        -s 1,virtio-blk,/mnt/da1p1/vms/os/ubuntu-budgie-gpu/ubuntu-2104-gpu.img \
        -s 2,passthru,1/0/0 \
        -s 2:1,passthru,1/0/1 \
        -s 2:2,passthru,1/0/2 \
        -s 2:3,passthru,1/0/3 \
        -s 6,virtio-net,tap0 \
        -s 20,hda,play=/dev/dsp8,rec=/dev/dsp8 \
        -s 29,fbuf,tcp=0.0.0.0:5900,w=1440,h=900 \
        -s 30,xhci,tablet \
        -s 31,lpc -l com1,stdio \
        -l bootrom,/usr/local/share/uefi-firmware/BHYVE_UEFI.fd \
        vm0

Unfortunately,it didn't work because this error :

Assertion failed: (!err), function hda_init, file /usr/src/usr.sbin/bhyve/pci_hda.c, line 400.
Segnale di annullamento(creato file core)


It seems like a bug or what ? Suggestions to give me ? thanks.
 
I can't help but to notice that you are not passing thru the Nvidia VGA device at PCI0 2:0:0
You need to pass thru all NVIDIA devices and pare them back once up and running.
For example you don't have to pass thru the HDMI audio component.
 
Ok I was slightly off on my prognosis.
Studying your pciconf output I see you have two NVidia cards installed.
That is a mistake. Please try and make your experiments as simple as possible.
I recommend Intel GPU on motherboard for Bhyve host and try and pass thru one NVidia card.
That is step one.
On top of that simplify further by getting a NVIDIA FreeBSD VM working first.
Then try Mount Everest.
 
Ok I was slightly off on my prognosis.
Studying your pciconf output I see you have two NVidia cards installed.
That is a mistake. Please try and make your experiments as simple as possible.
I recommend Intel GPU on motherboard for Bhyve host and try and pass thru one NVidia card.
That is step one.
On top of that simplify further by getting a NVIDIA FreeBSD VM working first.
Then try Mount Everest.

Yes,I have 3 graphic cards in my PC. The default is the Intel integrated mobo gpu. I use this as primary. The other are the RTX 2080 ti and the GTX 1060. If u give a better look,you will see that I haven't passed thru the GTX 1060,but only the RTX 2080 ti and its children. I don't want to mount Everest. Do u say that I shouldn't pass the audio device of the RTX 2080 ti ? i don't want to pass thru the intel gpu nor the 1060 because they aren't so powerful.
 
I can append it ,what's the name ? Don't insult me,please. I'm not too lazy,I'm not too expert. Maybe its name is lldb.core ?
 
Sorry if you felt insulted, wasn't my intention. But i am not happy with the reaction from you to my answers. The last time, instead of thanking me, you deleted your question.
Now, in this thread, i suggested to analyze the core file and you just ignored it. Maybe ask yourself if you are the one who insults people with his actions.

what's the name ?
under FreeBSD and other 4.4BSD systems, a core file is called progname.core instead of just core, to make it clearer which program a core file belongs to.
 
I don't remember which question I have deleted. As I have told somewhere else,I know what are my limits. First of all I want to learn FreeBSD from the basics. To analyze a core file is not a basic task. Maybe for you it is. For me it isn't. Maybe one day I will be able to do this,also. Furthermore I don't think that everything can be taken as an offence. Your seem to be a revenge against me. But I have nothing against you. I didn't even realize that you felt offended for my *virtual* actions which, as such, cannot have any offensive elements, other than those that you necessarily want to see. Here we are in internet. An offence can be only some phrases coded as such. Anyway,your is not a real offence,but an incorrect judgment about a person who does not think like you,that you do not understand or approve.
 
I don't remember which question I have deleted.
The one above my post here.
As I have told somewhere else,I know what are my limits. First of all I want to learn FreeBSD from the basics. To analyze a core file is not a basic task. Maybe for you it is. For me it isn't.
It isn't a basic task for sure. But you asked for suggestions on how to proceed with your problem. I never analyzed a core file either but this would have been the next logical step.
Your seem to be a revenge against me. But I have nothing against you.
No, no revenge. I have nothing against you either. I am just talking to you.

Enough offtopic for now, if you want to talk further you can send me a private message.
 
I've changed the product. I've selected "product" = base system and "component" = bhyve ; version = 13 RELEASE. is it ok now ?
 
The one above my post here.

It isn't a basic task for sure. But you asked for suggestions on how to proceed with your problem. I never analyzed a core file either but this would have been the next logical step.

No, no revenge. I have nothing against you either. I am just talking to you.

Enough offtopic for now, if you want to talk further you can send me a private message.

Anyway,I want to try to analyze the file. Let's see if I can have some useful informations from this action. Thanks for having pointed me in this direction,even if I'm skeptical to be able to get a spider out of the hole.
 
so,these are the informations that I've collected doing the debug of the core file :

Code:
root@marietto:/usr/home/marietto/Desktop/Files/bhyve # lldb -c bhyve.core
(lldb) target create --core "bhyve.core"
Core file '/usr/home/marietto/Desktop/Files/bhyve/bhyve.core' (x86_64) was loaded.

(lldb) thread backtrace all
* thread #1, name = 'bhyve', stop reason = signal SIGABRT
  * frame #0: 0x00000008015f62ea
    frame #1: 0x000000080156b064
  thread #2, name = 'blk-1:0-0', stop reason = signal SIGABRT
    frame #0: 0x000000080149cb3c
    frame #1: 0x00000008014ac660
  thread #3, name = 'blk-1:0-1', stop reason = signal SIGABRT
    frame #0: 0x000000080149cb3c
    frame #1: 0x00000008014ac660
  thread #4, name = 'blk-1:0-2', stop reason = signal SIGABRT
    frame #0: 0x000000080149cb3c
    frame #1: 0x00000008014ac660
  thread #5, name = 'blk-1:0-3', stop reason = signal SIGABRT
    frame #0: 0x000000080149cb3c
    frame #1: 0x00000008014ac660
  thread #6, name = 'blk-1:0-4', stop reason = signal SIGABRT
    frame #0: 0x000000080149cb3c
    frame #1: 0x00000008014ac660
  thread #7, name = 'blk-1:0-5', stop reason = signal SIGABRT
    frame #0: 0x000000080149cb3c
    frame #1: 0x00000008014ac660
  thread #8, name = 'blk-1:0-6', stop reason = signal SIGABRT
    frame #0: 0x000000080149cb3c
    frame #1: 0x00000008014ac660
  thread #9, name = 'blk-1:0-7', stop reason = signal SIGABRT
    frame #0: 0x000000080149cb3c
    frame #1: 0x00000008014ac660
  thread #10, name = 'vtnet-6:0 tx', stop reason = signal SIGABRT
    frame #0: 0x000000080149cb3c
    frame #1: 0x00000008014ac660
  thread #11, name = 'hda-audio-output', stop reason = signal SIGABRT
    frame #0: 0x000000080149cb3c
    frame #1: 0x00000008014ac660
(lldb)

what now ? nothing that can't be done more professionally by the developers,right ?
 
Did u understand something useful from that debug ? What I can understand if that I shouldn't passthru the audio device integrated with the graphic card. But I've already did it and the error is not gone.
 
lldb -c bhyve.core
This is not the correct command:
To examine a core file, specify the name of the core file in addition to the program itself. Instead of starting up lldb in the usual way, type lldb -c progname.core — progname
So this should be something like
lldb -c bhyve.core -- bhyve

Maybe you'll get a more descriptive output then.
 
Code:
(lldb) thread backtrace all

Program aborted due to an unhandled Error:
Error value was Success. (Note: Success values must still be checked prior to being destroyed).
PLEASE submit a bug report to https://bugs.freebsd.org/submit/ and include the crash backtrace.
Stack dump:
0.    Program arguments: lldb -c bhyve.core -- bhyve
1.    HandleCommand(command = "thread backtrace all")
#0 0x0000000003ae7aee PrintStackTrace /usr/src/contrib/llvm-project/llvm/lib/Support/Unix/Signals.inc:564:13
#1 0x0000000003ae5fa5 RunSignalHandlers /usr/src/contrib/llvm-project/llvm/lib/Support/Signals.cpp:69:18
#2 0x0000000003ae8060 SignalHandler /usr/src/contrib/llvm-project/llvm/lib/Support/Unix/Signals.inc:0:3
#3 0x0000000804c35e00 handle_signal /usr/src/lib/libthr/thread/thr_sig.c:0:3
Segnale di annullamento(creato file core)
 
What does pci_hda.c (that is, the sound card emulation code) have to do with passthrough anyway?
He wants to passthrough the "TU102 High Definition Audio Controller" to bhyve.
Program aborted due to an unhandled Error:
Error value was Success. (Note: Success values must still be checked prior to being destroyed).
PLEASE submit a bug report to https://bugs.freebsd.org/submit/ and include the crash backtrace.
Stack dump:
0. Program arguments: lldb -c bhyve.core -- bhyve
1. HandleCommand(command = "thread backtrace all")
#0 0x0000000003ae7aee PrintStackTrace /usr/src/contrib/llvm-project/llvm/lib/Support/Unix/Signals.inc:564:13
#1 0x0000000003ae5fa5 RunSignalHandlers /usr/src/contrib/llvm-project/llvm/lib/Support/Signals.cpp:69:18
#2 0x0000000003ae8060 SignalHandler /usr/src/contrib/llvm-project/llvm/lib/Support/Unix/Signals.inc:0:3
#3 0x0000000804c35e00 handle_signal /usr/src/lib/libthr/thread/thr_sig.c:0:3
Looks like lldb has a bug, see here. Op suggest to use gdb(1) instead.
 
That's what I get for being subtle (or, rather, less than complete asshole) on the Internet.
I don't get it.

What's the point?
A complete backtrace with all the options passed to
Code:
hda_init()
?

Looking at the code, one of these function calls fail
Code:
p = hda_parse_config(opts, "play=", play);
r = hda_parse_config(opts, "rec=", rec);

Would be interesting to know which one and what options were passed.
 
Did u read here ? ----> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=258007

This is what I wrote on the bug report :


Then,according with the wiki : https://wiki.freebsd.org/bhyve/pci_passthru ; I have masked the pci devices of the graphic card inside the file /boot/loader.conf like this :


Code:
/boot/loader.conf

pptdevs="1/0/0 1/0/1 1/0/2 1/0/3"


but I tried also different combinations,like these :


pptdevs="1/0/0 1/0/2 1/0/3"

or

pptdevs="2/0/0"

or

pptdevs="2/0/0 2/0/1"


It means that I tried to exclude the audio device,but passing thru the other devices freezed the ubuntu virtual machine. Maybe I should remove this parameter ?

-s 20,hda,play=/dev/dsp8,rec=/dev/dsp8 \
 
Back
Top