Hi, folks....
I'm mystified on this one since I know it is possible to get the GPU on these working now... Any suggestions?
This is on 12-STABLE, compiled from source, most recently about 48 hours ago (upgrade from 11.2-PRERELEASE, before VEGA graphics were supported with the 4.16 drm merge), rebuilt all ports, etc. Although I am willing to entertain the possibility that there is something wrong with the install, everything else seems to work flawlessly in 12-STABLE, I just still can't seem to get the GPU online properly.
I'm currently downloading the 12-STABLE 20190404 snapshot to do a fresh install to another disk for additional testing.
The board is an ASUS Prime B350-PLUS, which I have tried with the last few BIOS versions up to and including the latest from 3 days ago with the latest microcode updates. Memory speed has been backed off from the perfectly stable 3200 to stock 2133. I have twiddled every knob I can think of in the BIOS to no avail, although I suspect either I must be missing something or this is a faulty BIOS implementation somehow.
The original install was on a GPT-partitioned Intel SSD, booting EFI. I know about the issues with amdgpu on EFI boots, so of course I tried disabling the console via the hw.syscons.disable=1 loader.conf tunable and while it does disable the console, as soon as the screen would normally clear to start the kernel boot messages (where I just have the EFI framebuffer debug info still on the screen,) the top few lines of the screen go yellow, then patterns, then yellow.... then I get nothing further from the console, regardless of whether I try to
OK, I think, "must be something with virtualization or IOMMU...." so I disable in the BIOS... No change.
The latest BIOS has an option to set the framebuffer size... OK, set manually... No change
OK, screw EFI, this is supposed to Just Work on a legacy boot, so I try to force legacy mode in the BIOS, but of course it can't boot due to it having been set up GPT... Grrrr....
OK, grab another different disk, MBR partition it, boot0... do a
Surely this will work, right? ...
Boot legacy VT(vga) and it doesn't work either!! Double GRRRRR....
No joy.... (Even with console enabled or disabled) even my legacy, non-EFI mode boot, when starting amdgpu, it just goes like this:
(cough... sputter...wheeze...)
The last thing actually on the actual screen of the console when it is enabled is the line:
"Apr 7 12:47:08 some-hostname kernel: [drm] PCI I/O BAR is not found."
Things like "BIOS signature incorrect 0 0" make me suspect that either the modules with the firmware are wrong somehow (I have tried the binary versions using pkg install for both the drm kmod and the firmware ports in case it was something with my source trees... No change.) causing the GPU firmware to not load or initialize properly or something in the BIOS of the machine is gibbled and not setting up the GPU's memory ranges, etc. correctly or whatnot.
I'm trying a fresh 12-STABLE install now, will update with any progress.
I'm tearing my hair out on this one...
Any bright ideas?
I'm mystified on this one since I know it is possible to get the GPU on these working now... Any suggestions?
This is on 12-STABLE, compiled from source, most recently about 48 hours ago (upgrade from 11.2-PRERELEASE, before VEGA graphics were supported with the 4.16 drm merge), rebuilt all ports, etc. Although I am willing to entertain the possibility that there is something wrong with the install, everything else seems to work flawlessly in 12-STABLE, I just still can't seem to get the GPU online properly.
I'm currently downloading the 12-STABLE 20190404 snapshot to do a fresh install to another disk for additional testing.
The board is an ASUS Prime B350-PLUS, which I have tried with the last few BIOS versions up to and including the latest from 3 days ago with the latest microcode updates. Memory speed has been backed off from the perfectly stable 3200 to stock 2133. I have twiddled every knob I can think of in the BIOS to no avail, although I suspect either I must be missing something or this is a faulty BIOS implementation somehow.
The original install was on a GPT-partitioned Intel SSD, booting EFI. I know about the issues with amdgpu on EFI boots, so of course I tried disabling the console via the hw.syscons.disable=1 loader.conf tunable and while it does disable the console, as soon as the screen would normally clear to start the kernel boot messages (where I just have the EFI framebuffer debug info still on the screen,) the top few lines of the screen go yellow, then patterns, then yellow.... then I get nothing further from the console, regardless of whether I try to
kldload amdgpu
later or not. (I can usually still wait for it to boot and then SSH in, although every few boots it just crashes and reboots.)OK, I think, "must be something with virtualization or IOMMU...." so I disable in the BIOS... No change.
The latest BIOS has an option to set the framebuffer size... OK, set manually... No change
OK, screw EFI, this is supposed to Just Work on a legacy boot, so I try to force legacy mode in the BIOS, but of course it can't boot due to it having been set up GPT... Grrrr....

OK, grab another different disk, MBR partition it, boot0... do a
dump -0 -f - / | restore rf -
onto new disk...Surely this will work, right? ...
Boot legacy VT(vga) and it doesn't work either!! Double GRRRRR....

No joy.... (Even with console enabled or disabled) even my legacy, non-EFI mode boot, when starting amdgpu, it just goes like this:
(cough... sputter...wheeze...)
Code:
Apr 7 12:47:08 some-hostname kernel: [drm] amdgpu kernel modesetting enabled.
Apr 7 12:47:08 some-hostname kernel: drmn0: <drmn> on vgapci0
Apr 7 12:47:08 some-hostname kernel: vgapci0: child drmn0 requested pci_enable_io
Apr 7 12:47:08 some-hostname syslogd: last message repeated 1 times
Apr 7 12:47:08 some-hostname kernel: [drm] initializing kernel modesetting (RAVEN 0x1002:0x15DD 0x1043:0x876B0xC6).
Apr 7 12:47:08 some-hostname kernel: [drm] register mmio base: 0xFCC00000
Apr 7 12:47:08 some-hostname kernel: [drm] register mmio size: 524288
Apr 7 12:47:08 some-hostname kernel: [drm] PCI I/O BAR is not found.
Apr 7 12:47:09 some-hostname kernel: drmn0: successfully loaded firmware image with name: amdgpu/raven_gpu_info.bin
Apr 7 12:47:09 some-hostname kernel: [drm] probing gen 2 caps for device 1022:15db = 700d03/e
Apr 7 12:47:09 some-hostname kernel: [drm] probing mlw for device 1002:15dd = 400d03
Apr 7 12:47:09 some-hostname kernel: [drm] VCN decode is enabled in VM mode
Apr 7 12:47:09 some-hostname kernel: [drm] VCN encode is enabled in VM mode
Apr 7 12:47:09 some-hostname kernel: [drm] BIOS signature incorrect 0 0
Apr 7 12:47:09 some-hostname kernel: ATOM BIOS: 113-RAVEN-113
Apr 7 12:47:09 some-hostname kernel: [drm] vm size is 262144 GB, 4 levels, block size is 9-bit, fragment sizeis 9-bit
Apr 7 12:47:09 some-hostname kernel: drmn0: VRAM: 512M 0x000000F400000000 - 0x000000F41FFFFFFF (512M used)
Apr 7 12:47:09 some-hostname kernel: drmn0: GTT: 1024M 0x000000F500000000 - 0x000000F53FFFFFFF
Apr 7 12:47:09 some-hostname kernel: Successfully added WC MTRR for [0xe0000000-0xefffffff]: 0;
Apr 7 12:47:09 some-hostname kernel: [drm] Detected VRAM RAM=512M, BAR=256M
Apr 7 12:47:09 some-hostname kernel: [drm] RAM width 128bits UNKNOWN
Apr 7 12:47:09 some-hostname kernel: [drm:amdgpu_ttm_global_init] Failed setting up TTM memory accounting subsystem.
Apr 7 12:47:09 some-hostname kernel: [drm:amdgpu_device_ip_init] sw_init of IP block <gmc_v9_0> failed -12
Apr 7 12:47:09 some-hostname kernel: drmn0: amdgpu_device_ip_init failed
Apr 7 12:47:09 some-hostname kernel: drmn0: Fatal error during GPU init
Apr 7 12:47:09 some-hostname kernel: [drm] amdgpu: finishing device.
Apr 7 12:47:09 some-hostname kernel: vgapci0: child drmn0 requested pci_disable_io
Apr 7 12:47:09 some-hostname syslogd: last message repeated 1 times
Apr 7 12:47:09 some-hostname kernel: device_attach: drmn0 attach returned 12
The last thing actually on the actual screen of the console when it is enabled is the line:
"Apr 7 12:47:08 some-hostname kernel: [drm] PCI I/O BAR is not found."
Things like "BIOS signature incorrect 0 0" make me suspect that either the modules with the firmware are wrong somehow (I have tried the binary versions using pkg install for both the drm kmod and the firmware ports in case it was something with my source trees... No change.) causing the GPU firmware to not load or initialize properly or something in the BIOS of the machine is gibbled and not setting up the GPU's memory ranges, etc. correctly or whatnot.
I'm trying a fresh 12-STABLE install now, will update with any progress.
I'm tearing my hair out on this one...
Any bright ideas?