nvme0: Unable to allocate PCI resource issue

BosephusKingfish · Dec 12, 2022

Greetings everyone - unsure if "System hardware" or "storage" is better, but this seemed more basic so I chose the former:

I've got a new dual Asus AMD Epyc 7742 based server (model #RS700A-E11-RS12U). It's unique in that it has PCIe lanes going directly to NVME disk slots on the front of the chassis. No SAS/SATA, and no controller. Upon attempting to install FreeBSD (either 12 or 13 have the issue), I get through the installer which sees no disks in the system, though there is one 7.68TB NVME disk. I noticed the installer probe showed:

Code:

nvme0: <WDC SN200> at device 0.0 numa-domain 0 on pci4
nvme0: unable to allocate pci resource
nvme0: <WDC SN200> at device 0.0 numa-domain 0 on pci4
nvme0: unable to allocate pci resource

So it appears to understand what it is, but for whatever reason, can't initialize it. Ubuntu 22.04 and Windows 10 and 11 both install to this disk without issue, in case it matters. I only found one other mention of this error in the nvme context and the discussion just kinda trailed off and wasn't quite the same details (it was interrupt related).

We have not deployed the system yet so if this is a more sinister problem, I'd love to assist with the debug/investigation/beta testing. Thank you!

cracauer@ · Dec 12, 2022

Can you gather a dmesg from Linux booting this impressive machine, along with a `lspci -v`?

sko · Dec 12, 2022

BosephusKingfish said:
It's unique in that it has PCIe lanes going directly to NVME disk slots on the front of the chassis.

Doesn't matter whether the disk is attached in the front or internal to a PCIe-/M.2 slot; electrically and from a firmware side it's basically the same... Also PCI, M.2 or U.2 connection shouldn't matter - all are just PCIe connections.

Given that there are *lots* of drives out there with broken/buggy/"special" firmware, where they interpreted the standards just a bit too loosely (i.e. wrong): can you try another (known working) NVME drive?
This will also rule out that there is really nothing special going on with those front PCIe connectors (U.2?).

BosephusKingfish · Dec 12, 2022

cracauer@ said:
Can you gather a dmesg from Linux booting this impressive machine, along with a `lspci -v`?

Attached. Thank you!

BosephusKingfish · Dec 12, 2022

sko said:
Given that there are *lots* of drives out there with broken/buggy/"special" firmware, where they interpreted the standards just a bit too loosely (i.e. wrong): can you try another (known working) NVME drive?
This will also rule out that there is really nothing special going on with those front PCIe connectors (U.2?).

Yes, U.2 - I don't have any other disks to try/work with for the time being, but given Linux, Windows 10, and Windows 11 all work with this disk in this system in that location, I tend to think the disk isn't the problem.

cracauer@ · Dec 13, 2022

Definitely open a bug report on this, attaching those Linux logs. This will be high priority for some PCIe wizard.

covacat · Dec 13, 2022

try to boot a 14-CURRENT recent snapshot too

Index of /ftp/snapshots/amd64/amd64/ISO-IMAGES/14.0/

BosephusKingfish · Dec 13, 2022

covacat said:
covacat said:

try to boot a 14-CURRENT recent snapshot too

Index of /ftp/snapshots/amd64/amd64/ISO-IMAGES/14.0/

Click to expand...

I did fire off a copy of 14-CURRENT, and the disk is detected properly on bootup and even shows that it's capable of being installed on. So looks like whatever the issue was/is, 14 takes care of it. See attached dmesg.

covacat said:
try to boot a 14-CURRENT recent snapshot too

Index of /ftp/snapshots/amd64/amd64/ISO-IMAGES/14.0/

Update: Potentially new info. I installed two M.2 NVME disks on the motherboard and installed FreeBSD 13.1-RELEASE to it and booted there. The dmesg.boot probe is slightly different:

nvme0: <WDC SN200> mem 0xf8330000-0xf8333fff,0xf8320000-0xf832ffff at device 0.0 numa-domain 0 on pci14

If I try to newfs it, it says "newfs: /dev/nvme0: could not find special device", and sure enough there is no /dev/nvme0 devices in /dev, just the two M.2 2TB NVME disks:

crw------- 1 root wheel 0x38 Dec 12 21:24 /dev/nvme1
crw------- 1 root wheel 0x5f Dec 12 21:24 /dev/nvme1ns1
crw------- 1 root wheel 0x39 Dec 12 21:24 /dev/nvme2
crw------- 1 root wheel 0x6d Dec 12 21:24 /dev/nvme2ns1

Full dmesg.boot (with the two M.2 NVME disks and the single 7.68TB NVME disks) attached.

BosephusKingfish · Dec 13, 2022

cracauer@ said:
Definitely open a bug report on this, attaching those Linux logs. This will be high priority for some PCIe wizard.

Will do! Additionally, CURRENT-14 works - you can install to the disk, read it, etc... Just 13.x and earlier don't. ;-(

BosephusKingfish · Dec 14, 2022

Further update: Looks like 13.1 Stable works, too! Dmesg attached. So maybe this is just wait for 13.2?

cracauer@ · Dec 14, 2022

At this point you can probably quickly bisect the actual commit and backport/cherry-pick it to 13.1.

Commit 03acc85e3f2aa5478ae6aa74dd5be7ed05d87a98 looks like a candidate.

nvme0: Unable to allocate PCI resource issue

BosephusKingfish

cracauer@

sko

BosephusKingfish

Attachments

BosephusKingfish

cracauer@

covacat

BosephusKingfish

Attachments

BosephusKingfish

BosephusKingfish

Attachments

cracauer@