System hang on boot up?

Hi,

I have an Supermicro X10Dri-T motherboard with 11.1 installed. Sometimes my boot gets stuck as shown below. If I do a reboot, it boots up fine. Any idea how to solve this?

QST6ZFy.jpg
 
I have an Supermicro X10Dri-T motherboard with 11.1 installed. Sometimes my boot gets stuck as shown below. If I do a reboot, it boots up fine. Any idea how to solve this?
Was this system booting successfully with an older version of FreeBSD (11.0 or 10.3) or is it a new installation? If things used to work, this may be related to the new EARLY_AP_STARTUP kernel option. You could try building / booting a kernel without that option. If it is a new install, that could also be the problem, but it is more difficult to be sure as we don't have a working baseline to compare to.

Also, make sure your BIOS and IPMI firmware are up-to-date. At present, it looks like the latest BIOS is 5.05 from June 5th, 2017 and the latest IPMI is 3.58 from June 9th, 2017.

If neither of those help, try doing a verbose boot (at the loader menu) and see if it hangs consistently in one place (when it fails to boot completely). This is often not as helpful as it seems, as many boot issues have actually happened earlier in the boot process and only show up at some point later on in the boot.
 
Was this system booting successfully with an older version of FreeBSD (11.0 or 10.3) or is it a new installation? If things used to work, this may be related to the new EARLY_AP_STARTUP kernel option. You could try building / booting a kernel without that option. If it is a new install, that could also be the problem, but it is more difficult to be sure as we don't have a working baseline to compare to.

Also, make sure your BIOS and IPMI firmware are up-to-date. At present, it looks like the latest BIOS is 5.05 from June 5th, 2017 and the latest IPMI is 3.58 from June 9th, 2017.

If neither of those help, try doing a verbose boot (at the loader menu) and see if it hangs consistently in one place (when it fails to boot completely). This is often not as helpful as it seems, as many boot issues have actually happened earlier in the boot process and only show up at some point later on in the boot.

The BIOS and IPMI are upto date. I did not have this issue in 10.3, it seems to have crept up in 11.0/11.1 release. Was the EARLY_AP_STARTUP option in 11.0?
 
The BIOS and IPMI are upto date. I did not have this issue in 10.3, it seems to have crept up in 11.0/11.1 release. Was the EARLY_AP_STARTUP option in 11.0?
The code was there, but the kernel config to enable it was only MFC'd a little over 2 months ago, so it should not be on in 11.0 unless you built a kernel that enabled it.
 
I built a custom kernel without the EARLY_AP_STARTUP. The verbose boot option with the new kernel is hung as follows:

nKAKuBu.jpg



My /boot/loader.conf looks like this
Code:
kern.geom.label.disk_ident.enable="0"
kern.geom.label.gptid.enable="0"
vfs.zfs.min_auto_ashift=12
zfs_load="YES"
net.fibs=6
net.add_addr_allfibs=0
ipfw_load="YES"
net.inet.ip.fw.default_to_accept=1
geom_eli_load="YES"
coretemp_load="YES"
aesni_load="YES"
geom_eli_load="YES"

sfxge_load="YES"

#disable APIC so that debian vm doesnt give error
hw.vmm.vmx.use_apic_vid="0"

#so that windows bhyve can see more cores
hw.vmm.topology.cores_per_package="4"

#powerd
hint.p4tcc.0.disabled=1
hint.acpi_throttle.0.disabled=1

vmm_load="YES"
pptdevs="1/0/0 2/0/0 131/0/0 131/0/1"
 
I built a custom kernel without the EARLY_AP_STARTUP. The verbose boot option with the new kernel is hung as follows:
That may be helpful. The x2APIC support was apparently never committed to FreeBSD 10.x.

A Supermicro FAQ for a much older board says: in “Local APIC Mode” selections of BIOS setup, please select “Compatible APIC Mode.” Your motherboard may have something different, perhaps X2APIC_OPT_OUT.
 
The X2APIC was set to disabled in BIOS. Unfortunately, I dont see an option to set 'Compatible APIC mode'.

kaww64h.jpg


After enabling it, i get an extra option as you mentioned. Going to try this out.

UtOqHUU.jpg
 
I tried with both X2APIC_OPT_OUT flag on and off, with a soft reset in between. It didnt help.
 
The system booted up fine with two consecutive reboots. This is with X2APIC_OPT_OUT flag turned on. I will try a reboot after 24 hours to check again.

BXNOZx1.png
 
The culprit was the USB 3 PCIe card. As soon as i took it the problems went away. This BUG report has the details.
 
Back
Top