Solved Updating from 11.1-STABLE to 11.2-STABLE results in kernel panic on XEN

I'm running a couple of HVM domU FreeBSD guests under Xen-4.10.1 under a Gentoo Linux host, and have found (frustratingly) that the kernel panics after performing a reboot (after running "make installkernel"). Below is the console output up to the panic:
Code:
  ______               ____   _____ _____
|  ____|             |  _ \ / ____|  __ \
| |___ _ __ ___  ___ | |_) | (___ | |  | |
|  ___| '__/ _ \/ _ \|  _ < \___ \| |  | |
| |   | | |  __/  __/| |_) |____) | |__| |
| |   | | |    |    ||     |      |      |
|_|   |_|  \___|\___||____/|_____/|_____/    ```                        `
                                             s` `.....---.......--.```   -/
+============Welcome to FreeBSD===========+ +o   .--`         /y:`      +.
|                                         |  yo`:.            :o      `+-
|  1. Boot Multi User [Enter]             |   y/               -/`   -o/
|  2. Boot [S]ingle User                  |  .-                  ::/sy+:.
|  3. [Esc]ape to loader prompt           |  /                     `--  /
|  4. Reboot                              | `:                          :`
|                                         | `:                          :`
|  Options:                               |  /                          /
|  5. [K]ernel: kernel (1 of 2)           |  .-                        -.
|  6. Configure Boot [O]ptions...         |   --                      -.
|                                         |    `:`                  `:`
|                                         |      .--             `--.
|                                         |         .---.....----.
+=========================================+
                                        
/boot/kernel/kernel text=0x1499010 data=0x13d5a8+0x474fa0 syms=[0x8+0x15db90+0x8+0x177861]
/boot/entropy size=0x1000
Booting...
Copyright (c) 1992-2018 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
    The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 11.2-STABLE #4 r340432: Wed Nov 14 14:37:21 EST 2018
    root@jailer.warfaresdl.com:/usr/obj/usr/src/sys/JAILER amd64
FreeBSD clang version 6.0.1 (tags/RELEASE_601/final 335540) (based on LLVM 6.0.1)
VT(vga): text 80x25
XEN: Hypervisor version 4.10 detected.
CPU: Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz (3600.19-MHz K8-class CPU)
  Origin="GenuineIntel"  Id=0x906e9  Family=0x6  Model=0x9e  Stepping=9
  Features=0x1fc3fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT>
  Features2=0xfffa3203<SSE3,PCLMULQDQ,SSSE3,FMA,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND,HV>
  AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM>
  AMD Features2=0x121<LAHF,ABM,Prefetch>
  Structured Extended Features=0x1c6fbb<FSGSBASE,TSCADJ,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,NFPUSG,MPX,RDSEED,ADX,SMAP>
  Structured Extended Features3=0x9c000000<IBPB,STIBP,L1DFL,SSBD>
  XSAVE Features=0xf<XSAVEOPT,XSAVEC,XINUSE,XSAVES>
  AMD Extended Feature Extensions ID EBX=0x1000
Hypervisor: Origin = "XenVMMXenVMM"
real memory  = 3217031168 (3068 MB)
avail memory = 3076325376 (2933 MB)
Event timer "LAPIC" quality 100
ACPI APIC Table: <Xen HVM>
WARNING: L1 data cache covers less APIC IDs than a core
0 < 1
WARNING: L2 data cache covers less APIC IDs than a core
0 < 1
WARNING: L3 data cache covers less APIC IDs than a core
0 < 1
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
FreeBSD/SMP: 1 package(s) x 4 core(s)
random: unblocking device.
ioapic0: Changing APIC ID to 1
MADT: Forcing active-low polarity and level trigger for SCI
ioapic0 <Version 1.1> irqs 0-47 on motherboard
MADT: Forcing active-low polarity and level trigger for SCI
SMP: AP CPU #3 Launched!
SMP: AP CPU #2 Launched!
SMP: AP CPU #1 Launched!
random: entropy device external interface
kbd1 at kbdmux0
netmap: loaded module
module_register_init: MOD_LOAD (vesa, 0xffffffff80f4d780, 0) error 19
random: registering fast source Intel Secure Key RNG
random: fast provider: "Intel Secure Key RNG"
nexus0
vtvga0: <VT VGA driver> on motherboard
cryptosoft0: <software crypto> on motherboard
acpi0: <Xen> on motherboard
acpi0: Power Button (fixed)
acpi0: Sleep Button (fixed)
cpu0: <ACPI CPU> on acpi0
cpu1: <ACPI CPU> on acpi0
cpu2: <ACPI CPU> on acpi0
cpu3: <ACPI CPU> on acpi0
hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0
Timecounter "HPET" frequency 62500000 Hz quality 950
attimer0: <AT timer> port 0x40-0x43 irq 0 on acpi0
Timecounter "i8254" frequency 1193182 Hz quality 0
Event timer "i8254" frequency 1193182 Hz quality 100
atrtc0: <AT realtime clock> port 0x70-0x71 irq 8 on acpi0
atrtc0: registered as a time-of-day clock, resolution 1.000000s
panic: unable to configure IRQ#8

cpuid = 0
KDB: stack backtrace:
#0 0xffffffff80acdf27 at kdb_backtrace+0x67
#1 0xffffffff80a87417 at vpanic+0x177
#2 0xffffffff80a87293 at panic+0x43
#3 0xffffffff81050352 at xen_intr_pirq_config_intr+0x102
#4 0xffffffff803b95a5 at acpi_alloc_resource+0x1e5
#5 0xffffffff80ac306e at bus_alloc_resource+0x9e
#6 0xffffffff810366cd at atrtc_attach+0x1fd
#7 0xffffffff80ac0598 at device_attach+0x3b8
#8 0xffffffff80ac183d at bus_generic_attach+0x3d
#9 0xffffffff803b8c89 at acpi_attach+0xe39
#10 0xffffffff80ac0598 at device_attach+0x3b8
#11 0xffffffff80ac183d at bus_generic_attach+0x3d
#12 0xffffffff80ac0598 at device_attach+0x3b8
#13 0xffffffff80ac1ea9 at bus_generic_new_pass+0xe9
#14 0xffffffff80ac3b67 at root_bus_configure+0x77
#15 0xffffffff8103a009 at configure+0x9
#16 0xffffffff80a22b78 at mi_startup+0x118
#17 0xffffffff8030602c at btext+0x2c
Uptime: 1s
Automatic reboot in 15 seconds - press a key on the console to abort
--> Press a key on the console to reboot,
--> or switch off the system now.
The kernel config is short and sweet:
Code:
include         GENERIC
ident           JAILER

options         XENHVM
device          xenpci
device          hyperv
device          virtio
device          virtio_pci
device          vtnet
device          virtio_blk
device          virtio_scsi
device          virtio_balloon
device          virtio_random
device          virtio_console

# xen performance options
options         NO_ADAPTIVE_MUTEXES
options         NO_ADAPTIVE_RWLOCKS
options         NO_ADAPTIVE_SX

# disable IPSEC
nooptions       IPSEC, IPSEC_SUPPORT
# disable NFS
nooptions       NFSCL, NFSD, NFSLOCKD, NFS_ROOT
# disable VIA Padlock
nodevice        padlock_rng
Sources used:
Code:
root@jailer:~ # svnlite info /usr/src
Path: /usr/src
Working Copy Root Path: /usr/src
URL: https://svn.freebsd.org/base/stable/11
Relative URL: ^/stable/11
Repository Root: https://svn.freebsd.org/base
Repository UUID: ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
Revision: 340432
Node Kind: directory
Schedule: normal
Last Changed Author: scottl
Last Changed Rev: 340403
Last Changed Date: 2018-11-13 13:49:43 -0500 (Tue, 13 Nov 2018)

For reference, below is the Xen config file:
Code:
memory = 3072
vcpus = 4
acpi = 1
apic = 1
name = "jailer"

uuid = "b0634902-82d1-421a-8f73-ec702d28bd1c"

# PVHVM stuff
type = "hvm"
firmware_override = "hvmloader"
boot = "c"

vif = [ 'mac=00:16:3e:fe:ce:af,bridge=bridge0' ]
disk = [ 'format=raw, vdev=xvda, access=rw, target=/dev/zvol/rpool/VM/jailer' ]

device_model_version = 'qemu-xen-traditional'

# Necessary for getting the serial console in `xl console`
serial = "pty"
on_poweroff = 'destroy'
on_reboot = 'restart'
on_crash = 'destroy'
EDIT: Also of note, I've also tried the unmodified GENERIC kernel config... which was where I started before attempting the custom kernel route. The GENERIC kernel had the same output, so I tried making a custom kernel. Am I missing something?
 
Well, first of all why run STABLE vs. RELEASE? If you don't have any specific reasons then I'd recommend using RELEASE which is the commonly supported version. Basically STABLE is a developers snapshot. Not 'bleeding edge' but ickyness can find its way in there which is less likely with RELEASE.

Second: why bother with a customized kernel if all (well, most) you do is add stuff that's already enabled in GENERIC? Most of your entries (above your Xen performance options) can be removed because those are already defined. See also the GENERIC config file.

Also: is there any crash / debug info? If you end up with a core file you could try running gdb to see if you can find more hints about any specific causes.

(edit) Anyway, for what's it worth I'd try GENERIC instead (don't configure anything). Better yet: don't try to build your own kernel but grab the default GENERIC and use that, maybe combined with RELEASE.
 
Well, first of all why run STABLE vs. RELEASE? If you don't have any specific reasons then I'd recommend using RELEASE which is the commonly supported version. Basically STABLE is a developers snapshot. Not 'bleeding edge' but ickyness can find its way in there which is less likely with RELEASE.

Second: why bother with a customized kernel if all (well, most) you do is add stuff that's already enabled in GENERIC? Most of your entries (above your Xen performance options) can be removed because those are already defined. See also the GENERIC config file.

Also: is there any crash / debug info? If you end up with a core file you could try running gdb to see if you can find more hints about any specific causes.
I can try RELEASE, I have no need to stay on the bleeding edge... I was under the impression that was CURRENT. At any rate, it would be good to report this issue before it gets into a RELEASE, even if I do switch back to RELEASE.

You're correct about most options being included in GENERIC. I actually started with the GENERIC kernel when I ran into this problem. I then started looking for any kernel options that may be specific to Xen/virtualization and enabled those that I thought were critical. The fact that some may overlap ensures that in the future, if they are dropped from GENERIC, they are not dropped from my config.

I'm not sure where to look for crash/debug info or a core file. There's a stack trace in the output. I'm fairly sure the system never managed to mount the drive node.
 
Update: Reverting back to 11.2-RELEASE fixed the problem. The release kernel (customized with same config above) works as expected and boots. Chalk it up to some dev snapshot "ickyness".
 
Back
Top