Kernel panic when trying to unload vmm

Hi guys,
On amd64 FreeBSD 10.2 (r289315) I hit very specific bug. Kernel panics when I unload vmm(4) module. I did open PR 203820 for it hoping that somebody notices this.

In the mean time I'm trying to reproduce this on other servers/desktops with Intel CPU (as issue occurred on Intel's part of the code of vmm(4)). I tried few with i5/i7 desktop computers but didn't hit the bug. Also tried virtual machine under VMware - no issue there either.

My CPU is:

Code:
CPU: Intel(R) Xeon(R) CPU E3-1240 V2 @ 3.40GHz (3392.37-MHz K8-class CPU)
  Origin="GenuineIntel"  Id=0x306a9  Family=0x6  Model=0x3a  Stepping=9
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0x7fbae3ff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND>
  AMD Features=0x28100800<SYSCALL,NX,RDTSCP,LM>
  AMD Features2=0x1<LAHF>
  Structured Extended Features=0x281<FSGSBASE,SMEP,ERMS>
  XSAVE Features=0x1<XSAVEOPT>
  VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID
  TSC: P-state invariant, performance statistics

If you have such free machine, would you care to try it yourself ? No configuration is needed, just to load and unload the vmm(4) kernel module:

Code:
# kldload vmm
# kldunload vmm
 
I am pretty sure I have a different issue, but I can trigger a reboot with my AMD CPU when I try to unload vmm with a running VirtualBox VM. Just curious if you have one running?
 
Just curious if you have one running?
I do have VirtualBox running on the host where this failure occurred, if that's what you mean.

By triggering a reboot you mean panic? If so I'd be curious to see where it fails...
 
By triggering a reboot you mean panic?
No, the computer just resets. I managed to setup dumpdev correctly and got this:
Code:
Fatal trap 1: privileged instruction fault while in kernel mode
cpuid = 4; apic id = 14
instruction pointer	= 0x20:0xffffffff8264492d
stack pointer	        = 0x28:0xfffffe0660ec24e0
frame pointer	        = 0x28:0xfffffe0660ec2530
code segment		= base 0x0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags	= resume, IOPL = 0
current process		= 4677 (VirtualBox)
trap number		= 1
panic: privileged instruction fault
cpuid = 4
KDB: stack backtrace:
#0 0xffffffff80984e30 at kdb_backtrace+0x60
#1 0xffffffff809489e6 at vpanic+0x126
#2 0xffffffff809488b3 at panic+0x43
#3 0xffffffff80d4aadb at trap_fatal+0x36b
#4 0xffffffff80d4a75c at trap+0x75c
#5 0xffffffff80d307f2 at calltrap+0x8
 
If you have dump available you could check (assuming GENERIC kernel and vmcore.0 is the latest):

Code:
# cd /usr/obj/usr/src/sys/GENERIC
# kgdb /boot/kernel/kernel /var/crash/vmcore.0
# list *0xffffffff8264492d

to know where it failed. It seems VirtualBox itself caused it. I'm very new to kernel debugging and this is unfortunately bigger chunk I can chew for now. As you mentioned though, not strictly related to problem I have.
 
Actually, tobik , VirtualBox _is the smoking gun here. Today I managed to do some tests. There's no problem (i.e. kldload/kldunload works on vmm) when:

a) no vbox modules are loaded
b) all vbox modules are loaded but no VM is running

It fails when VM is running. Not bad for a start :) :beer:
 
Last edited:
My guess is that it would crash in the same way if you tried it the opposite way; running a vm on bhyve, then trying to unload the vboxdrv module (I'm not sure that you are allowed to unload it anyway). Trying to change anything related to a hypervisor (like kernel modules) while running vm's is asking for trouble. :-)
 
My guess is that it would crash in the same way if you tried it the opposite way; running a vm on bhyve, then trying to unload the vboxdrv module (I'm not sure that you are allowed to unload it anyway). Trying to change anything related to a hypervisor (like kernel modules) while running vm's is asking for trouble. :)

Unloading a module that is in use is not possible. But I'm unloading a module which is not used at all. All VMs are from VirtualBox, vmm is just loaded and then unloaded. No bhyve VMs are defined nor used.

From little I know crash occurs to be during the cleaning phase of the vmm unload procedure, during the vmx_disable() execution. It might be that vmm steps on VirtualBox fingers here.
 
From little I know crash occurs to be during the cleaning phase of the vmm unload procedure, during the vmx_disable() execution. It might be that vmm steps on VirtualBox fingers here.
I am going to have to try to get a better stack trace than the one I posted above just to see if it happens in vmx_disable too. If I ever figure out how to do this...
 
That PR is only about two weeks old, you shouldn't get worried yet. Perhaps if you posted about it to the relevant mailing list you could get more / faster attention to it. Not all the developers hang out here in the forums.
 
tingo Yeah, I never opened PR before though I'm using FreeBSD for more than a decade now. I went through the howto but I thought bugs@ is the appropriate mailing list.

Not all the developers hang out here in the forums.
That's why I opened the PR. But I was wondering if anybody else have this issue.

junovitch@ Thanks for assigning it to the correct mailing list.
 
Back
Top