Solved [Solved]experienced random kernel panic(on logging out X)

This is maybe the 3rd installation of Release 10.0 on my computer, hoping to get rid of this kernel panic which keep haunting me for 2 months. Recompiling kernel, rebuilding world, installing pkg from Ports... have been doing this since 7.2 but this is probably the first time I've experienced kernel panic causing reboot.
My make.conf file is almost empty except for the CPUtype=?native
After googling, I thought it's completely beyond my scope to have it solved. Hopeful experts here can shed some light on the possible solution
Thanks

My core dump
Code:
xxxxx.xxxx.xxx dumped core - see /var/crash/vmcore.0

Tue Apr  1 01:45:43 HKT 2014

FreeBSD xxxxx.xxxx.xxx 10.0-RELEASE FreeBSD 10.0-RELEASE #0 r263914: Sun Mar 30 02:49:40 HKT 2014     root@xxxxx.xxxx.xxx:/usr/obj/usr/src/sys/MYKERNEL  amd64

panic: vm_fault: fault on nofault entry, addr: fffffe0000908000

GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...

Unread portion of the kernel message buffer:
panic: vm_fault: fault on nofault entry, addr: fffffe0000908000
cpuid = 2
KDB: stack backtrace:
#0 0xffffffff8050c790 at kdb_backtrace+0x60
#1 0xffffffff804d64c5 at panic+0x155
#2 0xffffffff806d143d at vm_fault_hold+0x14ed
#3 0xffffffff806cff07 at vm_fault+0x77
#4 0xffffffff807101e5 at trap_pfault+0x205
#5 0xffffffff8070f944 at trap+0x454
#6 0xffffffff806f6ee3 at calltrap+0x8
Uptime: 3h29m19s
Dumping 646 out of 4064 MB:..3%..13%..23%..33%..43%..52%..62%..72%..82%..92%

Reading symbols from /boot/modules/nvidia.ko...done.
Loaded symbols for /boot/modules/nvidia.ko
Reading symbols from /boot/kernel/linux.ko.symbols...done.
Loaded symbols for /boot/kernel/linux.ko.symbols
Reading symbols from /boot/kernel/ums.ko.symbols...done.
Loaded symbols for /boot/kernel/ums.ko.symbols
Reading symbols from /boot/kernel/linprocfs.ko.symbols...done.
Loaded symbols for /boot/kernel/linprocfs.ko.symbols
#0  doadump (textdump=<value optimized out>) at pcpu.h:219
219	pcpu.h: No such file or directory.
	in pcpu.h
(kgdb) #0  doadump (textdump=<value optimized out>) at pcpu.h:219
#1  0xffffffff804d6140 in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:447
#2  0xffffffff804d6504 in panic (fmt=<value optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:754
#3  0xffffffff806d143d in vm_fault_hold (map=0xfffff80002000000, 
    vaddr=<value optimized out>, fault_type=1 '\001', fault_flags=0, 
    m_hold=0x0) at /usr/src/sys/vm/vm_fault.c:279
#4  0xffffffff806cff07 in vm_fault (map=0xfffff80002000000, 
    vaddr=<value optimized out>, fault_type=1 '\001', fault_flags=0)
    at /usr/src/sys/vm/vm_fault.c:224
#5  0xffffffff807101e5 in trap_pfault (frame=0xfffffe012379e770, usermode=0)
    at /usr/src/sys/amd64/amd64/trap.c:775
#6  0xffffffff8070f944 in trap (frame=0xfffffe012379e770)
    at /usr/src/sys/amd64/amd64/trap.c:463
#7  0xffffffff806f6ee3 in calltrap ()
    at /usr/src/sys/amd64/amd64/exception.S:232
#8  0xffffffff80d1d89b in _nv000232rm () from /boot/modules/nvidia.ko
#9  0xfffffe000151f000 in ?? ()
#10 0xfffff800065f4d00 in ?? ()
#11 0xfffff80002a82000 in ?? ()
#12 0xfffff80002a82000 in ?? ()
#13 0xfffff800065f4d00 in ?? ()
#14 0xffffffff81251c16 in _nv000796rm () from /boot/modules/nvidia.ko
#15 0xfffffe000151f000 in ?? ()
#16 0xfffff800065f4d00 in ?? ()
#17 0xfffffe012379e960 in ?? ()
#18 0xfffff80002a82000 in ?? ()
#19 0xfffff800065f4d00 in ?? ()
#20 0xffffffff81253f32 in rm_free_unused_clients ()
   from /boot/modules/nvidia.ko
#21 0x0000000000018719 in ?? ()
#22 0x13609a80f07660b8 in ?? ()
#23 0x13609a81dee188b8 in ?? ()
#24 0x13609a81dee188b8 in ?? ()
#25 0x13609a8167abf4b8 in ?? ()
#26 0x0000000000000000 in ?? ()
Current language:  auto; currently minimal
(kgdb) 

------------------------------------------------------------------------
 
Re: experienced random kernel panic(mostly on logging out X)

Boot and run Memtest86 for at least an hour or so. Errors like this are often caused by failing memory or a power supply problem. If this is a desktop computer, open it up and inspect the capacitors, particularly the ones near the processor.
 
Re: experienced random kernel panic(mostly on logging out X)

wblock, thanks for your reply

This my old com, C2Q Q9300 running 24/7 non-stop for 6 years in my room as my home server & main desktop. I used to replace the PSU about every 2years when they failed and before moving up to Release 10.0, it's running fine with Freebsd 9.0. In fact it has passed the memtest86 few days ago.
Actually i have a new com next to it, waiting to have R10.0 installed. But want to make sure it's not 10.0 bug causing this. You know how fustrating it is to see kernel panic after hours/day of compiling a new OS x(

BTW I found someone mentioning a "lock leak" , and he has even posted a patch for the kernel...no idea if it's related :q

http://readlist.com/lists/freebsd.org/freebsd-current/21/105696.html
"From what I see, this is a lock leak, I forgot to unlock the map.
> > It is nice that it is so simple to reproduce the issue in your setup."
 
Re: experienced random kernel panic(mostly on logging out X)

The stack trace suggests a problem in the nvidia.ko kernel module. It can be very tough to debug since the driver is binary only. Ask this on the freebsd-x11 (or maybe freebsd-hackers) mailing list and you might get assistance from someone with better knowledge.
 
Re: experienced random kernel panic(mostly on logging out X)

@kpa, you're probably right. On further examination of the /var/log/messages, around the time of the last kernel panic I experienced on logging out of KDE, I found some mentioning of "kernel :NVRM"
Code:
.....
Apr  1 00:55:37 xxxxxxxx kernel: NVRM: Xid (0000:01:00): 44, Ch 0000000b, engmask 00000101, intr 10000000
Apr  1 00:55:37 xxxxxxxx pkg-static: gutenprint-ijs-5.2.8 installed
Apr  1 00:55:38 xxxxxxxx pkg-static: gutenprint-5.2.8 installed
Apr  1 00:55:38 xxxxxxxx pkg-static: gimp-2.8.10,2 installed
Apr  1 01:45:21 xxxxxxxx syslogd: kernel boot file is /boot/kernel/kernel
Apr  1 01:45:21 xxxxxxxx kernel: panic: vm_fault: fault on nofault entry, addr: fffffe0000908000
Apr  1 01:45:21 xxxxxxxx kernel: cpuid = 2
Apr  1 01:45:21 xxxxxxxx kernel: KDB: stack backtrace:
.......

It looks like the kernel panic was nvidia.ko related. I've tried to modify the Makefile in the nvidia-driver port and disable checksum so that I can install the latest Nvidia proprietary driver. But I saw the same kernel panic after 1-2 days.

I came across this thread on FreeBSD-STABLE mailing list
""panic: vm_fault: fault on nofault entry" in nvidia module on 10"
http://lists.freebsd.org/pipermail/freebsd-stable/2013-December/076552.html

Very similar to my situation... computer ran fine on 9.X but repeated panel kernic on R10.0
....But that post was left unanswered for 3 months

Should I file a formal bug report to FreeBSD? (never done that before)
 
Last edited by a moderator:
Re: experienced random kernel panic(mostly on logging out X)

I thought I owed you guys an update for my problems.
And thanks a lot for wblock & kpa for suggestions of possible hardware +/- nvidia driver causes of the kernel panics....
I disconnected my GT440 and do a thorough cleaning of the PCI-E slot with a soft brush. Not that there was a lot of dust I found there, just did it as the last resort before I declared my motherboard/VGA is dead x(
After a week of testing my old buddies, no more kernel nor any hint of "kernel: panic: vm_fault: fault on nofault entry, addr:...." in the Xorg log on multiple log-in/log-out of X. :beer

Never thought it would end up like this but words can't describe how happy I'm now :beergrin
Thanks for all your replies and help!
 
Back
Top