kernel panic (backtrace: at krb5_unwrap+0x12c) when writing files over NFSv4

Hi,

I really want to narrow down this problem, because the system is failing to record some TV programs that I was looking forward to watch over weekends.

I have FreeBSD-9.1 running as KVM instance inside Scientific Linux (kernel version 3.4.4).
I also have Debian running as another instance, providing NFSv4 with sec=krb5p.
My FreeBSD-9.1 records some TV programs and saves the ecoded video files to NFSv4.
The kernel panic mostly happens while recording and saving a video file(100~300MB).
And it just happened while I was writing this post:\
I have set the following line in /etc/rc.conf, so I will have a crash dump ready, when it happens again:
Code:
dumpdev="AUTO"

I have tried freebsd-questions mailing list last month, but got no reply.
Any suggestion or pointer is really appreciated!!

# uname -a
Code:
FreeBSD freebsd.nj-k.org 9.1-PRERELEASE FreeBSD 9.1-PRERELEASE #5 r243328: Sat Nov 24 03:01:20 JST 2012     
root@:/usr/obj/usr/src/sys/GENERIC  amd64

Part of /var/log/messages (every backtraces have #0 to #4 in common):
Code:
Jan 19 08:09:55 freebsd syslogd: kernel boot file is /boot/kernel/kernel
Jan 19 08:09:55 freebsd kernel: panic: stack overflow detected; backtrace may be
 corrupted
Jan 19 08:09:55 freebsd kernel: cpuid = 0
Jan 19 08:09:55 freebsd kernel: KDB: stack backtrace:
Jan 19 08:09:55 freebsd kernel: #0 0xffffffff809274a6 at kdb_backtrace+0x66
Jan 19 08:09:55 freebsd kernel: #1 0xffffffff808f13fe at panic+0x1ce
Jan 19 08:09:55 freebsd kernel: #2 0xffffffff8091a452 at __stack_chk_fail+0x12
Jan 19 08:09:55 freebsd kernel: #3 0xffffffff81613ee7 at krb5_unwrap_old+0x407
Jan 19 08:09:55 freebsd kernel: #4 0xffffffff8161413c at krb5_unwrap+0x12c
Jan 19 08:09:55 freebsd kernel: #5 0xffffffff816446e4 at xdr_rpc_gss_unwrap_data
+0x164
Jan 19 08:09:55 freebsd kernel: #6 0xffffffff81642a9d at rpc_gss_validate+0x1bd
Jan 19 08:09:55 freebsd kernel: #7 0xffffffff80ad09be at clnt_vc_call+0x8de
Jan 19 08:09:55 freebsd kernel: #8 0xffffffff80acec0b at clnt_reconnect_call+0xf
b
Jan 19 08:09:55 freebsd kernel: #9 0xffffffff807f07e5 at newnfs_request+0x595
Jan 19 08:09:55 freebsd kernel: #10 0xffffffff80826c42 at nfscl_request+0x72
Jan 19 08:09:55 freebsd kernel: #11 0xffffffff8080fcd4 at nfsrpc_write+0x4e4
Jan 19 08:09:55 freebsd kernel: #12 0xffffffff8081ea72 at ncl_writerpc+0x62
Jan 19 08:09:55 freebsd kernel: #13 0xffffffff80829f26 at ncl_doio+0x196
Jan 19 08:09:55 freebsd kernel: #14 0xffffffff8082e002 at nfssvc_iod+0xc2
Jan 19 08:09:55 freebsd kernel: #15 0xffffffff808c1f0f at fork_exit+0x11f
Jan 19 08:09:55 freebsd kernel: #16 0xffffffff80bc8c3e at fork_trampoline+0xe
Jan 19 08:09:55 freebsd kernel: Uptime: 8h4m36s
 
The system encountered kernel panic twice, so I have tried looking into those 2 crashes.
The first one was a page fault.
Code:
# less info.4
Dump header from device /dev/vtbd0p3
  Architecture: amd64
  Architecture Version: 2
  Dump Length: 924037120B (881 MB)
  Blocksize: 512
  Dumptime: Tue Jan  8 23:16:46 2013
  Hostname: freebsd.nj-k.org
  Magic: FreeBSD Kernel Dump
  Version String: FreeBSD 9.1-PRERELEASE #5 r243328: Sat Nov 24 03:01:20 JST 2012
    root@:/usr/obj/usr/src/sys/GENERIC
  Panic String: page fault
  Dump Parity: 1640339569
  Bounds: 4
  Dump Status: good

And the second was a stack overflow.
Code:
# less info.5
Dump header from device /dev/vtbd0p3
  Architecture: amd64
  Architecture Version: 2
  Dump Length: 881512448B (840 MB)
  Blocksize: 512
  Dumptime: Sun Jan 20 06:10:07 2013
  Hostname: freebsd.nj-k.org
  Magic: FreeBSD Kernel Dump
  Version String: FreeBSD 9.1-PRERELEASE #5 r243328: Sat Nov 24 03:01:20 JST 2012
    root@:/usr/obj/usr/src/sys/GENERIC
  Panic String: stack overflow detected; backtrace may be corrupted
  Dump Parity: 3349813351
  Bounds: 5
  Dump Status: good

Looking into the first crash dump, I have found that mbuf chain's last structure had length of zero.
If the KASSERT macro was on, the system would have panicked with "Unexpected empty mbuf".
It seems like the second one was also caused by the empty mbuf.

I have pasted my output of kgdb here.
 
Back
Top