Panic in httpd

Hi,

On one of our FreeBSD (7.0 amd64) Apache (2.2.14) servers we are getting regular crashes, especially when the load is high. The only additional information i have is the apache modules in use and the core-dump of the system.

I'm at a loss of how to move on except reinstalling the entire system, which is not a option at the moment given that we don't have any redundancy.

Could anyone please help me move the diagnose process forward?

Below is the output of a basic kdbg session. While I have written som C code in my days I am not familiar with gdb, kernel programming or the syscalls involved, so go easy om me ;).


Code:
[root@www2 /opt/crash]# ls -alh
total 3285792
drwx------   2 root  wheel   512B May 19 17:20 .
drwxr-xr-x  10 root  wheel   512B Feb  9 15:38 ..
-rw-r--r--   1 root  wheel     2B May 19 17:20 bounds
-rw-------   1 root  wheel   436B Feb 15 14:42 info.0
-rw-------   1 root  wheel   436B May  2 00:02 info.1
-rw-------   1 root  wheel   436B May 13 11:39 info.2
-rw-------   1 root  wheel   436B May 14 17:31 info.3
-rw-------   1 root  wheel   435B May 19 17:20 info.4
-rw-------   1 root  wheel   684M Feb 15 14:43 vmcore.0
-rw-------   1 root  wheel   731M May  2 00:03 vmcore.1
-rw-------   1 root  wheel   689M May 13 11:40 vmcore.2
-rw-------   1 root  wheel   633M May 14 17:32 vmcore.3
-rw-------   1 root  wheel   656M May 19 17:21 vmcore.4
[root@www2 /opt/crash]# cat info.4 
Dump header from device /dev/da0s1b
  Architecture: amd64
  Architecture Version: 2
  Dump Length: 687824896B (655 MB)
  Blocksize: 512
  Dumptime: Wed May 19 17:18:09 2010
  Hostname: www2.anonymous.se
  Magic: FreeBSD Kernel Dump
  Version String: FreeBSD 7.0-RELEASE #0: Fri Oct 10 12:24:42 UTC 2008
    drift@www2.anonymous.se:/usr/obj/usr/src/sys/KERNCONF
  Panic String: page fault
  Dump Parity: 779422760
  Bounds: 4
  Dump Status: good
[root@www2 /opt/crash]# cd /usr/obj/usr/src/sys/KERNCONF
[root@www2 /usr/obj/usr/src/sys/KERNCONF]# kgdb kernel.debug /opt/crash/vmcore.4 
[GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"]
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd".

Unread portion of the kernel message buffer:
kernel trap 12 with interrupts disabled


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x0
fault code              = supervisor read data, page not present
instruction pointer     = 0x8:0xffffffff80468d5f
stack pointer           = 0x10:0xffffffffb4eb09c0
frame pointer           = 0x10:0xffffff00182ca000
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = resume, IOPL = 0
current process         = 64033 (httpd)
trap number             = 12
panic: page fault
cpuid = 0
Uptime: 16h26m40s
Physical memory: 8178 MB
Dumping 655 MB: 640 624 608 592 576 560 544 528 512 496 480 464 448 432 416 400 384 368 352 336 320 304 288 272 256 240 224 208 192 176 160 144 128 112 96 80 64 48 32 16

#0  doadump () at pcpu.h:194
194             __asm __volatile("movq %%gs:0,%0" : "=r" (td));
(kgdb) bt
#0  doadump () at pcpu.h:194
#1  0x0000000000000004 in ?? ()
#2  0xffffffff804775f9 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409
#3  0xffffffff804779fd in panic (fmt=0x104 <Address 0x104 out of bounds>) at /usr/src/sys/kern/kern_shutdown.c:563
#4  0xffffffff807305c4 in trap_fatal (frame=0xffffff00612e8000, eva=18446742975826784256) at /usr/src/sys/amd64/amd64/trap.c:724
#5  0xffffffff8073123f in trap (frame=0xffffffffb4eb0910) at /usr/src/sys/amd64/amd64/trap.c:251
#6  0xffffffff80716f3e in calltrap () at /usr/src/sys/amd64/amd64/exception.S:169
#7  0xffffffff80468d5f in lf_advlock (ap=Variable "ap" is not available.
) at /usr/src/sys/kern/kern_lockf.c:294
#8  0xffffffff8044ebbb in kern_fcntl (td=0xffffff00612e8000, fd=Variable "fd" is not available.
) at vnode_if.h:1036
#9  0xffffffff8044ef7f in fcntl (td=0xffffff00612e8000, uap=0xffffffffb4eb0be0) at /usr/src/sys/kern/kern_descrip.c:336
#10 0xffffffff80730c17 in syscall (frame=0xffffffffb4eb0c70) at /usr/src/sys/amd64/amd64/trap.c:852
#11 0xffffffff8071714b in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:290
#12 0x000000080112f07c in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb) list *0xffffffff80468d5f
0xffffffff80468d5f is in lf_advlock (/usr/src/sys/kern/kern_lockf.c:295).
290                                         (td->td_wmesg == lockstr) &&
291                                         (i++ < maxlockdepth)) {
292                                             waitblock = (struct lockf *)td->td_wchan;
293                                             /* Get the owner of the blocking lock */
294                                             waitblock = waitblock->lf_next;
295                                             if ((waitblock->lf_flags & F_POSIX) == 0)
296                                                     break;
297                                             nproc = (struct proc *)waitblock->lf_id;
298                                             if (nproc == (struct proc *)lock->lf_id) {
299                                                     PROC_SUNLOCK(wproc);
(kgdb) q
[root@www2 /usr/obj/usr/src/sys/KERNCONF]#
 
"Page fault; page not present" usually indicates a problem with memory. Check if the internal memory is still good. Also check if there are any bad sectors on the harddrive. There may be some on the swap partition.
 
Back
Top