hello
I have a FreeBSD 6.2-RELEASE-p8 server (http+mysql+pop3+imap+smtp) crashing about twice a week for a month or so
this server was running fine for at least 8 months before without a single crash
I already tried replacing RAM and PSU, but the problem keeps happening
I got to create a "top" less than 10 seconds before the server crashing (I had set a script saving it):
and here's the dump:
any ideas if I should try to replace some other hardware or if this may be a software/kernel problem?
the DC offered to swap my entire box, leaving only the HDDs
this server uptime is very important for me, so I'm trying to find the least risk procedure to try to debug that
thanks
I have a FreeBSD 6.2-RELEASE-p8 server (http+mysql+pop3+imap+smtp) crashing about twice a week for a month or so
this server was running fine for at least 8 months before without a single crash
I already tried replacing RAM and PSU, but the problem keeps happening
I got to create a "top" less than 10 seconds before the server crashing (I had set a script saving it):
Code:
last pid: 61576; load averages: 2.04, 2.38, 2.45 up 9+12:13:49 14:42:43
199 processes: 2 running, 196 sleeping, 1 zombie
Mem: 1245M Active, 347M Inact, 281M Wired, 100M Cache, 112M Buf, 31M Free
Swap: 2048M Total, 2096K Used, 2046M Free
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
16167 mysql 19 20 0 488M 85556K kserel 0 53.3H 27.88% mysqld
61398 apache 1 4 0 168M 33868K sbwait 0 0:01 5.89% httpd
92707 apache 1 4 0 169M 64084K sbwait 0 1:51 3.42% httpd
and here's the dump:
Code:
Unread portion of the kernel message buffer:
kernel trap 12 with interrupts disabled
Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 06
fault virtual address = 0x104
fault code = supervisor read, page not present
instruction pointer = 0x20:0xc067a45d
stack pointer = 0x28:0xe4f58c90
frame pointer = 0x28:0xe4f58c9c
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags = resume, IOPL = 0
current process = 5 (thread taskq)
trap number = 12
panic: page fault
cpuid = 2
Uptime: 9d12h14m29s
Physical memory: 2039 MB
Dumping 338 MB: 323 307 291 275 259 243 227 211 195 179 163 147 131 115 99 83 67 51 35 19 3
#0 doadump () at pcpu.h:165
165 __asm __volatile("movl %%fs:0,%0" : "=r" (td));
(kgdb) backtrace
#0 doadump () at pcpu.h:165
#1 0xc0683236 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409
#2 0xc068355d in panic (fmt=0xc08d7a75 "%s") at /usr/src/sys/kern/kern_shutdown.c:565
#3 0xc0889c70 in trap_fatal (frame=0xe4f58c50, eva=260) at /usr/src/sys/i386/i386/trap.c:837
#4 0xc0889426 in trap (frame=
{tf_fs = -968949752, tf_es = -967507928, tf_ds = -453705688, tf_edi = -968921088, tf_esi = 4, tf_ebp = -453669732, tf_isp = -453669764,
tf_ebx = -960082340, tf_edx = 6, tf_ecx = 0, tf_eax = 1, tf_trapno = 12, tf_err = 0, tf_eip = -1066949539, tf_cs = 32, tf_eflags = 65538,
tf_esp = -941363984, tf_ss = 4})
at /usr/src/sys/i386/i386/trap.c:270
#5 0xc087604a in calltrap () at /usr/src/sys/i386/i386/exception.s:139
#6 0xc067a45d in _mtx_lock_sleep (m=0xc6c64e5c, tid=3326046208, opts=0, file=0x0, line=0) at /usr/src/sys/kern/kern_mutex.c:546
#7 0xc06c97c2 in unp_gc (arg=0x0, pending=1) at /usr/src/sys/kern/uipc_usrreq.c:1714
#8 0xc06a3edf in taskqueue_run (queue=0xc64c9080) at /usr/src/sys/kern/subr_taskqueue.c:257
#9 0xc06a43c2 in taskqueue_thread_loop (arg=0x1) at /usr/src/sys/kern/subr_taskqueue.c:376
#10 0xc066c979 in fork_exit (callout=0xc06a4330 <taskqueue_thread_loop>, arg=0xc09d7048, frame=0xe4f58d38) at /usr/src/sys/kern/kern_fork.c:821
#11 0xc08760ac in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:208
any ideas if I should try to replace some other hardware or if this may be a software/kernel problem?
the DC offered to swap my entire box, leaving only the HDDs
this server uptime is very important for me, so I'm trying to find the least risk procedure to try to debug that
thanks