Server crashes periodically after upgrading to 11.1-RELEASE-p9

  • Thread starter Deleted member 36389
  • Start date
D

Deleted member 36389

Guest
Yep, basically just what the title says. Downgraded to 11.1-RELEASE-p8 and all was well, went back to -p9 and it started crashing again. Appreciate any help anyone can provide.

Here's some info, please let me know if anything else is needed:

uname -a
Code:
FreeBSD 6.example.net 11.1-RELEASE-p9 FreeBSD 11.1-RELEASE-p9 #11: Tue Apr 24 19:55:09 UTC 2018     test@6.example.net:/usr/obj/usr/src/sys/6  amd64

dmesg
Code:
Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address    = 0x7c
fault code        = supervisor read data, page not present
instruction pointer    = 0x20:0xffffffff80859152
stack pointer            = 0x28:0xfffffe04569a2640
frame pointer            = 0x28:0xfffffe04569a2660
code segment        = base 0x0, limit 0xfffff, type 0x1b
            = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags    = interrupt enabled, resume, IOPL = 0
current process        = 12 (irq264: em0:rx0)
trap number        = 12
panic: page fault
cpuid = 0
KDB: stack backtrace:
#0 0xffffffff806db147 at kdb_backtrace+0x67
#1 0xffffffff80699226 at vpanic+0x186
#2 0xffffffff80699093 at panic+0x43
#3 0xffffffff809ae342 at trap_fatal+0x322
#4 0xffffffff809ae39b at trap_pfault+0x4b
#5 0xffffffff809adbba at trap+0x2ca
#6 0xffffffff80990a20 at calltrap+0x8
#7 0xffffffff80859482 at udp_common_ctlinput+0x102
#8 0xffffffff807c3ca3 at icmp_input+0x733
#9 0xffffffff807c43af at ip_input+0x10f
#10 0xffffffff807a5350 at netisr_dispatch_src+0xa0
#11 0xffffffff807900df at ether_demux+0x13f
#12 0xffffffff80790d6b at ether_nh_input+0x31b
#13 0xffffffff807a5350 at netisr_dispatch_src+0xa0
#14 0xffffffff80790376 at ether_input+0x26
#15 0xffffffff8078ce0a at if_input+0xa
#16 0xffffffff803caf6c at em_rxeof+0x26c
#17 0xffffffff803ca903 at em_msix_rx+0x33
Uptime: 20h51m34s

kgdb kernel.debug /var/crash/vmcore.0
Code:
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...

Unread portion of the kernel message buffer:
interrupt enabled, resume, IOPL = 0
current process        = 12 (irq264: em0:rx0)
trap number        = 12
panic: page fault
cpuid = 0
KDB: stack backtrace:
#0 0xffffffff806db147 at kdb_backtrace+0x67
#1 0xffffffff80699226 at vpanic+0x186
#2 0xffffffff80699093 at panic+0x43
#3 0xffffffff809ae342 at trap_fatal+0x322
#4 0xffffffff809ae39b at trap_pfault+0x4b
#5 0xffffffff809adbba at trap+0x2ca
#6 0xffffffff80990a20 at calltrap+0x8
#7 0xffffffff80859482 at udp_common_ctlinput+0x102
#8 0xffffffff807c3ca3 at icmp_input+0x733
#9 0xffffffff807c43af at ip_input+0x10f
#10 0xffffffff807a5350 at netisr_dispatch_src+0xa0
#11 0xffffffff807900df at ether_demux+0x13f
#12 0xffffffff80790d6b at ether_nh_input+0x31b
#13 0xffffffff807a5350 at netisr_dispatch_src+0xa0
#14 0xffffffff80790376 at ether_input+0x26
#15 0xffffffff8078ce0a at if_input+0xa
#16 0xffffffff803caf6c at em_rxeof+0x26c
#17 0xffffffff803ca903 at em_msix_rx+0x33
Uptime: 20h51m34s
Dumping 3001 out of 16323 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%

Reading symbols from /boot/kernel/geom_mirror.ko...Reading symbols from /usr/lib/debug//boot/kernel/geom_mirror.ko.debug...done.
done.
Loaded symbols for /boot/kernel/geom_mirror.ko
Reading symbols from /boot/kernel/accf_data.ko...Reading symbols from /usr/lib/debug//boot/kernel/accf_data.ko.debug...done.
done.
Loaded symbols for /boot/kernel/accf_data.ko
Reading symbols from /boot/kernel/accf_http.ko...Reading symbols from /usr/lib/debug//boot/kernel/accf_http.ko.debug...done.
done.
Loaded symbols for /boot/kernel/accf_http.ko
Reading symbols from /boot/kernel/cc_htcp.ko...Reading symbols from /usr/lib/debug//boot/kernel/cc_htcp.ko.debug...done.
done.
Loaded symbols for /boot/kernel/cc_htcp.ko
Reading symbols from /boot/kernel/zfs.ko...Reading symbols from /usr/lib/debug//boot/kernel/zfs.ko.debug...done.
done.
Loaded symbols for /boot/kernel/zfs.ko
Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols from /usr/lib/debug//boot/kernel/opensolaris.ko.debug...done.
done.
Loaded symbols for /boot/kernel/opensolaris.ko
Reading symbols from /boot/kernel/ums.ko...Reading symbols from /usr/lib/debug//boot/kernel/ums.ko.debug...done.
done.
Loaded symbols for /boot/kernel/ums.ko
Reading symbols from /boot/kernel/pf.ko...Reading symbols from /usr/lib/debug//boot/kernel/pf.ko.debug...done.
done.
Loaded symbols for /boot/kernel/pf.ko
#0  doadump (textdump=<value optimized out>) at pcpu.h:229
229        __asm("movq %%gs:%1,%0" : "=r" (td)
(kgdb) bt
#0  doadump (textdump=<value optimized out>) at pcpu.h:229
#1  0xffffffff80698da1 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:366
#2  0xffffffff80699260 in vpanic (fmt=<value optimized out>, ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:759
#3  0xffffffff80699093 in panic (fmt=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:690
#4  0xffffffff809ae342 in trap_fatal (frame=0xfffffe04569a2580, eva=124) at /usr/src/sys/amd64/amd64/trap.c:826
#5  0xffffffff809ae39b in trap_pfault (frame=0xfffffe04569a2580, usermode=0) at pcpu.h:229
#6  0xffffffff809adbba in trap (frame=0xfffffe04569a2580) at /usr/src/sys/amd64/amd64/trap.c:416
#7  0xffffffff80990a20 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:232
#8  0xffffffff80859152 in udp_notify (inp=0xfffff800325e2740, errno=65) at /usr/src/sys/netinet/udp_usrreq.c:732
#9  0xffffffff80859482 in udp_common_ctlinput (cmd=18, sa=<value optimized out>, vip=0xfffff80029f9b02a, pcbinfo=0xffffffff813262f0)
    at /usr/src/sys/netinet/udp_usrreq.c:779
#10 0xffffffff807c3ca3 in icmp_input (mp=<value optimized out>, offp=<value optimized out>, proto=<value optimized out>)
    at /usr/src/sys/netinet/ip_icmp.c:529
#11 0xffffffff807c43af in ip_input (m=0x0) at /usr/src/sys/netinet/ip_input.c:823
#12 0xffffffff807a5350 in netisr_dispatch_src (proto=1, source=<value optimized out>, m=<value optimized out>) at /usr/src/sys/net/netisr.c:1120
#13 0xffffffff807900df in ether_demux (ifp=0xfffff800063ba000, m=<value optimized out>) at /usr/src/sys/net/if_ethersubr.c:850
#14 0xffffffff80790d6b in ether_nh_input (m=<value optimized out>) at /usr/src/sys/net/if_ethersubr.c:639
#15 0xffffffff807a5350 in netisr_dispatch_src (proto=5, source=<value optimized out>, m=<value optimized out>) at /usr/src/sys/net/netisr.c:1120
#16 0xffffffff80790376 in ether_input (ifp=<value optimized out>, m=0x0) at /usr/src/sys/net/if_ethersubr.c:759
#17 0xffffffff8078ce0a in if_input (ifp=<value optimized out>, sendmp=<value optimized out>) at /usr/src/sys/net/if.c:4007
#18 0xffffffff803caf6c in em_rxeof (count=99) at /usr/src/sys/dev/e1000/if_em.c:4880
#19 0xffffffff803ca903 in em_msix_rx (arg=0xfffff80006678200) at /usr/src/sys/dev/e1000/if_em.c:1673
#20 0xffffffff8065f86c in intr_event_execute_handlers (p=<value optimized out>, ie=0xfffff80006671100) at /usr/src/sys/kern/kern_intr.c:1262
#21 0xffffffff8065fb56 in ithread_loop (arg=0xfffff8000669bf20) at /usr/src/sys/kern/kern_intr.c:1275
#22 0xffffffff8065cec5 in fork_exit (callout=0xffffffff8065fa80 <ithread_loop>, arg=0xfffff8000669bf20, frame=0xfffffe04569a2ac0)
    at /usr/src/sys/kern/kern_fork.c:1042
#23 0xffffffff809916ee in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:815
#24 0x0000000000000000 in ?? ()
Current language:  auto; currently minimal
(kgdb)
 
  • Thanks
Reactions: Oko
It has happened to me in the past, regular night crashes at 03:00, triggered apparently by some cron job. It started after some FreeBSD upgrade. Most people from FreeBSD lists told me to run memtest. Bingo, the memory usage pattern had changed making the system hit this bad page in RAM, which didn't happen with the previous version.
 
It has happened to me in the past, regular night crashes at 03:00, triggered apparently by some cron job. It started after some FreeBSD upgrade. Most people from FreeBSD lists told me to run memtest. Bingo, the memory usage pattern had changed making the system hit this bad page in RAM, which didn't happen with the previous version.

Thanks for the tip. I will run memtest and see if it comes up with any errors.

I had a very similar problem with a FreeBSD 10.1 patch that made the server always crash when apache was restarted. It produced almost the exact same output as this problem I'm having now with the exception that now it seems to be something with the network card as the current process is always listed as "12 (irq264: em0:rx0)".

Guess we'll see what the reply to the bug report is and if memtest comes up with anything.
 
Back
Top