Kernel fault, supervisor read data, page not present

Hi,

Can anyone help me to find the reason of this ?
http://piccy.info/view3/8295820/3900b14c1b9833a4ba9773d743bec932/

uname -a:
Code:
FreeBSD ****** 10.1-RELEASE-p3 FreeBSD 10.1-RELEASE-p3 #13 r276782: Thu Jan  8 00:28:53 EET 2015  root@*****:/usr/obj/usr/src/sys/GENERIC  amd64
dmesg:
Code:
Copyright (c) 1992-2014 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
  The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 10.1-RELEASE-p3 #13 r276782: Thu Jan  8 00:28:53 EET 2015
  root@inkiev.net:/usr/obj/usr/src/sys/GENERIC amd64
FreeBSD clang version 3.4.1 (tags/RELEASE_34/dot1-final 208032) 20140512
CPU: Intel(R) Xeon(R) CPU  L5640  @ 2.27GHz (2266.79-MHz K8-class CPU)
  Origin = "GenuineIntel"  Id = 0x206c2  Family = 0x6  Model = 0x2c  Stepping = 2
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0x9ee3fd<SSE3,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,POPCNT>
  AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM>
  AMD Features2=0x1<LAHF>
  VT-x: (disabled in BIOS) PAT,HLT,MTF,PAUSE,EPT,UG,VPID
  TSC: P-state invariant, performance statistics
real memory  = 103083409408 (98308 MB)
avail memory = 100121870336 (95483 MB)
Event timer "LAPIC" quality 600
ACPI APIC Table: <092810 APIC1354>
FreeBSD/SMP: Multiprocessor System Detected: 12 CPUs
FreeBSD/SMP: 2 package(s) x 6 core(s)
cpu0 (BSP): APIC ID:  0
cpu1 (AP): APIC ID:  2
cpu2 (AP): APIC ID:  4
cpu3 (AP): APIC ID: 16
cpu4 (AP): APIC ID: 18
cpu5 (AP): APIC ID: 20
cpu6 (AP): APIC ID: 32
cpu7 (AP): APIC ID: 34
cpu8 (AP): APIC ID: 36
cpu9 (AP): APIC ID: 48
cpu10 (AP): APIC ID: 50
cpu11 (AP): APIC ID: 52
ioapic0: Changing APIC ID to 1
ioapic1: Changing APIC ID to 3
ioapic0 <Version 2.0> irqs 0-23 on motherboard
ioapic1 <Version 2.0> irqs 24-47 on motherboard
random: <Software, Yarrow> initialized
kbd1 at kbdmux0
acpi0: <SMCI > on motherboard
acpi0: Power Button (fixed)
acpi0: reservation of 400, 100 (3) failed
cpu0: <ACPI CPU> on acpi0
cpu1: <ACPI CPU> on acpi0
cpu2: <ACPI CPU> on acpi0
cpu3: <ACPI CPU> on acpi0
cpu4: <ACPI CPU> on acpi0
cpu5: <ACPI CPU> on acpi0
cpu6: <ACPI CPU> on acpi0
cpu7: <ACPI CPU> on acpi0
cpu8: <ACPI CPU> on acpi0
cpu9: <ACPI CPU> on acpi0
cpu10: <ACPI CPU> on acpi0
cpu11: <ACPI CPU> on acpi0
attimer0: <AT timer> port 0x40-0x43 irq 0 on acpi0
Timecounter "i8254" frequency 1193182 Hz quality 0
Event timer "i8254" frequency 1193182 Hz quality 100
atrtc0: <AT realtime clock> port 0x70-0x71 irq 8 on acpi0
Event timer "RTC" frequency 32768 Hz quality 0
hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0
Timecounter "HPET" frequency 14318180 Hz quality 950
Event timer "HPET" frequency 14318180 Hz quality 350
Event timer "HPET1" frequency 14318180 Hz quality 340
Event timer "HPET2" frequency 14318180 Hz quality 340
Event timer "HPET3" frequency 14318180 Hz quality 340
Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
pcib1: <ACPI PCI-PCI bridge> at device 1.0 on pci0
pci1: <ACPI PCI bus> on pcib1
igb0: <Intel(R) PRO/1000 Network Connection version - 2.4.0> port 0xec00-0xec1f mem 0xfbde0000-0xfbdfffff,0xfbdc0000-0xfbddffff,0xfbd9c000-0xfbd9ffff irq 28 at device 0.0 on pci1
igb0: Using MSIX interrupts with 9 vectors
igb0: Ethernet address: 00:25:90:0c:d9:08
igb0: Bound queue 0 to cpu 0
igb0: Bound queue 1 to cpu 1
igb0: Bound queue 2 to cpu 2
igb0: Bound queue 3 to cpu 3
igb0: Bound queue 4 to cpu 4
igb0: Bound queue 5 to cpu 5
igb0: Bound queue 6 to cpu 6
igb0: Bound queue 7 to cpu 7
igb1: <Intel(R) PRO/1000 Network Connection version - 2.4.0> port 0xe880-0xe89f mem 0xfbd60000-0xfbd7ffff,0xfbd40000-0xfbd5ffff,0xfbd1c000-0xfbd1ffff irq 40 at device 0.1 on pci1
igb1: Using MSIX interrupts with 9 vectors
igb1: Ethernet address: 00:25:90:0c:d9:09
igb1: Bound queue 0 to cpu 8
igb1: Bound queue 1 to cpu 9
igb1: Bound queue 2 to cpu 10
igb1: Bound queue 3 to cpu 11
igb1: Bound queue 4 to cpu 0
igb1: Bound queue 5 to cpu 1
igb1: Bound queue 6 to cpu 2
igb1: Bound queue 7 to cpu 3
pcib2: <ACPI PCI-PCI bridge> at device 3.0 on pci0
pci2: <ACPI PCI bus> on pcib2
pcib3: <ACPI PCI-PCI bridge> at device 5.0 on pci0
pci3: <ACPI PCI bus> on pcib3
pcib4: <ACPI PCI-PCI bridge> at device 7.0 on pci0
pci4: <ACPI PCI bus> on pcib4
pcib5: <ACPI PCI-PCI bridge> at device 9.0 on pci0
pci5: <ACPI PCI bus> on pcib5
pci0: <base peripheral, interrupt controller> at device 20.0 (no driver attached)
pci0: <base peripheral, interrupt controller> at device 20.1 (no driver attached)
pci0: <base peripheral, interrupt controller> at device 20.2 (no driver attached)
pci0: <base peripheral, interrupt controller> at device 20.3 (no driver attached)
uhci0: <Intel 82801JI (ICH10) USB controller USB-D> port 0xdc00-0xdc1f irq 16 at device 26.0 on pci0
uhci0: LegSup = 0x2f00
usbus0 on uhci0
uhci1: <Intel 82801JI (ICH10) USB controller USB-E> port 0xd880-0xd89f irq 21 at device 26.1 on pci0
uhci1: LegSup = 0x2f00
usbus1 on uhci1
uhci2: <Intel 82801JI (ICH10) USB controller USB-F> port 0xd800-0xd81f irq 19 at device 26.2 on pci0
uhci2: LegSup = 0x2f00
usbus2 on uhci2
ehci0: <Intel 82801JI (ICH10) USB 2.0 controller USB-B> mem 0xfbeda000-0xfbeda3ff irq 18 at device 26.7 on pci0
usbus3: EHCI version 1.0
usbus3 on ehci0
uhci3: <Intel 82801JI (ICH10) USB controller USB-A> port 0xd480-0xd49f irq 23 at device 29.0 on pci0
uhci3: LegSup = 0x2f00
usbus4 on uhci3
uhci4: <Intel 82801JI (ICH10) USB controller USB-B> port 0xd400-0xd41f irq 19 at device 29.1 on pci0
uhci4: LegSup = 0x2f00
usbus5 on uhci4
uhci5: <Intel 82801JI (ICH10) USB controller USB-C> port 0xd080-0xd09f irq 18 at device 29.2 on pci0
uhci5: LegSup = 0x2f00
usbus6 on uhci5
ehci1: <Intel 82801JI (ICH10) USB 2.0 controller USB-A> mem 0xfbed8000-0xfbed83ff irq 23 at device 29.7 on pci0
usbus7: EHCI version 1.0
usbus7 on ehci1
pcib6: <ACPI PCI-PCI bridge> at device 30.0 on pci0
pci6: <ACPI PCI bus> on pcib6
vgapci0: <VGA-compatible display> mem 0xf9000000-0xf9ffffff,0xfaffc000-0xfaffffff,0xfb000000-0xfb7fffff irq 18 at device 1.0 on pci6
vgapci0: Boot video device
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel ICH10 SATA300 controller> port 0xd000-0xd007,0xcc00-0xcc03,0xc880-0xc887,0xc800-0xc803,0xc480-0xc48f,0xc400-0xc40f irq 19 at device 31.2 on pci0
ata2: <ATA channel> at channel 0 on atapci0
ata3: <ATA channel> at channel 1 on atapci0
atapci1: <Intel ICH10 SATA300 controller> port 0xc000-0xc007,0xbc00-0xbc03,0xb880-0xb887,0xb800-0xb803,0xb480-0xb48f,0xb400-0xb40f irq 19 at device 31.5 on pci0
ata4: <ATA channel> at channel 0 on atapci1
ata5: <ATA channel> at channel 1 on atapci1
acpi_button0: <Power Button> on acpi0
uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
uart1: <16550 or compatible> port 0x2f8-0x2ff irq 3 on acpi0
qpi0: <QPI system bus> on motherboard
pcib7: <QPI Host-PCI bridge> pcibus 255 on qpi0
pci255: <PCI bus> on pcib7
pcib8: <QPI Host-PCI bridge> pcibus 254 on qpi0
pci254: <PCI bus> on pcib8
orm0: <ISA Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xc8fff,0xc9000-0xc9fff on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
ppc0: cannot reserve I/O port range
est0: <Enhanced SpeedStep Frequency Control> on cpu0
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 11
device_attach: est0 attach returned 6
p4tcc0: <CPU Frequency Thermal Control> on cpu0
est1: <Enhanced SpeedStep Frequency Control> on cpu1
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 11
device_attach: est1 attach returned 6
p4tcc1: <CPU Frequency Thermal Control> on cpu1
est2: <Enhanced SpeedStep Frequency Control> on cpu2
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 11
device_attach: est2 attach returned 6
p4tcc2: <CPU Frequency Thermal Control> on cpu2
est3: <Enhanced SpeedStep Frequency Control> on cpu3
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 11
device_attach: est3 attach returned 6
p4tcc3: <CPU Frequency Thermal Control> on cpu3
est4: <Enhanced SpeedStep Frequency Control> on cpu4
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 11
device_attach: est4 attach returned 6
p4tcc4: <CPU Frequency Thermal Control> on cpu4
est5: <Enhanced SpeedStep Frequency Control> on cpu5
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 11
device_attach: est5 attach returned 6
p4tcc5: <CPU Frequency Thermal Control> on cpu5
est6: <Enhanced SpeedStep Frequency Control> on cpu6
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 11
device_attach: est6 attach returned 6
p4tcc6: <CPU Frequency Thermal Control> on cpu6
est7: <Enhanced SpeedStep Frequency Control> on cpu7
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 11
device_attach: est7 attach returned 6
p4tcc7: <CPU Frequency Thermal Control> on cpu7
est8: <Enhanced SpeedStep Frequency Control> on cpu8
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 11
device_attach: est8 attach returned 6
p4tcc8: <CPU Frequency Thermal Control> on cpu8
est9: <Enhanced SpeedStep Frequency Control> on cpu9
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 11
device_attach: est9 attach returned 6
p4tcc9: <CPU Frequency Thermal Control> on cpu9
est10: <Enhanced SpeedStep Frequency Control> on cpu10
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 11
device_attach: est10 attach returned 6
p4tcc10: <CPU Frequency Thermal Control> on cpu10
est11: <Enhanced SpeedStep Frequency Control> on cpu11
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 11
device_attach: est11 attach returned 6
p4tcc11: <CPU Frequency Thermal Control> on cpu11
random: unblocking device.
usbus0: 12Mbps Full Speed USB v1.0
Timecounters tick every 1.000 msec
usbus1: 12Mbps Full Speed USB v1.0
usbus2: 12Mbps Full Speed USB v1.0
usbus3: 480Mbps High Speed USB v2.0
usbus4: 12Mbps Full Speed USB v1.0
usbus5: 12Mbps Full Speed USB v1.0
usbus6: 12Mbps Full Speed USB v1.0
usbus7: 480Mbps High Speed USB v2.0
ugen1.1: <Intel> at usbus1
uhub0: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus1
ugen0.1: <Intel> at usbus0
uhub1: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus0
ugen4.1: <Intel> at usbus4
uhub2: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus4
ugen3.1: <Intel> at usbus3
uhub3: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus3
ugen2.1: <Intel> at usbus2
uhub4: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus2
ugen6.1: <Intel> at usbus6
uhub5: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus6
ugen5.1: <Intel> at usbus5
uhub6: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus5
ugen7.1: <Intel> at usbus7
uhub7: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus7
uhub0: 2 ports with 2 removable, self powered
uhub1: 2 ports with 2 removable, self powered
uhub2: 2 ports with 2 removable, self powered
uhub4: 2 ports with 2 removable, self powered
uhub5: 2 ports with 2 removable, self powered
uhub6: 2 ports with 2 removable, self powered
ada0 at ata2 bus 0 scbus0 target 0 lun 0
ada0: <Hitachi HDT721010SLA360 ST6OA3AA> ATA-8 SATA 2.x device
ada0: Serial Number STH607MS2WVLLS
ada0: 300.000MB/s transfers (SATA 2.x, UDMA5, PIO 8192bytes)
ada0: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C)
ada0: Previously was known as ad4
cd0 at ata5 bus 0 scbus3 target 0 lun 0
cd0: <TEAC DV-28S-V 1.0B> Removable CD-ROM SCSI-0 device
cd0: 150.000MB/s transfers (SATA 1.x, UDMA5, ATAPI 12bytes, PIO 8192bytes)
cd0: Attempt to query device size failed: NOT READY, Medium not present - tray closed
ada1 at ata2 bus 0 scbus0 target 1 lun 0
ada1: <SAMSUNG HD103UJ 1AA01118> ATA-7 SATA 2.x device
ada1: Serial Number S13PJ90S700753
ada1: 300.000MB/s transfers (SATA 2.x, UDMA5, PIO 8192bytes)
ada1: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C)
ada1: Previously was known as ad5
ada2 at ata3 bus 0 scbus1 target 0 lun 0
ada2: <HITACHI HUA7210SASUN1.0T 0944GVG6AF GKAOAC5A> ATA-7 SATA 2.x device
ada2: Serial Number GTF002PBJVG6AF
ada2: 300.000MB/s transfers (SATA 2.x, UDMA5, PIO 8192bytes)
ada2: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C)
ada2: Previously was known as ad6
ada3 at ata3 bus 0 scbus1 target 1 lun 0
ada3: <SAMSUNG HD103UJ 1AA01118> ATA-7 SATA 2.x device
ada3: Serial Number S13PJ90S700768
ada3: 300.000MB/s transfers (SATA 2.x, UDMA5, PIO 8192bytes)
ada3: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C)
ada3: Previously was known as ad7
SMP: AP CPU #1 Launched!
SMP: AP CPU #6 Launched!
SMP: AP CPU #4 Launched!
SMP: AP CPU #11 Launched!
SMP: AP CPU #3 Launched!
SMP: AP CPU #8 Launched!
SMP: AP CPU #5 Launched!
SMP: AP CPU #10 Launched!
SMP: AP CPU #2 Launched!
SMP: AP CPU #9 Launched!
SMP: AP CPU #7 Launched!
Timecounter "TSC-low" frequency 1133397386 Hz quality 1000
GEOM_MIRROR: Cancelling unmapped because of ada3.
GEOM_MIRROR: Cancelling unmapped because of ada1.
GEOM_MIRROR: Device mirror/gm1 launched (1/2).
GEOM_MIRROR: Device gm1: rebuilding provider ada1.
GEOM_MIRROR: Cancelling unmapped because of ada2s1.
GEOM_MIRROR: Cancelling unmapped because of ada0s1.
GEOM_MIRROR: Device mirror/gm-root launched (1/2).
GEOM_MIRROR: Device gm-root: rebuilding provider ada0s1.
Root mount waiting for: usbus7 usbus3
Root mount waiting for: usbus7 usbus3
uhub3: 6 ports with 6 removable, self powered
uhub7: 6 ports with 6 removable, self powered
Trying to mount root from ufs:/dev/mirror/gm-roota [rw]...
WARNING: / was not properly dismounted
WARNING: /: mount pending error: blocks 92 files 7
ugen1.2: <American Megatrends Inc.> at usbus1
ukbd0: <Keyboard Interface> on usbus1
ugen4.2: <APC> at usbus4
kbd2 at ukbd0
ukbd1: <APC Keyboard> on usbus4
kbd3 at ukbd1
WARNING: / was not properly dismounted
WARNING: /raid was not properly dismounted
WARNING: /raid: mount pending error: blocks 6536 files 164
ums0: <Mouse Interface> on usbus1
ums0: 3 buttons and [XY] coordinates ID=0
ums1: <APC Mouse> on usbus4
ums1: 5 buttons and [XYZ] coordinates ID=0
WARNING: attempt to domain_add(netgraph) after domainfinalize()
tun0: link state changed to UP
 
Do you have a crash dump in /var/crash? The panic doesn't reveal much and the only way to reveal what state the system was in at the time would be analysis of the crash dump.
 
The "supervisor read data, page not present" panic is usually caused by bad memory. You might want to check that first. Another possibility is a bad block inside the swap partition but I would have expected to see something about the swapper process in the backtrace.
 
sysrc dumpdev=AUTO or manually setting that in /etc/rc.conf would enable crash dumps assuming you have a swap partition for the kernel to dump memory to when it panics. Given that your first post showed 145 days of uptime I think SirDice is spot on. Perhaps a latent static discharge finally started to show its impact. I would probably go straight to running a MemTest86 to vet the memory.
 
The "supervisor read data, page not present" panic is usually caused by bad memory. You might want to check that first.
That was very true back in the days when hardware design was a lot sloppier. I once spoke (via a translator) with the designer of a popular PC chipset which will remain nameless but sounds sort of like "Sympathy" who told me that he wasn't concerned about some rare cache coherency issues in his chipset because "Windows doesn't stay up that long anyway".

These days, if it happens in same place (same backtrace) over and over again, it is almost definitely a kernel bug. If it changes (as this one seems to), then it may well be memory. It could also be a problem where the kernel or driver writes to the wrong memory location(s), causing a subsequent access to "go off into the weeds" due to a corrupted pointer. Those are a lot harder to track down.

In any event, downloading an ISO or USB stick image of Memtest86+ and booting it is an easy way to see if it is a memory issue.

It could also be a power supply issue.

Crash dumps have been problematic for me for years (I think it dates from SMP w/Giant-only locking). The last time I needed one was early on in FreeBSD 8, and it was not possible to get one (I think my last successful dump was back in FreeBSD 5.x on a UP system). I worked with various developers on getting some of the problems fixed (interrupts happening on a processor other than the one that panic'd was one of the issues) but was never able to get a clean dump. This may well have been fixed by FreeBSD 10 - fortunately, I haven't had any system panics in a long time.
 
That was very true back in the days when hardware design was a lot sloppier. I once spoke (via a translator) with the designer of a popular PC chipset which will remain nameless but sounds sort of like "Sympathy" who told me that he wasn't concerned about some rare cache coherency issues in his chipset because "Windows doesn't stay up that long anyway".

These days, if it happens in same place (same backtrace) over and over again, it is almost definitely a kernel bug. If it changes (as this one seems to), then it may well be memory. It could also be a problem where the kernel or driver writes to the wrong memory location(s), causing a subsequent access to "go off into the weeds" due to a corrupted pointer. Those are a lot harder to track down.
Very true indeed but my experience tells me it's usually memory. Although I did have some badly behaving drivers that showed similar panics the number of times this happened pales in comparison to the number of panics caused by bad memory. Bad memory is a lot easier to test. Hence my recommendation to test it first. Even if it's only to rule out as a possible cause.
 
Back
Top