FreeBSD 8.2 kernel panics

Hi

I'am using a kernel netgraph module (self-written) for high-volume traffic processing on FreeBSD 8 boxes. Traffic is proccessed on netgraph without sending it to kernel IP processing. Netgraph nodes are connected on boot time and not changed during processing. Peak load is 3-4Gps, about 600-800Kpps, no more then 50% CPU utilization.

The problem: sometimes the system crashes with kernel fatal traps 9, 11, 12. Crashes are not related to load, most incidents happens in non-peak hours, traffic load is 2-3 times less then the peak times. Usually, system crashes with uptime one week or more. Sometimes it enters crash/reboot cycle and keep doing that for 10-12 hours, with up times 30-90 minuts.

Only 10G systems are involved (ix), em-based working fine. The problems are on different server vendors, Intel Xeon based. FreeBSD 7 boxes working stable with the same module.

Monitoring doesn't show any signs of problems prior to crash/reboot. CPU, network utilization are monitored, also some vmstat data (related to the module).

Here are some system setup details:
uname -a
Code:
FreeBSD 8.2-RELEASE-p2 #0: Mon Aug  1 20:13:04 UTC 2011     [email]root@c2ip-test02.is74.ru[/email]:/usr/obj/usr/src/sys/IP  amd64

sysctl.conf
Code:
net.link.ether.ipfw=1
kern.ipc.nmbclusters=409600
net.graph.recvspace=40960
net.graph.maxdgram=40960
kern.ipc.maxsockbuf=2097152
hw.intr_storm_threshold=9000

loader.conf
Code:
hw.em.txd=4096
hw.em.rxd=4096
hw.ixgbe.txd=4096
hw.ixgbe.rxd=4096
net.graph.maxalloc=128000
net.graph.maxdata=128000
net.graph.threads=4

Kernel is compiled with (diff to GENERIC)
Code:
-ident		GENERIC
+ident		IP

-options 	FLOWTABLE		# per-cpu routing cache
+#options 	FLOWTABLE		# per-cpu routing cache

+options         IPFIREWALL
+options         IPFIREWALL_VERBOSE
+options         IPFIREWALL_VERBOSE_LIMIT=100
+options         IPFIREWALL_DEFAULT_TO_ACCEPT
+options         IPFIREWALL_FORWARD
+options         DUMMYNET

At the moment I am unable to get a kernel dump, for some reason it isn't saved after crash. As an option, I will try to use the i386 version of FreeBSD 8. Because of Intel ix I am unable downgrade system to FreeBSD 7.

I would be glad to hear suggestions or provide additional information to investigate and fix the problem.
 
I've seen similar problems on a server acting as a router (quagga / bgpd / zebra / 15k routes) and mpd5 (PPPoE) server, on the 6-th day after reboot.
The following code fixed the problem on my case.
Code:
nooptions 	FLOWTABLE
 
Thanks,

FLOWTABLE is off, I mentioned that in the kernel compile options file.

I still wasn't able to get a core dump (yes, I've read all that mans) on AMD64, but to catch the thing we've changed OS on one host to FreeBSD 8.2-RELEASE/i386.

Same day we've got four kernel coredumps stating that system gets SEGV in the 'ix' driver.
That was pointless, as we have other applications processing up to 15 GBs on one box with 'ix' network cards without issues.

After doing some research and management profiling I've found a problem in the own module which could lead to buffer overflow and it was really reproducible then causing the same kernel panic in the ix driver.

We'll keep monitoring, 99% likely this wasn't the driver or kernel bug.
 
Back
Top