Hello everybody!
I have a few servers withfreebsd FreeBSD 7.2 or 8.1 and MPD 5.5 for a PPPoE connection. After I updated MPD to version 5.6 (I use ports and this patch and a patch to support the CoA RAD_CLASS attribute:
After this update the servers start to reboot after a panic periodically about once a week. The reasons are different but usually it is:
Sometimes there are other errors, but there is always bpf_filter in "where" command output of gdb. All my kernels have additional options:
And I've changed these sysctl variables:
There are about 200 users on every server. And pppoe-delay=3 or 4 (see this patch).
What may be the reason of kernel panic?
I have a few servers with
Code:
--- ../mpd-5.6/src/radsrv.c 2011-12-21 23:58:49.000000000 +0900
+++ ./src/radsrv.c 2012-04-02 19:02:26.106800017 +0900
@@ -94,6 +94,7 @@
Bund B;
Link L;
char *tmpval;
+ u_char *rad_class = NULL;
char *username = NULL, *called = NULL, *calling = NULL, *sesid = NULL;
char *msesid = NULL, *link = NULL, *bundle = NULL, *iface = NULL;
int nasport = -1, serv_type = 0, ifindex = -1, i;
@@ -163,6 +164,13 @@
Log(LG_RADIUS2, ("radsrv: Got RAD_USER_NAME: %s",
username));
break;
+ case RAD_CLASS:
+ tmpval = Bin2Hex(data, len);
+ Log(LG_RADIUS2, ("radsrv: Got RAD_CLASS: %s",
+ tmpval));
+ Freee(tmpval);
+ rad_class = Mdup(MB_AUTH, data, len);
+ break;
case RAD_NAS_IP_ADDRESS:
nas_ip = rad_cvt_addr(data);
Log(LG_RADIUS2, ("radsrv: Got RAD_NAS_IP_ADDRESS: %s ",
@@ -509,6 +517,8 @@
ACLCopy(acl_queue, &L->lcp.auth.params.acl_queue);
ACLCopy(acl_table, &L->lcp.auth.params.acl_table);
#endif /* USE_IPFW */
+ if (rad_class)
+ L->lcp.auth.params.class=rad_class;
#ifdef USE_NG_BPF
for (i = 0; i < ACL_FILTERS; i++) {
ACLDestroy(L->lcp.auth.params.acl_filters[i]);
After this update the servers start to reboot after a panic periodically about once a week. The reasons are different but usually it is:
Code:
kgdb /boot/kernel/kernel /var/crash/vmcore.2
...
Fatal trap 18: integer divide fault while in kernel mode
cpuid = 0; apic id = 00
instruction pointer = 0x20:0xc4de1d73
stack pointer = 0x28:0xc3f92670
frame pointer = 0x28:0xc3f926c0
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 26 (em1 taskq)
trap number = 18
...
(kgdb) list *0xc4de1d73
0xc4de1d73 is in bpf_filter (/usr/src/sys/modules/netgraph/bpf/../../../net/bpf_filter.c:461).
456 case BPF_ALU|BPF_MUL|BPF_K:
457 A *= pc->k;
458 continue;
459
460 case BPF_ALU|BPF_DIV|BPF_K:
461 A /= pc->k;
462 continue;
463
464 case BPF_ALU|BPF_AND|BPF_K:
465 A &= pc->k;
Sometimes there are other errors, but there is always bpf_filter in "where" command output of gdb. All my kernels have additional options:
Code:
options IPFIREWALL
options IPDIVERT
options IPFIREWALL_FORWARD
options NETGRAPH
options NETGRAPH_IPFW
options NETGRAPH_PPPOE
options NETGRAPH_IFACE
options DEVICE_POLLING
options HZ=1000
Code:
net.inet.icmp.icmplim=800
net.inet.flowtable.enable=0
net.isr.direct=1
kern.random.sys.harvest.ethernet=0
kern.random.sys.harvest.point_to_point=0
kern.random.sys.harvest.interrupt=0
net.inet.ip.fastforwarding=1
vm.pmap.shpgperproc=2048
net.isr.maxthreads 2
net.isr.bindthreads 1
There are about 200 users on every server. And pppoe-delay=3 or 4 (see this patch).
What may be the reason of kernel panic?