FBSD 8.0 RELEASE server lagging very badly

Hello, we need help on a FreeBSD 8.0 RELEASE server on production that is failing to respond in a timely fashion to any input. The server has a cPanel installed on it, and because of this we had to rebuild the GENERIC kernel with the option quotas enabled, however, the server was already running with some slowdowns even before doing any of this.
We were seeing in dmesg "Approaching limit on PV entries.." so we changed these kernel variables, kern.ipc.shm_use_phys=1 and vm.pmap.shpgperproc=1400.
The Apache server responds perfectly until n requests, and then suddenly begins lagging for like 5-10 seconds, then starts responding again.

The behavior is like this:
Code:
[root@apollo /var/log]# time ls
auth.log                exim                    sendmail.st.10
auth.log.0.bz2          lastlog                 sendmail.st.2
auth.log.1.bz2          lpd-errs                sendmail.st.3
auth.log.2.bz2          maillog                 sendmail.st.4
auth.log.3.bz2          maillog.0.bz2           sendmail.st.5
auth.log.4.bz2          maillog.1.bz2           sendmail.st.6
auth.log.5.bz2          maillog.2.bz2           sendmail.st.7
auth.log.6.bz2          maillog.3.bz2           sendmail.st.8
auth.log.7.bz2          maillog.4.bz2           sendmail.st.9
chkservd.log            maillog.5.bz2           setuid.today
cpupdate.env            maillog.6.bz2           setuid.yesterday
cron                    maillog.7.bz2           stunnel-4.15-build.log
cron.0.bz2              messages                userlog
cron.1.bz2              messages.0.bz2          wtmp
cron.2.bz2              messages.1.bz2          wtmp.0
cron.3.bz2              messages.2.bz2          wtmp.1
dcpumon                 messages.3.bz2          xferlog
debug.log               messages.4.bz2          xferlog.0.bz2
debug.log.0.bz2         messages.5.bz2          xferlog.1.bz2
debug.log.1.bz2         mount.today             xferlog.2.bz2
debug.log.2.bz2         mount.yesterday         xferlog.3.bz2
debug.log.3.bz2         pf.today                xferlog.4.bz2
debug.log.4.bz2         ppp.log                 xferlog.5.bz2
debug.log.5.bz2         quota_enable.log        xferlog.6.bz2
debug.log.6.bz2         restartsrv_err.log      xferlog.7.bz2
debug.log.7.bz2         security                xferlog.offset
dmesg                   sendmail.st             xferlog.offsetftpsep
dmesg.today             sendmail.st.0
dmesg.yesterday         sendmail.st.1

real    0m4.352s
user    0m0.001s
sys     0m0.000s
[root@apollo /var/log]# time ls
auth.log                exim                    sendmail.st.10
auth.log.0.bz2          lastlog                 sendmail.st.2
auth.log.1.bz2          lpd-errs                sendmail.st.3
auth.log.2.bz2          maillog                 sendmail.st.4
auth.log.3.bz2          maillog.0.bz2           sendmail.st.5
auth.log.4.bz2          maillog.1.bz2           sendmail.st.6
auth.log.5.bz2          maillog.2.bz2           sendmail.st.7
auth.log.6.bz2          maillog.3.bz2           sendmail.st.8
auth.log.7.bz2          maillog.4.bz2           sendmail.st.9
chkservd.log            maillog.5.bz2           setuid.today
cpupdate.env            maillog.6.bz2           setuid.yesterday
cron                    maillog.7.bz2           stunnel-4.15-build.log
cron.0.bz2              messages                userlog
cron.1.bz2              messages.0.bz2          wtmp
cron.2.bz2              messages.1.bz2          wtmp.0
cron.3.bz2              messages.2.bz2          wtmp.1
dcpumon                 messages.3.bz2          xferlog
debug.log               messages.4.bz2          xferlog.0.bz2
debug.log.0.bz2         messages.5.bz2          xferlog.1.bz2
debug.log.1.bz2         mount.today             xferlog.2.bz2
debug.log.2.bz2         mount.yesterday         xferlog.3.bz2
debug.log.3.bz2         pf.today                xferlog.4.bz2
debug.log.4.bz2         ppp.log                 xferlog.5.bz2
debug.log.5.bz2         quota_enable.log        xferlog.6.bz2
debug.log.6.bz2         restartsrv_err.log      xferlog.7.bz2
debug.log.7.bz2         security                xferlog.offset
dmesg                   sendmail.st             xferlog.offsetftpsep
dmesg.today             sendmail.st.0
dmesg.yesterday         sendmail.st.1

real    0m0.001s
user    0m0.001s
sys     0m0.000s
[root@apollo /var/log]#

You can clearly see that the first time the server took more than 4 seconds to respond to the ls on /var/log, and the second time it responded in a timely fashion.

We have already googled everywhere but cannot find anything that could help us. Any advice or suggestion will be strongly welcome.
 
okasion said:
because of this we had to rebuild the GENERIC kernel with the option quotas enabled
Never, ever change GENERIC! People will ask you to run uname -a and it will show up as GENERIC even though it's not. This will inevitably lead to confusion. Always create a new kernel image with a different name (ident).

The Apache server responds perfectly until n requests, and then suddenly begins lagging for like 5-10 seconds, then starts responding again.
I had similar issues with samba and NFS. But after a lot of network traffic everything seemed to stall for a few seconds. This seems to have disappeared somewhere during 8.0-STABLE. You could update to 8.1-RELEASE.

You can clearly see that the first time the server took more than 4 seconds to respond to the ls on /var/log, and the second time it responded in a timely fashion.
The second run came from cache. That's what it's for.
 
Thanks for your input.
Apparently, we solved the problem setting "KeepAlive Off" in httpd.conf. It does make some sense because we were seeing lots of httpd@127.0.01 requests, thus, because the server has 8GB of RAM, everything, including the console, is running pretty well- or perfect.
We do not know if this problem is related to cPanel or not yet, nor if this is the final solution.
 
lots of httpd@127.0.01 requests

Iam not familiar with cpanel, but "lots of httpd@127.0.01 requests" sounds to me more like a problem with the websites you host on this server.

How many seconds was your "KeepAlive" Option before you disabled it?

You can use the Apache Extended Server Status to figure out what the Apache childs are doing.

Code:
Current Time: Thursday, 26-Aug-2010 15:19:41 CEST
Restart Time: Wednesday, 25-Aug-2010 16:05:52 CEST
Parent Server Generation: 0
Server uptime: 23 hours 13 minutes 49 seconds
Total accesses: 807484 - Total Traffic: 34.7 GB
CPU Usage: u17.6641 s38.5313 cu744.016 cs0 - .957% CPU load
9.66 requests/sec - 435.3 kB/second - 45.1 kB/request
34 requests currently being processed, 41 idle workers
_K_C_____.___K_CK_K_KK...KK.___K..__K_K_K.KKKK_K.K_.___K..KK....
....__.K__.._K__.K_WK_K._.K_..._......__...K.._K...R._..........
..............K.................................................
................................................................
Scoreboard Key:
"_" Waiting for Connection, "S" Starting up, "R" Reading Request,
"W" Sending Reply, "K" Keepalive (read), "D" DNS Lookup,
"C" Closing connection, "L" Logging, "G" Gracefully finishing,
"I" Idle cleanup of worker, "." Open slot with no current process
 
Back
Top