FreeBSD 7.2 Freezing, Locking on Lighttpd+php fast-cgi!!

Hi All.

I need some help here.

My brand new server IS FREEZING !! one, two, three times a day it locks down on lighty rising CPU usage to 100%!

It seems to be something related to lighttpd + php fast-cgi.
When the server locks, I try to KILL -9 lighttpd process, but it just dont work!!!
"Waiting for PIDS: 77498 77498 77498 77498 77498 77498 77498 77498 77498 77498 77498..."
(and never stops)

-------------------------------------------------
Some info about my scenario:
-------------------------------------------------
Code:
FreeBSD 7.2-RELEASE #0: Tue Jul 28 19:36:22 BRT 2009
[email]root@server.com[/email] :/usr/src/sys/amd64/compile/www
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Xeon(R) CPU E5410 @ 2.33GHz (2327.51-MHz K8-class CPU)

Origin = "GenuineIntel"  Id = 0x1067a  Stepping = 10
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8
 ,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36 ,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0x40ce3bd<SSE3,RSVD2,MON,DS_CPL,VMX,EST,TM2,SSSE3,
 CX16,xTPR,PDCM,DCA,<b19>,XSAVE>
  AMD Features=0x20100800<SYSCALL,NX,LM>
  AMD Features2=0x1<LAHF>

Cores per package: 4
usable memory = 4276776960 (4078 MB)
avail memory  = 4104232960 (3914 MB)
ACPI APIC Table: <DELL   PE_SC3  >
FreeBSD/SMP: Multiprocessor System Detected: 8 CPUs
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
 cpu2 (AP): APIC ID:  2
 cpu3 (AP): APIC ID:  3
 cpu4 (AP): APIC ID:  4
 cpu5 (AP): APIC ID:  5
 cpu6 (AP): APIC ID:  6
 cpu7 (AP): APIC ID:  7
-------------------------------------------------
mysql-server-5.1.36
xcache-1.2.2
php5-5.2.10
lighttpd-1.4.23
-------------------------------------------------

when I restart lighttpd, FREE shows:
Code:
mem_wire:         507826176 (    484MB) [ 12%]
mem_active:  +    308924416 (    294MB) [  7%]
mem_inactive:+    360247296 (    343MB) [  8%]
mem_cache:   +      5332992 (      5MB) [  0%]
mem_free:    +   2943963136 (   2807MB) [ 71%]
mem_gap_vm:  +       487424 (      0MB) [  0%]
-------------------------------------------------
some time later, active goes to 30-40% and inactive rises to 30-38%, free drops to 3-5%.

Lighttpd-status reports 8-30 simultaneous users ~300kb/s on each F5.
Code:
Uptime 14 min 3 s 
Started at 2009-08-24 21:36:35 
absolute (since start) 
Requests 8 kreq 
Traffic 96.29 Mbyte 

average (since start) 
Requests 9 req/s 
Traffic 116.97 kbyte/s 

average (5s sliding average) 
Requests 16 req/s 
Traffic 168.89 kbyte/s

All my pages are dynamic PHP.

Today, it stopped at 19:30 according to CACTI graphs that shows 100% CPU Utilization
- it jumped from normal 2-5 cpu (at 19:25) to 100% (at 19:30).

lighttpd.error.log shows NOTHING more than the usual "Connection reset by peer" (crazy error that I just cant get rid :-( ):
Code:
2009-08-24 15:33:20: (connections.c.132) (warning) close: 26 Connection reset by peer
2009-08-24 16:38:39: (connections.c.132) (warning) close: 23 Connection reset by peer
2009-08-24 16:47:54: (connections.c.132) (warning) close: 20 Connection reset by peer

Kill does not worked, so I did a "reboot" command on ssh.

2009-08-24 20:32:30: (log.c.172) server started
2009-08-24 20:52:53: (server.c.1495) server stopped by UID = 0 PID = 0
2009-08-24 20:53:16: (log.c.172) server started

Nothing usefull on /var/log/messages either.


/usr/local/etc/lighttpd.conf
Code:
## My trying Tweaks 
server.max-keep-alive-requests = 4
server.max-keep-alive-idle = 4
server.max-fds = 6144
server.max-connections = 2048
server.max-read-idle = 20
server.max-write-idle = 180

fastcgi.server             = ( ".php" =>
                               ( "localhost" =>
                                 (
                                   "socket" => "/var/run/lighttpd/php-fastcgi.socket",
                                   "bin-path" => "/usr/local/bin/php-cgi",
                                   "max-procs"         => 4,
                                   "min-procs"         => 1,
                                   "max-load-per-proc" => 10,
                                   "idle-timeout"      => 20,
                                   "bin-environment" => (
                                                "PHP_FCGI_CHILDREN" => "2",
                                                "PHP_FCGI_MAX_REQUESTS" => "500"
                                                        ),
                                   "bin-copy-environment" => (""),
                                   "broken-scriptfilename" => "enable"
                                 )
                               )
                            )

Thank you in advance for any directions, any help...
Francisco
 
We're having the same issues with php 5.2.10 and php 5.2.9...

Code:
FreeBSD jail.coccozella.com 7.2-RELEASE-p2 FreeBSD 7.2-RELEASE-p2 #0: Wed Jun 24 00:57:44 UTC 2009     [email]root@i386-builder.daemonology.net[/email]:/usr/obj/usr/src/sys/GENERIC  i386

From the logs the crash seemed to be triggered by a post in phpbb 3 at times and it would be triggered reliably if we used the following function with a large amount of data ..

Code:
<?
function do_post_request($url, $data, $optional_headers = null){
	$params = array('http' => array(
		'method' => 'POST',
		'content' => $data
		));
	if ($optional_headers !== null) {
		$params['http']['header'] = $optional_headers;
	}
	$ctx = stream_context_create($params);
	$fp = fopen($url, 'rb', false, $ctx);
	$response = @stream_get_contents($fp);


	if (!$fp) {
        	throw new Exception("Problem with $url, $php_errormsg");
	}
	if ($response === false) {
		throw new Exception("Problem reading data from $url, $php_errormsg");
	}

	return $response;
}
?>

Heres the odd bit .. we copied over php 5.2.9 from a freebsd 7.0 box and it seems to be stable again. Even though we tried compiling a stripped down php 5.2.9 on the box it still kept keeling over.

We're running it under vmware and cant seem to get into the box after it crashes it shows the last login time when you try and login on the console and goes no further.
 
Hi. I modified some configs on /etc/sysctl.conf and lighttpd.conf and now my server is UP for 5 days.

# cat /etc/sysctl.conf
Code:
net.inet.icmp.icmplim=150
net.inet.icmp.maskrepl=0
net.inet.icmp.drop_redirect=1
net.inet.icmp.bmcastecho=0
net.inet.tcp.icmp_may_rst=0

net.inet.tcp.drop_synfin=1
kern.ipc.somaxconn=4096
net.inet.ip.fw.one_pass=1

net.inet.tcp.msl=7500
net.inet.ip.stealth=0

net.inet.tcp.blackhole=2
net.inet.udp.blackhole=1

net.inet.ip.rtexpire=2
net.inet.ip.rtminexpire=2
net.inet.ip.rtmaxcache=256

net.inet.ip.accept_sourceroute=0
net.inet.ip.sourceroute=0

kern.ipc.shmmax=134217728
kern.ipc.shmall=32768
kern.ipc.semmap=256

kern.maxfiles=65536

kern.ipc.maxsockets="131072"
vm.pmap.pg_ps_enabled=1
vm.kmem_size=1G
kern.ipc.nmbclusters=32768
security.bsd.see_other_uids=0
security.bsd.see_other_gids=0
security.bsd.unprivileged_read_msgbuf=0
kern.maxfilesperproc=52428
kern.threads.max_threads_per_proc=4096
net.inet.tcp.keepinit=1000
net.inet.ip.portrange.last=65535
net.inet.ip.portrange.first=10000

# cat /usr/local/etc/lighttpd.conf
Code:
server.max-keep-alive-requests = 4
server.max-keep-alive-idle = 4
server.max-fds = 2048
server.max-connections = 412
server.max-read-idle = 20
server.max-write-idle = 180

server.event-handler = "select"
server.network-backend = "freebsd-sendfile"
net.inet.tcp.recvspace = 4096


fastcgi.server             = ( ".php" =>
                               ( "localhost" =>
                                 (
                                   "socket" => "/var/run/lighttpd/php-fastcgi.socket",
                                   "bin-path" => "/usr/local/bin/php-cgi",
                                   "max-procs"         => 4,
                                   "min-procs"         => 1,
                                   "max-load-per-proc" => 10,
                                   "idle-timeout"      => 20,
                                   "bin-environment" => (
                                                "PHP_FCGI_CHILDREN" => "2",
                                                "PHP_FCGI_MAX_REQUESTS" => "500"
                                                        ),
                                   "bin-copy-environment" => (""),
                                   "broken-scriptfilename" => "enable"
                                 )
                               )
                            )
 
Lighttpd is known to have memory leakage problems. I had a scenario similar to yours. No matter what I did in the kernel parameters and/or to lighty config file, the crash kept happening randomly. I had a dynamic php site also. I changed to Nginx, and never looked back.

Besides, I see that you only have two PHP_FCGI_CHILDREN, don't you think that this is way too small? Try increasing the number of children and see what happens. Also, when you start lighty have a look at it's process memory size, vi top or ps and when it hits the crash again capture that size, you should see a massive difference.
 
Back
Top