Hi, I need to know what lockf is and how to diagnose high system cpu usage. The problem is as follows. The server is a web server, it has no mysqld and it only serves dynamic files no static media.
When I was using apache with mod_php and load got high enough then all http processes would enter lockf state and the entire http server remained locked until forcefully restarted. When this occured cpu usage was quite low and there was no obvious resource saturation cpu, i/o etc. Enabling vfs shared lookups eased the problem but didn't solve it.
I have now changed apache to threaded and separated php off to a fastcgi setup. I have tested both fcgid and fpm.
Now http itself has no issues so it's the php side of the serving that's the problem. But I get the same issue, when enough connections to server system% load jumps up in one go from about 10% to 90% and then all processes lockf state.
I suspect something in the web code itself is doing somethingfreebsd FreeBSD doesn't like however I don't know how to diagnose it from here as I cannot see the following.
1 - breakdown of whats using sys% cpu usage.
2 - live status of files accessed on HDDs.
Here is a snapshot from systat, I even have /tmp on a RAM disk as it was been flooded with php session files.
So only 74% sys but that's enough, currently all php-fpm in lockf state and either not responding or responding very slowly, if I forcefully restart it (php-fpm) then it instantly becomes responsive again with sys% down to around 10%.
The hardware is 4 SCSI HDDs in RAID10, 4 core Intel Xeon. FreeBSD 8.2 64bit generic kernel.
When I was using apache with mod_php and load got high enough then all http processes would enter lockf state and the entire http server remained locked until forcefully restarted. When this occured cpu usage was quite low and there was no obvious resource saturation cpu, i/o etc. Enabling vfs shared lookups eased the problem but didn't solve it.
I have now changed apache to threaded and separated php off to a fastcgi setup. I have tested both fcgid and fpm.
Now http itself has no issues so it's the php side of the serving that's the problem. But I get the same issue, when enough connections to server system% load jumps up in one go from about 10% to 90% and then all processes lockf state.
I suspect something in the web code itself is doing something
1 - breakdown of whats using sys% cpu usage.
2 - live status of files accessed on HDDs.
Here is a snapshot from systat, I even have /tmp on a RAM disk as it was been flooded with php session files.
Code:
2 users Load 3.12 2.76 2.24 Jan 9 06:25
Mem:KB REAL VIRTUAL VN PAGER SWAP PAGER
Tot Share Tot Share Free in out in out
Act 1604080 8560 7193324 11168 5817080 count
All 1726784 12964 1084943k 139136 pages
Proc: Interrupts
r p d s w Csw Trp Sys Int Sof Flt 144 cow 1616 total
1 1k 507 1759 1388 16 117 1668 1023 zfod atkbd0 1
ozfod ata0 irq14
74.5%Sys 0.0%Intr 0.9%User 0.0%Nice 24.6%Idle %ozfod uhci4 22
| | | | | | | | | | | daefr 400 cpu0: time
=====================================> prcfr ciss0 256
19 dtbuf totfr 16 bce1 258
Namei Name-cache Dir-cache 206497 desvn react 400 cpu2: time
Calls hits % hits % 36777 numvn pdwak 400 cpu3: time
2319 2319 100 33420 frevn pdpgs 400 cpu1: time
intrn
Disks md0 da0 pass0 634752 wire
KB/t 0.00 0.00 0.00 1566956 act
tps 0 0 0 83288 inact
MB/s 0.00 0.00 0.00 236 cache
%busy 0 0 0 5817032 free
So only 74% sys but that's enough, currently all php-fpm in lockf state and either not responding or responding very slowly, if I forcefully restart it (php-fpm) then it instantly becomes responsive again with sys% down to around 10%.
The hardware is 4 SCSI HDDs in RAID10, 4 core Intel Xeon. FreeBSD 8.2 64bit generic kernel.