Streaming Server Bottleneck

This is a production streaming server running nginx and munin. Its only purpose is to serve static video using nginx (no PHP, no MySQL).

All statistics and monitoring tools show the server being healthy (at least from my understanding) other then gstat which shows 100% busy with a network transfer rate of 33 MB/s. If I restart nginx, the server can do even 40 MB/s with gstat only showing around 40% busy. However, over time, % busy creeps up to 100% and L(q) shoots to 10 instead of staying mostly at 0 (which is the case after a restart).

I believe it has something to do with the many concurrent connections.
[CMD=]netstat -a[/CMD] shows 230 "ESTABLISHED" connections. SSH interactivity is a bit slow. As soon as I stop nginx, everything is fine again.

I had assigned 5MB mp4_buffer_size in nginx. Maybe since 5MBx230 Connections = 1150GB of memory is required for nginx? Also, why do I have about 1 GB of memory inactive? How can I tell FreeBSD to use all the memory available?









 
The big portion of wired-down memory is normal since you are using memory hungry ZFS.

Unless you have unlimited bandwidth (or gave priority to ssh), under that many established connections (download), I would expect a little slowdown as well. So I think your problem is not memory shortage but bandwidth (just like the rest of us :)).

BTW, have you tried increasing worker_processes?
 
aa said:
The big portion of wired-down memory is normal since you are using memory hungry ZFS.

Unless you have unlimited bandwidth (or gave priority to ssh), under that many established connections (download), I would expect a little slowdown as well. So I think your problem is not memory shortage but bandwidth (just like the rest of us :)).

BTW, have you tried increasing worker_processes?

Actually, I do have unmetered bandwidth with 1xGbE NIC. I now believe the problem is munin. It seems to be spawning to many processes and causing the system to slow down. I had worker_processes at 4 before and tried reducing it to 2, but that didn't fix the problem.

I had restarted nginx and for the past 6 hours, network throughput was 35 MB/s and gstat shows the disk being around 30% busy. This is perfect but by tomorrow, it's going to slow the system down even without any additional load.


Should I turn munin off completely?
 
You can not serve a file bigger than mp4_buffer_size, I've checked it default to 10MB. It doesn't mean that you need memory as much as connections x buffer, of course.
Should I turn munin off completely?
Worth to try, never used it myself, looking forward the result after you've done that.
 
Turning munin off didn't help. I now suspect its due to context switching. When there is a lot of sockets open sysctl kern.ipc.numopensockets (greater than 200), context switching goes really high and the number of processes also increase a lot, then the system is sluggish. It's not a memory issue, as I have 2GB free at all times. The weird thing is that the system does not even need a reboot, just a restart of nginx and everything is back to "perfect".
 
Back
Top