1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

My server keeps running out of RAM

Discussion in 'Web and Network Services' started by ghostcorps, Apr 6, 2012.

  1. ghostcorps

    ghostcorps New Member

    Messages:
    295
    Likes Received:
    0
    Hi guys,

    As of a day or two ago my server is shutting down due to low RAM, according to my host. But, I can not pin down the cause, I have replicated it by running a find command on /usr/local/etc/apache22/extras so no huge task but it has occurred both times I ran the command today. It is a media streaming server so it should not have any trouble with such a small request.

    I have started looking through /var but so far I can not find out what is freaking it out.

    Here is top at idle:


    Code:
    last pid:  2750;  load averages:  0.00,  0.00,  0.00
    62 processes:  2 running, 60 sleeping
    CPU:  0.3% user,  0.0% nice,  0.2% system,  0.0% interrupt, 99.5% idle
    Mem: 76M Active, 83M Inact, 51M Wired, 444K Cache, 85M Buf, 521M Free
    Swap: 988M Total, 988M Free
    Order to sort: [B]res[/B]
      PID USERNAME     THR PRI NICE   SIZE    RES STATE    TIME   WCPU COMMAND
     2609     88        18  44    0 86896K 45168K ucond    0:01  0.00% mysqld
     2694 www            1  45    0   124M 27496K lockf    0:01  0.00% httpd
     2698 www            1  50    0   122M 25460K lockf    0:01  0.00% httpd
     2695 www            1  50    0   122M 25420K lockf    0:01  0.00% httpd
     2697 www            1  50    0   122M 25420K lockf    0:01  0.00% httpd
     2696 www            1  44    0   112M 14808K lockf    0:00  0.00% httpd
     2706 www            1  44    0   112M 14752K lockf    0:00  0.00% httpd
     2705 www            1  44    0   112M 14752K kqread   0:00  0.00% httpd
     2693 root           1  44    0   112M 14744K select   0:00  0.00% httpd
     1450 www            1  44    0 71420K  7248K accept   0:00  0.00% httpd
     1451 www            1  63    0 71420K  7248K accept   0:00  0.00% httpd
     1452 www            1  63    0 71420K  7248K accept   0:00  0.00% httpd
     1453 www            1  63    0 71420K  7248K accept   0:00  0.00% httpd
     1454 www            1  63    0 71420K  7248K accept   0:00  0.00% httpd
     1376 root           1  44    0 71420K  7244K select   0:00  0.00% httpd
     2667 admin          1  44    0 38104K  5176K RUN      0:00  0.00% sshd
     2664 root           1  45    0 38104K  5168K sbwait   0:00  0.00% sshd
     1405 root           1  44    0 26172K  4500K select   0:00  0.00% sshd
     1242 root           1  44    0 11092K  4184K select   0:00  0.00% openvpn
     1411 root           1  44    0 12096K  4080K select   0:00  0.00% sendmail
     1417 smmsp          1  76    0 12096K  4012K pause    0:00  0.00% sendmail
     1780 smmsp          1  76    0 12004K  3864K pause    0:00  0.00% sendmail
     1774 root           1  44    0 12004K  3616K select   0:00  0.00% sendmail
     2670 root           1  44    0 10216K  2800K wait     0:00  0.00% bash
     2668 admin          1  44    0 10216K  2796K wait     0:00  0.00% bash
     2750 root           1  44    0  9336K  2288K RUN      0:00  0.00% top
     2669 admin          1  44    0 21668K  2008K wait     0:00  0.00% su
     2040     88         1  76    0  8264K  1860K wait     0:00  0.00% sh
     2077 root           1  44    0  8080K  1636K nanslp   0:00  0.00% cron
     1787 root           1  44    0  7952K  1612K nanslp   0:00  0.00% cron
     1424 root           1  44    0  7952K  1612K nanslp   0:00  0.00% cron
     1913 root           1  44    0  7024K  1584K select   0:00  0.00% syslogd
     1089 root           1  44    0  7024K  1564K select   0:00  0.00% syslogd
     1611 root           1  44    0  6896K  1560K select   0:00  0.00% syslogd
     2287 root           1  76    0  9008K  1396K select   0:00  0.00% inetd
     2440 root           1  76    0  6892K  1288K ttyin    0:00  0.00% getty
     2445 root           1  76    0  6892K  1288K ttyin    0:00  0.00% getty
     2441 root           1  76    0  6892K  1288K ttyin    0:00  0.00% getty
     2446 root           1  76    0  6892K  1288K ttyin    0:00  0.00% getty
     2447 root           1  76    0  6892K  1288K ttyin    0:00  0.00% getty
     2442 root           1  76    0  6892K  1288K ttyin    0:00  0.00% getty
     2443 root           1  76    0  6892K  1288K ttyin    0:00  0.00% getty
     2444 root           1  76    0  6892K  1288K ttyin    0:00  0.00% getty
      115 root           1  76    0  2744K  1024K pause    0:00  0.00% adjkerntz
      852 root           1  44    0  3204K   724K select   0:00  0.00% devd
    
    Can anyone please suggest a way to find the culprit? This is a production server and I am getting my arse kicked every time it goes down :(
     
  2. gkontos

    gkontos Member

    Messages:
    1,385
    Likes Received:
    1
    1) How exactly does it shut down?

    2) What do your logs say when this happens? (/var/log/messages)

    3) Is this a dedicated or a VPS?
     
  3. blakjak

    blakjak New Member

    Messages:
    2
    Likes Received:
    0
    Your sever is running out of ram space

    you need to have a SWAP partition during your installation of the FreeBSD OS. This
    SWAP partition is used when your computer is running out of RAM space. I hope you have a swap partition?
     
  4. ghostcorps

    ghostcorps New Member

    Messages:
    295
    Likes Received:
    0
    Thanks for the questions.

    It is a VPS

    It is not entirely clear how it stalls, but we are forced to reboot it through the VM to get it back. It was down this morning, we restarted it and it ran for a few hours. I logged in via ssh to make some changes to the apache config (unrelated). I ran a find search looking for a string in the /extras folder and after opening the file the session stalled. At that point the website which runs off a jailed webserver went offline, but the fail over page on the host was still live, albeit very slow to load.

    After a short time there is nothing on either page and I am forced to reboot.


    A few rules from /etc/ipfw.rules that are mentioned in messages:

    Code:
    $IPF 801 deny log all from any to HOST.SERVER 22-25
    $IPF 900 deny log all from any to WEBSERVER.JAIL 1-79          
    $IPF 910 allow log all from any to WEBSERVER.JAIL 80
    $IPF 920 allow log all from any to WEBSERVER.JAIL 443
    $IPF 930 deny log all from any to WEBSERVER.JAIL 81-442
    $IPF 940 deny log all from any to WEBSERVER.JAIL 444-1934
    

    /var/log/messages Starting from a flood of SYSERRs before the first crash, to now'ish. I have cut out a bunch of stuff that I didn't think was necessary.

    Code:
    Mar 27 13:32:12 DOMAIN sm-mta[48309]: q2RHW6bp048307: SYSERR(root): database.URL.com. config error: mail loops back to me (MX problem?)
    Mar 27 14:02:12 DOMAIN sm-mta[48509]: q2RI27mM048507: SYSERR(root): database.URL.com. config error: mail loops back to me (MX problem?)
    Mar 27 14:32:11 DOMAIN sm-mta[48630]: q2RIW6Fm048628: SYSERR(root): database.URL.com. config error: mail loops back to me (MX problem?)
    Mar 27 23:01:11 DOMAIN sm-mta[51704]: q2S316ug051625: SYSERR(root): webserver.URL.com. config error: mail loops back to me (MX problem?)
    Mar 27 23:02:52 DOMAIN sm-mta[51860]: q2S32kYN051810: SYSERR(root): webserver.URL.com. config error: mail loops back to me (MX problem?)
    Mar 27 23:02:52 DOMAIN sm-mta[51863]: q2S32lId051856: SYSERR(root): webserver.URL.com. config error: mail loops back to me (MX problem?)
    Mar 27 23:47:10 DOMAIN sm-mta[56507]: q2S3l5fM056505: SYSERR(root): webserver.URL.com. config error: mail loops back to me (MX problem?)
    Mar 28 09:02:12 DOMAIN sm-mta[64836]: q2SD27Tx064834: SYSERR(root): database.URL.com. config error: mail loops back to me (MX problem?)
    Mar 28 09:32:11 DOMAIN sm-mta[64957]: q2SDW6uO064955: SYSERR(root): database.URL.com. config error: mail loops back to me (MX problem?)
    Mar 28 10:02:12 DOMAIN sm-mta[65156]: q2SE26gm065154: SYSERR(root): database.URL.com. config error: mail loops back to me (MX problem?)
    Mar 28 10:32:12 DOMAIN sm-mta[65280]: q2SEW7Xw065278: SYSERR(root): database.URL.com. config error: mail loops back to me (MX problem?)
    Mar 28 11:02:11 DOMAIN sm-mta[65482]: q2SF26lJ065480: SYSERR(root): database.URL.com. config error: mail loops back to me (MX problem?)
    Mar 28 11:32:12 DOMAIN sm-mta[65608]: q2SFW6H2065606: SYSERR(root): database.URL.com. config error: mail loops back to me (MX problem?)
    Mar 28 12:02:12 DOMAIN sm-mta[66145]: q2SG27eb066091: SYSERR(root): database.URL.com. config error: mail loops back to me (MX problem?)
    Mar 28 12:02:12 DOMAIN sm-mta[66147]: q2SG27ss066092: SYSERR(root): database.URL.com. config error: mail loops back to me (MX problem?)
    Mar 28 12:05:29 DOMAIN sm-mta[66335]: q2SG5OXj066330: SYSERR(root): database.URL.com. config error: mail loops back to me (MX problem?)
    Mar 28 12:05:29 DOMAIN sm-mta[66338]: q2SG5Oqt066331: SYSERR(root): database.URL.com. config error: mail loops back to me (MX problem?)
    Mar 28 12:10:01 DOMAIN sm-mta[70765]: q2SG9uMm070763: SYSERR(root): database.URL.com. config error: mail loops back to me (MX problem?)
    Mar 28 12:32:12 DOMAIN sm-mta[70880]: q2SGW7n6070878: SYSERR(root): database.URL.com. config error: mail loops back to me (MX problem?)
    Mar 28 13:02:12 DOMAIN sm-mta[71081]: q2SH27qc071079: SYSERR(root): database.URL.com. config error: mail loops back to me (MX problem?)
    Mar 28 13:32:12 DOMAIN sm-mta[71203]: q2SHW6EY071201: SYSERR(root): database.URL.com. config error: mail loops back to me (MX problem?)
    Mar 28 14:02:12 DOMAIN sm-mta[71407]: q2SI27SK071405: SYSERR(root): database.URL.com. config error: mail loops back to me (MX problem?)
    Mar 28 14:32:11 DOMAIN sm-mta[71531]: q2SIW6iW071529: SYSERR(root): database.URL.com. config error: mail loops back to me (MX problem?)
    Mar 28 23:01:11 DOMAIN sm-mta[74646]: q2T316R2074503: SYSERR(root): webserver.URL.com. config error: mail loops back to me (MX problem?)
    Mar 28 23:02:50 DOMAIN sm-mta[74804]: q2T32jvM074753: SYSERR(root): webserver.URL.com. config error: mail loops back to me (MX problem?)
    Mar 28 23:02:50 DOMAIN sm-mta[74807]: q2T32j9p074802: SYSERR(root): webserver.URL.com. config error: mail loops back to me (MX problem?)
    Mar 28 23:50:41 DOMAIN sm-mta[79450]: q2T3oeYv079448: SYSERR(root): webserver.URL.com. config error: mail loops back to me (MX problem?)
    
    ***dmesg***
    
    Apr  2 15:48:45 DOMAIN su: URLadmin to toor on /dev/pts/0
    Apr  3 23:05:06 DOMAIN su: admin to root on /dev/pts/0
    Apr  5 20:05:19 DOMAIN kernel: arp: XXX.XXX.XXX.3 moved from 00:ff:2d:81:3b:3c to 00:ff:03:09:cd:79 on tap0
    Apr  5 20:05:35 DOMAIN sshd[xxx86]: error: PAM: authentication error for root from YYY.YYY.YYY
    Apr  5 20:05:58 DOMAIN sshd[25888]: error: PAM: authentication error for toor from YYY.YYY.YYY
    Apr  5 20:06:01 DOMAIN sshd[25888]: error: PAM: authentication error for toor from YYY.YYY.YYY
    Apr  5 20:07:47 DOMAIN su: URLadmin to toor on /dev/pts/0
    Apr  5 20:29:58 DOMAIN sshd[1401]: error: accept: Software caused connection abort
    Apr  5 20:35:41 DOMAIN su: admin to root on /dev/pts/1
    Apr  5 20:37:53 DOMAIN su: URLadmin to toor on /dev/pts/0
    Apr  5 20:39:32 DOMAIN su: admin to root on /dev/pts/1
    Apr  6 01:17:05 DOMAIN su: admin to root on /dev/pts/0
    Apr  6 01:52:54 DOMAIN su: admin to root on /dev/pts/1
    
    
    Apr  6 02:09:14 DOMAIN kernel: ipfw: limit 5 reached on entry 900
    Apr  6 02:09:14 DOMAIN kernel: ipfw: limit 5 reached on entry 930
    Apr  6 02:09:14 DOMAIN kernel: ipfw: limit 5 reached on entry 940
    Apr  6 02:09:51 DOMAIN kernel: ipfw: limit 5 reached on entry 920
    Apr  6 02:09:51 DOMAIN kernel: ipfw: limit 5 reached on entry 910
    Apr  6 02:10:43 DOMAIN kernel: ipfw: limit 5 reached on entry 910
    Apr  6 02:10:47 DOMAIN kernel: ipfw: limit 5 reached on entry 920
    Apr  6 02:24:46 DOMAIN kernel: ipfw: limit 5 reached on entry 801
    
    ***dmesg***
    
    Apr  6 03:58:42 DOMAIN kernel: ipfw: limit 5 reached on entry 801
    Apr  6 03:59:38 DOMAIN fsck: /dev/da0s1e: 38 files, 145 used, 253670 free (30 frags, 31705 blocks, 0.0% fragmentation)
    Apr  6 04:00:24 DOMAIN fsck: /dev/da0s1f: PARTIALLY TRUNCATED INODE I=711230
    Apr  6 04:00:24 DOMAIN fsck: /dev/da0s1f: UNEXPECTED SOFT UPDATE INCONSISTENCY; RUN fsck MANUALLY.
    
    Apr  6 04:03:01 DOMAIN kernel: ipfw: limit 5 reached on entry 910
    Apr  6 04:05:47 DOMAIN kernel: ipfw: limit 5 reached on entry 920
    Apr  6 04:19:33 DOMAIN kernel: ipfw: limit 5 reached on entry 900
    Apr  6 04:19:33 DOMAIN kernel: ipfw: limit 5 reached on entry 930
    Apr  6 04:19:33 DOMAIN kernel: ipfw: limit 5 reached on entry 940
    
    ***dmesg***
    
    Apr  6 04:52:01 DOMAIN kernel: ipfw: limit 5 reached on entry 910
    
    Apr  6 04:52:54 DOMAIN fsck: /dev/da0s1e: 39 files, 145 used, 253670 free (30 frags, 31705 blocks, 0.0% fragmentation)
    Apr  6 04:53:05 DOMAIN fsck: /dev/da0s1d: LINK COUNT FILE I=70669  OWNER=operator MODE=100400
    Apr  6 04:53:05 DOMAIN fsck: /dev/da0s1d: SIZE=2048 MTIME=Apr  6 02:22 2012  COUNT 2 SHOULD BE 1 (ADJUSTED)
    Apr  6 04:53:05 DOMAIN fsck: /dev/da0s1d: LINK COUNT FILE I=70680  OWNER=operator MODE=100400
    Apr  6 04:53:05 DOMAIN fsck: /dev/da0s1d: SIZE=2048 MTIME=Apr  6 04:00 2012  COUNT 2 SHOULD BE 1 (ADJUSTED)
    Apr  6 04:53:05 DOMAIN fsck: /dev/da0s1d: LINK COUNT FILE I=70688  OWNER=operator MODE=100400
    Apr  6 04:53:05 DOMAIN fsck: /dev/da0s1d: SIZE=2048 MTIME=Apr  6 04:22 2012  COUNT 2 SHOULD BE 1 (ADJUSTED)
    Apr  6 04:53:05 DOMAIN fsck: /dev/da0s1d: LINK COUNT FILE I=70689  OWNER=operator MODE=100400
    Apr  6 04:53:05 DOMAIN fsck: /dev/da0s1d: SIZE=2048 MTIME=Apr  6 03:55 2012  COUNT 2 SHOULD BE 1 (ADJUSTED)
    Apr  6 04:53:05 DOMAIN fsck: /dev/da0s1d: LINK COUNT FILE I=70692  OWNER=operator MODE=100400
    Apr  6 04:53:05 DOMAIN fsck: /dev/da0s1d: SIZE=2048 MTIME=Apr  6 04:11 2012  COUNT 2 SHOULD BE 1 (ADJUSTED)
    Apr  6 04:53:05 DOMAIN fsck: /dev/da0s1d: LINK COUNT FILE I=70694  OWNER=operator MODE=100400
    Apr  6 04:53:05 DOMAIN fsck: /dev/da0s1d: SIZE=2048 MTIME=Apr  6 04:33 2012  COUNT 2 SHOULD BE 1 (ADJUSTED)
    Apr  6 04:53:05 DOMAIN fsck: /dev/da0s1d: LINK COUNT FILE I=70705  OWNER=operator MODE=100400
    Apr  6 04:53:05 DOMAIN fsck: /dev/da0s1d: SIZE=2048 MTIME=Apr  6 03:44 2012  COUNT 2 SHOULD BE 1 (ADJUSTED)
    Apr  6 04:53:05 DOMAIN fsck: /dev/da0s1d: Reclaimed: 0 directories, 1 files, 1 fragments
    Apr  6 04:53:05 DOMAIN fsck: /dev/da0s1d: 25334 files, 126338 used, 623684 free (7228 frags, 77057 blocks, 1.0% fragmentation)
    
    ***dmesg***
    
     
  5. gkontos

    gkontos Member

    Messages:
    1,385
    Likes Received:
    1
    I don't think it is a memory related issue. It looks more like a filesystem corruption to me.
    I would suggest a full backup of your data, sites and dbs, and then a fsck from single user mode.

    Also, try fixing sendmail in your database jail by either disabling it or making the proper aliases.
    I don't have much experience with IPFW syntax but I would find a way to keep only 1 rule there also.
     
  6. ghostcorps

    ghostcorps New Member

    Messages:
    295
    Likes Received:
    0
    Thanks Blakjak, yes, I do :)

    #df -h
    Code:
    Filesystem     Size    Used   Avail Capacity  Mounted on
    /dev/da0s1a    496M    331M    125M    72%    /
    devfs          1.0K    1.0K      0B   100%    /dev
    /dev/da0s1e    496M    290K    456M     0%    /tmp
    /dev/da0s1f     16G     12G    2.6G    82%    /usr
    /dev/da0s1d    1.4G    247M    1.1G    18%    /var
    devfs          1.0K    1.0K      0B   100%    /usr/gaols/webserver/dev
    procfs         4.0K    4.0K      0B   100%    /usr/gaols/webserver/proc
    devfs          1.0K    1.0K      0B   100%    /usr/gaols/database/dev
    procfs         4.0K    4.0K      0B   100%    /usr/gaols/database/proc
    
    #pstat -T
    Code:
    324/12072 files
    0M/987M swap space

    gkontos:

    It looks like one of the guys at the host has run an fsck on it already, but I will check with them and give it a go otherwise. We have gone a night without a crash but who knows what the new day will bring.

    I don't need sendmail on the db so I'll turn that off too. Thanks for pointing that out, I didn't even realise it was on.

    Not sure what you mean about keeping 1 rule in ipfw. The rules above are only a small handful of the hundred or so rules I use to keep the ports blocked. If I remove any of them it will expose me.



    Thanks again for you help :)
     
  7. gkontos

    gkontos Member

    Messages:
    1,385
    Likes Received:
    1
    When dealing with firewall rules, you try to write them in such way that they don't bring extra burden in to the filtering engine.
    Like I said before, I have absolutely no idea how script IPFW rules. But you can use this as a general rule of thumb:

    1) Have your most frequent rules processed first.
    2) Explicitly deny all other ports using a more general statement.

    Pseudocode example:

    Code:
    permit any to <webeserver> <webservice_tcp_ports>
    deny any  any
    This pretty much works with any type of firewall.
     
  8. ghostcorps

    ghostcorps New Member

    Messages:
    295
    Likes Received:
    0
    Thanks,

    I have made the rule list loosely adhering to that idea. I will see what I can do to optimise it.
     
  9. olav

    olav New Member

    Messages:
    349
    Likes Received:
    0
    I had the same problem with FreeBSD 8.2-RELEASE, an upgrade to FreeBSD 8.2-STABLE solved it. The STABLE branch is a good branch, and can be used on production systems.
     
  10. ghostcorps

    ghostcorps New Member

    Messages:
    295
    Likes Received:
    0
    Thanks olay,

    I should have mentioned that this box is running FreeBSD 8.1-RELEASE-p2. There are a few patches outstanding because I have modified the kernel and rolling it back will take the site offline for a day and we have just gotten some articles out in the news so we don't want to take it down just yet.

    So far it hasn't crashed again though. * fingers crossed*
     
  11. debguy

    debguy New Member

    Messages:
    40
    Likes Received:
    0
    sshd is 3mb? Talk about using ash not sh to reserve memory and ssh blows through it :)

    112M is wrong. Apache would use 4MB for a process having a small web page open.

    (1) I would check httpd config files to see what setting would *allow* apache to cache that much data: apache is designed not to break memory limits by any kind of web hits.

    (2) Llook at your web content. Is apache loading a corrupt webpage that is in fact 100M to load?

    Please say if the top you show is a httpd process waiting for a web hit or already having loaded the home page. (i.e., in the setting it may load 5 waiting - which get recycled)

    Use netstat -a to see what's LISTENING v. CONNECTED.

    BTW is that multi-processor / threaded apache processes or regular ones? The mp version I think may be a little wild on memory it may say so in the docs. Use the right apache2 install pkg.

    By what is allowed I mean "apache mods" that your config say apache should / can load - there are so many I don't know if you're loading all of perl, python, php, and all else and the kitchen sink per process for no reason.

    If you are migrating, you might not run the new apache with an old website - maybe use the apache the website had been working fine with.
     
  12. ghostcorps

    ghostcorps New Member

    Messages:
    295
    Likes Received:
    0
    Thanks Debguy.

    Did you mean bash instead of sh?

    I will look into everything you have mentioned, there looks to be some fine-tuning to be done. I should say that top was run on the host which holds two jailed servers. Both the host and one of the virtual servers hosts an Apache installation. The webserver is a video streaming server. I would expect that Apache would run pretty heavy in this situation but I will still see what I can so about lightening the load.

    I am not sure which modules are safe to disable and which are not, whenever I try to thin them out I always end up breaking something that is not obvious.

    It looks to be the multiprocess version, which is the version portmaster chose to install.

    /usr/ports/www/apache22/Makefile
    Code:
    PORTNAME=       apache
    PORTVERSION=    2.2.22
    PORTREVISION=   5
    CATEGORIES=     www
    MASTER_SITES=   ${MASTER_SITE_APACHE_HTTPD}
    DISTNAME=       httpd-${PORTVERSION}
    DIST_SUBDIR=    apache22
    
    MAINTAINER?=    apache@FreeBSD.org
    COMMENT?=       Version 2.2.x of Apache web server with ${WITH_MPM:L} MPM.
    
    netstat -a
    Code:
    Active Internet connections (including servers)
    Proto Recv-Q Send-Q  Local Address          Foreign Address       (state)
    tcp4       0    104 XXX.ssh              ME.35008       ESTABLISHED
    tcp4       0      0 *.*                    *.*                    CLOSED
    tcp46      0      0 *.http                 *.*                    LISTEN
    tcp4       0      0 *.https                *.*                    LISTEN
    tcp4       0      0 *.http                 *.*                    LISTEN
    tcp4       0      0 *.8080                 *.*                    LISTEN
    tcp4       0      0 SITENAME.com..smtp  *.*                    LISTEN
    tcp4       0      0 *.ftp                  *.*                    LISTEN
    tcp4       0      0 *.submission           *.*                    LISTEN
    tcp6       0      0 *.smtp                 *.*                    LISTEN
    tcp4       0      0 *.smtp                 *.*                    LISTEN
    tcp4       0      0 XXX.ssh              *.*                    LISTEN
    tcp4       0      0 XXX.ssh              *.*                    LISTEN
    

    Thankfully we have not had any trouble since posting this thread, but that doesn't mean it can not happen again.
     
  13. User23

    User23 Member

    Messages:
    348
    Likes Received:
    0
    If php is used as apache module, 112MB per process is nothing special. Running low on RAM could happen if too many processes are running at the same time. Monitor your services and count of processes and you may find the problem easily.

    I had similiar problems with a wordpress + statistics plugin. The plugin stored the statistics in a mysql db so slow, that the whole server could process only 2 request per second ... so the number of running processes raised sometimes to 200 or more and the server began to swap.

    Use apachebench (ab) for a stress test.
     
  14. ghostcorps

    ghostcorps New Member

    Messages:
    295
    Likes Received:
    0
    I am told we are using a statistics plugin on wordpress. But I ran ab and it held up fine I think.

    ab -n 1000 -c 5 https://URL.com/
    Code:
    
    Server Software:        Apache
    Server Hostname:        URL.com.au
    Server Port:            443
    SSL/TLS Protocol:       TLSv1/SSLv3,DHE-RSA-AES256-SHA,2048,256
    
    Document Path:          /
    Document Length:        7756 bytes
    
    Concurrency Level:      5
    Time taken for tests:   848.079 seconds
    Complete requests:      1000
    Failed requests:        0
    Write errors:           0
    Total transferred:      8188432 bytes
    HTML transferred:       7756000 bytes
    Requests per second:    1.18 [#/sec] (mean)
    Time per request:       4240.396 [ms] (mean)
    Time per request:       848.079 [ms] (mean, across all concurrent requests)
    Transfer rate:          9.43 [Kbytes/sec] received
    
    Connection Times (ms)
                  min  mean[+/-sd] median   max
    Connect:     1194 1278 235.6   1203    3841
    Processing:   803 2950 1326.2   2775   14652
    Waiting:      802 2593 1299.9   2402   14282
    Total:       2004 4227 1337.0   4042   16455
    
    Percentage of the requests served within a certain time (ms)
      50%   4042
      66%   4337
      75%   4558
      80%   4722
      90%   5376
      95%   6395
      98%   7668
      99%  10180
     100%  16455 (longest request)
    
    
    nb. That I ran ab from Australia and the server is in the US.

    Looking at the info.php it says I am using mod_php5, which if I have googled correctly means that I am not using PHP as a module, is this right?
     
  15. User23

    User23 Member

    Messages:
    348
    Likes Received:
    0
    The main problem is not PHP but the number of processes, running at the same time, due to the slow statistic plugin.
    Make a stress test with and without the wordpress statistics plugin enabled to verify the problem.

    PHP as apache module should be the fastest option, so stay with it.
     
  16. User23

    User23 Member

    Messages:
    348
    Likes Received:
    0
    "mod_php5" is the PHP5 apache module. So, everything is ok.
     
  17. ghostcorps

    ghostcorps New Member

    Messages:
    295
    Likes Received:
    0
    I have been running a 50000 pass test over night, it is almost done. I'll try without the plugin when it finishes however during the test I see no more than 14 threads at any time. Is this good or bad?
     
  18. ghostcorps

    ghostcorps New Member

    Messages:
    295
    Likes Received:
    0
    Could it have something to do with the time being out of sync by a day between the webserver and the database? I found my servers were not using the same timezone for some reason. But I have corrected it now.

    Will see if this has any affect on the number of threads.
     
  19. ghostcorps

    ghostcorps New Member

    Messages:
    295
    Likes Received:
    0
    Hello again,

    By taking some time to research the mods and testing each one one at a time, I have cut the httpd threads down to about 95-120mb. Disabling the stats plugin WassUp did not reduce the thread size noticeably.

    This is an improvement of about 20mb and all services I can think of are working but I get the feeling I could go further. Would you mind having a look at my mod list below and letting me know if I have blocked anything subtly crucial? Or If there is anything more I could block :)

    /usr/local/etc/apache22/httpd.conf
    Code:
    LoadModule authn_file_module libexec/apache22/mod_authn_file.so
    #LoadModule authn_dbm_module libexec/apache22/mod_authn_dbm.so
    #LoadModule authn_anon_module libexec/apache22/mod_authn_anon.so
    #LoadModule authn_default_module libexec/apache22/mod_authn_default.so
    #LoadModule authn_alias_module libexec/apache22/mod_authn_alias.so
    LoadModule authz_host_module libexec/apache22/mod_authz_host.so
    LoadModule authz_groupfile_module libexec/apache22/mod_authz_groupfile.so
    LoadModule authz_user_module libexec/apache22/mod_authz_user.so
    #LoadModule authz_dbm_module libexec/apache22/mod_authz_dbm.so
    #LoadModule authz_owner_module libexec/apache22/mod_authz_owner.so
    #LoadModule authz_default_module libexec/apache22/mod_authz_default.so
    LoadModule auth_basic_module libexec/apache22/mod_auth_basic.so
    #LoadModule auth_digest_module libexec/apache22/mod_auth_digest.so
    #LoadModule file_cache_module libexec/apache22/mod_file_cache.so
    #LoadModule cache_module libexec/apache22/mod_cache.so
    #LoadModule disk_cache_module libexec/apache22/mod_disk_cache.so
    #LoadModule dumpio_module libexec/apache22/mod_dumpio.so
    LoadModule reqtimeout_module libexec/apache22/mod_reqtimeout.so
    LoadModule include_module libexec/apache22/mod_include.so
    #LoadModule filter_module libexec/apache22/mod_filter.so
    #LoadModule charset_lite_module libexec/apache22/mod_charset_lite.so
    LoadModule deflate_module libexec/apache22/mod_deflate.so
    LoadModule log_config_module libexec/apache22/mod_log_config.so
    #LoadModule log_forensic_module libexec/apache22/mod_log_forensic.so
    #LoadModule logio_module libexec/apache22/mod_logio.so
    LoadModule env_module libexec/apache22/mod_env.so
    #LoadModule mime_magic_module libexec/apache22/mod_mime_magic.so
    #LoadModule cern_meta_module libexec/apache22/mod_cern_meta.so
    LoadModule expires_module libexec/apache22/mod_expires.so
    LoadModule headers_module libexec/apache22/mod_headers.so
    LoadModule usertrack_module libexec/apache22/mod_usertrack.so
    LoadModule unique_id_module libexec/apache22/mod_unique_id.so
    LoadModule setenvif_module libexec/apache22/mod_setenvif.so
    #LoadModule version_module libexec/apache22/mod_version.so
    LoadModule ssl_module libexec/apache22/mod_ssl.so
    LoadModule mime_module libexec/apache22/mod_mime.so
    #LoadModule dav_module libexec/apache22/mod_dav.so
    #LoadModule status_module libexec/apache22/mod_status.so
    #LoadModule autoindex_module libexec/apache22/mod_autoindex.so
    #LoadModule asis_module libexec/apache22/mod_asis.so
    #LoadModule info_module libexec/apache22/mod_info.so
    LoadModule cgi_module libexec/apache22/mod_cgi.so
    #LoadModule dav_fs_module libexec/apache22/mod_dav_fs.so
    LoadModule vhost_alias_module libexec/apache22/mod_vhost_alias.so
    #LoadModule negotiation_module libexec/apache22/mod_negotiation.so
    LoadModule dir_module libexec/apache22/mod_dir.so
    #LoadModule imagemap_module libexec/apache22/mod_imagemap.so
    LoadModule actions_module libexec/apache22/mod_actions.so
    LoadModule speling_module libexec/apache22/mod_speling.so
    #LoadModule userdir_module libexec/apache22/mod_userdir.so
    LoadModule alias_module libexec/apache22/mod_alias.so
    LoadModule rewrite_module libexec/apache22/mod_rewrite.so
    LoadModule unique_id_module libexec/apache22/mod_unique_id.so
    LoadModule security2_module libexec/apache22/mod_security2.so
    LoadModule php5_module        libexec/apache22/libphp5.so
     
  20. ghostcorps

    ghostcorps New Member

    Messages:
    295
    Likes Received:
    0
    Just when I thought I was ready to mark this as solved... It crashed again!!

    :(

    I have worked through all the suggestions and still crashing!

    I am lost now ...
     
  21. User23

    User23 Member

    Messages:
    348
    Likes Received:
    0
    Depends on how many simultaneous queries you used to test and on the server hardware.

    Try
    Code:
    ab -c 20 -n 1000 http://yourdomain.tld 
    for example.
     
  22. ghostcorps

    ghostcorps New Member

    Messages:
    295
    Likes Received:
    0
    Thanks,

    I ran the test but it timed out after 4 completed requests.

    Code:
    Benchmarking URL.com (be patient)
    apr_poll: The timeout specified has expired (70007)
    Total of 4 requests completed
    I couldn't browse to the site and my ssh session crashed out too, but I was able to log in with some patience. I found alot of threads were still open, I guess they were the threads created by ab had not closed..

    After restarting apache the website came back up and access is normal again. Is it the RAM or is more likely that the interface between the site and SQL is too slow? Apache runs on one jailed server and the database is on another.
     
  23. ghostcorps

    ghostcorps New Member

    Messages:
    295
    Likes Received:
    0
    It is strange, if I run a 20 thread test with a concurrency of 20, it pulls through and the threads clear. But when I run a 40 thread test with the same concurrency the test times out and the threads lock up.

    I have added this to /usr/local/etc/apache22/httpd.conf
    Code:
    RequestReadTimeout header=1-3,MinRate=500
    
    But it has not had any noticeable effect.
     
  24. ghostcorps

    ghostcorps New Member

    Messages:
    295
    Likes Received:
    0
    By turning off the KeepAlive entry in the config I no longer lock up the server when ab times out. Which is excellent news.

    But I still need to work out how to stop the server grinding to a halt when I run 20 consecutive threads for example:

    ab -c 20 -n 20 https://some.site.cd/
     
  25. User23

    User23 Member

    Messages:
    348
    Likes Received:
    0
    You could run the ab test on a static html page. This will show how the apache perform without mysql.

    As I said, I guess it is the Wordpress statistic plugin. Keep an eye on the mysql slow queries log and use
    Code:
    show full processlist;
    on the mysql console, while stress testing. If the statistics inserts are the bottleneck it should be easy to identify them in the processlist.