Solved [Solved] syslog-ng wedged at 100% CPU

(Solved By: Upgrading to a newer version of the port; known bug in FreeBSD syslog-ng version 3.3.6. Jump to comment in this post)

I installed syslog-ng recently, replacing syslogd (I am planning to capture logs from my pfSense gateway and record to an RDB). I followed the steps on bsdguides.org. I just noticed that it's apparently monopolizing an entire CPU:

Code:
 top -S
last pid: 42036;  load averages:  1.00,  1.11,  1.16                                               up 41+22:50:23  20:06:52
75 processes:  3 running, 71 sleeping, 1 waiting
CPU:     % user,     % nice,     % system,     % interrupt,     % idle
Mem: 34M Active, 67M Inact, 20G Wired, 2220M Buf, 3367M Free
Swap: 3052M Total, 3052M Free

  PID USERNAME    THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
   11 root          8 155 ki31     0K   128K CPU7    7 7752.6 700.00% idle
93520 root          1 103    0   130M 10984K CPU6    6 216.1H 100.00% syslog-ng
    0 root        174  -8    0     0K  2784K -       0 504:47  0.00% kernel
   38 root          4  -8    -     0K    80K tx->tx  3  79:19  0.00% zfskern

Looking at other users' top reports of syslog-ng, this does not appear normal; 3-5% seems more typical. Any thoughts of what might be preoccupying my service so much? I ran ktrace on it and quickly generated a half megabyte of trace data, but I can't make much sense out of the output. In kdump I see this repeatedly:
Code:
 93520 syslog-ng RET   kevent 1
 93520 syslog-ng CALL  clock_gettime(0x4,0x80241d1a0)
 93520 syslog-ng RET   clock_gettime 0
 93520 syslog-ng CALL  kevent(0x3,0x7fffffff5c10,0,0x7fffffff5ad0,0x9,0x7fffffffdc60)
 93520 syslog-ng GIO   fd 3 wrote 0 bytes
       ""
 93520 syslog-ng GIO   fd 3 read 32 bytes
       0x0000 0c00 0000 0000 0000 ffff 0000 0000 0000 4575 0000 0000 0000 f0bb 5102 0800  |................Eu........Q...|
       0x001e 0000                                                                        |..|
... but I'll admit I didn't have the stomach to look in detail through the file. Holding down the spacebar in more has thousands of perfectly identical lines scroll by. I presume that "fd 3" is "file descriptor 3"; I'm at best vaguely incompetent with strace on Linux, how would I go about mapping that back to a file or device? Does the "...Eu...Q..." read data ring a bell for anyone?

As far as I can tell it's configured to write just to /var/log/:
Code:
grep '^destination' /usr/local/etc/syslog-ng.conf
destination messages { file("/var/log/messages"); };
destination security { file("/var/log/security"); };
destination authlog { file("/var/log/auth.log"); };
destination maillog { file("/var/log/maillog"); };
destination lpd-errs { file("/var/log/lpd-errs"); };
destination xferlog { file("/var/log/xferlog"); };
destination cron { file("/var/log/cron"); };
destination debuglog { file("/var/log/debug.log"); };
destination consolelog { file("/var/log/console.log"); };
destination all { file("/var/log/all.log"); };
destination newscrit { file("/var/log/news/news.crit"); };
destination newserr { file("/var/log/news/news.err"); };
destination newsnotice { file("/var/log/news/news.notice"); };
destination slip { file("/var/log/slip.log"); };
destination ppp { file("/var/log/ppp.log"); };
destination console { file("/dev/console"); };
destination allusers { usertty("*"); };
... but /var/log/ only shows a few files with modest changes (so it's working, but not spamming that folder). The exception are the last two lines; could there be vigorous activity on either /dev/console or usertty()?

Thanks for any suggestions!

-CAT

Version info for FreeBSD and syslog-ng, plus currently enabled services:
Code:
uname -a
FreeBSD citadel 9.1-RELEASE FreeBSD 9.1-RELEASE #0 r243825: Tue Dec  4 09:23:10 UTC 2012     root@farrell.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC  amd64

syslog-ng --version
syslog-ng 3.3.6
Installer-Version: 3.3.6
Revision: ssh+git://algernon@git.balabit//var/scm/git/syslog-ng/syslog-ng-ose--mainline--3.3#master#d32b5c63e09a4c3897cbc2239c07967cea393de9
Compile-Date: Dec 30 2013 15:55:01
Default-Modules: affile,afprog,afsocket,afuser,basicfuncs,csvparser,dbparser,syslogformat,afsql
Error opening plugin module; module='afsocket-notls', error='/usr/local/lib/syslog-ng/libafsocket-notls.so: Undefined symbol "log_transport_tls_new"'
Available-Modules: afsocket,afsocket-tls,afsql,affile,afprog,afuser,afmongodb,csvparser,confgen,syslogformat,basicfuncs,convertfuncs,dbparser,dummy
Enable-Debug: off
Enable-GProf: off
Enable-Memtrace: off
Enable-IPv6: on
Enable-Spoof-Source: off
Enable-TCP-Wrapper: off
Enable-Linux-Caps: off
Enable-Pcre: on

service -e
/etc/rc.d/hostid
/etc/rc.d/zvol
/etc/rc.d/hostid_save
/etc/rc.d/zfs
/etc/rc.d/cleanvar
/etc/rc.d/ip6addrctl
/etc/rc.d/devd
/usr/local/etc/rc.d/syslog-ng
/etc/rc.d/newsyslog
/etc/rc.d/ntpdate
/etc/rc.d/rpcbind
/etc/rc.d/dmesg
/etc/rc.d/mountd
/etc/rc.d/nfsd
/etc/rc.d/statd
/etc/rc.d/lockd
/etc/rc.d/virecover
/etc/rc.d/motd
/etc/rc.d/ntpd
/usr/local/etc/rc.d/smartd
/usr/local/etc/rc.d/postgresql
/etc/rc.d/sshd
/etc/rc.d/sendmail
/etc/rc.d/cron
/etc/rc.d/mixer
/etc/rc.d/gptboot
/etc/rc.d/bgfsck
 
Re: syslog-ng wedged at 100% CPU

Do things improve if you remove the last line of your /usr/local/etc/syslog-ng.conf file?
Code:
destination allusers { usertty("*"); };
 
Re: syslog-ng wedged at 100% CPU

trh411 said:
Do things improve if you remove the last line of your /usr/local/etc/syslog-ng.conf file?
Code:
destination allusers { usertty("*"); };
Good idea, it does not seem to be doing anything at the moment. As far as I know, the 'destination' lines just define the output, so I started out by commenting out the 'log' line that uses it instead:
Code:
# log { source(src); filter(f_emerg); destination(allusers); };
I stopped then started the service:
Code:
citadel 323 [ROOT] service -v syslog-ng stop
syslog-ng is located in /usr/local/etc/rc.d
Stopping syslog_ng.
Waiting for PIDS: 93520.

# Just to be sure...
citadel 326 [ROOT] service -v syslog-ng stop
syslog-ng is located in /usr/local/etc/rc.d
syslog_ng not running? (check /var/run/syslog.pid).

citadel 327 [ROOT] service -v syslog-ng start
syslog-ng is located in /usr/local/etc/rc.d
Starting syslog_ng.
I quickly started top, and watched CPU usage steadily climb over a period of about 15-20 seconds to 100%. It's been ten minutes and it's stayed pegged at 100%. I went ahead and commented out the destination line as well (in case it "set aside" IO or some-such), stopped/started, but again it quickly climbed to 100% and stayed there.
 
Re: syslog-ng wedged at 100% CPU

Well, I thought it was worth a try. Sorry it did not help.

When viewing the output of top, does the RES remain fairly constant as the WCPU increases for syslog-ng? After 5-10 minutes, does RES show any significant change?
 
Re: syslog-ng wedged at 100% CPU

trh411 said:
When viewing the output of top, does the RES remain fairly constant as the WCPU increases for syslog-ng? After 5-10 minutes, does RES show any significant change?
RES began at 7276 KB, then very soon fell (trivially) to 7260 KB and held steady there. No frets, thanks for trying to help. I assume I've mis-configured something, or am missing a subtle moving part.

syslog-ng --version reported, amongst other things:
Code:
Error opening plugin module; module='afsocket-notls', error='/usr/local/lib/syslog-ng/libafsocket-notls.so: Undefined symbol "log_transport_tls_new"'
I'm not conversant in TLS, but I'm not (yet) doing anything over the network so it would not seem immediately relevant.

Another issue I've been mulling over is persistent complaints over "my own domain name". I see periodically in /var/log/messages:
Code:
Jan  8 03:01:00 citadel sendmail[38216]: My unqualified host name (citadel) unknown; sleeping for retry
... repeats for 4-5 minutes ...
Jan  8 03:05:24 citadel sendmail[38476]: unable to qualify my own domain name (citadel) -- using short name
... and a similar complaint at boot (where it appears to delay the boot process). I don't know if logging requires lookup of the host's host name, and it's spending a lot of time spinning wheels there?
 
Re: syslog-ng wedged at 100% CPU

In a situation like this, although it will be tedious, I would start with an empty configuration (i.e., nothing for syslog-ng to do) and see how it behaves. If it looks okay, then I would give it something to do by adding lines one or two at a time to its configuration and see what ultimately "breaks" it.
 
Re: syslog-ng wedged at 100% CPU

After a night's rest, I realized that syslog-ng probably had a dedicated forum (I really shouldn't be root anytime after 9pm...). I asked there, and someone pointed out that there was a FreeBSD-specific "CPU spinning" bug in the version I had. After figuring out how to upgrade packages, I managed to build the updated version (3.4.6 currently). That seems to have solved the problem! I can't even find the process in top anymore, but it's at 0% in htop:
Code:
  PID USER      PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command
89440 root      120   0 18724  3144     0 R  0.0  0.0  0:00.13 htop
89379 root      120   0  102M 10920     0 S  0.0  0.0  0:00.03 /usr/local/sbin/syslog-ng -p /var/run/syslog.pid
89378 root      128   0 33692  4952     0 S  0.0  0.0  0:00.00 /usr/local/sbin/syslog-ng -p /var/run/syslog.pid
I double-checked that it's also making log entries. Thanks everyone for your help and ideas! Special thanks to balabit at the syslog-ng forums for catching the version-specific bug.
 
Back
Top