spamd (spam trapping) crashes

ironmikie · Feb 16, 2010

I am running FreeBSD 7.2-STABLE with spamd from openbsd. I use this ofcourse for spam trapping as in greylisting etc.

I notice that sometimes spamd crashes which cause unknown smtp connections to be rejected. After watching the spamd.log for a while i know what error is responsible for this behaviour. Every time the following error pops up in the log, spamd exits.

<snip>
greyreader failed (No such file or directory)
</snip>

I have still no clue what to do with this error. After reading the source i see that this "greyreader" is a function. I am not that good in reading "C" so can anybody help with this one?

Thanks in advance.

malster · Jul 23, 2010

Hi ironmikie,

How did you go with this, I have the same and can't track down any info...

Cheers,

Mal

Klinki · Aug 27, 2010

Just had the same error - for me it was some typo in the rc.conf at obspamd_flags="..."

Hope it helps!

ohauer · Sep 4, 2010

ironmikie said:
I am running FreeBSD 7.2-STABLE with spamd from openbsd. I use this ofcourse for spam trapping as in greylisting etc.

I notice that sometimes spamd crashes which cause unknown smtp connections to be rejected. After watching the spamd.log for a while i know what error is responsible for this behaviour. Every time the following error pops up in the log, spamd exits.

<snip>
greyreader failed (No such file or directory)
</snip>

I have still no clue what to do with this error. After reading the source i see that this "greyreader" is a function. I am not that good in reading "C" so can anybody help with this one?

Thanks in advance.

Do you run CPANEL or something else?

If something send a signal to spamd you will see the a message like the one you notice.
Pls. build spamd with parm -DCPANEL, the resulting binary will be then obspamd and not spamd.

You can run the port without redirecting the traffic to spamd a view days to see if the process dies again
Pls. test this way with and without build param -DCPANEL, if the process dies again open a PR.

val · Dec 1, 2011

Yes, have the same problem. obspamd died unexpectedly almost every hour

Code:

FreeBSD 8.2-Stable

gate ~# pfctl -t spamd-white -T show | wc -l
      41 
gate ~# pfctl -t blocked -T show | wc -l
   95413

May be obspamd can't work with large pf tables?

BTW, I haven't any CPANEL, no typo errors in obspamd_flag= and etc.

How to debug problem?
Any ideas?

ohauer · Dec 2, 2011

Hm, almost every hour sounds like spamd gets a signal from the outside.

The largest spamdb I've seen was ~900MB with a view hundred thousand entries on a 8.x x64 machine.

Do you have any log entries?
Can you check if there is something running every hour (for example cron jobs, spamd-setup, ...)

val · Dec 2, 2011

Not exactly every hours. Correlation with cron job I can't find. Here yesterday statistics from daemon:

Code:

2011-12-01 03:17:00 The process obspamd is dead (63508).
2011-12-01 03:17:03 The process obspamd is running (70757).
2011-12-01 04:36:00 The process obspamd is dead (70757).
2011-12-01 04:36:02 The process obspamd is running (72229).
2011-12-01 05:41:00 The process obspamd is dead (72229).
2011-12-01 05:41:02 The process obspamd is running (73479).
2011-12-01 07:10:00 The process obspamd is dead (73479).
2011-12-01 07:10:02 The process obspamd is running (75305).
2011-12-01 09:03:00 The process obspamd is dead (75305).
2011-12-01 09:03:02 The process obspamd is running (77987).
2011-12-01 10:17:00 The process obspamd is dead (77987).
2011-12-01 10:17:03 The process obspamd is running (82077).
2011-12-01 10:47:00 The process obspamd is dead (82077).
2011-12-01 10:47:02 The process obspamd is running (83502).
2011-12-01 11:20:00 The process obspamd is dead (83502).
2011-12-01 11:20:02 The process obspamd is running (85314).
2011-12-01 16:04:00 The process obspamd is dead (95001).
2011-12-01 16:04:02 The process obspamd is running (9791).
2011-12-01 17:03:00 The process obspamd is dead (9791).
2011-12-01 17:03:03 The process obspamd is running (15587).
2011-12-01 18:02:00 The process obspamd is dead (15587).
2011-12-01 18:02:02 The process obspamd is running (20956).
2011-12-01 19:17:00 The process obspamd is dead (20956).
2011-12-01 19:17:02 The process obspamd is running (27318).

cron check service availability every minute.

val · Dec 2, 2011

In messages

Code:

Dec  1 09:02:11 orion kernel: pid 75305 (spamd), uid 132:[color="Red"] exited on signal 11[/color]
Dec  1 09:02:11 orion spamd[75308]: greyreader failed (No such file or directory)
Dec  1 09:03:00 orion spamd[77987]: listening for incoming connections.

graudeejs · Dec 2, 2011

I think I used to have similar problem, when I was running my server.
But it crashed perhaps once a week or few times a month most.

(I was using ipfw)

val · Dec 2, 2011

in my case used pf

ohauer · Dec 2, 2011

Code:

2011-12-01 18:02:00 The process obspamd is dead (15587).
2011-12-01 18:02:02 The process obspamd is running (20956).
2011-12-01 19:17:00 The process obspamd is dead (20956).
2011-12-01 19:17:02 The process obspamd is running (27318).

Hm the timing between fail and run again are really short.
Do you monitor maybe the proctitle which change at the time the (internal) greyreader process runs?

val said:

In messages

Code:

Dec  1 09:02:11 orion kernel: pid 75305 (spamd), uid 132:[color="Red"] exited on signal 11[/color]
Dec  1 09:02:11 orion spamd[75308]: greyreader failed (No such file or directory)
Dec  1 09:03:00 orion spamd[77987]: listening for incoming connections.

Can you try to start with a fresh spamdb? If greyreader fails, it is mostly a broken pipe or a corrupt database.

val · Dec 5, 2011

ohauer said:
Hm the timing between fail and run again are really short.

This one because task in cron running every minute and restarting died daemon.

ohauer said:
Do you monitor maybe the proctitle which change at the time the (internal) greyreader process runs?

Yes, I'm ready, but no have vaild ideas how to implemet this.

ohauer said:
Can you try to start with a fresh spamdb? If greyreader fails, it is mostly a broken pipe or a corrupt database.

No success.

val · Dec 6, 2011

Ok, I think was memory corruption in spamd.c module apprx in this place or after:

Code:

syslog_r(LOG_DEBUG, ... "Body: %s", cp->addr, p);

may be due to compiling flags or ..?
PS If spamd compiled with debug flag error doesn't occur.

ohauer · Dec 8, 2011

val said:
Ok, I think was memory corruption in spamd.c module apprx in this place or after:

Code:

syslog_r(LOG_DEBUG, ... "Body: %s", cp->addr, p);

may be due to compiling flags or ..?
PS If spamd compiled with debug flag error doesn't occur.

This sounds strange ... (never happened to me)

Have you used non-default optimization flags before?

val · Dec 8, 2011

Problem occur if used compiler optimization flag -O2.
Snip from my make.conf:

Code:

CPUTYPE?=pentium3
CFLAGS=-O2 -pipe
COPTFLAGS=-O2 -pipe

If used such make.conf conf, signal 11 doesn't occur

Code:

CPUTYPE?=pentium3
CFLAGS=-O -pipe
COPTFLAGS=-O2 -pipe

feld · Jan 20, 2012

val said:
Problem occur if used compiler optimization flag -O2.
Snip from my make.conf:

Code:

CPUTYPE?=pentium3 CFLAGS=-O2 -pipe COPTFLAGS=-O2 -pipe

If used such make.conf conf, signal 11 doesn't occur

Code:

CPUTYPE?=pentium3 CFLAGS=-O -pipe COPTFLAGS=-O2 -pipe

If this works I'll be contacting the maintainer. Also, there needs to be a proper knob for renaming the binary to obspamd

kpa · Jan 20, 2012

Those flags should go into the Makefile of the port, setting them unconditionally in /etc/make.conf is an error. Setting CPUTYPE is also an error unless you're cross compiling for different machine with a different type of CPU.

feld · Feb 2, 2012

Compiling with CFLAGS=-O -pipe in the port's Makefile did not help. I had my crash again last night. I haven't tested with debug yet, but that seems a bit extreme.

Ben · Jul 4, 2012

I am facing the problem that obspamd-update -b will crash my FreeBSD 9.0-p3 AMD64 completely. It will load 170.000 or more IPs into a local table.

Might this issue be related?