Can't start Postgrey

xy16644 · Apr 11, 2013

Hello

This morning I applied the latest security fixes to my FreeBSD 9.1 server. I run a custom kernel so I do this from source. After running the updates I am now running:

Code:

FreeBSD 9.1-RELEASE-p2 #2 r249052:

One strange thing happened after the updates...my Postgrey service will not start:

Code:

/usr/local/etc/rc.d/postgrey status
postgrey is not running.

If I try to start the service it says it is starting the service but when I query it again (as above) it says it is not running. This means that I can't receive email (but can send).

According to /var/log/maillog:

Code:

postfix/smtpd[5380]: warning: connect to 127.0.0.1:10023: Connection refused
postfix/smtpd[5380]: warning: problem talking to server 127.0.0.1:10023: Connection refused

This is what my /var/db/postgrey/ looks like:

Code:

-rw-------  1 root  postgrey     24576 Apr 11 12:50 __db.001
-rw-------  1 root  postgrey     49152 Apr 11 12:50 __db.002
-rw-------  1 root  postgrey    270336 Apr 11 12:50 __db.003
-rw-------  1 root  postgrey     98304 Apr 11 12:50 __db.004
-rw-------  1 root  postgrey     49152 Apr 11 12:50 __db.005
-rw-------  1 root  postgrey  10485760 Apr 11 12:50 log.0000000002
-rw-------  1 root  postgrey    458752 Apr 11 12:50 postgrey.db
-rw-------  1 root  postgrey         0 Apr 11 12:50 postgrey.lock
-rw-------  1 root  postgrey    147456 Apr 11 12:50 postgrey_clients.db

Does anyone know how I can get Postgrey to run again so that I can receive emails again? I haven't changed the configuration at all and have only applied the latest FreeBSD security updates to the operating system.

I am most baffled why this happened! Thanks for any help.

xy16644 · Apr 11, 2013

I also tried reinstalling the postgrey port but this didn't seem to make a difference.

Does anyone have any ideas I could try?

Savagedlight · Apr 11, 2013

Verify that postgrey is indeed not running by # ps -aux | grep postgrey, then remove the lock file on the database: # rm [file]/var/db/postgrey/postgrey.lock[/file], and try starting it again.

xy16644 · Apr 11, 2013

Savagedlight said:
Verify that postgrey is indeed not running by # ps -aux | grep postgrey, then remove the lock file on the database: # rm [file]/var/db/postgrey/postgrey.lock[/file], and try starting it again.

# ps -aux | grep postgrey returns nothing so postgrey is defintely not running.

I also deleted the /var/db/postgrey/postgrey.lock but I still can't start the postgrey service.

Running # sockstat -4 | egrep 10023 also returns nothing!

Thanks for your help, any other ideas? :stud

xy16644 · Apr 11, 2013

After running # ps -aux | grep postgrey again after deleting the lock file I now get:

Code:

root      98624   0.0  0.0   9636   1664  0  S+   10:18PM    0:00.00 grep postgrey

Still can't start the service though!

fonz · Apr 11, 2013

xy16644 said:
After running # ps -aux | grep postgrey again after deleting the lock file I now get:

Code:

root 98624 0.0 0.0 9636 1664 0 S+ 10:18PM 0:00.00 grep postgrey

Consider that no match. When piping the output of ps(1) to grep(1), chances are good that the grep itself shows up in the result and that's exactly what happened here. Typically, if it's really unwanted (possibly considering further processing), one does something like % ps aux|grep -v grep|grep foo.

Savagedlight · Apr 11, 2013

May I suggest checking the usual suspects? /var/log/messages, # dmesg, or some log files related to the service?

Oh, and I just noticed... all the files in /var/db/postgrey are owned by root

ostgrey, but the group have no access. Any reason why? Have you tried giving the group access? I find it unlikely that this should break due to security patching, but you never know..

xy16644 · Apr 12, 2013

Savagedlight said:
May I suggest checking the usual suspects? /var/log/messages, # dmesg, or some log files related to the service?

Oh, and I just noticed... all the files in /var/db/postgrey are owned by rootostgrey, but the group have no access. Any reason why? Have you tried giving the group access? I find it unlikely that this should break due to security patching, but you never know..

Thanks for the suggestions. I have checked the log files you mentioned as well as /var/log/maillog. The maillog was the most useful saying that the connection was refused (due to the service not running) but I still can't figure out WHY the service won't start!

Good spot re the permissions on /var/db/postgrey. I have changed the permissions so that they now read:

Code:

-rw-------  1 postgrey  postgrey     24576 Apr 11 12:50 __db.001
-rw-------  1 postgrey  postgrey     49152 Apr 11 12:50 __db.002
-rw-------  1 postgrey  postgrey    270336 Apr 11 12:50 __db.003
-rw-------  1 postgrey  postgrey     98304 Apr 11 12:50 __db.004
-rw-------  1 postgrey  postgrey     49152 Apr 11 12:50 __db.005
-rw-------  1 postgrey  postgrey  10485760 Apr 11 12:50 log.0000000002
-rw-------  1 postgrey  postgrey    458752 Apr 11 12:50 postgrey.db
-rw-------  1 postgrey  postgrey    147456 Apr 11 12:50 postgrey_clients.db

Despite this the service still won't start...

wblock@ · Apr 12, 2013

fonz said:
Consider that no match. When piping the output of ps(1) to grep(1), chances are good that the grep itself shows up in the result and that's exactly what happened here. Typically, if it's really unwanted (possibly considering further processing), one does something like % ps aux|grep -v grep|grep foo.

Use pgrep(1), which is made just for this.

wblock@ · Apr 12, 2013

After upgrading from source, the mergemaster(8) step may have overwritten sendmail(8) config files in /etc/mail. No idea what Postfix needs, but given that the mail server won't talk to the filter, that's a good place to look.

For sendmail(8), reintegrating custom changes to /etc/mail/hostname.mc and then doing a # make all install restart is needed. Postfix is probably similar.

xy16644 · Apr 12, 2013

wblock@ said:
After upgrading from source, the mergemaster(8) step may have overwritten sendmail(8) config files in /etc/mail. No idea what Postfix needs, but given that the mail server won't talk to the filter, that's a good place to look.

For sendmail(8), reintegrating custom changes to /etc/mail/hostname.mc and then doing a # make all install restart is needed. Postfix is probably similar.

I assumed Postfix wasn't the problem here? The reason I say that is, I have removed the postgrey configuration from my main.cf file and I can now send/receive mail fine (i.e.: postgrey isn't being used now with my Postfix setup).

The way I understand it is, until I get the postgrey service to start again, I can't use postgrey with Postfix.

I should also mention that, during the steps I followed after the reboot:

Code:

# reboot
# cd /usr/src
# make installworld
# mergemaster -iU
# reboot

I received the following error while running # make installworld:

Code:

===> usr.sbin/sendmail (install)
install -s -o root -g smmsp -m 2555   sendmail /usr/libexec/sendmail
install: sendmail: No such file or directory
*** [_proginstall] Error code 71

Stop in /usr/src/usr.sbin/sendmail.
*** [realinstall] Error code 1

Stop in /usr/src/usr.sbin.
*** [realinstall] Error code 1

Stop in /usr/src.
*** [reinstall] Error code 1

Stop in /usr/src.
*** [installworld] Error code 1

Stop in /usr/src.
*** [installworld] Error code 1

Stop in /usr/src.

Could this be related to the problems I am experiencing?

wblock@ · Apr 12, 2013

You may have disabled building sendmail(8) in /etc/src.conf. That is not a problem, Postfix replaces it. But it should not cause an error in installworld.

A configuration file overwritten or incorrectly merged by mergemaster(8) is still a very likely problem.

xy16644 · Apr 12, 2013

wblock@ said:
A config file overwritten or incorrectly merged by mergemaster(8) is still a very likely problem.

I'm afraid I'm not sure how to proceed from here. Do I run mergemaster again?

wblock@ · Apr 12, 2013

First, get installworld to complete successfully. I don't know what that problem is, would be best handled in another thread.

After that, enable logging or repeat the process of configuring things to use the postgrey filter.

ShelLuser · Apr 18, 2013

Falling in a bit late but even so...

The first thing I usually try, especially. when there's nothing to be found in the logs, is trying to start the daemon manually. So not using the rc.d structure, but the executable itself. In this particular case you'd definitely want to use the -v flag too (verbose) and optionally --syslog-facility (if available).

In a lot of cases daemons tend to send errors to the stdout, especially when dealing with startup problems. This approach can help you to easily see those.

Hope this can help too.

galaxsat · Apr 21, 2013

I discovered this same problem after installing security updates yesterday. I haven't completely solved it yet, but I can tell you it is related to postgrey trying to bind to all localhost interfaces. Changing the option from the default --inet=10023 to --inet=127.0.0.1:10023 will allow postgrey to be started in the foreground, but I cannot yet get it to daemonize.

Running it in the foreground without specifying the IP yields:

Code:

Resolved [localhost]:10023 to [::1]:10023, IPv6
Resolved [localhost]:10023 to [127.0.0.1]:10023, IPv4
Binding to TCP port 10023 on host ::1 with IPv6
Insecure dependency in socket while running with -T switch at /usr/local/lib/perl5/5.12.4/mach/IO/Socket.pm line 80.

galaxsat · May 10, 2013

If anyone is still looking for help with this, a temporary solution is to remove the taint switch on line 1 of /usr/local/sbin/postgrey.
Change:

Code:

#!/usr/bin/perl -T -w

To:

Code:

#!/usr/bin/perl -w

xy16644 · Jun 30, 2013

galaxsat said:
If anyone is still looking for help with this, a temporary solution is to remove the taint switch on line 1 of /usr/local/sbin/postgrey.
Change:

Code:

#!/usr/bin/perl -T -w

To:

Code:

#!/usr/bin/perl -w

Genius! That fixed my problem! I am now able to start the postgrey service and everything is working as it should.

Thank you!

Savagedlight · Jun 30, 2013

Apparently, -T is 'taint mode'. This mode makes Perl assume all user input is hostile, and seems to be sanitizing the input. I'm sure the author of the script put the -T switch there for a very good reason - do you really trust everyone who speaks to your mail server?

wblock@ · Jun 30, 2013

But now you have compromised security.

Edit: simulpost with @Savagedlight.

xy16644 · Jun 30, 2013

Fair enough but what are my options then? Up until now I have disabled postgrey and my spam count went through the roof.

I'm open to ideas!

wblock@ · Jun 30, 2013

Step 1: search on "postgrey freebsd taint": http://www.perlmonks.org/?node_id=1025751

Step 2 appears to be "remove and reinstall all Perl modules", but read the PR from that URL (http://www.freebsd.org/cgi/query-pr.cgi?pr=177416).

kpa · Jun 30, 2013

I used mail/milter-greylist for a short time when I had my own server handling the mail for my domain. Was painless to set up for both mail/postfix and sendmail(8) (except for some group tweaking with postfix).

xy16644 · Jun 30, 2013

Since I am going to be replacing this server soon I think I will leave as is for now and accept the risk. Not ideal but my current server is quite slow and recompiling ports and dependencies takes forever.

On the new server I am still deciding on what ports to use but I do want it to be secure going forward!

ShelLuser · Jun 30, 2013

I get a very strong feeling that maybe the problem was caused by not following UPDATING while upgrading several ports.

Because I'm also using postgrey on all my servers and I don't have this problem. Even the server which I installed first, now several months ago, has seen many updates (including the recent Perl update) yet it still runs postgrey without any problems or modifications what so ever.

Therefore I can only conclude that this has nothing to do with Postgrey but the environment it runs on. So changing the way Perl gets executed doesn't fix a thing here; it only hides some (severe?) problems on your system.