dovecot lmtp: error reading configuration

Hello,

I am getting the following error on my dovecot LMTP server, upon delivery from my postfix mailserver:

Oct 6 03:31:09 m dovecot[36551]: lmtp: Fatal: Error reading configuration: read(/var/run/dovecot/config) failed: read(size=8192) failed: Interrupted system call - Also failed to read config by executing doveconf: /var/run/dovecot/config is a UNIX socket (path is from CONFIG_FILE environment)



This results in mail being deferred by postfix (different host), and successfully delivered at the next try:

Oct 6 03:31:10 mailer postfix/lmtp[44835]: 46F0BA329: to=<d@l.ynx.fr>, orig_to=<root>, relay=192.168.1.101[192.168.1.101]:24, delay=17, delays=0.2/0.01/16/0, dsn=4.4.2, status=deferred (lost connection with 192.168.1.101[192.168.1.101] while receiving the initial server greeting)

Any idea where it comes from?

I am running dovecot 2.3.21.1 (d492236fa0) on FreeBSD 14.1-RELEASE-p5.

Here is my configuration:

Code:
# 2.3.21.1 (d492236fa0): /usr/local/etc/dovecot/dovecot.conf
# Pigeonhole version 0.5.21.1 (49005e73)
# OS: FreeBSD 14.1-RELEASE-p3 amd64  
# Hostname: m.ynx.fr
auth_mechanisms = cram-md5 SCRAM-SHA-256
auth_verbose = yes
listen = *
mail_location = mdbox:~/mdbox
managesieve_notify_capability = mailto
managesieve_sieve_capability = fileinto reject envelope encoded-character vacation subaddress comparator-i;ascii-numeric relational regex imap4flags copy include variables body enotify environment mailbox date index ihave duplicate mime foreverypart extracttext
namespace inbox {
  inbox = yes
  location = 
  mailbox Drafts {
    auto = create
    special_use = \Drafts
  }
  mailbox Junk {
    auto = create
    special_use = \Junk
  }
  mailbox Sent {
    auto = create
    special_use = \Sent
  }
  mailbox "Sent Messages" {
    special_use = \Sent
  }
  mailbox Trash {
    auto = create
    special_use = \Trash
  }
  prefix = 
}
passdb {
  args = scheme=CRYPT username_format=%u /usr/local/etc/dovecot/users
  driver = passwd-file
}
protocols = imap lmtp sieve
service auth {
  inet_listener {
    port = 144
  }
}
service lmtp {
  inet_listener lmtp {
    port = 24
  }
  user = vmail
}
service managesieve-login {
  inet_listener sieve {
    port = 4190
  }
}
ssl = no
userdb {
  args = uid=vmail gid=vmail home=/home/vmail/%u
  driver = static
}
protocol lmtp {
  mail_plugins = " sieve"
}
protocol imap {
  mail_max_userip_connections = 100
}
 
Sorry there was a mistake during copy-paste. I fixed it.

As far as I know this prefix can be empty. It's just something that is prepended to the file paths.

The server works 99% of the time. I receive emails just fine it just takes some extra time for a couple of messages a day though. So I don't think it's a config error or it would not work at all.
 
On my system, /var/run/dovecot/dovecot.conf is a symlink.
Code:
ls -l /var/run/dovecot/dovecot.conf 
lrwx------  1 root  wheel  35 Sep 30 17:00 /var/run/dovecot/dovecot.conf -> /usr/local/etc/dovecot/dovecot.conf
 
It's /var/run/dovecot/config not dovecot.conf
srw------- 1 root wheel 0 Sep 22 14:20 config

Most likely the server is busy or out of resources (RAM/CPU/IOps etc.) and it timeout the process.
 
What is the log above this message:


Post the entire process log of 36551

Also check this archive:

There is nothing right before. Older logs are unrelated (imap):
Oct 6 00:25:48 m dovecot[36551]: lmtp(82615): Disconnect from 192.168.1.29: Logged out (state=READY)
Oct 6 01:02:47 m dovecot[36551]: imap(d@l.ynx.fr)<70580><OnJscbMjJLTAqMcH>: Disconnected: Inactivity - no input for 1800 secs in=1064 out=15279 deleted=0 expunged=0 trashed=0 hdr_count=0 hdr_bytes=0 body_count=0 body_bytes=0
Oct 6 01:02:48 m dovecot[36551]: imap(d@l.ynx.fr)<70578><jRHicLMjRKjAqMcH>: Disconnected: Inactivity - no input for 1800 secs in=2531 out=56107 deleted=0 expunged=0 trashed=1 hdr_count=2 hdr_bytes=1150 body_count=2 body_bytes=31053
Oct 6 01:45:18 m dovecot[36551]: imap-login: Login: user=<d@l.ynx.fr>, method=CRAM-MD5, rip=192.168.199.7, lip=192.168.1.101, mpid=83507, session=<JqNOYcMjGJbAqMcH>
Oct 6 02:15:27 m dovecot[36551]: imap(d@l.ynx.fr)<83507><JqNOYcMjGJbAqMcH>: Disconnected: Inactivity - no input for 1800 secs in=1723 out=13703 deleted=0 expunged=0 trashed=0 hdr_count=0 hdr_bytes=0 body_count=0 body_bytes=0
Oct 6 02:46:54 m dovecot[36551]: imap-login: Login: user=<d@l.ynx.fr>, method=CRAM-MD5, rip=192.168.199.7, lip=192.168.1.101, mpid=84183, session=<DTqWPcQj1JPAqMcH>
Oct 6 03:17:08 m dovecot[36551]: imap(d@l.ynx.fr)<84183><DTqWPcQj1JPAqMcH>: Disconnected: Inactivity - no input for 1800 secs in=1723 out=13703 deleted=0 expunged=0 trashed=0 hdr_count=0 hdr_bytes=0 body_count=0 body_bytes=0
Oct 6 03:31:09 m dovecot[36551]: lmtp: Fatal: Error reading configuration: read(/var/run/dovecot/config) failed: read(size=8192) failed: Interrupted system call - Also failed to read config by executing doveconf: /var/run/dovecot/config is a UNIX socket (path is from CONFIG_FILE environment)



Is it a very busy server? In particular, on those few occasions when you see the error? My guess would be that when the lmtp process tries to ask the config process what mailbox to write to, it's just timing out.

<https://doc.dovecot.org/main/developers/design/processes.html#config-process>

It is not busy at all. Personal mailbox. Like between 5 and 30 mails per day. BUT, this seems to happen when two messages are delivered concurrently.
I can see that in the postfix logs, it's not just one mail, it's actually two of them that get deferred. Here is the complete postfix log for these two messages:

Oct 6 03:30:42 mailer postfix/pickup[42170]: 5D768A327: uid=0 from=<root>
Oct 6 03:30:44 mailer postfix/cleanup[44788]: 5D768A327: message-id=<20241006013042.5D768A327@mailer.daserv.fr>
Oct 6 03:30:51 mailer postfix/pickup[42170]: CE682A2EC: uid=0 from=<root>
Oct 6 03:30:51 mailer postfix/cleanup[44788]: CE682A2EC: message-id=<20241006013051.CE682A2EC@mailer.daserv.fr>
Oct 6 03:30:51 mailer postfix/qmgr[1599]: 5D768A327: from=<root@mailer.daserv.fr>, size=1051, nrcpt=1 (queue active)
Oct 6 03:30:52 mailer postfix/qmgr[1599]: CE682A2EC: from=<root@mailer.daserv.fr>, size=1927, nrcpt=1 (queue active)
Oct 6 03:30:53 mailer postfix/cleanup[44788]: 0AABDA1DD: message-id=<20241006013042.5D768A327@mailer.daserv.fr>
Oct 6 03:30:53 mailer postfix/local[44833]: 5D768A327: to=<root@mailer.daserv.fr>, orig_to=<root>, relay=local, delay=13, delays=12/1.1/0/0.25, dsn=2.0.0, status=sent (forwarded as 0AABDA1DD)
Oct 6 03:30:53 mailer postfix/qmgr[1599]: 0AABDA1DD: from=<root@mailer.daserv.fr>, size=1607, nrcpt=1 (queue active)
Oct 6 03:30:53 mailer postfix/qmgr[1599]: 5D768A327: removed
Oct 6 03:30:53 mailer postfix/cleanup[44788]: 46F0BA329: message-id=<20241006013051.CE682A2EC@mailer.daserv.fr>
Oct 6 03:30:53 mailer postfix/qmgr[1599]: 46F0BA329: from=<root@mailer.daserv.fr>, size=2483, nrcpt=1 (queue active)
Oct 6 03:30:53 mailer postfix/qmgr[1599]: CE682A2EC: removed
Oct 6 03:30:53 mailer postfix/local[44833]: CE682A2EC: to=<root@mailer.daserv.fr>, orig_to=<root>, relay=local, delay=4, delays=3.5/0.35/0/0.2, dsn=2.0.0, status=sent (forwarded as 46F0BA329)
Oct 6 03:31:10 mailer postfix/lmtp[44835]: 46F0BA329: to=<d@l.ynx.fr>, orig_to=<root>, relay=192.168.1.101[192.168.1.101]:24, delay=17, delays=0.2/0.01/16/0, dsn=4.4.2, status=deferred (lost connection with 192.168.1.101[192.168.1.101] while receiving the initial server greeting)
Oct 6 03:31:10 mailer postfix/lmtp[44834]: 0AABDA1DD: to=<d@l.ynx.fr>, orig_to=<root>, relay=192.168.1.101[192.168.1.101]:24, delay=17, delays=0.25/0.04/17/0, dsn=4.4.2, status=deferred (lost connection with 192.168.1.101[192.168.1.101] while receiving the initial server greeting)
Oct 6 03:40:42 mailer postfix/qmgr[1599]: 46F0BA329: from=<root@mailer.daserv.fr>, size=2483, nrcpt=1 (queue active)
Oct 6 03:40:43 mailer postfix/qmgr[1599]: 0AABDA1DD: from=<root@mailer.daserv.fr>, size=1607, nrcpt=1 (queue active)
Oct 6 03:40:48 mailer postfix/lmtp[45285]: 46F0BA329: to=<d@l.ynx.fr>, orig_to=<root>, relay=192.168.1.101[192.168.1.101]:24, delay=595, delays=589/1.1/0.03/5.1, dsn=2.0.0, status=sent (250 2.0.0 <d@l.ynx.fr> /BQEDpvqAWcoUgEANae01Q Saved)
Oct 6 03:40:48 mailer postfix/lmtp[45286]: 0AABDA1DD: to=<d@l.ynx.fr>, orig_to=<root>, relay=192.168.1.101[192.168.1.101]:24, delay=595, delays=590/0.02/0.44/4.7, dsn=2.0.0, status=sent (250 2.0.0 <d@l.ynx.fr> Jx+QJ5vqAWcpUgEANae01Q Saved)
Oct 6 03:40:48 mailer postfix/qmgr[1599]: 46F0BA329: removed
Oct 6 03:40:48 mailer postfix/qmgr[1599]: 0AABDA1DD: removed


What I don't get is that it's just two e-mails. Dovecot servers in production are probably processing tens of thousands of emails per minutes in corporate setups. So there must be something wrong with my conf. But what?

From the error message, and the documentation linked by putney, it looks like setting the CONFIG_FILE environment variable could help avoid this. But I cannot even find a reference to that environment variable in the dovecot documentation.
 
Can you check if this happen at high disk load during the same time period on different days. There may be periodic task at 3:00 which may search your disk and cause high disk usage and slowdown. Look for some time patterns

bzgrep "/var/run/dovecot/config" /var/log/maillog.*.bz2
grep "/var/run/dovecot/config" /var/log/maillog
OR
bzgrep "Fatal" /var/log/maillog.*.bz2
grep "Fatal" /var/log/maillog
 
Thank you for taking the time to look into this with me.

It happens every few days.

root@m:~ # bzgrep "/var/run/dovecot/config" /var/log/maillog*
/var/log/maillog.1.bz2:Oct 6 03:31:09 m dovecot[36551]: lmtp: Fatal: Error reading configuration: read(/var/run/dovecot/config) failed: read(size=8192) failed: Interrupted system call - Also failed to read config by executing doveconf: /var/run/dovecot/config is a UNIX socket (path is from CONFIG_FILE environment)
/var/log/maillog.2.bz2:Oct 5 03:07:18 m dovecot[36551]: lmtp: Fatal: Error reading configuration: read(/var/run/dovecot/config) failed: read(size=8192) failed: Interrupted system call - Also failed to read config by executing doveconf: /var/run/dovecot/config is a UNIX socket (path is from CONFIG_FILE environment)
/var/log/maillog.2.bz2:Oct 5 03:16:05 m dovecot[36551]: lmtp: Fatal: Error reading configuration: read(/var/run/dovecot/config) failed: read(size=8192) failed: Interrupted system call - Also failed to read config by executing doveconf: /var/run/dovecot/config is a UNIX socket (path is from CONFIG_FILE environment)
/var/log/maillog.5.bz2:Oct 2 03:29:50 m dovecot[36551]: lmtp: Fatal: Error reading configuration: read(/var/run/dovecot/config) failed: read(size=8192) failed: Interrupted system call - Also failed to read config by executing doveconf: /var/run/dovecot/config is a UNIX socket (path is from CONFIG_FILE environment)


These e-mails seem to usually come from the periodic jobs, but not always. The ones on October 5 were regular e-mail from an external website. But yes it would be while periodic jobs are running on this host / jail, I think.
 
So it's between 3:00 and 4:00. By default periodic daily is started at 3:01. You may try to start the daily job manually so you can observe if the high disc usage is the cause of the failed dovecot. This can be run manually using periodic daily command.

You can view the daily scripts in /etc/periodic/daily and if you need to enable/disable some of them you can create /etc/periodic.conf.local where you can place your settings for example
daily_ntpd_leapfile_enable="no"
daily_clean_disks_enable="no"
daily_status_security_enable="no"


periodic(8)

What filesystem you are using ZFS or UFS?
 
I mean yes I could disable the periodic jobs, but that's a poor workaround. I wish there was something I could do at the dovecot level to avoid the crash instead.

This host (dovecot is running in a jail) is running ZFS.
 
Message seems to say it crashes, partly because the CONFIG_FILE variable is not pointing to the right path. I'll try to set that to point to the actual configuration file and see if it helps.
 
Ah you mean it's a collision of all periodic jobs running at the same time (this host runs like 5 or 6 jails) that might be causing this. That's a very good point. I will see how it goes with the environment variable I mentioned and if I encounter the issue again I'll try spacing out the periodic runs. Thank you!
 
Back
Top