Solved ssh, scp, rsync delay

I have a bunch of machines in my home on a LAN, some of it wired, some wireless. Three of the machines run FreeBSD -- one 10.3, the other two 11RC2. The rest of them run OpenSuse Linux. I set all the machines up with static IP addresses and record those addresses in /etc/hosts files on each of the machines. When scp'ing (or ssh'ing or rsync'ing) from Linux to Linux or from Linux to FreeBSD, using the name, not the IP address of the target machine, the connection happens instantly. When doing the same from one of the FreeBSD systems, there is a noticeable and very annoying delay. That delay goes away if instead of using the target system's name, I provide the IP address. I think FreeBSD is doing a DNS lookup of the provided name first, rather than consulting /etc/hosts first. I can't prove that, but it would certainly explain this behavior. I am aware of the existence of /etc/host.conf. Here's mine (they are the same on the three FreeBSD machines):
Code:
# Auto-generated from nsswitch.conf
hosts
dns
So this appears to be correct.

I have resorted to creating aliases for the inter-machine commands I use most, with the IP addresses embedded in the alias. It's a workaround, but I shouldn't have to do this. There's clearly a bug here and it's been present for a long time (I noticed this in earlier versions of FreeBSD, but didn't mention it because I have always run into a show-stopping bug with earlier versions of FreeBSD and stopped using it; 10.3 has worked well for me and my limited experience with 11RC2 has been similarly good).

Any ideas for how to get more information so I can file a bug report? Or a better explanation than mine for what's going on here?

Thanks --
 
What is in /etc/nsswitch.conf ? Can you truss ssh to see what's happening during the delay? You should see opens of /etc/nsswitch.conf and /etc/hosts. tcpdump -ilo0 port 53 tell anything ?

Juha
 
What is in /etc/nsswitch.conf ? Can you truss ssh to see what's happening during the delay? You should see opens of /etc/nsswitch.conf and /etc/hosts. tcpdump -ilo0 port 53 tell anything ?

Juha
Code:
#
# nsswitch.conf(5) - name service switch configuration file
# $FreeBSD: releng/11.0/etc/nsswitch.conf 301711 2016-06-09 01:28:44Z markj $
#
group: compat
group_compat: nis
hosts: files dns
netgroup: compat
networks: files
passwd: compat
passwd_compat: nis
shells: files
services: compat
services_compat: nis
protocols: files
rpc: files
This looks right to me, based on the nsswitch.conf man page.

Here's an excerpt from truss output when I ssh'ed to a Linux system by name and encountered the long delay:
Code:
close(4)  = 0 (0x0)
getpid()  = 1570 (0x622)
getpid()  = 1570 (0x622)
getpid()  = 1570 (0x622)
socket(PF_INET,SOCK_DGRAM,17)  = 4 (0x4)
sendto(4,"\0\^T\^A\0\0\^A\0\0\0\0\0\^A\^Ef"...,34,0x0,{ AF_INET 75.75.75.76:53 },0x10) = 34 (0x22) <<<<<<<<<-------------------------------------
select(5,{ 4 },0x0,0x0,{ 5.000000 })  = 0 (0x0)
close(4)  = 0 (0x0)
socket(PF_INET,SOCK_DGRAM,17)  = 4 (0x4)
sendto(4,"\0\^T\^A\0\0\^A\0\0\0\0\0\^A\^Ef"...,34,0x0,{ AF_INET 75.75.75.76:53 },0x10) = 34 (0x22)
select(5,{ 4 },0x0,0x0,{ 5.000000 })  = 0 (0x0)
close(4)  = 0 (0x0)
socket(PF_INET,SOCK_DGRAM,17)  = 4 (0x4)
sendto(4,"\0\^T\^A\0\0\^A\0\0\0\0\0\^A\^Ef"...,34,0x0,{ AF_INET 75.75.75.76:53 },0x10) = 34 (0x22)
select(5,{ 4 },0x0,0x0,{ 5.000000 })  = 0 (0x0)
close(4)  = 0 (0x0)
socket(PF_INET,SOCK_DGRAM,17)  = 4 (0x4)
sendto(4,"\0\^T\^A\0\0\^A\0\0\0\0\0\^A\^Ef"...,34,0x0,{ AF_INET 75.75.75.75:53 },0x10) = 34 (0x22)
select(5,{ 4 },0x0,0x0,{ 5.000000 })  = 1 (0x1)
fcntl(4,F_GETFL,)  = 2 (0x2)
fcntl(4,F_SETFL,O_NONBLOCK|0x2)  = 0 (0x0)
recvfrom(4,"\0\^T\M^A\M-#\0\^A\0\0\0\^F\0\^A"...,65535,0x0,NULL,0x0) = 642 (0x282)
close(4)  = 0 (0x0)
open("/home/dca/.ssh/known_hosts",O_RDONLY,0666) = 4 (0x4)
fstat(4,{ mode=-rw-r--r-- ,inode=23723,size=510,blksize=4096 }) = 0 (0x0)
read(4,"franz ecdsa-sha2-nistp256 AAAAE2"...,4096) = 510 (0x1fe)
read(4,0x802032000,4096)  = 0 (0x0)
close(4)  = 0 (0x0)
open("/home/dca/.ssh/known_hosts2",O_RDONLY,0666) ERR#2 'No such file or directory'
open("/etc/ssh/ssh_known_hosts",O_RDONLY,0666)  ERR#2 'No such file or directory'
open("/etc/ssh/ssh_known_hosts2",O_RDONLY,0666)  ERR#2 'No such file or directory'
write(3,"\0\0\0\f\n\^U\0\0\0\0\0\0\0\0\0"...,16) = 16 (0x10)
getpid()  = 1570 (0x622)
write(3,"y\^Y\a#\M^M\M-p\M-y\M-x\M^J\M-.e"...,44) = 44 (0x2c)
select(4,{ 3 },0x0,0x0,0x0)  = 1 (0x1)
The 'sendto' where I stuck in the arrow is significant, I think. It was there that the delay began. The IP address 75.75.75.76 is one of the two static IP addresses of the DNS servers at Comcast that I use (the other comes up a few lines down -- 75.75.75.75). While I don't know enough about this to say QED, this strikes me as possible proof that something is deciding to do a DNS lookup on the name I provided and it's slow, because the DNS server is not going to find it.
 
I have stripped it down to
Code:
group: files
hosts: files dns
networks: files
passwd: files
shells: files
services: files
protocols: files
rpc: files
Don't know what benefit the compat lines would give, been happy this way.

Trying to ssh the local host, I see this
Code:
stat("/etc/nsswitch.conf",{ mode=-rw-r--r-- ,inode=7142892,size=260,blksize=32768 }) = 0 (0x0)
open("/etc/nsswitch.conf",O_CLOEXEC,0666)  = 3 (0x3)
ioctl(3,0x402c7413 { IOR 0x74('t'), 19, 44 },0xbfbfc6f8) ERR#25 'Inappropriate ioctl for device'
fstat(3,{ mode=-rw-r--r-- ,inode=7142892,size=260,blksize=32768 }) = 0 (0x0)
read(3,"#\n# nsswitch.conf(5) - name ser"...,32768) = 260 (0x104)
read(3,0x28c38000,32768)  = 0 (0x0)
and this
Code:
open("/etc/hosts",O_CLOEXEC,0666)  = 3 (0x3)
fstat(3,{ mode=-rw-r--r-- ,inode=7142860,size=472,blksize=32768 }) = 0 (0x0)
read(3,"\n127.0.0.1\tlocalhost localhost"...,32768) = 472 (0x1d8)
read(3,0x28c66000,32768)  = 0 (0x0)
close(3)  = 0 (0x0)
and then
Code:
connect(3,{ AF_INET 192.168.0.1:22 },16)  ERR#61 'Connection refused'

Nothing tries to connect^W send to :53.

Juha

Crazy idea, world writable /etc/hosts ?
 
The compat lines are there for supporting the NIS/YP convention of +/- lines that tell the system to append users and groups from NIS/YP at those lines. Almost nobody uses NIS/YP anymore so it's safe to use just files/dns where appropriate.
 
I have stripped it down to
Code:
group: files
hosts: files dns
networks: files
passwd: files
shells: files
services: files
protocols: files
rpc: files
Don't know what benefit the compat lines would give, been happy this way.

Trying to ssh the local host, I see this
Code:
stat("/etc/nsswitch.conf",{ mode=-rw-r--r-- ,inode=7142892,size=260,blksize=32768 }) = 0 (0x0)
open("/etc/nsswitch.conf",O_CLOEXEC,0666)  = 3 (0x3)
ioctl(3,0x402c7413 { IOR 0x74('t'), 19, 44 },0xbfbfc6f8) ERR#25 'Inappropriate ioctl for device'
fstat(3,{ mode=-rw-r--r-- ,inode=7142892,size=260,blksize=32768 }) = 0 (0x0)
read(3,"#\n# nsswitch.conf(5) - name ser"...,32768) = 260 (0x104)
read(3,0x28c38000,32768)  = 0 (0x0)
and this
Code:
open("/etc/hosts",O_CLOEXEC,0666)  = 3 (0x3)
fstat(3,{ mode=-rw-r--r-- ,inode=7142860,size=472,blksize=32768 }) = 0 (0x0)
read(3,"\n127.0.0.1\tlocalhost localhost"...,32768) = 472 (0x1d8)
read(3,0x28c66000,32768)  = 0 (0x0)
close(3)  = 0 (0x0)
and then
Code:
connect(3,{ AF_INET 192.168.0.1:22 },16)  ERR#61 'Connection refused'

Nothing tries to connect^W send to :53.

Juha


Crazy idea, world writable /etc/hosts ?

I tried your changes to nsswitch.conf; they don't help in my case -- I still get the long delay.

As for your "crazy idea":
Code:
-rw-r--r--  1 root  wheel  1678 Sep 13 10:59 /etc/hosts
 
Even stranger idea, how about something as simple as a typo in /etc/hosts?

Well, I eyeballed it and it looks fine. And if it were invalid, running 'ssh <local machine name>' wouldn't work at all. It does work -- eventually.
 
And if it were invalid, running 'ssh <local machine name>' wouldn't work at all. It does work -- eventually.
Actually, that doesn't make sense. If your hosts file is skipped and only DNS is used you would get errors like "unable to resolve <hostname>" and it will not connect at all. But that's assuming none of the configured DNS server are actually able to resolve the hostname.
 
AltGr-space aka NoBreakSpace is a funny gremlin. Some programs think it's whitespace, some don't, only a few with Powers can show it. Turn a language- or charset-knob and situation changes.

Not likely here, of course, but ...
 
Actually, that doesn't make sense. If your hosts file is skipped and only DNS is used you would get errors like "unable to resolve <hostname>" and it will not connect at all. But that's assuming none of the configured DNS server are actually able to resolve the hostname.

No, what *you* said doesn't make sense, because your understanding of the problem is wrong. "If your hosts file is skipped and only DNS is used" is not what I said -- see my original post. What is happening is that it goes to DNS first, which takes a long time to fail, and then it *gets the address from /etc/hosts and the command succeeds*. It just takes a long time because things are happening in the wrong order and not in the order specified in /etc/host.conf or /etc/nsswitch.conf.
 
Anything in your ssh config that would force a name lookup ? Might not even be looking up the destination host, but something else.

Those Canonica* sound suspicious.

Also, to make sendmail happy, you have to have
Code:
192.168.0.1 hopo hopo.local
details or relatedness of that I do not know.
 
Here's mine for comparison
Code:
Host cough
Port sneeze
IdentityFile ~/.ssh/id_rsa_linkki

Host aspi

Host *

Protocol 2
AddressFamily inet
UseRoaming no
ServerAliveInterval 121
ServerAliveCountMax 3

SendEnv TERMCAP
Additionally
Code:
 $ ssh -V
OpenSSH_7.2p2, OpenSSL 1.0.1s-freebsd  1 Mar 2016
$ hostname
hopo
the ssh just came with 10.3-RELEASE, and hostname does not contain a surprising domain part


Naah, not that either. Just tried misconfiguring the hostname, and nothing made ssh to try to reach DNS.
 
Anything in your ssh config that would force a name lookup ? Might not even be looking up the destination host, but something else.


Code:
192.168.0.1 hopo hopo.local
details or relatedness of that I do not know.

My /etc/ssh/sshd_config is stock, as it came from the factory:
Code:
#   $OpenBSD: sshd_config,v 1.98 2016/02/17 05:29:04 djm Exp $
#   $FreeBSD: releng/11.0/crypto/openssh/sshd_config 296633 2016-03-11 00:15:29Z des $

# This is the sshd server system-wide configuration file.  See
# sshd_config(5) for more information.

# This sshd was compiled with PATH=/usr/bin:/bin:/usr/sbin:/sbin

# The strategy used for options in the default sshd_config shipped with
# OpenSSH is to specify options with their default value where
# possible, but leave them commented.  Uncommented options override the
# default value.

# Note that some of FreeBSD's defaults differ from OpenBSD's, and
# FreeBSD has a few additional options.

#Port 22
#AddressFamily any
#ListenAddress 0.0.0.0
#ListenAddress ::

# The default requires explicit activation of protocol 1
#Protocol 2

# HostKey for protocol version 1
#HostKey /etc/ssh/ssh_host_key
# HostKeys for protocol version 2
#HostKey /etc/ssh/ssh_host_rsa_key
#HostKey /etc/ssh/ssh_host_dsa_key
#HostKey /etc/ssh/ssh_host_ecdsa_key
#HostKey /etc/ssh/ssh_host_ed25519_key

# Lifetime and size of ephemeral version 1 server key
#KeyRegenerationInterval 1h
#ServerKeyBits 1024

# Ciphers and keying
#RekeyLimit default none

# Logging
# obsoletes QuietMode and FascistLogging
#SyslogFacility AUTH
#LogLevel INFO

# Authentication:

#LoginGraceTime 2m
#PermitRootLogin no
#StrictModes yes
#MaxAuthTries 6
#MaxSessions 10

#RSAAuthentication yes
#PubkeyAuthentication yes

# The default is to check both .ssh/authorized_keys and .ssh/authorized_keys2
#AuthorizedKeysFile .ssh/authorized_keys .ssh/authorized_keys2

#AuthorizedPrincipalsFile none

#AuthorizedKeysCommand none
#AuthorizedKeysCommandUser nobody

# For this to work you will also need host keys in /etc/ssh/ssh_known_hosts
#RhostsRSAAuthentication no
# similar for protocol version 2
#HostbasedAuthentication no
# Change to yes if you don't trust ~/.ssh/known_hosts for
# RhostsRSAAuthentication and HostbasedAuthentication
#IgnoreUserKnownHosts no
# Don't read the user's ~/.rhosts and ~/.shosts files
#IgnoreRhosts yes

# Change to yes to enable built-in password authentication.
#PasswordAuthentication no
#PermitEmptyPasswords no

# Change to no to disable PAM authentication
#ChallengeResponseAuthentication yes

# Kerberos options
#KerberosAuthentication no
#KerberosOrLocalPasswd yes
#KerberosTicketCleanup yes
#KerberosGetAFSToken no

# GSSAPI options
#GSSAPIAuthentication no
#GSSAPICleanupCredentials yes

# Set this to 'no' to disable PAM authentication, account processing,
# and session processing. If this is enabled, PAM authentication will
# be allowed through the ChallengeResponseAuthentication and
# PasswordAuthentication.  Depending on your PAM configuration,
# PAM authentication via ChallengeResponseAuthentication may bypass
# the setting of "PermitRootLogin without-password".
# If you just want the PAM account and session checks to run without
# PAM authentication, then enable this but set PasswordAuthentication
# and ChallengeResponseAuthentication to 'no'.
#UsePAM yes

#AllowAgentForwarding yes
#AllowTcpForwarding yes
#GatewayPorts no
#X11Forwarding yes
#X11DisplayOffset 10
#X11UseLocalhost yes
#PermitTTY yes
#PrintMotd yes
#PrintLastLog yes
#TCPKeepAlive yes
#UseLogin no
#UsePrivilegeSeparation sandbox
#PermitUserEnvironment no
#Compression delayed
#ClientAliveInterval 0
#ClientAliveCountMax 3
#UseDNS yes
#PidFile /var/run/sshd.pid
#MaxStartups 10:30:100
#PermitTunnel no
#ChrootDirectory none
#VersionAddendum FreeBSD-20160310

# no default banner path
#Banner none

# override default of no subsystems
Subsystem   sftp   /usr/libexec/sftp-server

# Example of overriding settings on a per-user basis
#Match User anoncvs
#   X11Forwarding no
#   AllowTcpForwarding no
#   PermitTTY no
#   ForceCommand cvs server
 
Last edited by a moderator:
My apologies.

I also run unbound just to cache stuff, and I ass-u-me'd that it would not matter here. And I did not start sshd either to test & observe properly. Never learning. Sigh.


That suspension point you showed with the arrow comes way later I thought. The connection is already up. tcpdump shows that there are queries
Code:
23:49:01.666168 IP localhost.48681 > localhost.domain: 41925+ [1au] SSHFP? hopo. (33)
23:49:01.666242 IP localhost.domain > localhost.48681: 41925* 0/0/1 (33)
23:49:01.777628 IP localhost.23625 > localhost.domain: 19087+ PTR? 1.0.168.192.in-addr.arpa. (42)
23:49:01.777712 IP localhost.domain > localhost.23625: 19087* 1/0/0 PTR hopo. (60)
23:49:01.777908 IP localhost.19182 > localhost.domain: 63833+ A? hopo. (22)
23:49:01.777949 IP localhost.domain > localhost.19182: 63833* 2/0/0 A 192.168.0.1, A 192.168.1.1 (54)

Adding hopo. (with a trailing dot) to the hosts file removes the address queries, but the fingerprint query ... That might be hard to work around, depends how your nameservers respond to it. Or not :)

Juha

always enjoys a good mystery
always volunteers to throw sand in the soup

Simply
Code:
VerifyHostKeyDNS false
 
Back
Top