lacp lagg not connecting well to some machines

Hi all,

I have a FreeBSD 10.0-RELEASE-p2 machine with some weird behaviour regarding LACP, not connecting to some machines, and well connecting to other machines.

My rc.conf:

Code:
ifconfig_em0="up mtu 9000 polling"
ifconfig_em1="up mtu 9000 polling"
cloned_interfaces="lagg0"
ifconfig_lagg0="laggproto lacp laggport em0 laggport em1"
ifconfig_lagg0_alias0="inet 192.168.0.70 netmask 255.255.255.0"
ifconfig_lagg0_alias1="inet 192.168.0.71 netmask 255.255.255.255"
ifconfig_lagg0_alias2="inet 192.168.0.72 netmask 255.255.255.255"
defaultrouter="192.168.0.1"

My sysctl.conf:

Code:
security.bsd.see_other_uids=0
net.link.lagg.0.lacp.lacp_strict_mode=0

ifconfig output:

Code:
em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
	options=db<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,POLLING,VLAN_HWCSUM>
	ether 00:1b:21:53:b6:96
	inet6 fe80::21b:21ff:fe53:b696%em0 prefixlen 64 scopeid 0x1 
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
	media: Ethernet autoselect (1000baseT <full-duplex>)
	status: active
em1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
	options=db<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,POLLING,VLAN_HWCSUM>
	ether 00:1b:21:53:b6:96
	inet6 fe80::21b:21ff:fe53:b697%em1 prefixlen 64 scopeid 0x2 
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
	media: Ethernet autoselect (1000baseT <full-duplex>)
	status: active
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
	options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
	inet6 ::1 prefixlen 128 
	inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3 
	inet 127.0.0.1 netmask 0xff000000 
	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
	options=db<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,POLLING,VLAN_HWCSUM>
	ether 00:1b:21:53:b6:96
	inet 192.168.0.70 netmask 0xffffff00 broadcast 192.168.0.255 
	inet 192.168.0.71 netmask 0xffffffff broadcast 192.168.0.71 
	inet 192.168.0.72 netmask 0xffffffff broadcast 192.168.0.72 
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
	media: Ethernet autoselect
	status: active
	laggproto lacp lagghash l2,l3,l4
	laggport: em1 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
	laggport: em0 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>

So far all seems well. The LAGG is connected to a Netgear GS724Tv3 switch which does support LACP (through LAG), and is configured with the two ports enabled, admin enabled, dst enabled, flow control enabled, etc. Link state is up.

In LAGG debug mode one can see the transmits and receives from and to the Netgear switch without any problems:

Code:
May  7 17:33:02 phenomium kernel: em0: lacpdu transmit
May  7 17:33:02 phenomium kernel: actor=(8000,00-1B-21-53-B6-96,008B,8000,0001)
May  7 17:33:02 phenomium kernel: actor.state=3d<ACTIVITY,AGGREGATION,SYNC,COLLECTING,DISTRIBUTING>
May  7 17:33:02 phenomium kernel: partner=(0064,08-BD-43-CE-A6-5F,0034,0080,0001)
May  7 17:33:02 phenomium kernel: partner.state=3d<ACTIVITY,AGGREGATION,SYNC,COLLECTING,DISTRIBUTING>
May  7 17:33:02 phenomium kernel: maxdelay=0
May  7 17:33:02 phenomium kernel: em1: lacpdu transmit
May  7 17:33:02 phenomium kernel: actor=(8000,00-1B-21-53-B6-96,008B,8000,0002)
May  7 17:33:02 phenomium kernel: actor.state=3d<ACTIVITY,AGGREGATION,SYNC,COLLECTING,DISTRIBUTING>
May  7 17:33:02 phenomium kernel: partner=(0064,08-BD-43-CE-A6-5F,0034,0080,0002)
May  7 17:33:02 phenomium kernel: partner.state=3d<ACTIVITY,AGGREGATION,SYNC,COLLECTING,DISTRIBUTING>
May  7 17:33:02 phenomium kernel: maxdelay=0
May  7 17:33:05 phenomium kernel: em0: lacpdu receive
May  7 17:33:05 phenomium kernel: actor=(0064,08-BD-43-CE-A6-5F,0034,0080,0001)
May  7 17:33:05 phenomium kernel: actor.state=3d<ACTIVITY,AGGREGATION,SYNC,COLLECTING,DISTRIBUTING>
May  7 17:33:05 phenomium kernel: partner=(8000,00-1B-21-53-B6-96,008B,8000,0001)
May  7 17:33:05 phenomium kernel: em1: lacpdu receive
May  7 17:33:05 phenomium kernel: partner.state=3d<ACTIVITY,AGGREGATION,SYNC,COLLECTING,DISTRIBUTING>
May  7 17:33:05 phenomium kernel: actor=(0064,08-BD-43-CE-A6-5F,0034,0080,0002)
May  7 17:33:05 phenomium kernel: maxdelay=0
May  7 17:33:05 phenomium kernel: actor.state=3d<ACTIVITY,AGGREGATION,SYNC,COLLECTING,DISTRIBUTING>
May  7 17:33:05 phenomium kernel: partner=(8000,00-1B-21-53-B6-96,008B,8000,0002)
May  7 17:33:05 phenomium kernel: partner.state=3d<ACTIVITY,AGGREGATION,SYNC,COLLECTING,DISTRIBUTING>
May  7 17:33:05 phenomium kernel: maxdelay=0
May  7 17:33:32 phenomium kernel: em0: lacpdu transmit
May  7 17:33:32 phenomium kernel: actor=(8000,00-1B-21-53-B6-96,008B,8000,0001)
May  7 17:33:32 phenomium kernel: actor.state=3d<ACTIVITY,AGGREGATION,SYNC,COLLECTING,DISTRIBUTING>
May  7 17:33:32 phenomium kernel: partner=(0064,08-BD-43-CE-A6-5F,0034,0080,0001)
May  7 17:33:32 phenomium kernel: partner.state=3d<ACTIVITY,AGGREGATION,SYNC,COLLECTING,DISTRIBUTING>
May  7 17:33:32 phenomium kernel: maxdelay=0
May  7 17:33:32 phenomium kernel: em1: lacpdu transmit
May  7 17:33:32 phenomium kernel: actor=(8000,00-1B-21-53-B6-96,008B,8000,0002)
May  7 17:33:32 phenomium kernel: actor.state=3d<ACTIVITY,AGGREGATION,SYNC,COLLECTING,DISTRIBUTING>
May  7 17:33:32 phenomium kernel: partner=(0064,08-BD-43-CE-A6-5F,0034,0080,0002)
May  7 17:33:32 phenomium kernel: partner.state=3d<ACTIVITY,AGGREGATION,SYNC,COLLECTING,DISTRIBUTING>
May  7 17:33:32 phenomium kernel: maxdelay=0
May  7 17:33:35 phenomium kernel: em1: lacpdu receive
May  7 17:33:35 phenomium kernel: actor=(0064,08-BD-43-CE-A6-5F,0034,0080,0002)
May  7 17:33:35 phenomium kernel: actor.state=3d<ACTIVITY,AGGREGATION,SYNC,COLLECTING,DISTRIBUTING>
May  7 17:33:35 phenomium kernel: partner=(8000,00-1B-21-53-B6-96,008B,8000,0002)
May  7 17:33:35 phenomium kernel: partner.state=3d<ACTIVITY,AGGREGATION,SYNC,COLLECTING,DISTRIBUTING>
May  7 17:33:35 phenomium kernel: maxdelay=0
May  7 17:33:35 phenomium kernel: em0: lacpdu receive
May  7 17:33:35 phenomium kernel: actor=(0064,08-BD-43-CE-A6-5F,0034,0080,0001)
May  7 17:33:35 phenomium kernel: actor.state=3d<ACTIVITY,AGGREGATION,SYNC,COLLECTING,DISTRIBUTING>
May  7 17:33:35 phenomium kernel: partner=(8000,00-1B-21-53-B6-96,008B,8000,0001)
May  7 17:33:35 phenomium kernel: partner.state=3d<ACTIVITY,AGGREGATION,SYNC,COLLECTING,DISTRIBUTING>
May  7 17:33:35 phenomium kernel: maxdelay=0

Still, I get weird behaviour when connecting over nfs to my Synology DS211j or when connecting with ssh to an Ubuntu server. It *does* connect, but hangs somewhere in the middle:

Log from SSH to Ubuntu box:

Code:
[unix@phenomium ~]$ ssh -vvvv bitcoin1
OpenSSH_6.4, OpenSSL 1.0.1e-freebsd 11 Feb 2013
debug1: Reading configuration data /etc/ssh/ssh_config
debug2: ssh_connect: needpriv 0
debug1: Connecting to bitcoin1 [192.168.0.33] port 22.
debug1: Connection established.
debug1: identity file /home/unix/.ssh/id_rsa type -1
debug1: identity file /home/unix/.ssh/id_rsa-cert type -1
debug3: Incorrect RSA1 identifier
debug3: Could not load "/home/unix/.ssh/id_dsa" as a RSA1 public key
debug1: identity file /home/unix/.ssh/id_dsa type 2
debug1: identity file /home/unix/.ssh/id_dsa-cert type -1
debug1: identity file /home/unix/.ssh/id_ecdsa type -1
debug1: identity file /home/unix/.ssh/id_ecdsa-cert type -1
debug1: Enabling compatibility mode for protocol 2.0
debug1: Local version string SSH-2.0-OpenSSH_6.4_hpn13v11 FreeBSD-20131111
debug1: Remote protocol version 2.0, remote software version OpenSSH_6.2p2 Ubuntu-6ubuntu0.1
debug1: match: OpenSSH_6.2p2 Ubuntu-6ubuntu0.1 pat OpenSSH*
debug1: Remote is not HPN-aware
debug2: fd 3 setting O_NONBLOCK
debug3: ssh_load_hostkeys: loading entries for host "bitcoin1" from file "/home/unix/.ssh/known_hosts"
debug3: ssh_load_hostkeys: loaded 0 keys
debug1: SSH2_MSG_KEXINIT sent

However, which SSHing to my Macbook Pro (OSX 10.9.2) there is no problem connecting and sending files over with scp for instance. Also connecting to another FreeBSD box goes without problems.

Anybody have an idea? I have searched through google but couldn't find anything which resembles this (with an answer). I also had this behaviour in 9.1 (which was the reason I upgraded to 10.0), but on 8.2 LACP worked flawlessly...

Note1: with a single connection with MTU 1500 or 9000 no problems.
Note2: with a LACP connection with MTU 1500 no problems (weird!!)

Thanks for your help and if you need more information, I'd be happy to provide it for you.

Cheers,

Rick
 
Back
Top