FreeBSD gateway periodically stops responding

Hi.

I have a small NUC-type system running FreeBSD 11.2-RELEASE-p8 serving as the gateway for my home network. It's got an N3160 Celeron CPU (2C/4T), 8GB RAM, and two igb(4) interfaces. igb0 is LAN, igb1 is WAN. pf(4) is configured to NAT my LAN subnets behind a single public IPv4 address. It's also a router for an IPv6 /56 prefix.

Several times each day, the gateway will stop responding on its LAN interface. This can last up to two minutes before it recovers. The system hasn't rebooted and there's nothing in the logs to indicate any problem has occurred.

I am observing this issue from my WiFi connected laptop, but have eliminated the WiFi connection as being a possible cause by pinging both the gateway and another system connected to the same switch as the gateway. Only the gateway pings timeout.

I've tried bumping mbufs up with sysctl kern.ipc.nmbclusters=1048576, but it's had no effect.

I previously ran pfSense 2.4.4 (based on FreeBSD 11.2) on this system without any problems, so this issue is quite frustrating. Can anyone make any suggestions on how to diagnose this further?

Thanks.
jem
 
Have you checked the obvious things? Like bad/dodgy ethernet cable? Or a dodgy port on the switch?
 
Oops, I had not. I've swapped over the cable for now.

I should add that my internet connection is not particularly fast (20Mbps down, 3Mbps up) so I doubt it's a network load issue.
 
Even if you manage to completely saturate your uplink that still shouldn't result in getting disconnected on the LAN side. Is there anything in /var/log/messages that might give some clues?
 
Nothing in /var/log/messages corresponding to the times of the outages.

However, after further ping testing, maybe it's not the gateway that's at fault after all. I'm getting the following results:
  • laptop to gateway IPv4 ping: periodically times out
  • laptop to server IPv4 ping: no problems
  • server to gateway IPv4 ping: no problems
  • laptop to gateway IPv6 ping: no problems
It seems to be only laptop-to-gateway IPv4 connectivity that's occasionally screwing up, but the laptop is running Linux so I'll look elsewhere for clues.
 
If it's only with IPv4 it might be an ARP issue, IPv6 works differently in this respect (it uses ICMPv6 instead).

What happens if you turn things around? Ping the laptop from the gateway.
 
It might be a DNS issue. I have sometimes very similiar symptoms, when my Internet connection drops for some reason.
Try pinging with "-n", this would always use IP addresses and skip the DNS resolution. If pinging per IP address works but the DNS resolution times out, you have the culprit.

Also try using DNS resolver utilities like dig or drill. If they time out, but ping -n is fine - a DNS issue.

The problem is, when I configure my router as DNS server, it probably does cache or handle optimally and always sends DNS requests to its primary DNS server. When the connection drops (even for a couple of minutes) this results in hanging TCP programs, even for local connections inside my LAN.
Sometimes this happens if a firewall is misconfigured. For example, if the firewall blocks some outgoing connection attempt, but drops the packet without sending back response - the program waits for a couple of minutes until the connection attempt times out. The fix is to make the gateway firewall reject the packets with ICMP host unreachable or ICMP port unreachable. This lets the app know that a connection cannot be made.

Also, enable firewall logging on the gateway and post the log here. If packets are dropped, the log will show it.
 
Back
Top