Workstation losing Ethernet connection every 48 hours or so, requiring a reboot

The just started happening on Saturday, 2020-05-02, out of the blue with no network config changes made on this end. Running
Code:
# service netif restart
produces
Code:
Ethernet: no carrier
in the output. The only solution has been to reboot the machine.

The machine's config is described in detail here. FreeBSD 12.1-RELEASE-p4 (p3 kernel), latest pkg branch. Any ideas?
 
FuryBSD is a derivative.
PC-BSD, FreeNAS, XigmaNAS, and all other FreeBSD Derivatives


It has everything except the thing that's causing problems, the ethernet card. What card is it? What driver is it using?

We've been over this before. It's just an installer that bundles a DE with FreeBSD. There's no difference between FuryBSD and FreeBSD after the installation. Anytime the FreeBSD project wants to disincentivize people from using DE installers, they can bundle a DE with FreeBSD or write clear documentation to install one that actually works. Neither of those exist. I know because I tried myself, documented in a thread you yourself participated in.

CyberCr33p: DHCP.

Alain De Vos: I said what else I tried in my OP.
 
I had similar problems with two separate Intel NICs. In one case it was a bad network drop. In the other it was a PCI bus incompatibility. Same discrete card worked fine in a different motherboard.

These might not be your issue, though, 'cause in my case the interface would start to flap under load. It seems like yours flakes off after a long idle, which is why I suspect power management.
 
( Not related to the problem ) Have you noticed the message from GEOM in dmesg?

Code:
GEOM: ada0: corrupt or invalid GPT detected.
GEOM: ada0: GPT rejected -- may not be recoverable.
GEOM: ada2: the primary GPT table is corrupt or invalid.
GEOM: ada2: using the secondary instead -- recovery strongly advised.
 
( Not related to the problem ) Have you noticed the message from GEOM in dmesg?

Code:
GEOM: ada0: corrupt or invalid GPT detected.
GEOM: ada0: GPT rejected -- may not be recoverable.
GEOM: ada2: the primary GPT table is corrupt or invalid.
GEOM: ada2: using the secondary instead -- recovery strongly advised.
I got those when I added a drive that had had a GPT partition to a zpool. They're harmless in that case. I took the time to zero out the end of the disk with dd(1), but just 'cause I like a clean dmesg.
 
Code:
re0: watchdog timeout
re0: link state changed to DOWN

The handbook mentions watchdog:
To resolve watchdog timeout errors, first check the network cable. Many cards require a PCI slot which supports bus mastering. On some old motherboards, only one PCI slot allows it, usually slot 0. Check the NIC and the motherboard documentation to determine if that may be the problem.
 
The handbook mentions watchdog:

"Many cards require a PCI slot which supports bus mastering. On some old motherboards, only one PCI slot allows it, usually slot 0."
Well, this might neatly explain my problem, though I didn't see any watchdog errors. Thanks!
 
In fairness he did say that it takes 48 hours for the problem to manifest. Let's give him at least that long to try things out.
 
The RE8111E interface in my workstation also drops connections under heavy load after some time. Re-plugging the Cable *sometimes* works, but mostly the NIC is completely locked up and the host needs to be rebooted. As I've seen many weird behaviors and errors with Realtek NICs (usually broken offloading!!) I just think its yet another crappy NIC from them. I resorted to handle all heavier workloads over the Intel NIC and avoid Realtek for stuff I'd like to 'just work'
 
The OP also stated they tried restarting netif.
Fair enough.
I also mentioned the lack of any information.
By all means everyone help, but most of the help is speculative because the OP has provided very little information. Even in his hardware list, as SirDice pointed out, it didn't mention the ethernet cards/wifi.

My 2 cents: Ditch any wifi card that is Realtek, they're just bad. We had a batch of laptops from a Chinese company previously owned by a Big Blue company, fitted with these dud cards that continually dropped out and locked up (under Windows 10). Under warranty all 40 were replaced with Intel cards and NO ONE has had any problems since.

They're just junk.
 
I had a re(4) interface on a machine once. For some reason it just crapped out at irregular intervals. It would only work again after I powered down the machine completely, just restarting netif didn't help, rebooting didn't help either. It had to be completely powered down. I've had so many weird issues with Realtek cards, I absolutely hate them. Bought a box of 10 Intel PRO/1000 cards (OEM) and replaced all Realtek network cards in my machines. Never had an issue since.
 
Back
Top