igb0 up/down

I have been experiencing frequent igb0 up/down events with FreeBSD 14.1 An identical computer also running FreeBSD 14.1 does not have any up/down events. I am still unlikely to call it a hardware problem because I have ethernet problems with FreeBSD all the time with lots of different adapters and computers.

Error computer - dev.igb.0.mac_stats.crc_errs: 4
Control computer - dev.igb.0.mac_stats.crc_errs: 0

Changing the port on the switch did not help. Changing the ethernet cable did not help first time so I changed to a brand new ethernet cable and same problem. The next step is to disable EEE? Which settings exactly should I apply?
 
I have ethernet problems with FreeBSD all the time with lots of different adapters and computers.
I have the complete opposite - rock solid Ethernet on a wide variety of machines.

I think with the newer igc drivers there were some occasional firmware version issues to consider but not with older em/igb drivers.

Can you connect the computers directly to eliminate the switch?

Are you sure the computers are identical especially the NICs?

Try a Linux live image and see if it has any issues.

What hardware exactly? Any USB adapters or anything like that?
 
I am still unlikely to call it a hardware problem because I have ethernet problems with FreeBSD all the time with lots of different adapters and computers.
Not the first time...
 
The NIC is an Intel I211 on both systems. Both systems are pretty identical except one was upgraded and one was clean installed. One has FIBS setup in rc.conf and sysctl.conf and the other one does not yet. Both systems use an Asus mini ITX motherboard. Both systems also use a USB ethernet adapter for their secondary ethernet connections. These USB ethernet devices have their own problems with going up/down without some workarounds. Also, I have used the error system for years without these problems appearing. One other thing I noticed is that ue0 on the error system is error free with no up/down events and on the control system ue0 (a different model adapter) has been having some up/down events. But I know the workaround for it I just have not set it up yet.

The directly connected method is fraught with problems like both systems being down and the fact that without continuous data being transferred like on the error system the problem may not show itself anyway. The error system is continuously streaming 2 HD ATSC streams which likely stresses the connection. The ATSC streams are blinking on and off today also which I am not sure is just issues at the transmitter/interference or a sign of system trouble. It has been glitching heavily like this since yesterday and this is with FreeBSD and the live USB. I am probably going to tune one of the streams to a completely different transmitter to troubleshoot that.

I am now posting from the Debian live USB. Another problem besides seeing the up/down in the log was apcupsd of which the error system was the master was causing disconnects around the network. I have enabled apcupsd on this live USB and started the 2 ATSC streams and I guess I will wait it out for a few days to see what happens unless anyone else has any suggestions. So far there are no ethernet disconnects, no errors logged in "ifconfig", and no apcupsd disconnects in the 45 minutes or so I have been booted up in the live USB. My guess is there will not be any real trouble here and the problem may be magically "fixed" when I boot back into FreeBSD due to the linux driver setting something in the hardware.

This reminds me of the time that I bought a low cost laptop about 10 years ago or more and when I booted it into FreeBSD the ethernet died permanently. I think I just returned that one and shrugged when someone tried to look at the OS on it. One of the few times that I have seen software kill hardware. Along with the old AMD CPUs that did not have temperature throttling and I have seen Linux ALSA fry a tablet speaker by sending it the wrong voltage or something. Story time over.
 
I have booted back to FreeBSD because I am just not convinced there is actually anything wrong with the ethernet. I could not just wait for something to happen.

Some key differences between error system and control system:
master on apcupsd vs slave
ue0 workaround (usbconfig -d ugen0.2 set_config 1) set vs not set
streams from hdhomerun vs streams from streaming sites

If the igb0 up/down events continue I will just conclude that the ethernet is suddenly incompatible with FreeBSD and use another USB ethernet adapter as main. This could not be due to ghosts but to files left over from an upgrade (conditions on the network could have changed such that only a computer with old driver files has glitches), stress from continuous streaming (only started streaming with this computer recently), malformed packets from hdhomerun that the driver does not like dealing with (only this computer streams from this device) or something along those lines.

Certainly these hypotheses are a stretch but this is a very hard problem to track down.
 
I've found BSD systems having some issues with USB devices, but never any problem with Intel NICs.

The i211 is a low-end Intel NIC but the ones I've got haven't had any issues.

I can't explain your experiences but they do not match mine (and if FreeBSD was as flaky as you are seeing with Intel NICs in general then I don't think it would have lasted as long as it has.)
 
I got my Multi WAN/Multi switch setup working using another USB device instead. Both USB ethernet devices are 1 Gb and are attached by USB 3.0 and speed shows SUPER in usbconfig. Even though both USB ethernet are identical Insignia brand devices only one needed the set_config workaround so far. I was worried at first because I was seeing no carrier on the device that I had previously set with set_config but after unplugging and replugging and using set_config again both are now working without errors. The only other problem would be naming of the devices at boot time can be inconsistent but this system being the main one has a large UPS attached so I am happy.

edit:
I wanted to mention that I needed to unplug/replug at the USB side to reset the no carrier problem
Also the NICs in both systems are built into the mini ITX motherboards
Also the control system is still doing fine with no resets after streaming many videos
Also the problem is obviously still not fixed but only worked around
 
Processor in both is Ryzen 2400GE. I ultimately needed the set_config workaround for the replacement USB adapter also to keep it from going up/down. I also needed to unplug/replug the USB for this adapter before using set_config to get it to behave.

Both the error system and the control system have:
hw.pci.enable_msix="1"
hw.pci.enable_msi="1"
So I think that is a red herring.

Another difference is the error system has 32 GB RAM and the control system has 64 GB RAM.

I have been using the igb0 without errors for years until I packaged the homemade laptop workstation and moved to a new location. I am willing to accept that during the move that something broke in the motherboard that does not allow use of the ethernet with FreeBSD any more. As I have mentioned it is error free in the Debian Linux live USB. The laptop is not about to break down though. It is running strong with no errors in ZFS or further errors in dmesg.
 
I have been using the igb0 without errors for years until I packaged the homemade laptop workstation and moved to a new location. I am willing to accept that during the move that something broke in the motherboard that does not allow use of the ethernet with FreeBSD any more. As I have mentioned it is error free in the Debian Linux live USB.
It's just a bit strange how it works with Debian but not FreeBSD.

If it's broken hardware, it's broken hardware.

But if it has been solid for years and you've changed absolutely nothing (apart from the physical location of the machine) - then maybe the move upset something (but what and how?)
 
Back
Top