May need to replace a 1000BaseT PCIe adapter that might be suffering an odd h/w fault.
Looking for any recommendations for replacements. A quick check of our local suppliers doesn't show much on offer. ASUS or TP-Link.
Also considering using >=2.5Gb but that may prove problematic in the short term. Are there any likely caveats to using a 2.5/10Gb replacement for only the problematic NIC while the other end of the connection is still 1000BaseT?
Alternatively open to performing further diagnostics if anyone has ideas. Details follow.
Thanks.
Setup is a FreeBSD system (#1) that does a nightly backup transfer over ssh of a single file of around 1.5TB to another FreeBSD system (#2) over a 1000BaseT direct connection. Very occasionally - say a handful of times per year - the transfer aborts with `ssh: connect to host 172.22.222.9 port 22: Operation timed out`.
At that point system #1 has lost all connectivity to #2. A warm reboot (of #1) is enough to restore connectivity. No other procedure has yet been identified that restores connectivity - cable disconnect/reconnect, warm/cold restart of #2, ifconfig down/up (#1 and #2).
No system error messages have been observed pertaining to the connection failure. Fault has persisted over a variety of FreeBSD versions, currently 14.4-RELEASE.
The NIC is a PCIe adapter via riser in a 1RU server. The mainboard has two Intel I210 NICs.
System #1:
System Information
Manufacturer: Intel Corporation
Product Name: S1200SP
Version: LR1304SPCFG1R
Handle 0x0005, DMI type 41, 11 bytes
Onboard Device
Reference Designation: Intel I210
Type: Ethernet
Status: Enabled
Type Instance: 1
Bus Address: 0000:03:00.0
Handle 0x0021, DMI type 41, 11 bytes
Onboard Device
Reference Designation: Intel I210
Type: Ethernet
Status: Enabled
Type Instance: 2
Bus Address: 0000:04:00.0
Mar 13 02:36:40 mail kernel: pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
Mar 13 02:36:40 mail kernel: pci0: <ACPI PCI bus> on pcib0
Mar 13 02:36:40 mail kernel: pcib1: <ACPI PCI-PCI bridge> irq 16 at device 1.0 on pci0
Mar 13 02:36:40 mail kernel: pci1: <ACPI PCI bus> on pcib1
Mar 13 02:36:40 mail kernel: em0: <Intel(R) Gigabit CT 82574L> port 0x5000-0x501f mem 0x9cd80000-0x9cd9ffff,0x9cd00000-0x9cd7ffff,0x9cda0000-0x9cda3fff irq 16 at device 0.0 on pci1
Mar 13 02:36:40 mail kernel: em0: EEPROM V1.8-0
Mar 13 02:36:40 mail kernel: em0: Using 1024 TX descriptors and 1024 RX descriptors
Mar 13 02:36:40 mail kernel: em0: Using 2 RX queues 2 TX queues
Mar 13 02:36:40 mail kernel: em0: Using MSI-X interrupts with 3 vectors
Mar 13 02:36:40 mail kernel: em0: Ethernet address: 68:05:ca:15:80:f4
Mar 13 02:36:40 mail kernel: em0: netmap queues/slots: TX 2/1024, RX 2/1024
The complete boot-up log in /var/log/messages is attached.
Looking for any recommendations for replacements. A quick check of our local suppliers doesn't show much on offer. ASUS or TP-Link.
Also considering using >=2.5Gb but that may prove problematic in the short term. Are there any likely caveats to using a 2.5/10Gb replacement for only the problematic NIC while the other end of the connection is still 1000BaseT?
Alternatively open to performing further diagnostics if anyone has ideas. Details follow.
Thanks.
Setup is a FreeBSD system (#1) that does a nightly backup transfer over ssh of a single file of around 1.5TB to another FreeBSD system (#2) over a 1000BaseT direct connection. Very occasionally - say a handful of times per year - the transfer aborts with `ssh: connect to host 172.22.222.9 port 22: Operation timed out`.
At that point system #1 has lost all connectivity to #2. A warm reboot (of #1) is enough to restore connectivity. No other procedure has yet been identified that restores connectivity - cable disconnect/reconnect, warm/cold restart of #2, ifconfig down/up (#1 and #2).
No system error messages have been observed pertaining to the connection failure. Fault has persisted over a variety of FreeBSD versions, currently 14.4-RELEASE.
The NIC is a PCIe adapter via riser in a 1RU server. The mainboard has two Intel I210 NICs.
System #1:
System Information
Manufacturer: Intel Corporation
Product Name: S1200SP
Version: LR1304SPCFG1R
Handle 0x0005, DMI type 41, 11 bytes
Onboard Device
Reference Designation: Intel I210
Type: Ethernet
Status: Enabled
Type Instance: 1
Bus Address: 0000:03:00.0
Handle 0x0021, DMI type 41, 11 bytes
Onboard Device
Reference Designation: Intel I210
Type: Ethernet
Status: Enabled
Type Instance: 2
Bus Address: 0000:04:00.0
Mar 13 02:36:40 mail kernel: pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
Mar 13 02:36:40 mail kernel: pci0: <ACPI PCI bus> on pcib0
Mar 13 02:36:40 mail kernel: pcib1: <ACPI PCI-PCI bridge> irq 16 at device 1.0 on pci0
Mar 13 02:36:40 mail kernel: pci1: <ACPI PCI bus> on pcib1
Mar 13 02:36:40 mail kernel: em0: <Intel(R) Gigabit CT 82574L> port 0x5000-0x501f mem 0x9cd80000-0x9cd9ffff,0x9cd00000-0x9cd7ffff,0x9cda0000-0x9cda3fff irq 16 at device 0.0 on pci1
Mar 13 02:36:40 mail kernel: em0: EEPROM V1.8-0
Mar 13 02:36:40 mail kernel: em0: Using 1024 TX descriptors and 1024 RX descriptors
Mar 13 02:36:40 mail kernel: em0: Using 2 RX queues 2 TX queues
Mar 13 02:36:40 mail kernel: em0: Using MSI-X interrupts with 3 vectors
Mar 13 02:36:40 mail kernel: em0: Ethernet address: 68:05:ca:15:80:f4
Mar 13 02:36:40 mail kernel: em0: netmap queues/slots: TX 2/1024, RX 2/1024
The complete boot-up log in /var/log/messages is attached.