Constantly slowdown with Chelsio T310 10G network card

I currently have a NAS running FreeBSD 13.0-RELEASE-p11 with a Chelsio T310 10G network card. It's connected via a QNAP 10G switch to my Linux box. As you can see, I can normally get 10Gb speeds (Linux on the left, FreeBSD on the right)

1657381042647.png


However, after the FreeBSD box has been up for sometime, the speeds will slow to a crawl

1657381308557.jpeg


Sometimes, after a few hours, speeds will return to normal. They always return to normal if I reboot the FreeBSD box. I've run `ipref` to/from other machines on the other side of the fiber connection and I'm able to get 1Gb speeds fine. I don't think it's the transceivers or equipment. Is there something wrong with this card, or is something weird happening between the card and the FreeBSD drivers/network stack?
 
Have you considered airflow? Chelsio cards need alot of airflow. They have published specs.
I would imagine a slowdown if overheated. Do you have very good airflow over the card?

I mounted a healthy fan right behind my three T540 and made a crude duct. Pushing air across and out the back.

I mounted a fan right on the T420's and T422's from a VGA card.
Plastic standoffs with a 3.5mm plastic fastener kit I have.
Those cards had holes near cpu for mounting a fan above heatsink.

The earlier the card the more heat they put out. Simple as that.
As CPU/ASIC die fabrication shrank power efficiency increased.
 
I was wondering if it was thermal throttling. I know one of the 10G cards I purchased was missing a heatsink, but when I opened my NAS back up the Chelsio had one installed, so I assumed it was good. Turns out I was wrong. The heatsink was hot to the touch. I slapped a cheap fan on top of the case and the throttling issue went away after the card cooled back down. I guess this card was pulled from a server with more fans or a better cooling solution. I'll get a slot fan for a more long term solution.


nas-1.jpeg
 
The spec is: Airflow: 200 lf/m (linear feet per minute)

There are also useful sysctls:
sysctl -a|grep temp
dev.t5nex.2.temperature: 41
dev.t5nex.1.temperature: 55
dev.t5nex.0.temperature: 54
 
Well, looks like I spoke too soon. I'm still getting slowdowns, dropping back down to a trickle in `iperf3`. The fan is still running on top of the open case and the heatsink barely feels warm now. When using the above `sysctl` command, it doesn't look like my motherboard or NIC report any useful temperatures:

Code:
vm.pfault_oom_attempts: 3
net.inet6.ip6.use_tempaddr: 0
net.inet6.ip6.temppltime: 86400
net.inet6.ip6.tempvltime: 604800
net.inet6.ip6.prefer_tempaddr: 0
hw.usb.template: -1
kstat.zfs.misc.arcstats.arc_tempreserve: 0
2022-07-18_17-23.png


So I'm back to thinking it has to be a software/drive issue (or maybe a hardware issue with the card itself?) What else can I do to diagnose this problem?
 
Back
Top