unstable network connection

I have three FreeBSD 8.1 running on three different hardware and therefore consist of different network adapter as well (bce, bge and igb). I found that the network connection is kind of unstable which I have tried to scp some > 10MB file and found that I cannot always get the files completed successfully. I have further checked with my network admin and he claim that the problem is being caused by the network driver which cannot support the load whereby he tried to ping using huge packet size (around 15k) and my server will drop packet consistently at a regular interval. I found that this statement may not be valid since the three server is using three different network drive and it would be quite impossible that the same problem is being caused by three different network adapter and thus different network driver.

Since then I have tried to tune up the performance by playing around with the /etc/sysctl.conf figures with no luck.

Code:
# Reduce the cache size of slow start connection
net.inet.tcp.hostcache.expire=1

Our network admin also claim that they see quite a lot of network up and down from their cisco switch log while I cannot find any up down message inside the dmesg. Have further checked the netstat -s but dont have concrete idea. But I suspect the below may provide some clues.

Code:
tcp:
133695291 packets sent
134898031 packets received
190712 duplicate acks
114 completely duplicate packets (135202 bytes)
27 old duplicate packets
64181 packets received after close
45192 discarded due to memory problems
390871 retransmit timeouts
ip:
685307 output packets dropped due to no bufs, etc.

Anyone got an idea what might be the possible cause?
 
I replaced all Broadcom interfaces with Intel ones, because the Broadcom ones caused problems like irreducible, steady, and sometimes quite high packet loss under load. I don't care if it's the hardware or the drivers or a combination of both, but when it comes to performance and reliability under load, I think you'd better find alternatives. I can vouch for the Intel PRO/1000 interfaces.
 
Also see if you can fixate the speed/duplex settings on both the switch and the NIC. I've noticed that some combinations do not auto detect well which will cause errors.
 
In my case, no change to interface settings (tried about everything ifconfig had on offer) solved the problem (hence: irreducible).
 
We switched from bge to intel NICs because the broadcom chips wasn't working or stopped working after some
time in use on FreeBSD 8.0 AMD64. According to the 8.1 Release notes most of this bugs should be resolved. But i had no chance to test it :/ .

http://www.freebsd.org/releases/8.1R/relnotes-detailed.html

Code:
The bge(4) driver now supports BCM5761, BCM5784, and BCM57780-based devices.

The bge(4) driver now supports TSO (TCP Segmentation Offloading) on BCM5755 or newer controllers.

A long-standing bug in the bge(4) driver which was related to ASF heartbeat sending has been fixed.

A long-standing stability issue of the bce(4) and bge(4) driver due to a hardware bug in its DMA handling when the system has more than 4GB 
memory has been fixed. This applies to BCM5714, BCM5715, and BCM5708 controllers.

A bug in the bge(4) driver that incorrectly enabled TSO on BCM5754/BCM5754M controllers has been fixed.

Well i dont trust broadcom chips as long as i can test it for myself ...

Are the Network up and downs only on the servers with broadcom chips or the igb too?

A problem with the igb driver is fixed in 8.2 beta1

http://wiki.freebsd.org/Releng/8.2TODO

Code:
20101215 - In igb(4) remove a test for min frame size which may fail in some situation (jfv@) (r216173, merged as r216467)

But this bug doesnt sound that serious ...
 
epic-win-photos-service-win.jpg


This is similar with regards to network cards :e
 
tried to forced both the NIC and the switch port using 1000baseTx full-duplex or 100baseTx full-duplex with no luck.
Perhaps the bge, bce or igb is really too buggy for production use...
 
Ops, one of the server is actually an Intel 1000/pro (igb one) and also encounter similar problem. Perhaps I have to switch back to linux coz no single linux server have such problem.
 
Back
Top