why ftp speed 100 x different on the same network?

I have three machines drone, queen and worker,

drone is running very old Freebsd 4.10, queen and worker with freebsd 8.0-p2.
drone uses wuftp, I believe, and queen and worker uses vsftp. All transfers done after putting "epsv off" .

hardware:
drone: Pentium II 350 MHz, 640 meg
worker: AMD quadcore 2.3 GHz, 4 gig
queen: AMD quadcore 3.2 GHz, 4 gig

from drone to queen, one 8 gig file takes 13 min:
Code:
2% |*                                                 |   196 MB    11:50 ETA^
226 File receive OK.
206810477 bytes sent in 18.09 seconds (10.90 MB/s)

the same file I want to get from queen to worker:
it takes about 17 hrs

Code:
  0% |                                     | 19328 KB  129.71 KB/s 17:22:34 ETA^C
226 Transfer complete.
19857408 bytes sent in 02:29 (129.49 KB/s)

from drone to worker it takes 99 hrs!

so why the speed would be about 100x different? they all use the same defaultrouter and submask. I was expecting faster transfer between queen and worker because of much faster hardware.

something wrong on my worker?
even between 2 old machines (drone and cell, both P2 class machine), I get 805.91 KB/s..

the network is supposed to be 1 gb/sec, I believe.
 
Is throughput from drone to worker as bad as from queen to worker? Which network devices are in those boxes? Anything unusual in /var/log/messages?
 
bschmidt, thanks.

drone to worker, 6x faster, but still only 1/14 of between drone to queen...

31165440 bytes sent in 40.21 seconds (756.85 KB/s)

nothing usual at workers's /var/log/messages

Code:
St Apr 10 12:25:05 2010 [pid 81327] [b] OK UPLOAD: Client "queen", "/u
sr/home/bee/April9.tar", 19856424 bytes, 129.47Kbyte/sec

Sat Apr 10 14:49:12 2010 [pid 81632] CONNECT: Client "drone"
Sat Apr 10 14:49:14 2010 [pid 81631] [b] OK LOGIN: Client "drone"
Sat Apr 10 14:50:39 2010 [pid 81633] [b] OK UPLOAD: Client "drone", "httpd-error.log", 31164416 bytes, 756.79Kbyte/sec
 
queen:

Code:
re0: <RealTek 8168/8168B/8168C/8168CP/8168D/8168DP/8111B/8111C/8111CP/8111DP PCIe Gigabit Ethernet> port 0xe800-0xe8ff mem 0xfdfff000-0xfdffffff,0xfdff8000-0xfdffbfff irq 19 at device 0.0 on pci2
re0: Using 1 MSI messages
re0: Chip rev. 0x28000000
re0: MAC rev. 0x00000000

worker: (same card! -- on MB, would a PCI card be faster or slower?

Code:
re0: <RealTek 8168/8168B/8168C/8168CP/8168D/8168DP/8111B/8111C/8111CP/8111DP PCIe Gigabit Ethernet> port 0xe800-0xe8ff mem 0xfdfff000-0xfdffffff,0xfdff8000-0xfdffbfff irq 18 at device 0.0 on pci2
re0: Using 1 MSI messages
re0: Chip rev. 0x28000000
re0: MAC rev. 0x00000000

drone:
Code:
rl0: <Accton MPX 5030/5038 10/100BaseTX> port 0xcc 00-0xccff mem 0xd7000000-0xd70000ff irq 10 at device 11.0 on pci0
miibus0: <MII bus> on rl0
rlphy0: <RealTek internal media interface> on miib us0
rlphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseT X-FDX, auto
 
Ok, Realtek chips. I remember those having issues with Gbit LAN (at least mine does). You can play around with tso, rxcsum and txcsum settings to see if those do make any kind of difference.

# ifconfig re0 -tso -rxcsum -txcsum

Also forcing 'em to 100Mbit might be an option.
 
thanks. I tried the command on both worker and queen, still 120 kb/s. I googled around but did not see instructions to force it to 100mbit.
 
I forgot: queen is directly on the net, worker and drone are behind switches...but the switches should be 100m or faster.
 
DutchDaemon said:
Is everything on full-duplex?

I do not know..how do I check?
some cables might be old (circa 1997)...are there old cables using half duplex?

probably simpler for me to move them to direct connection without switches...
 
ok, ifconfig says I am at full duplex and "100baseTX"

Code:
        media: Ethernet autoselect (100baseTX <full-duplex>)
        status: active
plip0: flags=8810<POINTOPOINT,SIMPLEX,MULTICAST> metric 0 mtu 1500
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=3<RXCSUM,TXCSUM>
        inet 127.0.0.1 netmask 0xff000000
        inet6 ::1 prefixlen 128
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3
 
in addition, getting from drone to queen (i.e. ftp from queen to drone, then "get file") is even slower:

844392 bytes received in 00:33 (24.54 KB/s)

The 130 kb/s was ftping from drone to queen and using "put file".
 
more tests.

I also have "lappy" which shared the same switch with worker.

1. I get 11 MB/s between lappy and worker (same switch).
2. I get 11 MB/s between lappy and queen (through switch, so switch is good).

3. I get 1.34 MB/s between worker and my home. this is the same as queen to my home.


so it seems it is NOT a bad switch since lappy (due core laptop with 2 gig ram) can get 11 MB/s to queen, from a common switch with worker.

something is not right with worker. somehow it only perform bad to queen and drone, but not to my home...

and it does fine to lappy at max speed (100 megabits/sec).
 
I need to draw a graph...

1. drone is alone, with a switch or hub
2. queen is direct to the ether jacket
3. cell/lappy/worker all share the same 10megabyte switch.

not sure why cell to worker is only 0.7 megabyte/s, they are quite close.

right, it looks like I have to send files from queen to lappy (10 meg) and then lappy to worker, also 10 meg). if I go queen to worker directly, I get 0.3 meg/s).
 
some questions:
1) have you already tried to switch calbles and switchports between the machines?
2) have you tested with smaller files say 200MB?

Even the machines have 100Mbit interfaces you should see ~10-11MB/s with FTP.

Maybe the netstat(1) command can bring some light.
Code:
# clear netstat statistic
netstat -s -z > /dev/null

#now start the transfer
netstat -s -p ip | tee transfer1.log

# wait at least 30sec.
netstat -s -p ip | tee transfer2.log

diff -u transfer1.log transfer2.log
 
worker did have one error:

Code:
+pid 78463 (mv), uid 0 inumber 47152 on /: filesystem full

df says:
Code:
Filesystem         1K-blocks     Used     Avail Capacity  Mounted on
/dev/mirror/gm0s1a   1012974   189218    742720    20%    /
devfs                      1        1         0   100%    /dev
/dev/mirror/gm0s1e   1012974      892    931046     0%    /tmp
/dev/mirror/gm0s1f 686168364 28039306 603235590     4%    /usr
/dev/mirror/gm0s1d   5077038   252062   4418814     5%    /var
devfs                      1        1         0   100%    /var/named/dev

how do I fix that?

iostat and netstat both show no error. have not tried switch cables and ports yet since I am at home (machines in office).

tomorrow I plan to:
1. swap cables and ports together: between cell and worker
2. if not working, mirror queen, and place inside worker, if that works, then throw worker hd away (software issue).
3. if that does not work, then move worker to a place with no switch.
4. if that does not work, then I do not know where is the problem and give up.

ok, I retested with a 11 meg file, most of the cases I did not consider direction...
except after the diagram is done, I found out that the speed
from worker to cell, upload is 11 mb but download is only 0.023! what a difference:

Code:
local: bee.log remote: bee.log
227 Entering Passive Mode (xxx,4,192,97)
150 Opening BINARY mode data connection for 'bee.log'.
100% |*************************************| 11068 KB   11.21 MB/s    00:00 ETA
226 Transfer complete.
11333983 bytes sent in 00:00 (11.18 MB/s)
ftp> get bee.log
local: bee.log remote: bee.log
227 Entering Passive Mode (xxx,4,192,98)
150 Opening BINARY mode data connection for 'bee.log' (11333983 bytes).
100% |*************************************| 11068 KB   23.98 KB/s    00:00 ETA
226 Transfer complete.
11333983 bytes received in 07:41 (23.98 KB/s)
ftp> quit
221 Goodbye.

so take the graph with a grain of salt...sorry for the copyright stamp...forgot to turn it off.
network-diagram.jpg
 
also, worker --> lappy is 11 MB but lappy --> worker (uploading to worker) is only 1.12.

since two hosts (lappy and cell) to worker is not symmetrical, uploading to worker is slow, this suggests it is not the cable/switch, but worker has a problem....

not sure if the file sys full is causing this.
 
worker to queen also not symmetrical:
upload to queen is 11 MB, but down from queen to worker is only 0.13 MB.

the pattern is everyone can download from worker fast (except drone, which might be limited by the hub), but uploading to worker is always slow...

cell to worker: 0.023!
queen to worker: 0.13
drone to worker: 1.3

why they are so all different? definitely narrowed down to the problem on Worker...
 
netstat 1 on worker:
Code:
            input        (Total)           output
   packets  errs      bytes    packets  errs      bytes colls
        79     0     110882         62     0       4484     0 <during uploading to worker
        98     0     141155         77     0       5626     0
       100     0     141432         76     0       5408     0
        95     0     140954         76     0       5408     0
       100     0     141307         76     0       5408     0
       103     0     141490         76     0       5408     0
        45     0      55627         34     0       2456     0
         5     0        332          1     0        170     0
         3     0        180          1     0        170     0
         6     0        360          1     0        170     0
         4     0        240          1     0        170     0
         5     0        300          1     0        170     0
         4     0        240          1     0        170     0
        10     0       1101          1     0        170     0
         9     0        806          1     0        170     0
      2232     0     147419       3330     0    5090716     0 <downloading from worker started)
      3014     0     198926       4508     0    6760975     0
         4     0        437          1     0        170     0
         2     0        120          1     0        170     0
         2     0        120          1     0        170     0
         2     0        120          1     0        170     0
            input        (Total)           output
   packets  errs      bytes    packets  errs      bytes colls
         7     0        420          1     0        170     0
        10     0        600          2     0        388     0
         5     0        300          1     0        170     0
 
solved: bad cable...when I swapped ports, nothing happened...then I swapped cable with my xp machine...bingo! both directions 10 meg...I could have come to the office instead drawing that and testing and re-testing....sometimes the simplest solution is the best...
 
strange...it worked for a while...now the same thing again...slow downloading to worker...not the cable, after all?
 
so, I swapped the two machines (queen and worker) and still worker receives at 0.1 while sending at 10 MB. I moved queen to another location (nothing in my office now!) with direct connection...now both directions are fine...not sure why...some kind of interaction between worker and the switch (Netgear FS108). no switches in between, no problems...
 
Back
Top