FreeBSD slow NFS read speed on FreeBSD 9.1

belon_cfy · Apr 25, 2012

5th Feb 2013
Previously I m managed to solve it by adding a vlan on top of the em interface(no issue on igb interface), but the same solution doesn't work anymore on FreeBSD 9.1.

I have verified it by switching back to FreeBSD 9.0 and it works fine again, the server able to saturate the 1Gbps link. After that I reinstall the same server to FreeBSD 9.1 and the problem came back again, it only able to push below 210Mbps max.

Is there any driver update causing bad performance on FreeBSD 9.1 ?

By the way, below is my test environment on ESXI5 with NFS mount.
- Running a CentOS6.3 64bit guest on ESXI5
- Running dd if=/dev/sda of=/dev/null bs=1M iflag=direct in VM
- iostat showing only 20MB read per sec, and systat -if 1 on FreeBSD 9.1 showing the same result.

Thanks.

-------------------------------------------------------------------------------------
25th Apr 2012

I'm experiencing very slow read speed when connecting to NFS server. The average read speed is only 5-10MB/s while the write speed sustain 50MB/s and 80-100MB/s on ASYNC mode.

I found some people also having the same issue when mounting the NFS with TCP instead of UDP. Understand that mount with UDP will solve the problem but the ESXI does not allow us to do it.

Both of my servers are running FreeBSD 9 Release with the following NFS parameters:

Code:

nfsv4_server_enable="YES"
rpcbind_enable="YES"
nfs_server_enable="YES"
nfs_server_flags="-u -t -n 128"
mountd_flags="-r"
rpc_lockd_enable="YES"
rpc_statd_enable="YES"

SCP speed is about 80MB from the server to the client.

Any idea?

belon_cfy · Apr 26, 2012

I found that the NFS read through igb works much faster, however slow read through the other interfaces such as em or bce. But the scp and nc transfer speed are the same. Anyone know why I'm getting inconsistent NFS read speed through different interfaces?

belon_cfy · Apr 28, 2012

Seems the problem already existed for more than 5 years, however it's still not resolved yet.

http://lists.freebsd.org/pipermail/freebsd-bugs/2007-January/021928.html

Problem still persists, even changed the network cable and connected directly to different clients. Only TCP NFS read will hit the "bug", others such as SCP, ISCSI are all no issue.

I can't switch to UDP because ESXI doesn't support NFS through UDP.

belon_cfy · Apr 28, 2012

Speed slightly increases from 10-20MB/s to 30-50MB/s after changing the MTU to 256, but still far more behind the igb interface.

Any idea?

belon_cfy · Apr 30, 2012

Changing to 'INTEL PRO/1000 PT DUAL PORT PCI-E' for testing, it has been detected as em0 and em1 and the performance is still same as before, systat -ifstat 1 showing only 10-20MB/s on NFS read but other services are able to hit 100MB/s at output by running ping -i 0.0001 -s 8192 xxx.xxx.xxx.xxx.

belon_cfy · May 25, 2012

Problem solved by adding it into a VLAN, both interfaces are able to get a consistent 100MB/s.

belon_cfy · Feb 5, 2013

5th Feb 2013
Previously I m managed to solve it by adding a vlan on top of the em interface(no issue on igb interface), but the same solution doesn't work anymore on FreeBSD 9.1.

I have verified it by switching back to FreeBSD 9.0 and it works fine again, the server able to saturate the 1Gbps link. After that I reinstall the same server to FreeBSD 9.1 and the problem came back again, it only able to push below 210Mbps max.

Is there any driver update causing bad performance on FreeBSD 9.1 ?

By the way, below is my test environment on ESXI5 with NFS mount.
- Running a CentOS6.3 64bit guest on ESXI5
- Running dd if=/dev/sda of=/dev/null bs=1M iflag=direct in VM
- iostat showing only 20MB read per sec, and systat -if 1 on FreeBSD 9.1 showing the same result.

Thanks.

Sebulon · Feb 5, 2013

Hey man,

IÂ´m soon to be upgrading a machine to 9.1 that has igb, so IÂ´ll be keeping this in mind. Our problems have been to get Jumbo Frames to work properly with igb and have to use these sysctl's:
/etc/sysctl.conf:

Code:

kern.ipc.nmbclusters=262144
kern.ipc.nmbjumbo9=38400

Otherwise the network wouldnÂ´t go up at all.

/Sebulon

belon_cfy · Feb 5, 2013

But we are able to get 1Gbps speed on igb interface even without jumbo frame, both of the em and igb with the default mtu size which is 1500.

The em interface are able to saturate the 1Gbps link if vlan is added on FreeBSD 9.0, but the same method doesn't work on FreeBSD 9.1 anymore. I only got 100-200Mbps through the em interface with the exact same cable and port.

By the way, the em interface does not support jumbo frame.

Code:

igb0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=401bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,VLAN_HWTSO>
        ether 00:25:90:6c:23:0c
        inet6 fe80::225:90ff:fe6c:230c%igb0 prefixlen 64 scopeid 0x1
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
igb1: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=401bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,VLAN_HWTSO>
        ether 00:25:90:6c:23:0d
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: Ethernet autoselect
        status: no carrier
em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=4219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC,VLAN_HWTSO>
        ether 00:25:90:6c:23:0c
        inet6 fe80::225:90ff:fe68:99fe%em0 prefixlen 64 scopeid 0x7
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
em1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=4219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC,VLAN_HWTSO>
        ether 00:25:90:68:99:ff
        inet6 fe80::225:90ff:fe68:99ff%em1 prefixlen 64 scopeid 0x8
        inet 10.90.1.246 netmask 0xffffff00 broadcast 10.90.1.255
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: Ethernet autoselect (100baseTX <full-duplex>)
        status: active

Sebulon · Feb 5, 2013

belon_cfy said:
By the way, the em interface does not support jumbo frame.

Oh yes, they most certainly do. We have several storage units with em- and igb HW that are configured for Jumbo Frames and are performing magnificantly. HereÂ´s how, ontop of a trunk too:

Code:

ifconfig_em(or igb)0="mtu 9000 up"
ifconfig_em(or igb)1="mtu 9000 up"
cloned_interfaces="lagg0 vlanX"
ifconfig_lagg0="up laggproto lacp laggport em(or igb)0 laggport em(or igb)1"
ifconfig_vlanX="inet XXX.XXX.XXX.XXX netmask 255.255.255.0 vlan X vlandev lagg0 mtu 9000"

This have been confirmed with tcpdump showing transfers exceeding the 1460 that is the maximum for 1500 windows:

Code:

[CMD="jumboclient#"]ping -s 8192 jumboserver.foo.com[/CMD]
8200 bytes from jumboserver.foo.com: icmp_req=237 ttl=64 time=0.606 ms
8200 bytes from jumboserver.foo.com: icmp_req=238 ttl=64 time=0.599 ms
...

Code:

[CMD="jumboserver#"]tcpdump -i vlanX | grep jumboclient[/CMD]
10:08:07.953778 IP jumboclient.foo.com > jumboserver.foo.com: ICMP echo request, id 11436, seq 75, length 8200
10:08:07.953798 IP jumboserver.foo.com > jumboclient.foo.com: ICMP echo reply, id 11436, seq 75, length 8200

/Sebulon

belon_cfy · Feb 5, 2013

Sebulon said:
Oh yes, they most certainly do. We have several storage units with em- and igb HW that are configured for Jumbo Frames and are performing magnificantly. HereÂ´s how, ontop of a trunk too:

Code:

ifconfig_em(or igb)0="mtu 9000 up" ifconfig_em(or igb)1="mtu 9000 up" cloned_interfaces="lagg0 vlanX" ifconfig_lagg0="up laggproto lacp laggport em(or igb)0 laggport em(or igb)1" ifconfig_vlanX="inet XXX.XXX.XXX.XXX netmask 255.255.255.0 vlan X vlandev lagg0 mtu 9000"

This have been confirmed with tcpdump showing transfers exceeding the 1460 that is the maximum for 1500 windows:

Code:

[CMD="jumboclient#"]ping -s 8192 jumboserver.foo.com[/CMD] 8200 bytes from jumboserver.foo.com: icmp_req=237 ttl=64 time=0.606 ms 8200 bytes from jumboserver.foo.com: icmp_req=238 ttl=64 time=0.599 ms ...

Code:

[CMD="jumboserver#"]tcpdump -i vlanX | grep jumboclient[/CMD] 10:08:07.953778 IP jumboclient.foo.com > jumboserver.foo.com: ICMP echo request, id 11436, seq 75, length 8200 10:08:07.953798 IP jumboserver.foo.com > jumboclient.foo.com: ICMP echo reply, id 11436, seq 75, length 8200

/Sebulon

Hi Sebulon,
Thanks for your information, I'm able to obtain much higher speed than previous setup with increasing the mtu on em interface and client side.

Just wondering why the em interface perform slower on the default mtu size but the igb does faster?

Sebulon · Feb 5, 2013

belon_cfy said:
Just wondering why the em interface perform slower on the default mtu size but the igb does faster?

Np, happy it helped. How much higher is "much higher", btw?

Well, while not at all likely, since you stated it worked better using older software, but in other cases network congestion would be an explanation; an instance where the network is overloaded with an amount of packets that is too great for the switches/routers to handle, that ends up hurting throughput. Using Jumbo Frames then reduces that amount, gaining higher throughput from it.

/Sebulon

belon_cfy · Feb 5, 2013

Sebulon said:
Np, happy it helped. How much higher is "much higher", btw?
/Sebulon

It was 10-20MB/s read on FreeBSD before jumbo frame, but after that it was able to reach 50-60MB/s, just like running on igb interface with the same amount of read speed.

I don't think the congestion is the main cause of slowness on em interface because they were using the same cable. I have benchmarked on the same environment several times with the same results.

By the way, thanks for sharing your knowledge on enabling jumbo frame. :beer

Sebulon · Feb 6, 2013

@belon_cfy

As I said, I donÂ´t think congestion is explanation in this case either, since your situation largely improved just by using an older version of FreeBSD. I will make my own tests, compare and report back.

/Sebulon

belon_cfy · Feb 19, 2013

Thanks Sebulon. Awaiting your confirmation.

Sebulon · Feb 19, 2013

@belon_cfy

HavenÂ´t gotten around to complete the update on that system just yet. I have however managed to do that on the storage system hosting our oVirt platform. Although the system is equipped with em NIC's, it might be good to have for comparison. So this output is also from a CentOS-6.3 amd64 VM, configured with a 80GB VirtIO HDD:

Code:

[CMD="#"]dd if=/dev/vda of=/dev/null bs=1M iflag=direct[/CMD]
81920+0 records in
81920+0 records out
85899345920 bytes (86 GB) copied, 876.063 s, 98.1 MB/s

So no problems there, at least

/Sebulon

belon_cfy · Feb 20, 2013

Hi Sebulon,
What kind of hypervisor you are using? Is it through NFS with TCP mounted?

We have more than 10 servers with different hardware and all of them are having slowness on read with ESXI4.1-5 + NFS on em interface.

Sebulon · Feb 20, 2013

belon_cfy said:
What kind of hypervisor you are using?

Fedora 17 at the moment, planning on upgrading to 18 as soon as weÂ´re done testing the procedure in our test environment.

belon_cfy said:
Is it through NFS with TCP mounted?

Yup. And Jumbo Frames as well.

belon_cfy said:
We have more than 10 servers with different hardware and all of them are having slowness on read with ESXI4.1-5 + NFS on em interface.

Our oVirt hosts are a mix of HP DL380 G5 and Sun Fire X4140, whereas the VMWare hosts all are HP DL380 G6. The storage of our oVirt platform is Supermicro-based with em NIC's, while the VMWare storage is a HP DL180 G6 with igb NIC's.

/Sebulon

belon_cfy · Feb 21, 2013

Sebulon said:
Fedora 17 at the moment, planning on upgrading to 18 as soon as weÂ´re done testing the procedure in our test environment.

Yup. And Jumbo Frames as well.

Our oVirt hosts are a mix of HP DL380 G5 and Sun Fire X4140, whereas the VMWare hosts all are HP DL380 G6. The storage of our oVirt platform is Supermicro-based with em NIC's, while the VMWare storage is a HP DL180 G6 with igb NIC's.

/Sebulon

How about test it without Jumbo Frames?

Sebulon · Mar 1, 2013

belon_cfy said:
How about test it without Jumbo Frames?

Ugh, rather not. Too much of a fuss to set up. But hereÂ´s the same benchmark with ESXi 4.1.0, CentOS-6.3 Guest on igb 9.1-RELEASE ZFS storage system:

Code:

[CMD="#"]dd if=/dev/sda of=/dev/null bs=1M iflag=direct[/CMD]
81920+0 records in
81920+0 records out
85899345920 bytes (86 GB) copied, 785.259 s, 109 MB/s

So whatever problems you may have, I doubt that itÂ´s FreeBSD's "fault" directly. FWIW, here are the sysctls used to get lagg working correctly with igb:
/etc/sysctl.conf:

Code:

kern.ipc.nmbclusters=262144
kern.ipc.nmbjumbo9=38400

Maybe the next logcal step for you is to be paranoid about the network between them? Set up a FreeBSD client and server alongside of it all and test with benchmarks/iperf? Preferrably with- and without Jumbo, to see if thereÂ´s any difference.

/Sebulon

pboehmer · Mar 5, 2013

Just out of curiosity, what is tcpdump showing as the packet length during your nfs transfers. The best I can get is 1448, shouldn't it be higher with jumbo frames? Doesn't matter if I use TCP or UDP and no change regardless of rsize/wsize settings. I can [cmd=]ping -s 8192[/cmd] and get the appropriate response.

Sebulon · Mar 7, 2013

pboehmer said:
Just out of curiosity, what is tcpdump showing as the packet length during your nfs transfers. The best I can get is 1448, shouldn't it be higher with jumbo frames? Doesn't matter if I use TCP or UDP and no change regardless of rsize/wsize settings. I can [cmd=]ping -s 8192[/cmd] and get the appropriate response.

It looks like this:

Code:

[CMD="jumboserver#"]tcpdump -i vlanX[/CMD]
11:15:59.354059 IP jumboclient.foo.bar.768 > jumboserver.foo.bar.nfsd: Flags [.], seq 34278944:34287892, ack 293405, win 7742, options [nop,nop,TS val 111012298 ecr 215023984], length 8948
11:15:59.354066 IP jumboclient.foo.bar.768 > jumboserver.foo.bar.nfsd: Flags [.], seq 34287892:34296840, ack 293569, win 7742, options [nop,nop,TS val 111012299 ecr 215023984], length 8948
11:15:59.354077 IP jumboserver.foo.bar.nfsd > jumboclient.foo.bar.768: Flags [.], ack 34296840, win 28987, options [nop,nop,TS val 215023984 ecr 111012298], length 0
11:15:59.354183 IP jumboclient.foo.bar.768 > jumboserver.foo.bar.nfsd: Flags [.], seq 34296840:34305788, ack 293569, win 7742, options [nop,nop,TS val 111012299 ecr 215023984], length 8948
11:15:59.354190 IP jumboclient.foo.bar.768 > jumboserver.foo.bar.nfsd: Flags [.], seq 34305788:34314736, ack 293569, win 7742, options [nop,nop,TS val 111012299 ecr 215023984], length 8948
11:15:59.354194 IP jumboserver.foo.bar.nfsd > jumboclient.foo.bar.768: Flags [.], ack 34305788, win 29127, options [nop,nop,TS val 215023984 ecr 111012299], length 0
11:15:59.354307 IP jumboclient.foo.bar.768 > jumboserver.foo.bar.nfsd: Flags [.], seq 34314736:34323684, ack 293569, win 7742, options [nop,nop,TS val 111012299 ecr 215023984], length 8948
11:15:59.354317 IP jumboserver.foo.bar.nfsd > jumboclient.foo.bar.768: Flags [.], ack 34323684, win 28987, options [nop,nop,TS val 215023984 ecr 111012299], length 0
11:15:59.354432 IP jumboclient.foo.bar.768 > jumboserver.foo.bar.nfsd: Flags [.], seq 34323684:34332632, ack 293569, win 7742, options [nop,nop,TS val 111012299 ecr 215023984], length 8948
11:15:59.354437 IP jumboclient.foo.bar.768 > jumboserver.foo.bar.nfsd: Flags [.], seq 34332632:34341580, ack 293569, win 7742, options [nop,nop,TS val 111012299 ecr 215023984], length 8948
11:15:59.354439 IP jumboserver.foo.bar.nfsd > jumboclient.foo.bar.768: Flags [.], ack 34332632, win 29127, options [nop,nop,TS val 215023984 ecr 111012299], length 0
11:15:59.354488 IP jumboserver.foo.bar.nfs > jumboclient.foo.bar.2041124902: reply ok 160

Note the value of "length" there, at the end of each line. Good to know is that cranking up the window size on both client and server isnÂ´t enough. It also needs to be activated in the switches between them, unless if they are connected with crossover cables, of course

Also know that Jumbo Frames only can be used over the same switched network, layer 2 traffic. Because thereÂ´s difference between switching frame size and routing frame size, which still would (and should) have the default of 1500.

/Sebulon

belon_cfy · May 13, 2013

We bought a brand new Intel Xeon server today (with Intel Xeon 1230 V2, unfortunately, the network interface shows as emX instead of igbX. Of course, slow read performance as before. No jumbo frame is configured.

Host: CentOS 6.3 64bit
Storage: FreeBSD 9.1-RELEASE-p3 64bit

Code:

[root@nmekvm2 mnt]# mount 10.50.5.251:/vol/vol00 /mnt/storage8_vol00/ -o nolock,tcp,vers=3
[root@nmekvm2 mnt]# cd /mnt/storage8_vol00/
[root@nmekvm2 storage8_vol00]# dd if=test2 of=/dev/null bs=1M iflag=direct
^C799+0 records in
798+0 records out
836763648 bytes (837 MB) copied, 29.1752 s, 28.7 MB/s

Code:

[root@nmekvm2 mnt]# mount 10.50.5.251:/vol/vol00 /mnt/storage8_vol00/ -o nolock,udp,vers=3
[root@nmekvm2 mnt]# cd storage8_vol00/
[root@nmekvm2 storage8_vol00]# dd if=test2 of=/dev/null bs=1M iflag=direct
^C3067+0 records in
3066+0 records out
3214934016 bytes (3.2 GB) copied, 31.3094 s, 103 MB/s

Sebulon · May 15, 2013

Hey @belon_cfy!

Have you made any test with benchmarks/iperf yet? Also it would be logical to connect the systems with a crossover cable just to rule out the network between them.

/Sebulon

vlho · May 21, 2013

Hi @belon_cfy,

I have same problems with very slow read speed (5 - 8 MB/s) on VMware ESXi which is connecting to an NFS server running on NAS4Free (based on FreeBSD 9.1). Write speed is OK (60 - 100 MB/s). Speed of read over another protocol, e.g. CIFS, FTP etc. is OK also.

I tested various hardware and I don't know why, but somewhere read speed is very good (min. 70 MB/s), e.g.:

HP ML115 G5, onboard controller, LAN is bge
HP ML110 G7, onboard controller, LAN is em
IBM x3250, onboard controller LSI, LAN is bge (Broadcom 5721)
IBM x236, controller Promise, LAN is bge (Broadcom 5721)

I can see bad read speed on these "irons":

HP ML110 G5, onboard controller, LAN is bge
HP ML110 G6, onboard controller, LAN is bge
HP Microserver N40L, onboard controller, LAN is bge
HP DL320 G5p, controller SmartArray, LAN is bge

I tested your tip with jumbo frames and really, read speed increases to 30 - 50 MB/s

Because it is impossible to set up jumbo frames on all LAN adapters, I was looking for another solution. Yesterday I found that similar effect as jumbo frames is provided by this command: ifconfig <name_if_interface> -tso. Unfortunately read speed is still slower than write speed.