1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

FreeBSD slow NFS read speed on FreeBSD 9.1

Discussion in 'Web and Network Services' started by belon_cfy, Apr 25, 2012.

  1. belon_cfy

    belon_cfy New Member

    Messages:
    192
    Likes Received:
    0
    5th Feb 2013
    Previously I m managed to solve it by adding a vlan on top of the em interface(no issue on igb interface), but the same solution doesn't work anymore on FreeBSD 9.1.

    I have verified it by switching back to FreeBSD 9.0 and it works fine again, the server able to saturate the 1Gbps link. After that I reinstall the same server to FreeBSD 9.1 and the problem came back again, it only able to push below 210Mbps max.

    Is there any driver update causing bad performance on FreeBSD 9.1 ?

    By the way, below is my test environment on ESXI5 with NFS mount.
    - Running a CentOS6.3 64bit guest on ESXI5
    - Running dd if=/dev/sda of=/dev/null bs=1M iflag=direct in VM
    - iostat showing only 20MB read per sec, and systat -if 1 on FreeBSD 9.1 showing the same result.


    Thanks.

    -------------------------------------------------------------------------------------
    25th Apr 2012

    I'm experiencing very slow read speed when connecting to NFS server. The average read speed is only 5-10MB/s while the write speed sustain 50MB/s and 80-100MB/s on ASYNC mode.

    I found some people also having the same issue when mounting the NFS with TCP instead of UDP. Understand that mount with UDP will solve the problem but the ESXI does not allow us to do it.

    Both of my servers are running FreeBSD 9 Release with the following NFS parameters:
    Code:
    nfsv4_server_enable="YES"
    rpcbind_enable="YES"
    nfs_server_enable="YES"
    nfs_server_flags="-u -t -n 128"
    mountd_flags="-r"
    rpc_lockd_enable="YES"
    rpc_statd_enable="YES"
    
    SCP speed is about 80MB from the server to the client.

    Any idea?
     
  2. belon_cfy

    belon_cfy New Member

    Messages:
    192
    Likes Received:
    0
    I found that the NFS read through igb works much faster, however slow read through the other interfaces such as em or bce. But the scp and nc transfer speed are the same. Anyone know why I'm getting inconsistent NFS read speed through different interfaces?
     
  3. belon_cfy

    belon_cfy New Member

    Messages:
    192
    Likes Received:
    0
    Seems the problem already existed for more than 5 years, however it's still not resolved yet.

    http://lists.freebsd.org/pipermail/freebsd-bugs/2007-January/021928.html

    Problem still persists, even changed the network cable and connected directly to different clients. Only TCP NFS read will hit the "bug", others such as SCP, ISCSI are all no issue.

    I can't switch to UDP because ESXI doesn't support NFS through UDP.
     
  4. belon_cfy

    belon_cfy New Member

    Messages:
    192
    Likes Received:
    0
    Speed slightly increases from 10-20MB/s to 30-50MB/s after changing the MTU to 256, but still far more behind the igb interface.

    Any idea?
     
  5. belon_cfy

    belon_cfy New Member

    Messages:
    192
    Likes Received:
    0
    Changing to 'INTEL PRO/1000 PT DUAL PORT PCI-E' for testing, it has been detected as em0 and em1 and the performance is still same as before, systat -ifstat 1 showing only 10-20MB/s on NFS read but other services are able to hit 100MB/s at output by running ping -i 0.0001 -s 8192 xxx.xxx.xxx.xxx.
     
  6. belon_cfy

    belon_cfy New Member

    Messages:
    192
    Likes Received:
    0
    Problem solved by adding it into a VLAN, both interfaces are able to get a consistent 100MB/s.
     
  7. belon_cfy

    belon_cfy New Member

    Messages:
    192
    Likes Received:
    0
    5th Feb 2013
    Previously I m managed to solve it by adding a vlan on top of the em interface(no issue on igb interface), but the same solution doesn't work anymore on FreeBSD 9.1.

    I have verified it by switching back to FreeBSD 9.0 and it works fine again, the server able to saturate the 1Gbps link. After that I reinstall the same server to FreeBSD 9.1 and the problem came back again, it only able to push below 210Mbps max.

    Is there any driver update causing bad performance on FreeBSD 9.1 ?

    By the way, below is my test environment on ESXI5 with NFS mount.
    - Running a CentOS6.3 64bit guest on ESXI5
    - Running dd if=/dev/sda of=/dev/null bs=1M iflag=direct in VM
    - iostat showing only 20MB read per sec, and systat -if 1 on FreeBSD 9.1 showing the same result.


    Thanks.
     
  8. Sebulon

    Sebulon Member

    Messages:
    663
    Likes Received:
    2
    Hey man,

    I´m soon to be upgrading a machine to 9.1 that has igb, so I´ll be keeping this in mind. Our problems have been to get Jumbo Frames to work properly with igb and have to use these sysctl's:
    /etc/sysctl.conf:
    Code:
    kern.ipc.nmbclusters=262144
    kern.ipc.nmbjumbo9=38400
    Otherwise the network wouldn´t go up at all.

    /Sebulon
     
  9. belon_cfy

    belon_cfy New Member

    Messages:
    192
    Likes Received:
    0
    But we are able to get 1Gbps speed on igb interface even without jumbo frame, both of the em and igb with the default mtu size which is 1500.

    The em interface are able to saturate the 1Gbps link if vlan is added on FreeBSD 9.0, but the same method doesn't work on FreeBSD 9.1 anymore. I only got 100-200Mbps through the em interface with the exact same cable and port.

    By the way, the em interface does not support jumbo frame.
    Code:
    igb0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
            options=401bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,VLAN_HWTSO>
            ether 00:25:90:6c:23:0c
            inet6 fe80::225:90ff:fe6c:230c%igb0 prefixlen 64 scopeid 0x1
            nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
            media: Ethernet autoselect (1000baseT <full-duplex>)
            status: active
    igb1: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
            options=401bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,VLAN_HWTSO>
            ether 00:25:90:6c:23:0d
            nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
            media: Ethernet autoselect
            status: no carrier
    em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
            options=4219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC,VLAN_HWTSO>
            ether 00:25:90:6c:23:0c
            inet6 fe80::225:90ff:fe68:99fe%em0 prefixlen 64 scopeid 0x7
            nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
            media: Ethernet autoselect (1000baseT <full-duplex>)
            status: active
    em1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
            options=4219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC,VLAN_HWTSO>
            ether 00:25:90:68:99:ff
            inet6 fe80::225:90ff:fe68:99ff%em1 prefixlen 64 scopeid 0x8
            inet 10.90.1.246 netmask 0xffffff00 broadcast 10.90.1.255
            nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
            media: Ethernet autoselect (100baseTX <full-duplex>)
            status: active
    
     
  10. Sebulon

    Sebulon Member

    Messages:
    663
    Likes Received:
    2
    Oh yes, they most certainly do. We have several storage units with em- and igb HW that are configured for Jumbo Frames and are performing magnificantly. Here´s how, ontop of a trunk too:
    Code:
    ifconfig_em(or igb)0="mtu 9000 up"
    ifconfig_em(or igb)1="mtu 9000 up"
    cloned_interfaces="lagg0 vlanX"
    ifconfig_lagg0="up laggproto lacp laggport em(or igb)0 laggport em(or igb)1"
    ifconfig_vlanX="inet XXX.XXX.XXX.XXX netmask 255.255.255.0 vlan X vlandev lagg0 mtu 9000"
    
    This have been confirmed with tcpdump showing transfers exceeding the 1460 that is the maximum for 1500 windows:
    Code:
    [CMD="jumboclient#"]ping -s 8192 jumboserver.foo.com[/CMD]
    8200 bytes from jumboserver.foo.com: icmp_req=237 ttl=64 time=0.606 ms
    8200 bytes from jumboserver.foo.com: icmp_req=238 ttl=64 time=0.599 ms
    ...
    Code:
    [CMD="jumboserver#"]tcpdump -i vlanX | grep jumboclient[/CMD]
    10:08:07.953778 IP jumboclient.foo.com > jumboserver.foo.com: ICMP echo request, id 11436, seq 75, length 8200
    10:08:07.953798 IP jumboserver.foo.com > jumboclient.foo.com: ICMP echo reply, id 11436, seq 75, length 8200
    /Sebulon
     
  11. belon_cfy

    belon_cfy New Member

    Messages:
    192
    Likes Received:
    0
    Hi Sebulon,
    Thanks for your information, I'm able to obtain much higher speed than previous setup with increasing the mtu on em interface and client side.

    Just wondering why the em interface perform slower on the default mtu size but the igb does faster?
     
  12. Sebulon

    Sebulon Member

    Messages:
    663
    Likes Received:
    2
    Np, happy it helped. How much higher is "much higher", btw?:)

    Well, while not at all likely, since you stated it worked better using older software, but in other cases network congestion would be an explanation; an instance where the network is overloaded with an amount of packets that is too great for the switches/routers to handle, that ends up hurting throughput. Using Jumbo Frames then reduces that amount, gaining higher throughput from it.

    /Sebulon
     
  13. belon_cfy

    belon_cfy New Member

    Messages:
    192
    Likes Received:
    0
    It was 10-20MB/s read on FreeBSD before jumbo frame, but after that it was able to reach 50-60MB/s, just like running on igb interface with the same amount of read speed.

    I don't think the congestion is the main cause of slowness on em interface because they were using the same cable. I have benchmarked on the same environment several times with the same results.

    By the way, thanks for sharing your knowledge on enabling jumbo frame. :beer
     
  14. Sebulon

    Sebulon Member

    Messages:
    663
    Likes Received:
    2
    @belon_cfy

    As I said, I don´t think congestion is explanation in this case either, since your situation largely improved just by using an older version of FreeBSD. I will make my own tests, compare and report back.

    /Sebulon
     
  15. belon_cfy

    belon_cfy New Member

    Messages:
    192
    Likes Received:
    0
    Thanks Sebulon. Awaiting your confirmation.
     
  16. Sebulon

    Sebulon Member

    Messages:
    663
    Likes Received:
    2
    @belon_cfy

    Haven´t gotten around to complete the update on that system just yet. I have however managed to do that on the storage system hosting our oVirt platform. Although the system is equipped with em NIC's, it might be good to have for comparison. So this output is also from a CentOS-6.3 amd64 VM, configured with a 80GB VirtIO HDD:
    Code:
    [CMD="#"]dd if=/dev/vda of=/dev/null bs=1M iflag=direct[/CMD]
    81920+0 records in
    81920+0 records out
    85899345920 bytes (86 GB) copied, 876.063 s, 98.1 MB/s
    So no problems there, at least:)

    /Sebulon
     
  17. belon_cfy

    belon_cfy New Member

    Messages:
    192
    Likes Received:
    0
    Hi Sebulon,
    What kind of hypervisor you are using? Is it through NFS with TCP mounted?

    We have more than 10 servers with different hardware and all of them are having slowness on read with ESXI4.1-5 + NFS on em interface.
     
  18. Sebulon

    Sebulon Member

    Messages:
    663
    Likes Received:
    2
    Fedora 17 at the moment, planning on upgrading to 18 as soon as we´re done testing the procedure in our test environment.

    Yup. And Jumbo Frames as well.

    Our oVirt hosts are a mix of HP DL380 G5 and Sun Fire X4140, whereas the VMWare hosts all are HP DL380 G6. The storage of our oVirt platform is Supermicro-based with em NIC's, while the VMWare storage is a HP DL180 G6 with igb NIC's.

    /Sebulon
     
  19. belon_cfy

    belon_cfy New Member

    Messages:
    192
    Likes Received:
    0
    How about test it without Jumbo Frames?
     
  20. Sebulon

    Sebulon Member

    Messages:
    663
    Likes Received:
    2
    Ugh, rather not. Too much of a fuss to set up. But here´s the same benchmark with ESXi 4.1.0, CentOS-6.3 Guest on igb 9.1-RELEASE ZFS storage system:
    Code:
    [CMD="#"]dd if=/dev/sda of=/dev/null bs=1M iflag=direct[/CMD]
    81920+0 records in
    81920+0 records out
    85899345920 bytes (86 GB) copied, 785.259 s, 109 MB/s
    So whatever problems you may have, I doubt that it´s FreeBSD's "fault" directly. FWIW, here are the sysctls used to get lagg working correctly with igb:
    /etc/sysctl.conf:
    Code:
    kern.ipc.nmbclusters=262144
    kern.ipc.nmbjumbo9=38400
    Maybe the next logcal step for you is to be paranoid about the network between them? Set up a FreeBSD client and server alongside of it all and test with benchmarks/iperf? Preferrably with- and without Jumbo, to see if there´s any difference.

    /Sebulon
     
  21. pboehmer

    pboehmer New Member

    Messages:
    68
    Likes Received:
    0
    Just out of curiosity, what is tcpdump showing as the packet length during your nfs transfers. The best I can get is 1448, shouldn't it be higher with jumbo frames? Doesn't matter if I use TCP or UDP and no change regardless of rsize/wsize settings. I can ping -s 8192 and get the appropriate response.
     
  22. Sebulon

    Sebulon Member

    Messages:
    663
    Likes Received:
    2
    It looks like this:
    Code:
    [CMD="jumboserver#"]tcpdump -i vlanX[/CMD]
    11:15:59.354059 IP jumboclient.foo.bar.768 > jumboserver.foo.bar.nfsd: Flags [.], seq 34278944:34287892, ack 293405, win 7742, options [nop,nop,TS val 111012298 ecr 215023984], length 8948
    11:15:59.354066 IP jumboclient.foo.bar.768 > jumboserver.foo.bar.nfsd: Flags [.], seq 34287892:34296840, ack 293569, win 7742, options [nop,nop,TS val 111012299 ecr 215023984], length 8948
    11:15:59.354077 IP jumboserver.foo.bar.nfsd > jumboclient.foo.bar.768: Flags [.], ack 34296840, win 28987, options [nop,nop,TS val 215023984 ecr 111012298], length 0
    11:15:59.354183 IP jumboclient.foo.bar.768 > jumboserver.foo.bar.nfsd: Flags [.], seq 34296840:34305788, ack 293569, win 7742, options [nop,nop,TS val 111012299 ecr 215023984], length 8948
    11:15:59.354190 IP jumboclient.foo.bar.768 > jumboserver.foo.bar.nfsd: Flags [.], seq 34305788:34314736, ack 293569, win 7742, options [nop,nop,TS val 111012299 ecr 215023984], length 8948
    11:15:59.354194 IP jumboserver.foo.bar.nfsd > jumboclient.foo.bar.768: Flags [.], ack 34305788, win 29127, options [nop,nop,TS val 215023984 ecr 111012299], length 0
    11:15:59.354307 IP jumboclient.foo.bar.768 > jumboserver.foo.bar.nfsd: Flags [.], seq 34314736:34323684, ack 293569, win 7742, options [nop,nop,TS val 111012299 ecr 215023984], length 8948
    11:15:59.354317 IP jumboserver.foo.bar.nfsd > jumboclient.foo.bar.768: Flags [.], ack 34323684, win 28987, options [nop,nop,TS val 215023984 ecr 111012299], length 0
    11:15:59.354432 IP jumboclient.foo.bar.768 > jumboserver.foo.bar.nfsd: Flags [.], seq 34323684:34332632, ack 293569, win 7742, options [nop,nop,TS val 111012299 ecr 215023984], length 8948
    11:15:59.354437 IP jumboclient.foo.bar.768 > jumboserver.foo.bar.nfsd: Flags [.], seq 34332632:34341580, ack 293569, win 7742, options [nop,nop,TS val 111012299 ecr 215023984], length 8948
    11:15:59.354439 IP jumboserver.foo.bar.nfsd > jumboclient.foo.bar.768: Flags [.], ack 34332632, win 29127, options [nop,nop,TS val 215023984 ecr 111012299], length 0
    11:15:59.354488 IP jumboserver.foo.bar.nfs > jumboclient.foo.bar.2041124902: reply ok 160
    Note the value of "length" there, at the end of each line. Good to know is that cranking up the window size on both client and server isn´t enough. It also needs to be activated in the switches between them, unless if they are connected with crossover cables, of course:)
    Also know that Jumbo Frames only can be used over the same switched network, layer 2 traffic. Because there´s difference between switching frame size and routing frame size, which still would (and should) have the default of 1500.

    /Sebulon
     
  23. belon_cfy

    belon_cfy New Member

    Messages:
    192
    Likes Received:
    0
    We bought a brand new Intel Xeon server today (with Intel Xeon 1230 V2, unfortunately, the network interface shows as emX instead of igbX. Of course, slow read performance as before. No jumbo frame is configured.

    Host: CentOS 6.3 64bit
    Storage: FreeBSD 9.1-RELEASE-p3 64bit

    Code:
    [root@nmekvm2 mnt]# mount 10.50.5.251:/vol/vol00 /mnt/storage8_vol00/ -o nolock,tcp,vers=3
    [root@nmekvm2 mnt]# cd /mnt/storage8_vol00/
    [root@nmekvm2 storage8_vol00]# dd if=test2 of=/dev/null bs=1M iflag=direct
    ^C799+0 records in
    798+0 records out
    836763648 bytes (837 MB) copied, 29.1752 s, 28.7 MB/s
    Code:
    [root@nmekvm2 mnt]# mount 10.50.5.251:/vol/vol00 /mnt/storage8_vol00/ -o nolock,udp,vers=3
    [root@nmekvm2 mnt]# cd storage8_vol00/
    [root@nmekvm2 storage8_vol00]# dd if=test2 of=/dev/null bs=1M iflag=direct
    ^C3067+0 records in
    3066+0 records out
    3214934016 bytes (3.2 GB) copied, 31.3094 s, 103 MB/s
     
  24. Sebulon

    Sebulon Member

    Messages:
    663
    Likes Received:
    2
    Hey @belon_cfy!

    Have you made any test with benchmarks/iperf yet? Also it would be logical to connect the systems with a crossover cable just to rule out the network between them.

    /Sebulon
     
    Last edited by a moderator: Oct 16, 2014
  25. vlho

    vlho New Member

    Messages:
    1
    Likes Received:
    0
    Hi @belon_cfy,

    I have same problems with very slow read speed (5 - 8 MB/s) on VMware ESXi which is connecting to an NFS server running on NAS4Free (based on FreeBSD 9.1). Write speed is OK (60 - 100 MB/s). Speed of read over another protocol, e.g. CIFS, FTP etc. is OK also.

    I tested various hardware and I don't know why, but somewhere read speed is very good (min. 70 MB/s), e.g.:
    • HP ML115 G5, onboard controller, LAN is bge
    • HP ML110 G7, onboard controller, LAN is em
    • IBM x3250, onboard controller LSI, LAN is bge (Broadcom 5721)
    • IBM x236, controller Promise, LAN is bge (Broadcom 5721)
    I can see bad read speed on these "irons":
    • HP ML110 G5, onboard controller, LAN is bge
    • HP ML110 G6, onboard controller, LAN is bge
    • HP Microserver N40L, onboard controller, LAN is bge
    • HP DL320 G5p, controller SmartArray, LAN is bge
    I tested your tip with jumbo frames and really, read speed increases to 30 - 50 MB/s

    Because it is impossible to set up jumbo frames on all LAN adapters, I was looking for another solution. Yesterday I found that similar effect as jumbo frames is provided by this command: ifconfig <name_if_interface> -tso. Unfortunately read speed is still slower than write speed.
     
    Last edited by a moderator: Oct 16, 2014