Poor samba performance

I'm experiencing poor samba 3.4 performance under FreeBSD 8.1-RELEASE AMD64.
Under Ubuntu and ext4, I got sequential throughput over 70MB/s for both read and write over samba. Under FreeBSD and UFS2 however, I get about 32MB/s for both read and write. The drive I'm using under FreeBSD right now is an older and slower drive though, but its local throughput is still well above what I'm getting through samba.
Code:
$ dd if=/dev/zero of=/usr/home/vcn64ultra/share/zerofile.000 bs=1m count=10000
10000+0 records in
10000+0 records out
10485760000 bytes transferred in 185.196407 secs (56619673 bytes/sec)
$ dd if=/usr/home/vcn64ultra/share/zerofile.000 of=/dev/null bs=1M
10000+0 records in
10000+0 records out
10485760000 bytes transferred in 171.272849 secs (61222547 bytes/sec)

The system is an Intel D945GCLF2 (Atom 330) board with 2GB DDR2-800.

Are there any tweaks or fixes to help improve my samba performance? Thanks in advance!
 
That was with default settings on everything that matters, with samba 3.4.8.

I've tried new settings with samba 3.5.6 with little improvement.
New settings are
Code:
aio_load="YES"
in /boot/loader.conf where Asynchronous IO was checked during ports install of samba

Code:
min receivefile size = 131072
aio read size = 1
aio write size = 1
use sendfile = yes
in /usr/local/etc/smb.conf

Code:
kern.ipc.maxsockbuf=16777216
kern.ipc.nmbclusters=32768
kern.ipc.somaxconn=32768
kern.maxfiles=65536
kern.maxfilesperproc=32768
kern.maxvnodes=800000
net.inet.tcp.delayed_ack=0
net.inet.tcp.inflight.enable=0
net.inet.tcp.path_mtu_discovery=0
net.inet.tcp.recvbuf_auto=1
net.inet.tcp.recvbuf_inc=524288
net.inet.tcp.recvbuf_max=16777216
net.inet.tcp.recvspace=65536
net.inet.tcp.rfc1323=1
net.inet.tcp.sendbuf_auto=1
net.inet.tcp.sendbuf_inc=524288
net.inet.tcp.sendspace=65536
net.inet.udp.maxdgram=57344
net.inet.udp.recvspace=65536
net.local.stream.recvspace=65536
net.inet.tcp.sendbuf_max=16777216
net.inet.tcp.mssdflt=1460
in /etc/sysctl.conf

Basically I followed this thread:
http://forums.freebsd.org/showthread.php?t=9187

On rare occasions I was able to benchmark the samba throughput higher (like 45MB/s) but most of the time it gets 20s and low 30s. There has been little improvement with the new samba version and new settings. Also, whats bugging me is that the performance isn't consistent. It seems like the system performs better or worse depending on its own whims. This inconsistency exists for both the old and new settings as well as on my actual server (the system I'm talking about in this thread was created for tweaking research).
 
For samba there exist a lot of changes that I apply beginning from OS to app. Mount options for the share. Could u you show your smb.conf?

Code:
  testparm -v -s
 
The post was too long so I removed some of the fields that had nothing after the '=' sign
Code:
$ testparm -v -s
Load smb config files from /usr/local/etc/smb.conf
Processing section "[homes]"
Processing section "[share]"
Processing section "[printers]"
Loaded services file OK.
Server role: ROLE_STANDALONE
[global]
        dos charset = CP850
        unix charset = UTF-8
        display charset = LOCALE
        workgroup = WORKGROUP
        netbios name = DIAMONDVILLE
        server string = Samba Server
        bind interfaces only = No
        security = SHARE
        encrypt passwords = Yes
        update encrypted = No
        client schannel = Auto
        server schannel = Auto
        allow trusted domains = Yes
        map to guest = Never
        null passwords = No
        obey pam restrictions = No
        password server = *
        smb passwd file = /usr/local/etc/samba/smbpasswd
        private dir = /usr/local/etc/samba
        passdb backend = tdbsam
        algorithmic rid base = 1000
        guest account = nobody
        enable privileges = Yes
        pam password change = No
        passwd chat = *new*password* %n\n *new*password* %n\n *changed*
        passwd chat debug = No
        passwd chat timeout = 2
        password level = 0
        username level = 0
        unix password sync = No
        restrict anonymous = 0
        lanman auth = No
        ntlm auth = Yes
        client NTLMv2 auth = No
        client lanman auth = No
        client plaintext auth = No
        preload modules =
        dedicated keytab file =
        kerberos method = default
        map untrusted to domain = No
        log level = 0
        syslog = 1
        syslog only = No
        log file = /var/log/samba/log.%m
        max log size = 50
        debug timestamp = Yes
        debug prefix timestamp = No
        debug hires timestamp = Yes
        debug pid = No
        debug uid = No
        debug class = No
        enable core files = Yes
        smb ports = 445 139
        large readwrite = Yes
        max protocol = NT1
        min protocol = CORE
        min receivefile size = 131072
        read raw = Yes
        write raw = Yes
        disable netbios = No
        reset on zero vc = No
        acl compatibility = auto
        defer sharing violations = Yes
        nt pipe support = Yes
        nt status support = Yes
        announce version = 4.9
        announce as = NT
        max mux = 50
        max xmit = 16644
        name resolve order = lmhosts wins host bcast
        max ttl = 259200
        max wins ttl = 518400
        min wins ttl = 21600
        time server = No
        unix extensions = Yes
        use spnego = Yes
        client signing = auto
        server signing = No
        client use spnego = Yes
        client ldap sasl wrapping = plain
        enable asu support = No
        deadtime = 0
        getwd cache = Yes
        keepalive = 300
        lpq cache time = 30
        max smbd processes = 0
        paranoid server security = Yes
        max disk size = 0
        max open files = 32768
        socket options = TCP_NODELAY
        use mmap = Yes
        hostname lookups = No
        name cache timeout = 660
        clustering = No
        ctdb timeout = 0
        load printers = Yes
        printcap cache time = 750
        cups encrypt = No
        cups connection timeout = 30
        disable spoolss = No
        show add printer wizard = Yes
        mangling method = hash2
        mangle prefix = 1
        max stat cache size = 256
        stat cache = Yes
        machine password timeout = 604800
        logon path = \\%N\%U\profile
        logon home = \\%N\%U
        domain logons = No
        init logon delay = 100
        os level = 20
        lm announce = Auto
        lm interval = 60
        preferred master = No
        local master = Yes
        domain master = Auto
        browse list = Yes
        enhanced browsing = Yes
        dns proxy = No
        wins proxy = No
        wins support = No
        kernel oplocks = Yes
        lock spin time = 200
        oplock break wait time = 0
        ldap delete dn = No
        ldap passwd sync = no
        ldap replication sleep = 1000
        ldap ssl = start tls
        ldap ssl ads = No
        ldap deref = auto
        ldap follow referral = Auto
        ldap timeout = 15
        ldap connection timeout = 2
        ldap page size = 1024
        ldap debug level = 0
        ldap debug threshold = 10
        lock directory = /var/db/samba
        state directory = /var/db/samba
        cache directory = /var/db/samba
        pid directory = /var/run/samba
        socket address = 0.0.0.0
        nmbd bind explicit broadcast = Yes
        afs token lifetime = 604800
        time offset = 0
        NIS homedir = No
        registry shares = No
        usershare allow guests = No
        usershare max shares = 0
        usershare owner only = Yes
        usershare path = /var/db/samba/usershares
        host msdfs = Yes
        passdb expand explicit = No
        idmap backend = tdb
        idmap cache time = 604800
        idmap negative cache time = 120
        template homedir = /home/%D/%U
        template shell = /bin/false
        winbind separator = \
        winbind cache time = 300
        winbind reconnect delay = 30
        winbind enum users = No
        winbind enum groups = No
        winbind use default domain = No
        winbind trusted domains only = No
        winbind nested groups = Yes
        winbind expand groups = 1
        winbind nss info = template
        winbind refresh tickets = No
        winbind offline logon = No
        winbind normalize names = No
        winbind rpc only = No
        create krb5 conf = Yes
        read only = Yes
        acl check permissions = Yes
        acl group control = No
        acl map full control = Yes
        create mask = 0744
        force create mode = 00
        security mask = 0777
        force security mode = 00
        directory mask = 0755
        force directory mode = 00
        directory security mask = 0777
        force directory security mode = 00
        force unknown acl user = No
        inherit permissions = No
        inherit acls = No
        inherit owner = No
        guest only = No
        administrative share = No
        guest ok = No
        only user = No
        hosts allow =
        hosts deny =
        allocation roundup size = 1048576
        aio read size = 1
        aio write size = 1
        aio write behind =
        ea support = No
        nt acl support = Yes
        profile acls = No
        map acl inherit = No
        afs share = No
        smb encrypt = auto
        block size = 1024
        change notify = Yes
        directory name cache size = 100
        kernel change notify = Yes
        max connections = 0
        min print space = 0
        strict allocate = No
        strict sync = No
        sync always = No
        use sendfile = Yes
        write cache size = 0
        max reported print jobs = 0
        max print jobs = 1000
        printable = No
        printing = cups
        cups options =
        print command =
        lpq command = %p
        lprm command =
        lppause command =
        lpresume command =
        queuepause command =
        queueresume command =
        printer name =
        use client driver = No
        default devmode = Yes
        force printername = No
        printjob username = %U
        default case = lower
        case sensitive = Auto
        preserve case = Yes
        short preserve case = Yes
        mangling char = ~
        hide dot files = Yes
        hide special files = No
        hide unreadable = No
        hide unwriteable files = No
        delete veto files = No
        veto files =
        hide files =
        veto oplock files =
        map archive = Yes
        map hidden = No
        map system = No
        map readonly = yes
        mangled names = Yes
        store dos attributes = No
        dmapi support = No
        browseable = Yes
        access based share enum = No
        blocking locks = Yes
        csc policy = manual
        fake oplocks = No
        locking = Yes
        oplocks = Yes
        level2 oplocks = Yes
        oplock contention limit = 2
        posix locking = Yes
        strict locking = Auto
        share modes = Yes
        dfree cache time = 0
        dfree command =
        copy =
        preexec =
        preexec close = No
        postexec =
        root preexec =
        root preexec close = No
        root postexec =
        available = Yes
        volume =
        fstype = NTFS
        set directory = No
        wide links = No
        follow symlinks = Yes
        dont descend =
        magic script =
        magic output =
        delete readonly = No
        dos filemode = No
        dos filetimes = Yes
        dos filetime resolution = No
        fake directory create times = No
        vfs objects =
        msdfs root = No
        msdfs proxy =

[homes]
        comment = Home Directories
        read only = No
        browseable = No

[share]
        comment = "test share"
        path = /usr/home/vcn64ultra/share
        force user = nobody
        force group = nobody
        read only = No
        force create mode = 0777
        force directory mode = 0777
        guest ok = Yes

[printers]
        comment = All Printers
        path = /var/spool/samba
        printable = Yes
        browseable = No
$
 
Well, reading your config and the thread u you point [to]:

1; Mount the fs where u you will save the samba shares with noatime, maybe u you already did this.

2; Samba, the other thread says:

Code:
aio read size = 16384
aio write size = 16384
aio write behind = true

Your config has this:

Code:
aio read size = 1
aio write size = 1
aio write behind =

Play with these options:

Code:
socket options = SO_RCVBUF=8192 SO_SNDBUF=8192

8192, 16384, 32768, etc, etc.

What about your network speed? Is it running full?

If u you move a file to the other server and vice versa do u you get full speed, 10MB/100MB, using other programs like ftp, scp, etc.?

What about your disk speed?

bonnie, bonnie++ running in the fs where u you save your shares?

Are there other process eating CPU? When u you are doing IO work, does your system have a high IO waiting?

Samba has a lot of options to work on, see u later :)
 
Thanks for helping! Sorry about taking a while to respond, I've been busy with stuff.

I assume the network speed is at 1000Mbit/s since the speeds I'm getting are above 100Mbit/s and 1000Mbit/s is the next step above 100Mbit/s. "Full" as in full duplex? Even with 1000Mbit/s at half duplex wouldn't I still be getting a theoretical max of 62.5Mbyte/s?

Sorry, I have never done a command line based transfer to another computer before. All of my inter-computer transfers have been managed by GUI programs but neither of my servers have a GUI installed.

Especially the case for the Atom test system, there is nothing eating CPU or doing anything whatsoever, it is a clean install of FreeBSD with the addition of samba and, to my amazement, 55MB of used RAM.

Code:
$ bonnie++
Writing a byte at a time...done
Writing intelligently...done
Rewriting...done
Reading a byte at a time...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...done.
Create files in random order...done.
Stat files in random order...done.
Delete files in random order...done.
Version  1.96       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
diamondville     4G   106  99 47792  25 17006  17   276  99 49481  16 162.9  39
Latency               285ms     267ms    1066ms   42729us   44813us     707ms
Version  1.96       ------Sequential Create------ --------Random Create--------
diamondville        -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16 12696  78 +++++ +++ 28569  99  7931  52 +++++ +++ 26889  99
Latency             73885us      82us      76us   70804us      64us     118us
1.96,1.96,diamondville,1,1288961299,4G,,106,99,47792,25,17006,17,276,99,49481,16,162.9,
39,16,,,,,12696,78,+++++,+++,28569,99,7931,52,+++++,+++,26889,99,285ms,267ms,1066ms,
42729us,44813us,707ms,73885us,82us,76us,70804us,64us,118us
$

I don't know what the +++++ stuff is about...

I should mention that through samba, when the write speed isn't just in the 20s or low 30s, it starts in the 20s for a few seconds then shoots up to 45-50MB/s. This causes the benchmark result to be lower than 45-50MB/s (but higher than low 30s). From now on I'm going to observe what the sped up value ends up being according to my network meter. The read speed is constant and slower than write.

I tried the line
Code:
socket options = SO_RCVBUF=8192 SO_SNDBUF=8192
with some values from 8192 to 134217728. In the low values both read and write speeds were horrible single digit (read was sometimes 0.70MB/s) numbers. In the higher values the write speed became the same as what bonnie++ says is the max of the HDD. That's good. However, the read speed is still horrible at 1.60MB/s or so.

When I commented out the above line the speeds went back to normal where both read and write were in the 20s or low 30s. This should mean that while I did do the other changes such as "aio read size = 16384", only the above line actually made a difference.

I tried again with
Code:
socket options = SO_RCVBUF=134217728
to see if the read speed would go back to normal (although I want better than normal! :D) but it didn't. I still get less than 2MB/s for read speed.

Assuming the write is all fine now, what can I do about the read speed?? Thanks!
 
regarding your terrible performance

can I ask one question ? Is this a dell server or is it a LSI raid Controller I have seen .6 and 1.2 Mbyte speed transfers before with lsi raid controllers, some genius made there controllers turn the cache off unless specificly requesting you turn it back on (bit like giving you a race car with the hidden hand brake on)
have you tryed gstat to get the disk loading when your trasfering with smb ?
have you looked at vmstat to get io states ?, as the previous person asked have you tryed another transfer , scp might not be a good bench mark as unless you install openssh with the hpn patch, it's buffers never keep up with the transfer, ftp would be your best bet for testing , but even winscp for windows to ssh would at least tell you if the io for the disk is to high.
there is a entry you put in sysctl.conf to enable the cache, let me know if its a lsi controler and i'll dig out the entry for you
 
The controller would be whatever controller is integrated in the Intel south bridge (model NH82801GB).

However, I have given up on FreeBSD. NexentaStor gets significantly better throughput for ZFS which should be due to the kernel CIFS implementation being better than Samba.
 
Back
Top