Solved Samba sends file in bursts.

I have been battling with a strange issue for the past two weeks, the like of which I have neither experienced nor heard of. I know it is advisable to be verbose when you're describing a technical problem to technicians, but I'll try to explain using as few words as I can, hoping someone will understand what is going on.

There is a ProLiant DL120 G7 machine (16GBs of RAM) running FreeBSD 14.1 and Samba 4.16. There are 7 Kingston enterprise SSDs, connected to a flashed Dell PERC H310 card. (This is the second time I have built such a machine, but it was version 14.0.) When I copy a file from a Windows client, it starts fast, at almost 1Gbps, 7 to 10 seconds later, it drops to 0Gbps; and after staying like that for another 7 to 10 seconds, it quickly resumes at full speed. All the while, pinging continues without any interruptions. When I connect to it directly, ie with a CAT6 patch cable, it works fine.

I believe it has to be a problem with cache or something. Does anyone have an idea what is going on?

Code:
[global]
    server string = NAS
    netbios name = nas
    workgroup = WORKGROUP
    security = user
    force group = wheel
    create mode = 777
    wide links = no
    force create mode = 777
    max log size = 50
    delete veto files = yes
    veto files = /Thumbs.db/.DS_Store/._.DS_Store/.apdisk/desktop.ini/
    force directory mode = 777
    use sendfile = true
    socket options = SO_RCVBUF=131072 SO_SNDBUF=131072 TCP_NODELAY
    aio write size = 16384
    aio read size = 16384
    aio write behind = true
    delete readonly = yes
    os level = 20
    min receivefile size = 16384
    load printers = no
    force user = root
    directory mode = 777
    read raw = true
    dos charset = cp866
    unix charset = UTF-8
    store dos attributes = no
    ea support = no
    map archive = no
    map hidden = no
    map system = no
    map readonly = no
    server signing = mandatory
 
What filesystem? ZFS?
Yes. There are two pools:
Code:
[NAS][root ~] > zpool status
  pool: tank
 state: ONLINE
config:

        NAME        STATE     READ WRITE CKSUM
        tank        ONLINE       0     0     0
          raidz3-0  ONLINE       0     0     0
            da2     ONLINE       0     0     0
            da0     ONLINE       0     0     0
            da1     ONLINE       0     0     0
            da3     ONLINE       0     0     0
            da4     ONLINE       0     0     0
            da5     ONLINE       0     0     0
            da6     ONLINE       0     0     0

errors: No known data errors

  pool: zroot
 state: ONLINE
config:

        NAME        STATE     READ WRITE CKSUM
        zroot       ONLINE       0     0     0
          ada0p3    ONLINE       0     0     0

errors: No known data errors

Just tried using rar to copy a 1.7GB MP4 file from tank into zroot and there were no issues. (I used rar since it shows progress.)

Or even locally?
Tried the same within tank, and it worked fine.

Does the same behavior appear with NFS?
Created an NFS share and mounted it on the Windows 11 machine; again it worked.

Now I am getting convinced that this has to be a Samba issue.

PS: I started tinkering with FreeBSD's net.inet.tcp.recvspace and net.inet.tcp.sendspace kernel options, setting them both to 131072, and Samba's SO_RCVBUF and SO_SNDBUF, both at 262144 now, and I have seen an improvement. The copy operation drops down to 0Mbps just for once for the 1.74GB MP4 file.
 
A lot of options that are either redundant and/or can cause interesting performance characteristics.

Trying something like this in your global section, you should change interfaces though....

Code:
[global]
workgroup = WORKGROUP
server string = Storage Server Blobbie
log file = /var/log/samba.log
max log size = 10240
bind interfaces only = true
interfaces = igc0
disable netbios = yes
directory name cache size = 0
aio write size = 0
load printers = no
printing = bsd
printcap name = /dev/null
veto files = /Thumbs.db/.DS_Store/._.DS_Store/.apdisk/
delete veto files = yes
enable core files = no
multicast dns register = no
 
A lot of options that are either redundant and/or can cause interesting performance characteristics.

Trying something like this in your global section, you should change interfaces though....

Code:
[global]
workgroup = WORKGROUP
server string = Storage Server Blobbie
log file = /var/log/samba.log
max log size = 10240
bind interfaces only = true
interfaces = igc0
disable netbios = yes
directory name cache size = 0
aio write size = 0
load printers = no
printing = bsd
printcap name = /dev/null
veto files = /Thumbs.db/.DS_Store/._.DS_Store/.apdisk/
delete veto files = yes
enable core files = no
multicast dns register = no
and ditch all the others?

PS: Done that. The copy operation of the 1.74GB MP4 file comes to a halt just once.
 
What do you have in sysctl.conf ?

Code:
[NAS][root ~] > cat /etc/sysctl.conf
#
#  This file is read when going to multi-user and its contents piped thru
#  ``sysctl'' to adjust kernel values.  ``man 5 sysctl.conf'' for details.
#

# Uncomment this to prevent users from seeing information about processes that
# are being run under another UID.
#security.bsd.see_other_uids=0
vfs.zfs.min_auto_ashift=12

net.inet.tcp.recvspace=131072
net.inet.tcp.sendspace=131072
 
Update:

It works fine if I disconnect the pfSense (v2.7.2) box, which is configured to run dnsmasq.

And this is my resolv.conf file:
Code:
nameserver 172.16.50.1 # the pfSense 2.7.2 router

Does that ring any bells for anyone?
 
DNS is only involved initially when the connection is set up, to get the IP address. Once the connection is made DNS is irrelevant.

When I connect to it directly, ie with a CAT6 patch cable, it works fine.
It begs the question, what's normally in between the client and the server? Are you routing through that pfSense box too? That might be throttling the traffic?
 
So what is the problem? Why would the NAS send the file at full speed at first and then come to a halt about halfway through?
 
It begs the question, what's normally in between the client and the server? Are you routing through that pfSense box too? That might be throttling the traffic?

Nothing but the switch. The server is on two VLANs: 1U, 101T. The client is on one VLAN: 101U.
 
Don't try to fix something that's not broken. (aka your sysctl.conf the defaults are much than good)
Check the MTU / MSS and ICMP filtering on the firewall.
 
Don't try to fix something that's not broken. (aka your sysctl.conf the defaults are much than good)
Check the MTU / MSS and ICMP filtering on the firewall.
Just tried blocking all traffic from the server to the router; it didn't solve the problem. So I guess there is more to this than I had anticipated.

also try scp / ftp / netcat copy. if that works then continue with samba.
scp definitely works; I believe there is something about Samba 4.16 that causes this.
 
Just curious... is there a caching problem on the receiving end?
Perhaps it is throttling the sender due to slow writes on the receiving end.
 
Just curious... is there a caching problem on the receiving end?
Perhaps it is throttling the sender due to slow writes on the receiving end.
No issues when just the server and the client are connected to the switch. I am planning to do further testing. Any ideas? Suggestions?
 
Check the routing between the vlans. Allow all ICMP and verify the MTU.
If there's some packet inspection you can try to turn it off.
 
I believe I have got to the bottom of the mystery.

First of all, my NAS was misconfigured:
Code:
[NAS][root ~] > cat /etc/rc.conf | grep defaultrouter
defaultrouter="172.16.51.1"

So I corrected that first:

Code:
[NAS][root ~] > cat /etc/rc.conf | grep defaultrouter
defaultrouter="172.16.50.1"
Then rebooted.

Then I created a block rule on the subnet that PCs are on:
1730820079217.png


Then rebooted the client. Now it works.

Please accept my apologies.
 
Back
Top