Solved Samba36 performance problems

gqgunhed · Dec 28, 2014

I have a strange Samba behavior here. CPU usage raises to 100% for the one thread copying files to or from my Samba fileserver.

Situation:
FreeBSD 10.1-RELEASE upgraded from 9.2-RELEASE
net/samba36 within ezjail
zpool as RAID10, compression for ZFS dataset set to lz4
Gigabit em0: throughput when copying (when not hitting the problem, according to systat/ifstat)
- 110,187 MB/s (peak)
- roughly 50 MB/s (copying medium files)
- around 10 MB/s (copying small files, 100kB)

Tests I did so far:
- Activated and deactivated various options in /usr/local/etc/smb.conf like socket options, aio options, etc. no change
- Copying about 300GB to and from the server with mixed filesets
- Packet captures from server and workstation. I found no errors like dropped packets, retransmissions, etc.
- Updated the base system with freebsd-update and ezjails with ezjail-admin update -u and server was restarted

Results so far:
- When the single SMB thread hits the 100% CPU limit performance drops to about 1 kB/s without seeing any problems with systat/iostat or zpool iostat
- Switching on and off compression on the ZFS datasets does not change the performance
- Tweaking sysctl values (like shown at https://calomel.org/freebsd_network_tuning.html as I did with the 9.2-RELEASE or leaving them at 10.1-RELEASE default does not change the performance

Ideas:
- Is it dependent on a specific file type (content)? No reason why that should have any impact.
- Is it dependent on the number of files within the target directory? Does performance drop with increasing number of files in the directory?

Strange is: The config worked fine with samba36 and 9.2-RELEASE. I did not change anything during the upgrade with the config.

gqgunhed · Dec 28, 2014

Performance seems to be impacted by the number of files in the target directory. When copying files to a subdirectory containing:
- About 10.000 files in the directory--throughput drops to around 5 MB/s
- About 40.000 files in the directory--throughput drops to around 2 kB/s

As the copy job reaches another subdirectory with no or just a few files performance is immediately restored.

Any ideas why this happens? This behavior appeared as far as I remember with the upgrade to 10.1-RELEASE but may be a coincidence. Is this influenced by a kind of file alteration monitoring or directory index caching? I use the standard pkg-version net/samba-3.6.24_2.

Thanks in advance.

P.S. I am not too stupid to organize my files

but I sometimes have to do some forensics and file carving so I end up with a lot of files of one file type in a subdirectory.

junovitch@ · Dec 29, 2014

gqgunhed said:
Is this influenced by a kind of file alteration monitoring or directory index caching?

For this question in particular, check the following during a heavy copy.
sysctl kern.maxfiles
sysctl kern.maxfilesperproc
sysctl kern.openfiles
fstat | grep smb

That should reveal just just how much load opening a file descriptor to monitor files is taking. I would be curious if running /usr/share/dtrace/toolkit/opensnoop during copying reveals anything extra.

In context of your entire issue, I'm not sure what else to add just yet. Nothing else stands out or comes to mind.

gqgunhed · Dec 29, 2014

junovitch said:
For this question in particular, check the following during a heavy copy.
sysctl kern.maxfiles
sysctl kern.maxfilesperproc
sysctl kern.openfiles
fstat | grep smb

That should reveal just just how much load opening a file descriptor to monitor files is taking. I would be curious if running /usr/share/dtrace/toolkit/opensnoop during copying reveals anything extra.

Here are new results for the given advices:
sysctl kern.maxfiles = 200000
sysctl kern.maxfilesperproc = 32684
sysctl kern.openfiles: I made several calls to kern.openfiles resulting in a constant range of open files between 3000 and 4500 files during heavy copy.
fstat | grep smb: Where can I see the load resulting from these calls? I checked the output but can't find anything unusual.

Further tests show that local copy on the server between different ZFS datasets show no (minimum) performance decrease with growing number of files in target directory. So I do not think it is a problem with ZFS. As I do not have any dropped packets I rule out the network buffers as bottleneck. Only the smbd thread climbs to 100% with growing number in the target directory.

I tested several options in /usr/local/etc/smb.conf, but no luck here:

Code:

# file attribute settings general   
store dos attributes = no   
#unix charset = UTF8   
unix extensions = yes   
wide links  = no  # disabled by unix extensions anyway
nt acl support  = yes   
inherit acls  = no   
map acl inherit = yes   
vfs objects  = zfsacl   
nfs4:mode  = special   
nfs4:acedup  = merge   
nfs4:chown  = yes  

# tuning from http://ogre.ikratko.com/archives/347   
# more tipps from https://www.samba.org/samba/docs/man/Samba3-HOWTO/speed.html   
kernel oplocks = no  # only on Linux systems   
sync always = no   
strict locking = no   
strict sync = no   
getwd cache = yes   
log level = 0   
#max xmit = 65535   
#read size = 65535   
#min receivefile size=16384   
#use sendfile=true   
aio read size = 16384  # Use asynchronous I/O for reads bigger than 16KB request size (default 0)
aio write size = 16384   
write cache size = 1048576  # 262144 for a 256k cache size per file   
#aio write behind = true   
#directory name cache size = 0 # need to be turned on for BSD systems (default 0)   
#stat cache = yes  # should never be changed (default yes)   
#max stat cache size = 2048  # in kB, should not be changed (default 256)   
#min receivefile size = 128  # max 128k (default 0)

/usr/share/dtrace/toolkit/opensnoop showed a single difference compared to local copy with mc (just to get some progress bar and some kind of virtual throughput and application layer). When copying via Samba the file descriptors cycle through 1-200 showing no errorcodes while mc-copy just cycles through 9-10, but may not be important.

/usr/share/dtrace/toolkit/hotkernel during copy shows:

Code:

[...]
kernel`bzero 17 0.0%
zfs.ko`zap_leaf_array_read 18 0.0%
kernel`__rw_wlock_hard 22 0.0%
kernel`sx_try_xlock_ 23 0.0%
zfs.ko`zap_leaf_lookup_closest 31 0.1%
kernel`_sx_xunlock 31 0.1%
kernel`atomic_add_long 37 0.1%
zfs.ko`lz4_compress 39 0.1%
kernel`bcopy 40 0.1%
kernel`spinlock_exit 80 0.2%
zfs.ko`lzjb_compress 102 0.2%
zfs.ko`list_prev 110 0.2%
zfs.ko`l2arc_feed_thread 114 0.2%
kernel`cpu_idle 2341 5.0%
kernel`acpi_cpu_c1 43288 92.4%

junovitch@ · Dec 30, 2014

I'm not a Samba expert at all so I can't add anything on your configuration there. The file handles and Dtrace stuff suggest things are normal. Files are opened and closed. There doesn't appear to be any issues with the file alteration monitoring or directory index stuff like you had mentioned. On the kernel side, it looks like things are mostly idle there.

Maybe somebody with more Samba experience will drop by in the meantime I would think this is a good opportunity to try out profiling with a Dtrace Flame graph. That's what I would try next but for what its worth I've never used this to solve any production issues before and have only tried it to see how it worked. If something is pegging the thread at 100% I would expect it would show on the Flame Graph. Take a look at the link below. Look under Dtrace. They describe profiling mysqld and it should be pretty trival to run the same steps with smbd.

http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html

gqgunhed · Jan 2, 2015

junovitch said:
... If something is pegging the thread at 100% I would expect it would show on the Flame Graph. Take a look at the link below. Look under Dtrace. They describe profiling mysqld and it should be pretty trival to run the same steps with smbd.

http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html

Thank you very much I will definitely have a look into this. I have no DTrace knowledge but it seems like it must only be changed from "mysqld" to "smbd". I did some more testing but can not find any root cause yet. I think it will be only one single bit of information I am missing, maybe even a misconfiguration aka "tuning" by myself...
I think I'll try that on another machine and/or try starting from a fresh smb.conf. testparm and testparm -v (to show all available options for smb.conf) do not give any hints or errors. testparm from net/samba4 complains about the following "# ...comment" lines e.g. behind the "aio read size" so I deleted the appended hashes and comments.

gqgunhed · Jan 2, 2015

According to feedback and suggestions from the #samba IRC channel it is the correct (but slow) behavior for Samba to compare all filenames in the target directory to the filename of the file being copied into it. See https://www.samba.org/samba/docs/man/manpages-3/smb.conf.5.html, section "name mangling".
I will report later if setting the mangling options according to the Samba documents change something.

junovitch@ · Jan 2, 2015

Interesting, is that behavior net/samba36 specific or common to net/samba4 and net/samba41 as well? I guess that makes sense that the one worker process is at 100% as it spends most of it's time opening and closing files to check them rather than handling network traffic. I'd be curious to find out if there is a performance difference in newer versions.

gqgunhed · Jan 6, 2015

Ok, I finally found the critical bit

It is the option

Code:

case sensitive = Auto

in /usr/local/etc/smb.conf. "Auto" is the default and leads to the described name mangling behavior and the low performance when writing to subdirectories with 10.000+ files.
As soon as I set

Code:

case sensitive = Yes

the performance is restored and normal.

Could someone please confirm that?
I tested it only with net/samba36-3.6.24_2 with Linux and Windows Clients.
Thanks.

junovitch@ · Jan 7, 2015

With a for i in `seq 1 100000`; do touch `jot -r 1`; done to make a directory full of test files and a jail running net/samba41, I see CPU hit 90% when I browse to that directory and 60% when I set case sensitivity to yes. Seems like there's no way around the workload of listing everything in the directory.

gqgunhed · Jan 7, 2015

junovitch said:
With a for i in `seq 1 100000`; do touch `jot -r 1`; done to make a directory full of test files and a jail running net/samba41, I see CPU hit 90% when I browse to that directory and 60% when I set case sensitivity to yes. Seems like there's no way around the workload of listing everything in the directory.

junovitch: Please try writing to that directory and try switching between "case sensitive = Auto" (default) and "Yes". Listing/reading/browsing does not change as it does not invoke the "compare the new file with filenames in that directory" function. The impact when writing should be much bigger.

junovitch@ · Jan 8, 2015

Confirmed on net/samba41. My share is a read only nullfs on this test jail but the response is instant that I don't have write access with case sensitive as yes. It takes several minutes to come back and tell me that when case sensitive is auto. The Dtrace flame graph doesn't show the the stack names in the latter case but does show a massive amount of time spent on what must be case sensitivity checks. The stack frames that show up on the baseline when the copy is not running is less than 10% of what is running on the CPU.

Also, I meant to say I did for i in `seq 1 100000`; do touch $i; done. What I said above was the first command I tried but was a mistake because it touches the same 100 files 100,000 times. That one creates 100,000 files.