ZFS Limit Delete Priority

vermaden

Son of Beastie

Thanks: 1,078
Messages: 2,710

#1
The problem:

When I start deleting large amount of data (5-10 GB) the ZFS prioritizes deletion and system is VERY unresponsive.

Which ZFS parameter to change to make deletion of files a lot less priority for ZFS? ... or how to prioritize reads with ZFS?
 
OP
OP
vermaden

vermaden

Son of Beastie

Thanks: 1,078
Messages: 2,710

#3
@ VladiBG

Thank You for trying but its not that case. It can be ONE large file with 5-10 GB in size (a movie or VM disk) and behavior is the same, ZFS tried to 'deallocate' all used blocks as fast as possible with read being dead in that time, system literally freezes which is unacceptable.

I think that I have seen somewhere discussion about sysctl(8) values that can be set to slow this down but can not find it anymore.
 

sko

Well-Known Member

Thanks: 213
Messages: 419

#4
I *think* what you are looking for are the various sysctls for dirty data, respectively vfs.zfs.dirty* and vfs.zfs.vdev.async_write_active_[min|max]_dirty_percent.
I suspect deleting a big amount of data fills up the max amount of dirty data ("in flight" writes, not yet committed to disk) allowed, so zfs is constantly trying to get the TXGs committed to disk.

The default values usually work at least "good enough" except for some really extreme edge cases and should never cause the problems you are seeing. So I highly suspect there is another root cause - e.g. a dying disk that degrades pool performance or heavy memory pressure on the system.
I write and delete large disk images with well over 100GB (full disk backups of some clients) on our storage server and never saw anything like the behaviour you described. Even my desktop machine with much less RAM and disks never got unresponsive when I bashed its single ZFS pool with similar tasks.

If you still want/need to adjust some knobs of ZFS, there is no "master recipe" on how to adjust these (or other zfs-related) sysctls - you have to carefully monitor the system behaviour under load to understand where the bottleneck is. Dtrace is your very best friend for this. I can *highly recommend you read the sections on "Performance" and "Tuning" in "FreeBSD Mastery: Advanced ZFS" by Michael W. Lucas and Allan Jude. They provide you with a structured method on how to identify performance bottlenecks as well as some example Dtrace-scripts that can be adjusted to your needs. The dtrace-toolkit (available from ports and packages) also has a lot of zfs-, disk- and i/o-related scripts that can help narrowing down the exact bottleneck.
That being said, I still suspect there is a much easier solution to your problem - so you should start from a high level and narrow down the true root cause.

What layout has the pool you are seeing this behaviour? Does the pools ashift size fit the drives blocksize?
Any errors reported by zpool status? Memory throttle counts reported by zfs-stats -A?
During deletion of large files, try monitoring the pool with zpool iostat -v 1 - do the "operations" and "bandwidth" numbers look plausible and are they relatively evenly distributed across all vdevs and providers? As said - a single dying or misbehaving drive can send the performance of the whole pool into the abyss. SSDs that have reached their max wear level (or min wearout indicator for intel) are notorious for this because they tend to throttle back to sub-1MiB/s throughput levels.[/cmd]
 
OP
OP
vermaden

vermaden

Son of Beastie

Thanks: 1,078
Messages: 2,710

#6
@ sko

Thanks I will look into it.

This is on a single SSD drive on GELI (aligned to 4k) and ZFS pool also aligned to that 4k (ashift=12). Disk is not dying. I will check sysctl(8)'s you mentioned. Thanks.
 
OP
OP
vermaden

vermaden

Son of Beastie

Thanks: 1,078
Messages: 2,710

#7
zfs_free_max_blocks and zfs_free_min_time_ms
Code:
% sysctl zfs_free_max_blocks
sysctl: unknown oid 'zfs_free_max_blocks'

% sysctl zfs_free_min_time_ms
sysctl: unknown oid 'zfs_free_min_time_ms'

% uname -spr
FreeBSD 11.2-RELEASE amd64

% sysctl -a | grep zfs_free_min_time_ms
(none)

% sysctl -a | grep zfs_free_max_blocks 
(none)
 

t1066

Active Member

Thanks: 84
Messages: 226

#9
Actually, they are vfs.zfs.free.max.blocks and vfs.zfs.free.min.time.ms.
And you can get a list of sysctl zfs variables with a line of description by using
sysctl -d vfs.zfs
 
Top