ZFS High CPU on snapshot destroy

I have a PostgreSQL DB server running 10.3 and anytime I prune snapshots, the CPU spikes and I see database timeouts. Is there a sysctl tunable that effects the priority of snapshot deletion? I was looking at the vfs.zfs.vdev.(sync|async|scrub)_* parameters, but it isn't clear to me whether the "scrub" values would have any impact on snapshot reclamation.

Any info would be greatly appreciated.

Thanks-
 
Deleting a snapshot typically uses very little CPU. Are these large snapshots? Or are you recursively deleting? Recursion might be the reason why it's using a lot of CPU. Or maybe it's the script that looks for snapshots to delete that's using the CPU?
 
Thanks for the quick reply.

It is not recursive -- I do typically delete a few at a time using the snap1%snap2 method, but otherwise, that's about it. The "reference" size is generally around 2.3T -- the snapshot size is only around 160M (with the exception of the earliest, which is usually in the 30G range) -- they are taken every 2 hours.

I found a tunable (vfs.zfs.free_min_time_ms) that I doubled to see if it makes a difference.
 
In that case you may want to try 10.4 instead. FreeBSD 10.4 was released after 11.1 so it's possible some or all of those improvements also made their way into 10.4. Upgrading a minor version is usually quite easy to do. Typically easier than upgrading a major version.
 
We are also hovering near the 80% threshold on the zpool and the fragmentation level is relatively high (63-70%), so I'm wondering if there are a few things contributing to the issue.
 
Back
Top