bhyve ZFS properties focusing on performance, for bhyve datasets?

kalleboy · Dec 21, 2023

Hi BSD fellows.

Does someone have experience on setting zfs properties for Bhyve datasets?

Should I set atime=off and sync=disabled to improve performance?

Finally; Is there any performance comparison article/chart, between UFS and ZFS for bhyve usage? Which one performs faster?

Regards.

blazingice · Dec 21, 2023

This link makes reference to changing recordsize to 64K. That’s all I do, but would love to know if others do more.

sko · Dec 21, 2023

atime should always be off unless *really* needed for a dataset, because it causes a lot of metadata being constantly changed/written.

Regarding bhyve performance: usually setting the storage type to "nvme" gives by far the biggest performance boost for VM.
I also remember that image files residing on ZFS datasets are considered to be faster than zvols, but the margins here (and with dataset properties) are a lot smaller than the difference between virtio and nvme storage type. So if the guest supports nvme, use that in bhyve.

I thought there was an article about the virtio/nvme comparision (and also the image files vs zvols) by Klara systems, but I can't find it right now...

Edit: bingo! found it: https://klarasystems.com/articles/virtualization-showdown-freebsd-bhyve-linux-kvm/ (no "nvme" or "zfs" in the title, hence I didn't recognize it in the overview...)

kalleboy · Dec 21, 2023

Thanks sko yes I have been reading about nvme on Klara.

However that article doesn't mention at all, optimizing ZFS properties for Bhyve specific usage. That's why I wanted to ask about it here.

A question; when should atime really be on? What service needs it? Mail servers?

Regards.

sko · Dec 21, 2023

One thing I remember from testing/using larger volblocksize values: Back when a contractor wanted to install a Windows 2019 + MSSQL2019 server MSSQL kept crashing and causing bluescreens with various/random hints to some filesystem-related stuff - turned out that junk somehow can't handle anything bigger than 4k blocksize (Windows itself should be/is able to handle it). This had cost us almost 2 days + an unplanned nightshift... I have absolutely NO idea why that thing should care about the blocksize as this is PURELY a matter for the filesystem, but it did - and with some digging we found several reports on exactly this issue, some of them already 2+ years old (of course with zero usable replies from MS support)
So if you are talking about running MS stuff in those VMs, be careful with large (vol)blocksizes!
That server ended up running with 8K volblocksize for the OS zvol and 4K on the database zvol because MSSQL couldn't handle anythinng bigger. (and IIRC Windows didn't want to install on anything larger than 8K)

This might not be an issue if using image files on a dataset with large blocksizes, but I choose to use zvols because of the easier (i.e. "direct") handling of snapshots and backups of VM images. Plus everything that really needs performance is running on nvme-based pools anyways, so those few percents of performance loss compared to image files are negligible for us.

kalleboy said:
A question; when should atime really be on? What service needs it? Mail servers?

atime (=access time) would always trigger an update to the file/block metadata if it is accessed - so no, on a mailserver (especially a busy one) you don't want that, as well as on a database server, where every access to the DB would also trigger an atime update.

Usually I always set atime=off on the root dataset on every newly created pool. And TBH from the top of my head I can't think of any kind of dataset where atime=on would serve any valuable purpose, except maybe some auditing stuff where you need to know when some file was read...

covacat · Dec 21, 2023

as a compromise you may want to check relatime property

bhyve ZFS properties focusing on performance, for bhyve datasets?

kalleboy

blazingice

sko

kalleboy

sko

covacat