Solved Reclaim unused space on a virtual machine on a sparse vm

Martin Garcia · Oct 12, 2018

Hello,
I have this exact issue,
https://blogs.oracle.com/solaris/zf...tted?cid=57c69e34-f90a-4064-bb6b-30a9375352a4

I have virtual machines (FTP, WEB, MySQL servers) on KVM sometime ago it was filled with some files or sites that are no longer in, so I cleaned about 200Gb, 300Gb from a 500Gb total space, resulting on a 80Gb virtual machine. Ive looked for some info and they seems to be filled with deleted blocks.

I already reclaimed space on my other vms based on CentOS7 and 6. with fstrim http://xfs.org/index.php/FITRIM/discard

Does ZFS/FreeBSD provides a tool for this?
My current setup is a CentOS7 hypervisor with vms with FreeBSD11 and Linux CentOS

Any light on it will be greatly appreciated,

Martin

Eric A. Borisch · Oct 12, 2018

FreeBSD’s ZFS supports TRIM, so long as the (emulated, I your case) device supports it...

KDragon75 · Oct 14, 2018

As noted, if the guest supports UNMAP or TRIM, this should be doable. I am not familiar with KVMs storage to know if there is a filesystem like VMFS in the middle. If there is, that needs to support UNMAP/TRIM sa well.

Martin Garcia · Oct 29, 2018

Sorry guys, Im still without success.

From what Ive found seems like ZFS does not free/unallocate used blocks inside vm to the hypervisor, in this case KVM, this works with virtio-scsi driver that provides UNMAP feature. This is not related to SSD disks, what I will try to do is to sparse a KVM virtual machine.

What happens in Linux, CentOS7. is that you install fstrim and free the unused blocks that are not being used, you need to declare the unmap in fstab and lvm.conf in order to do it. This returns to the hypervisor. So this way the image of the vm "shrinks" and expands as data is inserted or removed inside the vm.

Googling around I found that I asked some time ago about the same
https://forums.freebsd.org/threads/...qcow2-thin-provisiones-virtual-machine.66833/

This issue/features is also here without success
https://forum.proxmox.com/threads/zfs-trim-and-over-provisioning-support.32854/
https://forum.netgate.com/topic/112410/ssd-zfs-enable-trim

As you can see im noob on ZFS and FreeBSD so, in order to shrink my vm, I will try to replace the vm zroot/disk with a new one, expecting to resilver the data and not the unused blocks.

Any help will be greatly appreciated,

phoenix · Oct 29, 2018

KDragon75 said:
As noted, if the guest supports UNMAP or TRIM, this should be doable. I am not familiar with KVMs storage to know if there is a filesystem like VMFS in the middle. If there is, that needs to support UNMAP/TRIM sa well.

KVM either provides you with an emulated storage device (AHCI, SCSI, IDE) that you treat the same as physical hardware. Or, it provides a virtio-based storage device (virtio-blk, virtio-scsi), that, again, you treat the same as physical hardware. Both support TRIM/UNMAP.

As long as the filesystem being used in the VM support TRIM/UNMAP, and the backing store of the VM supports TRIM/UNMAP, then everything should just work automatically. You may need to fiddle with sysctl in the VM and the host to make sure TRIM/UNMAP is enabled.

Martin Garcia · Oct 29, 2018

Thanks for the answer, phoenix.
The emulated storage is virtio-scsi

Sorry for the uppercase,
### CHECKING IF TRIM IS ENABLED ON FILESYSTEM (ZFS)
root@WEB02:~ # sysctl vfs.zfs.trim
vfs.zfs.trim.max_interval: 1
vfs.zfs.trim.timeout: 30
vfs.zfs.trim.txg_delay: 32
vfs.zfs.trim.enabled: 1

### USAGE OF DISK
root@WEB02:~ # df -h
Filesystem Size Used Avail Capacity Mounted on
zroot/ROOT/default 952G 72G 880G 8% /
devfs 1.0K 1.0K 0B 100% /dev
zroot/tmp 881G 261M 880G 0% /tmp
zroot/usr/home 880G 88K 880G 0% /usr/home
zroot/usr/ports 881G 666M 880G 0% /usr/ports
zroot/usr/src 880G 88K 880G 0% /usr/src
zroot/var/audit 880G 88K 880G 0% /var/audit
zroot/var/crash 880G 88K 880G 0% /var/crash
zroot/var/log 888G 8.0G 880G 1% /var/log
zroot/var/mail 880G 1.9M 880G 0% /var/mail
zroot/var/tmp 880G 92K 880G 0% /var/tmp
zroot 880G 88K 880G 0% /zroot

### SIZE OF THE IMAGE
[root@MARTE02 images]# ls -hall
-rw-r--r-- 1 qemu qemu 462G Oct 29 17:45 SPARSE-WEB02.qcow2

### RELEVANT CONFIGURATION FOR VM
<disk type='file' device='disk'>
<driver name='qemu'discard='unmap' type='qcow2'/>
<source file='/var/lib/libvirt/images/SPARSE-WEB03.qcow2'/>
<target dev='sda' bus='scsi'/>
<address type='drive' controller='0' bus='0' target='0' unit='0'/>
</disk>

Thanks again for the help,

phoenix · Oct 30, 2018

Is zfs used in the VM? Or on the host? Or both?

Eric A. Borisch · Oct 30, 2018

How about sysctl kstat.zfs.misc.zio_trim output from the guest?

Martin Garcia · Oct 30, 2018

phoenix said:
Is zfs used in the VM? Or on the host? Or both?

Yes ZFS is being used in VM. Host is using EXT4

Martin Garcia · Oct 30, 2018

Eric A. Borisch said:
How about sysctl kstat.zfs.misc.zio_trim output from the guest?

This is the output
sysctl kstat.zfs.misc.zio_trim
kstat.zfs.misc.zio_trim.failed: 0
kstat.zfs.misc.zio_trim.unsupported: 0
kstat.zfs.misc.zio_trim.success: 26877
kstat.zfs.misc.zio_trim.bytes: 1357041664

Eric A. Borisch · Oct 30, 2018

Ok, so it looks like from freebsd’s point of view, it is issuing the trims and getting confirmation (no errors) back from the VM host.

One experiment you can do is to fill up a file (use /dev/rand as a source so it doesn’t get compressed away) some easy to see amount, say a few GB, and check that the utilization outside the VM before and after creating the large file is what you expect. Then delete the file and wait. ZFS doesn’t issue all the trims immediately, but it should in not too long if the FS isn’t busy. You can watch for the zio_trim.bytes number go up by the chunk you deleted. (Make sure it wasn’t retained by a snapshot.) If everything hangs together inside the VM (you see the trimmed bytes go up by ~ the size of your freshly deleted test file), then check on the VM host to see if the backing store has been appropriately deflated.

If this all seems to work (you see the backing store grow and then shrink), then you’re left with why it is out of “sync” now. I see your KVM config has discard enabled on the backing store. Has it always been on? If not, previous discards will never be reissued; but you can fill up the drive (similar to the above experiment; don’t go too crazy, 100% is a really bad place to be) and then erase the file; this will issue fresh discards, and hopefully shrink you back down to size.

Martin Garcia · Oct 30, 2018

Hello Eric, thanks for your answer. Seems like ZFS is not returning the state to the KVM host.

Eric A. Borisch said:
One experiment you can do is to fill up a file (use /dev/rand as a source so it doesn’t get compressed away) some easy to see amount, say a few GB, and check that the utilization outside the VM before and after creating the large file is what you expect. Then delete the file and wait. ZFS doesn’t issue all the trims immediately, but it should in not too long if the FS isn’t busy. You can watch for the zio_trim.bytes number go up by the chunk you deleted. (Make sure it wasn’t retained by a snapshot.) If everything hangs together inside the VM (you see the trimmed bytes go up by ~ the size of your freshly deleted test file), then check on the VM host to see if the backing store has been appropriately deflated.

If this all seems to work (you see the backing store grow and then shrink), then you’re left with why it is out of “sync” now. I see your KVM config has discard enabled on the backing store. Has it always been on? If not, previous discards will never be reissued; but you can fill up the drive (similar to the above experiment; don’t go too crazy, 100% is a really bad place to be) and then erase the file; this will issue fresh discards, and hopefully shrink you back down to size.

I will do the experiment, however as Im moving away slowly from some old Linuxes box and moving into BSD, this lack of sparseness impact my KVM host storage capacity. I presume there should be a way, however is not clear for me At the moment Im recompressing my FreeBSD vms with qemu-img convert -c -p -O qcow2 fat.qcow2 slim.qcow2 command. A VM image from 270Gb turns into 80Gb which is good for me, at the expense of some CPU. Not sure how much it will impact the whole host performance. This way I reclaim (by the side of the host) some space, however I still feel that it should be managed by the VM filesystem.

My goal is to move to BSD in the near future, however I still dont know how to manage myself properly using jails or bhyve as I currently manage my vms with KVM.

Eric A. Borisch · Oct 30, 2018

Martin Garcia said:
Hello Eric, thanks for your answer. Seems like ZFS is not returning the state to the KVM host.

That's not what I said; it looks like (inside the VM) FreeBSD+ZFS is saying "Yep, I see that I can discard; and I've issued lots of discards successfully."

My guess is that you did not always have the discard flag set in KVM. On ext4, you can (after some period of usage) enable discard in KVM, reboot, use fstrim, and get your space back. Here fstrim is effective issuing trim commands for all the not-currently-used space on the disk. (You can also set the discard option on the mountpoint, as I recall, or periodically use fstrim to do either as-needed (zfs style) or bulk trims, respectively.)

There is no similar after-the-fact-trim-everything-unused tool for ZFS, but ZFS does support trim, and the sysctl output above shows that (as far as FreeBSD is concerned) trims are currently being issued successfully. As new files are created (and deleted) the deletions will issue fresh trims. Create/delete cycles prior to setting the discard flag in KVM will not be retrospectively reaped; that's why I suggested (assuming I'm right, and the experiment pans out) creating a dummy file filling up 90-95% of your drive and then deleting it (while making sure it isn't retained in a snapshot.)

Martin Garcia · Nov 1, 2018

Hello Eric,
Thanks for the follow up,
I want to share with you guys some of my experiments as suggested,
I filled the disc, with this, dd bs=10M count=1000G if=/dev/zero of=archivaso

1.- Needed to disable dedup and compress on ZFS since when I create the file it shows the size but is not filling the disc.
2.- When I did the dd, it fills the disk, as shown below (ive created the file on /root partition so it shows 100%

# df -h
Filesystem Size Used Avail Capacity Mounted on
zroot/ROOT/default 701G 701G 0B 100% /
devfs 1.0K 1.0K 0B 100% /dev
zroot/tmp 208K 208K 0B 100% /tmp
zroot/usr/home 370G 370G 0B 100% /usr/home
zroot/usr/ports 666M 666M 0B 100% /usr/ports
zroot/usr/src 88K 88K 0B 100% /usr/src
zroot/var/audit 88K 88K 0B 100% /var/audit
zroot/var/crash 88K 88K 0B 100% /var/crash
zroot/var/log 17M 17M 0B 100% /var/log
zroot/var/mail 384K 384K 0B 100% /var/mail
zroot/var/tmp 88K 88K 0B 100% /var/tmp
zroot 88K 88K 0B 100% /zroot

3.- I deleted the file rm -rf archivaso
4.- I wait some minutes, but seems nothing happened, now the vm is 1Tb.
5.- After delete the file archivaso, this is the new capacity
root@NETTIX:~ # df -h
Filesystem Size Used Avail Capacity Mounted on
zroot/ROOT/default 701G 1.2G 700G 0% /
devfs 1.0K 1.0K 0B 100% /dev
zroot/tmp 700G 208K 700G 0% /tmp
zroot/usr/home 1.0T 370G 700G 35% /usr/home
zroot/usr/ports 701G 666M 700G 0% /usr/ports
zroot/usr/src 700G 88K 700G 0% /usr/src
zroot/var/audit 700G 88K 700G 0% /var/audit
zroot/var/crash 700G 88K 700G 0% /var/crash
zroot/var/log 700G 17M 700G 0% /var/log
zroot/var/mail 700G 384K 700G 0% /var/mail
zroot/var/tmp 700G 88K 700G 0% /var/tmp
zroot 700G 88K 700G 0% /zroot

What I will do next is to recompress the image, to see if I can recover size at qemu level. Thanks for your comments

Martin Garcia · Nov 2, 2018

UPDATE: For the record, I had success trimming the fat of my VM.
My VMs were not started with virtio-scsi driver. This driver supports unmap. So, as they were running without this. They were using the blocks (but not returning to the hypervisor, in this case KVM).

So, in order to reduce the size of a ZFS vm in KVM I followed this steps
1 I changed the disk driver to virtio-scsi
2 In the vm xml conf file. Added the option in driver section discard=unmap <driver name='qemu'discard='unmap' type='qcow2'/>
3 If enabled, disable compression and dedup. zfs set compression=off and zfs set dedup=off
4 Fill the disk with data dd bs=10M if=/dev/zero of=archivaso
5 Remove the big file, rm archivaso
6 Stop the vm
7 sparse (again) the vm. virt-sparsify FAT_VM.qcow2 SLIM_VM.qcow2
8 Change the image in your KVM conf, from FAT_VM to SLIM_VM.
9 Ensure you repeat step 2.
10 Start the vm
11 If needed, enable compression and dedup. zfs set compression=lz4 and zfs set dedup=on

Hope this helps to anyone looking to "shrink" their ZFS vms.

Martin Garcia · Nov 10, 2018

This was solved with the above solution. Thanks to all who participate in this.

_martin · Jan 12, 2021

I had a similar issue on FreeBSD running under VMware. The same problem (as OP has) even with pvscsi back-end.
Solution to this is partially what you mentioned above, steps 3) 4) 5). But instead of powering off VM command from vmware-tools can be used:

vmware-toolbox-cmd disk shrinkonly

This is not an online operation per se as VM gets frozen during the run of the command. Depending on your setup this can be a problem or not.
Then compression/dedup can be enabled again.