ZFS ZFS, UNMAP and VMware ESXi troubles

Tomas Engin · Nov 10, 2020

Hello.

I'm seeing quite a lot of UNMAP failed messages on our FreeBSD servers ever since we migrated from VMFS5 to VMFS6 datastores on our vSphere cluster. Happens on both thin and thick provisioned disks.

There have been instances where servers just stopped responding after a couple of failed unmap operations.

I'm not sure if we have a configuration issue somewhere or if we stumbled upon a bug.

Anyone here have some ideas?

The error messages looks something like this:

Code:

Nov 10 08:26:37 b-test-1 kernel: (da0:mpt0:0:0:0): UNMAP failed, disabling BIO_DELETE
Nov 10 08:26:37 b-test-1 kernel: (da0:mpt0:0:0:0): UNMAP. CDB: 42 00 00 00 00 00 00 00 08 00
Nov 10 08:26:37 b-test-1 kernel: (da0:mpt0:0:0:0): CAM status: SCSI Status Error
Nov 10 08:26:37 b-test-1 kernel: (da0:mpt0:0:0:0): SCSI status: Check Condition
Nov 10 08:26:37 b-test-1 kernel: (da0:mpt0:0:0:0): SCSI sense: ILLEGAL REQUEST asc:24,0 (Invalid field in CDB)
Nov 10 08:26:37 b-test-1 kernel: (da0:mpt0:0:0:0): Command byte 7 is invalid
Nov 10 08:26:37 b-test-1 kernel: (da0:mpt0:0:0:0): Error 22, Unretryable error
Nov 10 08:26:38 b-test-1 ZFS[46103]: vdev state changed, pool_guid=$15317486287551845913 vdev_guid=$12174472687408032879

Looking at sysctl's:

Code:

kern.cam.da.0.unmapped_io: 1
kern.cam.nda.max_trim: 256
vfs.zfs.trim.max_interval: 1
vfs.zfs.trim.timeout: 30
vfs.zfs.trim.txg_delay: 32
vfs.zfs.trim.enabled: 1
vfs.zfs.vol.unmap_sync_enabled: 0
vfs.zfs.vol.unmap_enabled: 1
vfs.zfs.vdev.trim_max_pending: 10000
vfs.zfs.vdev.trim_max_active: 64
vfs.zfs.vdev.trim_min_active: 1
vfs.zfs.vdev.trim_on_init: 1
vfs.ffs.dotrimcons: 1
vfs.unmapped_buf_allowed: 1
vfs.aio.num_unmapped_aio: 0
hw.storvsc.use_pim_unmapped: 1
kstat.zfs.misc.zio_trim.failed: 37
kstat.zfs.misc.zio_trim.unsupported: 150
kstat.zfs.misc.zio_trim.success: 0
kstat.zfs.misc.zio_trim.bytes: 0

---

Hypervisor: VMware ESXi 6.7 (16713306)
Storage: Nimble 5.x

OS: FreeBSD 12.1-RELEASE (also tried 12.2-RELEASE)
Guest OS: FreeBSD 12 or later versions (64-bit)
Compatibility: ESXi 6.7 Update 2 and later (VM version 15)
VMware Tools: Running, version:2147483647 (Guest Managed)
SCSI controller: LSI Logic SAS

Emrion · Nov 10, 2020

Maybe UNMAP has been enabled only in VMFS6?

VMware Knowledge Base

kb.vmware.com

Space Reclamation Requests from VMFS Datastores

Deleting or removing files from a VMFS datastore frees space within the file system. This free space is mapped to a storage device until the file system releases or unmaps it. ESXi supports reclamation of free space, which is also called the unmap operation.

docs.vmware.com

Tomas Engin · Nov 11, 2020

Yes. The automatic reclaim of space is what trigger the errors. But why the errors? I'm wondering if this is a problem with our storage, ESXi, ZFS or maybe the mpt driver in FreeBSD.

Until I can resolve this I'm migrating back the FreeBSD VMs to VMFS5 datastores.

Emrion · Nov 11, 2020

To begin with, you should check if UNMAP is supported on your system. You have the command line to test this in the first link.

wiscodisco · Jun 23, 2021

I'm experiencing the same issue. Did you find a solution or a viable workaround?

Ophiuchus · Aug 6, 2021

wiscodisco said:
I'm experiencing the same issue. Did you find a solution or a viable workaround?

I was getting the same error on one of my FreeBSD VMs (VMware). I cloned the VM and I stopped getting that error message.

Some extra information here:
* I get this "UNMAP failed, disabling BIO_DELETE" only once during the startup and never till I restart the VM again
* There are other FreeBSD VMs in the same volume (VMware datastore) and I never got this message
* After cloning the VM the disk read/write operations were still very slow so I guess this does not really qualify as a "workaround"