ZFS ZFS, UNMAP and VMware ESXi troubles

Tomas Engin

New Member


Messages: 3

Hello.

I'm seeing quite a lot of UNMAP failed messages on our FreeBSD servers ever since we migrated from VMFS5 to VMFS6 datastores on our vSphere cluster. Happens on both thin and thick provisioned disks.

There have been instances where servers just stopped responding after a couple of failed unmap operations.

I'm not sure if we have a configuration issue somewhere or if we stumbled upon a bug.

Anyone here have some ideas?



The error messages looks something like this:
Code:
Nov 10 08:26:37 b-test-1 kernel: (da0:mpt0:0:0:0): UNMAP failed, disabling BIO_DELETE
Nov 10 08:26:37 b-test-1 kernel: (da0:mpt0:0:0:0): UNMAP. CDB: 42 00 00 00 00 00 00 00 08 00
Nov 10 08:26:37 b-test-1 kernel: (da0:mpt0:0:0:0): CAM status: SCSI Status Error
Nov 10 08:26:37 b-test-1 kernel: (da0:mpt0:0:0:0): SCSI status: Check Condition
Nov 10 08:26:37 b-test-1 kernel: (da0:mpt0:0:0:0): SCSI sense: ILLEGAL REQUEST asc:24,0 (Invalid field in CDB)
Nov 10 08:26:37 b-test-1 kernel: (da0:mpt0:0:0:0): Command byte 7 is invalid
Nov 10 08:26:37 b-test-1 kernel: (da0:mpt0:0:0:0): Error 22, Unretryable error
Nov 10 08:26:38 b-test-1 ZFS[46103]: vdev state changed, pool_guid=$15317486287551845913 vdev_guid=$12174472687408032879

Looking at sysctl's:
Code:
kern.cam.da.0.unmapped_io: 1
kern.cam.nda.max_trim: 256
vfs.zfs.trim.max_interval: 1
vfs.zfs.trim.timeout: 30
vfs.zfs.trim.txg_delay: 32
vfs.zfs.trim.enabled: 1
vfs.zfs.vol.unmap_sync_enabled: 0
vfs.zfs.vol.unmap_enabled: 1
vfs.zfs.vdev.trim_max_pending: 10000
vfs.zfs.vdev.trim_max_active: 64
vfs.zfs.vdev.trim_min_active: 1
vfs.zfs.vdev.trim_on_init: 1
vfs.ffs.dotrimcons: 1
vfs.unmapped_buf_allowed: 1
vfs.aio.num_unmapped_aio: 0
hw.storvsc.use_pim_unmapped: 1
kstat.zfs.misc.zio_trim.failed: 37
kstat.zfs.misc.zio_trim.unsupported: 150
kstat.zfs.misc.zio_trim.success: 0
kstat.zfs.misc.zio_trim.bytes: 0

---

Hypervisor: VMware ESXi 6.7 (16713306)
Storage: Nimble 5.x

OS: FreeBSD 12.1-RELEASE (also tried 12.2-RELEASE)
Guest OS: FreeBSD 12 or later versions (64-bit)
Compatibility: ESXi 6.7 Update 2 and later (VM version 15)
VMware Tools: Running, version:2147483647 (Guest Managed)
SCSI controller: LSI Logic SAS
 
OP
T

Tomas Engin

New Member


Messages: 3

Yes. The automatic reclaim of space is what trigger the errors. But why the errors? I'm wondering if this is a problem with our storage, ESXi, ZFS or maybe the mpt driver in FreeBSD.

Until I can resolve this I'm migrating back the FreeBSD VMs to VMFS5 datastores.
 

Emrion

Aspiring Daemon

Reaction score: 221
Messages: 681

To begin with, you should check if UNMAP is supported on your system. You have the command line to test this in the first link.
 

Ophiuchus

Member


Messages: 29

I'm experiencing the same issue. Did you find a solution or a viable workaround?
I was getting the same error on one of my FreeBSD VMs (VMware). I cloned the VM and I stopped getting that error message.

Some extra information here:
* I get this "UNMAP failed, disabling BIO_DELETE" only once during the startup and never till I restart the VM again
* There are other FreeBSD VMs in the same volume (VMware datastore) and I never got this message
* After cloning the VM the disk read/write operations were still very slow so I guess this does not really qualify as a "workaround"
 
Top