bhyve Windows Server slow IO

stratacast1

Well-Known Member

Reaction score: 24
Messages: 281

I have Windows Server 2016 installed in a bhyve VM on FreeBSD 12.0 and overall it runs well, except I/O operations are slow. For example, simply extracting a 12KB zip archive can take close to 10 seconds. I did an iostat on its device (/dev/vmm/winserver2016) and here's the output I got when doing the extract:

Code:
       tty            cpu
 tin  tout us ni sy in id
   0    27  0  0 31  0 69
   0    27  0  0 24  0 76
   0    27  0  0 33  0 67
   0    27  0  0 33  0 67
   0    27  0  0 44  0 56
   0    27  0  0 51  0 49
   0    27  0  0 50  0 50
it has 2 cores and 3GB on an i5 4570 (host has 8GB totalmem) and all running on an SSD and the vm was installed in a zfs dataset with vm-bhyve. Any insights would be great. Not used for production at least
 

Zirias

Aspiring Daemon

Reaction score: 226
Messages: 609

For me, using virtio-blk instead of ahci-hd for the virtual disk made a huge difference. But there's a gotcha: If you don't run -CURRENT, you have to apply a patch, otherwise bhyve crashes quickly when a windows guest uses a virtio-blk disk with redhat virtio windows drivers:

Code:
--- head/usr.sbin/bhyve/virtio.c    2019/05/18 17:30:03    347959
+++ head/usr.sbin/bhyve/virtio.c    2019/05/18 19:32:38    347960
@@ -3,6 +3,7 @@
  *
  * Copyright (c) 2013  Chris Torek <torek @ torek net>
  * All rights reserved.
+ * Copyright (c) 2019 Joyent, Inc.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
@@ -32,6 +33,8 @@
 #include <sys/param.h>
 #include <sys/uio.h>
 
+#include <machine/atomic.h>
+
 #include <stdio.h>
 #include <stdint.h>
 #include <pthread.h>
@@ -422,6 +425,12 @@
     vue = &vuh->vu_ring[uidx++ & mask];
     vue->vu_idx = idx;
     vue->vu_tlen = iolen;
+
+    /*
+     * Ensure the used descriptor is visible before updating the index.
+     * This is necessary on ISAs with memory ordering less strict than x86.
+     */
+    atomic_thread_fence_rel();
     vuh->vu_idx = uidx;
 }
 
@@ -459,6 +468,13 @@
     vs = vq->vq_vs;
     old_idx = vq->vq_save_used;
     vq->vq_save_used = new_idx = vq->vq_used->vu_idx;
+
+    /*
+     * Use full memory barrier between vu_idx store from preceding
+     * vq_relchain() call and the loads from VQ_USED_EVENT_IDX() or
+     * va_flags below.
+     */
+    atomic_thread_fence_seq_cst();
     if (used_all_avail &&
         (vs->vs_negotiated_caps & VIRTIO_F_NOTIFY_ON_EMPTY))
         intr = 1;
--- head/usr.sbin/bhyve/block_if.c    2019/05/02 19:59:37    347032
+++ head/usr.sbin/bhyve/block_if.c    2019/05/02 22:46:37    347033
@@ -65,7 +65,7 @@
 #define BLOCKIF_SIG    0xb109b109
 
 #define BLOCKIF_NUMTHR    8
-#define BLOCKIF_MAXREQ    (64 + BLOCKIF_NUMTHR)
+#define BLOCKIF_MAXREQ    (BLOCKIF_RING_MAX + BLOCKIF_NUMTHR)
 
 enum blockop {
     BOP_READ,
--- head/usr.sbin/bhyve/block_if.h    2019/05/02 19:59:37    347032
+++ head/usr.sbin/bhyve/block_if.h    2019/05/02 22:46:37    347033
@@ -41,7 +41,13 @@
 #include <sys/uio.h>
 #include <sys/unistd.h>
 
-#define BLOCKIF_IOV_MAX        33    /* not practical to be IOV_MAX */
+/*
+ * BLOCKIF_IOV_MAX is the maximum number of scatter/gather entries in
+ * a single request.  BLOCKIF_RING_MAX is the maxmimum number of
+ * pending requests that can be queued.
+ */
+#define    BLOCKIF_IOV_MAX        128    /* not practical to be IOV_MAX */
+#define    BLOCKIF_RING_MAX    128
 
 struct blockif_req {
     int        br_iovcnt;
--- head/usr.sbin/bhyve/pci_virtio_block.c    2019/05/02 19:59:37    347032
+++ head/usr.sbin/bhyve/pci_virtio_block.c    2019/05/02 22:46:37    347033
@@ -3,6 +3,7 @@
  *
  * Copyright (c) 2011 NetApp, Inc.
  * All rights reserved.
+ * Copyright (c) 2019 Joyent, Inc.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
@@ -55,7 +56,9 @@
 #include "virtio.h"
 #include "block_if.h"
 
-#define VTBLK_RINGSZ    64
+#define VTBLK_RINGSZ    128
+
+_Static_assert(VTBLK_RINGSZ <= BLOCKIF_RING_MAX, "Each ring entry must be able to queue a request");
 
 #define VTBLK_S_OK    0
 #define VTBLK_S_IOERR    1
@@ -351,7 +354,15 @@
     /* setup virtio block config space */
     sc->vbsc_cfg.vbc_capacity = size / DEV_BSIZE; /* 512-byte units */
     sc->vbsc_cfg.vbc_size_max = 0;    /* not negotiated */
-    sc->vbsc_cfg.vbc_seg_max = BLOCKIF_IOV_MAX;
+
+    /*
+     * If Linux is presented with a seg_max greater than the virtio queue
+     * size, it can stumble into situations where it violates its own
+     * invariants and panics.  For safety, we keep seg_max clamped, paying
+     * heed to the two extra descriptors needed for the header and status
+     * of a request.
+     */
+    sc->vbsc_cfg.vbc_seg_max = MIN(VTBLK_RINGSZ - 2, BLOCKIF_IOV_MAX);
     sc->vbsc_cfg.vbc_geometry.cylinders = 0;    /* no geometry */
     sc->vbsc_cfg.vbc_geometry.heads = 0;
     sc->vbsc_cfg.vbc_geometry.sectors = 0;
I'm using this patch with 12.0-RELEASE and it's working fine :)
 
OP
OP
stratacast1

stratacast1

Well-Known Member

Reaction score: 24
Messages: 281

Darn base compiling with a patch hehe. Not a big deal for me to do (probably worth it for myself), but I was hoping I could tell my friends bhyve was ready for virtualizing Windows

I'll have to give this patch a spin soon, hopefully it'll get ported back to 12.0
 
OP
OP
stratacast1

stratacast1

Well-Known Member

Reaction score: 24
Messages: 281

It is – perfectly works here for 3 years: Windows 7 / 10 / 2019 on ZFS. I never noticed any IO issue.
Wish I could say that. Networking is great, install was great, but a 1MB zip extraction takes a minute. That's not working perfectly
 

Zirias

Aspiring Daemon

Reaction score: 226
Messages: 609

I'll have to give this patch a spin soon, hopefully it'll get ported back to 12.0
That's unlikely, as it neither fixes a security hole nor a "bug" (bhyve is documented to be incompatible with windows guests using virtio-blk).
But, as the patch is pretty small and doesn't change any other behavior (as far as I can tell), I could imagine it might be included in an upcoming 12.1.
Wish I could say that. Networking is great, install was great, but a 1MB zip extraction takes a minute. That's not working perfectly
I didn't do any exact measurement, but for me, a windows guest using ahci-hd was usable, but "felt" a bit slow. Switching to virtio-blk did speed up things.
 
OP
OP
stratacast1

stratacast1

Well-Known Member

Reaction score: 24
Messages: 281

That's unlikely, as it neither fixes a security hole nor a "bug" (bhyve is documented to be incompatible with windows guests using virtio-blk).
But, as the patch is pretty small and doesn't change any other behavior (as far as I can tell), I could imagine it might be included in an upcoming 12.1.

I didn't do any exact measurement, but for me, a windows guest using ahci-hd was usable, but "felt" a bit slow. Switching to virtio-blk did speed up things.
Even if it showed up in 12.1 that would be neat. I can't say my Windows VM "feels" slow, it IS slow. 1MB extract of a .zip takes a minute. That isn't "feeling" slow. I wished it did better at this because I have quite a few people who turned down the idea of testing FreeBSD simply because bhyve can't virtualize Windows Server well compared to KVM. bhyve is pretty infant compared to other hypervisors though, so here's to it catching up for Windows guests!
 

free-and-bsd

Aspiring Daemon

Reaction score: 76
Messages: 706

That isn't "feeling" slow. I wished it did better at this because I have quite a few people who turned down the idea of testing FreeBSD simply because bhyve can't virtualize Windows Server well compared to KVM. bhyve is pretty infant compared to other hypervisors though, so here's to it catching up for Windows guests!
Alas, it is even so. I have to use VMware Player: though it's not a hypervisor, it gives me faster Windows nevertheless. Or maybe it's the virtual networking driver that slows things down, I don't know. When I connect to my Windows machine on my real office network using xfreerdp as aragats suggests here, the connected Windows is lighting fast in the RDP window. Connected to bhyve it reminds me of Win95 times when computers used to hang every now and then. Just can't believe that virtual NATed network can be way slower than the real one.
 

`Orum

Active Member

Reaction score: 30
Messages: 234

When I connect to my Windows machine on my real office network using xfreerdp...Windows is lighting fast... Connected to bhyve it reminds me of Win95 times when computers used to hang every now and then.
Are you connecting to the Windows servers using VNC? Yes, that's pretty slow and aside from the installation or recovery, I'd advise against it. Connecting to Windows bhyve VMs via RDP works fine for me.
 

free-and-bsd

Aspiring Daemon

Reaction score: 76
Messages: 706

Are you connecting to the Windows servers using VNC? Yes, that's pretty slow and aside from the installation or recovery, I'd advise against it. Connecting to Windows bhyve VMs via RDP works fine for me.
No , man. Using RDP. Comparing here RDP connection to bhyve vs. RDP to office network located Win 10 machine. Fair comparison.
 
Top