I'm running FreeBsd (14.3) on old server hardware (SM487 chassis) with the following (copied from dmesg):
The drives involved are 35 Seagate Exos x16 16TB and 1 Seagate Exos x18 18TB (one of the ZFS spares, was a warranty replacement). 24 drives are on the front backplane (including the x18); the rear backplane has 12 (all x16). Zpool configuration is 3x11-wide RAIDZ3 with 3 hot spares, with two vdevs entirely on the front backplane and the remaining one entirely on the rear backplane (along with two spares on the front and one on the rear).
I'm considering GELI encypting the drives, should the performance be sufficient, so I set up a few drives by removing the three hot spares from the array and testing them as individual drives. In this setup performance seemed pretty reasonable--when zeroing out the drives dd reported about 220MB/s per drive, down from 270MB/s for the raw unencrypted drives. The performance was the same writing to three drives at a time as it was to one.
So far so good; I added the da*.eli devices as spares to my zfs pool and then offlined three drives (one per vdev). The resilvering performance is hard to judge; I think it's taking a little longer than normal but I can't remember the last time I resilvered three drives at once. So putting that aside for the moment, except to know that it is going on in the background. I did check CPU usage during the resilver and no CPU is reporting much more than 20% utilized.
However, while the resilver is happening, the dd progress cleaning out the next set of three drives reports an absolutely abysmal slowdown--from 220MB/s per drive down to 80MB/s per drive. Doesn't seem to matter if one drive is being zeroed or 3. Rechecking CPU usage shows a couple at 22-23%, the rest still in the mid to high teens. If there's a CPU bottleneck I can't see it. The processor should support AES-NI (and it shows up in the dmesg); all of the GELI devices report accelerated software. I did a `geli list`, and all of the drives look the same, so I'm only showing the output for one drive:
Anyone have an ideas where else I can look for the cause of the slowdown? Writing all zeroes to the drive shouldn't really be a difficult workload (and in any event I'm comparing the same workload both times). I'm concerned I might get half of my pool encrypted only to find out that performance has completely tanked. I already set `kern.geom.eli.threads` to 1, and while that has limited the threads to one per drive I see very little difference in performance.
Or is it just normal for unrelated I/O to completely tank on a machine that is resilvering a zfs pool?
Code:
CPU: Intel(R) Xeon(R) CPU E5-2667 v2 @ 3.30GHz (3300.22-MHz K8-class CPU)
Origin="GenuineIntel" Id=0x306e4 Family=0x6 Model=0x3e Stepping=4
Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
Features2=0x7fbee3ff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND>
AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM>
AMD Features2=0x1<LAHF>
Structured Extended Features=0x281<FSGSBASE,SMEP,ERMS>
Structured Extended Features3=0x9c000000<IBPB,STIBP,L1DFL,SSBD>
XSAVE Features=0x1<XSAVEOPT>
VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr
TSC: P-state invariant, performance statistics
real memory = 277025390592 (264192 MB)
avail memory = 267661074432 (255261 MB)
Event timer "LAPIC" quality 600
ACPI APIC Table: <ALASKA A M I>
FreeBSD/SMP: Multiprocessor System Detected: 32 CPUs
FreeBSD/SMP: 2 package(s) x 8 core(s) x 2 hardware threads
The drives involved are 35 Seagate Exos x16 16TB and 1 Seagate Exos x18 18TB (one of the ZFS spares, was a warranty replacement). 24 drives are on the front backplane (including the x18); the rear backplane has 12 (all x16). Zpool configuration is 3x11-wide RAIDZ3 with 3 hot spares, with two vdevs entirely on the front backplane and the remaining one entirely on the rear backplane (along with two spares on the front and one on the rear).
I'm considering GELI encypting the drives, should the performance be sufficient, so I set up a few drives by removing the three hot spares from the array and testing them as individual drives. In this setup performance seemed pretty reasonable--when zeroing out the drives dd reported about 220MB/s per drive, down from 270MB/s for the raw unencrypted drives. The performance was the same writing to three drives at a time as it was to one.
So far so good; I added the da*.eli devices as spares to my zfs pool and then offlined three drives (one per vdev). The resilvering performance is hard to judge; I think it's taking a little longer than normal but I can't remember the last time I resilvered three drives at once. So putting that aside for the moment, except to know that it is going on in the background. I did check CPU usage during the resilver and no CPU is reporting much more than 20% utilized.
However, while the resilver is happening, the dd progress cleaning out the next set of three drives reports an absolutely abysmal slowdown--from 220MB/s per drive down to 80MB/s per drive. Doesn't seem to matter if one drive is being zeroed or 3. Rechecking CPU usage shows a couple at 22-23%, the rest still in the mid to high teens. If there's a CPU bottleneck I can't see it. The processor should support AES-NI (and it shows up in the dmesg); all of the GELI devices report accelerated software. I did a `geli list`, and all of the drives look the same, so I'm only showing the output for one drive:
Code:
Geom name: da0.eli
State: ACTIVE
EncryptionAlgorithm: AES-XTS
KeyLength: 256
Crypto: accelerated software
Version: 7
UsedKey: 0
Flags: AUTORESIZE
KeysAllocated: 3726
KeysTotal: 3726
Providers:
1. Name: da0.eli
Mediasize: 16000900657152 (15T)
Sectorsize: 4096
Mode: r1w1e0
Consumers:
1. Name: da0
Mediasize: 16000900661248 (15T)
Sectorsize: 512
Stripesize: 4096
Stripeoffset: 0
Mode: r1w1e1
Anyone have an ideas where else I can look for the cause of the slowdown? Writing all zeroes to the drive shouldn't really be a difficult workload (and in any event I'm comparing the same workload both times). I'm concerned I might get half of my pool encrypted only to find out that performance has completely tanked. I already set `kern.geom.eli.threads` to 1, and while that has limited the threads to one per drive I see very little difference in performance.
Or is it just normal for unrelated I/O to completely tank on a machine that is resilvering a zfs pool?