ZFS throughput incredibly slow

Hi,

I've just upgraded from 9.1-RELEASE to 10.0-RELEASE and am seeing some insane problems with my zpool.

Reads and writes are incredibly slow. If I run iostat -a, it pretty much freezes at the first test - the output shows KB 64, reclen 4 then... eventually... a figure will appear under 'write' for the first test. Then a long pause... etc.

(I'll paste the output here after there's something to show!)

I have 6GB of RAM - here's the output of top:

Code:
last pid:  1882;  load averages:  0.31,  0.43,  0.49                                                                           up 0+04:38:34  21:40:46
47 processes:  1 running, 46 sleeping
CPU:  0.8% user,  0.0% nice,  0.4% system,  0.0% interrupt, 98.8% idle
Mem: 165M Active, 309M Inact, 4215M Wired, 164M Buf, 1261M Free
ARC: 3903M Total, 952M MFU, 2898M MRU, 24M Anon, 18M Header, 11M Other
Swap: 3598M Total, 3598M Free

It's taken 4 minutes for iozone -a to report the following:

Code:
	Iozone: Performance Test of File I/O
	        Version $Revision: 3.420 $
		Compiled for 64 bit mode.
		Build: freebsd 

	Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
	             Al Slater, Scott Rhine, Mike Wisner, Ken Goss
	             Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
	             Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner,
	             Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone,
	             Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root,
	             Fabrice Bacchella, Zhenghua Xue, Qin Li, Darren Sawyer,
	             Vangel Bojaxhi, Ben England, Vikentsi Lapa.

	Run began: Sat Aug 30 21:39:35 2014

	Auto Mode
	Command line used: iozone -a
	Output is in Kbytes/sec
	Time Resolution = 0.000001 seconds.
	Processor cache size set to 1024 Kbytes.
	Processor cache line size set to 32 bytes.
	File stride size set to 17 * record size.
                                                            random  random    bkwd   record   stride                                   
              KB  reclen   write rewrite    read    reread    read   write    read  rewrite     read   fwrite frewrite   fread  freread
              64       4       3  252098   614542   666414  673098  261173  473593   279676   582535   246994   225423  210569   224480
              64       8   43388  111300   232854   220064  250919   55269  681644        4   874936        4        4  551422  1013707
              64      16

I'm seeing some weird behaviours -- ls'ing folders on the pool takes forever, for example.

zpool status shows everything's fine. It was scrubbed just before the upgrade. Now, if I try to start a scrub, the command takes forever to execute and the scrub itself runs at just 112KB/sec.

There are no warnings logged in /etc/messages or dmesg.

I have tried a bit of zfs tuning, but it's not helped. Currently in /boot/loader.conf, I have:

Code:
vfs.zfs.arc_max=4092461056
vfs.zfs.write_limit_override=268435456

Any idea what the problem could be?

Thanks,
Chris
 
A bit more info:

This pool had been upgraded from version 28 in FreeBSD 9.1 to 5000 in 10-RELEASE. I have another pool which I've not yet upgraded which seems OK.

I ran zdb on the problem store about 10 minutes ago - it's yet to produce any output.

If it's any help, my uname -a output is:

Code:
FreeBSD trillian 10.0-RELEASE-p7 FreeBSD 10.0-RELEASE-p7 #0 r270797: Fri Aug 29 10:37:33 BST 2014     root@trillian:/usr/obj/usr/src/sys/TRILLIAN  amd64

(the 'TRILLIAN' kernel config is just GENERIC, plus the options required to run a VPN server).
 
Pool layout? Number of drives? How are they connected? Do you have any cache (L2ARC) devices? Number and speed of CPUs?

We need more information. :)
 
Back
Top