1ac77 ZFS Max Reads : 3x1TB + 1xSSD L2ARC - The FreeBSD Forums
The FreeBSD Forums  

Go Back   The FreeBSD Forums > Base System > Storage

Storage Place to ask questions about partitioning, labelling, filesystems, encryption or anything else related to storage area.

Reply
 
Thread Tools Display Modes
  #1  
Old May 3rd, 2012, 22:07
einthusan einthusan is offline
Junior Member
 
Join Date: Feb 2011
Location: Toronto
Posts: 87
Thanks: 26
Thanked 2 Times in 2 Posts
Default ZFS Max Reads : 3x1TB + 1xSSD L2ARC

Hey guys,

As the title says, I will be using 3 x 1TB drives and one SSD for L2ARC. I'm looking to get the fastest read throughput. I'm a bit confused obviously, can you mirror the 1TB data on all 3 drives? So in the end, your system only has 1 TB of storage capacity instead of 3TB?

Does this involve striping or do I bypass striping altogether? My reasoning is that if I'm doing file serving and the same data is redundantly stored on all 3 drives, the IOPS should be 3 times as much.

Also take note that I am not really worried about data backup, I've got that covered, so any data redundancy is purely for performance reasons.

Last edited by DutchDaemon; May 4th, 2012 at 00:16.
Reply With Quote
  #2  
Old May 3rd, 2012, 22:40
phoenix's Avatar
phoenix phoenix is offline
Moderator
 
Join Date: Nov 2008
Location: Kamloops, BC, Canada
Posts: 3,141
Thanks: 43
Thanked 702 Times in 579 Posts
Default

Quote:
Originally Posted by einthusan View Post
As the title says, I will be using 3 x 1TB drives and one SSD for L2ARC. I'm looking to get the fastest read throughput. I'm a bit confused obviously, can you mirror the 1TB data on all 3 drives? So in the end, your system only has 1 TB of storage capacity instead of 3TB?
Yes, you can create a 3-way mirror, where the data is the same on each disk, giving you 1 TB of usable storage, with the ability to lose 2 disks without losing any data:
Code:
# zpool create poolname mirror disk1 disk2 disk3
Quote:
Also take note that I am not really worried about data backup, I've got that covered, so any data redundancy is purely for performance reasons.
Redundancy doesn't give you performance. If you want the absolute best performance, then just create a pool of individual disks:
Code:
# zpool create poolname disk1 disk2 disk3
That will create the equivalent of a RAID0 stripe across the three drives, give you 3 TB of disk space, and the most IOps. Of course, lose any 1 drive, and the whole pool is gone. And you lose the ability to repair errors in any data, as there is no redundancy in the pool.
__________________
Freddie

Help for FreeBSD: Handbook, FAQ, man pages, mailing lists.
Reply With Quote
  #3  
Old May 3rd, 2012, 22:57
wblock@'s Avatar
wblock@ wblock@ is offline
Moderator
 
Join Date: Sep 2009
Location: Milky Way galaxy
Posts: 7,706
Thanks: 430
Thanked 1,757 Times in 1,456 Posts
Default

A mirror can give some performance increase in reads depending on the mirror algorithm, but writes suffer. I tested this recently with two 80G IDE drives:

Code:
           write  read
lone drive 37608  55994
gstripe    26945  78086
gmirror    13460  71698
That's with the gmirror(8) load algorithm.
Reply With Quote
  #4  
Old May 4th, 2012, 01:42
einthusan einthusan is offline
Junior Member
 
Join Date: Feb 2011
Location: Toronto
Posts: 87
Thanks: 26
Thanked 2 Times in 2 Posts
Default

I had always thought that mirroring performed better than striping. Your tests indicates that striping performs better in reads as well.

This articles concludes by saying that mirrors are always faster than RAID-Z groups for file serving.
http://constantin.glez.de/blog/2010/...nd-performance

Am I understanding correctly that a RAID-Z group is the same as adding raw drives into a ZFS pool? This is surely some confusing stuff. If I had a spare machine I would be able to run some tests myself but I don't have one.
Reply With Quote
  #5  
Old May 4th, 2012, 04:57
phoenix's Avatar
phoenix phoenix is offline
Moderator
 
Join Date: Nov 2008
Location: Kamloops, BC, Canada
Posts: 3,141
Thanks: 43
Thanked 702 Times in 579 Posts
Default

No. A raidz1 vdev is similar to a RAID5 array in that out of 'n' disks, you have 'n-1' data disks and 1 parity disk. And a raidz2 vdev is like a RAID6 array, where you have 'n-2' data disks and 2 parity disks.
__________________
Freddie

Help for FreeBSD: Handbook, FAQ, man pages, mailing lists.
Reply With Quote
  #6  
Old May 5th, 2012, 10:43
jalla's Avatar
jalla jalla is offline
Member
 
Join Date: Aug 2009
Location: Bergen, Norway
Posts: 334
Thanks: 11
Thanked 67 Times in 58 Posts
Default

Quote:
Originally Posted by wblock@ View Post
A mirror can give some performance increase in reads depending on the mirror algorithm, but writes suffer. I tested this recently with two 80G IDE drives:

Code:
           write  read
lone drive 37608  55994
gstripe    26945  78086
gmirror    13460  71698
That's with the gmirror(8) load algorithm.
It depends mostly on the number of datadrives. Testing a stripe of three disks vs three mirrored pairs, I can't really tell the difference.

Code:
             -------Sequential Output-------- ---Sequential Input-- --Random--
              -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
Machine    MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %CPU
r0-3x1   16384 30372 10.5 24044  2.8 22521  3.2 224798 63.4 218721  8.2 490.2  0.6
r1-6x3   16384 30104  7.4 23427  2.7 21700  3.1 237144 74.8 221707  7.6 468.3  0.6
__________________
Practical latin
Amicule, deliciae, num is sum qui mentiar tibi?
But dear, could I ever lie to you?
Reply With Quote
  #7  
Old May 5th, 2012, 16:18
mav@ mav@ is offline
FreeBSD Developer
 
Join Date: Feb 2008
Location: Dnepropetrovsk, Ukraine
Posts: 554
Thanks: 0
Thanked 136 Times in 119 Posts
Default

Quote:
Originally Posted by wblock@ View Post
A mirror can give some performance increase in reads depending on the mirror algorithm, but writes suffer. I tested this recently with two 80G IDE drives:

Code:
           write  read
lone drive 37608  55994
gstripe    26945  78086
gmirror    13460  71698
That's with the gmirror(8) load algorithm.
Make sure that those drives not share one IDE port. Small numbers on mirror write are suspicious. In general, if no other limitations and the benchmark is multi-threaded, gstripe should give x2 performance on both read and write, while gmirror -- x2 on read and x1 on write.
Reply With Quote
  #8  
Old May 5th, 2012, 17:58
wblock@'s Avatar
wblock@ wblock@ is offline
Moderator
 
Join Date: Sep 2009
Location: Milky Way galaxy
Posts: 7,706
Thanks: 430
Thanked 1,757 Times in 1,456 Posts
Default

They were on different ports on an old Promise PCI IDE controller. There may be bottlenecks on the card, but it was the only way other than IDE/USB converters to attach these to a recent motherboard. I figured the mirror write slowdown was due to the mirror having up to twice the rotational latency of a lone drive.
Reply With Quote
  #9  
Old May 5th, 2012, 20:17
mav@ mav@ is offline
FreeBSD Developer
 
Join Date: Feb 2008
Location: Dnepropetrovsk, Ukraine
Posts: 554
Thanks: 0
Thanked 136 Times in 119 Posts
Default

Quote:
Originally Posted by wblock@ View Post
I figured the mirror write slowdown was due to the mirror having up to twice the rotational latency of a lone drive.
You are right about latency, just not twice, but I think on average x1.5. But read-ahead/write-back of the file system should hide it.

Last edited by mav@; May 6th, 2012 at 06:23.
Reply With Quote
The Following User Says Thank You to mav@ For This Useful Post:
wblock@ (May 5th, 2012)
  #10  
Old May 21st, 2012, 00:02
einthusan einthusan is offline
Junior Member
 
Join Date: Feb 2011
Location: Toronto
Posts: 87
Thanks: 26
Thanked 2 Times in 2 Posts
Default l2arc writes more than reading

After striping together three disks and adding an L2ARC device, I enabled caching of streaming data and let the L2ARC warm up for ten hours. The read rates from L2ARC are lower than those of disk reads. L2ARC keeps writing 40 MB/sec and reads only at 20 MB/sec. When I tested the SSD device using Bonnie++, the throughput was amazingly high. However, under real-world streaming load, it's as if the L2ARC wants to keep on caching disk reads instead of helping to improve overall read throughput. Any advice/tips would be much appreciated!

Last edited by DutchDaemon; May 21st, 2012 at 00:31.
Reply With Quote
  #11  
Old May 21st, 2012, 02:42
t1066 t1066 is offline
Member
 
Join Date: Jun 2010
Posts: 142
Thanks: 3
Thanked 25 Times in 24 Posts
Default

First make sure you have
Code:
vfs.zfs.l2arc_noprefetch=0
set in /etc/sysctl.conf. Otherwise, set it on the command line

# sysctl vfs.zfs.l2arc_noprefetch=0

Next install sysutils/zfs-stats and run zstat. It will show the efficiencies of ARC, L2ARC and ZFETCH. Write them down for future references.

Now comes the hard part. Determine what is the size of your working set. Then make sure it is less than the capacity of your L2ARC. This can be done in two ways. Add more cache drives. Or put the files into different filesystems and only set secondarycache=all for files that are you want to cache. Rerun zstat after each change and see what the improvement would be, if any.

Last edited by DutchDaemon; May 21st, 2012 at 02:58.
Reply With Quote
  #12  
Old May 21st, 2012, 07:04
einthusan einthusan is offline
Junior Member
 
Join Date: Feb 2011
Location: Toronto
Posts: 87
Thanks: 26
Thanked 2 Times in 2 Posts
Default

Quote:
Originally Posted by t1066 View Post
First make sure you have
Code:
vfs.zfs.l2arc_noprefetch=0
set in /etc/sysctl.conf. Otherwise, set it on the command line
# sysctl vfs.zfs.l2arc_noprefetch=0
Yes, I had this value set.

Obviously the L2ARC wasn't working as I expected since it seems to be in degraded mode. I'll try to read up on this more. Thanks.

Code:
L2 ARC Summary: (DEGRADED)
	Passed Headroom:			2.16m
	Tried Lock Failures:			28.29k
	IO In Progress:				178
	Low Memory Aborts:			3
	Free on Write:				117.27k
	Writes While Full:			40.71k
	R/W Clashes:				91
	Bad Checksums:				64
	IO Errors:				0
	SPA Mismatch:				0

L2 ARC Size: (Adaptive)				29.78	GiB
	Header Size:			0.15%	45.87	MiB

L2 ARC Evicts:
	Lock Retries:				213
	Upon Reading:				391

L2 ARC Breakdown:				29.18m
	Hit Ratio:			24.30%	7.09m
	Miss Ratio:			75.70%	22.09m
	Feeds:					111.90k

L2 ARC Buffer:
	Bytes Scanned:				45.67	TiB
	Buffer Iterations:			111.90k
	List Iterations:			6.63m
	NULL List Iterations:			1.04m

L2 ARC Writes:
	Writes Sent:			100.00%	86.82k
Code:
  pool: pool1
 state: ONLINE
  scan: scrub repaired 0 in 0h1m with 0 errors on Fri May 18 11:08:59 2012
config:

	NAME          STATE     READ WRITE CKSUM
	pool1         ONLINE       0     0     0
	  mirror-0    ONLINE       0     0     0
	    ada0p2    ONLINE       0     0     0
	    ada2p2    ONLINE       0     0     0
	  gpt/disk1   ONLINE       0     0     0
	cache
	  gpt/cache1  ONLINE       0     0     0

errors: No known data errors

Last edited by DutchDaemon; May 21st, 2012 at 21:01. Reason: "I'll" and "Ill" are different things.
Reply With Quote
  #13  
Old May 21st, 2012, 07:56
einthusan einthusan is offline
Junior Member
 
Join Date: Feb 2011
Location: Toronto
Posts: 87
Thanks: 26
Thanked 2 Times in 2 Posts
Default

Quote:
Originally Posted by t1066 View Post
Next install sysutils/zfs-stats and run zstat. It will show the efficiencies of ARC, L2ARC and ZFETCH. Write them down for future references.
I don't see L2ARC :S
Code:
ZFS real-time cache activity monitor

Cache efficiency percentage:
                  10s    60s    tot
          ARC:  68.79  70.48  70.48
       ZFETCH:  96.88  96.94  96.94
VDEV prefetch:   0.00   0.00   0.00
Reply With Quote
  #14  
Old May 21st, 2012, 09:14
einthusan einthusan is offline
Junior Member
 
Join Date: Feb 2011
Location: Toronto
Posts: 87
Thanks: 26
Thanked 2 Times in 2 Posts
Default

Got it to work. Made the changes you suggested. Does this look okay?
Code:
ZFS real-time cache activity monitor

Cache efficiency percentage:
           10s    60s    tot
   ARC:  76.44  78.37  80.52
 L2ARC:  12.12  17.14  15.28
ZFETCH:  98.56  98.61  98.77

Last edited by DutchDaemon; May 21st, 2012 at 21:02.
Reply With Quote
  #15  
Old May 21st, 2012, 11:12
t1066 t1066 is offline
Member
 
Join Date: Jun 2010
Posts: 142
Thanks: 3
Thanked 25 Times in 24 Posts
Default

Your L2ARC has 64 bad checksum, which is why it is classified as DEGRADED.

The efficiency of your L2ARC is less than 20%, which is pretty bad unless it is just warming up. I would try to get the efficiency up to at least 70%. And ideally, it should be over 90% most of the time. You should try to improve the whole setup by monitoring the size of L2ARC and the efficiency. If you fill up the cache drive but still get low efficiency, you had to either add more cache drives or restrict caching to certain filesystems.

Last edited by DutchDaemon; May 21st, 2012 at 21:02.
Reply With Quote
The Following User Says Thank You to t1066 For This Useful Post:
einthusan (May 24th, 2012)
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
L2ARC and ZIL on SSD - 4K alignment? belon_cfy Storage 34 May 22nd, 2012 18:05
Good Reads Beeblebrox Off-Topic 10 February 22nd, 2012 16:34
ZFS zil and l2arc on other controller Sylhouette System Hardware 1 November 2nd, 2011 11:18
[Solved] DVD burner: burns/reads fine, but doesn't boot Caliante System Hardware 2 November 28th, 2010 10:52
Can FreeBSD reads contents of usb flash when it works in Virtual PC? anti Peripheral Hardware 6 February 6th, 2010 03:41


All times are GMT +1. The time now is 22:26.


Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2013, vBulletin Solutions, Inc.
The mark FreeBSD is a registered trademark of The FreeBSD Foundation and is used by The FreeBSD Project with the permission of The FreeBSD Foundation.
Web protection and acceleration provided by CloudFlare
0