Chances of failure to drives on a 4 drive ZFS raidz1

General questions about the FreeBSD operating system. Ask here if your question does not fit elsewhere.

Chances of failure to drives on a 4 drive ZFS raidz1

Postby overmind » 27 Oct 2010, 21:35

What are the chances for failure of 2 drives at the same time on a 4 drive raidz 1?

I've read that chances are 1 from 10 drives to fail over a period of time of one year or more.

What is the best approach to having more space but still have redundancy?
3 drives in raidz 1?
6 drives in raidz 2?

I would like to use a pool of 4 x 1.5 TB drives for a mini backup server.

I've read some posts regarding ZFS here on forum and on http://www.solarisinternals.com/ but I need your opinion on that: chances of failure for hard drives from your experience.
overmind
Member
 
Posts: 315
Joined: 18 Nov 2008, 12:29

Postby gkontos » 27 Oct 2010, 21:48

Once I had a failure of 2 drives on a 3disk raid 5 :\
The system was not using ZFS. In fact it was using ext3 on a Centos (can't remember the version) It turned out that a voltage issue caused both drives to fail at the same time. So, honestly the more the better. And of course never forget that a good raid will remain good as long as it is being backed up regularly.

George
Powered by BareBSD
User avatar
gkontos
Senior Member
 
Posts: 1370
Joined: 09 Dec 2009, 08:36
Location: Polidendri, GR

Postby shitson » 27 Oct 2010, 22:01

I'm sorry to be the bringer of bad news but anything would just be a massive guess - The chance of 2 drives failing at the same moment in time are possible i.e lighting storm, faulty firmware, faulty batch... but you can mitigate the chances by running your gear with a UPS that is both surge/spike protected. But your really dealing with consumer grade hardware -

There is a chance and really never leave your data in the hands of the hardware gods ;)
"Virtually everything worth doing has a learning curve associated with it", anomie.
User avatar
shitson
Member
 
Posts: 181
Joined: 17 Aug 2010, 02:51
Location: Australia, Wollongong

Postby overmind » 27 Oct 2010, 22:17

I will use server motherboard and UPS to reduce some risks.
overmind
Member
 
Posts: 315
Joined: 18 Nov 2008, 12:29

Postby phoenix » 28 Oct 2010, 05:20

The chance of two drives failing at the same time is pretty low. HOWEVER, the chance of a second drive dying while going through the stress of a resilver is much higher. And if that second drive dies while the first drive is rebuilding ...

A 3-drive raidz1 is no better than a mirror (same amount of disk space), will be slower than a mirror, and will "waste" an extra drive of disk space compared to a mirror.

A 4-drive raidz1 will be slower than a pair or mirrors, and will have less redundancy (raidz1 can only lose 1 drive; pair of mirrors can lose two drives if they are the right two).

A 4-drive raidz2 is no better than a mirror (same amount of disk space), but will be slower than a mirror, and will "waste" two drives of disk space compared to a mirror. A pair of mirrors has the same amount of redundancy, but better speed.

A 6-drive raidz2 is the "sweet" spot where raidz overtakes mirrors in terms of disk space and redundancy.

If you are absolutely paranoid about redundancy, and can't stand to lose a shred of data, then use 3-disk mirrors. :) Or, use OpenSolaris with 8-drive raidz3 vdevs.

Or, go with whatever you are most comfortable with. :) There's no "right" answer.
Freddie

Help for FreeBSD: Handbook, FAQ, man pages, mailing lists.
User avatar
phoenix
MFC'd
 
Posts: 3349
Joined: 17 Nov 2008, 05:43
Location: Kamloops, BC, Canada

Postby danbi » 28 Oct 2010, 05:35

phoenix, wouldn't 3 drive raidz1 have the space of two drives stripped? A mirror will have the space of only one drive. Then 4 drive raidz1 will have the space of three drives etc.
raidz is however slower than mirrors for random writes and writing in general.

In any case, off-line backup is what saves your data. The redundancy of mirror or raidz{123} is here to let your system run while there is disk failure. Some systems must run at all times, no matter what, other can tolerate extended downtime (to restore from backup).

Not making backups and hoping your data will be safe is... well, hoping your data will be safe :)
danbi
Member
 
Posts: 227
Joined: 25 Apr 2010, 09:32
Location: Varna, Bulgaria

Postby aragon » 28 Oct 2010, 08:24

Prepare for the worst. Run smartd ([port]sysutils/smartmontools[/port]) so that you're notified early on of potential drive failure and purchase a cold standby which you keep in a safe place.
aragon
Giant Locked
 
Posts: 2031
Joined: 16 Nov 2008, 17:04
Location: Cape Town, South Africa

Postby User23 » 28 Oct 2010, 11:59

There is nothing to be prepared for the worst.

If all your drives were produced on the same date, there could be a production problem so all drives could fail at nearly the same time.

Smartmontools f.e. wont help you if your powersupply let your drives die through to much voltage.

Even if you have a backup, if it is in the same room or house it could be burned by fire.

http://en.wikipedia.org/wiki/Murphy's_law
User avatar
User23
Member
 
Posts: 336
Joined: 17 Nov 2008, 14:25
Location: Germany near Berlin

Postby overmind » 28 Oct 2010, 14:20

danbi wrote:Not making backups and hoping your data will be safe is... well, hoping your data will be safe :)


The idea is that the machine I am talking about will be a backup server. Should I then backup the backup server?

Until now I've used gmirror on all my servers and it worked ok until no, I had no data loss. All servers connected to 1000VA ups.
overmind
Member
 
Posts: 315
Joined: 18 Nov 2008, 12:29

Postby phoenix » 28 Oct 2010, 15:19

danbi wrote:phoenix, wouldn't 3 drive raidz1 have the space of two drives stripped? A mirror will have the space of only one drive. Then 4 drive raidz1 will have the space of three drives etc.


Hee hee, oops. You're right. My math is off.
Freddie

Help for FreeBSD: Handbook, FAQ, man pages, mailing lists.
User avatar
phoenix
MFC'd
 
Posts: 3349
Joined: 17 Nov 2008, 05:43
Location: Kamloops, BC, Canada

Postby fgordon » 29 Oct 2010, 00:58

At the moment I'm using a 12 drive ZFS (raidz2) and as a backup system a Linux sw based raid-6 solution - also with 12 drives (Samsungs 2 TB 5400 U/min - they stay cool while running)

Former systems had up to 24 drives in a single array (RAID-6) and I never had to use my backup system so far (~ 10 years now) even with pata and 24 drive - configurations :D (no server drives)

I think the normal drives are a lot better than one might expect - I do have a backup-server of course.

I'm not using an UPS - but electricity is very stable here - I think using a really good PSU is enough (at least for home usage)
fgordon
Junior Member
 
Posts: 33
Joined: 28 Mar 2010, 11:44

Postby wonslung » 01 Nov 2010, 05:31

I've got a few servers in production using 20 1TB or 2TB drives using 10 drive wide raidz2 vdevs. They work fantastically for backup servers and/or streaming servers.

So long as you don't make the mistake of using western digital green drives, i see no problem with using somewhat wide raidz2 stripes (10-12 drives) The main thing to remember is that random i/o is going to be very limited due to the variable block size (with raidz(1,2,3) a block is stipe length so you get the I/O of a single drive, but sequential access is fine, and can make good use of ZFS prefetching.)


The biggest issue with wide stripes, other than random i/o is resilver times, but i get between 200-600 MB/s resilvers and scrubs with this layout. I was using 3-4 vdevs but for backup servers and servers which are doing mostly sequential access, this turned out to be a waste.

The bottom line is, you should test it for your workload, but i see no real issues.
wonslung
Member
 
Posts: 850
Joined: 07 May 2009, 00:15

Postby overmind » 01 Nov 2010, 14:54

@wonslung
I have WD Green drives with gmirror, for backup (not zfs) and I had no problem. Tell me more about WD Green, what could be the problem with them?
(I used green ones for low power consumption).

So, for a backup server is ok a 10 drives raidz2, from a disk space point right?
And using two stripes of 5 drives raidz1 would be a little bit faster but with less space?
overmind
Member
 
Posts: 315
Joined: 18 Nov 2008, 12:29

Postby fgordon » 01 Nov 2010, 16:52

WD Green are sold with a VERY short idle timeout before parking heads ~ 8 Seconds - if you don't change that (wdidle.exe) the load/unload cycles will grow really fast - finally this will lead to a SMART failure - as the number of load/unload cycles is limited.
fgordon
Junior Member
 
Posts: 33
Joined: 28 Mar 2010, 11:44

Postby jalla » 01 Nov 2010, 18:13

overmind wrote:@wonslung
So, for a backup server is ok a 10 drives raidz2, from a disk space point right?
And using two stripes of 5 drives raidz1 would be a little bit faster but with less space?


I'd say not faster, and identical in space.
The big difference is in safety. One 10 disk radz2 is much more robust than two raidz vdevs of 5 disks combined.
Practical latin
Amicule, deliciae, num is sum qui mentiar tibi?
But dear, could I ever lie to you?
User avatar
jalla
Member
 
Posts: 369
Joined: 06 Aug 2009, 12:41
Location: Bergen, Norway

Postby phoenix » 01 Nov 2010, 18:43

overmind wrote:@wonslung
I have WD Green drives with gmirror, for backup (not zfs) and I had no problem. Tell me more about WD Green, what could be the problem with them?
(I used green ones for low power consumption).


Search the forums and the freebsd-stable/-current/-fs mailing lists. There are *lots* of posts about just how horrible the WD Green-series, WD GP-series, and WD "Advanced Format" versions really are. Just avoid them all. The only place they are useful is if you need a super-low-power and very quiet drive for putting into an HTPC or similar. However, do not use them in RAID, do not use them in a server, and do not use more than 1 per system. Just ... don't. It's not worth the hassle.
Freddie

Help for FreeBSD: Handbook, FAQ, man pages, mailing lists.
User avatar
phoenix
MFC'd
 
Posts: 3349
Joined: 17 Nov 2008, 05:43
Location: Kamloops, BC, Canada

Postby Christopher » 02 Nov 2010, 00:42

phoenix wrote:A 4-drive raidz2 is no better than a mirror (same amount of disk space), but will be slower than a mirror, and will "waste" two drives of disk space compared to a mirror. A pair of mirrors has the same amount of redundancy, but better speed.


Yes, but in the event of a single drive failure, a raidz2 retains redundant copies of all the information, unlike a pair of mirrors. This is useful during the stressful resilvering process. On a mirror, a single I/O error can result in data loss, but not true in a raidz2.
Christopher
Junior Member
 
Posts: 57
Joined: 16 Nov 2008, 17:15

Postby mix_room » 02 Nov 2010, 08:37

phoenix wrote:A 4-drive raidz2 is no better than a mirror (same amount of disk space), but will be slower than a mirror, and will "waste" two drives of disk space compared to a mirror. A pair of mirrors has the same amount of redundancy, but better speed.


That is not strictly true. A 4-drive raidz2 has slightly better redundancy than a mirror. (Assuming that by mirror you mean some form of Raid-1+0) With the raidz2 you can lose any 2 disks, while in a mirror you can only lose a specific combination of the disks.
mix_room
Member
 
Posts: 561
Joined: 07 Aug 2009, 16:31

Postby wonslung » 03 Nov 2010, 08:18

overmind wrote:@wonslung
I have WD Green drives with gmirror, for backup (not zfs) and I had no problem. Tell me more about WD Green, what could be the problem with them?
(I used green ones for low power consumption).

So, for a backup server is ok a 10 drives raidz2, from a disk space point right?
And using two stripes of 5 drives raidz1 would be a little bit faster but with less space?





Theres a lot of problems with them, but the biggest problem is the so called "advanced formating"

the sector size is physcially 4k but reports that it is 512b so for raidz this is just absolutely the worst possibkle situation.

Raidz uses a variable blocksize. ZFS tries to turn random writes into sequential writes by saving them in ram then flushing them to disk every so often, and with raidz it will try to always write in blocks which are as wide as the raidz group (so it will break 1 block into 5 parts for a vdev with 5 drives (including parity)

Because the drives report that the sector size is 512b, it will sometimes save these "blocks" in 512b "pieces" across the drives...this causes the drive to have to read and write over and over for what should be normal disk operations.
wonslung
Member
 
Posts: 850
Joined: 07 May 2009, 00:15

Postby wonslung » 03 Nov 2010, 08:21

jalla wrote:I'd say not faster, and identical in space.
The big difference is in safety. One 10 disk radz2 is much more robust than two raidz vdevs of 5 disks combined.



Yes, I agree (and thought I conveyed this) but it's often given as advice to not use wide stripes. But it DOES depend on your use. 2 raidz vdevs with 5 drives will be better in some situatons than a single raidz2 vdev with 10 drives.
wonslung
Member
 
Posts: 850
Joined: 07 May 2009, 00:15

Postby overmind » 04 Nov 2010, 14:02

wonslung wrote:Theres a lot of problems with them, but the biggest problem is the so called "advanced formating"

the sector size is physcially 4k but reports that it is 512b so for raidz this is just absolutely the worst possibkle situation.

Raidz uses a variable blocksize. ZFS tries to turn random writes into sequential writes by saving them in ram then flushing them to disk every so often, and with raidz it will try to always write in blocks which are as wide as the raidz group (so it will break 1 block into 5 parts for a vdev with 5 drives (including parity)

Because the drives report that the sector size is 512b, it will sometimes save these "blocks" in 512b "pieces" across the drives...this causes the drive to have to read and write over and over for what should be normal disk operations.


Then, which hard drives (1TB and 1.5TB) are ok?
- HDD Western Digital Caviar Black, 1.5 TB, 7200rpm, 64MB, SATA2 are kind of expensive.
- I've noticed Samsung has green drives: HDD Samsung F2 Eco Green Series 1.5 TB, 5400 rpm, 32MB, SATA2, I wonder if is the same issue with them as with WD green ones
- there's also: HDD Samsung SpinPoint 1.5 TB, 5400rpm, 32MB, SATA2, which I think is not green.

If you use some of them that work ok please advice.

I would not choose Seagate, I had lots of problems with them in the past (high failure rate). Also I would prefer ones that runs cooler.

Thank you and best regards!
overmind
Member
 
Posts: 315
Joined: 18 Nov 2008, 12:29

Postby aragon » 04 Nov 2010, 14:16

overmind wrote:Then, which hard drives (1TB and 1.5TB) are ok?

Everyone seems to rate the Hitachi highly:

http://www.newegg.com/Product/Product.aspx?Item=N82E16822145369

Although it looks like Newegg won't be selling them in 2TB anymore....

overmind wrote:- I've noticed Samsung has green drives: HDD Samsung F2 Eco Green Series 1.5 TB, 5400 rpm, 32MB, SATA2, I wonder if is the same issue with them as with WD green ones

There's also the F3 series if you can find them for sale somewhere. They didn't have sector emulation like the F4 does.

overmind wrote:Also I would prefer ones that runs cooler.

Notebook hard drives are an option too.

(I'm still on the fence regarding the severity of 4k sector emulation when we have gnop at our disposal)
aragon
Giant Locked
 
Posts: 2031
Joined: 16 Nov 2008, 17:04
Location: Cape Town, South Africa

Postby overmind » 04 Nov 2010, 15:08

Yes, notebook hard drives are interesting, if we can get 1TB at a low prices. Because of low power consumption, high density. Check this out: http://www.chenbro.com/corporatesite/products_detail.php?sku=117 (there's also a smaller, 24 drive version).
overmind
Member
 
Posts: 315
Joined: 18 Nov 2008, 12:29

...

Postby fgordon » 05 Nov 2010, 14:15

At the moment I'm using 24 Samsung F3 2 TByte drives....and had no failure so far (~ 1 year)

They stay really cool(!) I log temperatures of every drive and they are normally only 10 degrees (C) warmer than the room temp. (when under heavy load for hours (scrub))

Hmmm I thought the only problem are 4k drives that report they are using 512 bytes? Do the F4 have this "emulation"? If a drive reports as a 4k zfs should not have a problem with it.... as long as one partitions it correctly - or uses it without any partition...?
fgordon
Junior Member
 
Posts: 33
Joined: 28 Mar 2010, 11:44

Postby phoenix » 05 Nov 2010, 15:26

The problem with ZFS is that it uses a variable block size, and will gladly use 0.5 KK, 1 KB, or 2 KB blocks, which will play havoc with disk alignment after a bit of use.

Yes, one can force a single block size onto a ZFS filesystem via the recordsize property. However, that eliminates a lot of the potential performance gains of using a variable block, as recordsize is both a minimum and a maximum size for all blocks.

What one needs to do is to recompile all the ZFS code to set the minimum block size to 4 KB, without affecting the maximum. There's still a lot of debate over where in the code this needs to be set (FreeBSD patches set it in once place, OSol patches set it in a different place, nobody knows if either is enough).

Aligning the first partition is not enough for ZFS to stay aligned.

Until the OSOl and/or FreeBSD devs come up with a way to either autotune this based on what the drive reports, or set it at runtime, or even add a KNOB to set it at compile time, one really should avoid all 4K drives when using ZFS. Or, suffer through sub-par performance.
Freddie

Help for FreeBSD: Handbook, FAQ, man pages, mailing lists.
User avatar
phoenix
MFC'd
 
Posts: 3349
Joined: 17 Nov 2008, 05:43
Location: Kamloops, BC, Canada

Next

Return to General

Who is online

Users browsing this forum: No registered users and 2 guests