ZFS write performance issues with WD20EARS

mjrosenb said:
who did you write to, customer service department?
Yeah, I just emailed support. I believe I used this link.

"Pablo R." replied the following back to me:
I apologize for the inconvenience. I will forward your request to our engineer department.

Please bear in mind that this is not an indication that they will make the firmware update.

Still, I feel like we should all email them in the hopes that our voices will be heard. It's not like it's hard to write a short email requesting a firmware change.

mjrosenb said:
also, how do you have your disks attached to the system?

They're attached as single disks through my 3ware 9650SE-24M8 with no auto-verify and the storsave profile set to perform. On top of that, I have gnop providing 4k-sector'd disks and zfs uses those 4k transparent providers. So, all in all, my zpool status looks like this:

Code:
  pool: pool
 state: ONLINE
 scrub: none requested
config:

        NAME          STATE     READ WRITE CKSUM
        pool          ONLINE       0     0     0
          raidz1      ONLINE       0     0     0
            da6.nop   ONLINE       0     0     0
            da9.nop   ONLINE       0     0     0
            da5.nop   ONLINE       0     0     0
            da1.nop   ONLINE       0     0     0
          raidz1      ONLINE       0     0     0
            da2.nop   ONLINE       0     0     0
            da4.nop   ONLINE       0     0     0
            da0.nop   ONLINE       0     0     0
            da7.nop   ONLINE       0     0     0
          raidz1      ONLINE       0     0     0
            da3.nop   ONLINE       0     0     0
            da10.nop  ONLINE       0     0     0
            da16.nop  ONLINE       0     0     0
            da12.nop  ONLINE       0     0     0
          raidz1      ONLINE       0     0     0
            da8.nop   ONLINE       0     0     0
            da19.nop  ONLINE       0     0     0
            da13.nop  ONLINE       0     0     0
            da17.nop  ONLINE       0     0     0
          raidz1      ONLINE       0     0     0
            da15.nop  ONLINE       0     0     0
            da11.nop  ONLINE       0     0     0
            da18.nop  ONLINE       0     0     0
            da14.nop  ONLINE       0     0     0
        cache
          ad8         ONLINE       0     0     0

errors: No known data errors

ad8 is a 40 gig ssd.
 
wow, your setup is looking might similar to mine.
Is there a reason that you have the drives distributed throughout the zpools like that?

I found that I get an improvement from 8MB/s to 11 MB/s write speeds by changing from
Code:
zpool create store raidz1 /dev/da0.nop /dev/da1.nop /dev/da2.nop /dev/da3.nop
to
Code:
zpool create store raidz1 /dev/da0.nop /dev/da4.nop /dev/da8.nop /dev/da12.nop.

and the same question goes for the ordering of the devices within the raidz's and the ordering of the raidz's.

and what speeds are you getting while writing to your raidz now? still the 16.1 MB/s you stated earlier?
 
mjrosenb said:
Is there a reason that you have the drives distributed throughout the zpools like that?

Yes, although I'm not sure of its legitimacy. My reasoning was that I wanted to un-correlate drive failures as much as possible since, for example, two drives failing in the same raidz is much worse than two drives failing with each in a different raidz. So I arranged the drive assignments such that no two drives in any raidz were physically adjacent in my case.

mjrosenb said:
and the same question goes for the ordering of the devices within the raidz's and the ordering of the raidz's.

When it comes to the ordering of the vdevs, you may have no control over that--at least no useful control. It seems highly plausible to me that zfs will decide how to stripe blocks independently of how the vdevs are ordered when you create the pool, or how they're ordered when you do a zpool status.

mjrosenb said:
and what speeds are you getting while writing to your raidz now? still the 16.1 MB/s you stated earlier?

I removed the cache device, in case it might have gotten in the way of my benchmarks.

I created a ramdisk of size 2 gigs, copied a bunch of random data from /dev/random to it, and tried dd-ing that onto zfs. With a bs of 4m, it claimed 150 meg / s write speeds. With a bs of 1m, it claimed 210 meg / s write speeds.

Doing a sustained rsync (of huge amounts of data--8T--way more than the ram I have) from one zfs filesystem in the pool to another, I got sustained simultaneous rates of 80 megs / s read and write.

I should also note that I've done absolutely no tuning of zfs. Stock 8.0 installation. 8 gigs ddr2 ram, 3.2 ghz quad-core amd phenom ii x4 955.
 
I can also give the physical port <--> unit mapping from my 3ware configuration, since it's not completely straightforward.

Code:
# tw_cli /c0 show

Unit  UnitType  Status         %RCmpl  %V/I/M  Stripe  Size(GB)  Cache  AVrfy
------------------------------------------------------------------------------
u0    SINGLE    OK             -       -       -       1862.63   RiW    OFF
u1    SINGLE    OK             -       -       -       1862.63   RiW    OFF
u2    SINGLE    OK             -       -       -       1862.63   RiW    OFF
u3    SINGLE    OK             -       -       -       1862.63   RiW    OFF
u4    SINGLE    OK             -       -       -       1862.63   RiW    OFF
u5    SINGLE    OK             -       -       -       1862.63   RiW    OFF
u6    SINGLE    OK             -       -       -       1862.63   RiW    OFF
u7    SINGLE    OK             -       -       -       1862.63   RiW    OFF
u8    SINGLE    OK             -       -       -       1862.63   RiW    OFF
u9    SINGLE    OK             -       -       -       1862.63   RiW    OFF
u10   SINGLE    OK             -       -       -       1862.63   RiW    OFF
u11   SINGLE    OK             -       -       -       1862.63   RiW    OFF
u12   SINGLE    OK             -       -       -       1862.63   RiW    OFF
u13   SINGLE    OK             -       -       -       1862.63   RiW    OFF
u14   SINGLE    OK             -       -       -       1862.63   RiW    OFF
u15   SINGLE    OK             -       -       -       1862.63   RiW    OFF
u16   SINGLE    OK             -       -       -       1862.63   RiW    OFF
u17   SINGLE    OK             -       -       -       1862.63   RiW    OFF
u18   SINGLE    OK             -       -       -       1862.63   RiW    OFF
u19   SINGLE    OK             -       -       -       1862.63   RiW    OFF

VPort Status         Unit Size      Type  Phy Encl-Slot    Model
------------------------------------------------------------------------------
p0    OK             u10  1.82 TB   SATA  0   -            WDC WD20EARS-00MVWB0
p1    OK             u11  1.82 TB   SATA  1   -            WDC WD20EARS-00MVWB0
p2    OK             u12  1.82 TB   SATA  2   -            WDC WD20EARS-00MVWB0
p3    OK             u13  1.82 TB   SATA  3   -            WDC WD20EARS-00MVWB0
p4    OK             u14  1.82 TB   SATA  4   -            WDC WD20EARS-00MVWB0
p5    OK             u0   1.82 TB   SATA  5   -            WDC WD20EARS-00J2GB0
p6    OK             u1   1.82 TB   SATA  6   -            WDC WD20EARS-00J2GB0
p7    OK             u2   1.82 TB   SATA  7   -            WDC WD20EARS-00J2GB0
p8    OK             u15  1.82 TB   SATA  8   -            WDC WD20EARS-00MVWB0
p9    OK             u16  1.82 TB   SATA  9   -            WDC WD20EARS-00MVWB0
p10   OK             u17  1.82 TB   SATA  10  -            WDC WD20EARS-00MVWB0
p11   OK             u18  1.82 TB   SATA  11  -            WDC WD20EARS-00MVWB0
p12   OK             u19  1.82 TB   SATA  12  -            WDC WD20EARS-00MVWB0
p13   OK             u3   1.82 TB   SATA  13  -            WDC WD20EARS-00J2GB0
p14   OK             u4   1.82 TB   SATA  14  -            WDC WD20EARS-00J2GB0
p15   OK             u5   1.82 TB   SATA  15  -            WDC WD20EARS-00J2GB0
p16   OK             u6   1.82 TB   SATA  16  -            WDC WD20EARS-00J2GB0
p17   OK             u7   1.82 TB   SATA  17  -            WDC WD20EARS-00J2GB0
p18   OK             u8   1.82 TB   SATA  18  -            WDC WD20EARS-00J2GB0
p19   OK             u9   1.82 TB   SATA  19  -            WDC WD20EARS-00J2GB0
 
also, for whatever reason, my VPorts start at p8, and run up to p27.
Code:
Unit  UnitType  Status         %RCmpl  %V/I/M  Stripe  Size(GB)  Cache  AVrfy
------------------------------------------------------------------------------
u0    SINGLE    OK             -       -       -       1862.63   RiW    OFF    
u1    SINGLE    OK             -       -       -       1862.63   RiW    OFF    
u2    SINGLE    OK             -       -       -       1862.63   RiW    OFF    
u3    SINGLE    OK             -       -       -       1862.63   RiW    OFF    
u4    SINGLE    OK             -       -       -       1862.63   RiW    OFF    
u5    SINGLE    OK             -       -       -       1862.63   RiW    OFF    
u6    SINGLE    OK             -       -       -       1862.63   RiW    OFF    
u7    SINGLE    OK             -       -       -       1862.63   RiW    OFF    
u8    SINGLE    OK             -       -       -       1862.63   RiW    OFF    
u9    SINGLE    OK             -       -       -       1862.63   RiW    OFF    
u10   SINGLE    OK             -       -       -       1862.63   RiW    OFF    
u11   SINGLE    OK             -       -       -       1862.63   RiW    OFF    
u12   SINGLE    OK             -       -       -       1862.63   RiW    OFF    
u13   SINGLE    OK             -       -       -       1862.63   RiW    OFF    
u14   SINGLE    OK             -       -       -       1862.63   RiW    OFF    
u15   SINGLE    OK             -       -       -       1862.63   RiW    OFF    
u16   SINGLE    OK             -       -       -       1862.63   RiW    OFF    
u17   SINGLE    OK             -       -       -       1862.63   RiW    OFF    
u18   SINGLE    OK             -       -       -       1862.63   RiW    OFF    
u19   SINGLE    OK             -       -       -       1862.63   RiW    OFF    
u20   SINGLE    OK             -       -       -       1862.63   RiW    OFF    
u21   SINGLE    OK             -       -       -       1862.63   RiW    OFF    

VPort Status         Unit Size      Type  Phy Encl-Slot    Model
------------------------------------------------------------------------------
p8    OK             u0   1.82 TB   SATA  -   /c0/e0/slt0  WDC WD20EARS-00S8B1 
p9    OK             u1   1.82 TB   SATA  -   /c0/e0/slt1  WDC WD20EARS-00S8B1 
p10   OK             u2   1.82 TB   SATA  -   /c0/e0/slt2  WDC WD20EARS-00S8B1 
p11   OK             u3   1.82 TB   SATA  -   /c0/e0/slt4  WDC WD20EARS-00J2GB0
p12   OK             u4   1.82 TB   SATA  -   /c0/e0/slt6  WDC WD20EARS-00S8B1 
p13   OK             u5   1.82 TB   SATA  -   /c0/e0/slt7  WDC WD20EARS-00S8B1 
p14   OK             u6   1.82 TB   SATA  -   /c0/e0/slt8  WDC WD20EARS-00S8B1 
p15   OK             u7   1.82 TB   SATA  -   /c0/e0/slt5  WDC WD20EARS-00J2GB0
p16   OK             u8   1.82 TB   SATA  -   /c0/e0/slt9  WDC WD20EARS-00J2GB0
p17   OK             u9   1.82 TB   SATA  -   /c0/e0/slt10 WDC WD20EARS-00J2GB0
p18   OK             u10  1.82 TB   SATA  -   /c0/e0/slt11 WDC WD20EARS-00J2GB0
p19   OK             u11  1.82 TB   SATA  -   /c0/e0/slt12 WDC WD20EARS-00S8B1 
p20   OK             u12  1.82 TB   SATA  -   /c0/e0/slt13 WDC WD20EARS-00S8B1 
p21   OK             u13  1.82 TB   SATA  -   /c0/e0/slt14 WDC WD20EARS-00S8B1 
p22   OK             u14  1.82 TB   SATA  -   /c0/e0/slt15 WDC WD20EARS-00J2GB0
p23   OK             u15  1.82 TB   SATA  -   /c0/e0/slt16 WDC WD20EARS-00J2GB0
p24   OK             u16  1.82 TB   SATA  -   /c0/e0/slt17 WDC WD20EARS-00J2GB0
p25   OK             u17  1.82 TB   SATA  -   /c0/e0/slt18 WDC WD20EARS-00S8B1 
p26   OK             u18  1.82 TB   SATA  -   /c0/e0/slt19 WDC WD20EARS-00S8B1 
p27   OK             u19  1.82 TB   SATA  -   /c0/e0/slt20 WDC WD20EARS-00S8B1 
p28   OK             u20  1.82 TB   SATA  -   /c0/e0/slt22 WDC WD20EARS-00J2GB0
p29   OK             u21  1.82 TB   SATA  -   /c0/e0/slt23 WDC WD20EARS-00J2GB0
 
Just out of curiousity, have you tried doing the RAID by the 3ware controller itself? That is, not using ZFS at all, or use it on a single exported volume. That way, you could figure out if the 3ware/WD drives combo is part of the problem.

It is sort of worrying, that you can only get up to 20MB/sec writing to a single drive..
 
I have 4 EARS disks running using raidz and the performance is bad as well.
When I run gstat I can see that the reason for the bad performance is that on disk is saturated with write requests and takes a long time to complete them.
This one disk is always the same one, which makes me believe that the real reason is the zfs intend log which is written on that disk. IIRC the intend log write many small files on the disk...
Did anyone of you have a look at gstat as well and sees similar issues?
 
The underlying issue with EARS (aka 4K Advanced Format drives) is that the firmware lies to the OS about the size of the physical sectors. If you query the disk, it tells you it has 512 B logical sectors *AND* 512 B physical sectors. Thus, all filesystems and OSes on top try to use it like a normal harddrive with 512 B sectors, leading to all kinds of performance and mis-alignment issues.

There's nothing that can be done until either WD fixes their firmware to report the correct physical sector size, or DES finishes his work to incorporate a workaround for these drives (see the mailing list archives for more details).

IOW, there's no point in beating this horse anymore. It's been done to death already on the forums, the mailing lists, and all over the Internet.

Avoid EARS (Advanced Format) drives from WD.

4K drives from other manufacturers work correctly, and specify proper values for logical and physical sector size.
 
it's obvious this is all about money.

WD has done a few things recently which makes this clear. They've made the WDTLER.EXE stop working on drives, for one.

They want people running their consumer drives to only use them in windows or on a desktop, adn they want people who are using raid arrays to buy the more expensive raid drives.

This is why i only use hitachi 2TB drives and seagate and samsung 1TB drives.
 
wonslung said:
it's obvious this is all about money.

WD has done a few things recently which makes this clear. They've made the WDTLER.EXE stop working on drives, for one.

They want people running their consumer drives to only use them in windows or on a desktop, adn they want people who are using raid arrays to buy the more expensive raid drives.

This is why i only use hitachi 2TB drives and seagate and samsung 1TB drives.

I agree with you 100% I plan to never suggest WD to anyone after the bs I had to deal with these drives.
 
phoenix said:
4K drives from other manufacturers work correctly, and specify proper values for logical and physical sector size.

Other manufacturers? Who else is making 4K drives at the moment?
 
vermaden said:
Watch out also for seagate 7200.11 series (omit them).



this isnt' entirely true.

it was only one batch of 7200.11 which were flawed. I had a few of them.


Seagate not only released a firmware patch/fix for these drives but they also extended the warranty of the drives by 3 years. I've only had one of the 7200.11 drives fail and need to be replaced, and i've had tons of them in working systems which have worked fine since i bought them.


Granted, the 7200.12's are generally a much better drive (and these are the ones i currently use) but the 7200.11's are fine if you get a good deal on them.


The main reason to go with the .12's over the .11's has nothing to do with the firmware issue (which has been solved), no, the reason to use the .12's is that they use less power and generally perform better but even the .11's will outperform the 4k drives in ZFS based systems.
 
jem said:
Other manufacturers? Who else is making 4K drives at the moment?





The entire hard drive industry is making 4k drives now.

I'm not sure how many others have hit the market yet, but i know for sure Seagate,Maxtor, Fujitsu, and Hitachi are making them.
 
WD20EARS drives use this 'problematic/hidden' 4K sectors, but WD20EADS use 512B sectors, so just use the 'non problematic' version, at least for future purchases.
 
The latest EADS drives have also a firmware problem. You have to make some settings with MS-DOS or MS-Windows, so the drive does not die after approximately 2 months (Load_Cycle_Count bug). ataidle and camcontrol commands cannot switch off power management on these latest drives (the older EADS drives still support it). The new series simply does not support "advanced power management" anymore and takes full control of it by itself.

I recommend to avoid all WD Green drives for a while until WD decides to consider how operating systems (!= MS-Windows) work.
 
@nakal

That is the reason I sold mine WD Green, I thought that EARS are also 'broken' in that way (8 secs idle means power off), I definitely OMIT WD Green drives now (while my WD Blue 3.5 and 2.5 drives, and current WD Passport 2.5 1TB work ok), Seagate LP drives (5900RPM) seems a good alternative for the WD Green here, at least I havent heard any bad input about them.
 
Have 2 of this problematic .11 drives by Seagate. The nice thing is that you get a bootable ISO image that can upgrade the firmware, so you (mostly) don't need to install any weird operating systems to get rid of the problem.

Also SMART on these 2 drives is totally broken it shows
Code:
197 Current_Pending_Sector  0x0012   001   001   000    Old_age   Always       -       2047

on both drives. It looks dangerous, but they are completely OK. There are no read errors at all on the entire surface.

At the moment I prefer Samsung without any "green/low-power" technologies. 2 Watts more, but at least, the power management works as I want it to work. They are a bit noisy while starting and seeking, but they are twice as fast when compared to such power-saving drives.
 
Back
Top