AMD Brazos platform and WD30EZRX problems

Arnoud · Oct 12, 2011

On FreeBSD 8.2-RELEASE-p3

I have an ASUS E35M-m1 PRO motherboard witch uses the AMD Brazos platform with Hudson M1.
Recently I bought 5 WD30EZRX disks (3TB, Sata 6Gb/s).
My other disks are Seagate ST31000340AS and ST31000330AS (1TB Sata 3Gb/s)
System disk is a Seagate ST380815AS 4.AAB.
I also use Intel SASUC8i and Highoint RocketRaid 620 PCIe SATA/SAS cards
I use ZFS on every disk.

The new 3TB disks give me lots of problems:
on ata ports:
warning - write_dma
timeout - read dma
sudden reboots

I RMA'd one disk back (ad12) because there were soms S.M.A.R.T. errors. Wich was confirmed and I am now waiting to get that disk back.
Meanwhile an other disks starts to give problems after copying +24h.

Code:

ad8: FAILURE - READ_DMA48 status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE> LBA=422471168

Also I see (for the Highpoint RocketRaid 620):

Code:

atapci1: <Marvell AHCI controller> port 0xd040-0xd047,0xd030-0xd033,0xd020-0xd027,0xd010-0xd013,0xd000-0xd00f mem 0xfe910000-0xfe9107ff irq 16 at device 0.0 on pci3
atapci1: [ITHREAD]
atapci1: AHCI v1.00 controller with 2 6Gbps ports, PM supported
ata8: <ATA channel 0> on atapci1
ata8: port is not ready (timeout 0ms) tfd = 00000180
ata8: software reset clear timeout
ata8: port is not ready (timeout 0ms) tfd = 00000180
ata8: software reset clear timeout
ata8: [ITHREAD]
ata9: <ATA channel 1> on atapci1
ata9: [ITHREAD]

Original setup with no problems:
- 3 1TB disks and system disk on onboard controller
- 4 1TB on 2x Highpoint RocketRaid 620

Current setup with problems:
- 5 3TB disks on onboard controller
- system disk on Highpoint RocketRaid 620
- 7 1TB on Intel SASUC8i (~LSI SAS3081E-R)

dmesg also shows some ACPI warnings, errors and exceptions

Code:

real memory  = 8589934592 (8192 MB)
avail memory = 8199028736 (7819 MB)
ACPI APIC Table: <ALASKA A M I>
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
FreeBSD/SMP: 1 package(s) x 2 core(s)
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
ACPI Warning: Optional field Pm2ControlBlock has zero address or length: 0x0000000000000000/0x1 (20101013/tbfadt-655)
ioapic0: Changing APIC ID to 0
ioapic0 <Version 2.1> irqs 0-23 on motherboard
kbd1 at kbdmux0
acpi0: <ALASKA A M I> on motherboard
acpi0: [ITHREAD]
ACPI Error: [RAMB] Namespace lookup failure, AE_NOT_FOUND (20101013/psargs-464)
ACPI Exception: AE_NOT_FOUND, Could not execute arguments for [RAMW] (Region) (20101013/nsinit-452)
acpi0: Power Button (fixed)
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <32-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
cpu0: <ACPI CPU> on acpi0
cpu1: <ACPI CPU> on acpi0
acpi_hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0
Timecounter "HPET" frequency 14318180 Hz quality 900

Zhwazi · Oct 12, 2011

It's probably those green drives. Green + RAID = bad. They are consumer class drives not intended for use in RAID.

If you google for "wd30ezrx raid" one of the results features the following statement on the page where those drives are being sold:

*Business Critical RAID Environments â€“ WD Caviar Green Hard Drives are not recommended for and are not warranted for use in RAID environments utilizing Enterprise HBAs and/or expanders and in multi-bay chassis, as they are not designed for, nor tested in, these specific types of RAID applications. For all Business Critical RAID applications, please consider WD's Enterprise Hard Drives that are specifically designed with RAID-specific, time-limited error recovery (TLER), are tested extensively in 24x7 RAID applications, and include features like enhanced RAFF technology and thermal extended burn-in testing.

arp242 · Oct 12, 2011

WD GP drives work fine in a RAID setup. In fact, there is absolutly no different between a "normal" or "RAID" setup as far as the disks are concerned. Are WD Enterprise disks *more* reliable? Yes, but they're also twice as expensive...

Just to be on the clear side, the disks that give errors are *only* the new 3TB drives, which are all connected to the onboard controller, which used to work fine (Not to the new Intel controller?) The other drives on the other controllers give no errors?

I assume that you checked the SMART status of the offending disk (ad8 in this case)?

What kind of power supply are you using? According to the WD website, the WD30EZRX requires 1.78A on the 12V rails. So that's a total of 149W for 7 disks.

Zhwazi · Oct 12, 2011

Carpetsmoker said:
WD GP drives work fine in a RAID setup. In fact, there is absolutly no different between a "normal" or "RAID" setup as far as the disks are concerned. Are WD Enterprise disks *more* reliable? Yes, but they're also twice as expensive...

I see far more people having problems with WD Green drives in RAID configurations than any other kind of drive, and I've seen lots of Samsung, Hitachi, and Seagate. Greens won't fail absolutely every time, but it's far from "working fine". There are differences between desktop-class drives and RAID-class drives, particularly in firmware. Here's a relevant and particularly obnoxious and common example that causes a lot of misbehavior of Green drives on RAID controllers: http://en.wikipedia.org/wiki/Time-Limited_Error_Recovery

arp242 · Oct 12, 2011

Yeah, you'll want to run the wdidle tool to change the timeouts. Other than that, most of the problems in my experience come from the 4K sector size ... I stumbled across this:
http://forums.freebsd.org/showpost.php?p=109071&postcount=31

I don't how relevant this is now. Do these disks still report having 512 bytes sectors (You can check with diskinfo -v)? And have the ZFS tools been modified in the meantime (?)
I don't think this relates to the original problem though, but I must admit I have no experience with ZFS. Perhaps Freddie can pitch in here ...

Zhwazi · Oct 12, 2011

Timeout issues cause disks to fail from RAID arrays. And you'll notice the original post indicates he's getting timeouts. As far as I'm aware, the 4k sector issue just causes terrible performance and probably undue drive wear, but I am not aware of any mechanism by which 4K sector disks would be more likely to time out than 512B disks, nor have I seen any evidence that they are.

Whether it's for the 4K sector issue or the crappy timeouts or the fact that many if not all of those Green drives put themselves to sleep after several seconds of idleness, the advice to avoid WD Green, no matter how cheap they are, probably still holds.

Arnoud · Oct 12, 2011

Drives without TLER or similar give problems on raid controllers. Not on HBA cards or onboard controllers.
For the moment I use an Antec TrioPower 650Watt PSU, but I want to use one with a lower wattage (should save more power).
All harddisk use 4k sectors.

Code:

nas# zdb | grep ashift
                ashift=9
                ashift=9
                ashift=12

Last one are the 3TB disks.

I did not know of the wdidle tool. Might try that.

as for S.M.A.R.T.

Code:

nas# smartctl -a /dev/ad8
smartctl 5.41 2011-06-09 r3365 [FreeBSD 8.2-RELEASE-p3 amd64] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Device Model:     WDC WD30EZRX-00MMMB0
Serial Number:    WD-WCAWZ1162714
LU WWN Device Id: 5 0014ee 2b0d477bf
Firmware Version: 80.00A80
User Capacity:    3,000,592,982,016 bytes [3.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   8
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Thu Oct 13 00:33:59 2011 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x80) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (48300) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 255) minutes.
Conveyance self-test routine
recommended polling time:        (   5) minutes.
SCT capabilities:              (0x3035) SCT Status supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   100   253   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   235   151   021    Pre-fail  Always       -       5250
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       55
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       197
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       53
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       51
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       343
194 Temperature_Celsius     0x0022   123   113   000    Old_age   Always       -       29
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   066   066   000    Old_age   Always       -       65503
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]


SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

I tought everything was ok, but what about the Current Pending Sector stuff?
Also a DOA?

arp242 · Oct 13, 2011

Yeah, that looks like another DOA!

re: GP drives timeout values/wdidle:
I've never seen drives error on this, the problem is that they keep spinning down very fast, reducing lifetime. It's a long-term problem, not something that you'll notice on the short term.

Zhwazi · Oct 13, 2011

Carpetsmoker said:
Yeah, that looks like another DOA!

re: GP drives timeout values/wdidle:
I've never seen drives error on this, the problem is that they keep spinning down very fast, reducing lifetime. It's a long-term problem, not something that you'll notice on the short term.

I think there's some confusion, I was talking about TLER, and it looks like you were talking about the head parking during moments of idleness. Rereading it now makes more sense.

As far as I'm aware TLER doesn't only affect drives on RAID controllers, although RAID cards are much less lenient with non-responsive drives.

arp242 · Oct 13, 2011

Zhwazi said:
I think there's some confusion, I was talking about TLER, and it looks like you were talking about the head parking during moments of idleness. Rereading it now makes more sense.

Aha, and your comments now make more sense to me

serverhamster · Oct 13, 2011

I have a very similar setup, but with 6x 3TB WD30EZRS drives (sata 2, not 3).
The RAIDZ1 performs very good and has proven to be reliable so far. TLER hasn't caused any disks to be thrown out of the array. The TLER Wikipedia article talks about RAID, but are we sure enabling TLER is also recommended when using RAID-Z? After all, it's not exactly the same. (I don't know the specifics)

Considering the head parking. That doesn't sound like a large issue to me unless your drives see a lot of short activity. If that causes a lifetime of 8 years[1] instead of 10, it's not that bad.

[1] Yes, I might be a tad too optimistic.

Zhwazi · Oct 13, 2011

Enabling TLER for use in any kind of redundant array of disks makes sense, whether it's hardware or software. ZFS behaves a lot like a RAID controller in that, if the drive simply reports that it has a bad block or there is a checksum error, ZFS will try to write the correct data back to the block from the other mirrors/stripes and parity, and the operation won't block all the I/O to the array while the system waits for the one drive for very long. Without TLER the disk would spend up to almost a minute trying to get data back on its own when it would take a fraction of a second to recalculate the block and write it back to the disk. You may not see the timeouts because ZFS is probably less strict than a hardware RAID controller, but it should still help.

Arnoud · Oct 13, 2011

RMA was approved, so I will wait for the 2 disks to return and then hope it's ok.

What about the ACPI warnings and the timouts on the RocketRaid 620? (See first post)
Any idea?

mav@ · Oct 13, 2011

Arnoud said:
What about the ACPI warnings and the timouts on the RocketRaid 620? (See first post)
Any idea?

Marvell chips used in RocketRaid 620 are not very AHCI-compatible. Newer ahci(4) driver includes required workarounds for them.

Arnoud · Dec 1, 2011

I have 6 WDC WD30EZRX-00MMMB0 disks that give SMART errors. Two have been DOA and RMA'd to the reseller few weeks ago. Both give errors again and two of the other disks as well. I am using FreeBSD 8.2-RELEASE-p3 amd64. Motherboard is ASUS E35M-m1 PRO (uses the AMD Brazos platform with Hudson M1). Onboard SATA Controller + Highpoint RocketRaid 620. (I also have a Intel SASUC8i with 7 disks, none of them give errors). All the disks are properly mountet and cooled using Supermicro CSE-M35T-1B Mobile Racks.

SATA port / serial number / errors:

Code:

Thu Oct 13 2011:
ad4  WD-WCAWZ113**** SMART OK
ad6  WD-WCAWZ112**** SMART OK
ad8  WD-WCAWZ116**** Current_Pending_Sector=65503
ad10 WD-WCAWZ114**** SMART OK
ad12 WD-WCAWZ110**** Reallocated_Sector_Ct & Reallocated_Event_Count	         
=> RMA'd ad8 en ad12
Thu Dec  1 2011:
ad4  WD-WCAWZ113**** SMART OK
ad6  WD-WCAWZ112**** Current_Pending_Sector=65533
ad8  WD-WCAWZ106**** Current_Pending_Sector=65535
ad10 WD-WCAWZ114**** Raw_Read_Error_Rate=7 Seek_Error_Rate=37
ad12 WD-WCAWZ114**** Current_Pending_Sector=4,Offline_Uncorrectable=4,Multi_Zone_Error_Rate=2086

Is this very very bad luck? A bad batch? Could it be a FreeBSD problem? Chipset problem?

Arnoud · Dec 20, 2011

I replaced all the Western Digital Disks with Hitach 5k3000 and the problems are gone.

xibo · Jan 2, 2012

Do the 5k3000s allow enabling TLER? And also, is there a way to force them to use 5400rpm (instead of spinning up and down)?

Just curious as I'm interrested in getting some of those, too.

Arnoud · Jan 5, 2012

xibo said:
Do the 5k3000s allow enabling TLER? And also, is there a way to force them to use 5400rpm (instead of spinning up and down)?

Just curious as I'm interrested in getting some of those, too.

I think you are referring to WD disks and not the Hitachi 5k3000. TLER is something only Western Digital uses. Hitachi/Samsung use CCTL. Also, the Hitachi 5k3000 spin somwhere around 5900rpm. They use a motor that uses less power and generates less heat (Coolspin). They do not spin down unless you ask them to do.

Software raid, like RaidZ, should not have any problems with any disks, regardless of the TLER/CCTL future.

Many Western Digital drives just have incompatibility problems I guess.. I have used many disks in the past, for all sorts of raid configurations. Only WD's gave problems, even with software raid!

The new 5k3000's are running charming by the way.