Offline uncorrectable sectors

Hallo,

Recently I'm getting this errors:

Code:
Apr 1 18:11:05 server smartd[5286]: Device: /dev/ada0, 20 Currently unreadable (pending) sectors
Apr 1 18:11:05 server smartd[5286]: Device: /dev/ada0, 22 Offline uncorrectable sectors
Apr 1 18:41:05 server smartd[5286]: Device: /dev/ada0, 20 Currently unreadable (pending) sectors
Apr 1 18:41:05 server smartd[5286]: Device: /dev/ada0, 22 Offline uncorrectable sectors

Does anyone know more about it, can it cause problems? there are 2x WDC WD20EARX-008FB0 mirroring each other. Can this error be ignored?

Code:
server:~# smartctl -A /dev/ada0
smartctl 6.0 2012-10-10 r3643 [FreeBSD 9.1-RELEASE amd64] (local build)
Copyright (C) 2002-12, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   184   183   021    Pre-fail  Always       -       5800
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       42
  5 Reallocated_Sector_Ct   0x0033   193   193   140    Pre-fail  Always       -       322
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       377
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       42
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       14
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       196
194 Temperature_Celsius     0x0022   118   109   000    Old_age   Always       -       32
196 Reallocated_Event_Count 0x0032   199   199   000    Old_age   Always       -       1
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       20
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       22
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       33
 
ilaurens said:
Code:
server:~# smartctl -A /dev/ada0
smartctl 6.0 2012-10-10 r3643 [FreeBSD 9.1-RELEASE amd64] (local build)
Copyright (C) 2002-12, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   184   183   021    Pre-fail  Always       -       5800
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       42
  5 Reallocated_Sector_Ct   0x0033   193   193   140    Pre-fail  Always       -       [color="Red"]322[/color]

That, and the error messages, mean the disk is failing. Save what data you can recover from it right now, because it can fail entirely at any second. No, this should not be ignored. Replace the drive.
 
wblock@ said:
That, and the error messages, mean the disk is failing. Save what data you can recover from it right now, because it can fail entirely at any second. No, this should not be ignored. Replace the drive.

Hallo wblock, thank you for the fast response; can you take a look at this one also because it does not look good either:

Code:
server:~# smartctl -A /dev/ada1
smartctl 6.0 2012-10-10 r3643 [FreeBSD 9.1-RELEASE amd64] (local build)
Copyright (C) 2002-12, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   185   184   021    Pre-fail  Always       -       5716
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       43
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       386
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       43
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       14
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       199
194 Temperature_Celsius     0x0022   116   110   000    Old_age   Always       -       34
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

Bought both of them about a month ego, beginning of March, so for them to fail is a bit shocking. Fortunately I do not have much important data, it is for movies. Of course I would like to use it for important data also, which is the reason I'm using it in mirror.
 
ilaurens said:
Hallo wblock, thank you for the fast response; can you take a look at this one also because it does not look good either:

Code:
server:~# smartctl -A /dev/ada1
smartctl 6.0 2012-10-10 r3643 [FreeBSD 9.1-RELEASE amd64] (local build)
Copyright (C) 2002-12, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   185   184   021    Pre-fail  Always       -       5716
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       43
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       386
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       43
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       14
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       199
194 Temperature_Celsius     0x0022   116   110   000    Old_age   Always       -       34
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

Bought both of them about a month ego, beginning of March, so for them to fail is a bit shocking. Fortunately I do not have much important data, it is for movies. Of course I would like to use it for important data also, which is the reason I'm using it in mirror.

Kind of surprised at the amount of errors on the other drive given it's Power_On_Hours (377 = roughly 16 days). I'd think you should be able to get it replaced under warranty.
 
t0ken said:
Kinda surprised at the amount of errors on the other drive given it's Power_On_Hours (377 = roughly 16 days). I'd think you should be able to get it replaced under warranty.
Yes. I've never had any drive manufacturer refuse an RMA for even a single offline uncorrectable error.

The drive may have been damaged in transit - you'd be amazed how many "authorized" vendors ship drives in an antistatic bag and maybe one or two air pillow bags, rattling around in a cardboard box. Yup, I'm talking about you, *****g. It is possible for the heads to unlock and chip a sliver of media off a platter. This may not even be in the recordable area of the drive, but now you have a metal particle floating around inside the drive and potentially causing errors.

In the "old days", people would need to low-level format a drive before using it, and this sort of shipping damage would usually be detected during the format. For many years now, drives arrive pre-formatted and even if they accept a low-level format command, they don't actually perform a low-level format.

There have also been problems in the past where certain batches of drive platters would "rot" over time due to improper manufacturing or incomplete cleaning. The drive makers generally don't acknowledge issues like that, except under non-disclosure to their largest customers. Instead, they process RMAs where they replacement drives may have the same problem (since replacements are often refurbished, and if the original problem wasn't inside the sealed HDA (Head Disk Area), they won't open the HDA during the refurbishing process.
 
Terry_Kennedy said:
Yes. I've never had any drive manufacturer refuse an RMA for even a single offline uncorrectable error.
Some companies even replace the drive when there are only "pre-failure" warnings. But yeah, offline uncorrectable errors shouldn't pose a problem when returning the drive.
 
Thank you for the fast responses;

I did notice that the speed fluctuates, which might be the cause of that harddisk (or NAS software). I decided to wait a few more days to do more checks on it. Fortunately, there is no important data located on the server and the drive is mirrored, so whether it fails or not does not matter to me, that might even make things easier for me. Of course, I will contact the shop in the meantime about the errors. I will make sure to report back.
 
ilaurens said:
Of course, I will contact the shop in the meantime about the errors. I will make sure to report back.
You purchased this drive at a local shop?

It certainly seems to have been unused previously, since you only have a week and a half of power-on hours reported by sysutils/smartmontools. However, it could have been sitting in inventory there for some time while the warranty was ticking away.

I'd suggest using the drive manufacturer's "check warranty status" web page when you get the replacement to see if the drive warranty runs for the whole time period you expect. There are two cases where it may not:

  • If you were sold an OEM version of the drive, there is no end-user warranty from the drive manufacturer. For example, if it is a Seagate drive built for HP, Seagate expects HP to deal with warranty replacements and end users.
  • Normally, the warranty starts when the drive ships from the factory. If you have a receipt from an authorized reseller and it is still within 5 years (or whatever the warranty period is) from when you bought the drive, the manufacturer should honor the warranty. But if you bought it from a local shop, it is possible they aren't an authorized reseller and the manufacturer won't extend the warranty past their normal expiration date.
The above is the way things work in the US. If you are not in the US, laws regarding warranty performance may differ.

Also, this is for drives returned as "DOA" to the seller. If you go through the manufacturer's warranty process, you will get a drive with a warranty equal to either the remaining warranty on the original drive or some short period (30 to 120 days is typical), whichever is longer.
 
Decided to test both drives with seatools, and the one which had the errors did not work correctly and gave a warrenty message. So I will go to the shop tomorrow or begin next week.

@Terry_Kennedy, It was bought in a local shop. So I will return it there. Do not know the warrenty period however the minimum warrenty period in Europe is 2 years. So got plenty of months left. :D
 
Last edited by a moderator:
The WD drives I registered started the warranty on the day I registered them. The first time any drive manufacturer tries to play that "the warranty starts when the drive was built" is the last time I buy a drive from that company.
 
wblock@ said:
The WD drives I registered started the warranty on the day I registered them. The first time any drive manufacturer tries to play that "the warranty starts when the drive was built" is the last time I buy a drive from that company.

Yes, but didn't you have warranty from the shop? Couldn't you take it there? also warrenty from the creation day? that is just crap, personally I think you could push them.

I brought my drive back today and they took it without a problem, I said that it has high unallocated sector count and that the drive shows a failure when testing with seatools.
But yet again, it was less then a month old.
 
Back
Top