I have an HP Proliant ML110 G6, with a 120GB SSD for my OS and 2 2TB WD Enterprise edition HDDs in a zpool mirror. Well, I was having these messages come up on my screen tonight:
This message, along with preceding CAM control errors pertaining to my backup HDD that I had plugged in via USB. After some digging, I saw my /dev/ada2 was about to kick the bucket, so I plugged in a new identical HDD and ran
This then resilvered my disk and dropped ada2 from my pool. I then proceeded to do a scrub of my pool and got the same smartd error as above on my brand new ada3 drive, but with 3 unreadable (pending) sectors, but no speak of offline uncorrectable sectors.
Maybe I'm fooling myself, but I have a hard time believing I got 1 drive dying after 4 months and one dying in a matter of minutes. This is the new readout of my pool:
I have been thinking that maybe this is because I put in 2 more 4GB sticks of ECC memory, because none of this has been an issue since I did that. I have since taken out those 2 new sticks to see what'll happen. Any thoughts?
Code:
smartd[11102]: Device: /dev/ada2, 10 currently unreadable (pending) sectors
smartd[11102]: Device: /dev/ada2, 44 Offline uncorrectable sectors
This message, along with preceding CAM control errors pertaining to my backup HDD that I had plugged in via USB. After some digging, I saw my /dev/ada2 was about to kick the bucket, so I plugged in a new identical HDD and ran
zpool replace mypool ada2 ada3
This then resilvered my disk and dropped ada2 from my pool. I then proceeded to do a scrub of my pool and got the same smartd error as above on my brand new ada3 drive, but with 3 unreadable (pending) sectors, but no speak of offline uncorrectable sectors.
Maybe I'm fooling myself, but I have a hard time believing I got 1 drive dying after 4 months and one dying in a matter of minutes. This is the new readout of my pool:
Code:
pool: mypool
state: ONLINE
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://illumos.org/msg/ZFS-8000-9P
scan: scrub in progress since Fri Sep 8 02:28:27 2017
43.8G scanned out of 191G at 78.5M/s, 0h32m to go
0 repaired, 22.89% done
config:
NAME STATE READ WRITE CKSUM
mypool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
ada3 ONLINE 0 0 1
ada1 ONLINE 0 0 0
errors: No known data errors
I have been thinking that maybe this is because I put in 2 more 4GB sticks of ECC memory, because none of this has been an issue since I did that. I have since taken out those 2 new sticks to see what'll happen. Any thoughts?