I have replaced a failed drive in my NAS and am having real difficulty getting ZFS to recognise this. There have been a few stages to this problem, so I'll describe them here:
I setup my FreeNAS with 8.0.0 a few months ago with 3x Samsung 2T F4EG (HD204UI) drives in raidz1 configuration. A few weeks ago some S.M.A.R.T. errors appeared (197/0xC5 Current_Pending_Sector had a raw value of 2). A few days later the drive failed altogether, first slowing a scrub right down with multiple errors appearing on that drive (read and write).
I exported the pool, shutdown, removed the failed drive and installed 8.0.2 onto a new USB stick, which I booted and imported the freenas-1.db to. The NAS found the (degraded) pool and all was as I expected. I did a scrub (in the degraded pool) just to make sure the two remaining drives were fine.
This is where the confusion starts. I replaced the failed drive, popped in an identical Samsung drive, and noticed in the GUI that I had two ada2 drives. On the terminal, I 'offline'd the failed drive. In the web GUI I tried to replace the drive with the new one, but when the resilvering was done, it still had the failed drive in the list, stayed as 'degraded' and had the status 'replacing', even though it had finished.
Now when I did a scrub, I got some fresh problems:
How do I get rid of 315051496029849676 forever, and make the new drive (in the ada1 slot now) resilver properly? I can live with one file loss, but I don't want more to go!
Does ada1p2 also appear to have a real failure, or could this have happened some other way?
I setup my FreeNAS with 8.0.0 a few months ago with 3x Samsung 2T F4EG (HD204UI) drives in raidz1 configuration. A few weeks ago some S.M.A.R.T. errors appeared (197/0xC5 Current_Pending_Sector had a raw value of 2). A few days later the drive failed altogether, first slowing a scrub right down with multiple errors appearing on that drive (read and write).
I exported the pool, shutdown, removed the failed drive and installed 8.0.2 onto a new USB stick, which I booted and imported the freenas-1.db to. The NAS found the (degraded) pool and all was as I expected. I did a scrub (in the degraded pool) just to make sure the two remaining drives were fine.
This is where the confusion starts. I replaced the failed drive, popped in an identical Samsung drive, and noticed in the GUI that I had two ada2 drives. On the terminal, I 'offline'd the failed drive. In the web GUI I tried to replace the drive with the new one, but when the resilvering was done, it still had the failed drive in the list, stayed as 'degraded' and had the status 'replacing', even though it had finished.
Now when I did a scrub, I got some fresh problems:
Code:
zpool status -v
pool: pool
state: DEGRADED
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://www.sun.com/msg/ZFS-8000-8A
scrub: scrub completed after 10h9m with 1 errors on Mon Jan 9 19:38:49 2012
config:
NAME STATE READ WRITE CKSUM
pool DEGRADED 2 0 0
raidz1 DEGRADED 2 0 0
gptid/41542893-cfe7-11e0-8f22-78acc0f799d0 ONLINE 0 0 0
3150351496029849676 UNAVAIL 0 0 0 was /dev/gpt/ada1
ada1p2 ONLINE 2 0 0
errors: List of errors unavailable (insufficient privileges)
How do I get rid of 315051496029849676 forever, and make the new drive (in the ada1 slot now) resilver properly? I can live with one file loss, but I don't want more to go!
Does ada1p2 also appear to have a real failure, or could this have happened some other way?