ZFS ZFS error detected in following files

[FONT=Arial]I am running the latest release of FreeNAS on a Supermicro server that I repurposed. It was orginally a VMware host so it is pretty high-end for a NAS server. Anyway I boot from a USB drive and the 8 2TB SAS drives are controlled by a MegaRAID SAS controller and configured as RAID 10. All was running fine and I lost one drive (5). It is in a remote location and I had someone pull the bad drive and bring it to me. When I checked the server I saw another drive offline (4). Unfortunately it was the mirror of the other drive that was missing. Luckily this remote NAS is just a backup of a backup.

I stopped the iSCSI service and wiped the volume from FreeNAS, but this was after FreeNAS reported a corrupt volume. Anyway I completely reinitialized the RAID 10 from the MegaRAID controller and then re-added the volume to FreeNAS. Re-created a new empty data file for iSCSI to use and added it back. The next day I received an e-mail from FreeNAS that an error was detected on a file in the iSCSI. I deleted the file that was reported with zpool status -xv and then ran zpool status again and now it is reporting another. It took almost 24 hours to restore the backup files, but worse case I guess I wipe out everything including the FreeNAS and start over.[/FONT]

[FONT=Courier New]This is currently what zpool is telling me:
[root@RVDPA-NAS00 /mnt/iSCSI0]# zpool status -xv
pool: iSCSI0
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://illumos.org/msg/ZFS-8000-8A
scan: scrub in progress since Fri Dec 16 13:47:04 2016
54.9M scanned out of 1.24T at 1.28M/s, 282h49m to go
0 repaired, 0.00% done
config:

NAME STATE READ WRITE CKSUM
iSCSI0 ONLINE 0 0 1
gptid/77ce790e-a60b-11e6-b0bb-00259098d838 ONLINE 0 0 2

errors: Permanent errors have been detected in the following files:

iSCSI0/.system/syslog-7f4d67ae16c94917b949456bb9f364ad:<0x25>
[root@RVDPA-NAS00 /mnt/iSCSI0]# [/FONT]

[FONT=Arial]I'm not sure what it is pointing at because it doesn't look like a file. The iSCSI0 volume is at /mnt/iSCSI0 and the only two files in that folder are data and jails. data is the empty container file I created for the iSCSI service to use.[/FONT]

[FONT=Courier New][root@RVDPA-NAS00 /mnt/iSCSI0]# ls -lah
total 2660642797
drwxr-xr-x 3 root wheel 4B Dec 14 08:20 .
drwxr-xr-x 3 root wheel 128B Dec 13 15:58 ..
-rwxrwxrwx 1 root wheel 10T Dec 16 08:44 data
drwxr-xr-x 2 root wheel 2B Nov 8 17:41 jails
[root@RVDPA-NAS00 /mnt/iSCSI0]#
[/FONT]
[FONT=Arial]Is there a way to fix this?[/FONT]
 
[FONT=Arial] the 8 2TB SAS drives are controlled by a MegaRAID SAS controller and configured as RAID 10.[/FONT]

ZFS has no way of detecting what drive failed or returns corrupted data - the RAID completely hides the notation of single drives from ZFS. As stated constantly within the documentation and in any ZFS tutorial: Don't use HW-RAID!

As there is no redundancy on the VDEV, ZFS can't correct errors (and couldn't detect them properly anyways, because the RAID-Controller returns the data from whichever drive responded first), the only safe way is to restore from backups - onto a pool that is set up directly on the drives with redundancy.
 
Back
Top