For the past months I have collected data on my external ZFS disk connected via USB3. Pool was working as expected, zfs status wasn't showing any errors. Not sure if I run zpool scrub on it but I think it was run and it was OK.
But few nights ago I started copying (rsync-ing) data to another disk. Everything was fine until it started to show errors:
I thought that I accidently moved USB cable and disconnect disk while reading it but moving it to internal HDD slot didn't help.
After running zpool clear and scrubing it (again):
rsync errors:
smartctl -a /dev/ada0
Smartctl output wasn't changed much over days:
There are some valuable things on that disk (vacation photos from past years and stuff like that) which were copied into that temporary disk (not the best backup strategy, I know) and any help will be greatly appreciated
But few nights ago I started copying (rsync-ing) data to another disk. Everything was fine until it started to show errors:
I thought that I accidently moved USB cable and disconnect disk while reading it but moving it to internal HDD slot didn't help.
After running zpool clear and scrubing it (again):
Code:
pool: bckp-ext
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://illumos.org/msg/ZFS-8000-8A
scan: scrub repaired 0 in 0 days 00:49:08 with 708391 errors on Thu Sep 12 18:45:40 2019
config:
NAME STATE READ WRITE CKSUM
bckp-ext ONLINE 0 0 699K
ada1p1 ONLINE 0 0 1.38M
errors: 708357 data errors, use '-v' for a list
rsync errors:
Code:
slike svedska - sortirano/Linköping Gamla/17030104.jpg
read errors mapping "/mnt/bckp-ext/slike/slike svedska - sortirano/Linköping Gamla/17030104.jpg": Input/output error (5)
slike svedska - sortirano/Linköping Gamla/17030105.jpg
read errors mapping "/mnt/bckp-ext/slike/slike svedska - sortirano/Linköping Gamla/17030105.jpg": Input/output error (5)
slike svedska - sortirano/Linköping Gamla/17030106.jpg
...
WARNING: slike svedska - sortirano/Linköping Gamla/17030104.jpg failed verification -- update discarded (will try again).
WARNING: slike svedska - sortirano/Linköping Gamla/17030105.jpg failed verification -- update discarded (will try again).
WARNING: slike svedska - sortirano/Linköping Gamla/17030106.jpg failed verification -- update discarded (will try again).
Code:
zpool status -xv | grep 1703010
...
/mnt/bckp-ext/slike/slike svedska - sortirano/Linköping Gamla/17030104.jpg
/mnt/bckp-ext/slike/slike svedska - sortirano/Linköping Gamla/17030105.jpg
/mnt/bckp-ext/slike/slike svedska - sortirano/Linköping Gamla/17030106.jpg
...
Code:
# zdb -l /dev/ada1p1
------------------------------------
LABEL 0
------------------------------------
version: 5000
name: 'bckp-ext'
state: 0
txg: 15108279
pool_guid: 7930193435851463028
hostid: 1061846452
hostname: 'ProjectBSD'
top_guid: 3352315782749581932
guid: 3352315782749581932
vdev_children: 1
vdev_tree:
type: 'disk'
id: 0
guid: 3352315782749581932
path: '/dev/ada1p1'
whole_disk: 1
metaslab_array: 37
metaslab_shift: 31
ashift: 12
asize: 246955900928
is_log: 0
DTL: 102
create_txg: 4
features_for_read:
com.delphix:hole_birth
com.delphix:embedded_data
------------------------------------
LABEL 1
------------------------------------
version: 5000
name: 'bckp-ext'
state: 0
txg: 15108279
pool_guid: 7930193435851463028
hostid: 1061846452
hostname: 'ProjectBSD'
top_guid: 3352315782749581932
guid: 3352315782749581932
vdev_children: 1
vdev_tree:
type: 'disk'
id: 0
guid: 3352315782749581932
path: '/dev/ada1p1'
whole_disk: 1
metaslab_array: 37
metaslab_shift: 31
ashift: 12
asize: 246955900928
is_log: 0
DTL: 102
create_txg: 4
features_for_read:
com.delphix:hole_birth
com.delphix:embedded_data
------------------------------------
LABEL 2
------------------------------------
version: 5000
name: 'bckp-ext'
state: 0
txg: 15108279
pool_guid: 7930193435851463028
hostid: 1061846452
hostname: 'ProjectBSD'
top_guid: 3352315782749581932
guid: 3352315782749581932
vdev_children: 1
vdev_tree:
type: 'disk'
id: 0
guid: 3352315782749581932
path: '/dev/ada1p1'
whole_disk: 1
metaslab_array: 37
metaslab_shift: 31
ashift: 12
asize: 246955900928
is_log: 0
DTL: 102
create_txg: 4
features_for_read:
com.delphix:hole_birth
com.delphix:embedded_data
------------------------------------
LABEL 3
------------------------------------
version: 5000
name: 'bckp-ext'
state: 0
txg: 15108279
pool_guid: 7930193435851463028
hostid: 1061846452
hostname: 'ProjectBSD'
top_guid: 3352315782749581932
guid: 3352315782749581932
vdev_children: 1
vdev_tree:
type: 'disk'
id: 0
guid: 3352315782749581932
path: '/dev/ada1p1'
whole_disk: 1
metaslab_array: 37
metaslab_shift: 31
ashift: 12
asize: 246955900928
is_log: 0
DTL: 102
create_txg: 4
features_for_read:
com.delphix:hole_birth
com.delphix:embedded_data
smartctl -a /dev/ada0
Code:
smartctl 7.0 2018-12-30 r4883 [FreeBSD 12.0-RELEASE-p9 amd64] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Western Digital Scorpio Blue Serial ATA (AF)
Device Model: WDC WD2500BPVT-22JJ5T0
Serial Number: WD-WX31A13F3617
LU WWN Device Id: 5 0014ee 6587db803
Firmware Version: 01.01A01
User Capacity: 250,059,350,016 bytes [250 GB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 5400 rpm
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS (minor revision not indicated)
SATA Version is: SATA 2.6, 3.0 Gb/s
Local Time is: Thu Sep 12 18:36:11 2019 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 7260) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 75) minutes.
Conveyance self-test routine
recommended polling time: ( 5) minutes.
SCT capabilities: (0x7035) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0027 144 141 021 Pre-fail Always - 1800
4 Start_Stop_Count 0x0032 085 085 000 Old_age Always - 15175
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
9 Power_On_Hours 0x0032 098 098 000 Old_age Always - 1759
10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 210
191 G-Sense_Error_Rate 0x0032 016 016 000 Old_age Always - 84
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 66
193 Load_Cycle_Count 0x0032 131 131 000 Old_age Always - 208731
194 Temperature_Celsius 0x0022 105 100 000 Old_age Always - 38
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 100 253 000 Old_age Offline - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
Smartctl output wasn't changed much over days:
Code:
16c16
< Local Time is: Wed Sep 11 07:47:27 2019 CEST
---
> Local Time is: Thu Sep 12 17:57:27 2019 CEST
61c61
< 4 Start_Stop_Count 0x0032 085 085 000 Old_age Always - 15139
---
> 4 Start_Stop_Count 0x0032 085 085 000 Old_age Always - 15175
64c64
< 9 Power_On_Hours 0x0032 098 098 000 Old_age Always - 1725
---
> 9 Power_On_Hours 0x0032 098 098 000 Old_age Always - 1759
70,71c70,71
< 193 Load_Cycle_Count 0x0032 131 131 000 Old_age Always - 208689
< 194 Temperature_Celsius 0x0022 116 100 000 Old_age Always - 27
---
> 193 Load_Cycle_Count 0x0032 131 131 000 Old_age Always - 208731
> 194 Temperature_Celsius 0x0022 108 100 000 Old_age Always - 35
There are some valuable things on that disk (vacation photos from past years and stuff like that) which were copied into that temporary disk (not the best backup strategy, I know) and any help will be greatly appreciated