Helo all,
I was trying to make the ath10k driver work for my wifi model and after some attempts to compile and run the driver (and crash the system), I noticed that some directories was slow to access.
When I run "zpool status", it shows some files as corrupted. I removed the files manualy and took note from what packages they came from to be able to restore them.
Now the only errors left was some invalid metadata<0x123> that I can't fix running "zpool clear && zpool scrub".
My doubts are:
1- I see some places telling to run some zfs export and zfs import, but I think I can't do it on the same disk right?
2- My ssd is almost new (3, 4 months of use with dualboot Gentoo/Win10, now FreeBSD/Win10) and never had any issues with file errors.
ZFS says that I have checksum erros, may it be because of the invalid metadata?
3- The invalid metadata is a problem? I mean, it'll impact performance or reuse of blocks or anything harmfull in the long term?
I noticed that if I remove the invalid files and then try to recreate them, they show again in the status as invalid.
Per example, If I add a lot of files to ZFS with "npm install" in some of my projects, the chances are high that some file will be corrupted.
The output of smartctl is:
The output of "zpool status -v" is:
Thank you
I was trying to make the ath10k driver work for my wifi model and after some attempts to compile and run the driver (and crash the system), I noticed that some directories was slow to access.
When I run "zpool status", it shows some files as corrupted. I removed the files manualy and took note from what packages they came from to be able to restore them.
Now the only errors left was some invalid metadata<0x123> that I can't fix running "zpool clear && zpool scrub".
My doubts are:
1- I see some places telling to run some zfs export and zfs import, but I think I can't do it on the same disk right?
2- My ssd is almost new (3, 4 months of use with dualboot Gentoo/Win10, now FreeBSD/Win10) and never had any issues with file errors.
ZFS says that I have checksum erros, may it be because of the invalid metadata?
3- The invalid metadata is a problem? I mean, it'll impact performance or reuse of blocks or anything harmfull in the long term?
I noticed that if I remove the invalid files and then try to recreate them, they show again in the status as invalid.
Per example, If I add a lot of files to ZFS with "npm install" in some of my projects, the chances are high that some file will be corrupted.
The output of smartctl is:
Code:
=== START OF INFORMATION SECTION ===
Model Family: WD Blue and Green SSDs
Device Model: WDC WDS480G2G0B-00EPW0
Serial Number: 183541800480
LU WWN Device Id: 5 001b44 8b9628e44
Firmware Version: UK450000
User Capacity: 480.113.590.272 bytes [480 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Form Factor: M.2
Device is: In smartctl database [for details use: -P show]
ATA Version is: ACS-2 T13/2015-D revision 3
SATA Version is: SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Mon Sep 2 12:34:29 2019 -03
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 32) The self-test routine was interrupted
by the host with a hard or soft reset.
Total time to complete Offline
data collection: ( 120) seconds.
Offline data collection
capabilities: (0x15) SMART execute Offline immediate.
No Auto Offline data collection support.
Abort Offline collection upon new
command.
No Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
No Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 85) minutes.
SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
5 Reallocated_Sector_Ct 0x0032 100 100 000 Old_age Always - 0
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 3281
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 570
165 Block_Erase_Count 0x0032 100 100 000 Old_age Always - 1025
166 Minimum_PE_Cycles_TLC 0x0032 100 100 --- Old_age Always - 7
167 Max_Bad_Blocks_per_Die 0x0032 100 100 --- Old_age Always - 0
168 Maximum_PE_Cycles_TLC 0x0032 100 100 --- Old_age Always - 15
169 Total_Bad_Blocks 0x0032 100 100 --- Old_age Always - 204
170 Grown_Bad_Blocks 0x0032 100 100 --- Old_age Always - 0
171 Program_Fail_Count 0x0032 100 100 000 Old_age Always - 0
172 Erase_Fail_Count 0x0032 100 100 000 Old_age Always - 0
173 Average_PE_Cycles_TLC 0x0032 100 100 000 Old_age Always - 7
174 Unexpected_Power_Loss 0x0032 100 100 000 Old_age Always - 112
184 End-to-End_Error 0x0032 100 100 --- Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 2
188 Command_Timeout 0x0032 100 100 --- Old_age Always - 0
194 Temperature_Celsius 0x0022 063 053 000 Old_age Always - 37 (Min/Max 6/53)
199 UDMA_CRC_Error_Count 0x0032 100 100 --- Old_age Always - 0
230 Media_Wearout_Indicator 0x0032 100 100 000 Old_age Always - 0x025c0128025c
232 Available_Reservd_Space 0x0033 100 100 005 Pre-fail Always - 100
233 NAND_GB_Written_TLC 0x0032 100 100 --- Old_age Always - 3227
234 NAND_GB_Written_SLC 0x0032 100 100 000 Old_age Always - 11545
241 Total_Host_GB_Written 0x0030 100 100 000 Old_age Offline - 4632
242 Total_Host_GB_Read 0x0030 100 100 000 Old_age Offline - 4956
244 Temp_Throttle_Status 0x0032 000 100 --- Old_age Always - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Aborted by host 90% 3166 -
# 2 Short offline Completed without error 00% 2522 -
# 3 Short offline Interrupted (host reset) 90% 2070 -
# 4 Short offline Interrupted (host reset) 90% 2040 -
# 5 Short offline Interrupted (host reset) 90% 2011 -
# 6 Short offline Interrupted (host reset) 90% 2011 -
# 7 Short offline Aborted by host 80% 2011 -
# 8 Short offline Completed without error 00% 624 -
# 9 Short offline Aborted by host 30% 433 -
#10 Short offline Aborted by host 90% 401 -
#11 Short offline Completed without error 00% 400 -
#12 Short offline Completed without error 00% 321 -
#13 Short offline Completed without error 00% 106 -
#14 Short offline Self-test routine in progress 20% 106 -
#15 Short offline Aborted by host 90% 11 -
Selective Self-tests/Logging not supported
The output of "zpool status -v" is:
Code:
mario@freebsd-g3 ~ sudo zpool status -v
pool: zroot
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://illumos.org/msg/ZFS-8000-8A
scan: scrub repaired 0 in 0 days 00:00:28 with 37 errors on Mon Sep 2 12:30:27 2019
config:
NAME STATE READ WRITE CKSUM
zroot ONLINE 0 0 226
ada1p6 ONLINE 0 0 482
errors: Permanent errors have been detected in the following files:
<metadata>:<0x6>
<metadata>:<0x9>
<metadata>:<0x10a>
<metadata>:<0xb>
<metadata>:<0x10c>
<metadata>:<0x10>
<metadata>:<0x16>
<metadata>:<0x117>
<metadata>:<0x12b>
<metadata>:<0x46>
<metadata>:<0x4b>
<metadata>:<0xac>
<metadata>:<0xb3>
<metadata>:<0xde>
<metadata>:<0xec>
<metadata>:<0xee>
zroot/ROOT/default:<0x0>
//usr/local/share/PySide2/glue/qtqml.cpp
zroot/ROOT/default:<0x3bbe>
Thank you