UFS external USB Drive - Cannot find file system superblock - Repair possible ?

veryfoot · Dec 27, 2017

Hi all,

I have a USB2, external hard drive, formatted under UFS. It was working perfectly, but after an electric black out, im not able to mount or fsck the external drive anymore.

Code:

freebsd-version
11.1-RELEASE-p6

Code:

dmesg | grep ^da
da0 at umass-sim0 bus 0 scbus6 target 0 lun 0
da0: <ST950032 5AS > Fixed Direct Access SCSI-2 device
da0: Serial Number 006000003DC7
da0: 40.000MB/s transfers
da0: 476940MB (976773168 512 byte sectors)
da0: quirks=0x2<NO_6_BYTE>

Code:

usbconfig | grep -i iomega
ugen4.2: <Iomega Select HDD Iomega> at usbus4, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=ON (2mA)

Code:

gpart show
=>       34  156301421  ada0  GPT  (75G)
        34        128     1  freebsd-boot  (64K)
       162  146800512     2  freebsd-ufs  (70G)
  146800674    7815168     3  freebsd-swap  (3.7G)
  154615842    1685613        - free -  (823M)

=>       63  976773105  da0  MBR  (466G)
        63  976773105    1  freebsd  [active]  (466G)

=>        0  976773105  da0s1  BSD  (466G)
         0  976773105      1  freebsd-ufs  (466G)

=>       63  976773105  diskid/DISK-006000003DC7  MBR  (466G)
        63  976773105                         1  freebsd  [active]  (466G)

=>        0  976773105  diskid/DISK-006000003DC7s1  BSD  (466G)
         0  976773105                           1  freebsd-ufs  (466G)

Code:

fsck -t ufs -y /dev/da0s1a
** /dev/da0s1a
Cannot find file system superblock

LOOK FOR ALTERNATE SUPERBLOCKS? yes

SEARCH FOR ALTERNATE SUPER-BLOCK FAILED. YOU MUST USE THE
-b OPTION TO FSCK TO SPECIFY THE LOCATION OF AN ALTERNATE
SUPER-BLOCK TO SUPPLY NEEDED INFORMATION; SEE fsck_ffs(8).

Is there a way to fix this ?

Thanks

Regards,

Vince

veryfoot · Dec 27, 2017

Update :

Code:

newfs -N /dev/da0s1a
/dev/da0s1a: 476940.0MB (976773104 sectors) block size 32768, fragment size 4096
    using 762 cylinder groups of 626.22MB, 20039 blks, 80256 inodes.
super-block backups (for fsck_ffs -b #) at:
 192, 1282688, 2565184, 3847680, 5130176..............973414656, 974697152, 975979648

Code:

fsck_ufs -y -b 192 /dev/da0s1a
Alternate super block location: 192
** /dev/da0s1a
192 is not a file system superblock

Help will be appreciate

Regards,

Sensucht94 · Dec 27, 2017

Hi, When using fsck, try replacing da0s1a with da0s1, in accordance with the name shown in your code

Looking at your gpart show output I guess you chose MBR PT+ "legacy" freebsd partition type (referring to gpart(8)) as FS, containing in turn the single da0s1 slice you mentioned above. Slices are named after da0s* (da0s1, da0s2...).

Given a GPT partition table instead, If you had chosen to subdivide your USB HDD into specific freebsd parititions ( freebsd-ufs, freebsd-boot, freebsd-swap...again refer to gpart(8)), than your FS wouldn't be seen as a single partition with a BSD Label, but as a storage containing multiple "true" partitions, which would have been named subsequently after da0p* (da0p1, da0p2...).
I
As far as I know, da0<letter> named partitions (da0a, da0b....) are created instead whether for partitions of a BSD Label -marked disk, or for slices of a MBR Table -marked disk containing a classic freebsd partition, whose BSD label has not been written over the disk's MBR

I don't think (but may be wrong) that partitions named da0s1a could ever exist, as it would imply making BSD partitions in a slice

veryfoot · Dec 28, 2017

Hi and thanks for your help.

The external drive was formatted by a friend, so i didnt really choose a particular type. My friend just ask me, you want the entire disk ? I said Yes, and he do the rest.

Sorry for my ignorance about that.

Anyway, i have try what you have suggest, with no success :

Code:

fsck_ufs -y  /dev/da0s1
** /dev/da0s1
Cannot find file system superblock

LOOK FOR ALTERNATE SUPERBLOCKS? yes

SEARCH FOR ALTERNATE SUPER-BLOCK FAILED. YOU MUST USE THE
-b OPTION TO FSCK TO SPECIFY THE LOCATION OF AN ALTERNATE
SUPER-BLOCK TO SUPPLY NEEDED INFORMATION; SEE fsck_ffs(8).

Code:

fsck_ufs -y -b 192 /dev/da0s1
Alternate super block location: 192
** /dev/da0s1
192 is not a file system superblock

I dont want to loose data on the Usb Drive. (if possible of course)

Testdisk says :

Warning: Bad ending head (CHS and LBA don't match)

Is there a simple way to try to recover that ?

Thanks,

Vince

Snurg · Dec 28, 2017

Vincent FAUQUEZ said:
I dont want to loose data on the Usb Drive. (if possible of course)

Then stop playing around with the original.
Make a copy onto the HDD
dd if=/dev/da0 of=<your_selected_filepath>
and verify the image has the same size that the usb drive (i.e. complete copy).
Then store the USB drive in a safe and play around with that image copy you just created.
Else your chance to make the situation even worse is quite good.

Maelstorm · Dec 28, 2017

Sensucht94 said:
I don't think (but may be wrong) that partitions named da0s1a could ever exist, as it would imply making BSD partitions in a slice

I actually had that naming issue myself which necessitating me doing a complete reformat and reinstall of the system. In my case, I wrote the BSD Disklabel directly down on the MBR sector. While FreeBSD is ok with it, many third party disk utilities will complain about it, and may even nuke the disklabel to install a proper partition table. This is known as a dangerously dedicated disk.

In looking at the OP's question, da0s1 indicates that he does not have a dangerously dedicated disk. So the partitions within the slice are going to be da0s1a, da0s1b, etc....

Sensucht94 · Dec 28, 2017

Maelstorm said:
I actually had that naming issue myself which necessitating me doing a complete reformat and reinstall of the system. In my case, I wrote the BSD Disklabel directly down on the MBR sector. While FreeBSD is ok with it, many third party disk utilities will complain about it, and may even nuke the disklabel to install a proper partition table. This is known as a dangerously dedicated disk.

In looking at the OP's question, da0s1 indicates that he does not have a dangerously dedicated disk. So the partitions within the slice are going to be da0s1a, da0s1b, etc....

I see, indeed I knew about the distinction between writing the BSD label to then MBR, followed partitioning into slices (which I think it's equal to set BSD as partition table, then partitioning) which shall return da0a, da0b....partitions, or choosing to preserve MBR, and make lone-standing UFS partitions (da0s1, da0s2...). However I did not realize that, one can also make a separate BSD label partition inside a MBR disk, then subdivide it into slices, as you stated, which would ultimately be named da0s1a, da0s1b......so sorry Vincent FAUQUEZ for that, Maelstrom was right and it was me having got it wrong, and not having looked at your code more carefully to get a grasp of how things stood in the first place

I have a USB storage, where I wrote the BSD label to the MBR (I'm to guess it's to be called a dangerously dedicated disk now

) and it look like that:

Code:

gpart show da0 =>
   0 7954432 da0 BSD (3.8G)
   0 7954432 1 freebsd-ufs (3.8G)

and as expected BSD is indicated as partition table, while the only UFS partition present is called da0a

Now, since OP has a MBR disk, containing a lone-standing BSD label (da0s1), which in turn hosts a single freebsd-ufs slice, then that one should be correctly named da0s1a

Vincent FAUQUEZ said:

Update :

Code:

newfs -N /dev/da0s1a
/dev/da0s1a: 476940.0MB (976773104 sectors) block size 32768, fragment size 4096
    using 762 cylinder groups of 626.22MB, 20039 blks, 80256 inodes.
super-block backups (for fsck_ffs -b #) at:
 192, 1282688, 2565184, 3847680, 5130176..............973414656, 974697152, 975979648

Code:

fsck_ufs -y -b 192 /dev/da0s1a
Alternate super block location: 192
** /dev/da0s1a better,
192 is not a file system superblock

Help will be appreciate

Regards,

Snurg made a good suggestion about cloning it to your hard drive

, however, bearing in mind bad superblocks are likely to be found, operation may be cut off in between the moment dd tries to write the bad block to output. If that happened to be the case, you can try use sysutils/safecopy to create a raw image of your corrupted UFS partition (see safecopy(1) for more details):

An example with increased verbosity, stronger recover attempting, and more in-depth check
safecopy /dev/da0s1a ~/ufs.img ~/ -r 5 --debug 4 --stage3

Then you should be able to make a virtual drive out of it with mdconfig(8):
mdconfig -f ~/ufs.img -u 0

and mount it regularly:
mount -t ufs /dev/md0 /mnt

You may want to run fsck_ffs on the created image before ever creating the md drive, otherwise chances are high it will fail to mount

I 'd suggest trying specifying the other superblocks detected by newfs -N. It's happened to me before, and second/third alternative superblock might be a FS one, even though they look way too much distant from first.

Another worthwhile try is to take the superblock location revealed by dumpfs | grep superblock. See dumpfs(8)

Then checking the drive for bad blocks/wrong sectors would be a good idea. I'd try both smartctl(8) from sysutils/smartmontools, which is noticeably better, and the deprecated badblocks(8) from sysutils/ef2progs, for a double-check.

First run (unless you have smartd(8) enabled, and perform daily scheduled tests, including the external drive) :
smartctl -c /dev/da0 to display how long would it take to perform a long test (on ~400Gb should be around 90min)

Then perform a long test and wait the predicted time:
smartctl -t long /dev/da0

Look up the report for errors:
smartctl -ax /dev/ada0 | grep -i errors

Look it up for reallocated sectors as well, as wblock@ suggested in Thread repair bad block.27507:
smartctl -ax /dev/ada0 | less -Sp Reallocated_Sector_Ct

Move to the right of the smartctl summary, and look for the value displayed under column RAW_VALUE for the underline Reallocated-Sector row. If count is > 0, then FS is probably corrupted, and a blanking ( dd if=/dev/zero of=/dev/da0 bs=512 conv=sync; will take years on such a large drive, but 512 as blocksize is the safest bet for blanking) followed by a reformat, might be your only choice.

If smartctl does not reveals any issue, try with badblocks, although chances it will succeed where smartmontools failed are narrow:
badblocks -fnv -b=2048 /dev/da0 ....it will take a lot

To reformat your external HDD instead, with GPT:

Code:

gpart delete -i 1 da0s1
gpart delete -i 1 da0
gpart destroy da0
gpart create -s gpt  da0
gpart add -t freebsd-ufs da0
newfs -U -o space /dev/da0p1

veryfoot · Dec 28, 2017

Hi and thanks for this complete reply.

Unfortunatly i dont have enought free space to backup my external USB Hard drive.

Datas on this hard drive are not critical. I will try to save it, but if it goes wrong, i will reformat it.

I try what you propose with the newfs command, but with no success.

After that i try the smartctl way.

Code:

smartctl -c /dev/da0
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.1-RELEASE-p4 i386] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (    0) seconds.
Offline data collection
capabilities:                    (0x73) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        No Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        ( 136) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x103b) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

Code:

smartctl -t long /dev/da0
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.1-RELEASE-p4 i386] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Extended self-test routine immediately in off-line mode".
Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 136 minutes for test to complete.
Test will complete after Thu Dec 28 15:40:37 2017

Use smartctl -X to abort test.

But it get back to the prompt immidiatly.

I try to check the logs and i got that :

Code:

 smartctl -ax /dev/da0
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.1-RELEASE-p4 i386] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Momentus 5400.6
Device Model:     ST9500325AS
Serial Number:    5VE5CQV2
LU WWN Device Id: 5 000c50 01bfa7de3
Firmware Version: 0002BSM1
User Capacity:    500,107,862,016 bytes [500 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 2.6, 1.5 Gb/s
Local Time is:    Thu Dec 28 13:50:42 2017 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Unavailable
APM level is:     128 (minimum power consumption without standby)
Rd look-ahead is: Enabled
Write cache is:   Enabled
DSN feature is:   Unavailable
ATA Security is:  Disabled, NOT FROZEN [SEC1]
Wt Cache Reorder: Unknown

=== START OF READ SMART DATA SECTION ===
SMART Status not supported: Incomplete response, ATA output registers missing
SMART overall-health self-assessment test result: PASSED
Warning: This result is based on an Attribute check.

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      ( 248) Self-test routine in progress...
                                        80% of test remaining.
Total time to complete Offline
data collection:                (    0) seconds.
Offline data collection
capabilities:                    (0x73) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        No Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        ( 136) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x103b) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR--   114   099   006    -    78071056
  3 Spin_Up_Time            PO----   099   098   000    -    0
  4 Start_Stop_Count        -O--CK   100   100   020    -    188
  5 Reallocated_Sector_Ct   PO--CK   100   100   036    -    0
  7 Seek_Error_Rate         POSR--   075   060   030    -    38662406
  9 Power_On_Hours          -O--CK   032   032   000    -    59569
 10 Spin_Retry_Count        PO--C-   100   100   097    -    0
 12 Power_Cycle_Count       -O--CK   100   037   020    -    188
184 End-to-End_Error        -O--CK   100   100   099    -    0
187 Reported_Uncorrect      -O--CK   100   100   000    -    0
188 Command_Timeout         -O--CK   100   099   000    -    4304404624
189 High_Fly_Writes         -O-RCK   100   100   000    -    0
190 Airflow_Temperature_Cel -O---K   061   049   045    -    39 (Min/Max 33/39)
191 G-Sense_Error_Rate      -O--CK   100   100   000    -    0
192 Power-Off_Retract_Count -O--CK   100   100   000    -    78
193 Load_Cycle_Count        -O--CK   072   072   000    -    57089
194 Temperature_Celsius     -O---K   039   051   000    -    39 (0 16 0 0 0)
195 Hardware_ECC_Recovered  -O-RC-   038   038   000    -    78071056
197 Current_Pending_Sector  -O--C-   100   100   000    -    0
198 Offline_Uncorrectable   ----C-   100   100   000    -    0
199 UDMA_CRC_Error_Count    -OSRCK   200   200   000    -    0
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

General Purpose Log Directory Version 1
SMART           Log Directory Version 1 [multi-sector log support]
Address    Access  R/W   Size  Description
0x00       GPL,SL  R/O      1  Log Directory
0x01       GPL,SL  R/O      1  Summary SMART error log
0x02       GPL,SL  R/O      5  Comprehensive SMART error log
0x03       GPL,SL  R/O      5  Ext. Comprehensive SMART error log
0x06       GPL,SL  R/O      1  SMART self-test log
0x07       GPL,SL  R/O      1  Extended self-test log
0x09       GPL,SL  R/W      1  Selective self-test log
0x10       GPL,SL  R/O      1  NCQ Command Error log
0x11       GPL,SL  R/O      1  SATA Phy Event Counters log
0x21       GPL,SL  R/O      1  Write stream error log
0x22       GPL,SL  R/O      1  Read stream error log
0x80-0x9f  GPL,SL  R/W     16  Host vendor specific log
0xa1       GPL,SL  VS      20  Device vendor specific log
0xa2       GPL     VS    2248  Device vendor specific log
0xa8       GPL,SL  VS      65  Device vendor specific log
0xa9       GPL,SL  VS       1  Device vendor specific log
0xb0       GPL     VS    2864  Device vendor specific log
0xbe-0xbf  GPL     VS   65535  Device vendor specific log
0xc0       GPL,SL  VS       1  Device vendor specific log
0xe0       GPL,SL  R/W      1  SCT Command/Status
0xe1       GPL,SL  R/W      1  SCT Data Transfer

SMART Extended Comprehensive Error Log Version: 1 (5 sectors)
No Errors Logged

SMART Error Log Version: 1
No Errors Logged

SMART Extended Self-test Log Version: 1 (1 sectors)
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Self-test routine in progress 80%     59569         -
# 2  Extended offline    Interrupted (host reset)      00%     59568         -

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Self-test routine in progress 80%     59569         -
# 2  Extended offline    Interrupted (host reset)      00%     59568         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

SCT Status Version:                  3
SCT Version (vendor specific):       522 (0x020a)
SCT Support Level:                   1
Device State:                        Active (0)
Current Temperature:                    39 Celsius
Power Cycle Min/Max Temperature:     33/39 Celsius
Lifetime    Min/Max Temperature:     16/52 Celsius
Under/Over Temperature Limit Count:   0/24

SCT Temperature History Version:     2
Temperature Sampling Period:         1 minute
Temperature Logging Interval:        1 minute
Min/Max recommended Temperature:      0/ 0 Celsius
Min/Max Temperature Limit:            0/ 0 Celsius
Temperature History Size (Index):    128 (92)

Index    Estimated Time   Temperature Celsius
  93    2017-12-28 11:43    30  ***********
 ...    ..(  3 skipped).    ..  ***********
  97    2017-12-28 11:47    30  ***********
  98    2017-12-28 11:48    31  ************
 ...    ..( 18 skipped).    ..  ************
 117    2017-12-28 12:07    31  ************
 118    2017-12-28 12:08    32  *************
 ...    ..( 24 skipped).    ..  *************
  15    2017-12-28 12:33    32  *************
  16    2017-12-28 12:34    26  *******
  17    2017-12-28 12:35    26  *******
  18    2017-12-28 12:36    28  *********
  19    2017-12-28 12:37    28  *********
  20    2017-12-28 12:38    30  ***********
  21    2017-12-28 12:39    30  ***********
  22    2017-12-28 12:40    31  ************
  23    2017-12-28 12:41    29  **********
  24    2017-12-28 12:42    29  **********
  25    2017-12-28 12:43    26  *******
  26    2017-12-28 12:44    26  *******
  27    2017-12-28 12:45    27  ********
  28    2017-12-28 12:46    27  ********
  29    2017-12-28 12:47    28  *********
  30    2017-12-28 12:48    28  *********
  31    2017-12-28 12:49    27  ********
  32    2017-12-28 12:50    27  ********
  33    2017-12-28 12:51    27  ********
  34    2017-12-28 12:52    29  **********
  35    2017-12-28 12:53    29  **********
  36    2017-12-28 12:54    29  **********
  37    2017-12-28 12:55    30  ***********
  38    2017-12-28 12:56     ?  -
  39    2017-12-28 12:57    18  -
  40    2017-12-28 12:58     ?  -
  41    2017-12-28 12:59    23  ****
  42    2017-12-28 13:00    23  ****
  43    2017-12-28 13:01    24  *****
  44    2017-12-28 13:02    25  ******
  45    2017-12-28 13:03    26  *******
  46    2017-12-28 13:04    27  ********
  47    2017-12-28 13:05    28  *********
  48    2017-12-28 13:06    28  *********
  49    2017-12-28 13:07    31  ************
 ...    ..(  3 skipped).    ..  ************
  53    2017-12-28 13:11    31  ************
  54    2017-12-28 13:12    32  *************
  55    2017-12-28 13:13    32  *************
  56    2017-12-28 13:14    32  *************
  57    2017-12-28 13:15    31  ************
 ...    ..(  3 skipped).    ..  ************
  61    2017-12-28 13:19    31  ************
  62    2017-12-28 13:20    33  **************
  63    2017-12-28 13:21    30  ***********
  64    2017-12-28 13:22    30  ***********
  65    2017-12-28 13:23    30  ***********
  66    2017-12-28 13:24    31  ************
  67    2017-12-28 13:25    32  *************
  68    2017-12-28 13:26     ?  -
  69    2017-12-28 13:27    33  **************
  70    2017-12-28 13:28    33  **************
  71    2017-12-28 13:29    34  ***************
  72    2017-12-28 13:30    35  ****************
  73    2017-12-28 13:31    35  ****************
  74    2017-12-28 13:32    35  ****************
  75    2017-12-28 13:33    36  *****************
  76    2017-12-28 13:34    36  *****************
  77    2017-12-28 13:35    36  *****************
  78    2017-12-28 13:36    37  ******************
  79    2017-12-28 13:37    37  ******************
  80    2017-12-28 13:38    37  ******************
  81    2017-12-28 13:39    38  *******************
 ...    ..(  4 skipped).    ..  *******************
  86    2017-12-28 13:44    38  *******************
  87    2017-12-28 13:45    39  ********************
 ...    ..(  4 skipped).    ..  ********************
  92    2017-12-28 13:50    39  ********************

SMART WRITE LOG does not return COUNT and LBA_LOW register
SCT (Get) Error Recovery Control command failed

Device Statistics (GP/SMART Log 0x04) not supported

Pending Defects log (GP Log 0x0c) not supported

SATA Phy Event Counters (GP Log 0x11)
ID      Size     Value  Description
0x000a  2            1  Device-to-host register FISes sent due to a COMRESET
0x0001  2            0  Command failed due to ICRC error
0x0003  2            0  R_ERR response for device-to-host data FIS
0x0004  2            0  R_ERR response for host-to-device data FIS
0x0006  2            0  R_ERR response for device-to-host non-data FIS
0x0007  2            0  R_ERR response for host-to-device non-data FIS

Sensucht94 · Dec 28, 2017

veryfoot said:
But it get back to the prompt immediately

Smart test is run in background, so you have to wait 136' for it to finish,before checking results. Your CODE in fact says 80% of test was still left when you checked status

By the way, when you're done with it,in addition to the couple of things I pointed above, it would be better if someone else here were to analyze more throughly any reported result, as I'm by no means competent in that

Unfortunatly i dont have enought free space to backup my external USB Hard drive.

If data inside of it was really important to you, you could ask a friend to lend you another HDD, or even buy one, they're really cheao nowadays!

veryfoot · Dec 29, 2017

Ok for Smart... but i dont see any errors so i'm a bit confused.

I have launch a testdisk scan, we will see tomorow if testdisk is able to recover something, or to backup some files.

If not, it will wait a bit because santa was greedy this year ^^

The backup disk have to wait a bit !

Thank you very much anyway for all your fast and complete answers.

Best regards,

Sensucht94 · Dec 29, 2017

veryfoot said:
Ok for Smart... but i dont see any errors so i'm a bit confused.

I have launch a testdisk scan, we will see tomorow if testdisk is able to recover something, or to backup some files.

If not, it will wait a bit because santa was greedy this year ^^

The backup disk have to wait a bit !

Thank you very much anyway for all your fast and complete answers.

Best regards,

Would you care about pasting the complete output of:

  sudo smartctl -a -f brief -l xerror,error -l xselftest,selftest -v 200,writeerrorcount /dev/da0

after a completed long test?

Run the test as soon as you login and wait the predicted time, then run the command above

In particular you should look for:

1 Raw_Read_Error_Rate whether or not RAW_VALUE >> THRESH. In your case, with only 20% test completed, read errors where already way beyond the normal threshold: (I have 0).
5 Reallocated_Sector_Ct whether or not RAW_VALUE > 0. Corresponds to the count of bad sectors found. If value is > 0, given you cannot run fsck, you should blank (one or more time, untile reallocated sectors count drops to 0) and reformat the disk as suggested in the thread I mentioned above, hoping to bring back disk to a usable state
7 Seek_Error_Rate, not very meaningful, as it may vary. However, since it corresponds to the error rate of HDD's magnetic heads, may be taken into account, as your problem arose after a sudden power cut off.
10 Spin_Retry_Count, whether or not RAW_VALUE > 0. Number of spin-up disk retries to reach the operational speed after an failed first attempt. Fundamental...if count is > 0, you may consider substituting your HDD
187 Reported_Uncorrect, whether or not RAW_VALUE > 0, very important as well
195 Hardware_ECC_Recovered whether or not RAW_VALE >> THRESH. Errors which were unable to be corrected here, fall under the Reported-Uncorrect category above
197 Current_Pending_Sector important if RAW_VALUE >> 0, Unstable sectors to be remapped
198 Offline_Uncorrectable, bad if RAW_VALUE > 0, if any is found, then chances are high it won't be corrected next time sector it's written either, and you may want to substitute your HDD

And for:

SMART Extended Self-test Log Version
.........
#1 Long Offline ..should say "completed without errors" if everything's fine

Also, if you enable smartd(8), error logs from background quick self tests will be stored in /var/log/syslog, and you may checked them.

Moreover, the smartctl(8) man page contains a useful bash script to quickly check if everything's fine at smartctl process exiting:

Code:

#! /usr/local/bin/bash
status=$?
      for ((i=0; i<8; i++)); do
    echo "Bit $i: $((status & 2**i    && 1))"
      done

Copy it to a file in $HOME, like ~/.smartbits
mark as executable ( chmod + x ~/.smartbits ), then run it
The normal output should be:

Code:

Bit 0: 0
Bit 1: 0
Bit 2: 0
Bit 3: 0
Bit 4: 0
Bit 5: 0
Bit 6: 0
Bit 7: 0

If any Bit >0, look at the legend provided by the man page, to interpret output:

Code:

      Bit 0: Command line did not parse.

      Bit 1: Device  open  failed,  device  did not return an IDENTIFY    DEVICE
         structure, or device is in a low-power  mode  (see  '-n'    option
         above).

      Bit 2: Some SMART or other ATA command to the disk failed, or there was
         a    checksum error in a SMART  data     structure  (see  '-b'    option
         above).

      Bit 3: SMART status check returned "DISK    FAILING".

      Bit 4: We found prefail Attributes <= threshold.

      Bit 5: SMART  status  check  returned  "DISK OK"    but we found that some
         (usage or    prefail) Attributes have been  <=  threshold  at  some
         time in the past.

      Bit 6: The device error log contains records of errors.

      Bit 7: The device self-test log contains    records    of errors.  [ATA only]
         Failed self-tests    outdated by a newer successful extended     self-
         test are ignored.

Finally, execute badblocks (8) as well with the command above: badblocks -fnsv -b 2048 /dev/da0 (you could also replace -b 2048 with -b 4096, to speed it up).

This will run a non destructive test (preserving data).The [B]-s[/B] option I added will show the progress bar in percentage. If you add the [B]-o[/B] switch, you can specify an output file to be stored with the result, instead of stdout (like -o ~/.badblocks)

Leave the computer "idle" as long as badblocks is running (you could do it over night), especially, do not copy/delete/move files, or you may increase the corruption error rate, for read and written data would not match. Running badblocks with -n, output should look like:

Code:

Checking for bad blocks in non-destructive read-write mode
From block 0 to 244190645
Checking for bad blocks (non-destructive read-write test)
Testing with random pattern:   0.04% done, 0:21 elapsed. (0/0/0 errors)

.........

And final result (either printed to standard output or to the specified file)

Code:

Pass completed, 0 bad blocks found (0/0/0 errors)

Where first number is read errors' count, second is write errors', third is read/write mismatch errors'

Anyway:

- in case major issues were to be reported, I'd just dust-bin the HDD, after a backup attempt

- in case minor/no error were reported, I'd wait to clone it,then blank it a couple of time, and reformat it the way I'm more comfortable with

veryfoot · Dec 29, 2017

Hi again

Thats very nice of you to send me all this infos and solutions.

Ok i have started de long test. I'll keep you informed. It is running...

After :

Code:

Please wait 136 minutes for test to complete.
Test will complete after Fri Dec 29 14:36:33 2017

I have test de smart script too :

Code:

bash .smartbits 
Bit 0: 0
Bit 1: 0
Bit 2: 0
Bit 3: 0
Bit 4: 0
Bit 5: 0
Bit 6: 0
Bit 7: 0

Looks pretty good.

After the result of the long smart test, i will launch de badblocks search.

For the moment smartd is not activate. I will do that after the test.

I let you know within two hours

For info Testdisk was not happy, and says that de FAT (???) partition is corrupted... i will not try this way no more to backup the HD, i will use "dd", as u said but later, when i will be able to have another Hard drive to backup the USB one.

Thanks ^^

Regards

Sensucht94 · Dec 29, 2017

veryfoot said:
Hi again

Thats very nice of you to send me all this infos and solutions.

Just wrote what proved useful the moment I bumped a similar problem in the past

Code:
I have test de smart script too :

Code:

bash .smartbits Bit 0: 0 Bit 1: 0 Bit 2: 0 Bit 3: 0 Bit 4: 0 Bit 5: 0 Bit 6: 0 Bit 7: 0

Looks pretty good.

Actually the script is made to print the Exit Status of smartctl command, in a way that is easily understandable

. If you run it after any other command, it will simply print the exit status of the latter to stdout. Even in tcsh, try typing ls -a, followed by echo $?: it will return 0, meaning command ls exited without error. If you type instead a command with a wrong syntax, like ls -v followed by echo $?, it will return 1, meaning ls exited with an error, as -v in an illegal option. If you run the script after a ls command, it wil have no relation with smartctl.

So you should run instead the script immdiately after smartctl: drop to a bash shell, run: smartctl -a -f brief -l xerror,error -l xselftest,selftest -v 200,writeerrorcount /dev/da0 && ~/.smartbits

Obviously always after at least 1 completed test

After the result of the long smart test, i will launch de badblocks search.

For the moment smartd is not activate. I will do that after the test

Remember that you'll need a smartd.conf(5) file, inside /usr/local/etc for smartd to work. Use the sample file as base:
cd /usr/local/etc/; sudo mv smartd.conf.sample smartd.conf
Then edit it to match your case and the checks you want to perform

For info Testdisk was not happy, and says that de FAT (???) partition is corrupted...

I've never used testdisk, so really can't help you with that....however, maybe you accidentally ran it over ada0 instead of da0, so it's just complaining about your EFI boot partition?

Best Regards

veryfoot · Dec 29, 2017

Results of the test :

Sensucht94 said:
smartctl -a -f brief -l xerror,error -l xselftest,selftest -v 200,writeerrorcount /dev/da0

Code:

smartctl -a -f brief -l xerror,error -l xselftest,selftest -v 200,writeerrorcount /dev/da0
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.1-RELEASE-p4 i386] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Momentus 5400.6
Device Model:     ST9500325AS
Serial Number:    5VE5CQV2
LU WWN Device Id: 5 000c50 01bfa7de3
Firmware Version: 0002BSM1
User Capacity:    500,107,862,016 bytes [500 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 2.6, 1.5 Gb/s
Local Time is:    Fri Dec 29 15:09:22 2017 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART Status not supported: Incomplete response, ATA output registers missing
SMART overall-health self-assessment test result: PASSED
Warning: This result is based on an Attribute check.

General SMART Values:
Offline data collection status:  (0x00)    Offline data collection activity
                    was never started.
                    Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)    The previous self-test routine completed
                    without error or no self-test has ever
                    been run.
Total time to complete Offline
data collection:         (    0) seconds.
Offline data collection
capabilities:             (0x73) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    No Offline surface scan supported.
                    Self-test supported.
                    Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003)    Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01)    Error logging supported.
                    General Purpose Logging supported.
Short self-test routine
recommended polling time:     (   1) minutes.
Extended self-test routine
recommended polling time:     ( 136) minutes.
Conveyance self-test routine
recommended polling time:     (   2) minutes.
SCT capabilities:            (0x103b)    SCT Status supported.
                    SCT Error Recovery Control supported.
                    SCT Feature Control supported.
                    SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR--   117   099   006    -    152460769
  3 Spin_Up_Time            PO----   099   098   000    -    0
  4 Start_Stop_Count        -O--CK   100   100   020    -    188
  5 Reallocated_Sector_Ct   PO--CK   100   100   036    -    0
  7 Seek_Error_Rate         POSR--   075   060   030    -    38750218
  9 Power_On_Hours          -O--CK   032   032   000    -    59594
 10 Spin_Retry_Count        PO--C-   100   100   097    -    0
 12 Power_Cycle_Count       -O--CK   100   037   020    -    188
184 End-to-End_Error        -O--CK   100   100   099    -    0
187 Reported_Uncorrect      -O--CK   100   100   000    -    0
188 Command_Timeout         -O--CK   100   099   000    -    4304404624
189 High_Fly_Writes         -O-RCK   100   100   000    -    0
190 Airflow_Temperature_Cel -O---K   072   049   045    -    28 (Min/Max 27/40)
191 G-Sense_Error_Rate      -O--CK   100   100   000    -    0
192 Power-Off_Retract_Count -O--CK   100   100   000    -    78
193 Load_Cycle_Count        -O--CK   072   072   000    -    57098
194 Temperature_Celsius     -O---K   028   051   000    -    28 (0 16 0 0 0)
195 Hardware_ECC_Recovered  -O-RC-   033   033   000    -    152460769
197 Current_Pending_Sector  -O--C-   100   100   000    -    0
198 Offline_Uncorrectable   ----C-   100   100   000    -    0
199 UDMA_CRC_Error_Count    -OSRCK   200   200   000    -    0
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

SMART Extended Comprehensive Error Log Version: 1 (5 sectors)
No Errors Logged

SMART Error Log Version: 1
No Errors Logged

SMART Extended Self-test Log Version: 1 (1 sectors)
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%     59593         -
# 2  Extended offline    Completed without error       00%     59575         -
# 3  Extended offline    Interrupted (host reset)      00%     59568         -

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%     59593         -
# 2  Extended offline    Completed without error       00%     59575         -
# 3  Extended offline    Interrupted (host reset)      00%     59568         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

and then check the return status :

Code:

echo $?
4

So i re ran the command because the "&&" is not played because i am in a code 4 :

smartctl -a -f brief -l xerror,error -l xselftest,selftest -v 200,writeerrorcount /dev/da0

And just after run the .smartbits :

Code:

bash .smartbits
Bit 0: 0
Bit 1: 0
Bit 2: 0
Bit 3: 0
Bit 4: 0
Bit 5: 0
Bit 6: 0
Bit 7: 0

I will now launch badblock command and keep you informed.

Regards,

veryfoot · Dec 29, 2017

Sensucht94 said:
I've never used testdisk, so really can't help you with that....however, maybe you accidentally ran it over ada0 instead of da0, so it's just complaining about your EFI boot partition?

Testdisk only work on an unmounted device. So it is sure it was working on /dev/da0.

Testdisk can retrieve, and backup datas on corrupted HD. It is able to retrieve some datas on FAT, or NTFS partitions. But this time, it is not working.

veryfoot · Dec 29, 2017

Sensucht94 said:
badblocks -fnsv -b 2048 /dev/da0

Code:

badblocks -fnsv -b 2048 /dev/da0
Checking for bad blocks in non-destructive read-write mode
From block 0 to 244193291
Checking for bad blocks (non-destructive read-write test)
Testing with random pattern: set_o_direct: Inappropriate ioctl for device
0.00% done, 1:15 elpased, (0/0/0 errors)

It is gone a take a while.... I get back to you when over.

Sensucht94 · Dec 29, 2017

Hi,looking at your test it seems there's no big issue with your HDD

; Hardware_ECC_Recovered matches Raw_Read_Error_Rate, meaning ECC was able to recover all errors encountered in sectors' reading, hence SMART did not report and log any problem, returning 0 errors.
Once badblocks has finished, I'd definitely try running fsck_ffs again, and if it still cannot read the FS superblock, I'd reformat the HDD for reuse

Best wishes

PS: Thanks for showing Testdisk, it will surely prove useful

veryfoot · Dec 29, 2017

Good news for the Usb Hd

For badblocks... we will have the answer probably in 2018 because :

0.22 % done in 1h15...

So estimation is around 5 hours for 1%

500 hours for 100 %

Around 20 days ! lol ! Am i correct ?

Happy new year to you :beer:

Cheers !

Sensucht94 · Dec 29, 2017

veryfoot said:
Good news for the Usb Hd

For badblocks... we will have the answer probably in 2018 because :

0.22 % done in 1h15...

So estimation is around 5 hours for 1%

500 hours for 100 %

Around 20 days ! lol ! Am i correct ?

Happy new year to you

Cheers !

Ahahahah this is fun, I admit only having run badblocks on small USB flash drives, and with a consistent block size (4M), so didn't realize it would have taken so long on a large HDD. Well, considered that no unrecoverable/reallocated sectors were reported among attributes, that no error was logged, and that smartctl exited with code 4, which according to man should mean "Disk OK", I'll move to the next step and start thinking about how to recover data from the UFS partition

UFS external USB Drive - Cannot find file system superblock - Repair possible ?

veryfoot

veryfoot

Sensucht94

Guest

veryfoot

Snurg

Maelstorm

Sensucht94

Guest

veryfoot

Sensucht94

Guest

veryfoot

Sensucht94

Guest

veryfoot

Sensucht94

Guest

veryfoot

veryfoot

veryfoot

Sensucht94

Guest

veryfoot

Sensucht94

Guest