USB disks hang system

Thanks for your replies, everyone!

Yes there is. Refer to tunefs and section 18.7 of the handbook: Disk Labels. This is generally how you would handle USB detachable disks, anyway.
Thank you for the pointers, I'll RTFM. :)

If you're interested, see below for problems I encountered when trying mjollnir's suggestions to the same effect.

Also, about your Seagate issue, I would presume/assume this drive is SMR, so can you run zonectl on it and report back the results?
It seems that neither drive uses SMR:
Code:
# zonectl -d /dev/da0 -c params
Zone Mode: None
Command support: None
Unrestricted Read in Sequential Write Required Zone (URSWRZ): No
Optimal Number of Open Sequential Write Preferred Zones: Not Set
Optimal Number of Non-Sequentially Written Sequential Write Preferred Zones: Not Set
Maximum Number of Open Sequential Write Required Zones: Not Set
# zonectl -d /dev/da0 -c rz
zonectl: DIOCZONECMD ioctl failed: Invalid argument
# zonectl -d /dev/da2 -c params
Zone Mode: None
Command support: None
Unrestricted Read in Sequential Write Required Zone (URSWRZ): No
Optimal Number of Open Sequential Write Preferred Zones: Not Set
Optimal Number of Non-Sequentially Written Sequential Write Preferred Zones: Not Set
Maximum Number of Open Sequential Write Required Zones: Not Set
# zonectl -d /dev/da2 -c rz
zonectl: DIOCZONECMD ioctl failed: Input/output error


Have your previously provided the results of camcontrol identify on this drive to the forum?
Nope, sorry! Here's the camcontrol identify output for the Seagate drive:
Code:
# camcontrol identify da0    
pass2: <ST4000DM000-1F2168 CC54> ACS-2 ATA SATA 3.x device
pass2: 400.000MB/s transfers

protocol              ACS-2 ATA SATA 3.x
device model          ST4000DM000-1F2168
firmware revision     CC54
serial number         [REDACTED]
WWN                   [REDACTED]
additional product id 
cylinders             16383
heads                 16
sectors/track         63
sector size           logical 512, physical 4096, offset 0
LBA supported         268435455 sectors
LBA48 supported       7814037168 sectors
PIO supported         PIO4
DMA supported         WDMA2 UDMA6 
media RPM             5900
Zoned-Device Commands no

Feature                      Support  Enabled   Value           Vendor
read ahead                     yes      yes
write cache                    yes      yes
flush cache                    yes      yes
Native Command Queuing (NCQ)   yes              32 tags
NCQ Priority Information       no
NCQ Non-Data Command           no
NCQ Streaming                  no
Receive & Send FPDMA Queued    no
NCQ Autosense                  no
SMART                          yes      yes
security                       yes      no
power management               yes      yes
microcode download             yes      yes
advanced power management      yes      yes     254/0xFE
automatic acoustic management  no       no
media status notification      no       no
power-up in Standby            yes      no
write-read-verify              yes      no      0/0x0
unload                         no       no
general purpose logging        yes      yes
free-fall                      no       no
sense data reporting           no       no
extended power conditions      no       no
device statistics notification no       no
Data Set Management (DSM/TRIM) no
Trusted Computing              no
encrypts all user data         no
Sanitize                       no
Host Protected Area (HPA)      yes      no      7814037168/7814037167
HPA - Security                 yes      no 
Accessible Max Address Config  no

Interestingly, this disk shows support for "power-up in Standby", but that that feature is disabled. Can I enable it in some way or is it just disabled because I've set APM level 254?


ls /dev/{diskid,gpt{,id},label,msdosfs,ufs,zvol/t450s}
Code:
ls: /dev/diskid: No such file or directory
ls: /dev/label: No such file or directory
ls: /dev/ufs: No such file or directory
ls: /dev/ufsid: No such file or directory
/dev/gpt:
DUMP IRST efiboot0 gptboot0

/dev/gptid:
3354896e-ab2e-11ea-a908-507b9d666b68 f3587124-b087-11ea-903f-507b9d666b68
33612b8d-ab2e-11ea-a908-507b9d666b68

/dev/msdosfs:
EFISYS

/dev/zvol/t450s:
SWAP
These are filesystem labels, partition labels, and under /dev/label IIRC geom labels (RTFM glabel(8)). I find it handy to give the zpool(8) name like the machine model or name, or disk model, or some other unique name like bob or mary or functional like dmz-host. I.e. give a unique name to avoid getting confused when moving disks between machines. In case you have equal disk models, pin a written label onto them, numbered and/or otherwise uniquely named. I recommend to use functional partition labels in fstab(5).
ralphbsz TL;DR
Here's the output of ls -lF /dev/{diskid,gpt{,id},label,msdosfs,ufs,zvol/t450s} for me:
Code:
ls: /dev/gpt: No such file or directory
ls: /dev/label: No such file or directory
ls: /dev/ufs: No such file or directory
ls: /dev/zvol/t450s: No such file or directory
/dev/diskid:
total 0
crw-r-----  1 root  operator   0x7e Aug 23 21:30 DISK-20190615006211F

/dev/gptid:
total 0
crw-r-----  1 root  operator   0x7f Aug 23 21:30 6c654b1b-d4b1-11ea-91ef-3085a9a86c56

/dev/msdosfs:
total 0
crw-r-----  1 root  operator   0x80 Aug 23 21:30 EFISYS
/dev/diskid/DISK-20190615006211F seems to be /dev/da2, the Toshiba disk, as that's what shows up in zpool status -v. It can't be used interchangably with /dev/da2 though:
Code:
# camcontrol apm /dev/diskid/DISK-20190615006211F -l 254 
camcontrol: cam_get_device: unable to find device unit number

I've also done a possibly weird thing in that I've not partitioned the USB disks, but just added them to zpools as-is, i.e.:
Code:
# zpool create pool1 /dev/da0
# zpool create pool2 /dev/da2
...which means that glabel(8) gets confused:
Code:
# glabel label -v twilk-server-seagate /dev/da0
glabel: Can't store metadata on /dev/da0: Operation not permitted.
# glabel label -v twilk-server-toshiba /dev/da2
glabel: Can't store metadata on /dev/da2: Operation not permitted.
 
I found, with the PREVIOUS generation of usb
drivers, removing the USB disks and reconnecting them as SATA/EIDE was more reliable
in the long run... unless one just onlines them for r/w, with a slow
parameter [ such as rsync's --bwlimit=700 ] then umounts them again.
....
fwiw.
 
Interestingly, this disk shows support for "power-up in Standby", but that that feature is disabled. Can I enable it in some way or is it just disabled because I've set APM level 254?

That would be my interpretation.

I didn't realise this is a zfs pool, so in that regard, having PUIS enabled would be a "bad thing" [tm].

Looking at your seagate drive, the difference seems to be in that it's failing on both reads and writes. Have you run smart on this drive to see if there's any reported issues?
It could be a cable or it could be a power supply or it could be a disk platter about to head off into space.

Make sure the cable is seated correctly, first off.

Forgive me if I asked this before, is the drive in an enclosure made by Seagate?

Edit: Forgot, has the drive any firmware available?

You can then flash it with https://www.seagate.com/au/en/suppo...as-drive-firmware-using-seaflashlin-007806en/
 
  • On labels: t450s is the name of my zpool(8) (because it's the disk in my ThinkPad T450s laptop), you should have replaced that with your zpool name. Do not give the same name to different disks: the benefit of labels is to have a unique identifier. ZFS labels disks, if you give it whole disks; thus the device can not be labeled by another utility.
  • See the drive's Extended Power Conditions: camcontrol epc da0 -c status
  • If the drive supports it, you could try to disable standby with camcontrol epc da0 -c state -d -p Standby_y -s & camcontrol epc da0 -c state -d -p Standby_z -s
    The downside is no power saving when the system goes to standby. Or disable the EPC timer values; RTFM camcontrol(8)
 
Thanks for your replies, jb_fvvm2 and mark_j!

I found, with the PREVIOUS generation of usb
drivers, removing the USB disks and reconnecting them as SATA/EIDE was more reliable
in the long run... unless one just onlines them for r/w, with a slow
parameter [ such as rsync's --bwlimit=700 ] then umounts them again.
....
fwiw.
I've tried something like this and it seemed to work -- in my case, I wrote a script to copy files one-by-one, waiting before each file until the disk temperature was below 50°C. That seems impractical for normal use as I'd like to serve files off that disk over HTTP.

That would be my interpretation.

I didn't realise this is a zfs pool, so in that regard, having PUIS enabled would be a "bad thing" [tm].

Looking at your seagate drive, the difference seems to be in that it's failing on both reads and writes. Have you run smart on this drive to see if there's any reported issues?
It could be a cable or it could be a power supply or it could be a disk platter about to head off into space.

Make sure the cable is seated correctly, first off.

Forgive me if I asked this before, is the drive in an enclosure made by Seagate?

Edit: Forgot, has the drive any firmware available?

You can then flash it with https://www.seagate.com/au/en/suppo...as-drive-firmware-using-seaflashlin-007806en/

The Toshiba drive was failing on both reads and writes as well, before I set APM level 254.

That Seagate page shows no firmware updates available.

The drive is in its original Seagate-branded enclosure. I've checked both the USB and power cables; they're definitely seated correctly.

I'm running smartd(8), which checks the drive every night. Here's smartctl -a /dev/da0:
Code:
smartctl 7.1 2019-12-30 r5022 [FreeBSD 12.1-RELEASE-p8 amd64] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Desktop HDD.15
Device Model:     ST4000DM000-1F2168
Serial Number:    [REDACTED]
LU WWN Device Id: [REDACTED]
Firmware Version: CC54
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5900 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Thu Aug 27 11:05:26 2020 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever 
                                        been run.
Total time to complete Offline 
data collection:                (  117) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine 
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        ( 518) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x1085) SCT Status supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   106   099   006    Pre-fail  Always       -       11391720
  3 Spin_Up_Time            0x0003   094   092   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       809
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   070   060   030    Pre-fail  Always       -       43051781288
  9 Power_On_Hours          0x0032   054   054   000    Old_age   Always       -       41087
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       521
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0 0 0
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   055   041   045    Old_age   Always   In_the_past 45 (Min/Max 45/47 #38)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       9
193 Load_Cycle_Count        0x0032   001   001   000    Old_age   Always       -       208771
194 Temperature_Celsius     0x0022   045   059   000    Old_age   Always       -       45 (0 16 0 0 0)
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       12622h+56m+20.040s
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       67676762018
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       134357918370

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     41079         -
# 2  Short offline       Completed without error       00%     41055         -
# 3  Short offline       Completed without error       00%     41031         -
# 4  Short offline       Completed without error       00%     41007         -
# 5  Extended offline    Completed without error       00%     40991         -
# 6  Short offline       Completed without error       00%     40959         -
# 7  Short offline       Completed without error       00%     40935         -
# 8  Short offline       Completed without error       00%     40911         -
# 9  Short offline       Completed without error       00%     40887         -
#10  Short offline       Completed without error       00%     40864         -
#11  Short offline       Completed without error       00%     40845         -
#12  Extended offline    Completed without error       00%     40828         -
#13  Short offline       Completed without error       00%     40796         -
#14  Short offline       Completed without error       00%     40772         -
#15  Short offline       Completed without error       00%     40764         -
#16  Short offline       Completed without error       00%     40751         -
#17  Short offline       Completed without error       00%     40729         -
#18  Short offline       Completed without error       00%      5945         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
Apparently no problems except for the temperature becoming too high once -- that was when I did a dd if=/dev/da0 of=/dev/null to see if there was a specific threshold temperature that would cause failures. Sadly, it didn't seem so clear-cut: I've had write failures of the Seagate drive at 51°C, but that dd(1) command carried on reading all the way up to 55°C, at which point I stopped it. The temperature also got up to 55°C once during an extended smart self-test overnight.

I haven't had problems with the Seagate drive for a few days, but for a while I'd regularly get unkillably stuck find(1) commands from various periodic(8) scripts running overnight, which were scanning that disk. Here's all of the messages in /var/log/messages from around that time:
Code:
Aug 22 03:50:00 server kernel: ugen1.2: <Seagate Expansion Desk> at usbus1 (disconnected)
Aug 22 03:50:00 server kernel: umass0: at uhub0, port 1, addr 1 (disconnected)
Aug 22 03:50:00 server kernel: da0 at umass-sim0 bus 0 scbus6 target 0 lun 0
Aug 22 03:50:00 server kernel: da0: <Seagate Expansion Desk 0712>  s/n [REDACTED] detached
Aug 22 03:50:00 server kernel: (da0:umass-sim0:0:0:0): Periph destroyed
Aug 22 03:50:00 server kernel: umass0: detached
Aug 22 03:50:08 server kernel: ugen1.2: <Seagate Expansion Desk> at usbus1
Aug 22 03:50:08 server kernel: umass0 on uhub0
Aug 22 03:50:08 server kernel: umass0: <Seagate Expansion Desk, class 0/0, rev 3.00/1.00, addr 1> on usbus1
Aug 22 03:50:08 server kernel: umass0:  SCSI over Bulk-Only; quirks = 0x0100
Aug 22 03:50:08 server kernel: umass0:6:0: Attached to scbus6
Aug 22 03:50:08 server kernel: da0 at umass-sim0 bus 0 scbus6 target 0 lun 0
Aug 22 03:50:08 server kernel: da0: <Seagate Expansion Desk 0712> Fixed Direct Access SPC-4 SCSI device
Aug 22 03:50:08 server kernel: da0: Serial Number [REDACTED]
Aug 22 03:50:08 server kernel: da0: 400.000MB/s transfers
Aug 22 03:50:08 server kernel: da0: 3815447MB (976754645 4096 byte sectors)
Aug 22 03:50:08 server kernel: da0: quirks=0x2<NO_6_BYTE>
The timing (disconnection at exactly 03:50:00) seems too specific to be random, though there's no cron(8) job scheduled then. I've set smartd(8) to run short self tests on Mon-Sat mornings between 3 and 4 am and extended self tests on Sundays at the same time. 22 August was a Saturday, so this error could correspond to a short self-test. However, that seems unlikely to cause this error, especially since I've run these tests every day since and they haven't caused the same error! According to my logs (sampling smartctl(8) every 2 minutes), the disk temperature was a constant 49°C that whole night.
 
Thanks, mjollnir!
  • On labels: t450s is the name of my zpool(8) (because it's the disk in my ThinkPad T450s laptop), you should have replaced that with your zpool name. Do not give the same name to different disks: the benefit of labels is to have a unique identifier. ZFS labels disks, if you give it whole disks; thus the device can not be labeled by another utility.
Unfortunately, /dev/zvol/ doesn't exist at all on my system! I never ran zfs create, just zpool create. That gave me a filesystem to mount, and zfs list shows the ones I created that way. Was that the wrong thing to do?

  • See the drive's Extended Power Conditions: camcontrol epc da0 -c status
Code:
# camcontrol epc da0 -c status     
camcontrol: The epc subcommand only works with ATA protocol devices
# camcontrol epc da2 -c status
camcontrol: The epc subcommand only works with ATA protocol devices


  • If the drive supports it, you could try to disable standby with camcontrol epc da0 -c state -d -p Standby_y -s & camcontrol epc da0 -c state -d -p Standby_z -s
    The downside is no power saving when the system goes to standby. Or disable the EPC timer values; RTFM camcontrol(8)
Thanks, I'll have a look!
 
Thanks for your replies, jb_fvvm2 and mark_j!


I've tried something like this and it seemed to work -- in my case, I wrote a script to copy files one-by-one, waiting before each file until the disk temperature was below 50°C. That seems impractical for normal use as I'd like to serve files off that disk over HTTP.



The Toshiba drive was failing on both reads and writes as well, before I set APM level 254.

That Seagate page shows no firmware updates available.

The drive is in its original Seagate-branded enclosure. I've checked both the USB and power cables; they're definitely seated correctly.

I'm running smartd(8), which checks the drive every night. Here's smartctl -a /dev/da0:
Code:
smartctl 7.1 2019-12-30 r5022 [FreeBSD 12.1-RELEASE-p8 amd64] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Desktop HDD.15
Device Model:     ST4000DM000-1F2168
Serial Number:    [REDACTED]
LU WWN Device Id: [REDACTED]
Firmware Version: CC54
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5900 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Thu Aug 27 11:05:26 2020 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (  117) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        ( 518) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x1085) SCT Status supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   106   099   006    Pre-fail  Always       -       11391720
  3 Spin_Up_Time            0x0003   094   092   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       809
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   070   060   030    Pre-fail  Always       -       43051781288
  9 Power_On_Hours          0x0032   054   054   000    Old_age   Always       -       41087
10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       521
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0 0 0
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   055   041   045    Old_age   Always   In_the_past 45 (Min/Max 45/47 #38)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       9
193 Load_Cycle_Count        0x0032   001   001   000    Old_age   Always       -       208771
194 Temperature_Celsius     0x0022   045   059   000    Old_age   Always       -       45 (0 16 0 0 0)
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       12622h+56m+20.040s
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       67676762018
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       134357918370

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     41079         -
# 2  Short offline       Completed without error       00%     41055         -
# 3  Short offline       Completed without error       00%     41031         -
# 4  Short offline       Completed without error       00%     41007         -
# 5  Extended offline    Completed without error       00%     40991         -
# 6  Short offline       Completed without error       00%     40959         -
# 7  Short offline       Completed without error       00%     40935         -
# 8  Short offline       Completed without error       00%     40911         -
# 9  Short offline       Completed without error       00%     40887         -
#10  Short offline       Completed without error       00%     40864         -
#11  Short offline       Completed without error       00%     40845         -
#12  Extended offline    Completed without error       00%     40828         -
#13  Short offline       Completed without error       00%     40796         -
#14  Short offline       Completed without error       00%     40772         -
#15  Short offline       Completed without error       00%     40764         -
#16  Short offline       Completed without error       00%     40751         -
#17  Short offline       Completed without error       00%     40729         -
#18  Short offline       Completed without error       00%      5945         -

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
Apparently no problems except for the temperature becoming too high once -- that was when I did a dd if=/dev/da0 of=/dev/null to see if there was a specific threshold temperature that would cause failures. Sadly, it didn't seem so clear-cut: I've had write failures of the Seagate drive at 51°C, but that dd(1) command carried on reading all the way up to 55°C, at which point I stopped it. The temperature also got up to 55°C once during an extended smart self-test overnight.

I haven't had problems with the Seagate drive for a few days, but for a while I'd regularly get unkillably stuck find(1) commands from various periodic(8) scripts running overnight, which were scanning that disk. Here's all of the messages in /var/log/messages from around that time:
Code:
Aug 22 03:50:00 server kernel: ugen1.2: <Seagate Expansion Desk> at usbus1 (disconnected)
Aug 22 03:50:00 server kernel: umass0: at uhub0, port 1, addr 1 (disconnected)
Aug 22 03:50:00 server kernel: da0 at umass-sim0 bus 0 scbus6 target 0 lun 0
Aug 22 03:50:00 server kernel: da0: <Seagate Expansion Desk 0712>  s/n [REDACTED] detached
Aug 22 03:50:00 server kernel: (da0:umass-sim0:0:0:0): Periph destroyed
Aug 22 03:50:00 server kernel: umass0: detached
Aug 22 03:50:08 server kernel: ugen1.2: <Seagate Expansion Desk> at usbus1
Aug 22 03:50:08 server kernel: umass0 on uhub0
Aug 22 03:50:08 server kernel: umass0: <Seagate Expansion Desk, class 0/0, rev 3.00/1.00, addr 1> on usbus1
Aug 22 03:50:08 server kernel: umass0:  SCSI over Bulk-Only; quirks = 0x0100
Aug 22 03:50:08 server kernel: umass0:6:0: Attached to scbus6
Aug 22 03:50:08 server kernel: da0 at umass-sim0 bus 0 scbus6 target 0 lun 0
Aug 22 03:50:08 server kernel: da0: <Seagate Expansion Desk 0712> Fixed Direct Access SPC-4 SCSI device
Aug 22 03:50:08 server kernel: da0: Serial Number [REDACTED]
Aug 22 03:50:08 server kernel: da0: 400.000MB/s transfers
Aug 22 03:50:08 server kernel: da0: 3815447MB (976754645 4096 byte sectors)
Aug 22 03:50:08 server kernel: da0: quirks=0x2<NO_6_BYTE>
The timing (disconnection at exactly 03:50:00) seems too specific to be random, though there's no cron(8) job scheduled then. I've set smartd(8) to run short self tests on Mon-Sat mornings between 3 and 4 am and extended self tests on Sundays at the same time. 22 August was a Saturday, so this error could correspond to a short self-test. However, that seems unlikely to cause this error, especially since I've run these tests every day since and they haven't caused the same error! According to my logs (sampling smartctl(8) every 2 minutes), the disk temperature was a constant 49°C that whole night.
Well nothing stands out for the smart info; given Seagate is notorious for producing convoluted numbers which are decoded only by their software so that only their software interprets the numbers correctly.
Eg, your seek error rate is 102108328 but your actual seek errors is 10, based on 43051781288.

I guess the important point is they are showing either pre-fail or old age. (And, pre-fail is a horrible notation unless it is actually in pre-fail mode, that is).

Why do you suspect the drive plays up at the same time? You've only shown one log entry. Are there others?

Remember, the OS has its own cron jobs that run. See /etc/crontab and /etc/periodic.

A gripe with me is the vagueness of the error messages. The CDB messages are less than useless, except if you take them as authoritative and assume the blocks reported are actually unreadable/unwritable (which I doubt) and it's more likely a timeout. This is why I would have almost bet the house on this drive being Shingled.

It's a quandary. Your drive, as reported from SMART looks ok. It then leads to the inevitable; it's FreeBSD. I need to think more about this.
 
Why do you suspect the drive plays up at the same time? You've only shown one log entry. Are there others?
Here are the errors for the Seagate disk in my /var/log/messages:
Code:
[snip]
Aug 20 20:15:39 server kernel: ugen1.2: <Seagate Expansion Desk> at usbus1 (disconnected)
Aug 20 20:15:39 server kernel: umass0: at uhub0, port 1, addr 1 (disconnected)
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 0c 07 a3 ab 00 00 01 00 
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 3 more tries remain
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 0c 07 a3 ab 00 00 01 00 
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 2 more tries remain
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 0c 07 a3 ab 00 00 01 00 
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 1 more tries remain
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 0c 07 a3 ab 00 00 01 00 
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 0 more tries remain
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 0c 07 a3 ab 00 00 01 00 
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): Error 5, Retries exhausted
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 0c 07 a3 ac 00 00 04 00 
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 3 more tries remain
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 0c 07 a3 ac 00 00 04 00 
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 2 more tries remain
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 0c 07 a3 ac 00 00 04 00 
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 1 more tries remain
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 0c 07 a3 ac 00 00 04 00 
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 0 more tries remain
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 0c 07 a3 ac 00 00 04 00 
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): Error 5, Retries exhausted
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 0b 80 af 0b 00 00 02 00 
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 3 more tries remain
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 0b 80 af 0b 00 00 02 00 
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 2 more tries remain
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 0b 80 af 0b 00 00 02 00 
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 1 more tries remain
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 0b 80 af 0b 00 00 02 00 
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 0 more tries remain
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 0b 80 af 0b 00 00 02 00 
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): Error 5, Retries exhausted
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 3a 38 17 82 00 00 02 00 
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 3 more tries remain
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 3a 38 17 82 00 00 02 00 
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 2 more tries remain
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 3a 38 17 82 00 00 02 00 
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 1 more tries remain
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 3a 38 17 82 00 00 02 00 
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 0 more tries remain
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 3a 38 17 82 00 00 02 00 
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): Error 5, Retries exhausted
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 0c 80 89 c3 00 00 01 00 
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 3 more tries remain
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 0c 80 89 c3 00 00 01 00 
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 2 more tries remain
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 0c 80 89 c3 00 00 01 00 
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 1 more tries remain
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 0c 80 89 c3 00 00 01 00 
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 0 more tries remain
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 0c 80 89 c3 00 00 01 00 
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): Error 5, Retries exhausted
Aug 20 20:15:39 server kernel: da0 at umass-sim0 bus 0 scbus6 target 0 lun 0
Aug 20 20:15:39 server kernel: da0: <Seagate Expansion Desk 0712>  s/n [REDACTED] detached
Aug 20 20:15:39 server kernel: (da0:umass-sim0:0:0:0): Periph destroyed
Aug 20 20:15:39 server kernel: umass0: detached
Aug 20 20:15:39 server ZFS[67324]: vdev state changed, pool_guid=$12612782409786294928 vdev_guid=$3198167944910114318
Aug 20 20:15:39 server ZFS[67640]: vdev is removed, pool_guid=$12612782409786294928 vdev_guid=$3198167944910114318
Aug 20 20:15:43 server kernel: ugen1.2: <Seagate Expansion Desk> at usbus1
Aug 20 20:15:43 server kernel: umass0 on uhub0
Aug 20 20:15:43 server kernel: umass0: <Seagate Expansion Desk, class 0/0, rev 3.00/1.00, addr 1> on usbus1
Aug 20 20:15:43 server kernel: umass0:  SCSI over Bulk-Only; quirks = 0x0100
Aug 20 20:15:43 server kernel: umass0:6:0: Attached to scbus6
Aug 20 20:15:43 server kernel: da0 at umass-sim0 bus 0 scbus6 target 0 lun 0
Aug 20 20:15:43 server kernel: da0: <Seagate Expansion Desk 0712> Fixed Direct Access SPC-4 SCSI device
Aug 20 20:15:43 server kernel: da0: Serial Number [REDACTED]
Aug 20 20:15:43 server kernel: da0: 400.000MB/s transfers
Aug 20 20:15:43 server kernel: da0: 3815447MB (976754645 4096 byte sectors)
Aug 20 20:15:43 server kernel: da0: quirks=0x2<NO_6_BYTE>
Aug 22 03:50:00 server kernel: ugen1.2: <Seagate Expansion Desk> at usbus1 (disconnected)
Aug 22 03:50:00 server kernel: umass0: at uhub0, port 1, addr 1 (disconnected)
Aug 22 03:50:00 server kernel: da0 at umass-sim0 bus 0 scbus6 target 0 lun 0
Aug 22 03:50:00 server kernel: da0: <Seagate Expansion Desk 0712>  s/n [REDACTED] detached
Aug 22 03:50:00 server kernel: (da0:umass-sim0:0:0:0): Periph destroyed
Aug 22 03:50:00 server kernel: umass0: detached
Aug 22 03:50:08 server kernel: ugen1.2: <Seagate Expansion Desk> at usbus1
Aug 22 03:50:08 server kernel: umass0 on uhub0
Aug 22 03:50:08 server kernel: umass0: <Seagate Expansion Desk, class 0/0, rev 3.00/1.00, addr 1> on usbus1
Aug 22 03:50:08 server kernel: umass0:  SCSI over Bulk-Only; quirks = 0x0100
Aug 22 03:50:08 server kernel: umass0:6:0: Attached to scbus6
Aug 22 03:50:08 server kernel: da0 at umass-sim0 bus 0 scbus6 target 0 lun 0
Aug 22 03:50:08 server kernel: da0: <Seagate Expansion Desk 0712> Fixed Direct Access SPC-4 SCSI device
Aug 22 03:50:08 server kernel: da0: Serial Number [REDACTED]
Aug 22 03:50:08 server kernel: da0: 400.000MB/s transfers
Aug 22 03:50:08 server kernel: da0: 3815447MB (976754645 4096 byte sectors)
Aug 22 03:50:08 server kernel: da0: quirks=0x2<NO_6_BYTE>
[snip]
Aug 23 22:34:42 server kernel: ugen1.2: <Seagate Expansion Desk> at usbus1 (disconnected)
Aug 23 22:34:42 server kernel: umass0: at uhub0, port 1, addr 1 (disconnected)
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 0b 80 09 75 00 00 09 00 
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 3 more tries remain
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 0b 80 09 75 00 00 09 00 
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 2 more tries remain
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 0b 80 09 75 00 00 09 00 
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 1 more tries remain
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 0b 80 09 75 00 00 09 00 
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 0 more tries remain
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 0b 80 09 75 00 00 09 00 
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): Error 5, Retries exhausted
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 0b b5 61 1e 00 00 20 00 
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 3 more tries remain
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 0b b5 61 1e 00 00 20 00 
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 2 more tries remain
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 0b b5 61 1e 00 00 20 00 
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 1 more tries remain
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 0b b5 61 1e 00 00 20 00 
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 0 more tries remain
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 0b b5 61 1e 00 00 20 00 
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): Error 5, Retries exhausted
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 0b b5 60 1e 00 00 20 00 
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 3 more tries remain
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 0b b5 60 1e 00 00 20 00 
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 2 more tries remain
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 0b b5 60 1e 00 00 20 00 
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 1 more tries remain
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 0b b5 60 1e 00 00 20 00 
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 0 more tries remain
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 0b b5 60 1e 00 00 20 00 
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): Error 5, Retries exhausted
Aug 23 22:34:42 server kernel: da0 at umass-sim0 bus 0 scbus6 target 0 lun 0
Aug 23 22:34:42 server kernel: da0: <Seagate Expansion Desk 0712>  s/n [REDACTED] detached
Aug 23 22:34:42 server kernel: (da0:umass-sim0:0:0:0): Periph destroyed
Aug 23 22:34:42 server kernel: umass0: detached
Aug 23 22:34:42 server ZFS[24697]: vdev state changed, pool_guid=$12612782409786294928 vdev_guid=$3198167944910114318
Aug 23 22:34:42 server ZFS[25288]: vdev is removed, pool_guid=$12612782409786294928 vdev_guid=$3198167944910114318
Aug 23 22:34:47 server kernel: ugen1.2: <Seagate Expansion Desk> at usbus1
Aug 23 22:34:47 server kernel: umass0 on uhub0
Aug 23 22:34:47 server kernel: umass0: <Seagate Expansion Desk, class 0/0, rev 3.00/1.00, addr 1> on usbus1
Aug 23 22:34:47 server kernel: umass0:  SCSI over Bulk-Only; quirks = 0x0100
Aug 23 22:34:47 server kernel: umass0:6:0: Attached to scbus6
Aug 23 22:34:47 server kernel: da0 at umass-sim0 bus 0 scbus6 target 0 lun 0
Aug 23 22:34:47 server kernel: da0: <Seagate Expansion Desk 0712> Fixed Direct Access SPC-4 SCSI device
Aug 23 22:34:47 server kernel: da0: Serial Number [REDACTED]
Aug 23 22:34:47 server kernel: da0: 400.000MB/s transfers
Aug 23 22:34:47 server kernel: da0: 3815447MB (976754645 4096 byte sectors)
Aug 23 22:34:47 server kernel: da0: quirks=0x2<NO_6_BYTE>
[snip]
The full log is attached as var-log-messages.txt.

A gripe with me is the vagueness of the error messages. The CDB messages are less than useless, except if you take them as authoritative and assume the blocks reported are actually unreadable/unwritable (which I doubt) and it's more likely a timeout. This is why I would have almost bet the house on this drive being Shingled.

It's a quandary. Your drive, as reported from SMART looks ok. It then leads to the inevitable; it's FreeBSD. I need to think more about this.
It seems unlikely to me, too, that the blocks are actually unreadable/unwritable as e.g. force-rebooting the system "fixes" the problem and lets me read/write those blocks again.

Is it possible that the drives use SMR internally but don't expose it over USB? I forgot to mention earlier, when running zonectl -d /dev/da0 -c rz, I get the following error in /var/log/messages:
Code:
Aug 29 11:10:19 server kernel: (da0:umass-sim0:0:0:0): ZBC IN. CDB: 95 00 00 00 00 00 00 00 00 00 00 02 00 00 00 00 
Aug 29 11:10:19 server kernel: (da0:umass-sim0:0:0:0): CAM status: SCSI Status Error
Aug 29 11:10:19 server kernel: (da0:umass-sim0:0:0:0): SCSI status: Check Condition
Aug 29 11:10:19 server kernel: (da0:umass-sim0:0:0:0): SCSI sense: ILLEGAL REQUEST asc:20,0 (Invalid command operation code)
Aug 29 11:10:19 server kernel: (da0:umass-sim0:0:0:0): Error 22, Unretryable error
 

Attachments

  • var-log-messages.txt
    943.6 KB · Views: 146
Code:
Aug 29 11:10:19 server kernel: (da0:umass-sim0:0:0:0): ZBC IN. CDB: 95 00 00 00 00 00 00 00 00 00 00 02 00 00 00 00
Aug 29 11:10:19 server kernel: (da0:umass-sim0:0:0:0): CAM status: SCSI Status Error
Aug 29 11:10:19 server kernel: (da0:umass-sim0:0:0:0): SCSI status: Check Condition
Aug 29 11:10:19 server kernel: (da0:umass-sim0:0:0:0): SCSI sense: ILLEGAL REQUEST asc:20,0 (Invalid command operation code)
Aug 29 11:10:19 server kernel: (da0:umass-sim0:0:0:0): Error 22, Unretryable error

This does suggest it does zone block reads because there's allocation size specified in the CDB. Perhaps it is "just" a misreading of the firmware by the driver, ie, a bug.

I did trawl through your logs, and it does seem you do need a firmware update:
Code:
Aug 13 18:14:02 server smartd[37433]: Device: /dev/ada0, WARNING: A firmware update for this drive may be available,
Aug 13 18:14:02 server smartd[37433]: see the following Seagate web pages:
Aug 13 18:14:02 server smartd[37433]: http://knowledge.seagate.com/articles/en_US/FAQ/207931en
Aug 13 18:14:02 server smartd[37433]: http://knowledge.seagate.com/articles/en_US/FAQ/223651en

If you go to the second link it definitely shows a different firmware number.
See: https://www.seagate.com/au/en/support/kb/barracuda-1tbdisk-platform-firmware-update-223651en/

Looking in the log you will see:
Code:
Aug 14 20:38:01 server kernel: ada0: <ST1000DM003-9YN162 CC4B> ATA8-ACS SATA 3.x device
Aug 14 20:38:01 server kernel: ada0: Serial Number S1D4H88M

Taking that serial number, selecting US as the region and using this URL it will show your drive is requiring a firmware update:

So, I suggest in the first instance updating the firmware (backup data first, if you need).
 
This does suggest it does zone block reads because there's allocation size specified in the CDB. Perhaps it is "just" a misreading of the firmware by the driver, ie, a bug.
Hm, that's interesting. Assuming it isn't a bug and the disk does zone reads internally, how do I tell whether that's the origin of the problems/how do I fix it if it is?

I did trawl through your logs, and it does seem you do need a firmware update:
Code:
Aug 13 18:14:02 server smartd[37433]: Device: /dev/ada0, WARNING: A firmware update for this drive may be available,
Aug 13 18:14:02 server smartd[37433]: see the following Seagate web pages:
Aug 13 18:14:02 server smartd[37433]: http://knowledge.seagate.com/articles/en_US/FAQ/207931en
Aug 13 18:14:02 server smartd[37433]: http://knowledge.seagate.com/articles/en_US/FAQ/223651en

If you go to the second link it definitely shows a different firmware number.
See: https://www.seagate.com/au/en/support/kb/barracuda-1tbdisk-platform-firmware-update-223651en/

Looking in the log you will see:
Code:
Aug 14 20:38:01 server kernel: ada0: <ST1000DM003-9YN162 CC4B> ATA8-ACS SATA 3.x device
Aug 14 20:38:01 server kernel: ada0: Serial Number S1D4H88M

Taking that serial number, selecting US as the region and using this URL it will show your drive is requiring a firmware update:

So, I suggest in the first instance updating the firmware (backup data first, if you need).
True, though that's ada0 (the internal SATA-connected hard disk), not da0 (the USB-connected hard disk that I'm having problems with). That internal disk is working perfectly fine, though you're right, I should update its firmware. It seems unlikely that that would solve the problem with the external disk though.
 
Hm, that's interesting. Assuming it isn't a bug and the disk does zone reads internally, how do I tell whether that's the origin of the problems/how do I fix it if it is?


True, though that's ada0 (the internal SATA-connected hard disk), not da0 (the USB-connected hard disk that I'm having problems with). That internal disk is working perfectly fine, though you're right, I should update its firmware. It seems unlikely that that would solve the problem with the external disk though.
Oops sorry I mistook ada for da.
 
Hm, that's interesting. Assuming it isn't a bug and the disk does zone reads internally, how do I tell whether that's the origin of the problems/how do I fix it if it is?

Well it reports itself as not SMR, so I guess you have to take the firmware's word for it.

The ZBC reports an error, so it does not support zone block control. Perhaps the only other definitive way is to re-format the disk as UFS, run dd on it and monitor the I/O. If it's SMR, it will have a burst of high I/O writes then drop precipitously after that and settle on some real mediocre write rate. That's shingled drive modus operandi because the write band or zone is under another 'shingle' of data.

(Those drives are so dodgy I personally believe it's criminal act selling them - especially as Seagate and Western Digital go to pains to hide the fact).

The other potential is it's disk managed, so it's "hiding" the SMR from the OS. Can you provide the diskinfo -v da0 output?

Output of camcontrol zone da0 -v -c rz?

(I apologise if I've asked these before, it's hard to keep track.)

If that still reports it as non-zoned, then there are 5 potential causes:

1. Disk is failing. (But smart should give some indication). Throw it away.
2. Power supply is failing. Take disk out of enclosure, fit it into another.
3. USB cable is damaged. Swap it out.
4. USB female socket is damaged on the host. Swap it to another USB plug.
5. There's a bug in the CAM driver. Advise via PR.
 
Thanks for the suggestions!

Well it reports itself as not SMR, so I guess you have to take the firmware's word for it.

The ZBC reports an error, so it does not support zone block control. Perhaps the only other definitive way is to re-format the disk as UFS, run dd on it and monitor the I/O. If it's SMR, it will have a burst of high I/O writes then drop precipitously after that and settle on some real mediocre write rate. That's shingled drive modus operandi because the write band or zone is under another 'shingle' of data.

(Those drives are so dodgy I personally believe it's criminal act selling them - especially as Seagate and Western Digital go to pains to hide the fact).

The other potential is it's disk managed, so it's "hiding" the SMR from the OS. Can you provide the diskinfo -v da0 output?

Output of camcontrol zone da0 -v -c rz?

(I apologise if I've asked these before, it's hard to keep track.)
I wiped the disk before formatting it with ZFS by dd(1)'ing from /dev/zero; that wrote the whole 4 TB at a constant 150 MiB/s, which seems decent over USB-3.
Code:
# diskinfo -v da0
da0
        4096            # sectorsize
        4000787025920   # mediasize in bytes (3.6T)
        976754645       # mediasize in sectors
        0               # stripesize
        0               # stripeoffset
        60800           # Cylinders according to firmware.
        255             # Heads according to firmware.
        63              # Sectors according to firmware.
        Seagate Expansion Desk  # Disk descr.
        NA4MHXW9        # Disk ident.
        No              # TRIM/UNMAP support
        Unknown         # Rotation rate in RPM
        Not_Zoned       # Zone Mode

# camcontrol zone da0 -v -c rz
(pass2:umass-sim0:0:0:0): ZBC IN. CDB: 95 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00 
(pass2:umass-sim0:0:0:0): CAM status: SCSI Status Error
(pass2:umass-sim0:0:0:0): SCSI status: Check Condition
(pass2:umass-sim0:0:0:0): SCSI sense: ILLEGAL REQUEST asc:20,0 (Invalid command operation code)


If that still reports it as non-zoned, then there are 5 potential causes:

1. Disk is failing. (But smart should give some indication). Throw it away.
2. Power supply is failing. Take disk out of enclosure, fit it into another.
3. USB cable is damaged. Swap it out.
4. USB female socket is damaged on the host. Swap it to another USB plug.
5. There's a bug in the CAM driver. Advise via PR.
  1. Hopefully not, and Linux seemed to handle the disk fine even under heavy load while I got errors from FreeBSD.
  2. I'd really like to avoid this as the enclosure is sealed and I'd have to break it to get the disk out, I can't just unscrew it.
  3. I've tried swapping the Toshiba disk's cable with the Seagate one (the Toshiba works perfectly now), but that didn't make a difference.
  4. I've tried that too, no difference.

I've had another one of these errors, by the way:
Code:
Aug 31 21:05:13 server kernel: ugen1.2: <Seagate Expansion Desk> at usbus1 (disconnected)
Aug 31 21:05:13 server kernel: umass0: at uhub0, port 1, addr 1 (disconnected)
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 11 0d 89 14 00 00 20 00 
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 3 more tries remain
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 11 0d 89 14 00 00 20 00 
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 2 more tries remain
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 11 0d 89 14 00 00 20 00 
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 1 more tries remain
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 11 0d 89 14 00 00 20 00 
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 0 more tries remain
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 11 0d 89 14 00 00 20 00 
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): Error 5, Retries exhausted
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 00 00 42 00 00 02 00 
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 3 more tries remain
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 00 00 42 00 00 02 00 
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 2 more tries remain
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 00 00 42 00 00 02 00 
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 1 more tries remain
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 00 00 42 00 00 02 00 
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 0 more tries remain
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 00 00 42 00 00 02 00 
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): Error 5, Retries exhausted
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 3a 38 17 42 00 00 02 00 
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 3 more tries remain
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 3a 38 17 42 00 00 02 00 
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 2 more tries remain
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 3a 38 17 42 00 00 02 00 
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 1 more tries remain
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 3a 38 17 42 00 00 02 00 
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): Retrying command, 0 more tries remain
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 3a 38 17 42 00 00 02 00 
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): Error 5, Retries exhausted
Aug 31 21:05:13 server kernel: da0 at umass-sim0 bus 0 scbus6 target 0 lun 0
Aug 31 21:05:13 server kernel: da0: <Seagate Expansion Desk 0712>  s/n [REDACTED] detached
Aug 31 21:05:13 server kernel: (da0:umass-sim0:0:0:0): Periph destroyed
Aug 31 21:05:13 server kernel: umass0: detached
Aug 31 21:05:13 server ZFS[77721]: vdev state changed, pool_guid=$12612782409786294928 vdev_guid=$3198167944910114318
Aug 31 21:05:13 server ZFS[78365]: vdev is removed, pool_guid=$12612782409786294928 vdev_guid=$3198167944910114318
Aug 31 21:05:19 server kernel: ugen1.2: <Seagate Expansion Desk> at usbus1
Aug 31 21:05:19 server kernel: umass0 on uhub0
Aug 31 21:05:19 server kernel: umass0: <Seagate Expansion Desk, class 0/0, rev 3.00/1.00, addr 1> on usbus1
Aug 31 21:05:19 server kernel: umass0:  SCSI over Bulk-Only; quirks = 0x0100
Aug 31 21:05:19 server kernel: umass0:6:0: Attached to scbus6
Aug 31 21:05:25 server kernel: da0 at umass-sim0 bus 0 scbus6 target 0 lun 0
Aug 31 21:05:25 server kernel: da0: <Seagate Expansion Desk 0712> Fixed Direct Access SPC-4 SCSI device
Aug 31 21:05:25 server kernel: da0: Serial Number [REDACTED]
Aug 31 21:05:25 server kernel: da0: 400.000MB/s transfers
Aug 31 21:05:25 server kernel: da0: 3815447MB (976754645 4096 byte sectors)
Aug 31 21:05:25 server kernel: da0: quirks=0x2<NO_6_BYTE>
Sep  1 10:28:40 server ZFS[40593]: vdev state changed, pool_guid=$12612782409786294928 vdev_guid=$3198167944910114318
Sep  1 10:28:40 server ZFS[43612]: vdev state changed, pool_guid=$12612782409786294928 vdev_guid=$3198167944910114318
(Those last two lines are me running zpool clear $da0pool.)
This error happened while serving a 1.6GiB file over HTTP to another device on the LAN.
 
Well, where to now?
I think the next step, should you be willing, is to test the USB subsystem. It will also involve using dtrace to attempt to find the potential software issue.
 
I have the same (or almost the same) Toshiba 4TB USB hard drives and the same issue on FreeBSD 13.0...Set up a ZFS raidz with 4TB Toshiba USB 3.0 hard drives. Everything works great until they go to sleep, once they do they will not wake up. Any file I/O hangs that process to the point where it cannot be killed in any way...I need to hold the power button for 10 seconds bc it can't even shutdown.

I get a bunch of "ccb request completed with failure" errors until it gives up. I tried everything in this thread and searched and tried every camcontrol power/sleep/standby setting and nothing makes a difference. I tried them in USB 2.0 ports, still has same problem.
Unfortunetly I ended up having to move to Linux where everything worked without issue using the same set up and ZFS...On another system I have different issues with even a simple 60gb mirror on FreeBSB, USB, & ZFS. I guess some hardware really doesn't work well with zfs & usb.
 
Unfortunetly I ended up having to move to Linux where everything worked without issue using the same set up and ZFS...On another system I have different issues with even a simple 60gb mirror on FreeBSB, USB, & ZFS. I guess some hardware really doesn't work well with zfs & usb.

There are lots of settings you can tweak in sysctl for timeouts, read_cache/write_cache just to name a few if you want to track down this issue, but it seems you've moved on. That's fine. Use whatever gets the job done quickest for you.
 
Well I really only moved on to see if it works, which it does, and works great actually (Ubuntu Server 21.04, OpenZFS, Samba server 4.13.3). The discs go to sleep when there's no activity, and wake right up once needed. That being said, I'd still much rather go back to FreeBSD for this setup as I know it and like it better.
I made a raidz ZFS using zstd-6 compression with 3 external 4TB USB 3.0 hard drives and a Samba server to be used on my internal network. That way I can easily access it from any OS on my network, Windows, Mac, Linux, & FreeBSD. I also have an internal HD that runs the OS so the 3 external HD's are completely separate for extra storage. This gives me about 7.5TB of space where 1 of the 3 drives can completely fail without losing any data. Been trying it out today and I really like it so far, (except on FreeBSD `mount_smbfs` only works with v1 of Samba but that's another story).

So basically my only issue is, once the hard drives go to sleep after a few minutes of no activity on FreeBSD, nothing will wake them up. I tried different combinations of things like:
```
camcontrol apm da2 -l 254
camcontrol standby /dev/da2 -t 3600
```
But no luck. Once they go to sleep, any file system call will indefinetly hang that process and terminal. No signals will work at all.
If I have to stay with Linux for this I will but I'd be willing to try some other things to get it working with FreeBSD. What else could I try? I really don't want them spinning 24/7 either so there's got to be a way to have them wake up properly. Thanks!
 
Back
Top