Generally my ZFS RaidZ2 pool (6 disks) works just fine. But more often than not, I get some errors as shown below when I do a scrub (which is twice a month) and then the disks ends up as unavailable. Smartd is of course running and does short test twice a week and long test twice a month.
But Smartd has never logged any warnings nor errors and also a
But something is wrong, any idea what?
/dev/da0 -a -o on -S on -n standby -s (S/../../(1|4/01)|L/../(07|22)/./02) -W 10,35,35 -p -m foo@bar.foo
But Smartd has never logged any warnings nor errors and also a
smartctl -a /dev/da0
gives nothing. So I tend to reboot the server and then the disk just works fine again.But something is wrong, any idea what?
Code:
Apr 25 02:00:24 freebsd kernel: (da0:mps0:0:0:0): READ(10). CDB: 28 00 bc 0c 21 f0 00 00 08 00
Apr 25 02:00:24 freebsd kernel: (da0:mps0:0:0:0): CAM status: SCSI Status Error
Apr 25 02:00:24 freebsd kernel: (da0:mps0:0:0:0): SCSI status: Check Condition
Apr 25 02:00:24 freebsd kernel: (da0:mps0:0:0:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred)
Apr 25 02:00:24 freebsd kernel: (da0:mps0:0:0:0): Retrying command (per sense data)
Apr 25 02:05:52 freebsd kernel: (da0:mps0:0:0:0): READ(10). CDB: 28 00 24 48 39 10 00 00 08 00 length 4096 SMID 340 Command timeout on target 0(0x0009) 60000 set, 60.189811576 elapsed
Apr 25 02:05:52 freebsd kernel: mps0: Sending abort to target 0 for SMID 340
Apr 25 02:05:52 freebsd kernel: (da0:mps0:0:0:0): READ(10). CDB: 28 00 24 48 39 10 00 00 08 00 length 4096 SMID 340 Aborting command 0xfffffe000269e8e0
Apr 25 02:05:52 freebsd kernel: mps0: Finished abort recovery for target 0
Apr 25 02:05:52 freebsd kernel: mps0: Unfreezing devq for target ID 0
Apr 25 02:05:52 freebsd kernel: (da0:mps0:0:0:0): READ(10). CDB: 28 00 24 48 39 10 00 00 08 00
Apr 25 02:05:52 freebsd kernel: (da0:mps0:0:0:0): CAM status: Command timeout
Apr 25 02:05:52 freebsd kernel: (da0:mps0:0:0:0): Retrying command, 3 more tries remain