ZFS SCSI CAM errors but zpool status no errors?

Hi,

I'm seeing this in dmesg:
Code:
    (da36:mpr1:0:8:0): READ(16). CDB: 88 00 00 00 00 05 65 cf 77 28 00 00 00 d0 00 00 length 106496 SMID 278 terminated ioc 804b loginfo 31080000 scsi 0 state 0 xfer 0
    (da36:mpr1:0:8:0): READ(16). CDB: 88 00 00 00 00 05 65 cf 78 50 00 00 00 20 00 00 length 16384 SMID 356 terminated ioc 804b l(da36:mpr1:0:8:0): READ(16). CDB: 88 00 00 00 00 05 65 cf 77 28 00 00 00 d0 00 00 
oginfo 31080000 scsi 0 state 0 xfer 0
(da36:mpr1:0:8:0): CAM status: CCB request completed with an error
(da36:mpr1:0:8:0): Retrying command
(da36:mpr1:0:8:0): READ(16). CDB: 88 00 00 00 00 05 65 cf 78 50 00 00 00 20 00 00 
(da36:mpr1:0:8:0): CAM status: CCB request completed with an error
(da36:mpr1:0:8:0): Retrying command
(da36:mpr1:0:8:0): READ(16). CDB: 88 00 00 00 00 05 65 cf 76 f0 00 00 00 20 00 00 
(da36:mpr1:0:8:0): CAM status: SCSI Status Error
(da36:mpr1:0:8:0): SCSI status: Check Condition
(da36:mpr1:0:8:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error)
(da36:mpr1:0:8:0): Info: 0x565cf7700
(da36:mpr1:0:8:0): Error 5, Unretryable error
    (da36:mpr1:0:8:0): READ(16). CDB: 88 00 00 00 00 05 65 cf a1 c0 00 00 01 00 00 00 length 131072 SMID 1127 terminated ioc 804b(da36:mpr1:0:8:0): READ(16). CDB: 88 00 00 00 00 05 65 cf a0 d8 00 00 00 20 00 00 
 loginfo 31080000 scsi 0 state 0 xfer 0
    (da36:mpr1:0:8:0): READ(16). CDB: 88 00 00 00 00 05 65 cf a2 c0 00 00 01 00 00 00 length 131072 SMID 922 terminated ioc 804b (da36:mpr1:0:8:0): CAM status: SCSI Status Error
(da36:mpr1:0:8:0): SCSI status: Check Condition
(da36:mpr1:0:8:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error)
(da36:mpr1:0:8:0): Info: 0x565cfa0d8
(da36:mpr1:0:8:0): Error 5, Unretryable error
loginfo 31080000 scsi 0 state 0 xfer 0
    (da36:mpr1:0:8:0): READ(16). CDB: 88 00 00 00 00 05 65 cf a3 c0 00 00 01 00 00 00 length 131072 SMID 216 terminated ioc 804b (da36:mpr1:0:8:0): READ(16). CDB: 88 00 00 00 00 05 65 cf a1 c0 00 00 01 00 00 00 
loginfo 31080000 scsi 0 state 0 xfer 0
(da36:mpr1:0:8:0): CAM status: CCB request completed with an error
(da36:mpr1:0:8:0): Retrying command
(da36:mpr1:0:8:0): READ(16). CDB: 88 00 00 00 00 05 65 cf a2 c0 00 00 01 00 00 00 
    (da36:mpr1:0:8:0): READ(16). CDB: 88 00 00 00 00 05 65 cf a4 c0 00 00 01 00 00 00 length 131072 SMID 485 terminated ioc 804b (da36:mpr1:0:8:0): CAM status: CCB request completed with an error
(da36:mpr1:0:8:0): Retrying command
(da36:mpr1:0:8:0): READ(16). CDB: 88 00 00 00 00 05 65 cf a3 c0 00 00 01 00 00 00 
(da36:mpr1:0:8:0): CAM status: CCB request completed with an error
(da36:mpr1:0:8:0): Retrying command
loginfo 31080000 scsi 0 state 0 xfer 0
    (da36:mpr1:0:8:0): READ(16). CDB: 88 00 00 00 00 05 65 cf a5 c0 00 00 00 08 00 00 length 4096 SMID 792 terminated ioc 804b loginfo 31080000 scsi 0 state 0 xfer 0
(da36:mpr1:0:8:0): READ(16). CDB: 88 00 00 00 00 05 65 cf a4 c0 00 00 01 00 00 00 
(da36:mpr1:0:8:0): CAM status: CCB request completed with an error
(da36:mpr1:0:8:0): Retrying command
(da36:mpr1:0:8:0): READ(16). CDB: 88 00 00 00 00 05 65 cf a5 c0 00 00 00 08 00 00 
(da36:mpr1:0:8:0): CAM status: CCB request completed with an error
(da36:mpr1:0:8:0): Retrying command
    (da36:mpr1:0:8:0): WRITE(16). CDB: 8a 00 00 00 00 02 7d dc b2 88 00 00 00 08 00 00 length 4096 SMID 460 terminated ioc 804b loginfo 31080000 scsi 0 state 0 xfer 0
    (da36:mpr1:0:8:0): READ(16). CDB: 88 00 00 00 00 05 5c a5 21 28 00 00 00 18 00 00 length 12288 SMID 433 terminated ioc 804b l(da36:mpr1:0:8:0): WRITE(16). CDB: 8a 00 00 00 00 02 7d dc b2 88 00 00 00 08 00 00 
oginfo 31080000 scsi 0 state 0 xfer 0
    (da36:mpr1:0:8:0): READ(16). CDB: 88 00 00 00 00 05 5c a5 21 40 00 00 00 20 00 00 length 16384 SMID 274 terminated ioc 804b loginfo 31080000 scsi 0 state 0 xfer 0
    (da36:mpr1:0:8:0): READ(16). CDB: 88 00 00 00 00 05 04 28 d6 b8 00 00 01 00 00 00 length 131072 SMID 874 terminated ioc 804b (da36:mpr1:0:8:0): CAM status: CCB request completed with an error
(da36:mpr1:0:8:0): Retrying command
loginfo 31080000 scsi 0 state 0 xfer 0
(da36:mpr1:0:8:0): READ(16). CDB: 88 00 00 00 00 05 5c a5 21 28 00 00 00 18 00 00 
(da36:mpr1:0:8:0): CAM status: CCB request completed with an error
    (da36:mpr1:0:8:0): READ(16). CDB: 88 00 00 00 00 05 04 28 d7 b8 00 00 00 98 00 00 length 77824 SMID 990 terminated ioc 804b l(da36:mpr1:0:8:0): Retrying command
(da36:mpr1:0:8:0): READ(16). CDB: 88 00 00 00 00 05 5c a5 21 40 00 00 00 20 00 00 
oginfo 31080000 scsi 0 state 0 xfer 0
(da36:mpr1:0:8:0): CAM status: CCB request completed with an error
(da36:mpr1:0:8:0): Retrying command
    (da36:mpr1:0:8:0): READ(16). CDB: 88 00 00 00 00 04 f1 57 65 40 00 00 00 18 00 00 length 12288 SMID 926 terminated ioc 804b l(da36:mpr1:0:8:0): READ(16). CDB: 88 00 00 00 00 05 66 bb af 20 00 00 00 20 00 00 
oginfo 31080000 scsi 0 state 0 xfer 0
    (da36:mpr1:0:8:0): READ(10). CDB: 28 00 49 32 04 d8 00 00 18 00 length 12288 SMID 215 terminated ioc 804b loginfo 31080000 scsi 0 state 0 xfer 0
(da36:mpr1:0:8:0): CAM status: SCSI Status Error
(da36:mpr1:0:8:0): SCSI status: Check Condition
(da36:mpr1:0:8:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error)
(da36:mpr1:0:8:0): Info: 0x566bbaf28
(da36:mpr1:0:8:0): Error 5, Unretryable error
(da36:mpr1:0:8:0): READ(16). CDB: 88 00 00 00 00 05 04 28 d6 b8 00 00 01 00 00 00 
(da36:mpr1:0:8:0): CAM status: CCB request completed with an error
(da36:mpr1:0:8:0): Retrying command
(da36:mpr1:0:8:0): READ(16). CDB: 88 00 00 00 00 05 04 28 d7 b8 00 00 00 98 00 00 
(da36:mpr1:0:8:0): CAM status: CCB request completed with an error
(da36:mpr1:0:8:0): Retrying command
(da36:mpr1:0:8:0): READ(16). CDB: 88 00 00 00 00 04 f1 57 65 40 00 00 00 18 00 00 
(da36:mpr1:0:8:0): CAM status: CCB request completed with an error
(da36:mpr1:0:8:0): Retrying command
(da36:mpr1:0:8:0): READ(10). CDB: 28 00 49 32 04 d8 00 00 18 00 
(da36:mpr1:0:8:0): CAM status: CCB request completed with an error
(da36:mpr1:0:8:0): Retrying command

However, zpool status shows 0 read or write or checksum errors for this device. Is this normal? Does this mean that the drive did work after a few retries on these events? Will ZFS only react if the drive goes completely offline?

Please advise,

Thanks in advance
 
Last edited by a moderator:
My recollection is that it will retry at the lower levels either two or three times, so if it succeeds, it won’t be reported to the higher levels (zpool.)
 
My recollection is that it will retry at the lower levels either two or three times, so if it succeeds, it won’t be reported to the higher levels (zpool.)

That's understandable, but there's a few of these: "(da36:mpr1:0:8:0): Error 5, Unretryable error" which should have gotten up the chain to ZFS no?
 
Yes, it goes up to ZFS. But ZFS may then simply try the IO using another disk (if the pool is redundant), or retry the IO on the same disk, as Eric said, and if that succeeds, it will say that there is no problem. That's because there is no problem with the pool itself. Remember, ZFS is a file system, with built-in RAID, but it does not contain built-in disk health management. There are other file/RAID systems that have built-in disk diagnostics and health management, and even orchestrate disk replacement for you, but ZFS doesn't,

Clearly, your disk is ill, and you should do some work with smart to find out how bad it is, and perhaps consider retiring and replacing it.
 
Back
Top