Other SCSI Error after installing beta 3 release 11

nedry · Jul 29, 2016

I am using Hyper-v on Windows Server 2012 R2, Hyper-V is configured with 10000 MB RAM and 127 IDE drive, after installing I get the following error:

Code:

(da0:blkvsc0:0:0:0): storvsc scsi_status = 2
(da0:blkvsc0:0:0): WRITE(10). CDB: 2a 00 09 b4 d0 80 00 00 40 00
(da0:blkvsc0:0:0): CAM status: SCSI Status Error
(da0:blkvsc0:0:0): SCSI status: Check Condition
(da0:blkvsc0:0:0): SCSI sense: UNIT ATTENTION asc:3f,2 (Changed operating systemm definition)
(da0:blkvsc0:0:0:0): Retrying command (per sense data)

This is because I took a checkpoint while the system was running, not turned off. I pressed enter and system seems OK, with beta 2 I used checkpoint without turning virtual machine off and did not get this error.
nedry

zapata · Aug 2, 2017

This problem is still present in 11.1-RELEASE. Is there a switch/patch to fix this? Or can we ignore this error?

zapata · Aug 2, 2017

zapata said:
Is there a switch/patch to fix this?

I get this error also on FreeBSD 12.0-CURRENT (r321915). :-(

SirDice · Aug 2, 2017

Has anyone created a PR for it yet? I know the Microsoft engineers working on FreeBSD/Hyper-V are usually quick to pick up issues like this. But it does require a proper PR to set things in motion.

ralphbsz · Aug 2, 2017

zapata said:
Or can we ignore this error?

Good question. How do we answer that question? By looking more at the log file: After this one error happens, does it repeat again?

If no, then it is only a cosmetic problem: The "disk drive" (virtualized disk provided by Hyper-V) is reporting a unit attention condition. That is normal SCSI tradition: if something "interesting" has happened, then the next time the initiator (the FreeBSD client in this case) executes a normal read or write operation, the disk drive tells it "I didn't do that command, because I first want to tell you about a really interesting condition". In this case the "really interesting condition" is: ASC/ASCQ=3F/2, meaning "changed operating system definition", and I have no clue what that means, nor do we need to care. The correct response from the initiator (FreeBSD) is to take note of the "really interesting condition" (for example by printing it to the system log), and then try the IO again.

If this happens exactly once, or very rarely, and it does *not* happen when the IO is retried, then everything is fine, since the IO will succeed on the second try. If it happens every time and after a while FreeBSD gives up, then we have a problem.

Starting a PR is still a good idea, because this is at the minimum a cosmetic problem: Systems shall not print scary-looking (but perhaps harmless) error messages to the log, because it scares the horses (old joke, the users). Maybe the emulated disk provided by Hyper-V should not be reporting this condition in the first place. Maybe FreeBSD should know that emulated disks will occasionally report this condition and not print it.

Personal side remark: I hate the old-fashioned SCSI standard. Implementing it correctly is unnecessarily hard, because its mindset and data model are from the 1970s, when 50-pin parallel ribbon cable ruled the world, and SCSI controllers had to be emulated using two dozen NAND gates. The unit attention thing is like the old joke: "Just because I have attention deficit disorder doesn't mean ... oh look a squirrel". When implementing SMART, this style of error handling (and the many incompatible option the SCSI standard allows) just means that writing code is a lot of work, and error prone. That crazy dance could be handled sanely, by having a sensible error and status reporting mechanism, which clearly distinguishes between errors (which prevent the requested command from being executed) and side conditions. Alas, this is the crappy standard we'll have to live with.

xanaduregio · Jul 18, 2018

I can also confirm this happens for Windows Server 2016 (Build 1607) using FreeBSD 11.2-RELEASE with a Generation 1 Hyper-V Virtual Machine using a vhdx.

Except my error message is slightly different. Pasting here for others that may be searching.

Code:

(da0:blkvsc0:0:0:0): SCSI sense: UNIT ATTENTION asc:3f,2 (Changed operating definition)

However, using either the vhd from the downloads page or manually installing to a manually created vhd, the issue is not reproduced.

Even when using either the downloaded vhd or manually installed vhd and converting to vhdx, the issue can be reproduced. So, the issue occurs when using vhdx for the Hyper-V Hard Drive.

The best config I have found through testing at this time with Hyper-V on either Server 2012 R2 or Server 2016 is:
Generation 1 Virtual Machine
Use a VHD and not a VHDX

So, if you don't need more than 2TB of space, no problem.

ralphbsz said:
Starting a PR is still a good idea, because this is at the minimum a cosmetic problem: Systems shall not print scary-looking (but perhaps harmless) error messages to the log, because it scares the horses (old joke, the users). Maybe the emulated disk provided by Hyper-V should not be reporting this condition in the first place. Maybe FreeBSD should know that emulated disks will occasionally report this condition and not print it.

SirDice said:
Has anyone created a PR for it yet? I know the Microsoft engineers working on FreeBSD/Hyper-V are usually quick to pick up issues like this. But it does require a proper PR to set things in motion.

Hi, ralphbsz or SirDice, do either of you know if a PR has been done yet? I don't know what a "PR" is.

I suspect there is actually something a little deeper here with VHDX and further exploration with the SCSI controller is necessary. Any thoughts?

ralphbsz · Jul 18, 2018

A "PR" is a problem report. Other people might call it a bug; it is the thing that is tracked, to make sure bugs eventually get dealt with (for example, fixed). To learn about what they are, how to open them, and how to look for them, go to the Bug reports web page. I certainly have not filed a PR for your problem. That page shows how to search for it; I'm not volunteering to do this for you, because I'm super hectic busy today. If after searching for it you find that something closely related already exists, you might want to add a comment to the existing report, for example saying that you have found the same problem in a different context, or why in your opinion this is important to fix.

alexg · May 26, 2019

SirDice said:
Has anyone created a PR for it yet? I know the Microsoft engineers working on FreeBSD/Hyper-V are usually quick to pick up issues like this. But it does require a proper PR to set things in motion.

Hey SirDice, I've reported the issue https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=236042
However I'm unsure if any Microsoft Hyper-V developers have looked at the bug report. If I email bsdic@microsoft.com a email error bounce back is returned.

Cheers,
Alex.

Other SCSI Error after installing beta 3 release 11

Administrator