How do I determine which disks are nvme or USB in a script?

paulvb · Oct 23, 2023

I wrote a script that runs everyday and reports on system emerging problems (such as high file system capacity). In particular, I check SATA disk health with smartctl, using the approach documented by BackBlaze: if any of 5 SMART attributes have non-zero values which historically correlate to future failures, then tell me to replace that disk. For nvme drives, these SMART attributes don't apply, so I use the nvme CLI to check temps, wear, overall health, ... The current version of this script runs on one system and I manually provide the SATA or NVME device names to separate functions. I'd like to run this on multiple systems and automate the discovery of disks and call the appropriate tool (smartctl or nvme). I'd also like to skip USB attached storage - which often does not work with smartctl. Here's camcontrol devlist output.

Code:

 # camcontrol devlist
<WDC WD10JFCX-68N6GN0 82.00A82>    at scbus0 target 0 lun 0 (pass0,ada0)
<WDC WD10JFCX-68N6GN0 82.00A82>    at scbus2 target 0 lun 0 (pass1,ada1)
<HL-DT-ST BD-RE  WH16NS40 1.02>    at scbus3 target 0 lun 0 (cd0,pass2)
< USB Flash Memory 5.00>           at scbus4 target 0 lun 0 (da0,pass3)
<Seagate Expansion 9300>           at scbus5 target 0 lun 0 (da1,pass4)

The first 2 are SATA HDDs attached to SATA ports on the MB. The 3rd is a SATA attached optical drive. 4th is USB flash and 5th is USB HDD. The text in the angle bracts seems to come from the device. For example, when I use a Toshiba flash drive, the text says Toshiba 8GB - no hint that this is USB or flash. My question is - how do I determine which are USB attached from a script? I can't figure out how to correlate the device names from camcontrol to output from usbconfig or lsusb.

Paul

jardows · Oct 23, 2023

I have been working some with this problem myself, primarily for a drive maintenance and USB imaging utility. In a default installation, FreeBSD makes this much easier than Linux or MacOS, as USB, STATA, and NVME drives will all have different device name descriptors.
SATA drives will be ada(0-9)
USB drives will be da(0-9)
NVME drives will be nvd(0-9) or nvme(0-9)

The main thing to worry about is if the system has SCSI drives, which will also be da(0-9). I'm not sure if this also applies to SAS, as I don't have any of those to test.

Now, I want to have some fail-safe assurance, rather than just assume, even though the assumption of device types and name are fairly safe in FreeBSD. I'll have to go back and review my scripts to get actual commands, but for now, here's the basic logic I'm working with:

Check if da(0-9) exists
If yes, verify it is a USB drive
If yes, then proceed with USB utilities.

There are probably much better ways to determine this, and I'm sure some more experienced scripters and admins will chime in, but maybe this can get you started.

SirDice · Oct 23, 2023

Isn't this information in the SMART info? You're fetching this in any case, regardless of the type of drive. But the simplest way would be just to look at the drive nomination; nvd* is NVMe, ada* or da* is SATA/SAS.

sko · Oct 23, 2023

From my experience ZFS will throw errors usually long before the disk firmware admits there is something wrong and increases some counters (let alone changing that useless health status...), so monitoring the ZFS error counters from all pools should be the priority.

Regarding smartctl:
Why don't you just enable smartd and define the disk types you want to monitor in smartd.conf(5)?
Apart from just sending a warning email, you can run scripts via -M exec and take immediate actions when a drive shows errors. (e.g. trigger resilvering of a spare drive).

paulvb · Oct 23, 2023

jardows said:
I have been working some with this problem myself, primarily for a drive maintenance and USB imaging utility. In a default installation, FreeBSD makes this much easier than Linux or MacOS, as USB, STATA, and NVME drives will all have different device name descriptors.
SATA drives will be ada(0-9)
USB drives will be da(0-9)
NVME drives will be nvd(0-9) or nvme(0-9)

The main thing to worry about is if the system has SCSI drives, which will also be da(0-9). I'm not sure if this also applies to SAS, as I don't have any of those to test.

Now, I want to have some fail-safe assurance, rather than just assume, even though the assumption of device types and name are fairly safe in FreeBSD. I'll have to go back and review my scripts to get actual commands, but for now, here's the basic logic I'm working with:

Check if da(0-9) exists
If yes, verify it is a USB drive
If yes, then proceed with USB utilities.

There are probably much better ways to determine this, and I'm sure some more experienced scripters and admins will chime in, but maybe this can get you started.

Thanks! This is helpful and confirms my suspicion of drive name consistency. I'm also hoping for some assurance

Paul

gpw928 · Oct 23, 2023

SirDice said:
Isn't this information in the SMART info? You're fetching this in any case, regardless of the type of drive. But the simplest way would be just to look at the drive nomination; nvd* is NVMe, ada* or da* is SATA/SAS.

Additional disambiguation may be done with the diskinfo(8) command:

Code:

[sherman.144] # diskinfo -v /dev/ada0 | grep Attachment
    ahcich0         # Attachment
[sherman.145] # diskinfo -v /dev/da7 | grep Attachment
    mps1            # Attachment
[sherman.146] # diskinfo -v /dev/da8 | grep Attachment
    umass-sim0      # Attachment

The "Attachment" identifies the ~~achi(4)~~ ahci(4), mps(4), and umass(4) drivers for SATA, LSI, and USB respectively.

ralphbsz · Oct 23, 2023

paulvb said:
For nvme drives, these SMART attributes don't apply, so I use the nvme CLI to check temps, wear, overall health, ...

I think you're confusing several things here.

One is: the attachment of the disk. The wire between the computer and the disk may be SATA, SAS (a form of SCSI), USB, or NVME. That classification is actually already over-simplified, as USB disks use a special form of SCSI to communicate. Correlated with the attachment question is: Does this disk drive implement SMART? Today, all SATA and SAS disks do; USB devices are hit and miss, and NVME is weird, but does not implement traditional SMART.

The other is: The type of disk drive: Spinning rust (platters, heads, actuators), versus flash chips (SSDs of various forms, which includes today's NVME disks). In theory, it would be possible to connect spinning disks via NVME, but I don't know any system that does this in production. So NVME seems to automatically imply flash chips.

Now, the Backblaze "predictive failure analysis" or PFA applies only to spinning disks that are connected via SATA. So how about doing this: The device name tells you whether a disk is NVME or not. If it is NVME, use the NVME/flash-specific form of PFA. Then, look whether the disk is spinning or not (that's easy to do, the RPM line in "camcontrol identify" tells you whether the disk spins). The same camcontrol identify command also tells you whether the disk supports SMART. If it does not, I don't know what to do. Otherwise, use the Backblaze PFA on SMART-enabled spinning disks only.

In a nutshell: A series of if statements.

paulvb · Oct 24, 2023

gpw928 said:
Additional disambiguation may be done with the diskinfo(8) command:

Code:

[sherman.144] # diskinfo -v /dev/ada0 | grep Attachment ahcich0 # Attachment [sherman.145] # diskinfo -v /dev/da7 | grep Attachment mps1 # Attachment [sherman.146] # diskinfo -v /dev/da8 | grep Attachment umass-sim0 # Attachment

The "Attachment" identifies the achi(4), mps(4), and umass(4) drivers for SATA, LSI, and USB respectively.

Thanks! This attachment info is part of what I need. As ralphbsz pointed out, I also need to separate out SSDs using any attachment, SSDs are not covered by the BackBlaze SMART attribute tests. And the attachment info helps with my next project:: finding syslog hardware errors. Thanks again.

paulvb · Oct 24, 2023

ralphbsz said:
I think you're confusing several things here.

One is: the attachment of the disk. The wire between the computer and the disk may be SATA, SAS (a form of SCSI), USB, or NVME. That classification is actually already over-simplified, as USB disks use a special form of SCSI to communicate. Correlated with the attachment question is: Does this disk drive implement SMART? Today, all SATA and SAS disks do; USB devices are hit and miss, and NVME is weird, but does not implement traditional SMART.

The other is: The type of disk drive: Spinning rust (platters, heads, actuators), versus flash chips (SSDs of various forms, which includes today's NVME disks). In theory, it would be possible to connect spinning disks via NVME, but I don't know any system that does this in production. So NVME seems to automatically imply flash chips.

Now, the Backblaze "predictive failure analysis" or PFA applies only to spinning disks that are connected via SATA. So how about doing this: The device name tells you whether a disk is NVME or not. If it is NVME, use the NVME/flash-specific form of PFA. Then, look whether the disk is spinning or not (that's easy to do, the RPM line in "camcontrol identify" tells you whether the disk spins). The same camcontrol identify command also tells you whether the disk supports SMART. If it does not, I don't know what to do. Otherwise, use the Backblaze PFA on SMART-enabled spinning disks only.

In a nutshell: A series of if statements.

You are correct. I knew the BackBlaze PFA logic only applies to spinning disks and I omitted to address this in my code :-( So far, my testing has omitted SATA SSDs, but I have a few which I'd like to support in this script (based on the rotation info). In my series of if statements, I should also report when it detects combinations that are not addressed in the checks. For example, I not expecting to purchase SAS hardware and not adding SAS-specific logic, but I should report if I do see it.

Thanks for the help!

paulvb · Oct 24, 2023

sko said:
From my experience ZFS will throw errors usually long before the disk firmware admits there is something wrong and increases some counters (let alone changing that useless health status...), so monitoring the ZFS error counters from all pools should be the priority.

Regarding smartctl:
Why don't you just enable smartd and define the disk types you want to monitor in smartd.conf(5)?
Apart from just sending a warning email, you can run scripts via -M exec and take immediate actions when a drive shows errors. (e.g. trigger resilvering of a spare drive).

I have code to look at the pool status. And run scrubs periodically. I still see value in checking the SMART attributes (for example, boot and swap partitions are not zfs, SMART attributes may be non-zero without being detected by zpool scrubs.) My choice to not use smartd is to allow full automation (not needing to craft smartd.conf on each system) and to have all detected issues in one report. But I agree the disks checks in my report could be done by smartd.

How do I determine which disks are nvme or USB in a script?

Administrator